Lightning in a Bottle/Chapter 5

←

Chapter 4

Lightning in a Bottle
by Jonathan Lawhead

Chapter 5

Chapter 6

→

2042365Lightning in a Bottle — Chapter 5Jonathan Lawhead

Chapter Five

Complexity, Chaos, and Challenges in Modeling the Complex Systems

5.0 A Road Map

We concluded the last chapter with something of a cliff-hanger: I argued that while the classical scientific method of decomposing systems into their constituent parts and studying the behavior of those parts in isolation has been spectacularly successful in the history of science, a number of contemporary problems have forced us to look for tools to supplement that approach. We saw that both biology and climate science have begun to explore more holistic models, with the hope that those perspectives will shed some light on issues that have stymied the decompositionalist approach. The bulk of the last chapter was dedicated to exploring a simplified climate model—the zero-dimensional energy balance model—and to articulating the physical intuitions behind the mathematics of that model. Near the end, we discussed the highly heterogeneous family of models called “Earth models of intermediate complexity,” and thought about the relationship between those models and the concept of dynamical complexity. I suggested that while EMICs shouldn’t be thought of as inferior imitations of more comprehensive models, the project of getting a clear understanding of the patterns that underlie the global climate will involve recruiting all available tools. To that end, I would like to spend this chapter discussing cutting-edge, high-level climate models, with particular attention to the computer simulations in which many of these models are implemented. This chapter will be the first to engage with some of the more controversial aspects of climate science, and will constitute a direct response to the critique of climatology as a “cooked up” enterprise—a “science by simulation.”

Here’s how things will go. In Section 5.1, we’ll begin to examine some of the more difficult points of climate science, with special attention to features of the global climate system that contribute to its high dynamical complexity. In particular, we’ll focus on two aspects of the global climate which, while neither necessary nor sufficient for high dynamical complexity in themselves, are characteristic of complex systems: the presence of non-linear feedback mechanisms, and the presence of chaotic behavior. We’ll think about what it means for a system to be chaotic, and how the presence of feedback mechanisms (which are represented as non-linearities in the mathematics describing the system’s behavior) can contribute to chaos. I shall argue that careful attention to these two factors can shed a tremendous amount of light on some of the vagaries of climatology. We will see that the kind of model we constructed in 4.1 is incapable of handling these issues, and will survey some more robust models which attempt to come to terms with them.

After describing some of the problems endemic to the study of the Earth’s climate (and the models designed to solve them), we shall consider how climate scientists meet the methodological challenges they face in actually using more sophisticated models. In Section 5.2, we will discuss one of the defining tools in the climatologist’s tool-kit: computer simulation. The construction of simulations—computer-solved models designed to be run repeatedly—is a methodological innovation common to many complex system sciences; we’ll think about why this is the case, and consider the relationship between the challenges presented by non-linearity and chaos, and the unprecedented methodological opportunities presented by modern supercomputers. I will argue that while “science by simulation” is an absolutely indispensable approach that climate science must take advantage of, it also comes with its own set of novel pitfalls, which must be carefully marked if they are to be avoided. More specifically, I argue that careful attention to the nature of chaos should force us to attend to the limitations of science by simulation, even in ideal conditions. It is worth emphasizing that these limitations are just that, though: limitations, and not absolute barriers. Popular dissatisfaction with the role that computational models play in climate sciences is largely a result of conflating these two notions, and even some people who ought to know better sometimes confuse the existence of chaos with the impossibility of any significant forecasting. We’ll think about the nature of the limitations imposed by chaos (especially in light of the method of computational model building), and see how those general limitations apply to climate science. Finally, I’ll argue that even with these limitations taken into account, the legitimate predictions made by climate science have serious implications for life on Earth.

5.1 The Challenges of Modeling Complexity

Individual special sciences have been increasingly adopting the concepts and methods of complexity theory, but this adoption has been a piecemeal response to the failures of the decompositionalist method in individual domains. So far, there exists little in the way of an integrative understanding of the methods, problems, or even central concepts underlying the individual approaches. Given the highly practical nature of science, this should not be terribly surprising: science does the best with the tools it has, and creates new tools only in response to new problems. The business of science is to figure out patterns in how the world changes over time, and this business requires a degree of specialized knowledge that makes it natural to focus on the trees rather than the forest (unless you happen to be working in forestry science). As a result, we’re at one of those relatively unusual (so far) junctures where there is genuinely important multidisciplinary conceptual clarification waiting to be done.

We’ve been in this situation before. The mechanistic revolution of the scientific enlightenment forced us to confront the question of how humanity might fit into a world that was fundamentally physical, leading to an explosion of new philosophical ideas about man and his place in nature. More recently, the non-classical revolution in the early 20^th century forced us to refine concepts that we’d taken to be rock-solid in our conception of the world, and the philosophical implications of quantum mechanics and relativity are still being fought out in ways that are actually relevant to the progress of science.^[1] There is similar room for conceptual work here. The time is ripe for philosophical analysis, which makes it all the more distressing that so little philosophical attention has been paid to the topic of complexity.

One of the consequences of the piecemeal way in which complexity-theoretic considerations have taken hold in the special sciences is that there’s a good deal of confusion about how to use some of the central concepts. It is instructive to note that many of the same terms (e.g. “emergence,” “self-organized,” “chaotic”) show up in complexity-motivated discussions of very diverse sciences, and there’s surely a sense in which most variations of those terms show a kind of family resemblance. Still, the fact that they are often defined with a specific context in mind means that it is not always easy to explicitly state the common core of these important terms as they appear across disciplines. Articulating this common core in a systematic way is one of the most important foundational contributions that remains to be made, as it will provide a common language in which scientists interested in complexity (but trained in different disciplines) can come together to discuss their work. Doing this ground-clearing work is also a necessary precursor to the more daunting task of defining complexity itself. While I cannot hope to disentangle all the relevant concepts here, I would like to now turn to an examination of two of the most important for our purposes: non-linearity and chaos. Where our discussions of complexity have thus far been principally focused on defining complexity, this section focuses on the practical challenges of actually working with dynamically complex systems. We would do well to keep the distinction between these two lines of discussion clear in our minds, though—while the issues we’ll be discussing in this chapter are characteristic of complex systems, they are not definitive of them. That is, neither non-linearity nor chaos (nor the conjunction of the two) is sufficient for dynamical complexity^[2].

5.1.1 Non-Linearity

Before we can tackle what it means to say that a system’s behavior is non-linear, we need to get some basic terminology under our belt. Complex systems theory is built largely on the back of a more general approach to scientific modeling called dynamical systems theory, which deals with the creation of mathematical models describing change (“dynamics”) in parts of the world (“systems”) as time progresses. For our purposes, a few of the methods of dynamical systems theory (DyST) are particularly worth flagging.

First, it’s important to note that DyST takes change as its primary object of interest. This might seem obvious given the name of the field, but it is vital that we appreciate the degree to which this assumption colors the DyST approach to scientific model-building. Rather than focusing on particular instantaneous states of systems—say, the position and momentum of each particle in a box of gas, or particular weather-states (the like of which were the focus of the qualitative approach to weather forecasting discussed in Chapter Four)—DyST focuses on ensembles of states that describe a system over some time period, not just at a single instant. The central mathematical tool of DyST is an equation that describes how different physical quantities of a system (e.g. force, mass, and velocity in Newtonian physics; populations of predator animals and prey animals in ecology; presence and concentration of certain atmospheric chemicals and global temperature in climatology) vary in relation to one another over time. That is, DyST is concerned with modeling how physical quantities differ with respect to one another at different times in a system’s lifetime—in most systems, this is accomplished through the use of differential equations, which describe how variables change in response to one another^[3]. The familiar Newtonian equation of motion (F = ma) is a simple differential equation, as it relates the change in velocity^[4] (acceleration) to other quantities of interest (force and mass) in physical systems.

We can think of a system of interest (for example, a box of gas) as being represented by a very large space of possible states that the system can take. For something like a box of gas, this space would be composed of points, each of which represents the specific position and velocity of each molecule in the system.^[5] For Newtonian systems like gasses, this space is called a phase space. More generally, a space like this—where the complete state of a system at a particular time is represented by a single point—is called a configuration space or state space. Since DyST is concerned with modeling not just a system at a particular time (but rather over some stretch of time), we can think of a DyST model as describing a path that a system takes through its state space. The succession of points represents the succession of states that the system goes through as it changes over time.

Given a configuration space and a starting point for a system, then, DyST is concerned with watching how the system moves from its starting position. The differential equations describing the system give a kind of “map”—a set of directions for how to figure out where the system will go next, given a particular position. The configuration space and the differential equations work together as a tool-kit to model the behavior of the system in question over time. The differential equation describes how interesting quantities (e.g. position and velocity) of the system change, and the configuration space is a representation of all the different possible values those quantities can take. The advantage of this approach should be obvious: it lets us reduce difficult questions about how complicated systems behave to mathematically-tractable questions about tracing a path through a space according to a rule. This powerful modeling tool is the heart of DyST.

Some systems can be modeled by a special class of differential equations: linear differential equations. Intuitively, a system’s behavior can be modeled by a set of linear differential equations if: (1) the behavior of the system is (in a sense that we shall articulate more precisely soon) the sum of the behavior of the parts of the system, and (2) the variables in the model of the system vary with respect to one another at constant rates^[6]. (1) should be relatively familiar: it’s just the decompositionalist assumption^[7] we discussed back at the end of Chapter Four! This assumption, as we saw, is innocuous in many cases. In the case of a box of gas, for example, we could take the very long and messy differential equation describing how all the trillions of molecules behave together and break it up into a very large collection of equations describing the behavior of individual molecules, and (hopefully) arrive at the very same predictions. There’s no appreciable^[8] interaction between individual molecules in a gas, so breaking the system apart into its component parts, analyzing the behavior of each part, and then taking the system to be (in some sense) the “sum” of that behavior should yield the same prediction as considering the gas as a whole.

It’s worth briefly considering some of the technicalities behind this condition. Strictly speaking, the additvity condition on linearity makes no reference to “parts,” as it is a condition on equations, not physical systems being modeled by equations. Rather, the condition demands that given any set of valid solutions to the equation describing the behavior of the system, the linear combination of those solutions is itself a solution. This formal statement, though more precise, runs the risk of obfuscating the physical (and philosophical) significance of linearity, so it is worth thinking more carefully about this condition with a series of examples.

Linearity is sometimes referred to as “convexity,” especially in discussions that are grounded in set-theoretic ways of framing the issue^[9]. In keeping with our broadly geometric approach to thinking about these issues, this is perhaps the most intuitive way of presenting the concept. Consider, for instance, the set of points that define a sphere in Euclidean space. This set is convex (in both the ordinary sense and the specialized sense under consideration here), since if we take any two points that are inside the sphere, then the linear combination—the weighted average of the two points—is also inside the sphere. Moreover, the line connecting the two points will be inside the sphere, the triangle defined by connecting any three points will lie entirely inside the sphere, and so on. More formally, we can say that a set of points is convex if for all points $x_{i}$ in the set,

\sum a_{i}x_{i}

5(a)

is also in the set as long as

\sum a_{i}=1

5(b)

(2) is necessary to ensure that the summation in (1) is just a weighted average of the values of the points, otherwise we could always define sets that were outside the initial set just by multiplying the points under consideration by arbitrarily large values. It’s easy to see that while the set of points defining a sphere is convex, the set of points defining a torus—a donut shape—is not. Two points can be inside the set, while their weighted average--the line connecting them--is outside the set (think of two points on either side of the “hole” in the middle of a donut, for instance).

Why is this particular sort of geometric structure relevant to our discussion here? What is it about sets that behave like spheres rather than like donuts that make them more well-behaved mathematical representations of physical systems? We’ll return to that question in just a moment, but first let’s briefly examine the other way of articulating the linearity condition—(2) described above. Ultimately, we shall see that these two conditions are, at least in most cases of relevance to us, just different ways of looking at the same phenomenon. For the moment, though, it is dialectically useful to examine each of the two approaches on its own.

The second condition for linearity given above is a condition not on the relationship between the parts of the system, but on the relationship between the quantities described by the differential equation in question. (2) demands that the way that the quantities described by the equation vary with respect to one another remain constant. To get a sense of what that means, it’s probably easiest to think about some cases where the requirement holds, and then think about some cases where the requirement doesn’t hold. Suppose you’re walking on a treadmill, and want to vary the speed at which the belt is moving so that you walk more quickly or more slowly. You can do this by pressing the up and down arrows on the speed control; each time you press one of the arrows, the speed of the belt will change by (say) .1 MPH. This is an example of a variation that satisfies condition (2). We could write down a simple differential equation relating two quantities: the number of times you’ve pressed each button, and the speed at which the treadmill’s belt is moving. No matter how many times you press the button, though, the value of the button press will remain constant: the amount by which pressing the up arrow varies the speed doesn’t depend on how many times you’ve pressed the button, or on how fast the treadmill is already turning. Whether you’re walking slowly at one mile per hour or sprinting at 15 miles per hour, pressing that button will always result in a change of .1 mile per hour. Condition (2) is satisfied.^[10]

OK, with an understanding of what a system must look like in order to be linear, let’s think about what sorts of systems might fail to satisfy these requirements. Let’s return to the treadmill example again, and think about how it might be designed so that it fails to satisfy (2). Suppose that we were designing a treadmill to be used by Olympic sprinters in training. We might decide that we need fine-grained speed control only at very high speeds, and that it’s more important for the athletes to get up to sprint speed quickly than to have fine control over lower speeds. With that in mind, we might design the treadmill such that if the speed is less than (say) 10 MPH, each button press increments or decrements the speed by 2 MPH. Once the speed hits 10 MPH, though, we need more fine grained control, so each button press only changes the current speed by 1 MPH. At 15 MPH, things get even more fine grained, and each press once again changes things by .1 MPH. In this case, condition (2) is not satisfied: the relationship between the quantities of interest in the system (number of button presses and speed of the belt) doesn’t vary at a constant rate. Just knowing that you’ve pressed the “up arrow” button three times in the last minute is no longer enough for me to calculate how much the speed of the belt has changed: I need to know what the starting speed was, and I need to know how the relationship between button presses and speed changes varies with speed. Predicting the behavior of systems like this is thus a bit more complicated, as there is a higher-order relationship present between the changing quantities of the system.

5.1.2 Two Illustrations of Non-Linearity

The logistic function for population growth in ecology is an oft-cited example of a real-world non-linear system. The logistic function models the growth of a population of individuals as a function of time, given some basic information about the context in which the population exists (e.g. the carrying-capacity of the environment). One way of formulating the equation is:

{\tfrac {dN}{dt}}=rN(1-{\tfrac {N}{K}})

5(c)

$N$ represents the number of individuals in the population, $r$ represents the relative rate at which the members of the population reproduce when unchecked, and $K$ represents the carrying capacity of the environment. Though quite simple, the logistic equation displays quite interesting behavior across a wide spectrum of circumstances. When $N$ is low—when there are relatively few members of a population—growth can proceed almost unchecked, as the first term on the right side of the equation dominates. As the population grows in size, though, the value of ${\tfrac {N}{K}}$ increases, making the carrying capacity of the environment—how many (say) deer the woods can support before they begin to eat themselves out of house and home—becomes increasingly important. Eventually, the contribution of ${\tfrac {N}{K}}$ outpaces the contribution of $rN$ , putting a check on population growth. More sophisticated versions of the logistic equation—versions in which, for instance, $K$ itself varies as a function of time or even as a function of $N$ —show even stronger non-linear behavior.^[11] It is this interrelationship between the variables in the equation that makes models like this one non-linear. Just as with the Olympian treadmill we described above, the values of the relevant variables in the system of differential equations describing the system depend on one another in non-trivial ways; in the case of the treadmill, the value of a button-press varies with (and affects) the speed of the belt, and in the case of the logistic equation, the rate of population growth varies with (and affects) extant population. This general behavior—the presence of feedbacks—is characteristic of non-linear systems.

Let us consider a more realistic concrete example by way of illustration: the relationship between material wealth and subjective utility. On the face of it, we might assume that the relationship between these two quantities is linear, at least in most cases. It seems reasonable, that is, to think that getting $10 would not only leave you with more utility--make you happier--than getting $5 would, but also that it would leave you with twice as much utility. Empirical investigation has not supported this idea, though, and contemporary economic theory generally holds that the relationship between wealth and utility is non-linear.

This principle, called the principle of diminishing marginal utility, was originally developed as a response to the St. Petersburg Paradox of decision theory. Consider a casino game in which the pot begins at a single dollar, and a fair coin is tossed repeatedly. After each toss, if the coin comes up heads the quantity of money in the pot is doubled. If the coin comes up tails, the game ends and the player wins whatever quantity is in the pot (i.e. a single dollar if the first toss comes up tails, two dollars if the second toss comes up tails, four if the third toss comes up tails, &c.). The problem asks us to consider what a rational gambler ought to be willing to pay for the privilege of playing the game. On the face of it, it seems as if a rational player ought to be willing to pay anything less than the expected value of a session of the game--that is, if the player wants a shot at actually making some money, she should be willing to pay the casino anything less than the sum of all the possible amounts of money she could win, each multiplied by the probability of winning that amount. The problem is that the value of this sum grows without bound: there is a probability of one-half that she will win one dollar, probability one-fourth that she’ll win two dollars, probability one-eighth that she’ll win four dollars, &c. More formally, the probability of winning $n$ dollars is ${\tfrac {n}{2n}}$ and so the overall expected value of playing the game (assuming that the house has unlimited resources and will allow the game to continue until a flip comes up tails) is given by:

\sum _{1}^{\infty }{\tfrac {1}{2}}

5(d)

If the amount of money that our gambler should be willing to pay to play a game is constrained only by the demand that it be less than the expected return from the game, then this suggests that she should pay any finite amount of money for a chance to play the game just once. That seems very strange. While there are a number of solutions to this problem, the one of most immediate interest to us was proposed in Bernoulli (1738).^[12] Bernoulli suggested that we ought to think of utility gained from the receipt of a quantity of some good (in this case money) as being inversely proportional to the quantity of that same good already possessed. He justifies this by pointing out that

The price of the item is dependent only on the thing itself and is equal for everyone; the utility, however, is dependent on the particular circumstances of the person making the estimate. Thus there is no doubt that a gain of one thousand ducats is more significant to a pauper than to a rich man though both gain the same amount^[13]

Bernoulli’s original suggestion of this fairly straightforward (albeit still non-linear) relationship between wealth and utility has been refined and expanded by a number of thinkers.^[14] The failure of variations in utility to be tied linearly to variations in wealth, though, can be understood as a failure of condition (2) from Section 5.1.1—the wealth/utility relationship is like the Olympic treadmill. More recently, empirical work in the social sciences has gone even further. Kahneman and Deaton (2010) argue that utility (or, as they put it, “emotional well-being”) increases with the logarithm of wealth, but only up to a point. On their account, plotting the relationship between utility and wealth yields a strongly concave function, which is what we ought to expect. However, they also argue that there is a leveling off point in the function, beyond which “there is no improvement whatever in any of the three measures of emotional well-being^[15].”

Of course, it is worth noting that Kahneman and Deaton’s investigation involved observation only of residents of the United States. Interestingly, as Kahneman and Deaton point out, the mean income in the United States at the time in which they conducted their research was just under $72,000: very close to the mark at which they observed the disappearance of any impact of increased income on emotional well-being.^[16] There is at least some reason to think that this is not entirely a coincidence. McBride (2001) argues that the impact of changes in wealth on an agent’s subjective utility depends not just on how much wealth the subject already possesses, but also on wealth possessed by others in the agent’s social circles. That is, being wealthier than those around you might itself have a positive impact on your subjective utility--an impact that is at least partially independent of the absolute quantity of wealth you possess. McBride found that people are made happier by being the richest people in a poorer neighborhood, and that increasing their wealth (but moving them to a cohort where they’d be among the poorest members) might result in a decrease in subjective utility! This hints at what might be partial explanation for the effect described by Kahneman and Deaton: being less wealthy than average is itself a source of negative subjective utility.

This suggests that the relationship between wealth and utility also fails to satisfy condition (1) from Section 5.1.1. Given a group of people (neighbors, for instance), the differential equations describing the change in utility of members of the group relative to their changes in wealth will resist decomposition, because their utilities are a function not just of their own wealth, but of the wealth of other members of the community as well. By decomposing the system into component parts, we would miss this factor, which means that even if we took the principle of diminishing marginal utility into account in our calculations, the decompositionalist approach would still fail to capture the actual dynamics of the overall system. A more holistic approach is required.

This suggests an important lesson for the study of natural systems in which non-linearities play a significant role: the presence of unexpected feedback and variable degrees of mutual influence between different components of a system might well mean that attempts to model the system’s behavior by way of aggregating models of the components are, if not exactly doomed to failure, at least of very limited use. We must be extraordinarily careful when we attempt to tease general predictions about the future of the global climate out of families of EMICs for precisely this reason. We shall return to this point in Section 5.2, but first let us turn our attention to the other central challenge to be discussed here: chaotic behavior.

5.1.3 Chaos

Like non-linearity, chaos is best understood as a dynamical concept—a feature of how systems changed over time that is represented by certain conditions on the DyST models of those systems. Chaos has played an increasingly central role in a number of sciences since the coinage of the term “butterfly effect” in the mid 20^th century as a response to Lorenz (1963)^[17]. Indeed, the evocative idea of the butterfly effect—that idea that the flapping of a butterfly’s wings on one side of the world can lead to a hurricane on the other side of the world days later—has percolated so thoroughly into popular culture that the broad strokes of the concept are familiar even to many laypeople. Still, the specifics of the concept are often misunderstood, even by many philosophers of science. In particular, chaotic systems are sometimes thought to be indeterministic, a mistake which has the potential to create a great deal of confusion. Let’s think things through slowly, and add on the formalism as we get a better handle on the concept.

Let’s start here: suppose that it is in fact true that the flapping of a butterfly’s wings in Portugal can spawn a hurricane off the coast of Mexico days later. Here’s a question that should immediately jump out at us: under what conditions does something like this happen? Clearly, it cannot be the case that every butterfly’s flapping has this sort of catastrophic effect, as there are far more butterfly flaps than there are hurricanes. That is, just saying that a tiny change (like a flap) can cause a big change (like a hurricane) doesn’t tell us that it will, or give us any information about what the preconditions are for such a thing to happen. This point is worth emphasizing: whatever a chaotic system is, it is not a system where every small change immediately “blows up” into a big change after a short time. We’ll need to get more precise.

Let’s stick with the butterfly effect as our paradigm case, but now consider things from the perspective of DyST. Suppose we’ve represented the Earth’s atmosphere in a state space that takes into account the position and velocity of every gas molecule on the planet. First, consider the trajectory in which the nefarious butterfly doesn’t flap its wings at some time t₁, and the hurricane doesn’t develop at a later time t₂. This is a perfectly well-defined path through the state space of the system that can be picked out by giving an initial condition (starting point in the space), along with the differential equations describing the behavior of the air molecules. Next, consider the trajectory in which the butterfly does flap its wings at t₁, and the hurricane does develop at t₂. What’s the relationship between these two cases? Here’s one obvious feature: the two trajectories will be very close together in the state space at t₁—they’ll differ only with respect to the position of the few molecules of air that have been displaced by the butterfly’s wings—but they’ll be very far apart at t₂. Whatever else a hurricane does, it surely changes the position and velocity of a lot of air molecules (to say the least!). This is an interesting observation: given the right conditions, two trajectories through state space can start off very close together, then diverge as time goes on. This simple observation is the foundation of chaos theory.

Contrast this case with the case of a clearly non-chaotic system: a pendulum, like the arm on a grandfather clock. Suppose we define a state space where each point represents a particular angular velocity and displacement angle from the vertical position for the pendulum. Now, look at the trajectory that the pendulum takes through the state space based on different initial conditions. Suppose our initial condition consists in the pendulum being held up at 70 degrees from its vertical position and released. Think about the shape that the pendulum will trace through its state space as it swings. At first, the angular velocity will be zero (as the pendulum is held ready). As the pendulum falls, its position will change in an arc, so its angular displacement will approach zero until it hits the vertical position, where its angular velocity will peak. The pendulum is now one-quarter of the way through a full period, and begins its upswing. Now, its angular displacement starts to increase (it gets further way from vertical), while its angular momentum decreases (it slows down). Eventually, it will hit the top of this upswing, and pause for a moment (zero angular velocity, high angular displacement), and then start swinging back down. If the pendulum is a real-world one (and isn’t being fed by some energy source), it will repeat this cycle some number of times. Each time, though, its maximum angular displacement will be slightly lower—it won’t make it quite as high—and its maximum angular velocity (when it is vertical) will be slightly smaller as it loses energy to friction. Eventually it will come to rest.

If we plot behavior in a two-dimensional state space (with angular displacement on one axis and angular momentum on the other), we will see the system trace a spiral-shaped trajectory ending at the origin. Angular velocity always falls as angular displacement grows (and vice-versa), so each full period will look like an ellipse, and the loss of energy to friction will mean that each period will be represented by a slightly smaller ellipse as the system spirals toward its equilibrium position of zero displacement and zero velocity: straight up and down, and not moving. See Figure 5.1 for a rough plot of what the graph of this situation would look like in a state-space for the pendulum.

Fig. 5.1

Now, consider the difference between this case and a case where we start the pendulum at a slightly smaller displacement angle (say, 65 degrees instead of 70). The two trajectories will (of course) start in slightly different places in the state space (both will start at zero angular velocity, but will differ along the other axis). What happens when you let the system run this time? Clearly, the shape it traces out through the state space will look much the same as the shape traced out by the first system: a spiral approaching the point (0,0). Moreover, the two trajectories should never get further apart, but rather will continue to approach each other more and more quickly as they near their point of intersection^[18]. The two trajectories are similar enough that it is common to present the phase diagram like Figure 5.1: with just a single trajectory standing in for all the variations. Trajectories which all behave similarly in this way are said to be qualitatively identical. The trajectories for any initial condition like this are sufficiently similar that we simplify things by just letting one trajectory stand in for all the others (this is really handy when, for instance, the same system can show several different classes of behavior for different initial conditions, and keeps the phase diagram from becoming too crowded)^[19].

Contrast this to the butterfly-hurricane case from above, when trajectories that started very close together diverged over time; the small difference in initial conditions was magnified over time in one case, but not in the other. This is what it means for a system to behave chaotically: small differences in initial condition are magnified into larger differences as the system evolves, so trajectories that start very close together in state space need not stay close together.

Lorenz (1963) discusses a system of equations first articulated by Saltzman (1962) to describe the convective transfer of some quantity (e.g. average kinetic energy) across regions of a fluid:

{\tfrac {dx}{dt}}=\sigma (y-x)

5(e)

{\tfrac {dy}{dt}}=x\;(\rho -z)

5(f)

{\tfrac {dz}{dt}}=xy-\beta z

5(g)

In this system of equations, $x$ , $y$ , and $z$ represent the modeled system’s position in a three-dimensional state space^[20] represents the intensity of convective motion, while $\sigma$ , $\rho$ , and $\beta$ are parameterizations representing how strongly (and in what way) changes in each of the state variables are connected to one another.

The important feature of Lorenz’s system for our discussion is this: the system exhibits chaotic behavior only for some parameterizations. That is, it’s possible to assign values to σ, ρ, and β such that the behavior of the system in some sense resembles that of the pendulum discussed above: similar initial conditions remain similar as the system evolves over time. This suggests that it isn’t always quite right to say that systems themselves are chaotic. It’s possible for some systems to have chaotic regions in their state spaces such that small differences in overall state not when the system is initialized, but rather when (and if) it enters the chaotic region are magnified over time. That is, it is possible for a system’s behavior to go from non-chaotic (where trajectories that are close together at one time stay close together) to chaotic (where trajectories that are close together at one time diverge)^[21]. Similarly, it is possible for systems to find their way out of chaotic behavior. Attempting to simply divide systems into chaotic and non-chaotic groups drastically over-simplifies things, and obscures the importance of finding predictors of chaos—signs that a system may be approaching a chaotic region of its state space before it actually gets there^[22].

Another basic issue worth highlighting is that chaos has absolutely nothing to do with indeterminism: a chaotic system can be deterministic or stochastic, according to its underlying dynamics. If the differential equations defining the system’s path through its state space contain no probabilistic elements, then the system will be deterministic. Many (most?) chaotic systems of scientific interest are deterministic. The confusion here stems from the observation that the behavior of systems in chaotic regions of their state space can be difficult to predict over significant time-scales, but this is not at all the same as their being non-deterministic. Rather, it just means that the more unsure I am about the system’s exact initial position in state space, the more unsure I am about where it will end up after some time has gone by. The behavior of systems in chaotic regions of their state space can be difficult to forecast in virtue of uncertainty about whether things started out in exactly one or another condition, but that (again) does not make them indeterministic. Again, we will return to this in much greater detail in Section 3 once we are in a position to synthesize our discussions of chaos and path-dependence.

Exactly how hard is it to predict the behavior of a system once it finds its way into a chaotic region? It’s difficult to answer that question in any general way, and saying anything precise is going to require that we at least dip our toes into the basics of the mathematics behind chaotic behavior. We’ve seen that state space trajectories in chaotic region diverge from one another, but we’ve said nothing at all about how quickly that divergence happens. As you might expect, this is a feature that varies from system to system: not all chaotic behavior is created equal. The rate of divergence between two trajectories is given by a particular number—the Lyapunov exponent—that varies from system to system (and from trajectory to trajectory within the system^[23]). The distance between two trajectories x₀ → x_t and y₀ → y_t at two different times can, for any given system, be expressed as:

\left|x_{t}-y_{t}\right|=e^{\lambda t}\left|x_{0}-y_{0}\right|

5(h)

where λ is the “Lyapunov exponent,” and quantifies the rate of divergence. The time-scales at which chaotic effects come to dominate the dynamics of the system, then depend on two factors: the value of the Lyapunov exponent, and how much divergence we’re willing to allow between two trajectories before we’re willing to consider it significant. For systems with a relatively small Lyapunov exponent, divergence at short timescales will be very small, and will thus likely play little role in our treatment of the system (unless we have independent reasons for requiring very great precision in our predictions). Likewise, there may be cases when we care only about whether the trajectory of the system after a certain time falls into one or another region of state space, and thus can treat some amount of divergence as irrelevant.

This point is not obvious but it is very important; it is worth considering some of the mathematics in slightly more detail before we continue on. In particular, let’s spend some time thinking about what we can learn by playing around a bit with the definition of a chaotic system given above.

To begin, let ${\mbox{D}}$ be some neighborhood on ${\mathcal {R}}^{n}$ such that all pairs of points $<x_{0}\,,\,y_{0}>\,\in D$ iff

\left|x_{0}-y_{0}\right|\leq \epsilon

5(i)

That is, let ${\mbox{D}}$ be some neighborhood in an n-dimensional space such that for all pairs of points that are in ${\mbox{D}}$ , the distance between those two points is less than or equal to some small value epsilon. If ${\mathcal {R}}^{n}$ is the state space of some dynamical system ${\mbox{S}}$ with Lyapunov exponent $\lambda$ , then combining (5) and (6) lets us deduce

\forall ({\mbox{t}}>0)<x_{t},y_{t}>\in D:\left|x_{t}-y_{t}\right|\leq \epsilon (e^{\lambda t})

5(j)

In other (English) words, if the space is a state space for some dynamical system with chaotic behavior, then for all times after the initialization time, the size of the smallest neighborhood that must include the successors to some collection of states that started off arbitrarily close together will increase as a function of the fastest rate at which any two trajectories in the system could diverge (i.e. the MLE) and the amount of time that has passed (whew!). That’s a mouthful, but the concepts behind the mathematics are actually fairly straightforward. In chaotic systems, the distance between two trajectories through the state space of the system increases exponentially as time goes by—two states that start off very close together will eventually evolve into states that are quite far apart. How quickly this divergence takes place is captured by the value of the Lyapunov exponent for the trajectories under consideration (with the “worst-case” rate of divergence defining the MLE). Generalizing from particular pairs of trajectories, we can think about defining a region in the state space. Since regions are just sets of points, we can think about the relationship between our region’s volume at one time and the smallest region encompassing the end-state of all the trajectories that started in that region at some later time. This size increase will be straightforwardly related to the rate at which individual trajectories in the region diverge, so the size of the later region will depend on three things: the size of the initial region, the rate at which paths through the system diverge, and the amount of time elapsed^[24]. If our system is chaotic, then no matter how small we make our region the trajectories followed by the states that are included in it will, given enough time, diverge significantly^[25].

How much does this behavior actually limit the practice of predicting what chaotic systems will do in the future? Let’s keep exploring the mathematics and see what we can learn. Consider two limit cases of the inequality in 5(j). First:

\lim _{\epsilon \to 0}\epsilon (e^{\lambda t})=0

5(k)

This is just the limiting case of perfect measurement of the initial condition of the system—a case where there’s absolutely no uncertainty in our first measurement, and so the size of our “neighborhood” of possible initial conditions is zero. As the distance between the two points in the initial pair approaches zero, then the distance between the corresponding pair at time t will also shrink. Equivalently, if the size of the neighborhood is zero—if the neighborhood includes one and only one point—then we can be sure of the system’s position in its state space at any later time (assuming no stochasticity in our equations). This is why the point that chaotic dynamics are not the same thing as indeterministic dynamics is so important. However:

\lim _{\lambda \to 0}\epsilon (e^{\lambda t})=\epsilon

5(l)

As the Lyapunov exponent $\lambda$ approaches zero, the second term on the right side of the inequality in 5(j) approaches unity. This represents another limiting case—one which is perhaps even more interesting than the first one. Note that 5(k) is still valid for non-chaotic systems: the MLE is just set to zero, and so the distance between two trajectories will remain constant as those points are evolved forward in time^[26]. More interestingly, think about what things look like if λ > 0 (the system is chaotic) but still very small. No matter how small λ is, chaotic behavior will appear whenever t ≫ 1/λ : even a very small amount of divergence becomes significant on long enough time scales. Similarly, if t ≪ 1/λ then we can generally treat the system as if it is non-chaotic (as in the case of the orbits of planets in our solar system). The lesson to be drawn is that it isn’t the value of either t or λ that matters so much as the ratio between the two values.

5.1.4 Prediction and Chaos

It can be tempting to conclude from this that if we know λ, ε, and t, then we can put a meaningful and objective “horizon” on our prediction attempts. If we know the amount of uncertainty in the initial measurement of the system’s state (ε), the maximal rate at which two paths through the state space could diverge (λ), and the amount of time that has elapsed between the initial measurement and the time at which we’re trying to make our prediction (t), then shouldn’t we be able to design things to operate within the uncertainty by defining relevant macroconditions of our system as being uniformly smaller than ε(e^λt) ? If this were true, it would be very exciting—it would let us deduce the best way to construct our models from the dynamics of the system under consideration, and would tell us how to carve up the state space of some system of interest optimally given the temporal scales involved.

Unfortunately, things are not this simple. In particular, this suggestion assumes that the state space can be neatly divided into continuously connected macroconditions, and that it is not possible for a single macrostate’s volume to be distributed across a number of isolated regions. It assumes, that is, that simple distance in state-space is always going to be the best measure of qualitative similarity between two states. This is manifestly not the case. Consider, for instance, the situation in classical statistical mechanics. Given some macrocondition M* at $t_{0}$ , what are the constraints on the system’s state at a later time $t_{1}$ ? We can think of M* as being defined in terms of 5(j)—that is, we can think of M* as being a macrocondition that’s picked out in terms of some neighborhood of the state space of S that satisfies 5(j).

By Liouville’s Theorem, we know that the total density ρ of states is constant along any trajectory through phase space. That is:

{\tfrac {d\rho }{dt}}=0

5(m)

However, as Albert (2000) points out, this only implies that the total phase space volume is invariant with respect to time. Liouville’s theorem says absolutely nothing about how that volume is distributed; it only says that all the volume in the initial macrocondition has to be accounted for somewhere in the later macrocondition(s). In particular, we have no reason to expect that all the volume will be distributed as a single path-connected region at $t_{1}$ : we just know that the original volume of M* must be accounted for somehow. That volume could be scattered across a number of disconnected states, as shown in Figure 5.2.

Fig. 5.2

While the specifics of this objection are only relevant to statistical mechanics, there is a more general lesson that we can draw: the track that we started down a few pages ago—of using formal features of chaos theory to put a straight-forward cap on the precision of our predictions on a given system after a certain amount of time—is not as smooth and straight as it may have initially seemed. In particular, we have to attend to the fact that simple distance across a state-space may not always be the best measure of the relative “similarity” between two different states; the case of thermodynamics and statistical mechanics provides an existence proof for this claim. Without an independent measure of how to group regions of a state space into qualitatively similar conditions—thermodynamic macroconditions in this case—we have no way of guaranteeing that just because some collection of states falls within the bounds of the region defined by 5(j) they are necessarily all similar to one another in the relevant respect. This account ignores the fact that two states might be very close together in state space, and yet differ in other important dynamical respects.

Generalizing from this case, we can conclude that knowing λ, ε, and t is enough to let us put a meaningful cap on the resolution of future predictions (i.e. that they can be only as fine-grained as the size of the neighborhood given by ε(e^λt) ) only if we stay agnostic about the presence (and location) of interesting macroconditions when we make our predictions. That is, while the inequality in 5(j) does indeed hold, we have no way of knowing whether or not the size and distribution of interesting, well-behaved regions of the state-space will correspond neatly with size of the neighborhoods defined by that inequality.

To put the point another way, restricting our attention to the behavior of some system considered as a collection of states can distract us from relevant factors in predicting the future of the system. In cases where the dynamical form of a system can shift as a function of time, we need to attend to patterns in the formation of well-behaved regions (like those of thermodynamic macroconditions)—including critical points and bifurcations—with just as much acumen as we attend to patterns in the transition from one state to another. Features like those are obscured when we take a static view of systems, and only become obvious when we adopt the tools of DyST.

5.1.5 Feedback Loops

In Section 5.1.2, we considered the relationship between non-linearities in the models of dynamical systems and the presence of feedback generally. Our discussion there, however, focused on an example drawn from economics. Moreover, we didn’t discuss feedback mechanisms themselves in much detail. Let us now fill in both those gaps. While CGCMs are breathtakingly detailed models in many respects, their detailed incorporation of feedback mechanisms into their outputs--a task that is impossible for EBMs and met by individual EMICs only for their narrow domains of application (if it is met at all). Since CGCMs are characterized as a group by their melding of atmospheric, oceanic, and land-based models, let’s begin by considering a representative sample of an important feedback mechanism from each of these three domains.

While feedback mechanisms are not definitive of complex systems like the climate, they are frequently the sources of non-linear behavior in the natural world, and so are often found in real-world complex systems. It’s not difficult to see why this is the case; dynamically complex systems are systems in which interesting behavioral patterns are present from many perspectives and at many scales (see Chapter Three), and thus their behavior is regulated by a large number of mutually interacting constraints. ^[27] Feedback mechanisms are a very common way for natural systems to regulate their own behavior. Dynamically complex systems, with their layers of interlocking constraints, have ample opportunity to develop a tangled thicket of feedback loops. Jay Forrester, in his 1969 textbook on the prospects for developing computational models of city growth, writes that “a complex system is not a simple feedback loop where one system state dominates the behavior. It is a multiplicity of interacting feedback loops [the behavior of which is] controlled by nonlinear relationships.^[28]” The global climate is, in this respect, very similar to an active urban center.

Feedback mechanisms are said to be either positive or negative, and the balance and interplay between these two different species of feedback is often the backbone of self-regulating dynamical systems: the global climate is no exception. Positive feedback mechanisms are those in which the action of the mechanism serves to increase the parameter representing the input of the mechanism itself. If the efficacy of the mechanism for producing some compound A depends (in part) on the availability of another compound B and the mechanism which produces compound B also produces compound A, then the operation of these two mechanisms can form a positive feedback loop—as more B is produced, more A is produced, which in turn causes B to be produced at a greater rate, and so on. Consider, for example, two teenage lovers (call them Romeo and Juliet) who are particularly receptive to each other’s affections. As Romeo shows more amorous interest in Juliet, she becomes more smitten with him as well. In response, Romeo—excited by the attention of such a beautiful young woman—becomes still more affectionate. Once the two teenagers are brought into the right sort of contact—once they’re aware of each other’s romantic feelings—their affection for each other will rapidly grow. Positive feedback mechanisms are perhaps best described as “runaway” mechanisms; unless they’re checked (either by other mechanisms that are part of the system itself or by a change in input from the system’s environment), they will tend to increase the value of some parameter of the system without limit. In the case of Romeo and Juliet, it’s easy to see that once the cycle is started, the romantic feelings that each of them has toward the other will, if left unchecked, grow without bound. This can, for obvious reasons, lead to serious instability in the overall system—most interesting systems cannot withstand the unbounded increase of any of their parameters without serious negative consequences. The basic engineering principles underlying the creation of nuclear weapons exploit this feature of positive feedback mechanisms: the destructive output of nuclear weapons results from the energy released during the fission of certain isotopes of (in most cases) uranium or plutonium. Since fission of these heavy isotopes produces (among other things) the high-energy neutrons necessary to begin the fission process in other nearby atoms of the same isotope, the fission reaction (once begun) can—given the right conditions—become a self-sustaining chain reaction, where the result of each step in the cycle causes subsequent steps, which are both similar and amplified. Once the fission reaction begins it reinforces itself, resulting in the rapid release of energy that is the nominal purpose of nuclear weapons.

Of course, in most real-world cases the parameters involved in positive feedback loops are not able to increase without bound. In most cases, that is, dynamical systems that include positive feedback loops also include related negative feedback loops, which provide a check on the otherwise-unbounded amplification of the factors involved in the positive feedback loops. While positive feedback loops are self-reinforcing, negative feedback loops are self-limiting; in the same way that positive loops can lead to the rapid destabilization of dynamical systems in which they figure, negative loops can help keep dynamical systems in which they figure stable.

Consider, for instance, a version of the story of Romeo and Juliet in which the teenage lovers are somewhat more dysfunctional. In this version of the tale, Romeo and Juliet still respond to each others’ affections, but they do so in the opposite way as in the story told above. Romeo, in this story, likes to “play hard to get:” the more he sees that Juliet’s affections for him are growing, the less interested he is in her. Juliet, on the other hand, is responsive to encouragement: the more Romeo seems to like her, the more she likes him. It’s easy to see that the story’s outcome given this behavior will be far different than the outcome in which their affections are purely driven by mutually reinforcing positive feedback loops. Rather than growing without bound, their affections will tend to stabilize at a particular level, the precise nature of which is determined by two factors: the initial conditions (how much they like each other to begin with), and the level of responsiveness by each teen (how much Juliet’s affection responds to Romeo’s reciprocity, and how much Romeo’s affection responds to Juliet’s enthusiasm). Depending on the precise tuning of these values, the relationship may either stabilize in a mutually congenial way (as both lovers are drawn toward a middle ground of passion), or it may stabilize in a way that results in the relationship ending (as Romeo’s lack of interest frustrates Juliet and she gives up). In either case, the important feature of the example is its eventual movement toward a stable attractor.^[29]

5.2.2 The Role of Feedback Loops in Driving Climate Dynamics

Similar feedback mechanics play central roles in the regulation and evolution of the global climate system. Understanding the dynamics and influence of these feedback mechanics is essential to understanding the limitations of basic models of the sort considered in Chapter Four. Some of the most important positive feedback mechanics are both obvious and troubling in their behavior. Consider, for instance, the relationship between planetary albedo and warming. Albedo, as you may recall from Chapter Four is a value representing the reflectivity of a given surface. Albedo ranges from 0 to 1, with higher values representing greater reflectivity. Albedo is associated with one of the most well-documented positive feedback mechanisms in the global climate. As the planet warms, the area of the planet covered by snow and ice tends to decrease.^[30] Snow and ice, being white and highly reflective, have a fairly high albedo when compared with either open water or bare land. As more ice melts, then, the planetary (and local) albedo decreases. This results in more radiation being absorbed, leading to increased warming and further melting. It’s easy to see that unchecked, this process could facilitate runaway climate warming, which each small increase in temperature encouraging further, larger increases. This positive feedback is left out of more basic climate models, which lack the formal structure to account for such nuanced behavior.

Perhaps the most significant set of positive feedback mechanisms associated with the long-term behavior of the global climate are those that influence the capacity of the oceans to act as a carbon sink.^[31] The planetary oceans are the largest carbon sinks and reservoirs in the global climate system, containing 93% of the planet’s exchangeable^[32] carbon. The ocean and the atmosphere exchange something on the order of 100 gigatonnes (Gt) of carbon (mostly as CO₂) each year via diffusion (a mechanism known as the “solubility pump”) and the exchange of organic biological matter (a mechanism known as the “biological pump), with a net transfer of approximately 2 Gt of carbon (equivalent to about 7.5 Gt of CO₂) to the ocean. Since the industrial revolution, the planet’s oceans have absorbed roughly one-third of all the anthropogenic carbon emissions.^[33] Given the its central role in the global carbon cycle, any feedback mechanism that negatively impacts the ocean’s ability to act as a carbon sink is likely to make an appreciable difference to the future of the climate in general. There are three primary positive warming feedbacks associated with a reduction in the oceans’ ability to sequester carbon:

(1) As anyone who has ever left a bottle of soda in a car on a very hot day (and ended up with an expensive cleaning bill) knows, liquid’s ability to store dissolved carbon dioxide decreases as the liquid’s temperature increases. As increased CO₂ levels in the atmosphere lead to increased air temperatures, the oceans too will warm. This will decrease their ability to “scrub” excess CO₂ from the atmosphere, leading to still more warming.

(2) This increased oceanic temperature will also potentially disrupt the action of the Atlantic Thermohaline Circulation. The thermohaline transports a tremendous amount of water--something in the neighborhood of 100 times the amount of water moved by the Amazon river--and is the mechanism by which the cold anoxic water of the deep oceans is circulated to the surface. This renders the thermohaline essential not just for deep ocean life (in virtue of oxygenating the depths), but also an important component in the carbon cycle, as the water carried up from the depths is capable of absorbing more CO₂ than the warmer water near the surface. The thermohaline is driven primarily by differences in water density, which in turn is a function of temperature and salinity^[34]. The heating and cooling of water as it is carried along by the thermohaline forms a kind of conveyor belt that keeps the oceans well mixed through much the same mechanism responsible for the mesmerizing motion of the liquid in a lava lamp. However, the fact that the thermohaline’s motion is primarily driven by differences in salinity and temperature means that it is extremely vulnerable to disruption by changes in those two factors. As CO₂ concentration in the atmosphere increases and ocean temperatures increase accordingly, melting glaciers and other freshwater ice stored along routes that are accessible to the ocean can result in significant influxes of fresh (and cold) water. This alters both temperature and salinity of the oceans, disrupting the thermohaline and inhibiting the ocean’s ability to act as a carbon sink. Teller et. al. (2002) argue that a similar large-scale influx of cold freshwater (in the form of the destruction of an enormous ice dam at Lake Agassiz) was partially responsible for the massive global temperature instability seen 15,000 years ago during the last major deglaciation^[35].

(3) Perhaps most simply, increased acidification of the oceans (i.e. increased carbonic acid concentration as a result of CO₂ reacting with ocean water) means slower rates of new CO₂ absorption, reducing the rate at which excess anthropogenic CO₂ can be scrubbed from the atmosphere.

Examples like these abound in climatology literature. As we suggested above, though, perhaps the most important question with regard to climate feedbacks is whether the net influence is positive or negative with respect to climate sensitivity. Climate sensitivity, recall, is the relationship between the change in the global concentration of greenhouse gases (given in units of CO2-equivalent impacts on radiative forcings) and the change in the annual mean surface air temperature (see Chapter Four). If the Earth were a simple system, free of feedbacks and other non-linearly interacting processes, this sensitivity would be a straightforwardly linear one: each doubling of CO2-e concentration would result in an increase of ~.30 ${\tfrac {K}{W/m^{2}}}$ , which would correspond to a mean surface temperature change of 1.2 degrees C at equilibrium^[36].

Unfortunately for climate modelers, things are not so simple. The net change in average surface air temperature following a CO2-e concentration doubling in the atmosphere also depends on (for instance) how the change in radiative forcing that doubling causes impacts the global albedo. The change in the global albedo, in turn, impacts the climate sensitivity by altering the relationship between radiative flux and surface air temperature.

Just as with albedo, we can (following Roe & Baker [2007]) introduce a single parameter φ such that the net influence of feedbacks on the equation describing climate sensitivity:

{\tfrac {dT}{dt}}=\varphi ({\tfrac {dR}{dt}})

5(n)

In a feedback-free climate system, we can parameterize 5(n) such that $\varphi =1$ , and such that $\varphi _{0}=\varphi _{t}$ . That is, we can assume that the net impact of positive and negative feedbacks on the total radiative flux is both constant and non-existent. However, just as with albedo, observations suggest that this simplification is inaccurate; $\varphi _{0}\neq \varphi _{t}$ . Discerning the value of $\varphi$ is one of the most challenging (and important) tasks in contemporary climate modeling.

The presence of so many interacting feedback mechanisms is one of the features that makes climatology such a difficulty science to get right. It is also characteristic of complex systems more generally. How are we to account for these features when building high-level models of the global climate? What novel challenges emerge from models designed to predict the behavior of systems like this? In Chapter Six, we shall examine Coupled General Circulation Models (CGCMs), which are built to deal with these problems.

↑ The question of how to interpret the formalism of non-relativistic quantum mechanics, for instance, still hasn’t been answered to the satisfaction of either philosophers or physicists. Philosophical attention to the measurement problem in the mid-20^th century led directly to the overthrow of the Copenhagen Interpretation, and (more recently) to work on decoherence and einselection (e.g. Zurek [2003]). For an accessible survey of some of the ways in which philosophical thinking has contributed to physics in the 20^th century, see Maudlin (2007). For examples of excellent current work in these areas, see Wallace (2011) and (2009), as well as Albert (2000).
↑ Whether or not either of these two features is a necessary feature of dynamically complex systems is a more complicated question. As we shall see, both non-linearity and chaos are best understood as properties of particular models rather than of systems themselves. Dynamically complex systems are (by definition) those which admit of sensible and useful consideration from a large variety of different perspectives; many interesting dynamically complex systems might exhibit chaotic behavior from some perspectives but not others. We should resist the temptation to even consider the question of whether systems like that are “really” chaotic or not in just the same way that we should resist the temptation to generally privilege one set of real patterns describing a system’s time-evolution over the others.
↑ Strictly speaking, differential equations are only applicable to systems in which the values in question can be modeled as varying continuously. In discrete-time systems, a separate (but related) mathematical tool called a difference equation must be used. For our purposes here, this distinction is not terribly important, and I will restrict the rest of the discussion to cases where continuous variation of quantities is present, and thus where differential equations are the appropriate tool.
↑ Of course, velocity too is a dynamical concept that describes the change in something’s position over time. The Newtonian equation of motion is thus a second order differential equation, as it describes not just a change in a basic quantity, but (so to speak) the change in the change in a basic quantity.
↑ This means that for a system like that, the space would have to have 6n dimensions, where n is the number of particles in the system. Why six? If each point in our space is to represent a complete state of the system, it needs to represent the x, y, and z coordinates of each particle’s position (three numbers), as well as the x, y, and z coordinates of each particle’s velocity (three more numbers). For each particle in the system, then, we must specify six numbers to get a complete representation from this perspective.
↑ In mathematical jargon, these two conditions are called “additivity” and “degree 1 homogeneity,” respectively. It can be shown that degree 1 homogeneity follows from additivity given some fairly (for our purposes) innocuous assumptions, but it is heuristically useful to consider the two notions separately.
↑ Ladyman, Lambert, & Wiesner (2011) quite appropriately note that “a lot of heat and very little light” has been generated in philosophical treatments of non-linearity. In particular, they worry about Mainzer (1994)’s claim that “[l]inear thinking and the belief that the whole is only the sum of its parts are evidently obsolete” (p. 1). Ladyman, Lambert, & Wiesner reasonably object that very little has been said about what non-linearity has to do with ontological reductionism, or what precisely is meant by “linear thinking.” It is precisely this sort of murkiness that I am at pains to dispel in the rest of this chapter.
↑ Fans of Wikipedia style guidelines might call “appreciable” here a “weasel-word.” What counts as an appreciable interaction is, of course, the really difficult question here. Suffice it to say that in practice we’ve found it to be the case that assuming no interaction between the molecules here gives us a model that works for certain purposes. A whole separate paper could be written on the DyST account of these ceteris paribus type hedges, but we shall have to set the issue aside for another time.
↑ For a nice case-study in the benefits of framing discussions of non-linearity in terms of convexity, see Al-Suwailem (2005)’s discussion of non-linearity in the context of economic theory and preference-ranking.
↑ Actually, this case satisfies both conditions. We’ve just seen how it satisfies (2), but we could also break the system apart and consider your “up arrow” presses and “down arrow” presses independently of one another and still calculate the speed of the belt. Treadmill speed control is a linear system, and this underscores the point that conditions (1) and (2) are not as independent as this presentation suggests.
↑ Consider, for instance, a circumstance in which the carrying capacity of an environment is partially a function of how much food is present in that environment, and in which the quantity of food available is a function of the present population of another species. This is often the case in predator-prey models; the number of wolves an environment can support partially depends on how many deer are around, and the size of the deer population depends both on how much vegetation is available for the deer to eat and on how likely an individual deer is to encounter a hungry wolf while foraging.
↑ Translation by Sommer (1954).
↑ Op. cit., pp. 158-159
↑ The principle of diminishing marginal utility was developed by a number of economists over the course of several decades, and continues to be refined to this day. See, for example, Menger (1950), Bohm-Bawerk (1955), and McCulloch (1977). While the originators of this principle (particularly Menger and Bohm-Bawerk) were associated with the Austrian school of economics, diminishing marginal utility has found its way into more mainstream neoclassical economic theories (Kahneman and Deaton, 2010).
↑ Kahneman and Deaton (2010), p. 16491
↑ Op. cit., p. 16492
↑ Lorenz (1963) never employs this poetic description of the effect, and the precise origin of the phrase is somewhat murky. In 1972, Lorenz delivered an address to the American Association for the Advancement of Science using the title “Does the Flap of a Butterfly’s Wings in Brazil Set Off a Tornado in Texas?” The resemblance between the Lorenz system’s state space graph (Figure 2) and a butterfly’s wings is likely not coincidental.
↑ This is a defining characteristic of dissipative systems. Conservative systems—undamped pendulums that don’t lose energy to friction—will feature trajectories that remain separate by a constant amount.
↑ Indeed, even our pendulum is like this! There is another possible qualitatively identical class of trajectories that’s not shown in Figure 1. Think about what would happen if we start things not by dropping the pendulum, but by giving it a big push. If we add in enough initial energy, the angular velocity will be high enough that, rather than coming to rest at the apex of its swing toward the other side and dropping back down, the pendulum will continue on and spin over the top, something most schoolchildren have tried to do on playground swings. Depending on the initial push given, this over-the-top spin may happen only once, or it may happen several times. Eventually though, the behavior of the pendulum will decay back down into the class of trajectories depicted here, an event known as a phase change.
↑ Precisely what this means, of course, depends on the system being modeled. In Lorenz’s original discussion, $x$ represents the intensity of convective energy transfer, $y$ represents the relative temperature of flows moving in opposite directions, and $z$ represents the the degree to which (and how) the vertical temperature profile of the fluid diverges from a smooth, linear flow.
↑ The Phillips curve in economics, which describes the relationship between inflation and unemployment, is a good real-world example of this. Trajectories through economic state space described by the Phillips curve can fall into chaotic regions under the right conditions, but there are also non-chaotic regions in the space.
↑ A number of authors have succeeded in identifying the appearance of a certain structure called a “period-doubling bifurcation” as one predictor of chaotic behavior, but it is unlikely that it is the only such indicator.
↑ Because of this variation—some pairs of trajectories may diverge more quickly than others—it is helpful to also define the maximal Lyapunov exponent (MLE) for the system. As the name suggests, this is just the largest Lyapunov exponent to be found in a particular system. Because the MLE represents, in a sense, the “worst-case” scenario for prediction, it is standard to play it safe and use the MLE whenever we need to make a general statement about the behavior of the system as a whole. In the discussion that follows, I am referring to the MLE unless otherwise specified.
↑ If we have some way of determining the largest Lyapunov exponent that appears in D, then that can stand in for the global MLE in our equations here. If not, then we must use the MLE for the system as a whole, as that is the only way of guaranteeing that the region at the later time will include all the trajectories.
↑ Attentive readers will note the use of what Wikipedia editors call a “weasel word” here. What counts as “significant” divergence? This is a very important question, and will be the object of our discussion for the next few pages. For now, it is enough to note that “significance” is clearly a goal-relative concept, a fact which ends up being a double-edged sword if we’re trying to predict the behavior of chaotic systems. We’ll see how very soon.
↑ If the Lyapunov exponent is negative, then the distance between two paths decreases exponentially with time. Intuitively, this represents the initial conditions all being “sucked” toward a single end-state. This is, for instance, the case with the damped pendulum discussed above—all initial conditions eventually converge on the rest state.
↑ The fact that a particular complex system exhibits interesting behavior at many scales of analysis implies this kind of inter-scale regulation: the features of a given pattern in the behavior of the system at one scale can be thought of a constraint on the features of the patterns at each of the other scales. After all, the choice of a state space in which to represent a system is just a choice of how to describe that system, and so to notice that a system’s behavior is constrained in one space is just to notice that the system’s behavior is constrained period, though the degree of constraint can vary.
↑ Forrester (1969), p. 9
↑ Under some conditions, the situation described here might fall into another class of attractors: the limit cycle. It is possible for some combinations of Romeo and Juliet’s initial interest in each other to combine with features of how they respond to one another to produce a situation where the two constantly oscillate back and forth, with Romeo’s interest in Juliet growing at precisely the right rate to put Juliet off, cooling his affections to the point where she once again finds him attractive, beginning the cycle all over again. In either case, however, the stability of the attractor is the important feature is the attractor’s stability. Both the two fixed-point attractors described in the text (the termination of the courtship and the stabilization of mutual attractiion) result in the values of the relevant differential equations “settling down” to predictable behavior. Similarly, the duo’s entrance into the less fortunate (but just as stable) limit cycle represents predictable long-term behavior.
↑ At least past a certain tipping point. Very small amounts of warming can (and have) produced expanding sea ice, especially in the Antarctic. The explanation for this involves the capacity of air of different temperatures to bear moisture. Antarctica, historically the coldest place on Earth, is often so cold that snowfall is limited by the temperature related lack of humidity. As the Antarctic continent has warmed slightly, its capacity for storing moisture has increased, leading to higher levels of precipitation in some locations. This effect is, however, both highly localized and transient. Continued warming will rapidly undo the gains associated with this phenomenon.
↑ Feely et. al. (2007)
↑ That is, 93% of the carbon that can be passed between the three active carbon reservoirs (land, ocean, and atmosphere), and thus is not sequestered (e.g. by being locked up in carbon-based minerals in the Earth’s mantle).
↑ Dawson and Spannagle (2007), p. 303-304
↑ Vallis and Farnetti (2009)
↑ In this case, the temporary shutdown of the thermohaline was actually responsible for a brief decrease in average global temperature--a momentary reversal of the nascent warming trend as the climate entered an interglacial period. This was due to differences in atmospheric and oceanic carbon content, and were a similar event to occur today it would likely have the opposite effect.
↑ Roe & Baker (2007), p. 630

[1] The question of how to interpret the formalism of non-relativistic quantum mechanics, for instance, still hasn’t been answered to the satisfaction of either philosophers or physicists. Philosophical attention to the measurement problem in the mid-20^th century led directly to the overthrow of the Copenhagen Interpretation, and (more recently) to work on decoherence and einselection (e.g. Zurek [2003]). For an accessible survey of some of the ways in which philosophical thinking has contributed to physics in the 20^th century, see Maudlin (2007). For examples of excellent current work in these areas, see Wallace (2011) and (2009), as well as Albert (2000).

[2] Whether or not either of these two features is a necessary feature of dynamically complex systems is a more complicated question. As we shall see, both non-linearity and chaos are best understood as properties of particular models rather than of systems themselves. Dynamically complex systems are (by definition) those which admit of sensible and useful consideration from a large variety of different perspectives; many interesting dynamically complex systems might exhibit chaotic behavior from some perspectives but not others. We should resist the temptation to even consider the question of whether systems like that are “really” chaotic or not in just the same way that we should resist the temptation to generally privilege one set of real patterns describing a system’s time-evolution over the others.

[3] Strictly speaking, differential equations are only applicable to systems in which the values in question can be modeled as varying continuously. In discrete-time systems, a separate (but related) mathematical tool called a difference equation must be used. For our purposes here, this distinction is not terribly important, and I will restrict the rest of the discussion to cases where continuous variation of quantities is present, and thus where differential equations are the appropriate tool.

[4] Of course, velocity too is a dynamical concept that describes the change in something’s position over time. The Newtonian equation of motion is thus a second order differential equation, as it describes not just a change in a basic quantity, but (so to speak) the change in the change in a basic quantity.

[5] This means that for a system like that, the space would have to have 6n dimensions, where n is the number of particles in the system. Why six? If each point in our space is to represent a complete state of the system, it needs to represent the x, y, and z coordinates of each particle’s position (three numbers), as well as the x, y, and z coordinates of each particle’s velocity (three more numbers). For each particle in the system, then, we must specify six numbers to get a complete representation from this perspective.

[6] In mathematical jargon, these two conditions are called “additivity” and “degree 1 homogeneity,” respectively. It can be shown that degree 1 homogeneity follows from additivity given some fairly (for our purposes) innocuous assumptions, but it is heuristically useful to consider the two notions separately.

[7] Ladyman, Lambert, & Wiesner (2011) quite appropriately note that “a lot of heat and very little light” has been generated in philosophical treatments of non-linearity. In particular, they worry about Mainzer (1994)’s claim that “[l]inear thinking and the belief that the whole is only the sum of its parts are evidently obsolete” (p. 1). Ladyman, Lambert, & Wiesner reasonably object that very little has been said about what non-linearity has to do with ontological reductionism, or what precisely is meant by “linear thinking.” It is precisely this sort of murkiness that I am at pains to dispel in the rest of this chapter.

[p154-8] Fans of Wikipedia style guidelines might call “appreciable” here a “weasel-word.” What counts as an appreciable interaction is, of course, the really difficult question here. Suffice it to say that in practice we’ve found it to be the case that assuming no interaction between the molecules here gives us a model that works for certain purposes. A whole separate paper could be written on the DyST account of these ceteris paribus type hedges, but we shall have to set the issue aside for another time.

[9] For a nice case-study in the benefits of framing discussions of non-linearity in terms of convexity, see Al-Suwailem (2005)’s discussion of non-linearity in the context of economic theory and preference-ranking.

[10] Actually, this case satisfies both conditions. We’ve just seen how it satisfies (2), but we could also break the system apart and consider your “up arrow” presses and “down arrow” presses independently of one another and still calculate the speed of the belt. Treadmill speed control is a linear system, and this underscores the point that conditions (1) and (2) are not as independent as this presentation suggests.

[11] Consider, for instance, a circumstance in which the carrying capacity of an environment is partially a function of how much food is present in that environment, and in which the quantity of food available is a function of the present population of another species. This is often the case in predator-prey models; the number of wolves an environment can support partially depends on how many deer are around, and the size of the deer population depends both on how much vegetation is available for the deer to eat and on how likely an individual deer is to encounter a hungry wolf while foraging.

[12] Translation by Sommer (1954).

[13] Op. cit., pp. 158-159

[14] The principle of diminishing marginal utility was developed by a number of economists over the course of several decades, and continues to be refined to this day. See, for example, Menger (1950), Bohm-Bawerk (1955), and McCulloch (1977). While the originators of this principle (particularly Menger and Bohm-Bawerk) were associated with the Austrian school of economics, diminishing marginal utility has found its way into more mainstream neoclassical economic theories (Kahneman and Deaton, 2010).

[15] Kahneman and Deaton (2010), p. 16491

[16] Op. cit., p. 16492

[17] Lorenz (1963) never employs this poetic description of the effect, and the precise origin of the phrase is somewhat murky. In 1972, Lorenz delivered an address to the American Association for the Advancement of Science using the title “Does the Flap of a Butterfly’s Wings in Brazil Set Off a Tornado in Texas?” The resemblance between the Lorenz system’s state space graph (Figure 2) and a butterfly’s wings is likely not coincidental.

[18] This is a defining characteristic of dissipative systems. Conservative systems—undamped pendulums that don’t lose energy to friction—will feature trajectories that remain separate by a constant amount.

[19] Indeed, even our pendulum is like this! There is another possible qualitatively identical class of trajectories that’s not shown in Figure 1. Think about what would happen if we start things not by dropping the pendulum, but by giving it a big push. If we add in enough initial energy, the angular velocity will be high enough that, rather than coming to rest at the apex of its swing toward the other side and dropping back down, the pendulum will continue on and spin over the top, something most schoolchildren have tried to do on playground swings. Depending on the initial push given, this over-the-top spin may happen only once, or it may happen several times. Eventually though, the behavior of the pendulum will decay back down into the class of trajectories depicted here, an event known as a phase change.

[20] Precisely what this means, of course, depends on the system being modeled. In Lorenz’s original discussion, $x$ represents the intensity of convective energy transfer, $y$ represents the relative temperature of flows moving in opposite directions, and $z$ represents the the degree to which (and how) the vertical temperature profile of the fluid diverges from a smooth, linear flow.

[21] The Phillips curve in economics, which describes the relationship between inflation and unemployment, is a good real-world example of this. Trajectories through economic state space described by the Phillips curve can fall into chaotic regions under the right conditions, but there are also non-chaotic regions in the space.

[22] A number of authors have succeeded in identifying the appearance of a certain structure called a “period-doubling bifurcation” as one predictor of chaotic behavior, but it is unlikely that it is the only such indicator.

[23] Because of this variation—some pairs of trajectories may diverge more quickly than others—it is helpful to also define the maximal Lyapunov exponent (MLE) for the system. As the name suggests, this is just the largest Lyapunov exponent to be found in a particular system. Because the MLE represents, in a sense, the “worst-case” scenario for prediction, it is standard to play it safe and use the MLE whenever we need to make a general statement about the behavior of the system as a whole. In the discussion that follows, I am referring to the MLE unless otherwise specified.

[p172-24] If we have some way of determining the largest Lyapunov exponent that appears in D, then that can stand in for the global MLE in our equations here. If not, then we must use the MLE for the system as a whole, as that is the only way of guaranteeing that the region at the later time will include all the trajectories.

[25] Attentive readers will note the use of what Wikipedia editors call a “weasel word” here. What counts as “significant” divergence? This is a very important question, and will be the object of our discussion for the next few pages. For now, it is enough to note that “significance” is clearly a goal-relative concept, a fact which ends up being a double-edged sword if we’re trying to predict the behavior of chaotic systems. We’ll see how very soon.

[p173-26] If the Lyapunov exponent is negative, then the distance between two paths decreases exponentially with time. Intuitively, this represents the initial conditions all being “sucked” toward a single end-state. This is, for instance, the case with the damped pendulum discussed above—all initial conditions eventually converge on the rest state.

[27] The fact that a particular complex system exhibits interesting behavior at many scales of analysis implies this kind of inter-scale regulation: the features of a given pattern in the behavior of the system at one scale can be thought of a constraint on the features of the patterns at each of the other scales. After all, the choice of a state space in which to represent a system is just a choice of how to describe that system, and so to notice that a system’s behavior is constrained in one space is just to notice that the system’s behavior is constrained period, though the degree of constraint can vary.

[28] Forrester (1969), p. 9

[29] Under some conditions, the situation described here might fall into another class of attractors: the limit cycle. It is possible for some combinations of Romeo and Juliet’s initial interest in each other to combine with features of how they respond to one another to produce a situation where the two constantly oscillate back and forth, with Romeo’s interest in Juliet growing at precisely the right rate to put Juliet off, cooling his affections to the point where she once again finds him attractive, beginning the cycle all over again. In either case, however, the stability of the attractor is the important feature is the attractor’s stability. Both the two fixed-point attractors described in the text (the termination of the courtship and the stabilization of mutual attractiion) result in the values of the relevant differential equations “settling down” to predictable behavior. Similarly, the duo’s entrance into the less fortunate (but just as stable) limit cycle represents predictable long-term behavior.

[30] At least past a certain tipping point. Very small amounts of warming can (and have) produced expanding sea ice, especially in the Antarctic. The explanation for this involves the capacity of air of different temperatures to bear moisture. Antarctica, historically the coldest place on Earth, is often so cold that snowfall is limited by the temperature related lack of humidity. As the Antarctic continent has warmed slightly, its capacity for storing moisture has increased, leading to higher levels of precipitation in some locations. This effect is, however, both highly localized and transient. Continued warming will rapidly undo the gains associated with this phenomenon.

[31] Feely et. al. (2007)

[32] That is, 93% of the carbon that can be passed between the three active carbon reservoirs (land, ocean, and atmosphere), and thus is not sequestered (e.g. by being locked up in carbon-based minerals in the Earth’s mantle).

[33] Dawson and Spannagle (2007), p. 303-304

[34] Vallis and Farnetti (2009)

[35] In this case, the temporary shutdown of the thermohaline was actually responsible for a brief decrease in average global temperature--a momentary reversal of the nascent warming trend as the climate entered an interglacial period. This was due to differences in atmospheric and oceanic carbon content, and were a similar event to occur today it would likely have the opposite effect.

[36] Roe & Baker (2007), p. 630

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]

[12]

[13]

[14]

[15]

[16]

[17]

[18]

[19]

[20]

[21]

[22]

[23]

[24]

[25]

[26]

[27]

[28]

[29]

[30]

[31]

[32]

[33]

[34]

[35]

[36]