This page has been proofread, but needs to be validated.

recovery of a missing bit requires a more complex pattern to be reliable.

Shannon entropy formalizes this notion. In Shannon’s original formulation, the entropy (H) of a particular message source (my roommate’s speech, my cat’s vocalizations, Han Solo’s prevarications) is given by an equation,[1] the precise details of which are not essential for our purposes here, that specifies how unlikely a particular message is, given specifications about the algorithm encoding the message. A particular string of noises coming out of my cat are (in general) far more likely than any particular string of noises that comes out of my roommate; my roommate’s speech shows a good deal more variation between messages, and between pieces of a given message. A sentence uttered by him has far higher Shannon entropy than a series of meows from my cat. So far, then, this seems like a pretty good candidate for what our intuitive sense of complexity might be tracking: information about complex systems has far more Shannon entropy than information about simple systems. Have we found our answer? Is complexity just Shannon entropy? Alas, things are not quite that easy. Let’s look at a few problem cases.

First, consider again the "toy science" from Section 1.3. We know that for each bit in a given string, there are two possibilities: the bit could be either a ‘1’ or a ‘0.’ In a truly random string in this language, knowing the state of a particular bit doesn’t tell us anything about the state of any other bits: there’s no pattern in the string, and the state of each bit is informationally independent of each of the others. What’s the entropy of a string like that—what’s the entropy of a


  1. iH = ∑PiHi This equation expresses the entropy in terms of a sum of probabilities pi(j)for producing various symbols j such that the message in question is structured the way it is. Thus, the more variation you can expect in each bit of the message, the higher the entropy of the total message. For a more detailed discussion of the process by which this equation can be derived, see Shannon (1948) and Shannon & Weaver (1964).

63