Friday, May 18, 2012

Confusion Theory

On Wednesday I went to the SFI public lecture by James Gleick (ne Chaos and now The Information). Most amazingly, he dispensed with the PowerPlonk and actually did a lecture from notes. (The night before, I attended our regular VFD medical training. I got there early because the guy who is supposed to set up all the media crap whined about me hogging the station's notebook computer to do real work and demanded that I deliver it to the training site early. There was this very strong-handshake kind of older gentleman standing around wearing a shirt from one of our sister-districts so I introduced myself just to be friendly. He said something like, "I guess there will be a PowerPoint presentation and all that." And I said, "It's pretty much required these days isn't it?" Turns out he was our presenter -- a retired Army flight surgeon -- and, yes he had a PP of gory field-surgery photos ready to go). Less amazingly he (Gleick) spent the first 15 of his 30 minutes talking around Shannon Information Theory without actually coming out and admitting that Shannon Information is NOT what every layman in the world thinks it is: It has nothing to do with Meaning (see my attempted simplification here). He finally made a few passes at separating Information from Meaning but I felt that the border was rather porous through the remainder of his talk.

While trying to formulate a post-question, it occurred to me that they (Information and Meaning) are orthogonal measures in much the same way as Entropy and Complexity are in the classic Crutchfield, Young (1989) paper:
Since Information is just how many bits you have to play with and is measured as entropy, lets call the X-axis Information Entropy (which it actually is in the context of this paper). Then lets call the Y-axis -- hmm, not exactly Meaning...I haven't heard a name for this quantity bandied about, so something similar -- Data. By Data I "mean" self-correlation and/or perhaps mutual information among otherwise random bits of Information-- or maybe, Facts. If you have a noisy Information stream you might be able to extract some actual Data from it, e.g., get a series of temperatures from a bunch of ice core compositions. And to beat the analogy a little harder, you don't get much Data from the entropy extremes. If it's low, the Information is a constant, and if it's high, it's completely random.

But our Data doesn't really mean anything until it gets combined with other facts extracted from other streams and related back to the real world. So Meaning is yet a third axis to consider. That axis is Semiotics, which is exactly the study of how symbols take on meaning.

Unfortunately my question window closed long before I could articulate this.

But in the course of re-thinking it, another thing occurred to me. The lecture was titled "How We Come to Be Deluged by Tweets". Twitter is a perfect example of increasing Information Entropy on the web. So, in "fact", using Shannon Information to describe the contents of the internet may not be so far off base.

No comments:

Post a Comment