I spent my day today at a course on Presenting Data and Information, given by Edward Tufte. Tufte is one of those people who is selectively famous- just about everyone in my little professional world has heard of him, whereas just about no one else I know recognizes his name. I was delighted to get the chance to see his course, and as a bonus, I am now the owner of four of his books (they came with the class registration): The Visual Display of Quantitative Information, Envisioning Information, Visual Explanations, and Beautiful Evidence. I look forward to reading them all, but I think the class added a dimension I would not have gotten from just reading his books.
I won't attempt to summarize the entire course here- if you're curious check out Tufte's website or one of his books. Besides, I am still assimilating what I learned. The older I get, the more I find that really interesting ideas take awhile to digest and incorporate into my thinking. So right now, I just have a pastiche of ideas, and not a unified narrative. But I want to share a few thoughts from the day:
1. A lot of the discussion in Tufte's books and in the class is about how to create a truly great graphical display of data. (He points to Minard's graph showing Napolean's march into and retreat out of Russia as a particularly effective graphic.) One of his fundamental points is that "the purpose of an information display is to assist thinking about the content." I found myself thinking back to the book Soundings, about Marie Tharp, the amazing maps she produced, and how the intellectual contribution of figuring out how to organize and present data so that its meaning can be understood is often under-appreciated.
2. I have just finished reading Distrust That Particular Flavor, a collection of essays by William Gibson. One theme Gibson returns to in several essays is the idea that the internet is a sort of collective external memory for humankind. That idea popped into my head a couple of times during today's class, but I haven't really sorted out what, exactly, I think the link is. Maybe just that if the internet is our collective memory, it would be good if we had better ways to organize and present the data in it. But maybe something more... I think this is an example of how it takes time for me to assimilate new ideas into my thinking.
3. Tufte did discuss the internet, and specifically Tim Berners-Lee's original proposal for an information "mesh" at CERN, which led eventually to the internet. Watching that part of the class, I was struck by how many times we have had to rediscover the fundamental unsuitability of hierarchical data models for representing most real world data. The first "discovery" of this of which I am aware was in Codd's work on relational databases. And then Berners-Lee founded the internet because he was dissatisfied with the limitations of the hierarchical information stores available to him at CERN. But then XML came around, and we all had the argument again. And now NoSQL databases have come around, and we're discussing it yet again. I don't know if this is sad, funny, or telling us something profound about how we humans like to think about data. (And for the record, I've made extensive use of XML, and think NoSQL databases are interesting and have advanced our capabilities in important ways- but there are fundamental limitations in representing data as a hierarchy, and some subset of the IT world seems to periodically forget that.)
I will certainly continue thinking about the course- after all, I have always liked to organize information. It is a unifying theme for a lot of my interests. I may or may not write more about Tufte's ideas in the future, but if you are at all interested in the topic of how to present data and you get a chance to attend his course, definitely take it.