Thursday, June 7, 2018

How much evidence do we need for a data visualization "rule"?

In a separate post, I laid out some of my arguments for why I think most line charts should start at zero. I posted some of my initial thoughts on that topic on Twitter, which generated some really thoughtful replies.

One of them, from Steve Haroz, noted that he knew of know evidence that people read non-zero-baseline bar charts any differently than non-zero-baseline line charts. And, furthermore, that we should be careful in talking about data visualization "rules" when our evidence for them is weak or nonexistent.

This led to a quite spirited discussion about whether data-visualization "guidelines" or "rules of thumb" that don't have any empirical research to back them up can still be valuable, or if we should stick primarily to those things that we have solid evidence for.

Speaking personally, I didn't fully appreciate the gaps in data visualization research until I watched Robert Kosara's excellent talk at the University of Washington, "How Do We Know That?"

The talk is based on Kosara's paper, Empire of Sand, which I now assign to my students at the University of Florida.

As Kosara points out, many of the things we think we know about data visualization have little empirical evidence to back them up. And other well-accepted "rules" may actually be wrong (for example, "chartjunk" may not be so bad after all).

Some rules are based on nothing more than the strong opinions of influential early writers in the field (like Edward Tufte and Jacques Bertin) and have not actually been subject to peer-reviewed research.

So where does that leave us as data visualization practitioners and teachers?

It would seem obvious that we shouldn't teach "rules" that we know to be wrong. But what about the many areas for which there is little or no empirical evidence at all? Can theory replace research in some cases? Is a common practice worth teaching our students even if we don't know it to be true?

Below, I've tried to collect some of my own thoughts on the matter as well as those of others who took part in the Twitter discussion.

First, though, a big caveat about my own tweets: While I teach at a university and have (strong) opinions on how to teach data visualization, I'm an "instructor" not a "professor". I don't have a PhD and I'm not engaged in academic research myself.

Let's get to the tweets!

I was curious about the project Enrico mentioned but Chen didn't appear to be on Twitter, so I sent him an email.

Chen sent me a very nice email back directing me to the Visualization Guidelines Repository.

The repository is still a work in progress, but an example on "chartjunk" suggests it could eventually be similar to what Ben Jones was suggesting: Links to where guidelines come from and studies that support or refute them.

There is also a related project, VisGuides, which is a platform to discuss visualization guidelines. (VisGuides was presented at Eurovis this week.)

Chen told me the two projects were setup by four visualization scientists: Alexandra Diehl, Alfie Abdul-Rahman, Menna El-Assady and Benjamin Bach.

It will be interesting to see how the Repository and VisGuides develops.

But I wonder if there isn't also a space for something more like the University of Chicago economists survey, but for data visualization: A place where people can see at a glance what leading practitioners in the field think about different guidelines.

I think this would provide useful information about which guidelines are universally accepted (i.e. "95% of practitioners think bar charts should start at zero") and which are more contested (i.e. "30% of practitioners think line charts should usually start at zero").

With sufficient buy-in, it could also provide a one-stop shop for people to check in with their favourite thinkers in the field when struggling with a chart decision. ("I want to make a pie chart with eight slices. What would Alberto Cairo think about that?" "Would Cole Nussbaumer Knaflic approve of me truncating this axis?")

If you've got thoughts on this topic, please post a comment below or hit me up on Twitter. Because of spam comments, my comments are moderated so don't be alarmed if yours doesn't show up right away. It will within a few hours.

No comments: