Monday, December 18, 2017

Coming this January: A special "Extended Edition" of my Tableau training

I've been doing Tableau training workshops for a few years now and the feedback I get from the people who take it is often the same: They loved it, but wished there was even more of it.

Many have expressed interest in a second, "Intermediate", training workshop that would cover more advanced Tableau skills. I hope to offer such a workshop at some point in the future.

But in the meantime, I thought I could easily expand my current workshop from three days to four to give participants even more Tableau goodness: to move beyond the basics into some of Tableau's more advanced (and really cool!) features like reshaping data, joining datasets and using calculations in your visualizations.

So next month, for the first time, I'm offering a special Extended Edition of my online Tableau training with four days of training instead of the usual three. For the math nerds, that's 33% more Tableau goodness.

The training will run for four hours on two Wednesdays and Thursdays in a row: Jan. 17-18 & Jan. 24-25, 2018.

You can buy tickets on Eventbrite right here:

If you can't make this workshop but would like to be alerted when the next one is scheduled, just add your name here.

And if you have several people at your organization who need training in Tableau, I'm also available for onsite training.

Tuesday, October 17, 2017

Infographic: How the way we vote makes B.C.'s "urban/rural divide" seem much worse than it really is

I made a little infographic in Tableau Public on B.C.'s "urban/rural divide" and electoral reform. The interactive version is below. You can view the static version here or download it from here.

I'm a bit of a B.C. politics junkie (if you are too, I highly recommend CHNL's "Inside Politics" podcast). And since the 2017 election, there's been a lot of talk about the province's "urban/rural divide". Specifically, how few seats the NDP got outside of Metro Vancouver and Vancouver Island.

There's also been some talk lately about electoral reform, as the new NDP/Green government is planning a referendum on proportional representation next fall.

Which, weirdly, got me thinking of the 1993 Canadian federal election. That's the one where the separatist Bloc Quebecois became the Official Opposition with 54 seats and Kim Campbell's Progressive Conservatives government was completely humiliated, winning only 2 (!) seats.

But the really weird thing about the 1993 election was that the Progressive Conservatives actually got more votes (2.2 million) than the BQ (1.8 million) did.

But because the BQ's votes were geographically concentrated in Quebec, while the PC's were scattered across the country, the BQ got way more seats. The way Canada voted rewarded a party with regional, rather than national, ambitions.

I wondered if something similar might have happened in B.C.: Whether the way we elect MLAs might make our regional divisions seem more severe than they really are.

So I did a bit of number crunching using Elections BC data and map files.

The situation isn't anywhere near as severe as the 1993 election, but the First Past The Post electoral system does definitely exaggerate the BC Liberals' popularity in rural B.C. (and, to a lesser extent, the NDP's popularity in urban areas).

The biggest decision I had to make was where to draw the boundaries for the purpose of my analysis. I've seen some pundits refer to how the NDP didn't get many seats outside the "Lower Mainland" and "Southern Vancouver Island". But those are slightly fuzzy concepts. Where does one divide Vancouver Island between north and south? And while many people think of the Lower Mainland as the same thing as Metro Vancouver, it's not.

So, instead, I decided to lump Metro Vancouver (a well-defined regional district) together with all of Vancouver Island. That seemed to fit the electoral patterns of the province best. After all, the NDP is pretty popular up and down the Island and the Liberals are still pretty strong in Fraser Valley communities like Abbotsford and Chilliwack.

If you don't agree with my choices, though, you're more than welcome to download the Tableau workbook yourself and choose different regions.

Thursday, October 12, 2017

My Missing Maps from The Vancouver Sun

I haven't been a data journalist at The Vancouver Sun for more than two years now. But still, every couple months or so, I get an email from someone saying they're having trouble finding an old map or chart from my Sun days and wondering if I could help them track it down.

That's because dozens of maps and charts that I created when I was at The Sun have disappeared from the site since I left.

My articles and blog posts are still there. But in many cases, when they refer to an interactive chart or map, there either isn't anything there at all, or there's a weird error message saying a file can't be found.

The problem of old data journalism gradually disappearing from the web is widespread.

In this case, the blame lies half with The Vancouver Sun and half with me.

First, The Sun's share of the blame. Or, more accurately, Postmedia's.

Every so often, Postmedia, the company that owns The Sun and most other major papers in Canada, updates its websites, often adopting a whole new platform (like Wordpress). This never goes smoothly.

I still remember the first time The Sun transitioned to a new blogging platform. We were assured all the old posts would show up in the new system. That was technically true, but in the process many of the URLs changed, so none of the posts could be found in their old spot (and if someone else had linked to a post in the past, readers now got a 404 Error).

The latest updates to The Sun's website seems to have kept all the blog posts in the same spot. But it stripped out all of the "embed codes" — the stuff that makes sure that maps and charts appear in the blog post properly. The result is that all of the blog posts I wrote that had maps or charts in them no longer do. (For some reason, embed codes in articles seem to have survived in most cases.)

Luckily, in most cases, this problem is easy to fix.

Above most of those embed codes, I put a little note that said something along the lines of: "For the mobile version, click here".

The "mobile version" of all those maps and charts are actually not mobile versions at all, but rather the exact same chart, just in a different spot. This was because of issues we had with embed codes not working on our mobile site. The result, though, is that if you click on the "mobile version" of a chart, you'll be taken to a separate website where you can usually find the missing chart (but not the maps; more on that in a moment).

OK, now for the part that's my fault.

Without getting too much into the technical weeds, many of the maps I made while at The Sun were built using Google Fusion Tables. And in order to add extra features to those maps, like colour legends and search boxes, I needed to do a bit of basic HTML. That raised the problem of where to host those HTML files.

For awhile, I hosted them on Postmedia's servers. But uploading files to those servers was a huge pain, requiring me to fill in IT request forms and other nonsense (not great on deadline) and at least once someone in IT, not knowing what the files were for, decided to go ahead and delete them.

What I should have done at that point is gotten some cheap hosting space on Amazon Web Services: something reliable that I knew would be there for the long term.

Instead, I used a slightly hacky technique to host the files for free on Google Drive. I actually had data journalism colleagues at other papers warn me that this was a dumb idea. "What happens if Google stops allowing free HTML hosting?" they'd say.

Which is exactly what Google did last year.

And the result is that dozens of maps I created for The Sun in Fusion Tables are no longer accessible online.

I actually alerted The Sun to this problem last year and, to its credit, they were working with me to figure out a way to get at least some of the missing content onto a Postmedia server and back on The Sun's website. Unfortunately, the two folks I was working with most closely to fix the problem have since left the paper, too, and so things went into limbo.

The problem is that restoring all these broken links and embed codes is a tedious, time-consuming job and most of it involves old stories that most people never see.

In the end, I figured I'd see if there was a way I could revive the content myself without bugging anyone else at the paper.

In preparing this post, I thought the best thing to do was to upload the old HTML files to GitHub Pages (another free, but arguably more trustworthy hosting solution). But when I tried that I got some weird JavaScript errors and the maps didn't load properly.

That said, for reasons I don't fully understand, the HTML files still seem to work fine when loaded up on your own computer. So what I've done is taken the HTML files for several of the maps that are no longer on The Vancouver Sun site and put them in a single ZIP file called which you can download here.

Just open the ZIP file on your own computer and double click on any of the HTML files. The map should then open in your web browser (in most cases you'll still need an Internet connection as the map data is being pulled from a Fusion Table source online).

Most of the maps are just single HTML files and most of the HTML files names are pretty obvious so you should be able to find what you're looking for.

But there are a couple maps that are a bit more complex.

The Unsolved Homicide Map which accompanied a Sun series on the topic requires you to open the UnsolvedHomicide folder and then click on the index.html file. One thing that doesn't work with that map anymore is the photos of the victims (as they, too, were loaded from a Google Drive folder).

You need to follow the same process for one of the Auto Crime maps that accompanied a Sun series I wrote. Open the AutoCrime folder and then click on index.html. The other auto crime maps can be launched from the main directory.

I haven't had time to grab everything I ever made at The Sun so prioritized content that I know got a lot of traffic when it was first posted and which I think may still be interesting to folks today. My main source was looking through my old blog posts and digging into those that seemed to have interesting content.

To make things a bit easier, I've included below the titles of all the blog posts and articles for which I've added maps to the ZIP file, along with links to the original posts:
For what it's worth, I've since moved away from Google Fusion Tables for mapping and I don't bother teaching it to my students anymore.

Instead, if you've got data to map, I highly recommend Tableau Public, which now has robust support for spatial files like KML and SHP.

And if you're still having trouble finding something I did while at The Sun, I recommend you check out my Tableau Public profile. It contains more than 120 interactive charts I created both during my time at The Vancouver and since I left. My favourites are in the "My Tableau Portfolio" workbook  (the first link on the page).

Saturday, September 2, 2017

The case against tweetstorms

Credit: Wikimedia Commons

Proposed: Tweet storms are a bad way to communicate complex ideas. Blog posts are much better and easier to discover in future.

As an example, Kevin Milligan is doing great analysis on tax reforms but reading his feed means some arguments appear backwards or disjointed.

On the other hand, only nerds like me still use RSS, so perhaps tweet storms are the best way to reach a large audience?

I'm also curious: Why have tweetstorms become more popular for writers than blog posts? Even when arguments are really long? Are tweetstorms easier to dash off on a smartphone? Less work to write? Is there less expectation for writing to be polished? Are they more appropriate for "ideas in progress"?

Are tweet storms more likely to go viral than blog posts? I doubt it. Lots of links to articles go viral on Twitter.

My biggest concern about tweetstorms is they're not easily discoverable. For example, someone Googling CCPC reforms won't find Milligan's tweets.

Twitter is also less popular by far than Facebook. Tweetstorms bypass Facebook's audience (blog posts, meanwhile, can be shared on Facebook and Twitter).

That said, if tweet storms are a way to work out ideas for a later blog post/article, I'm less concerned.

My bigger worry is when people with great ideas share them only in a tweetstorm and never crystallize their ideas in an article or blog post. I think that both limits the audience for their ideas and makes those ideas harder to digest.

NOTE: This blog post was adapted from a tweetstorm about tweetstorms. Given the topic, I thought it was appropriate to adapt it into a blog post as well. The text above is almost identical except for cleaning up the language a bit and adding a conclusion that the tweetstorm lacked. This is also an experiment with writing shorter, less polished, blog posts as I think one reason some writers default to tweetstorms is because of the expectation they place on themselves when writing blog posts rather than tweets: both in terms of length and quality. Blogs should be a safe place to dash off rough ideas.

Kevin Milligan, whose tweetstorms on tax policy inspired by tweetstorm, wrote a thoughtful tweetstorm of his own on why sometimes he tweets rather than writing longer pieces.

Tuesday, July 18, 2017

Hands-on Tableau Training: Now available online!

I'm pleased to announce that after several years running in-person Tableau training workshops in Vancouver, I'm now offering the same acclaimed hands-on training online!

My first online training workshop is this September over three Thursdays: Sept. 14, 21 and 28. You can buy tickets and get more information here or by clicking the button below:

Eventbrite - Online Tableau Training: Telling stories with data

This is a great opportunity for those outside Vancouver to learn Tableau. And for those in Vancouver, spreading the training out over three weeks should make it easier to fit into your busy schedule and to absorb the information.

While this is my first online workshop available to the public, it's not my first experience teaching Tableau online. I've done private online training in the past and am currently teaching an online Data Storytelling and Visualization course at the University of Florida.

Here are some testimonials from people who've attended my earlier training sessions.

If you can't make this workshop but would like to be alerted when the next one is scheduled, just add your name here.

If you have several people at your organization who need training in Tableau, I'm also available for onsite training.

Thursday, June 22, 2017

Beyond the Basics: The Big Book of Dashboards

On the very first page of The Big Book of Dashboards, the authors go out of their way to give their readers a warning: "This book is not about the fundamentals of data visualization."

I agree. If you're brand new to data visualization, The Big Book of Dashboards is probably not the book for you.

Instead you should probably pick up Cole Nussbaumer Knaflic's Storytelling with Data or Alberto Cairo's The Functional Art. Two titles, incidentally, that the authors of the Big Book themselves list in a section where they offer suggestions for great books on data viz basics (they also include titles by Stephen Few and Colin Ware).

But let's say you've already read one of those books on data viz fundamentals. Let's say you already know that pie charts are dangerous and bar charts should start at zero. You've gotten the memo on how colour should be used sparingly and chart titles should be descriptive. What then?

Well, then you really owe it to yourself to pick up a copy of the Big Book of Dashboards.

I've read a number of books on data visualization and the Big Book is one of the best I've come across in that sweet spot between books for beginners and books for experts (or academics).

In particular, the book's focus on Dashboards means it has a lot of helpful advice about a topic many books for beginners largely ignore: interactivity.

A lot of the fundamental principles of data visualization are focused on how to create static charts.

But in the real world, people are increasingly being asked to make interactive Dashboards for their organization, which requires careful thought about things like how filters should behave, where dropdowns should be placed and how to make sure that your users understand how everything works.

The Dashboards featured in The Big Book of Dashboards are almost all interactive, and there is a constant discussion throughout the book on how to anticipate your user's needs through careful use of interactivity.

The Big Book is broken into three parts.

Part 1 is a primer on data visualization basics. This is a pretty good refresher on data visualization principles but, like the authors, if this is what you really need I'd suggest you read another book first.

Part 2, by far the largest section of the book, is a series of nearly 30 chapters: each one focused on a different Dashboard that solves a particular real-world problem.

Part 3 is a series of essays that cover interesting topics like how to personalize your Dashboards and different ways to visualize time.

I found the most enjoyable way to read the book was to read Parts 1 and 3 all the way through first, and then dip into the Dashboards in Part 2 a little bit at a time.

Each chapter in Part 2 follows more-or-less the same structure: An image of a Dashboard, a brief description of the real-world scenario the Dashboard is trying to solve, a discussion of how people use the Dashboard and then a discussion of the Dashboard's strengths and weaknesses (including, in some cases, suggestions for alternative ways of visualizing the same data).

Not surprisingly, I found I was most interested in those chapters that featured Dashboards on topics that interested me or projects similar to those I've tackled myself as a consultant. I spent a lot of time poring over the chapter on how to visualize student satisfaction surveys, while largely skimming the chapters on sports statistics. That said, as the authors point out, solutions to one problem can often be applied to another (for example, product ratings can be visualized in a similar way to teacher ratings).

I suspect other readers will find the same thing I did: You'll be drawn first to the chapters most applicable to your day-to-day work, but will be surprised how, later on, you'll be inspired by examples of data visualization solutions from other subject areas. The nice thing about the way the book is structured is you can make your way through the scenario chapters in pretty much any order you like.

It's pretty easy to flip through the chapters to find the ones that interest you the most. But if I had a suggestion for the 2nd edition, I think a "visual Table of Contents" — showing thumbnail sketches of each Dashboard along with the chapter title — would make such skimming even easier.

While one of the book's authors (Andy Cotgreave) works at Tableau and the other two (Steve Wexler and Jeffrey Shaffer) are Tableau Zen Masters, the book is platform agnostic: Tableau is barely mentioned.

And, yet, I found one of the other major strengths of this book is that pretty much every Dashboard featured can be built using Tableau. Which means the solutions you find in The Big Book of Dashboards are ones you can put to use almost right away in your day-to-day work.

I think that could also make the Big Book a great resource for data visualization practitioners to share Dashboard ideas with others in their organization.

I suspect many data visualization practitioners live in fear of their boss coming to them one day and asking them to recreate some New York Times masterpiece like the 3D yield curve or floating map of Antarctica. Visualizations that, frankly, can't be built without D3 and some serious coding chops.

In contrast, any moderately skilled Tableau user could hand their boss a chapter or two from The Big Book of Dashboards as an example of what's possible, confident that if they said, "I want something like that!", they could build it. (A job made considerably easier by the fact the authors have posted Tableau workbook files for many of the featured Dashboards online.)

If you make interactive Dashboards in your day-to-day work, or often have to explain what a Dashboard is to others in your organization, I highly recommend The Big Book of Dashboards.

Disclosure: I know all three authors of The Big Book of Dashboards and, even worse, I like and respect all three of them. I also got a brief shout-out in the book for my Tapestry talk on personalizing data viz. And I got a free copy.

Wednesday, April 26, 2017

I'm on the PolicyViz podcast this week!

The PolicyViz podcast, hosted by Jon Schwabish, is one of my favourite podcasts: illuminating 30-minute conversations with various people in the data visualization field.

So it was a particular thrill went I loaded it up in Overcast this morning and saw my own name in the episode list.

Jon and I had a great chat about teaching data visualization and data storytelling. You can find the episode in your favourite podcast app or right here.

Also, at the risk of logrolling, I highly recommend you make the PolicyViz podcast part of your regular podcast lineup. Jon's a great interviewer and the episodes are always concise and focused. If this data visualization thing doesn't work out, Jon could switch careers and go into radio.