Getting down with data visualization

By Zoë Corbyn

Representing data graphically or on a map can help journalists spot a story or bring a piece to life for readers — and it isn't as difficult to do as you might think.

Dianne Finch

This is the message from the NASW meeting's DIY Data Visualization Workshop. The workshop — which built on last year's data visualization session — was a chance for participants not only to hear about the latest web based tools available to visualize data, but try some out themselves.

Led by Dianne Finch, manager of multimedia training at the Knight Science Journalism program at MIT, and data journalist David Kroodsma, participants were first shown how to “scrape” a large data set from the web, convert it into a Microsoft Excel spreadsheet and visualize it via Google Fusion Tables and Google Maps.

Using as an example two Excel spreadsheets — the world's nuclear power plants by location and by owner — Finch demonstrated how they could be imported into Google Fusion Tables and combined to create one data set, with locations of the plants displayed as points on a globe.

Finch outlines her tips for converting to Excel here and importing into Google Fusion Tables here but a basic start is to ensure the data are in “really good shape” before performing any kind of visualization. She recommended spot-checking to ensure they had been correctly converted into Excel and critically examining any outliers (they could indicate a data entry error or a whole new story to investigate). It is also important to ensure Google Fusion knows which data are geographical co-ordinates, she said.

Because it can handle a lot of records, Excel is best for big data sets rather than alternatives such as Google Spreadsheets, she added.

Kroodsma then demonstrated how journalists could hack computer code to produce data visualisations. Just a basic understanding of how code is structured and a rudimentary knowledge of html is enough to get started, he believes. “The most important thing is everything is a Google search away,” he said.

With instructions here, he showed how easy it could be to “steal” a basic html code used to display interactive markers on a Google map and modify a few lines of it to add new markers or change their positions. There is usually no problem with using code that is already out there, though it is good practice to cite the source, he said.

But, the experts stressed, while displaying data in an infographic can help tell great stories and see patterns and trends, it doesn't mean it is always appropriate to use a visualization. And nor does it negate the actual reporting or data analysis: the journalist still needs to tell the story. “I am not selling data visualisation as an replacement for reporting,” notes Finch.

So is it really this easy? Perhaps not as easy as the experts make it look but there are plenty of ways to become more proficient including online tutorials and courses. Good sources for more information recommended included the website FlowingData and the book Visualize This by Nathan Yau.

And those needing a little encouragement to explore data visualization should look no further than the graphs and videos of the Swedish statistician and academic Hans Rosling available at Gapminder. Guaranteed to inspire.

October 17, 2011

Biedler Prize for Cancer Journalism