Libraries, the universe and everything

Webinar: Communicating through infographics
August 28, 2014, 15:38
Filed under: CPD | Tags: , ,

Back on July 10th I listened to a very informative webinar, Communicating through infographics. Hosted by SLA’s DC Chapter and presented by Duke University’s Dr Christa Kelleher, the session provided an overview of recommendations and potential pitfalls for creating infographics. Looking back over my notes all these weeks later they seem rather stream-of-consciousness, so I’m almost reluctant to write them up in case I make the webinar sound a lot less coherent than it actually was! However, I’m moving house soon and I don’t want to risk losing them (they make sense to me!), so here goes…

Software recommendations:
for data analysis = Matlab, R, Python
for spatial analysis = ArcGIS, R Mapping
for fine tuning = Illustrator, Inkscape

How to create effective visualisations:

1. Choose an effective plot for your data & message (Steven Few), eg. line graph for data, heat map for positional info.
Dot plots are cleaner than bar charts for lots of different entities in the same graphic.
Pie charts are considered ineffective for 4 or more entities – consider a % table instead.

2. Remove ‘chart junk’ (Tufte, the “grandfather of visualisation”). Stay away from redundant, superfluous or non-data information, and avoid 3D graphics (except for in a few very specific cases). Take out excess lines, do all you can to make things cleaner. Don’t add unnecessary colour, use sparingly to highlight key details.

3. Display the same number of dimensions as the dataset – this makes it easy to identify differing attributes to the norm.

4. Consider the use of colour (or greyscale):
for sequential data keep to the same colour sequence
for diverging data depict the mean in a middle colour and have the others diverge outwards
for categorical data use random colours which are easily distinguished is useful, and there’s also an R plugin for this

5. Maintain axes when comparing subplots – may need to be manually changed (eg. don’t have one set of data grouped in 10s and the other in 100s, use same scale for both).

What people look for in a visualisation, and good practice for different formats:

Bar charts = magnitude
line plots = change
dot plots = correlation
frequency plots = distribution

Bar charts
Reference to zero (y axis) (Nathan Yan)
Rotate if more than 8-10 categories (so they run vertically down the left hand side)
Include a legend
Be aware of scaling (Gary Klass)
Stacked bar charts make sense if you’re comparing whatever’s on the bottom BUT make sure both sets of data are using the same scale (switch to % if numbers are unworkable)

Line charts
Consider the aspect ratio (William Cleveland) – eg. the ratio of width to height, 45% is optimal
But… a smaller aspect ratio allows you to show long-term trends, whilst a higher ratio highlights short-term variations
Use a log scale for y axis if comparing 2 significantly different data sets, or use a different y axis for each (eg. one on the left side & the other on the right)
Horizon graphs can be good for compressing data into a small space

Scatter plots
Use density, make points transparent
Matrices highlight relationships across data, colour can be used to plot 3rd dimension

Cycle plots
Combination of points and line graph – eg. monthly changes over a period of several years – used to show values and trends through time.

Box-and-whisker plots
Allow a lot of info to be displayed in a small space, as they show
– overall distribution
– outliers
– skew (difference between quartiles and median)

Consider the number of bins as these are very sensitive to both number and density.
Use kernel density estimators to achieve a smooth plot

Sankey diagrams
A flow diagram where arrow width is proportional to the quantity of material flow

Circos plots
Originally for economics data

Graphics in multiple dimensions, more for interaction than presentation

Other recommended resources (some from Christa, some by webinar participants):

Tableau:, free public version (Duke library has libguide)
R: = help
pikto chart
Gapminder desktop
Google public data
Stat silk
Edward Tufte
Hans Rossling (TED talk)
Stephen Few
Nathan Yau


Leave a Comment so far
Leave a comment

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: