As you can see
here, this graph, representing ten million points of data, plotted
logarithmically against seven million other points of data in a
counter-clockwise fashion, with a smoothing value of 3 and scaled by a function
of the distance from my elbow to my fingertips, designed by a particularly
gifted graphic artist at Bewilderment Inc., CLEARLY shows that eighteenth
century cattle had a strong preference for south facing barns.
Can't argue
with that. But I will anyway.
Over the past
two years I've been noticing a rise in what I like to call "shock and
awe" graphs in digital humanities, designed to overwhelm their audience
and perhaps even to evoke doubt in one’s own abilities to compete in the same
scholarly conversation. These graphics are both incredibly complex
representations of data, and incredibly beautiful. If we got rid of the axes,
we might even be tempted to hang them as art. A colleague of mine used the term
"poster graph" to describe these works. The idea behind that name was
that the graph looked nice enough to blow up and put on a poster. Implicitly,
this colleague suggested that represented in this manner, the data was likely
to impress and captivate. Great. But are complex graphs good for scholarship?
Scholarship shared between academics is not inherently meant to impress. It is meant for making discoveries. And so, while complex graphs are beautiful, they have a time and place.
Scholarship shared between academics is not inherently meant to impress. It is meant for making discoveries. And so, while complex graphs are beautiful, they have a time and place.
Exploring data
is certainly one of those times. Complex representations of data are sometimes the
only way we can make some types of discoveries. Our eyes are, after all, great
at noticing patterns. In a recent example (of which I was quite openly
critical), trends in a set of data only became evident when it was plotted
logarithmically. This graph then led the researchers on the trail of some interesting
discoveries that would not have otherwise been possible. I have no issue with this.
I have no issue with quantitative analysis.
I also have no
issue with attempting to engage an audience who might not otherwise be
interested in the research. I'm always thrilled to see historians,
archaeologists, and mathematicians discussing their work on TV or on radio.
That's fantastic. And in those cases, a "shock and awe" graph is
probably appropriate. After all we have to sell what we do if we hope to
compete with the Hollywood pros and the increasingly popular data journalists
in major news outlets for the scant attention of the masses.
But I do have
issue with shock and awe graphs sneaking into work intended for academic
colleagues – particularly in peer reviewed work, and particularly when the
complexity of the graph is not absolutely necessary to the conveyance of information.
I do have issue with the fact that many very intelligent people who are
responsible for evaluating the truth of these claims do not have the skills to
interrogate these complex visualizations. These graphs have seemingly come out
of nowhere for many who have spent their entire careers working almost
exclusively with text and perhaps only simple numbers. For interdisciplinary
work, there is a good chance that the first time many researchers will come
across a "shock and awe" graph is when they have been handed a paper
to review for a journal.
Understandably
it can be embarrassing to realise you do not have the skills to critically
assess the work in a field to which you have devoted your life. By handing
someone a graph you know they likely cannot appraise, you are deliberately
playing towards their sense of insecurity. It is easy to say the problem is
numerical literacy but we must remember these are extraordinarily complex
visualizations. It takes a lot of skill and a lot of learning before someone
can create these graphs. It takes a comparable amount of time to learn how best
to interpret them. And not everyone has had the luxury of focusing his or her
time on that skill. In some cases surely the reviewer passes the graph through
the filters unchecked. It’s less embarrassing that way.
I don’t believe
this is just a matter of numerical literacy levels. I’d go so far as to suggest
that these graphs are often intentionally overwhelming and unnecessary for
making the argument. But this is not my greatest worry. From the perspective of
good scholarship a shock and awe graph is impossible to test. And therein lies
the biggest problem. You plot tens of thousands of points on a complex
multi-coloured, multi-dimensional scatter plot. The reviewer gets a static
image. How do you test that exactly? How do you know there hasn’t been a
dramatic mistake in the way the information was put on the graph? How do you
know the data are even real?
You can't. You
don’t. And I believe too often their creators know this and hope that in an
effort not to expose one's own weaknesses, a reviewer will overlook parts he or
she does not fully understand. Shock and awe becomes one way to increase the
chances you will get a publication for your CV. I suppose we can’t blame people
for looking out for their own career development. But, one day someone will
take advantage of this knowledge and will cheat. That is, if they have not
already.
Cheating in
academia is not altogether unheard of. The humanities have long battled with
plagiarism. Famously, Saif Gaddafi was accused of having parts of his thesis
ghost-written while studying at the London School of Economics, leading to the
resignation of LSE's director Howard Davies shortly thereafter. Plagiarism is a war that may
always persist. But with the introduction of digital humanities in
collaborative efforts with more traditional humanist fields, we now have to
watch out for the faked results that researchers like Jatinder Ahluwalia have been
accused of committing.
Ahluwalia
recently made headlines after allegedly faking research results during his PhD
work at Imperial College London and later during a Post-Doc at University
College London. The investigation into Ahluwalia's work led to the embarrassing
retractions of papers in the Journal of Neurochemistry, Nature, and a parting of ways between Ahluwalia and his
employer, the University of East London.
We now need
safeguards to protect the integrity of the good work out there, and to allow
people to critically evaluate our results. One way to do that is to be
hyper-critical of the very graphs we love to look at so much. Do they convey
the data in the most straightforward way possible? Are they produced in a way
that allows the data to speak for themselves, or are colour, size, shape,
scale, orientation, or any other number of variables manipulated in a way that
seeks to draw the reader to a conclusion that may not be the correct or only interpretation?
Even something as simple as the order in which data points are put on a scatter
plot can drastically change how one interprets the results. Points that are put
on first may be covered up by later points, thus hiding or highlighting a trend
that may not exist.
There will
always be people who distrust numbers or who scoff at digital humanists as a
bunch of bean counters. That can be frustrating, but it is also invigorating to
know that there are those out there who will be sceptical of what we produce.
We need this scepticism and we need to meet it head on if our work will be
accepted. We can either work towards quelling this type of scepticism by
ensuring our graphs present necessary information as transparently as possible,
or we can attempt to silence it through a policy of shock and awe, with
ever-complex representations of increasingly intricate datasets.
We'll likely
make more friends if we take the former approach.
So before you
publish a visualization, please take a moment and step back. As in the cult
classic, Office Space, ask yourself:
Is this Good for the Company?
Is this Good
for Scholarship?
Or am I just trying to overwhelm my reviewers and my audience?
photo credit: “Swirling a Mystery” by garlandcannon
10 comments:
Nice post. I continue to feel like visualizations need to justify their existence as either tools for communicating known things or tools for making discoveries.
The examples you are referring to seem to fall into the latter which means they need to come with significant performance notes. That is they need a good bit of explication and we need to be told how if this turned out different they might have been wrong.
This is not just a problem for the humanities. I once was in a talk with a educational psychologist who told the audience that to visualize this you would need to think in 7 dimensional space. The implication there was that we should take their word for it.
I feel like the production of a visualization is always, effectively, the production of a new artifact that needs to be given the same kind of scrutiny that some other artifact would be given.
I sometimes draw an analogy between visual presentation and stylish writing—neither is necessary for academic communication per se, but it's still good that historians place some premium on stylishness for various reasons (accessibility, because it makes everyone's work more enjoyable, because it's something else we can teach).
That said, you're right that the shock and awe stuff can get over the top. (And 'shock and awe' is a great phrase for this.) I think a lot of the time we can make objections on aesthetic grounds alone, just as we can to the most overblown writing.
While I agree with your concern, my experience has been quite different. When presenting and discussing network visualizations in specific but data visualization in general, I've found that even well-framed and low-variable count data visualization is casually dismissed as aesthetically pleasing but useless for the transmission of knowledge. It's always been my fear that we'd end up in the situation you describe, since data viz is seductive and impressive to the lay audience.
But so far, I see the heft of verbiage about data viz among digital humanities practitioners to be in criticism of it, and not in fawning support of it. As such, I'm actually a bit worried when I read well-written, insightful pieces like the one you've just posted, because you give those folks yet another reason to dismiss any data visualization that they don't comprehend as something that's obviously just incomprehensible.
What Ben said.
I'm enthusiastic about awesome visualizations when the awesomeness has a communicative function. But there are particular kinds of awesomeness that rarely do.
E.g., with a force-directed graph, it's very tempting to show the thing evolving and organizing itself as an animation. Because that looks really cool. But unless the time axis of that animation is related to some actual time in the domain being modeled, it's actually a bit misleading -- at least in the sense that the showiest part of the viz is not a meaning-bearing part. (The only meaning it conveys is, arguably, explaining to the audience how a force-directed graph works.)
That said, I honor and respect good viz craftsmanship. What Ben said -- once again. Form should follow function in this domain for aesthetic reasons as much as anything.
Thanks Trevor, Ben and Elijah for your comments.
Trevor, thanks for the link to your post. I think you're right that we need to look at visualizations for the roles they fulfill rather than lump them all into a single category. In terms of added performance notes, I agree, but I wonder if our footnotes and appendices will be able to keep up with the increasingly technical nature of our analyses.
Ben, you make a great point about stylish writing and that's a comparison I hadn't thought of before. I suppose it's not difficult through style or through the omission of details that challenge your position, to be equally deceptive through prose than through visualization. Though that's no excuse for letting our guard down on the visual elements of our research - not that you were suggesting we do!
Elijah, I appreciate your concern and I think it's a valid one. I certainly am not anti-visualization and I'd disagree with anyone who views visualization as seductive but useless for knowledge transmission. I use visualizations frequently in my presentations and my written work. But something in me always wants to provide the raw data as well as notes on how I came to produce the visualization, if only to be transparent that yes, I did actually do this properly. And yes, this is a valid result. The graph is still a black box in most cases. I am not suggesting we get rid of the graph. But I'd like translucent sides on the box.
Thanks again for your comments all three of you. You've given me more to think about, which is exactly what I had hoped for!
Maybe it's also worth pointing out that this isn't a general "viz" issue. I'd say it's specifically an issue with "graphs" in the mathematical (node-edge) sense of the word.
And where social network graphs are concerned, I think the problem isn't restricted to "shock and awe," either. The more basic problem is that we're often not entirely sure what kind of relationality those graphs are representing.
I think, before we start draw nodes and edges, we should ask ourselves whether "network" is really the right sort of abstraction in the case we're confronting. It sometimes is: e.g. the travel networks in a project like ORBIS really are networks. Ditto for the hypertextual structure Elijah modeled when he tackled TV Tropes. But there's a growing tendency to use network graphs to represent kinds of domain space that are far more abstract, and not necessarily network-like.
Thanks for the comments Ted. I like your note that not all parts of graphs are meaning bearing. That's something I think many readers and interpreters of visualizations overlook or perhaps never thought of.
There have been a number of excellent responses to this post as well as articles on similar topics that have popped up around the web the last few days. As I'm sure some readers would like to read about the alternative perspective as I did, I thought I'd share the ones I've found here. If there are others please let me know as I'd love to hear more.
* Mark Ravina "In Praise of 'Shock and Awe'" (http://clioviz.wordpress.com/2012/05/29/in-praise-of-shock-and-awe/)
* John Thiebault "Visualizations and Historical Arguments" (http://writinghistory.trincoll.edu/evidence/theibault-2012-spring/)
* Liz S. "What Are We Doing With Our Visualizations?" (http://ludicanalytics.wordpress.com/2012/05/31/what-are-we-doing-with-our-visualizations/)
Another great article to add to the discussion. This time by Carla Uriona
(http://www.viewtific.com/when-graphs-are-hard-to-understand/)
Thanks for taking the time to join in Carla!
We would probably govern more of the common objects and values for the students that must be followed by the students which is even considered to be essential. wordpress appointment plugin
Post a Comment