Friday, January 13, 2012

Citation in Digital Humanities: Is the Old Bailey Online a Film, or a Science Paper?

Recently I was writing a paper for a journal and needed to cite the Old Bailey Online (OBO). Not any particular piece of content contained in the project, but the project itself as an outstanding example of digital humanities work. For those unfamiliar with the venture, it's a database containing 127 million words of historical trial transcripts marked up extensively with XML; still the flagship project of its kind in this author's opinion. I found myself struggling to decide who the authors of the project were; that is, whose names was I bound by "good scholarship" to include in the citation. Who deserved public credit? I happen to meet regularly with one of the project's principle investigators, Tim Hitchcock of the University of Hertfordshire, and raised the issue with him over drinks at the pub - incidently the pub is the most engaging place to discuss topics as dry as citation practices and the discussion becomes increasingly more engaging as the evening progresses. As it happens, the project had over 40 known contributors who actively participated in its creation. His initial response was that the team decided not to include any names when citing the project to avoid leaving people out and focusing credit in the hands of only some of the team members. The resulting citation looks like this:

Old Bailey Proceedings Online. Version 6.0, March 2011.

This is a very noble position for the project leaders to take; however, I do not believe it is the right position. In an effort not to emphasize the contributions of some over that of others, this policy makes most contributors entirely invisible. This is particularly significant for people in the alternative academic (alt-ac) fields whose career progression and in many cases, next meal, depend upon the strength of their portfolios. These people have roles such as project management, database building, and web design, all of which are crucial to ensure the projects themselves are world class. If we adopt the no-names policy across the board, these people will never be cited anywhere, whereas traditional academics may still have books and journal articles on top of their digital project work.

Though we brought our positions much closer together, the issue proved too much for a bottle of wine to solve. We parted ways and Hitchcock took the discussion to H-Albion, a list-serv for historians of Britain and Ireland where many historians and librarians have contributed their opinions. Seth Denbo then brought the discussion to Humanist, another list-serv for digital humanities scholars where a separate conversation has now begun. Rather than contribute to either or both of those conversations, I have decided to address the issue here with the hopes that it can find new contributors who may not otherwise see it in the list-servs.

The most interesting question to arise so far is whether digital humanities projects like the OBO are films or science papers. Not literally of course, but in terms of the model of credit offered to contributors of the finished product. Both films and science journals have developed unique models of credit. In films, the credits run at the end. In science papers, everyone who made a meaningful contribution gets listed as an author and those who made minor contributions get an acknowledgement. I will argue that digital humanists would be doing their field and industry a great service by adopting both models simultaneously. The OBO and projects like it are both films and science papers.


One of the respondants to the list-serv discussion, a retired librarian Malcolm Shifrin, suggested that the point of a citation was to retrieve the source, not to provide credit. In this sense, it does not matter whose names appear in the citation, as long as there is no ambiguity and the item can be identified. However, if that were the case, we could merely cite ISBN numbers, which would drastically cut down on the size of footnotes. Or, in nearly all cases, titles alone would suffice. For example, if I were to task you with finding a copy of the paper: "An alternative definition of the scapular coordinate system for use with RSA" without any further information, I'm entirely confident you would make your way to a paper by my lovely wife, which appears in the Journal of Biomechanics. Citation is not merely about finding an item, it is also about credit; however, as Shifrim points out, it is not crucial that credit appears within a citation. An alternative model is the one used by the film industry in which a portion of the finished product is dedicated to letting everyone know who was involved with its creation.

Most major website projects, including the OBO, already do this. The OBO's "About this Project" page lists 24 of the leading contributors along with their roles and effectively mimics the credits on a film. A listing of this sort is important because it offers an official "in-house" acknowledgement that's difficult to fake without breaking the law and hijacking the website to add your name. This allows everyone to direct future potential employers to evidence of past work that can be independently verified. I would certainly argue that any collaborative digital humanities project should reserve a space on their website for such a page, which has absolutely no cost but can be instrumental to the future career development of your team members. But, I certainly do not think it's enough.

We do not know where the alt-ac world is going, and we would be wise to ensure that as many doors as possible remain open to those people who currently occupy this grey space in academia. Some members may aspire to a future tenure-track position and may find it difficult to convince more conservative senior faculty that film-style credits on a webpage are akin to hits on JSTOR. And because these conservative attitudes change slowly, it would be rash for digital humanists to abandon a well established if perhaps dated model of credit just because we want to rebel in the name of progress. There's a baby in that bathwater.

Science Papers

This is where the model used by the academic science community is particularly helpful. In the humanities, typically if someone got paid to do the work as part of a grant or part-time role, we pretend they didn't exist. The work "was done" rather than "was done by soandso". We don't expect McDonalds to list the names of individual "team members" when they brag about how delicious their french fries are. It doesn't matter who made your fries. They were paid to do so and thereby give up their right to credit.

In the sciences, everyone who makes a meaningful contribution is entitled to a share of the authorship of a paper. Assuming each of the 24 members of the OBO team met those criteria, a citation for the OBO might look like this:

Hitchcock T, Shoemaker R, Emsley C, Howard S, Hardman P, Bayman A, Garrett E, Lewis-Roylance C, Parkinson S, Simmons A, Smithson G, Wilcox N, Wright C, Clayton M, Bankhurst B, Lingwood D, MacKenzie E, Rogers K, McLaughlin J, Henson L, Black J, Newman E, O'Flaherty K, Smithson G. Old Bailey Proceedings Online. Version 6.0, March 2011.

It may be a bit more of an eyefull than the previous example, but at least it's a more accurate reflection of the work people put into the site's creation. The exact criteria for determining a "meaningful contribution" generally rests with the policies of individual journals. A typical example, from the International Committee of Medical Journal Editors requires that each author must have made substantial contributions to all of the following:
  1. the conception and design of the study, or acquisition of data, or analysis and interpretation of data
  2. drafting the article or revising it critically for important intellectual content
  3. final approval of the version to be submitted

Obviously those criteria are designed specifically with a peer-reviewed journal article in mind. However, they can easily be adapted to the needs of a digital humanities web-based project, which typically is split into two parts: the project itself, and the digital infrastructure for allowing the audience to interact with the project. A digital humanities "author" could be someone that must have made substantial contributions to all of the following:

  1. the conception and design of the project or website; or acquisition of data or materials; or analysis, transformation and interpretation of data or materials
  2. drafting or creating any text, artwork, sound, video, workflow, interface, user experience, or code, that was integral to the success of the project and that would have been substantially different if it had been completed by someone else
  3. final approval of the finished product

In the case of the OBO, that may eliminate some people from the list of those credited with the project. As I am not one of those people, it is not my place to decide. But it is something I think as a community we should start discussing as soon as project teams are put together. What is the intended output, and how will each person's contribution be credited? It can be an awkward conversation at first, but it's a proactive solution to the elephant in the room for those in the alt-ac community.


The OBO is both a film and a science paper. Project leaders of web-based digital humanities projects would be doing their industry a favour by ensuring projects have both a page of film-style credits which outline contributors and their roles, as well as a science-style listing of substantial contributors or authors that are prominently displayed for anyone wishing to cite.

This two-pronged approach can only serve to help digital humanities to find its place within the academic world. It's the model that keeps the most doors open for those alt-ac members of our project teams who are unsure of which path their career will take in the future. It acknowledges the tremendous teamwork that goes into producing world class digital humanities work, setting them apart from single-authored papers. And it doesn't misrepresent or misconstrue the purposes of either model of credit. The citation may not mean much to a tenured professor, but it can help launch the career of someone in the alt-ac world. And so, the citation may be a bit clunkier if we use the science model, but at least it's an honest reflection.

Photo credit: "Steve Jobs rendered in Applesoft BASIC" by Blake Patterson.