An infrastructure wishlist

I have problems with the idea of infrastructure, particularly that of the e-research variety. It seems like we always end up talking about huge amounts of money and multi-institutional partnerships. It just doesn’t seem like a great model for innovation. As I’ve previously argued, I’d like to see something more like the funding schemes offered by the NEH Office for Digital Humanities. Encourage people with ideas, don’t just reward the good networkers. Build tools and apis, not portals and platforms.

Of course I’d still like to see the digital humanities well represented in the list of Virtual Laboratories and eResearch Tools currently under consideration by NeCTAR. It’s time the digital research needs of the humanities were properly recognised. There are lots of possibilities, most of which we can’t yet envisage, but as I was asked what I would like to see as part of a Virtual Laboratory I had a go at setting down a few brief ideas. For what it’s worth, here’s my e-research infrastructure wishlist…

Grappling with abundance

Traditional historical research is often based on a presumed scarcity of resources — the skill is in tracking down the sources. But large digital collections, like the Trove newspapers database, change this — you now have to make sense of the sheer volume of material. Digital history, through techniques such as text-mining and visualisation, offer a way of using these new riches effectively. We need to ensure that investments in digitisation are accompanied by evolutions in scholarly practice.

Understanding what’s not online

At the same time, it must be recognised that large quantities of our cultural heritage are not available in digital form. For example, only about 10% of the holdings of the National Archives of Australia are described in their collection database, and only a small proportion of these are digitised. Easy online access could foster a certain circularity in historical research where only ‘known’ resources are consulted. We need to develop tools and visualisations that reveal the valleys as well as the mountaintops — identifying the holes in our research fabric.

Critical engagement

More generally, we need to foster critical engagement with the tools and assumptions of digital research. Federated searching sounds great, but as scholars we need to expose the assumptions implicit in any such tool. What is being federated, from where, how is relevance being determined etc? Humanities e-research infrastructure should have built-in levels of reflexivity that enable scholars to understand the limits and assumptions of their digital research. Every algorithm contains an argument.

Documenting change

The resources we build are arguments with are subject to change. The Trove newspapers database, for example, is constantly adding new titles and articles, while users are improving the text transcriptions. Any analysis based on the holdings of this database needs to explicitly recognise this. At the very least the tools we have need to be able to generate time-stamped citations. It would be even better if we could capture a snapshot of the data to accompany our analyses. Perhaps there are possibilities for using something like the Memento project to ensure that the temporal context of humanities research is adequately documented.

Show your working out

Scholarly publication in history, and the humanities generally, tends to present a finished product. But as we delve further into digital research the research processes themselves will be equally important both for fostering critical engagement with tools and methods and for enabling others to reproduce or extend the research. We need easy ways for researchers to expose their working out (subject to whatever access controls they think appropriate). It should be possible to save a series of steps – search, analysis, visualisation etc as modules for sharing and re-use.

Follow your nose

Search needs to be complemented by rich, exploratory environments that encourage browsing, enable you to follow relationships, and foster serendipitous discovery. The problem with many collections is knowing enough about what’s in them to frame a useful search. Browsing, though a variety of interfaces — people, maps, events, record types, physical proximity — overcomes this problem. As more cultural institutions make use of Linked Open Data and shared identifiers — such as People Australia, Geonames or the Powerhouse Object Thesaurus — the possibilities for navigating this rich contextual space will increase.

Citation

We need to develop better models for embedding rich citations within scholarly research — citations that describe not only the resource in structured, machine-readable forms, but also relevant relationships. This will link research directly to resources, making scholarly outputs a means of resource discovery, and enabling resource databases to re-use the scholarly research to enhance their own descriptions and finding aids.

Constructing narratives

Moving beyond simple citation, we need better ways of exposing the structures of people, events, places and things that are referenced in our narratives. Linked Open Data provides a model, but we need tools to make it simple and examples to make it obvious.

This work is licensed under a Creative Commons Attribution 4.0 International License.

Tim Sherratt Written by:

I'm a historian and hacker who researches the possibilities and politics of digital cultural collections.

4 Comments

  1. November 10, 2011
    Reply

    Under constructing narratives, what do you think of Alan Liu’s Rose database? Do you know whether it is constructed in such a way as to allow for data harvesting? What about interoperability?

    I ask because when I heard him speak at Brown a few weeks ago, the tool struck me as being one that could be useful for historians, though of course it was constructed from the perspective of literary studies.

  2. November 13, 2011
    Reply

    Some good points here Tim. But as you say, it is a wish list. There is an old saying, “an historian is as close to a man of action as a curator is to an artist”. How can we make these things happen?

  3. November 13, 2011
    Reply

    Just reading Franco Moretti’s ‘Graphs, Maps, Trees’. I would be interested on your ideas here?

  4. November 13, 2011
    Reply

    OK, one last point. On your blog theme you say ‘people over systems’. But eResearch infrastructures are largely about connections of people (the net-workers), not just connections of data; linked or otherwise. It is really important that the two work together. It is the 2 hands of the DH that Willard McCarty often talks about. We need to foster the conditions for collaboration first; the tacit knowledge as well as the applied.

Leave a Reply

Your email address will not be published. Required fields are marked *