‘A map and some pins’: open data and unlimited horizons

8681035823_72f8c57ae0_m

This is the text of my keynote address to the Digisam conference on Open Heritage Data in the Nordic Region held in Malmö on 25 April 2013. You can also view the video and slides of my talk, or experience the full interactive experience by playing around with my text/presentation in all its LOD-powered glory. (Use a decent browser.)


The Australian poet and writer Edwin James Brady and his family lived for many years in the isolation of far eastern Victoria — in a little town called Mallacoota.

Edwin James Brady
Edwin James Brady (NLA: nla.pic-vn3704359)

Here, from time to time, Brady amused himself by taking a map of Australia down from the wall and sticking pins in it. The pins, Brady explained in 1938, included labels such as ‘Hydro-electric supply base’, ‘Irrigation area, and ‘Area for tropical settlement’. The map and its pins were one expression of Brady’s life-long obsession with Australia’s potential for development — for progress.

Maps and pins are probably more familiar now than they were in Brady’s time. We use them routinely for sharing our location, for plotting our travel, for finding the nearest restaurant. Maps and pins are one way that we document, understand and express our relationship to space.

Brady, however, was interested in using his pins to highlight possibilities. In the late nineteenth and early twentieth centuries size mattered. With the nations of Europe jostling for land and colonial possessions, space become an index of power. When the Australian colonies came together in 1901 to form a nation, maps and spatial calculations abounded. Australia was big and so it’s future was filled with promise.

Australia was big with promise.
Australia was big with promise.

In his travels around Australia, EJ Brady started to catalogue ways in which its vast, ’empty’ spaces might be turned to productive use. A hardy yeomanry armed with the latest science could transform these so-called ‘wastes’, and Brady was determined to bring these opportunities to the attention of the world.

This evangelical crusade reached its high point in 1918, with the publication of his huge compendium, Australia Unlimited — 1139 pages of ‘Romance History Facts & Figures’.

National Archives of Australia:  A659, 1943/1/3907, page 135
National Archives of Australia:
A659, 1943/1/3907, page 135

Space may no longer be invested with the same sense of swelling power, but our maps and pins still figure in calculations of progress. Now it is the data itself that awaits exploitation. Our carefully plotted movements, preferences and passions may well have value to governments, planners or advertisers. Data, according to some financial pundits, ‘is the new oil’.

Whereas Brady traveled the land documenting its untapped riches, we can take his work and life and mine it for data — looking for new patterns and possibilities for analysis.

Brady wasn’t the first to use the phrase ‘Australia Unlimited’, though he did much to make it familiar. By exploring the huge collection of digitised newspapers available through the National Library of Australia’s discovery service, Trove, we can track the phrase through time.

australia_unlimited

Brady was a skilled self-publicist and his research trips in 1912 were eagerly reported by the local press such as the Barrier Miner in Broken Hill and the Cairns Post.

In 1918 the book was published, receiving a generally positive reception — as an advertising leaflet showed, even King George V thought the book was ‘of special interest’. In 1926, a copy of the book was presented to a visiting delegation from Japan.

NAA: A659, 1943/1/3907, page 55
NAA: A659, 1943/1/3907, page 55

Over the years Brady sought to build on his modest successes, planning a variety of new editions and even a movie. But while his hopes were thwarted, the phrase itself lived on.

In 1938 and 1952, there are references to a radio program called ‘Australia Unlimited’, apparently featuring young musical artists. Also in 1952 came the news that the author of Australia Unlimited, EJ Brady, had died in Mallacoota at the age of 83.

Sydney Morning Herald, 23 July 1952
Sydney Morning Herald, 23 July 1952

Unfortunately copyright restrictions bring this analysis to a rather unnatural end in 1954. If we were able to follow it through until the present, we could see that from 1957 the phrase was used by a leading daily newspaper as the title of an annual advertising supplement detailing Australia’s possibilities for development. In 1958, it was adopted as a campaign slogan by Australia’s main Conservative party. In 1999 it was resurrected again by a major media company for a political and business symposium. Even now it provides the focus for a ‘branding’ initiative supported by the federal government.

Graphs like this are pretty easy to interpret, but of course we should always ask how they were made. In this case I simply harvested the number of matching search results from the Trove newspaper database for each year. The tool I built to do this has been through several different versions and is now a freely-accessible web application called QueryPic. Anyone can quickly and easily create this sort of analysis just by entering a few keywords.

QueryPic is one product of my experiments with the newspaper database over the last couple of years. I’ve also looked at changes to content of front pages, I’ve created a combination discovery interface and fridge poetry generator, and I’ve even built a simple game called Headline Roulette — which I’m told is strangely addictive.

All of these tools and experiments take advantage of Trove’s API. I think it’s important to note that the delivery of cultural heritage resources in a machine-readable form, whether through a custom API or as Linked Open Data, provides more than just improved access or possibilities for aggregation. It opens those resources to transformation. It empowers us to move beyond ‘discovery’ as a mode of interaction to analyse, extract, visualise and play.

Using freely available tools we can extract named entities from a text, we can look for topic clusters across a collection of documents, we can find places and pin them to a map. With a little bit of code I can take the newspaper reports of Brady’s travels in 1912 and map them. With a bit more time I could take another of Brady’s travel books, River Rovers, available in digitised form through the Internet Archive, and plot his journey along Australia’s longest river, the Murray.

Such transformations help us see resources in different ways — we can find new patterns, new problems, new questions. But transformation is a lossy business. I can put a pin in a map to show that Brady stopped off in Mildura on his voyage along the Murray. What is much harder to represent are the emotions that surrounded that visit. While he was there Brady received news of a friend’s death. ‘Bad news makes hateful the most pleasant place of abiding’, he wrote mournfully, ‘I strained to open the gate to go forth again into a wilderness of salt bush and sere sand’. Travel can be a form of escape.

Digital humanist and designer Johanna Drucker has written about the problems of representing the human experience of time and space using existing digital tools. ‘If I am anxious’, she notes, ‘spatial and temporal dimensions are distinctly different than when I am not’. We do not experience our journeys in a straightforward linear fashion — as the accumulation of metres and minutes. We invest each footstep with associations and meanings, with hopes for the future and memories of the past. Drucker calls on humanities scholars to articulate these complexities and work towards the development of new techniques for modelling and visualising our data.1

‘If human beings matter, in their individual and collective existence, not as data points in the management of statistical information, but as persons living actual lives, then finding ways to represent them within the digital environment is important.’2

In a similar way, Australia Unlimited is not just a catalogue of potentialities, or a passionate plea for national progress. It’s also the story of a struggling poet trying to find some way of supporting his family. Proceeds from the book enabled Brady to buy a plot of land in Mallacoota and build a modest home — I suspect that it wasn’t the same home that was painted by his wife in the 1950s.

nla.pic-an2287718-v

But even this small success was undermined. Distribution of the book was beset with difficulties and disappointments and and despite all his plans financial security remained elusive.

Brady’s youngest daughter, Edna June, was born to his third wife Florence in 1946. What could he leave he leave her? A ‘modern edition’ of Australia Unlimited lay completed but unpublished. ‘If it fails to find a publisher’, he remarked wistfully, ‘the MSS will be a liberal education for her after she has outgrown her father’s nonsense rhymes’. It was, he pondered, ‘a sort of heritage’.

One of the things I love about being a historian is that the more we focus in on the past the more complicated it gets. People don’t always do what we expect them to, and that’s both infuriating and wonderful.

Likewise, while we often have to clean up or ‘normalise’ cultural heritage data in order to do things with it, we should value its intrinsic messiness as a reminder that it is shot through with history. Invested with the complexities of human experience it resists our attempts at reduction, and that too is both infuriating and wonderful.

The glories of messiness challenge the extractive metaphors that often characterise our use of digital data. We’re not merely digging or mining or drilling for oil, because each journey into the data offers new possibilities — our horizons are opened, because our categories refuse to be closed. These are journeys of enrichment, interpretation and creation, not extraction.

We’re putting stuff back, not taking it out.

Cultural institutions have an exciting opportunity to help us work with this messiness. The challenge is not just to pump out data, anyone can do that. The challenge is to enrich the contexts within which we meet this data — to help us embrace nuance and uncertainty; to prevent us from taking the categories we use for granted.

For all it’s exuberant optimism, a current of fear ran through Australia Unlimited. The publisher’s prospectus boldly announced that it was a ‘Book with a Mission’. ‘A mere handful of White People’, perched uncomfortably near Asia’s ‘teeming centres of population’, could not expect to maintain unchallenged ownership of the continent and its potential riches. Australia’s survival as a white nation depended upon ‘Effective Occupation’, secured by a dramatic increase in population and the development of its vast, empty lands — ‘The Hour of Action is Now!’.

National Archives of Australia:  A659, 1943/1/3907, page 208
National Archives of Australia:
A659, 1943/1/3907, page 208

In 1901, one of the first acts of the newly-established nation of Australia was to introduce legislation designed to keep the country ‘white’. Restrictions on immigration, administered through a complex bureaucratic system, formed the basis of what became known as the White Australia Policy.

While the legislation was designed to keep non-white people out, an increase of the white population was seen as essential to strengthen security and legitimise Australia’s claim to the continent. Australia Unlimited was an exercise in national advertising aimed at filling the unsettling emptiness with sturdy, white settlers.

But White Australia always a myth. As well as the indigenous population there were, in 1901, many thousands of people classified as non-white living in Australia. They came from China, India, Indonesia, Turkey and elsewhere. A growing number had been born in Australia. They were building lives, making families and contributing to the community.

Here are some of them…

The real face of White Australia
The real face of White Australia

I built this wall of faces using records held by the National Archives of Australia. If a non-white person resident in Australia wanted to travel overseas they needed to carry special documents. Without them they could be prevented from re-entering the country — from coming home. Many, many thousands of these documents are preserved within the National Archives.

Kate Bagnall, a historian of Chinese-Australia, and I are exploring ways of exposing these records through an online project called Invisible Australians.

To build the wall I downloaded about 12,000 images from the Archives’ website — representing just a fraction of one series relating to the administration of the White Australia Policy. Unfortunately there’s no machine-readable access to this data, so I had to resort to cruder means — reverse-engineering interfaces and screen-scraping.

Once I had the images I ran them through a facial detection script to find and crop out the portraits. What we ended up with was a different way of accessing those records — an interface that brings the people to the front; an interface which is compelling, discomfiting, and often moving.

The wall of faces also raises interesting questions about context. Some people might be concerned by the loss of context when images are presented in this way, although each portrait is linked back to the document it was derived from, and to the Archive’s own collection database. What is more important, I think, are the contexts that are gained.

If you’re viewing digitised files on the National Archives’ own website, you can only do so one page at a time. Each document is separate and isolated. What changes when you can see the whole of the file at once? I’ve built another tool that lets you do just that with any digitised file in the Archives’ collection. You see the whole as well as the parts. You have a sense of the physical and conceptual shape of the file.

National Archives of Australia:  ST84/1, 1908/471-480
National Archives of Australia:
ST84/1, 1908/471-480

In the case of the wall of faces, bringing the images together, from across files, helps us understand the scale of the White Australia Policy and how it impacted on the lives of individuals and communities. These were not isolated cases, these were thousands of ordinary people caught up in the workings of a vast bureaucratic system. The shift of context wrought by these digital manipulations allows us to see, and to feel, something quite different.

And we can go the other way. In another experiment I created a userscript to insert faces back into Archives’ website. A userscript is just a little bit of code that rewrites web pages as they load in your browser. In this case the script grabs images relating to the files that you’re looking at from Invisible Australians.

So instead of this typical view of search results.

Before
Before

You see something quite different.

After
After

Instead of just the record metadata for an individual item, you see that there are people inside.

We also have to remember that the original context of these records was the administration of a system of racial surveillance and exclusion. The Archives preserves not only the records, but the recordkeeping systems that were used to monitor people’s movements. The remnants of that power adhere to the descriptive framework. There is power in the definition of categories and the elaboration of significance.

Thinking about this I came across Wendy Duff and Verne Harris’s call to develop ‘liberatory standards’ for archival description. Standards, like categories, are useful. They enable us to share information and integrate systems. But standards also embody power. How can we take advantage of the cooperative utility of standards while remaining attuned to the workings of power?

A liberatory descriptive standard, Duff and Harris argue: ‘would posit the record as always in the process of being made, the record opening out of the future. Such a standard would not seek to affirm the keeping of something already made. It would seek to affirm a process of open-ended making and re-making’.3

‘Holes would be created to allow the power to pour out.’

‘Making and re-making’ — sounds a lot like the open data credo of ‘re-use and re-mix’ doesn’t it? I think it’s important to carry these sorts of discussions about power over into the broader realm of open data. After all, open data must always, to some extent, be closed. Categories have been determined, data has been normalised, decisions made about what is significant and why. There is power embedded in every CSV file, arguments in every API.

This is inevitable. There is no neutral position. All we can do is encourage re-use of the data, recognising that every such use represents an opening out into new contexts and meanings. Beyond questions of access or format, data starts to become open through its use. In Duff and Harris’s words, we should see open data ‘as always in the process of being made’.

What this means for cultural institutions is that the sharing of open data is not just about letting people create new apps or interfaces. It’s about letting people create new meanings. We should be encouraging them to use our APIs and LOD to poke holes in our assumptions to let the power pour out.

There’s no magic formula for this beyond, perhaps, building confidence and creating opportunities. But I do think that Linked Open Data offers interesting possibilities as a framework for collaboration and contestation — for making and challenging meanings.

We tend to think about Linked Open Data as a way of publishing — of pushing our data out. But in fact the production and consumption of Linked Open Data are closely entwined. The links in our data come from re-using identifiers and vocabularies that others have developed. The linked data cloud grows through a process of give and take, by many small acts of creation and consumption.

There’s no reason why that process should be confined to cultural institutions, government departments, business, or research organisations. Linked Open Data enables any individual to talk about what’s important to them, while embedding their thoughts, collections, passions or obsessions within a global conversation. By sharing identifiers and vocabularies we create a platform for communication. Anyone can join in.

So, if we want people to engage with our data, perhaps we need to encourage them to create their own.

I’ve just been working on a project with the Mosman public library in Sydney aimed at gathering information about the experiences of local servicepeople during World War One. There are many such projects happening around the world at the moment, but I think ours is interesting in a couple of ways. The first is a strong emphasis on linking things up.

The are records relating to Australian service people available through the National Archives, the Australian War Memorial, and the Commonwealth War Graves Commission, but there are currently no links between these databases. I’ve created a series of screen scrapers that allow structured data to be easily extracted from these sources. That means that people can, for the first time, search across all these databases in one hit. It’s a very simple tool that I started coding to ease the boredom of a long bus trip — but it has proved remarkably popular with family historians.

Once you’ve found entries in these databases, you can just cut and paste the URL into a form on the Mosman website and a script will retrieve the relevant data and attach it to the record of the the person you’re interested in. Linking to a service record in the National Archives, for example, will automatically create entries for the person’s birthplace and next-of-kin.

The process of linking builds structures, and these structures will themselves all be available as Linked Open Data. Even more exciting is that the links will not only be between the holdings of cultural institutions. The stories, memories, photographs and documents that people contribute will also be connected, providing personal annotations on the official record.

None of this is particularly hard, it’s just about getting the basics right. Remembering that structure matters and that links can have meaning. It’s also about recognising that ‘crowd sourcing’ or user-generated content can be made anywhere. Using Linked Open Data people can attach meanings to your stuff without visiting your website. Through the process of give and take, creation and consumption, we can build layers of description, elaboration, and significance across the web.

What excites me most about open cultural data is not the possibility of shiny new apps or collection visualisations, but the possibility of doing old things better. The possibility of reimagining the humble footnote, for example, as a re-usable container of structured contextual data — as a form of distributed collection description. The possibility of developing new forms of publication that immerse text and narrative within a rich contextual soup brewed from the holdings of cultural institutions.

I want every online book or article to be a portal. I want every blog or social media site to be a collection interface.

What might this look like? Perhaps something like this presentation.

My slides today are embedded within a HTML document that incorporates all sorts of extra goodies. The full text of my talk is here, and as you scroll through it you’ll see more information about the people, places and resources I mention pop up in the sidebar. Alternatively you can explore just the people or the resources, looking at the connections between them and the contexts in which they’re mentioned within my text.

This is part of an ongoing interest to explore different ways of presenting historical narrative, that build a relationship between the text and the structured data that underlies the story.

All of the structured data is available in machine-readable form as RDF — it is, in itself, a source of LOD. In fact the interface is built using an assortment of JavaScript libraries that read the RDF into a little temporary triplestore, and then query it to create the different views. So the whole thing is really powered by LOD.

It’s still very much an experiment, but I think it raises some interesting possibilities for thinking about how we might consume and create LOD simply by doing what we’ve always done — telling stories.

EJ Brady’s dreams were never realised. Australia’s vast spaces remained largely empty, and the poet continued to wrestle with personal and financial disappointment. ‘After nearly eight decades association with the man’, Brady wrote of himself in 1949, ‘I have come to look upon him as the most successful failure in literary history’. This energetic booster of Australia’s potentialities was well aware of his own life’s mocking irony. ‘He has not… made the wages of a wharf laborer out of book writing yet he persists in asserting Australia is the best country in the world!’.

But still Brady continued to add pins to his map.’For half a century I’ve been heaping up notes, reports, clippings, pamphlets, etc. on… all phases of the country’s life and development’. ‘What in hell I accumulate such stuff for I don’t know’, he complained in 1947. As the elderly man surveyed the ‘bomb blasted pile of rubbish’ strewn about his writing tent, he admitted that ‘this collecting is a sort of mania’.

Brady’s map and pins told a complex story of hope and disappointment, of confidence and fear. A story that combined national progress with an individual’s attempts merely to live.

There are stories in our data too — complex and contradictory stories full of emotion and drama, disappointment and achievement, violence and love. Let’s find ways to tell those stories.

  1. Johanna Drucker, ‘Humanistic Theory and Digital Scholarship’, in Matthew K. Gold (ed.), Debates in the Digital Humanities, University of Minnesota Press, 2012. []
  2. Johanna Drucker, ‘Representation and the digital environment: Essential challenges for humanists’, University of Minnesota Press Blog, http://www.uminnpressblog.com/2012/05/representation-and-digital-environment.html []
  3. Wendy M. Duff and Verne Harris, ‘Stories and Names: Archival Description as Narrating Records and Constructing Meanings’, Archival Science, vol. 2, 2002, pp. 263–285. []

This work is licensed under a Creative Commons Attribution 4.0 International License.

Tim Sherratt Written by:

I'm a historian and hacker who researches the possibilities and politics of digital cultural collections.

25 Comments

  1. Anne Dutlinger
    June 19, 2013

    “Through the process of give and take, creation and consumption, we can create layers of description, elaboration, and significance across the web.” Tim Sharatt’s “A Map and Some Pins: Open Data and Unlimited Possibilities” is brilliant and very moving. I love his refrigerator poetry game, too!

  2. Andrew Wilson
    June 20, 2013

    An outstanding inspirational talk, Tim. Brilliant work.
    Thanks.

Leave a Reply