shoebox
archives

Emerging technologies and the need to experiment

About a month ago I posted a copy of my report Emerging technologies for the provision of access to archives on Scribd. It’s already edging up towards a thousand reads, so I thought it was time I put a link in from here.

The basic message is we need to experiment and find the spaces both within and between our institutions to foster such experimentation. Is that asking too much? Anyway… read, enjoy, use!

Doing it yourself

I was doing some research using the National Archives of Australia’s RecordSearch database the other day and became frustrated that there is no way of seeing how many pages are in a digitised file without clicking on the ‘Display digital copy’ link. So I fixed it.

As a userscript it’s hardly worthy of a blog post. All it does it find out how many pages are in the file and insert the number in the link text. It’s very simple. But I think it’s also a useful illustration of the changing balance of power between archives and their users.

William E Landis argued that archivists were ‘guilty as a profession of fetishising the outputs of our descriptive systems’. The design of finding aids have often been determined not by the needs of users but by a desire to faithfully represent the underlying archival architecture. But now users don’t have to just take what they’re given.

Technologies such as Greasemonkey are useful for sketching out alternatives. For organisations with IT systems that inhibit experimentation, Greasemonkey (or Mozilla’s Jetpack) provides a way of playing with interfaces without touching any of the underlying code. My rewrite of the way RecordSearch displays digitised files is an example of this.

But no one interface is ever going to meet the needs of all archive users. Fortunately, there are a growing number of ways in which archives can work in partnership with their users to help them create the interfaces they want and need.

Archives are starting to expose their data directly using APIs and linked open data. This gives users the power to create whole new applications. But I still think there’ll be a place for the little tweak – a simple hack that meets some small but specific need. I can imagine communities of interest building and sharing a range of tools, hacks, applications and interfaces specifically tailored to their research habits.

So if you don’t like it, fix it.

Some archives hacking

It’s great to see that the National Archives of Australia has released a large swag of data through the new data.australia.gov.au site. In the Commonwealth Agencies zip file you can find xml dumps of all the publicly accessible agency and series data in RecordSearch, as well as item data for series A1. This is the same data that Mitchell Whitelaw visualised so brilliantly in his Visible Archive project. There’s also item data and images from series A3560 – the Mildenhall photographs of early Canberra.

What’s even more exciting is that people are already using this data. At the recent GovHack event in Canberra the What The Federal Government Does team worked on visualising the activities of government by using functions data pulled from the agencies file. Another group has generated a really nice tag cloud and photo gallery from the Mildenhall data. With further GovHack sessions to follow and the MashupAustralia contest open until 13 November, let’s hope for some more inspired archives hacking.

Seeing RecordSearch data out in the world like this reminded me of a little project I started a while back and then set aside. It was a simple PHP script that scraped data from RecordSearch and spat it out either as XML or JSON. Mitchell used a version of this script in his A1 Explorer in order to find out the number of pages in each digitised file.

I’ve now expanded and improved the script so that it provides data on items, series, agencies and persons. The output includes all the basic fields as well as links between entities – such as related series, controlling agencies etc. As an added bonus you also get some useful totals (where they’re available): items include the number of pages, series include the number of items described on RecordSearch, and agencies include the number of series recorded. I’ve also fiddled with mod_rewrite to provide a more rest-ful interface.

For XML output use the url http://discontents.com.au/shed/rs/xml/ followed by the appropriate identifier – a barcode for an item, a CA number for an agency, a CP number for a person or a series number.

Some examples:

As you might have guessed, to get JSON output you just substitute ‘json’ for ‘xml’ in the url.

Being dependent on screen scraping, it’s inherently a bit fragile, but I’m hoping it might be of some use. My intention was to use it to start exploring some new ways of using and interacting with the data. The code itself is available at BitBucket. It’s not very elegant, but I don’t want to spend much time cleaning it up at the moment. If it seems like it might be useful, I’ll probably rewrite the whole thing in python and publish it through Google’s AppEngine.

Playing with pipes

The ever-informative Twitter alerted me recently to the History Trust of South Australia’s object of the month. It made me think that it would be nice if there was some way of bringing together all those objects, photos and documents featured by our cultural institutions. Some sort of combined RSS feed perhaps?

Something like this…

Well, yes… I couldn’t resist having a go. My tool of choice for this was Yahoo Pipes which has various modules for manipulating and creating RSS feeds. Check out my script on the Yahoo Pipes site to create a badge like this, play some more or inspect its innards. If you’re feeling adventurous you can even clone the script and tinker away yourself – it’s the best way to learn. Continue reading »

Harvesting context #1: Flickr comments

Instead of idly waiting for visitors to stumble over their holdings on some lonely information by-way,  archives are starting to push their content out into the bustling metropolis of the social web. They are going where the people are. Photographic collections, in particular, are gaining new lives and new audiences thanks to Flickr.

But that’s only part of the story. Released into the wild, these photos are slowly picking up the habits of the locals. They are making friends, building connections, even speaking with new accents and dialects. Commented, tagged, organised, linked – they are building new contexts for themselves outside of the cloying control of archival descriptive systems.

Unfortunately it seems there is often a chasm between the old lives of the photos, documented in databases and finding aids, and their new post-institutional careers. This is a pity because the new contexts they are gathering can help us both understand and find them. What can we do to overcome this divide? How could finding aids harvest and display the user-generated content that aggregates around collection items living in the outside world?

The good news is that the tools to start doing this already exist – Flickr has a powerful API that makes it easy to extract photo metadata. Time for a bit of experimenting… Continue reading »

MoA buttons galore

Mapping our Anzacs, in case you don’t know, provides a Google map interface to the 375,000+ WWI service records held by the National Archives of Australia. Amongst other other things, you can add scrapbook posts to individual entries and create tributes. It’s meant to encourage exploration, so go on… explore!

If you’ll do, you’ll notice that there are direct links into the National Archives’ database RecordSearch. However, there are currently no links going to other way. Why does this matter? Well perhaps you’d like to use NameSearch to find an individual record, but then add a scrapbook post in Mapping our Anzacs. Up until now you had to find them all over again. But not any more…

Introducing our new range of ‘View in Mapping our Anzacs’ buttons:

  • For the discerning Firefox devotee we have a Greasemonkey userscript which adds a button to the RecordSearch item details page.
  • For fashion-challenged IE user we have a bookmarklet. Just right click on this link – View in Mapping our Anzacs – and save it as a favourite in your ‘Links’ folder (you may need to enable the ‘Links’ toolbar first by checking Tools > Toolbars > Links.)

Yes, it’s true… you could use the Bookmarklet with Firefox (just drag it to your bookmarks toolbar), but Greasemonkey is so much more chic.

Once you’re fully button-enabled just head into RecordSearch, find an item in series B2455 (the WWI service records) and click! Hurrah! You will be instantly transported to Mapping our Anzacs.

You can test out your new button by heading here:

Archives in 3D

All dressed up – RecordSearch has a new look

All dressed up – RecordSearch has a new look

The new version of my Greasemonkey userscript, RecordSearch Image Tools, gives RecordSearch’s digital image pages a rather new look. My previous version had done away with the tired ol ‘lemon-chiffon’ background colour, but I decided it was time to get a bit more adventurous, so I blitzed the old design and rebuilt the page from the beginning.

As you can see from the screenshot, I’ve tried to give the images as much as the screen as possible. I’ve also created a consistent set of navigation buttons, and improved the functionality in various ways. Continue reading »

Pathways to memory

[Contains many broken links – included for historical interest only!]

what is there to know about archives?

In this age of virtual wonders, it seems that our past is rushing towards us. New communication technologies promise greatly improved access to Australia’s cultural heritage. The previous government had hoped to lead us along the aisles of our own “Electronic Smithsonian”, according to its 1995 statement, Innovate Australia [HREF 2]:

…school children will be able over the Internet to read the diaries of Cook and Bligh, Burke and Wills, stories of the Royal Flying Doctor Service in outback Australia, and see the works of Rover Thomas and Arthur Boyd.

In rather less expansive terms, the current government plans a National Cultural Network [HREF 3] that will “simplify and enhance the communication and exchange of cultural and heritage resources, information and ideas”. But where will the material be coming from to fill the virtual display cases? Government statements often point to “libraries, museums and galleries”, but what about archives? Of course we’re meant to assume that archives are somewhere amongst the “cultural and heritage organisations”, and anyway the major libraries collect archival material like diaries, letters and manuscripts. But consigning archives to the ranks of fellow-travellers in the information putsch, means that little attention is given to their specific needs and their unique potential. We will have no strategies for ensuring that appropriate forms of access are developed. Instead of delving deeply into our “vast cultural resources” we may simply skim the top, presenting only the familiar in a new digital guise. Instead of an “Electronic Smithsonian” we might end up with an “Electronic Disneyland”. This paper will examine how the World Wide Web might be used to avoid this by facilitating access to Australia’s archival resources – providing pathways for exploring our collective memory. Continue reading »

Mapping scientific memory

How do scientists document their research? As electronic means of communication become the norm, this question has taken on special urgency. If we don’t understand the process of record-keeping within the sciences, we are in danger of losing our scientific memory – with severe legal, financial and cultural consequences. This article introduces the connection between scientific practice and the records-keeping process, indicating how little we know of the technological, administrative and cultural dimensions of this relationship and how it has changed over time. Archival research that analyses this connection will enable the development of strategies to deal with current and future problems. But how can we fund this research?

READ ON ASAPWEB»

En-visioning ASAP

On behalf of ASAP I’d like to welcome you all here to help celebrate our 10th birthday. This is a milestone that, at times, it seemed we might never reach, but here we are, stronger than ever. If you haven’t already guessed, this is a night of rampant self-congratulation, mixed with some myth-making, and perhaps also a little reflection – just how did we make it this far? I believe it had a lot to do with the ‘V’ word – vision. Continue reading »

A world to win

What am I doing here? I work for a non-profit organisation attached to the University of Melbourne. What can I say about “Doing Business on the WWW”? You all know universities have it easy, large IT departments, huge bandwidth connections – it’s a different world! Continue reading »