the shed
hacks

Doing it yourself

I was doing some research using the National Archives of Australia’s RecordSearch database the other day and became frustrated that there is no way of seeing how many pages are in a digitised file without clicking on the ‘Display digital copy’ link. So I fixed it.

As a userscript it’s hardly worthy of a blog post. All it does it find out how many pages are in the file and insert the number in the link text. It’s very simple. But I think it’s also a useful illustration of the changing balance of power between archives and their users.

William E Landis argued that archivists were ‘guilty as a profession of fetishising the outputs of our descriptive systems’. The design of finding aids have often been determined not by the needs of users but by a desire to faithfully represent the underlying archival architecture. But now users don’t have to just take what they’re given.

Technologies such as Greasemonkey are useful for sketching out alternatives. For organisations with IT systems that inhibit experimentation, Greasemonkey (or Mozilla’s Jetpack) provides a way of playing with interfaces without touching any of the underlying code. My rewrite of the way RecordSearch displays digitised files is an example of this.

But no one interface is ever going to meet the needs of all archive users. Fortunately, there are a growing number of ways in which archives can work in partnership with their users to help them create the interfaces they want and need.

Archives are starting to expose their data directly using APIs and linked open data. This gives users the power to create whole new applications. But I still think there’ll be a place for the little tweak – a simple hack that meets some small but specific need. I can imagine communities of interest building and sharing a range of tools, hacks, applications and interfaces specifically tailored to their research habits.

So if you don’t like it, fix it.

Some archives hacking

It’s great to see that the National Archives of Australia has released a large swag of data through the new data.australia.gov.au site. In the Commonwealth Agencies zip file you can find xml dumps of all the publicly accessible agency and series data in RecordSearch, as well as item data for series A1. This is the same data that Mitchell Whitelaw visualised so brilliantly in his Visible Archive project. There’s also item data and images from series A3560 – the Mildenhall photographs of early Canberra.

What’s even more exciting is that people are already using this data. At the recent GovHack event in Canberra the What The Federal Government Does team worked on visualising the activities of government by using functions data pulled from the agencies file. Another group has generated a really nice tag cloud and photo gallery from the Mildenhall data. With further GovHack sessions to follow and the MashupAustralia contest open until 13 November, let’s hope for some more inspired archives hacking.

Seeing RecordSearch data out in the world like this reminded me of a little project I started a while back and then set aside. It was a simple PHP script that scraped data from RecordSearch and spat it out either as XML or JSON. Mitchell used a version of this script in his A1 Explorer in order to find out the number of pages in each digitised file.

I’ve now expanded and improved the script so that it provides data on items, series, agencies and persons. The output includes all the basic fields as well as links between entities – such as related series, controlling agencies etc. As an added bonus you also get some useful totals (where they’re available): items include the number of pages, series include the number of items described on RecordSearch, and agencies include the number of series recorded. I’ve also fiddled with mod_rewrite to provide a more rest-ful interface.

For XML output use the url http://discontents.com.au/shed/rs/xml/ followed by the appropriate identifier – a barcode for an item, a CA number for an agency, a CP number for a person or a series number.

Some examples:

As you might have guessed, to get JSON output you just substitute ‘json’ for ‘xml’ in the url.

Being dependent on screen scraping, it’s inherently a bit fragile, but I’m hoping it might be of some use. My intention was to use it to start exploring some new ways of using and interacting with the data. The code itself is available at BitBucket. It’s not very elegant, but I don’t want to spend much time cleaning it up at the moment. If it seems like it might be useful, I’ll probably rewrite the whole thing in python and publish it through Google’s AppEngine.

Playing with pipes

The ever-informative Twitter alerted me recently to the History Trust of South Australia’s object of the month. It made me think that it would be nice if there was some way of bringing together all those objects, photos and documents featured by our cultural institutions. Some sort of combined RSS feed perhaps?

Something like this…

Well, yes… I couldn’t resist having a go. My tool of choice for this was Yahoo Pipes which has various modules for manipulating and creating RSS feeds. Check out my script on the Yahoo Pipes site to create a badge like this, play some more or inspect its innards. If you’re feeling adventurous you can even clone the script and tinker away yourself – it’s the best way to learn. Continue reading »

Cooliris-enabled scrapbook

There’s more 3D goodness for you to enjoy now that the Mapping our Anzacs scrapbook is Cooliris-enabled. If you have Cooliris installed, you’ll notice that the Cooliris icon on your browser toolbar lights up when you visit the site. Just click on the icon to browse all the photos posted to the scrapbook on a glorious 3D wall.

Scrapbook posts in 3D

Scrapbook posts in 3D

(If you don’t have Cooliris then go and get it. It can be used both in Internet Explorer and Firefox, though you’ll probably need to have admin rights to install for IE.)

Having given the 3D treatment to digitised files from the National Archives of Australia and portrait images from the Australian Dictionary of Biography, it wasn’t too hard to do. The scrapbook is a Tumblr site and the api makes it easy to extract all the photos. So I created a php file to gather all the details and then write them to a media-rss file. Then it was just a matter of  inserting a link to it in the scrapbook. Continue reading »

ADB DIY RSS

So I was thinking, wouldn’t it be nice if the Australian Dictionary of Biography’s ‘born on this day‘ feature could be made available as an RSS feed. Every morning you’d get a new list of biographies delivered direct to your feed reader. And so…

[sounds of xpath wrangling and PHP coding]

here it is.

It’s pretty simple – it harvests all the links of people born on the current day, then loops through the links to gather the first paragraph of each biography. Then it’s just a matter of writing everything to an RSS file. Continue reading »

MoA buttons galore

Mapping our Anzacs, in case you don’t know, provides a Google map interface to the 375,000+ WWI service records held by the National Archives of Australia. Amongst other other things, you can add scrapbook posts to individual entries and create tributes. It’s meant to encourage exploration, so go on… explore!

If you’ll do, you’ll notice that there are direct links into the National Archives’ database RecordSearch. However, there are currently no links going to other way. Why does this matter? Well perhaps you’d like to use NameSearch to find an individual record, but then add a scrapbook post in Mapping our Anzacs. Up until now you had to find them all over again. But not any more…

Introducing our new range of ‘View in Mapping our Anzacs’ buttons:

  • For the discerning Firefox devotee we have a Greasemonkey userscript which adds a button to the RecordSearch item details page.
  • For fashion-challenged IE user we have a bookmarklet. Just right click on this link – View in Mapping our Anzacs – and save it as a favourite in your ‘Links’ folder (you may need to enable the ‘Links’ toolbar first by checking Tools > Toolbars > Links.)

Yes, it’s true… you could use the Bookmarklet with Firefox (just drag it to your bookmarks toolbar), but Greasemonkey is so much more chic.

Once you’re fully button-enabled just head into RecordSearch, find an item in series B2455 (the WWI service records) and click! Hurrah! You will be instantly transported to Mapping our Anzacs.

You can test out your new button by heading here:

Archives in 3D

All dressed up – RecordSearch has a new look

All dressed up – RecordSearch has a new look

The new version of my Greasemonkey userscript, RecordSearch Image Tools, gives RecordSearch’s digital image pages a rather new look. My previous version had done away with the tired ol ‘lemon-chiffon’ background colour, but I decided it was time to get a bit more adventurous, so I blitzed the old design and rebuilt the page from the beginning.

As you can see from the screenshot, I’ve tried to give the images as much as the screen as possible. I’ve also created a consistent set of navigation buttons, and improved the functionality in various ways. Continue reading »

RecordSearch tools broken!?

BREAKING NEWS (2.00pm, Monday, 8 December): RecordSearch seems to be back on the old subdomain, so now the userscript fix is not working! To be safe, I’ve updated the userscript again so that it will work on both the old and new subdomains. I’ll do the same with the Zotero translator, though for the time being it should be working. If you updated the userscript in the last few hours, you’d better do it again – sorry… Continue reading »