Have collection, will travel

A few years ago it seemed fashionable for cultural institutions to create a ‘My [insert institution name]’ space on their website where visitors could create their own online exhibits from collection materials. It bothered me at the time because it seemed to be a case of creating silos within silos. What could people do with their collections once they’d assembled them?

I was reminded of this recently as I undertook my Christmas-break mini project to think about Trove integration with Zotero. Some years ago I created a Zotero translator for the Trove newspaper zone, but I’d much rather we just exposed metadata within Trove pages that Zotero (and other tools like Hypothes.is and Mendeley) could pick up without any special code. More about that soon…

However. embedded metadata only addresses part of the problem — there’s also questions around tools and workflows. Trove includes a number of simple tools that enable users to annotate and organise resources — tags, comments, and lists. Tags and comments… well you know what they are. Lists are just collections of resources, and like tags, they can be public or private.

They may not be terribly exciting tools, but they are very heavily used. More than 2 million tags have been added by Trove users, but it’s lists that have shown the most growth in recent times. There are currently more than 47,000 lists and 30,000 of those are public. That’s a pretty impressive exercise in collection building. What are the lists about? A few months ago I harvested the titles of all public lists and threw them into Voyant for a quick word frequency check.

Word frequencies in the titles of Trove lists
Word frequencies in the titles of Trove lists

Given what we know about Trove users, it wasn’t surprising to see that many of the lists related to family history, but there are a few wonderful oddities buried in there as well. I love to be surprised by the passions of Trove users.

I suspect these tools are popular because they’re simple and open-ended. There are few constraints on what you can use for a tag or add to a list. Following some threads about game design recently I came upon a discussion of ‘underspecified’ tools — ‘the use of which you can never fully predict’. By underspecifying we leave open possibilities for innovation, experimentation and play. It seems like a pretty good design approach for the cultural heritage sector.

But wait a minute, you might be wondering, what sort of Trove Manager magic did I have to weave in order to extract those thousands of list titles? None, none at all. You could do exactly the same thing.

I’ve been talking a lot in recent months about Trove as a platform rather than a website — something to build on. One of our main construction tools is, of course, the Trove API. I suppose a good cultural heritage API is also underspecified — focused enough to be useful, but fuzzy enough to encourage a bit of screwing around. What you may not know about the Trove API is that as well as giving you access to around 300 million resources, it lets you access user comments, tags and lists.

I’m looking forward to researchers using the API to explore the various modes of meaning-making that occur around resources in Trove. But right now one thing it offers is portability — the collections people make can be moved. And that brings us back to Zotero.

Why should a Trove user have to decide up front whether they want to use Zotero or create a Trove list? Pursuing Europeana’s exciting vision of putting collections in our workflows we need to recognise that workflows change, projects grow, and new uses emerge. We should support and encourage this by making it as easy as possible for people to move their stuff around.

So of course I had to build something.

My Christmas project has resulted in some Python code that lets you export a Trove list or tag to a Zotero collection — API to API. Again it’s a simple idea, but I think it opens up some interesting possibilities for things like collaborative tagging projects — with a few lines of code hundreds of tagged items could be saved to Zotero for further organisation or annotation.

Along the way I ended up starting a more general Trove-Python library — it’s very incomplete, but it might be useful to someone. It’s all on GitHub — shared, as usual, not because I think the code is very good, but because I think it’s really important to share examples of what’s possible. Hopefully someone will find a slender spark of inspiration in my crappy code and build something better. Needless to say, this isn’t an official Trove product.

So what do you do if you want to export a list or tag?

First of all get yourself Python 2.7 and set up a virtualenv where you can play without messing anything up. Then install my library…

git clone https://github.com/Trove-Toolshed/trove-python.git
cd trove-python
python setup.py install

You’ll also need to install PyZotero. Once that’s done you can fire up Python and export a list from the command line like this…

from pyzotero import zotero
from trove_python.trove_core import trove
from trove_python.trove_zotero import export

zotero_api = zotero.Zotero('[Your Zotero user id]', 'user', '[Your Zotero API key]')
trove_api = trove.Trove('[Your Trove API key]')

export.export_list(list_id='[Your Trovelist id]', zotero_api=zotero_api, trove_api=trove_api)

Obviously you’ll also need to get yourself a Trove API key, and a Zotero key for your user account.

Exporting items with a particular tag is just as easy…

from pyzotero import zotero
from trove_python.trove_core import trove
from trove_python.trove_zotero import export

zotero_api = zotero.Zotero('[Your Zotero user id]', 'user', '[Your Zotero API key]')
trove_api = trove.Trove(['Your Trove API key'])
exporter = export.TagExporter(trove_api, zotero_api)

exporter.export('[Your tag]')

What do you end up with? Here’s my test list on Trove and the resulting Zotero collection. Here’s a set of resources tagged with ‘inigo’, and here’s the collection I created from them. You’ll notice that I added a few little extras, like attaching pdf copies where they’re available.

Sorry, no GUIs, no support, and not much documentation. Just a bit of rough code and some ideas to play with.

This work is licensed under a Creative Commons Attribution 4.0 International License.

Tim Sherratt Written by:

I'm a historian and hacker who researches the possibilities and politics of digital cultural collections.


Leave a Reply

Your email address will not be published. Required fields are marked *