<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>discontents</title>
	<atom:link href="http://discontents.com.au/feed" rel="self" type="application/rss+xml" />
	<link>http://discontents.com.au</link>
	<description>working for the triumph of content over form, ideas over control, people over systems</description>
	<lastBuildDate>Wed, 21 Jul 2010 23:24:54 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.9.2</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>THATCamp is coming to Australia</title>
		<link>http://discontents.com.au/shed/events/thatcamp-is-coming-to-australia</link>
		<comments>http://discontents.com.au/shed/events/thatcamp-is-coming-to-australia#comments</comments>
		<pubDate>Wed, 21 Jul 2010 23:21:33 +0000</pubDate>
		<dc:creator>tim</dc:creator>
				<category><![CDATA[events]]></category>
		<category><![CDATA[digital humanities]]></category>
		<category><![CDATA[thatcamp]]></category>
		<category><![CDATA[unconference]]></category>

		<guid isPermaLink="false">http://discontents.com.au/?p=960</guid>
		<description><![CDATA[	
	<span class="Z3988" title="ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Adc&amp;rfr_id=info%3Asid%2Focoins.info%3Agenerator&amp;rft.title=THATCamp+is+coming+to+Australia&amp;rft.aulast=Sherratt&amp;rft.aufirst=Tim&amp;rft.subject=events&amp;rft.source=discontents&amp;rft.date=2010-07-22&amp;rft.type=blogPost&amp;rft.format=text&amp;rft.identifier=http://discontents.com.au/shed/events/thatcamp-is-coming-to-australia&amp;rft.language=English"></span>

One of the things that&#8217;s keeping me busy at the moment is THATCamp Canberra. Yes, I got sick of missing out on all the THATCamp fun happening elsewhere and decided we should have our own.

THATCamp Canberra is a user-generated unconference on the digital humanities. It&#8217;ll be held at the University of Canberra on 28–29 August. [...]]]></description>
			<content:encoded><![CDATA[	
	<span class="Z3988" title="ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Adc&amp;rfr_id=info%3Asid%2Focoins.info%3Agenerator&amp;rft.title=THATCamp+is+coming+to+Australia&amp;rft.aulast=Sherratt&amp;rft.aufirst=Tim&amp;rft.subject=events&amp;rft.source=discontents&amp;rft.date=2010-07-22&amp;rft.type=blogPost&amp;rft.format=text&amp;rft.identifier=http://discontents.com.au/shed/events/thatcamp-is-coming-to-australia&amp;rft.language=English"></span>
<abbr class="unapi-id" title="http://discontents.com.au/?p=960"><!-- &nbsp; --></abbr>
<p>One of the things that&#8217;s keeping me busy at the moment is <a href="http://thatcampcanberra.org/">THATCamp Canberra</a>. Yes, I got sick of missing out on all the <a href="http://thatcamp.org/">THATCamp fun</a> happening elsewhere and decided we should have our own.</p>
<p><a href="http://thatcampcanberra.org"><img class="aligncenter size-medium wp-image-963" title="thatcamp_cbr_logo" src="http://discontents.com.au/wp-content/uploads/2010/07/thatcamp_cbr_logo-300x250.jpg" alt="" width="300" height="250" /></a></p>
<p>THATCamp Canberra is a user-generated unconference on the digital humanities. It&#8217;ll be held at the University of Canberra on 28–29 August. We&#8217;re getting a great mix of applications and I&#8217;m really looking forward to learning about what&#8217;s going on around Australia.</p>
<p>Applications close on 23 July, so get yours in soon!</p>
]]></content:encoded>
			<wfw:commentRss>http://discontents.com.au/shed/events/thatcamp-is-coming-to-australia/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Embedded archives</title>
		<link>http://discontents.com.au/shed/hacks/embedded-archives</link>
		<comments>http://discontents.com.au/shed/hacks/embedded-archives#comments</comments>
		<pubDate>Sun, 27 Jun 2010 12:00:17 +0000</pubDate>
		<dc:creator>tim</dc:creator>
				<category><![CDATA[hacks]]></category>
		<category><![CDATA[archives]]></category>
		<category><![CDATA[Cooliris]]></category>
		<category><![CDATA[recordsearch]]></category>

		<guid isPermaLink="false">http://discontents.com.au/?p=932</guid>
		<description><![CDATA[	
	<span class="Z3988" title="ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Adc&amp;rfr_id=info%3Asid%2Focoins.info%3Agenerator&amp;rft.title=Embedded+archives&amp;rft.aulast=Sherratt&amp;rft.aufirst=Tim&amp;rft.subject=hacks&amp;rft.source=discontents&amp;rft.date=2010-06-27&amp;rft.type=blogPost&amp;rft.format=text&amp;rft.identifier=http://discontents.com.au/shed/hacks/embedded-archives&amp;rft.language=English"></span>

Some of you may have noticed that my Hacking a research project post featured a file from the National Archives of Australia embedded as a Cooliris widget. Huh? To jog your memory, here it is again:
No, it&#8217;s not just an image, it&#8217;s a little 3D wall. You can pan and zoom to your heart&#8217;s content. [...]]]></description>
			<content:encoded><![CDATA[	
	<span class="Z3988" title="ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Adc&amp;rfr_id=info%3Asid%2Focoins.info%3Agenerator&amp;rft.title=Embedded+archives&amp;rft.aulast=Sherratt&amp;rft.aufirst=Tim&amp;rft.subject=hacks&amp;rft.source=discontents&amp;rft.date=2010-06-27&amp;rft.type=blogPost&amp;rft.format=text&amp;rft.identifier=http://discontents.com.au/shed/hacks/embedded-archives&amp;rft.language=English"></span>
<abbr class="unapi-id" title="http://discontents.com.au/?p=932"><!-- &nbsp; --></abbr>
<p>Some of you may have noticed that my <a href="http://discontents.com.au/shed/experiments/hacking-a-research-project">Hacking a research project</a> post featured a file from the <a href="http://naa.gov.au/">National Archives of Australia</a> embedded as a <a href="http://cooliris.com/">Cooliris</a> widget. Huh? To jog your memory, here it is again:</p>
<div class="wp-caption aligncenter" style="width: 470px">
<img style="visibility:hidden;width:0px;height:0px;" border=0 width=0 height=0 src="http://counters.gigya.com/wildfire/IMP/CXNID=2000002.11NXC/bT*xJmx*PTEyNzY3NzEwMDA5MjQmcHQ9MTI3Njc3MTAwNTYyOSZwPTkwMjA1MSZkPSZnPTEmb2Y9MA==.gif" /><object id="ci_10145_o" classid="clsid:D27CDB6E-AE6D-11cf-96B8-444553540000" width="460" height="300"><param name="movie" value="http://apps.cooliris.com/embed/cooliris.swf"/><param name="allowFullScreen" value="true"/><param name="allowScriptAccess" value="always"/><param name="bgColor" value="#121212" /><param name="flashvars" value="feed=http%3A%2F%2Fwraggelabs.com%2Frecordsearch%2Frss%2F7473965%2F%3Fpages%3D70%26ref%3DST84%2F1%2C%25201906%2F221-230&numrows=2" /><param name="wmode" value="opaque" /><embed id="ci_10145_e" type="application/x-shockwave-flash" src="http://apps.cooliris.com/embed/cooliris.swf" width="460" height="300" allowFullScreen="true" allowScriptAccess="always" bgColor="#121212" flashvars="feed=http%3A%2F%2Fwraggelabs.com%2Frecordsearch%2Frss%2F7473965%2F%3Fpages%3D70%26ref%3DST84%2F1%2C%25201906%2F221-230&numrows=2" wmode="opaque"></embed></object>
<p class="wp-caption-text">These certificates allowed non-white Australians travelling overseas to re-enter the country. NAA: ST84/1, 1906/21-30</p></div>
<p>No, it&#8217;s not just an image, it&#8217;s a little 3D wall. You can pan and zoom to your heart&#8217;s content. You can enlarge an image, view fullscreen &#8212; you can even share an image via Twitter. Fun for all the family!</p>
<p>Regular viewers will recall my previous encounters with CoolIris &#8212; <a href="http://discontents.com.au/shoebox/archives-shoebox/archives-in-3d">Archives in 3D</a> and <a href="http://discontents.com.au/shed/hacks/cooliris-enabled-scrapbook">CoolIris enabled scrapbook</a> &#8212; but these relied on having the CoolIris plugin installed. The embeddable Flash version wouldn&#8217;t work when the images were coming from the NAA because it upset Flash&#8217;s cross-domain settings.</p>
<p>So how did I get it to work? For various other projects I&#8217;ve been playing with simple image proxies using Python and Django, so I just applied the same principles. The image proxy makes it seem as if the images are coming from a local source, thus keeping Flash happy. Hurrah!</p>
<p>I&#8217;ve added a few little tweaks, so you can now view any digitised file in the National Archives of Australia in a CoolIris wall. Just go the the <a href="http://wraggelabs.com/recordsearch/wall/">file browser page</a> and enter a barcode. Even better you can install a bookmarklet. Just drag this link to your bookmarks bar (or save as a favourite) &#8212; <a href="javascript:(function(){window.location='http://wraggelabs.com/recordsearch/wall/'+document.evaluate('//td[b=&quot;Barcode&quot;]',document,null,XPathResult.FIRST_ORDERED_NODE_TYPE,null).singleNodeValue.lastChild.textContent})();">View on wall</a>. Then go to an item page in <a href="http://naa.gov.au/collection/recordsearch/index.aspx">RecordSearch</a> and click on the bookmarklet for 3D magic.</p>
<p>If you want to share a link to a file displayed in the 3D file browser, just use a url of the form:</p>
<p><code>http://wraggelabs.com/recordsearch/wall/[barcode]</code></p>
<p> &#8212; where [barcode] is fairly obviously the barcode of the file you want to view. For example:</p>
<ul>
<li><a href="http://wraggelabs.com/recordsearch/wall/3445411/">http://wraggelabs.com/recordsearch/wall/3445411/</a></li>
</ul>
<p>If you want to embed one of the mini-walls in your blog post it&#8217;s easy. Just go to the <a href="http://www.cooliris.com/yoursite/express/">CoolIris Express</a> site and create your own wall. When it asks you for content source, click on &#8216;Media RSS&#8217; and then in the &#8216;Feed URL&#8217; box put:</p>
<p><code>http://wraggelabs.com/recordsearch/rss/[barcode]</code></p>
<p>&#8211; where [barcode] is&#8230; well, you know&#8230;</p>
<p>I think this a pretty interesting way to view, browse and navigate digitised files. Using Flash, rather than a browser plugin makes it more accessible, but I&#8217;d still rather have something based on open software and standards. I think it won&#8217;t be too long before we see something similar using Canvas and Javascript. That&#8217;ll be really exciting.</p>
]]></content:encoded>
			<wfw:commentRss>http://discontents.com.au/shed/hacks/embedded-archives/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Hacking a research project</title>
		<link>http://discontents.com.au/shed/experiments/hacking-a-research-project</link>
		<comments>http://discontents.com.au/shed/experiments/hacking-a-research-project#comments</comments>
		<pubDate>Thu, 17 Jun 2010 13:49:22 +0000</pubDate>
		<dc:creator>tim</dc:creator>
				<category><![CDATA[experiments]]></category>
		<category><![CDATA[archives]]></category>
		<category><![CDATA[crowdsourcing]]></category>
		<category><![CDATA[invisibleaustralians]]></category>
		<category><![CDATA[research]]></category>
		<category><![CDATA[White Australia]]></category>

		<guid isPermaLink="false">http://discontents.com.au/?p=878</guid>
		<description><![CDATA[	
	<span class="Z3988" title="ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Adc&amp;rfr_id=info%3Asid%2Focoins.info%3Agenerator&amp;rft.title=Hacking+a+research+project&amp;rft.aulast=Sherratt&amp;rft.aufirst=Tim&amp;rft.subject=experiments&amp;rft.source=discontents&amp;rft.date=2010-06-17&amp;rft.type=blogPost&amp;rft.format=text&amp;rft.identifier=http://discontents.com.au/shed/experiments/hacking-a-research-project&amp;rft.language=English"></span>

Amongst the holdings of the National Archives of Australia are some of the most visually arresting documents you&#8217;ll see &#8212; thousands and thousands of forms from the early decades of the twentieth century, each with a portrait photograph and palm print, each documenting the movements of a non-white resident. Along with many other certificates, regulations, [...]]]></description>
			<content:encoded><![CDATA[	
	<span class="Z3988" title="ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Adc&amp;rfr_id=info%3Asid%2Focoins.info%3Agenerator&amp;rft.title=Hacking+a+research+project&amp;rft.aulast=Sherratt&amp;rft.aufirst=Tim&amp;rft.subject=experiments&amp;rft.source=discontents&amp;rft.date=2010-06-17&amp;rft.type=blogPost&amp;rft.format=text&amp;rft.identifier=http://discontents.com.au/shed/experiments/hacking-a-research-project&amp;rft.language=English"></span>
<abbr class="unapi-id" title="http://discontents.com.au/?p=878"><!-- &nbsp; --></abbr>
<p>Amongst the holdings of the National Archives of Australia are some of the most visually arresting documents you&#8217;ll see &#8212; thousands and thousands of forms from the early decades of the twentieth century, each with a portrait photograph and palm print, each documenting the movements of a non-white resident. Along with many other certificates, regulations, correspondence and case files, these forms are part of the massive bureaucratic legacy of the White Australia Policy.</p>
<div class="wp-caption aligncenter" style="width: 470px">
<img style="visibility:hidden;width:0px;height:0px;" border=0 width=0 height=0 src="http://counters.gigya.com/wildfire/IMP/CXNID=2000002.11NXC/bT*xJmx*PTEyNzY3NzEwMDA5MjQmcHQ9MTI3Njc3MTAwNTYyOSZwPTkwMjA1MSZkPSZnPTEmb2Y9MA==.gif" /><object id="ci_10145_o" classid="clsid:D27CDB6E-AE6D-11cf-96B8-444553540000" width="460" height="300"><param name="movie" value="http://apps.cooliris.com/embed/cooliris.swf"/><param name="allowFullScreen" value="true"/><param name="allowScriptAccess" value="always"/><param name="bgColor" value="#121212" /><param name="flashvars" value="feed=http%3A%2F%2Fwraggelabs.com%2Frecordsearch%2Frss%2F7473965%2F%3Fpages%3D70%26ref%3DST84%2F1%2C%25201906%2F221-230&numrows=2" /><param name="wmode" value="opaque" /><embed id="ci_10145_e" type="application/x-shockwave-flash" src="http://apps.cooliris.com/embed/cooliris.swf" width="460" height="300" allowFullScreen="true" allowScriptAccess="always" bgColor="#121212" flashvars="feed=http%3A%2F%2Fwraggelabs.com%2Frecordsearch%2Frss%2F7473965%2F%3Fpages%3D70%26ref%3DST84%2F1%2C%25201906%2F221-230&numrows=2" wmode="opaque"></embed></object>
<p class="wp-caption-text">These certificates allowed non-white Australians travelling overseas to re-enter the country. NAA: ST84/1, 1906/21-30</p></div>
<p>But these are more than just interesting looking pieces of paper, they are snapshots of people&#8217;s lives. The forms capture data about an individual&#8217;s place of birth, physical characteristics and more. Over time a person might have submitted several of these forms, so by bringing them together we could trace their history, we could map their journeys &#8212; we could even watch them age.</p>
<p>The system which sought to render non-whites invisible has captured and preserved the outlines of their lives. By extracting and linking this data we could build a picture of another Australia, an Australia in which non-white residents lived, loved, struggled and succeeded, despite the impositions of a repressive regime.</p>
<p>I talked about these records at the <a href="http://theaahc.org/conferences/2009conference/">AAHC conference</a> last year, inspired in part by Tim Hitchcock&#8217;s chapter in the <em>Virtual Representation of the Past</em>. Tim Hitchcock argues that technology can allow us to restructure archives, looking beyond institutional hierarchies to the lives of individuals contained within:</p>
<blockquote><p>What changes when we examine the world through the collected fragments of knowledge that we can recover about a single person, reorganised as a biographical narrative, rather than as part of an archival system?
</p></blockquote>
<p>I don&#8217;t know, but I&#8217;d like to find out.</p>
<p>During my AAHC talk, Dave Lester suggested that the extraction of data from these forms might make a good crowdsourcing project. It&#8217;s a great idea. As you can see, the data is generally well-structured and legible, it should be possible to construct a simple series of forms that would allow volunteers to transcribe the data. The next stage would be to try and match identities across forms. That&#8217;s more complicated, but projects such as Tim Hitchcock&#8217;s <a href="http://www.londonlives.org/">London Lives</a> show how users can construct identities by connecting a range of historical documents.</p>
<p>Then there are connections to resources outside of the archives &#8212; photographs, local histories, newspapers, genealogies, cemetery registers and more. By keeping our system open and extensible, and by working with others to help them expose their information in standard ways, it should be possible to develop the framework for an evolving mesh of biographical data.</p>
<p>So, how do we get started? This is the point when you usually have to start thinking about money &#8212; how can I fund this? In Australia that generally means a journey into the arcane world of the Australian Research Council. The ARC suffers from all the problems of a peer-reviewed system, but added to this is a rather antiquated notion of what research is.</p>
<p>In the rules covering each of the main schemes it&#8217;s clearly stated that the &#8216;compilation of data&#8217; and the &#8216;development of research aids or tools&#8217; are not supported. I spend part of my life working for the <a href="http://ands.org.au/">Australian National Data Service</a>, an organisation that seeks to highlight how the sharing and reuse of data can open up new research possibilities. The ARC, however, seems to think that data has little value beyond its original research context.</p>
<p>Of course you can still mount a case for such activities. Applicants for a &#8216;Discovery&#8217; grant can argue that data creation is integral to their project and provide details of the &#8217;specific research questions to be addressed&#8217;. But what if you don&#8217;t yet know what the questions are? Part of the point of a project such as this is to try and find out what questions <em>we are able</em> to ask. Until we start to compile, link and explore the data, the &#8217;specific research questions&#8217; will be little more than convenient fictions, dreamt up to satisfy the prodding of peer reviewers.</p>
<p>Tom Scheinfeldt wrote a <a href="http://www.foundhistory.org/2010/05/12/wheres-the-beef-does-digital-humanities-have-to-answer-questions/">fantastic blog post</a> recently, responding to concerns about the failure of many digital humanities projects to make arguments or answer questions. Drawing examples from the history of science, Tom argues:</p>
<blockquote><p>we need to make room for both kinds of digital humanities, the kind that seeks to make arguments and answer questions now and the kind that builds tools and resources with questions in mind, but only in the back of its mind and only for later. We need time to experiment and even&#8230; time to play.</p></blockquote>
<p>The ARC does not fund play.</p>
<p>You might imagine that the ARC&#8217;s infrastructure funding scheme would offer more hope for a project such as this. And yes, there are many worthy projects involving databases and online tools that have been supported in this way (and I have benefited from some of them!). But it seems that in the minds of research funders infrastructure is always BIG. Grants start at $150,000, and applications are expected to involve multiple institutional partners. Projects have to be scaled up to fit the ARC&#8217;s definition of infrastructure, often resulting in complex, lumbering, long-term projects whose products are out of date by the time of their release.</p>
<p>There is no room in our current infrastructure models for agile, innovative, user-focused digital toolmakers seeking small amounts to experiment with apps, prototypes, datasets or visualisations. I often look with envy upon the US National Endowment for the Humanities <a href="http://www.neh.gov/grants/guidelines/digitalhumanitiesstartup.html">Digital Humanities Start-Up Grants</a>.</p>
<p>In any case, neither I nor my partner in this endeavour, Kate Bagnall (<a href="http://twitter.com/baibi">@baibi</a>), are currently in academic positions, so our chances of gaining any sort of research funding are next to none. We have the expertise &#8212; Kate has spent many years researching Australian-Chinese families and knows the records back-to-front, while I just can&#8217;t help playing with biographical data &#8212; but is that enough? How can you mount an ongoing research project without institutional support, research funding and the various badges and signifiers of academic authority?</p>
<p>I don&#8217;t know that either, but I have some ideas.</p>
<div id="attachment_918" class="wp-caption aligncenter" style="width: 222px"><a href="http://discontents.com.au/wp-content/uploads/2010/06/cedt.jpeg"><img src="http://discontents.com.au/wp-content/uploads/2010/06/cedt_photo-212x300.jpg" alt="Ah Yin Pak Chong" title="cedt_photo" width="212" height="300" class="size-medium wp-image-918" /></a><p class="wp-caption-text">Mrs Ah Yin Pak Chong. NAA: ST84/1, 1907/321-330</p></div>
<p>I didn&#8217;t manage to get a contribution together for Dan Cohen and Tom Scheinfeldt&#8217;s crowdsourced-in-a-week book, <a href="http://hackingtheacademy.org/">Hacking the Academy</a>, but watching the process from afar I did begin to wonder about how we might hack the way we build and run major research projects. This is what I have in mind:</p>
<ul>
<li>To strip down the large, lumbering beasts and design projects that are modular and opportunistic &#8212; able to grow quickly when resources allow, to bolt on related projects, to absorb existing tools.</li>
<li>To follow the data freely across technological and institutional boundaries, developing open networks that invite participation and use.</li>
<li>To develop a floating pool of collaborators, both inside and outside of academia, who are able to come and go, contributing whatever and whenever they can.</li>
<li>To make everything public, accessible and standards-compliant, so that even if the project stalls it could be picked up and developed by someone else.</li>
</ul>
<p>Most of all I just want to be able to do it. I don&#8217;t want to second-guess the ARC. I don&#8217;t want to spend months negotiating with potential partners or begging for an institutional home. I want to build, experiment and play. I want to make a start.</p>
<p>So that&#8217;s what we&#8217;re going to do.</p>
<p>We have a topic, plenty of raw materials, some basic principles and the beginnings of a plan. We even have a name &#8212; <em>Invisible Australians: Living under the White Australia Policy</em>. </p>
<p>As the project develops, I&#8217;ll be blogging here about some of the technical stuff, while Kate will be exploring the content over at <a href="http://chineseaustralia.org/">the tiger&#8217;s mouth</a>. I hope to have a prototype of the transcription tool ready to demo at <a href="http://thatcampcanberra.org/">THATCamp Canberra</a>, while Kate is already at work putting together guides on using the records and developing an <a href="http://omeka.org">Omeka</a> site that follows a number of Chinese-Australian families through the archives.</p>
<p>Can we hack together a major research project? Let&#8217;s find out. </p>
]]></content:encoded>
			<wfw:commentRss>http://discontents.com.au/shed/experiments/hacking-a-research-project/feed</wfw:commentRss>
		<slash:comments>13</slash:comments>
		</item>
		<item>
		<title>(a not so) Quick catch up</title>
		<link>http://discontents.com.au/shed/a-not-so-quick-catch-up</link>
		<comments>http://discontents.com.au/shed/a-not-so-quick-catch-up#comments</comments>
		<pubDate>Fri, 07 May 2010 15:37:13 +0000</pubDate>
		<dc:creator>tim</dc:creator>
				<category><![CDATA[experiments]]></category>
		<category><![CDATA[the shed]]></category>
		<category><![CDATA[biographies]]></category>
		<category><![CDATA[Flickr]]></category>
		<category><![CDATA[games]]></category>
		<category><![CDATA[greasemonkey]]></category>
		<category><![CDATA[identities]]></category>
		<category><![CDATA[machine tags]]></category>
		<category><![CDATA[newspapers]]></category>
		<category><![CDATA[People Australia]]></category>
		<category><![CDATA[semantic web]]></category>
		<category><![CDATA[userscripts]]></category>

		<guid isPermaLink="false">http://discontents.com.au/?p=843</guid>
		<description><![CDATA[	
	<span class="Z3988" title="ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Adc&amp;rfr_id=info%3Asid%2Focoins.info%3Agenerator&amp;rft.title=%28a+not+so%29+Quick+catch+up&amp;rft.aulast=Sherratt&amp;rft.aufirst=Tim&amp;rft.subject=experiments&amp;rft.subject=the+shed&amp;rft.source=discontents&amp;rft.date=2010-05-08&amp;rft.type=blogPost&amp;rft.format=text&amp;rft.identifier=http://discontents.com.au/shed/a-not-so-quick-catch-up&amp;rft.language=English"></span>

The trained guinea pigs in the Wragge Labs bunker have been churning out all sorts of stuff in the last few months, and I&#8217;m way behind in my attempts to document their activities. So this is a bit of a catch-up post to try and commit a few pertinent details to the collective memory bank [...]]]></description>
			<content:encoded><![CDATA[	
	<span class="Z3988" title="ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Adc&amp;rfr_id=info%3Asid%2Focoins.info%3Agenerator&amp;rft.title=%28a+not+so%29+Quick+catch+up&amp;rft.aulast=Sherratt&amp;rft.aufirst=Tim&amp;rft.subject=experiments&amp;rft.subject=the+shed&amp;rft.source=discontents&amp;rft.date=2010-05-08&amp;rft.type=blogPost&amp;rft.format=text&amp;rft.identifier=http://discontents.com.au/shed/a-not-so-quick-catch-up&amp;rft.language=English"></span>
<abbr class="unapi-id" title="http://discontents.com.au/?p=843"><!-- &nbsp; --></abbr>
<p>The trained guinea pigs in the Wragge Labs bunker have been churning out all sorts of stuff in the last few months, and I&#8217;m way behind in my attempts to document their activities. So this is a bit of a catch-up post to try and commit a few pertinent details to the collective memory bank before they are lost forever in the sleep-deprived fog of day-to-day existence.</p>
<h3>Identity upgrades</h3>
<p>There have been a number of major improvements to <a href="http://wraggelabs.com/identities/">Wragge&#8217;s Identity Browser</a>. Regular viewers will recall that the Identity Browser is built on top of the <a href="http://www.nla.gov.au/apps/srw/search/peopleaustralia">People Australia SRU interface</a>. You might not realise, however, that People Australia contains details of many organisations as well as people. We can only be thankful that it wasn&#8217;t called Entity Australia.</p>
<p>The first version of my Identity Browser only searched for people, but now all your corporate-entity-identification needs are also met, with only a few minor changes to the interface so-beloved by numerous generations of identity seekers. To be specific, through the wonders of drop-down technology you can choose whether you want to search for a person or an organisation. Or not. You can also just ignore that and search for everything and get back sensible results anyway. It&#8217;s your choice. Or not.</p>
<div id="attachment_864" class="wp-caption aligncenter" style="width: 310px"><a href="http://wraggelabs.com/identities/"><img class="size-medium wp-image-864" title="identities" src="http://discontents.com.au/wp-content/uploads/2010/05/identities-300x77.jpg" alt="" width="300" height="77" /></a><p class="wp-caption-text">Gaze in awe at the power of my dropdown</p></div>
<p>Ah pattern matching&#8230; there are few phrases so redolent of warm summer days, hidden pleasures, and the subtle delights of wildcard characters. The People Australia SRU interface was sadly lacking in the pattern matching department, but this has now been rectified. So now you mix your stems and asterixes with wild abandon. Searching for &#8216;Curtin, J*&#8217; will now retrieve all those Curtins whose names begin with &#8216;J&#8217;. Amazing isn&#8217;t it?</p>
<p>Astonishing too is the fact that the accompanying &#8216;Identify me!&#8217; bookmarklet continues to function with nary a murmur of protest. There is, however, a little bit of cleverness built-in to enhance your bookmarklet experience. If the text that you highlight has a comma in it, the Identity Browser will conclude that you&#8217;re feeding it the name of a person – ie Surname, Firstname – and will treat the Firstname as a stem. So if you highlight &#8216;Whitlam, G&#8217; and click on the bookmarklet, the Identity Browser will be kick-started into life, searching for everything that matches surname equals &#8216;Whitlam&#8217; and firstname is like &#8216;G*&#8217;. If there&#8217;s no comma – ie firstname secondname – then it heads off to look for either a person whose surname equals &#8217;secondname&#8217; and whose firstname is like &#8216;firstname*&#8217;, or an organisation whose name includes both &#8216;firstname&#8217; and &#8217;secondname&#8217;. Got all that?</p>
<p>Basically the idea was to try and provide some sensible defaults so you really don&#8217;t have to think about it too much.</p>
<p>I have it in my head to prepare a long and rapturous homage to the wonders of machine tags. With their sly semantic ways and easy-going nature, they offer some exciting possibilities not just for user-generated content, but user-generated meanings and user-generated relationships. But for the full, ripe pleasure of that post you will have to wait another day, for now I shall simply say that as well as RDFa, the Identity Browser provides automagically-generated machine tags.</p>
<p>Where might you use them? Flickr&#8217;s a good place to start. Try identifying the subjects and creators of Flickr photos. At the NSW Reference and Information Services Group Seminar the other day I challenged those in attendance to go forth and machine tag. Already more than 100 machine tags have been added to Flickr using my Identity Browser. Expect to hear more about the Great Flickr Machine Tag Challenge soon&#8230;</p>
<p>One more thing&#8230; try adding &#8216;.rdf&#8217; on to the end of an identity record – eg <a href="http://wraggelabs.com/identities/person/612109.rdf">http://wraggelabs.com/identities/person/612109.rdf</a>. Just an experiment at the moment&#8230;</p>
<h3>More machine tag love</h3>
<p>One night on Twitter, <a href="http://twitter.com/lifeasdaddy">@lifeasdaddy</a> pointed out that someone had started using fragments of urls from the <a href="http://trove.nla.gov.au/newspaper">NLA newspapers site</a> as tags in the <a href="http://www.powerhousemuseum.com/collection/database/?irn=244414">Powerhouse Museum&#8217;s collection database</a>. In the conversation that ensued with <a href="http://twitter.com/sebchan">@sebchan</a> and others, I suggested that the PHM could encourage this sort of rich tagging by supporting machine tags, with all their wonderful juicy semantic goodness The guinea pigs got excited as well, and before I knew it, they&#8217;d constructed a little <a href="http://semweb-helper.appspot.com/">Semweb Helper app</a>.</p>
<p>The Semweb Helper comes with its very own custom-tailored bookmarklet. If you find an article on the NLA newspapers site that you&#8217;d like to point to, just click on the bookmarklet and marvel as a range of useful machine tags are automagically generated. Then you just pick the appropriate tag, copy and paste et voila – instant semantic gratification.</p>
<div id="attachment_861" class="wp-caption aligncenter" style="width: 310px"><a href="http://semweb-helper.appspot.com/"><img class="size-medium wp-image-861" title="semweb-helper" src="http://discontents.com.au/wp-content/uploads/2010/05/semweb-helper-300x147.jpg" alt="Screenshot" width="300" height="147" /></a><p class="wp-caption-text">Try out the Semweb Helper</p></div>
<p>It&#8217;s a very simple little app, and really just a demonstration of how semantic web technologies might be made available to the masses. It was also the first time the guinea pigs had been allowed to play with the Google Apps Engine.</p>
<h3>Who am I?</h3>
<p>This short catch-up post has become something quite long and rambling. Did I mention that I&#8217;m sleep-deprived? Anyway, a recent addition to the Wragge Labs range of lifestyle accessories is <a href="http://wraggelabs.com/whoami/">&#8216;Who am I?&#8217; </a>– a simple little game that is something like a cross between hangman and Wheel of Fortune. Choosing a person at random from People Australia and the <em>Australian Dictionary of Biography</em>, &#8216;Who am I?&#8217; tests your powers of logic, stamina and historical guesstimation.</p>
<p>Your challenge is to figure out the surname of the mystery historical personage. To help you there are a series of clues, such as their birthplace and known associates. With each guess you also see a little bit more of their portrait. But beware! For ten wrong guesses are all that are permitted to any so brave as to enter upon this quest. Not eleven or twelve, but ten and ten only. To ignore this limit is to invite ridicule and disdain – do so at your peril.</p>
<div id="attachment_858" class="wp-caption aligncenter" style="width: 310px"><a href="http://wraggelabs.com/whoami/"><img class="size-medium wp-image-858" title="whoami" src="http://discontents.com.au/wp-content/uploads/2010/05/whoami-300x137.jpg" alt="Who am I screenshot" width="300" height="137" /></a><p class="wp-caption-text">Play Who am I?</p></div>
<p>&#8216;Who am I&#8217; builds upon some work I&#8217;ve been doing for the National Museum of Australia – looking at ways of mashing together various types of date-identified data. As part of that project I&#8217;ve built a series of APIs and have scraped, pummelled and munged data from a variety of sources.</p>
<p>What&#8217;s the point? I wonder this myself sometimes, particularly after I fling such things off into the aethernet and hear naught but a rare retweet. I am, after all, only in it for the glory, oh and the money of course. (Hmmm, I must look again at that business plan.) The point is twofold: first to highlight possibilities for the re-use and remixing of cultural data; second, to play with game-based models for discovery and exploration of cultural resources; and&#8230; err&#8230; thirdly just to try building something a little different.</p>
<p>Of course, if you like &#8216;Who am I?&#8217; you will probably also want to try <a href="http://wraggelabs.com/newsroulette/">Headline Roulette</a>&#8230;</p>
<h3>Headline Roulette Reprieve</h3>
<p>At the end of <a href="http://discontents.com.au/shed/experiments/headline-roulette">our last instalment</a>, the future of <a href="http://wraggelabs.com/newsroulette/">Headline Roulette</a> seemed in dire peril. Changes to the National Library of Australia web site threatened its very existence. Did it have a future? Could it survive? And did anybody care?</p>
<p>As we pick up the story oblivion looms. The feared changes are confirmed, but just as all seems lost&#8230; is it? Could it be? Yes, an advanced search facility is added to the newspapers site within Trove. Sensing this may be their only opportunity, the guinea pigs leap into action, building <a href="http://bitbucket.org/wragge/nla-newspapers-scraper">a new screen-scraper</a>, saving Headline Roulette from doom, and setting the world upon the path to a safer, happier future.</p>
<p>In short, Headline Roulette will live on&#8230; so enjoy.</p>
<h3>Handing out some presents</h3>
<p>My head is easily turned by flattery and praise. Yes, I really am so shallow and so vain. But this means that if people say nice things to me, I&#8217;m inclined to give them presents.</p>
<p>As well as doing exciting things in the web 2.0 realm for the PROV, <a href="http://twitter.com/asaletourneau">@asaletourneau</a> leaves nice comments on this blog. So he earned himself a present. It&#8217;s not much, but I <a href="http://userscripts.org/scripts/show/71421">built a userscript</a> that displays photos from the PROV site in a neat little slideshow (it&#8217;s the non-3D javascript version of CoolIris). Install Greasemonkey, get the userscript and <a href="http://proarchives.imagineering.com.au/index_search.asp?searchid=41">try it out</a> (just do a search, then click on the &#8216;Browse as slideshow&#8217; button&#8217;).</p>
<div id="attachment_852" class="wp-caption aligncenter" style="width: 310px"><a href="http://discontents.com.au/wp-content/uploads/2010/05/prov-slideshow.jpg"><img class="size-medium wp-image-852" title="prov-slideshow" src="http://discontents.com.au/wp-content/uploads/2010/05/prov-slideshow-300x187.jpg" alt="Screen capture of slideshow" width="300" height="187" /></a><p class="wp-caption-text">PROV transport photos in a pretty slideshow</p></div>
<p>The State Library of NSW, or more specifically <a href="http://www.twitter.com/ellenforsyth">@ellenforsyth</a>, also earned my favour by inviting me to rave on about Linked Data at the afore-mentioned NSW RISG seminar. As a result, I added support for the SLNSW photo collections to my <a href="http://discontents.com.au/shoebox/archives-shoebox/harvesting-context-1">Flickr Context Harvester</a> userscript. Well&#8230; it&#8217;s the thought that counts, right? Once again – install Greasemonkey, <a href="http://userscripts.org/scripts/show/56135">get the userscript</a> and then <a href="http://acms.sl.nsw.gov.au/item/itemDetailPaged.aspx?itemID=447435">try it out</a>.</p>
<div id="attachment_855" class="wp-caption aligncenter" style="width: 310px"><a href="http://discontents.com.au/wp-content/uploads/2010/05/slnsw-flickr.jpg"><img class="size-medium wp-image-855" title="slnsw-flickr" src="http://discontents.com.au/wp-content/uploads/2010/05/slnsw-flickr-300x181.jpg" alt="Flickr context harvestr screenshot" width="300" height="181" /></a><p class="wp-caption-text">The Flickr Context Harvester in action</p></div>
<h3>And coming up&#8230;</h3>
<p>Stay tuned for more on the Great Flickr Machine Tag Challenge, screencasts demonstrating my Identity Browser, some playing with relationships, and much much more. But right now the squirming baby on my lap needs a nappy change&#8230;</p>
<p>Did I mention that I&#8217;m sleep deprived?</p>
]]></content:encoded>
			<wfw:commentRss>http://discontents.com.au/shed/a-not-so-quick-catch-up/feed</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>Headline roulette</title>
		<link>http://discontents.com.au/shed/experiments/headline-roulette</link>
		<comments>http://discontents.com.au/shed/experiments/headline-roulette#comments</comments>
		<pubDate>Tue, 23 Mar 2010 12:26:29 +0000</pubDate>
		<dc:creator>tim</dc:creator>
				<category><![CDATA[experiments]]></category>
		<category><![CDATA[API]]></category>
		<category><![CDATA[Django]]></category>
		<category><![CDATA[games]]></category>
		<category><![CDATA[newspapers]]></category>
		<category><![CDATA[NLA]]></category>
		<category><![CDATA[Piston]]></category>
		<category><![CDATA[python]]></category>
		<category><![CDATA[screen scraping]]></category>

		<guid isPermaLink="false">http://discontents.com.au/?p=834</guid>
		<description><![CDATA[	
	<span class="Z3988" title="ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Adc&amp;rfr_id=info%3Asid%2Focoins.info%3Agenerator&amp;rft.title=Headline+roulette&amp;rft.aulast=Sherratt&amp;rft.aufirst=Tim&amp;rft.subject=experiments&amp;rft.source=discontents&amp;rft.date=2010-03-23&amp;rft.type=blogPost&amp;rft.format=text&amp;rft.identifier=http://discontents.com.au/shed/experiments/headline-roulette&amp;rft.language=English"></span>

I&#8217;ve been doing a fair bit of coding in recent weeks and I thought I&#8217;d better write a few details down before I forget about them.
As previously noted, I&#8217;ve been gathering together various historical data sets for a project at the National Museum of Australia. One resource that I was keen on including was the [...]]]></description>
			<content:encoded><![CDATA[	
	<span class="Z3988" title="ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Adc&amp;rfr_id=info%3Asid%2Focoins.info%3Agenerator&amp;rft.title=Headline+roulette&amp;rft.aulast=Sherratt&amp;rft.aufirst=Tim&amp;rft.subject=experiments&amp;rft.source=discontents&amp;rft.date=2010-03-23&amp;rft.type=blogPost&amp;rft.format=text&amp;rft.identifier=http://discontents.com.au/shed/experiments/headline-roulette&amp;rft.language=English"></span>
<abbr class="unapi-id" title="http://discontents.com.au/?p=834"><!-- &nbsp; --></abbr>
<p>I&#8217;ve been doing a fair bit of coding in recent weeks and I thought I&#8217;d better write a few details down before I forget about them.</p>
<p>As previously noted, I&#8217;ve been gathering together various historical data sets for a project at the National Museum of Australia. One resource that I was keen on including was the fantastic <a href="http://newspapers.nla.gov.au/ndp/del/home">Australian Newspapers</a> project at the National Library of Australia. What I had in mind was being able to give a sense of context to any historical event by calling up the headlines for that particular time.</p>
<p>Unfortunately there&#8217;s no API for the newspapers project (or Trove in general), though apparently it&#8217;s in the works. So I had to reverse engineer the advanced search page to work out the various query options, and then build a screen scraper to harvest the results. I played around with the search options a bit to fine tune the results, finally deciding to limit them to &#8216;news&#8217; articles with more than 1000 words. Annoyingly, only 10 results are returned at a time.</p>
<p>I had hoped to parse the results as xml, but a rogue &lt;br&gt; tag broke the XHTML, so I fell back on <a href="http://www.crummy.com/software/BeautifulSoup/">Beautiful Soup</a> – a Python module that makes screen scraping considerably easier by tidying up HTML structures. After than it was pretty straightforward. Soon I had <a href="http://bitbucket.org/wragge/nla-newspapers/">my own Python module</a> to query the newspapers database and process the results.</p>
<p>The next step was to use the module to build a simple API that would let us quickly grab a set of headlines for a particular date and place. <a href="http://www.djangoproject.com/">Django</a> and <a href="http://bitbucket.org/jespern/django-piston/wiki/Home">Piston</a> made this easy. To see headlines from Victoria on 1 January 1901, for example:</p>
<p><a href="http://wraggelabs.com/api/newspapers/1901-01-01/nsw/">http://wraggelabs.com/api/newspapers/1901-01-01/nsw/</a></p>
<p>That was pretty cool and it started me thinking about what else I might do with the data. At first I was planning some sort of browser, like my <a href="http://wraggelabs.com/abs/">Population Browser</a>, but that seemed a bit boring. So I decided to create a simple game that grabbed a random headline and asked you to try and guess the date. After further refinement I decided to impose a limit of 10 guesses, with &#8216;higher&#8217; or &#8216;lower&#8217; prompts to get you moving in the right direction. Yes, basically it was a rip-off of The Price is Right – but an interesting, ironic and historically engaged rip-off&#8230;</p>
<p>This required me to make a change to the API and Python module so that I could retrieve a random headline. Basically it just meant generating a query based on random values for the day, month, year and state. For the interface I once again delved into JQuery&#8217;s box of tricks. With all the kerfuffle about ChatRoulette in the media, the name seemed obvious – <a href="http://wraggelabs.com/newsroulette/">Wragge&#8217;s Headline Roulette</a> was born.</p>
<div id="attachment_839" class="wp-caption aligncenter" style="width: 310px"><a href="http://wraggelabs.com/newsroulette/"><img class="size-medium wp-image-839" title="headline-roulette" src="http://discontents.com.au/wp-content/uploads/2010/03/headline-roulette-300x151.jpg" alt="Headline roulette screen capture" width="300" height="151" /></a><p class="wp-caption-text">Test your historical nous with Headline Roulette!</p></div>
<p>It&#8217;s a very simple little app, but a number of people have said how much fun it is. The bad news is that imminent changes to the NLA newspapers site are probably going to break it (at least in its current form). So enjoy it while you can. When the NLA makes an API available I might work on something a little more sophisticated.</p>
<p>Of course, the broader point is that there are a whole range of cultural materials out there waiting to be remixed and re-used in various forms. Get hacking&#8230;</p>
]]></content:encoded>
			<wfw:commentRss>http://discontents.com.au/shed/experiments/headline-roulette/feed</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>Out of the cube</title>
		<link>http://discontents.com.au/shed/experiments/out-of-the-cube</link>
		<comments>http://discontents.com.au/shed/experiments/out-of-the-cube#comments</comments>
		<pubDate>Fri, 26 Feb 2010 05:57:44 +0000</pubDate>
		<dc:creator>tim</dc:creator>
				<category><![CDATA[experiments]]></category>
		<category><![CDATA[APIs]]></category>
		<category><![CDATA[datacubes]]></category>
		<category><![CDATA[Django]]></category>
		<category><![CDATA[Piston]]></category>
		<category><![CDATA[population]]></category>
		<category><![CDATA[python]]></category>
		<category><![CDATA[spreadsheets]]></category>
		<category><![CDATA[statistics]]></category>

		<guid isPermaLink="false">http://discontents.com.au/?p=823</guid>
		<description><![CDATA[	
	<span class="Z3988" title="ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Adc&amp;rfr_id=info%3Asid%2Focoins.info%3Agenerator&amp;rft.title=Out+of+the+cube&amp;rft.aulast=Sherratt&amp;rft.aufirst=Tim&amp;rft.subject=experiments&amp;rft.source=discontents&amp;rft.date=2010-02-26&amp;rft.type=blogPost&amp;rft.format=text&amp;rft.identifier=http://discontents.com.au/shed/experiments/out-of-the-cube&amp;rft.language=English"></span>

For a project that I&#8217;m working on at the National Museum of Australia, I&#8217;ve started collecting various sources of date-identified data. Most recently I had a go at extracting historical population data from the Australian Bureau of Statistics.
The data can all be downloaded as .xls files, but they&#8217;re not simple, flat spreadsheets – they&#8217;re data [...]]]></description>
			<content:encoded><![CDATA[	
	<span class="Z3988" title="ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Adc&amp;rfr_id=info%3Asid%2Focoins.info%3Agenerator&amp;rft.title=Out+of+the+cube&amp;rft.aulast=Sherratt&amp;rft.aufirst=Tim&amp;rft.subject=experiments&amp;rft.source=discontents&amp;rft.date=2010-02-26&amp;rft.type=blogPost&amp;rft.format=text&amp;rft.identifier=http://discontents.com.au/shed/experiments/out-of-the-cube&amp;rft.language=English"></span>
<abbr class="unapi-id" title="http://discontents.com.au/?p=823"><!-- &nbsp; --></abbr>
<p>For a project that I&#8217;m working on at the National Museum of Australia, I&#8217;ve started collecting various sources of date-identified data. Most recently I had a go at extracting <a href="http://www.abs.gov.au/AUSSTATS/abs@.nsf/mf/3105.0.65.001">historical population data</a> from the Australian Bureau of Statistics.</p>
<p>The data can all be downloaded as .xls files, but they&#8217;re not simple, flat spreadsheets – they&#8217;re data cubes. As the name suggests, data cubes are organised along a number of dimensions. In the case of the population data it&#8217;s year, state and gender.</p>
<p>This means that you can&#8217;t just export the data to CSV and suck it into your database – first you&#8217;ve got to flatten the cube. No doubt there are other ways to do this, but I just wrote a simple python script. It uses <a href="http://pypi.python.org/pypi/xlrd">xlrd</a> to read from the spreadsheet, does a bit or reorganisation, then writes the output to a CSV file. The code, for what it&#8217;s worth, is <a href="http://bitbucket.org/wragge/abs-data-cube-processor/">available at Bitbucket</a>.</p>
<p>Once I had the CSV file I just imported it into MySQL and used Django and <a href="http://bitbucket.org/jespern/django-piston/wiki/Home">Piston</a> to build a basic API. So if you want to know the population of NSW in 1856, you just go to:</p>
<p><a href="http://wraggelabs.com/api/json/population/nsw/1856/">http://wraggelabs.com/api/json/population/nsw/1856/</a></p>
<p>The number of infant deaths in Tasmania in 1932:</p>
<p><a href="http://wraggelabs.com/api/json/infantdeaths/tas/1932/">http://wraggelabs.com/api/json/infantdeaths/tas/1932/</a></p>
<p>The number of female births in Australia in 1959:</p>
<p><a href="http://wraggelabs.com/api/json/births/australia/females/1959/">http://wraggelabs.com/api/json/births/australia/females/1959/</a></p>
<p>I&#8217;m sure you get the picture. You can change the &#8216;json&#8217; to &#8216;xml&#8217; if you&#8217;d like another flavour of data.</p>
<div id="attachment_830" class="wp-caption aligncenter" style="width: 310px"><a href="http://wraggelabs.com/abs/"><img class="size-medium wp-image-830" title="pop_browser" src="http://discontents.com.au/wp-content/uploads/2010/02/pop_browser-300x140.png" alt="Screenshot of population browser" width="300" height="140" /></a><p class="wp-caption-text">The API in action - a simple population browser</p></div>
<p>With an API delivering JSON you can start playing around with all sorts of fun AJAX-y stuff. To demonstrate I built a <a href="http://wraggelabs.com/abs/">simple population browser</a> using JQuery. Just drag the slider!</p>
]]></content:encoded>
			<wfw:commentRss>http://discontents.com.au/shed/experiments/out-of-the-cube/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Emerging technologies and the need to experiment</title>
		<link>http://discontents.com.au/shoebox/archives-shoebox/emerging-technologies-and-the-need-to-experiment</link>
		<comments>http://discontents.com.au/shoebox/archives-shoebox/emerging-technologies-and-the-need-to-experiment#comments</comments>
		<pubDate>Wed, 03 Feb 2010 06:30:23 +0000</pubDate>
		<dc:creator>tim</dc:creator>
				<category><![CDATA[archives]]></category>
		<category><![CDATA[drafts]]></category>

		<guid isPermaLink="false">http://discontents.com.au/?p=814</guid>
		<description><![CDATA[	
	<span class="Z3988" title="ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Adc&amp;rfr_id=info%3Asid%2Focoins.info%3Agenerator&amp;rft.title=Emerging+technologies+and+the+need+to+experiment&amp;rft.aulast=Sherratt&amp;rft.aufirst=Tim&amp;rft.subject=archives&amp;rft.subject=drafts&amp;rft.source=discontents&amp;rft.date=2010-02-03&amp;rft.type=blogPost&amp;rft.format=text&amp;rft.identifier=http://discontents.com.au/shoebox/archives-shoebox/emerging-technologies-and-the-need-to-experiment&amp;rft.language=English"></span>

About a month ago I posted a copy of my report Emerging technologies for the provision of access to archives on Scribd. It&#8217;s already edging up towards a thousand reads, so I thought it was time I put a link in from here. 
The basic message is we need to experiment and find the spaces [...]]]></description>
			<content:encoded><![CDATA[	
	<span class="Z3988" title="ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Adc&amp;rfr_id=info%3Asid%2Focoins.info%3Agenerator&amp;rft.title=Emerging+technologies+and+the+need+to+experiment&amp;rft.aulast=Sherratt&amp;rft.aufirst=Tim&amp;rft.subject=archives&amp;rft.subject=drafts&amp;rft.source=discontents&amp;rft.date=2010-02-03&amp;rft.type=blogPost&amp;rft.format=text&amp;rft.identifier=http://discontents.com.au/shoebox/archives-shoebox/emerging-technologies-and-the-need-to-experiment&amp;rft.language=English"></span>
<abbr class="unapi-id" title="http://discontents.com.au/?p=814"><!-- &nbsp; --></abbr>
<p>About a month ago I posted a copy of my report <a href="http://www.scribd.com/doc/24402148/Emerging-technologies-for-the-provision-of-access-to-archives-issues-challenges-and-ideas">Emerging technologies for the provision of access to archives</a> on Scribd. It&#8217;s already edging up towards a thousand reads, so I thought it was time I put a link in from here. </p>
<p>The basic message is we need to experiment and find the spaces both within and between our institutions to foster such experimentation. Is that asking too much? Anyway&#8230; read, enjoy, use!</p>
<p><object id="doc_178960182596690" name="doc_178960182596690" height="600" width="100%" type="application/x-shockwave-flash" data="http://d1.scribdassets.com/ScribdViewer.swf" style="outline:none;" ><param name="movie" value="http://d1.scribdassets.com/ScribdViewer.swf"><param name="wmode" value="opaque"><param name="bgcolor" value="#ffffff"><param name="allowFullScreen" value="true"><param name="allowScriptAccess" value="always"><param name="FlashVars" value="document_id=24402148&#038;access_key=key-yzpzm303w8owl3o60ef&#038;page=1&#038;viewMode=list"></object></p>
]]></content:encoded>
			<wfw:commentRss>http://discontents.com.au/shoebox/archives-shoebox/emerging-technologies-and-the-need-to-experiment/feed</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>I link therefore I am</title>
		<link>http://discontents.com.au/shed/experiments/i-link-therefore-i-am</link>
		<comments>http://discontents.com.au/shed/experiments/i-link-therefore-i-am#comments</comments>
		<pubDate>Wed, 20 Jan 2010 00:25:03 +0000</pubDate>
		<dc:creator>tim</dc:creator>
				<category><![CDATA[experiments]]></category>
		<category><![CDATA[identites]]></category>
		<category><![CDATA[name authorities]]></category>
		<category><![CDATA[People Australia]]></category>
		<category><![CDATA[RDFa]]></category>
		<category><![CDATA[semantic web]]></category>

		<guid isPermaLink="false">http://discontents.com.au/?p=761</guid>
		<description><![CDATA[	
	<span class="Z3988" title="ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Adc&amp;rfr_id=info%3Asid%2Focoins.info%3Agenerator&amp;rft.title=I+link+therefore+I+am&amp;rft.aulast=Sherratt&amp;rft.aufirst=Tim&amp;rft.subject=experiments&amp;rft.source=discontents&amp;rft.date=2010-01-20&amp;rft.type=blogPost&amp;rft.format=text&amp;rft.identifier=http://discontents.com.au/shed/experiments/i-link-therefore-i-am&amp;rft.language=English"></span>

Let me be clear. I am not Tim Sherratt the sound engineer. Nor, indeed, am I Timothy Sherratt, author of Saints as Citizens: A Guide to Public Responsibilities for Christians. We are three different people, spread across three continents, locked in a deadly battle for global supremacy via Google search rankings. There can be only [...]]]></description>
			<content:encoded><![CDATA[	
	<span class="Z3988" title="ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Adc&amp;rfr_id=info%3Asid%2Focoins.info%3Agenerator&amp;rft.title=I+link+therefore+I+am&amp;rft.aulast=Sherratt&amp;rft.aufirst=Tim&amp;rft.subject=experiments&amp;rft.source=discontents&amp;rft.date=2010-01-20&amp;rft.type=blogPost&amp;rft.format=text&amp;rft.identifier=http://discontents.com.au/shed/experiments/i-link-therefore-i-am&amp;rft.language=English"></span>
<abbr class="unapi-id" title="http://discontents.com.au/?p=761"><!-- &nbsp; --></abbr>
<p>Let me be clear. I am not Tim Sherratt the sound engineer. Nor, indeed, am I Timothy Sherratt, author of <em>Saints as Citizens: A Guide to Public Responsibilities for Christians</em>. We are three different people, spread across three continents, locked in a deadly battle for global supremacy via <a href="http://www.google.com/#hl=en&#038;source=hp&#038;q=tim+sherratt&#038;btnG=Google+Search&#038;aq=f&#038;aql=&#038;aqi=&#038;oq=tim+sherratt">Google search rankings</a>. There can be only one&#8230;</p>
<p>Of course you probably knew I wasn&#8217;t a British sound engineer or an American politics professor. There are plenty of contextual clues within this website, even on this page, to indicate that my interests lie elsewhere. But while we humans are pretty good at picking up such clues, it&#8217;s much harder for computers. When Google comes to index my site, how does it know I&#8217;m not a sound engineer who likes to dabble in history? Indeed, how does Google, or any computer know that the words &#8216;Tim Sherratt&#8217; are actually a person&#8217;s name? These are questions of both identity and semantics.</p>
<p>Librarians have been dealing with questions of identity for many, many years developing detailed name authority records. Such records allow name variations to be cross-referenced and individuals to be uniquely identified. For example I have a control number of &#8216;n 2005043272&#8242; in the <a href="http://authorities.loc.gov/">Library of Congress authorities database</a>, while Timothy R Sherratt, the politics professor has been assigned &#8216;n  94106739&#8242;.</p>
<p>The National Library of Australia has developed its own name authority file. However, the NLA has realised that reliable identity data has a much broader application that simply cataloguing, and is using its name authority data as the foundation of an exciting new resource – <a href="http://www.nla.gov.au/initiatives/peopleaustralia/">People Australia</a>. People Australia will mesh its own records with biographical data from a variety of outside sources, creating a rich collection of interlinked identities. Already entries from the Australian Dictionary of Biography have been ingested.</p>
<p>So now, thanks to People Australia, if I ever get confused about who I am I just have to remember one little url – my very own persistent identifier – <a href="http://nla.gov.au/nla.party-479364">http://nla.gov.au/nla.party-479364</a>. I&#8217;m going to get a t-shirt made up.</p>
<p>But that doesn&#8217;t help our new machine overlords very much. How can a computer tell that the words &#8216;Tim Sherratt&#8217; describe a person and that more information about that person can be found at http://nla.gov.au/nla.party-479364? This is the sort of problem that the semantic web hopes to solve. The semantic web aims to expose the structures that are buried in our documents and databases, to make explicit the contextual clues that humans pick up, but computers ignore. As the slogan goes, it represents a change from a &#8216;web of documents to a web of data&#8217;.</p>
<p>The semantic web uses a variety of tools and standards to encode information in a form that means something to computers. <a href="http://www.foaf-project.org/">FOAF</a> (Friend of a Friend) is, for example, a machine-readable ontology that describes people and their relationships. A computer visiting this page can in fact find out a fair bit about me, including my NLA persistent identifier, because there is a link to a small XML file in which my details are <a href="http://discontents.com.au/foaf.rdf">expressed using FOAF</a>.</p>
<p>But if this seems a little daunting, the semantic web offers another technology which is really just as easy as marking up a page in HTML – it&#8217;s called RDFa. This link – <a typeof="foaf:Person" property="foaf:name" content="Sherratt, Tim" rel="foaf:isPrimaryTopicOf" href="http://nla.gov.au/nla.party-479364">Tim Sherratt</a> – is more than it seems. Here is what a computer sees:</p>
<p><code>&lt;a typeof="foaf:Person" property="foaf:name" content="Sherratt, Tim" rel="foaf:isPrimaryTopicOf" href="http://nla.gov.au/nla.party-479364"&gt;Tim Sherratt&lt;/a&gt;</code></p>
<p>This says that Tim Sherratt is a person whose name has the standard form &#8216;Sherratt, Tim&#8217; and who is the primary topic of the page to be found at http://nla.gov.au/nla.party-479364. There&#8217;s a fair bit of semantic goodness in that one little link. If the NLA page also expressed its data in a machine-readable form, this link could send search engines and browsers into a whole new world of associations and inferences.</p>
<p>But I suppose you&#8217;re thinking that the code still looks a bit complicated. Well never fear, this long post is really just an introduction to a new project I&#8217;ve been working on – something that will help you generate markup like this with just a couple of clicks.</p>
<h3>Introducing Wragge&#8217;s identity browser</h3>
<p>I&#8217;ve been interested in publishing biographical data way back from the early days of <a href="http://www.asap.unimelb.edu.au/bsparcs/">Bright Sparcs</a> and, sad as it may seem, I find the possibilities of People Australia pretty exciting. However, I don&#8217;t think we should expect the NLA to do all the work. People Australia provides a framework that we can all use to enrich our own documents, databases, finding aids, and applications.</p>
<p>You can easily access People Australia data through <a href="http://trove.nla.gov.au/">Trove</a>. But to get a better idea of what&#8217;s in the database, I&#8217;d suggest you spend some time playing with its <a href="http://www.nla.gov.au/apps/srw/search/peopleaustralia">SRU interface</a>. Using this you can query the database directly, retrieving results in XML – ready for your own application to suck up and use.</p>
<p>To make this even easier, I&#8217;ve written a <a href="http://bitbucket.org/wragge/people-australia-client/">People Australia client library</a> in Python. This enables you to quickly extract and use identity information. Using it, your own web application can talk to People Australia directly. I won&#8217;t go into the details here – the code is farily heavily commented – but I welcome any feedback, suggestions or contributions. Copy it, change it, use it!</p>
<p>To try out my library and to provide a tool that might be of use to the average punter I&#8217;ve also built:</p>
<p>&lt;TA-DA&gt;<a href="http://wraggelabs.com/people/"><strong>Wragge&#8217;s identity browser</strong></a>!&lt;/TA-DA&gt;</p>
<p>It&#8217;s pretty simple. Search for a surname, pick a name from the result list, and view their identity details. For example, here&#8217;s <a href="http://wraggelabs.com/people/612109/">Clement Wragge&#8217;s details</a>.</p>
<p>But there are a couple of extra features that I am rather smugly pleased with. First of all, there&#8217;s an <a href="javascript:(function(){var%20selText;if%20(window.getSelection){if%20(document.activeElement%20&&%20(document.activeElement.tagName.toLowerCase()=='textarea'%20||%20document.activeElement.tagName.toLowerCase()=='input')){var%20text=document.activeElement.value;selText=text.substring(document.activeElement.selectionStart,document.activeElement.selectionEnd);}else{var%20selRange=window.getSelection();selText=selRange.toString();}}else{if%20(document.selection.createRange){var%20range=document.selection.createRange();selText=range.text;}}if%20(selText!=''){var%20url='http://wraggelabs.com/people/?context='+escape(selText);window.open(url);}else{alert('Select%20some%20text%20first!');}})();">'Identify me!'</a> bookmarklet. Just drag the link to your browser&#8217;s bookmarks or favourites toolbar (see below for some further notes).</p>
<p>Once you have the bookmarklet installed all you have to do to find the identity record for someone is to highlight their name on a webpage and click &#8216;Identify me!&#8217;. You could then grab the People Australia ID to store in your own application, allowing you (with the help of my client library) to automatically include links to relevant entries in the <em>Australian Dictionary of Biography</em>, for example.</p>
<p>Even better, Wragge&#8217;s identity browser will automagically generate the RDFa markup you need to semantically enrich your document. Whether you&#8217;re writing a blog post, publishing an article, drafting a caption, creating a database entry, or preparing a finding aid you can quickly and easily find an individual and then cut and paste the code you need.</p>
<p>To show this in action I used the bookmarklet to help me mark up many of the people named in one of my articles. We humans see a <a href="http://discontents.com.au/words/magazines-articles/looking-at-the-sun">normal page with a few extra links</a>. Computers, however, can extract the embedded RDFa to get at the <a href="http://www.w3.org/2007/08/pyRdfa/extract?uri=http%3A%2F%2Fdiscontents.com.au%2Fwords%2Fmagazines-articles%2Flooking-at-the-sun&#038;format=pretty-xml&#038;warnings=false&#038;parser=lax&#038;space-preserve=true&#038;submit=Go!&#038;text=">structured information that&#8217;s hidden in the page</a>.</p>
<p>Now I&#8217;ve got to go and semantify the rest of my articles&#8230;</p>
<p>Go forth and identify! And in the process help build a better web.</p>
<h4>Notes on the bookmarklet</h4>
<ul>
<li>Internet Explorer has &#8216;Favorites&#8217;, Firefox has &#8216;Bookmarks&#8217; – whatever you&#8217;re using first make sure that your Bookmarks/Favourites toolbar is visible. Look under Tools->Toolbars in IE8, View->Toolbars in Firefox. </li>
<li>Try dragging the &#8216;Identify me!&#8217; link to your Bookmarks/Favourites toolbar. If it doesn&#8217;t work, try right clicking on the link and choose &#8216;Bookmark this link&#8217; or &#8216;Add to Favourites&#8217;. Make sure you add it to the toolbar folder. IE will probably give you various warnings – ignore them.</li>
<li>You should now have a working bookmarklet – highlight a name and click on it, a new window should open with results from Wragge&#8217;s identity browser. IE might complain about opening a pop-up – allow pop-ups and try again.</li>
<li>The bookmarklet is pretty clever about working out which part of the highlighted text is the surname, so you can highlight names in a number of formats including:
<ul>
<li>Surname</li>
<li>Surname&#8217;s</li>
<li>Surname, Othernames</li>
<li>Othernames Surname</li>
<li>Othernames Surname&#8217;s</li>
</ul>
</li>
<li><del datetime="2010-01-26T12:34:24+00:00">For the moment this only works with &#8217;straight&#8217;, ie non-curly, apostrophes – but I&#8217;ll fix this asap.</del> Fixed!</li>
</ul>
<h4>Notes on RDFa markup</h4>
<ul>
<li>You have a choice between visible (ie clickable) links or invisible ones. They look the same to computers, so it&#8217;s just a matter of whether you want your human visitors to see them. Click &#8216;change&#8217; to toggle between the two options.</li>
<li>You can just paste the RDFa markup straight into your document. If you&#8217;ve used the bookmarklet, the text you highlighted will be automatically inserted as the link text – so just copy and paste. If you haven&#8217;t used the bookmarklet you can insert the link text yourself.</li>
<li>Somewhere in your document you need to tell computers what the FOAF in your RDFa markup means. You do this by inserting the text:<br />
<code>xmlns:foaf="http://xmlns.com/foaf/0.1/"</code> inside a tag that contains your marked up text. If you can edit the raw html of your page, you can just insert it in the <code>&lt;html&gt;</code> tag itself, so it becomes <code>&lt;html xmlns:foaf="http://xmlns.com/foaf/0.1/" &gt;</code>. Otherwise you can wrap your marked up text in a <code>&lt;div&gt;</code> tag and put the extra code in there.
</li>
<li>If you&#8217;re using something like Wordpress that strips out or converts any markup that it doesn&#8217;t expect, you need to be able to enter the RDFa as &#8216;raw&#8217; html. In Wordpress you can do this using the <a href="http://wordpress.org/extend/plugins/raw-html/">Raw HTML plugin</a>.</li>
<li>For more on using RDFa have a look at: <a href="http://www.w3.org/MarkUp/2009/rdfa-for-html-authors">RDFa for HTML Authors</a>.
</ul>
]]></content:encoded>
			<wfw:commentRss>http://discontents.com.au/shed/experiments/i-link-therefore-i-am/feed</wfw:commentRss>
		<slash:comments>8</slash:comments>
		</item>
		<item>
		<title>Doing it yourself</title>
		<link>http://discontents.com.au/shoebox/archives-shoebox/doing-it-yourself</link>
		<comments>http://discontents.com.au/shoebox/archives-shoebox/doing-it-yourself#comments</comments>
		<pubDate>Tue, 22 Dec 2009 11:21:31 +0000</pubDate>
		<dc:creator>tim</dc:creator>
				<category><![CDATA[archives]]></category>
		<category><![CDATA[hacks]]></category>
		<category><![CDATA[greasemonkey]]></category>
		<category><![CDATA[recordsearch]]></category>
		<category><![CDATA[userscript]]></category>

		<guid isPermaLink="false">http://discontents.com.au/?p=738</guid>
		<description><![CDATA[	
	<span class="Z3988" title="ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Adc&amp;rfr_id=info%3Asid%2Focoins.info%3Agenerator&amp;rft.title=Doing+it+yourself&amp;rft.aulast=Sherratt&amp;rft.aufirst=Tim&amp;rft.subject=archives&amp;rft.subject=hacks&amp;rft.source=discontents&amp;rft.date=2009-12-22&amp;rft.type=blogPost&amp;rft.format=text&amp;rft.identifier=http://discontents.com.au/shoebox/archives-shoebox/doing-it-yourself&amp;rft.language=English"></span>

I was doing some research using the National Archives of Australia&#8217;s RecordSearch database the other day and became frustrated that there is no way of seeing how many pages are in a digitised file without clicking on the &#8216;Display digital copy&#8217; link. So I fixed it.
As a userscript it&#8217;s hardly worthy of a blog post. [...]]]></description>
			<content:encoded><![CDATA[	
	<span class="Z3988" title="ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Adc&amp;rfr_id=info%3Asid%2Focoins.info%3Agenerator&amp;rft.title=Doing+it+yourself&amp;rft.aulast=Sherratt&amp;rft.aufirst=Tim&amp;rft.subject=archives&amp;rft.subject=hacks&amp;rft.source=discontents&amp;rft.date=2009-12-22&amp;rft.type=blogPost&amp;rft.format=text&amp;rft.identifier=http://discontents.com.au/shoebox/archives-shoebox/doing-it-yourself&amp;rft.language=English"></span>
<abbr class="unapi-id" title="http://discontents.com.au/?p=738"><!-- &nbsp; --></abbr>
<p>I was doing some research using the National Archives of Australia&#8217;s <a href="http://naa.gov.au/collection/recordsearch/index.aspx">RecordSearch</a> database the other day and became frustrated that there is no way of seeing how many pages are in a digitised file without clicking on the &#8216;Display digital copy&#8217; link. So <a href="http://userscripts.org/scripts/show/64722">I fixed it</a>.</p>
<p>As a userscript it&#8217;s hardly worthy of a blog post. All it does it find out how many pages are in the file and insert the number in the link text. It&#8217;s very simple. But I think it&#8217;s also a useful illustration of the changing balance of power between archives and their users.</p>
<p>William E Landis argued that archivists were &#8216;guilty as a profession of fetishising the outputs of our descriptive systems&#8217;. The design of finding aids have often been determined not by the needs of users but by a desire to faithfully represent the underlying archival architecture. But now users don&#8217;t have to just take what they&#8217;re given.</p>
<p>Technologies such as <a href="https://addons.mozilla.org/en-US/firefox/addon/748">Greasemonkey</a> are useful for sketching out alternatives. For organisations with IT systems that inhibit experimentation, Greasemonkey (or <a href="https://jetpack.mozillalabs.com/">Mozilla&#8217;s Jetpack</a>) provides a way of playing with interfaces without touching any of the underlying code. My rewrite of the way RecordSearch <a href="http://discontents.com.au/shoebox/archives-shoebox/archives-in-3d">displays digitised files</a> is an example of this.</p>
<p>But no one interface is ever going to meet the needs of all archive users. Fortunately, there are a growing number of ways in which archives can work in partnership with their users to help <em>them</em> create the interfaces they want and need.</p>
<p>Archives are starting to expose their data directly using APIs and linked open data. This gives users the power to create whole new applications. But I still think there&#8217;ll be a place for the little tweak – a simple hack that meets some small but specific need. I can imagine communities of interest building and sharing a range of tools, hacks, applications and interfaces specifically tailored to their research habits.</p>
<p>So if you don&#8217;t like it, fix it.</p>
]]></content:encoded>
			<wfw:commentRss>http://discontents.com.au/shoebox/archives-shoebox/doing-it-yourself/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Some archives hacking</title>
		<link>http://discontents.com.au/shoebox/archives-shoebox/some-archives-hacking</link>
		<comments>http://discontents.com.au/shoebox/archives-shoebox/some-archives-hacking#comments</comments>
		<pubDate>Thu, 05 Nov 2009 00:31:07 +0000</pubDate>
		<dc:creator>tim</dc:creator>
				<category><![CDATA[archives]]></category>
		<category><![CDATA[hacks]]></category>
		<category><![CDATA[govhack]]></category>
		<category><![CDATA[mashup]]></category>
		<category><![CDATA[metadata]]></category>
		<category><![CDATA[php]]></category>
		<category><![CDATA[recordsearch]]></category>

		<guid isPermaLink="false">http://discontents.com.au/?p=727</guid>
		<description><![CDATA[	
	<span class="Z3988" title="ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Adc&amp;rfr_id=info%3Asid%2Focoins.info%3Agenerator&amp;rft.title=Some+archives+hacking&amp;rft.aulast=Sherratt&amp;rft.aufirst=Tim&amp;rft.subject=archives&amp;rft.subject=hacks&amp;rft.source=discontents&amp;rft.date=2009-11-05&amp;rft.type=blogPost&amp;rft.format=text&amp;rft.identifier=http://discontents.com.au/shoebox/archives-shoebox/some-archives-hacking&amp;rft.language=English"></span>

It&#8217;s great to see that the National Archives of Australia has released a large swag of data through the new data.australia.gov.au site. In the Commonwealth Agencies zip file you can find xml dumps of all the publicly accessible agency and series data in RecordSearch, as well as item data for series A1. This is the [...]]]></description>
			<content:encoded><![CDATA[	
	<span class="Z3988" title="ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Adc&amp;rfr_id=info%3Asid%2Focoins.info%3Agenerator&amp;rft.title=Some+archives+hacking&amp;rft.aulast=Sherratt&amp;rft.aufirst=Tim&amp;rft.subject=archives&amp;rft.subject=hacks&amp;rft.source=discontents&amp;rft.date=2009-11-05&amp;rft.type=blogPost&amp;rft.format=text&amp;rft.identifier=http://discontents.com.au/shoebox/archives-shoebox/some-archives-hacking&amp;rft.language=English"></span>
<abbr class="unapi-id" title="http://discontents.com.au/?p=727"><!-- &nbsp; --></abbr>
<p>It&#8217;s great to see that the National Archives of Australia has released a large swag of data through the new <a href="http://data.australia.gov.au/">data.australia.gov.au</a> site. In the <a href="http://data.australia.gov.au/84">Commonwealth Agencies</a> zip file you can find xml dumps of all the publicly accessible agency and series data in RecordSearch, as well as item data for series A1. This is the same data that Mitchell Whitelaw visualised so brilliantly in his <a href="http://visiblearchive.blogspot.com/">Visible Archive</a> project. There&#8217;s also item data and images from series A3560 – the <a href="http://data.australia.gov.au/77">Mildenhall photographs of early Canberra</a>.</p>
<p>What&#8217;s even more exciting is that people are already using this data. At the recent GovHack event in Canberra the <a href="http://catherinestyles.com/2009/11/02/wtfgd-first-steps/">What The Federal Government Does</a> team worked on visualising the activities of government by using functions data pulled from the agencies file. Another group has generated a really nice <a href="http://mildenhall.creativepossums.net/">tag cloud and photo gallery</a> from the Mildenhall data. With further GovHack sessions to follow and the <a href="http://mashupaustralia.org/">MashupAustralia</a> contest open until 13 November, let&#8217;s hope for some more inspired archives hacking.</p>
<p>Seeing RecordSearch data out in the world like this reminded me of a little project I started a while back and then set aside. It was a simple PHP script that scraped data from RecordSearch and spat it out either as XML or JSON. Mitchell used a version of this script in his <a href="http://visiblearchive.blogspot.com/2009/08/exploring-a1-items-to-documents.html">A1 Explorer</a> in order to find out the number of pages in each digitised file.</p>
<p>I&#8217;ve now expanded and improved the script so that it provides data on items, series, agencies and persons. The output includes all the basic fields as well as links between entities – such as related series, controlling agencies etc. As an added bonus you also get some useful totals (where they&#8217;re available): items include the number of pages, series include the number of items described on RecordSearch, and agencies include the number of series recorded. I&#8217;ve also fiddled with mod_rewrite to provide a more rest-ful interface.</p>
<p>For XML output use the url <strong>http://discontents.com.au/shed/rs/xml/ </strong>followed by the appropriate identifier – a barcode for an item, a CA number for an agency, a CP number for a person or a series number.</p>
<p>Some examples:</p>
<ul>
<li> Series A1 – <a href="http://discontents.com.au/shed/rs/xml/a1">http://discontents.com.au/shed/rs/xml/a1</a></li>
<li>Item B2455, WRAGGE C L E – <a href="http://discontents.com.au/shed/rs/xml/3445411">http://discontents.com.au/shed/rs/xml/3445411</a></li>
<li>CSIR Head Office – <a href="http://discontents.com.au/shed/rs/xml/CA+486">http://discontents.com.au/shed/rs/xml/CA+486</a></li>
<li>Alfred Deakin – <a href="http://discontents.com.au/shed/rs/xml/CP+9">http://discontents.com.au/shed/rs/xml/CP+9</a></li>
</ul>
<p>As you might have guessed, to get JSON output you just substitute &#8216;json&#8217; for &#8216;xml&#8217; in the url.</p>
<p>Being dependent on screen scraping, it&#8217;s inherently a bit fragile, but I&#8217;m hoping it might be of some use. My intention was to use it to start exploring some new ways of using and interacting with the data. The code itself is <a href="http://bitbucket.org/wragge/rswrapper/">available at BitBucket</a>. It&#8217;s not very elegant, but I don&#8217;t want to spend much time cleaning it up at the moment. If it seems like it might be useful, I&#8217;ll probably rewrite the whole thing in python and publish it through Google&#8217;s AppEngine.</p>
]]></content:encoded>
			<wfw:commentRss>http://discontents.com.au/shoebox/archives-shoebox/some-archives-hacking/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>
