August 07, 2012

The Wizard behind the curtain: Relational databases, drupal websites, and humanities scholarship

Wizard of Oz

Imagine a world without Mendeley, Evernote, Bookends or Zotero, but with plenty of Microsoft Excel or google spreadsheets. Don't panic, it's just a hypothetical situation. You are attempting to catalogue your books using MS Excel, but you also want to include biographical information about your authors. You could have columns in your table for biographical information, but everytime you enter that author you are going to have to re-enter that information. There are only so many times that you can enter Wordsworth's date of birth before you lose the will to live (or at least to work). If you are like me, that 'number of times' is quite small. So, you create a separate spreadsheet with a table of authors and biographical information, and you simply reference that table from your original table of books. Obviously you have read and want to index every Wordsworth text ever published so this is going to save you a lot of copying and pasting. What you have created is a basic relational database, you have split the information into two tables to avoid repetition. In database-ease this is called normalizing your data. I think of data normalization like this: if you have to enter the information more than once, is it easier to split it into a different table? The author's name will probably be the 'key' that you use to relate the two tables. However, if you were in Byron's personal hell, and there were several William Wordsworths (or 'Turdsworths' as Byron would call them), you could use an arbitrary, but unique, number to cross-reference the two tables without any confusion, you'd know which of the ever-replicating Agent-Smith-esque Wordsworths authored which Prelude. Resisting the temptation to explore the deep philosophical questions about identity that this raises, I can tell you that the best tools for creating relational databases like this are, infact, not spreadsheet tools, but relational database management systems like Microsoft Access (best as a personal, desktop based database) or MySQL (which is the wizard behind the curtain of many many websites or institutional HR systems). As I understand it, then, a relational database is a way of recording information in a structure that maximizes efficiency by separating information into different tables which are linked by reference keys (in relational database speak, foreign keys and primary keys).

One planned output of the project that I am working on (Networks of Improvement) is an online relational database of eighteenth-century literary clubs and societies. After a few months of experimentation we now have an online platform for this research. For now, the platform is private, but by the end of the project in 2014 it will be freely available.

I began developing this platform by experimenting with MS Access, and this was really helpful because I learnt about normalizing databases and thought carefully about the design of the database and how the information related. This made me think about efficiency and how to get my users to enter data easily, consistently, and accurately. In the end though, Access seemed the wrong way to go, given that we wanted to collaboratively populate the database and to publish it online. The solution I have ultimately used is to design a website powered by Drupal, a widely used, powerful, open source content management system. I gather Drupal is really intended for people who are comfortable writing php code, but I have found you can do a great deal without this knowledge or even without knowing what php is (apart from a curiously vowelless noise). Using a cocktail of modules (plug-ins to the core drupal management code which are contributed by a community of developers to extend its functionality) I've been able to set up a website that allows users with permissions to add and edit content using forms, as well as to search and sort existing content. They can do this simultaneously, and from anywhere that they have an internet connection.

The site feels like a relational database management system. Drupal does use a MySQL database, but I still do not know whether or not the data in the MySQL database behind the scenes is organising the information we enter into different tables (I should really look into this!). However, when I turned to Drupal, I took the lessons I learnt from MS Access and applied them to content management. The platform I have designed feels like a relational database for its users. Drupal allows you to define different content types, and to define 'fields' (think of these like columns in a spreadsheet), for each content type. It creates forms for users to populate those fields. I began by setting up content types which, in my mind, were the equivalent of each of the tables that I had originally made use of in MS Access. I added contributed modules to the core Drupal in order to acquire nice forms for fields -for instance, I added a location module and a date module which made it easy to add, yes, locations and dates. In my opinion, the module that really pushed this solution ahead of the competition from an academic point of view was 'biblio', a module that acts like a citation manager within the website.

In our case, we are trying to record clubs, club membership, and club venues. I have created Drupal content types for 'club', for 'membership records', for 'individuals', and for 'venues'. I separated membership records and individuals into two different content types or 'tables', as I think of them. This was a lesson learnt from normalizing my database structure in Access -one individual can have lots of membership records, so this way I don't have to enter the individual more than once. This also helps us investigate research questions about an individual's clubbish behaviour -allowing the discovery of individuals who belong to many clubs and form nodal points in the networks of improvement we're studying.

The crucial part of the process was how to link these 'tables'. I ended up using a tagteam of contributed modules called entity_reference and entiity_connect. These modules allowed me to add a field to my content types which was (at least from a user's point of view) equivalent to the foreign key in a conventional database. So, my membership records had a field which referenced a club and an individual, for example. To most intents and purposes, I have created a really usable online relational database management system. The data can even be queried and represented using Structured Query Language (SQL), the major benefit of relational databases as a methodology for answering research questions.

We hope that this will ultimately produce a really valuable tool for scholars of the eighteenth century, as well as enriching people's sense of the history of their areas and of sociable or clubbish behaviour. I also hope that my experience of the technology behind all this will generate ideas for other applications of Drupal and relational databases in humanities research and give other non-developers the confidence to dabble in digital humanities. We're now considering the possibilities for wider collaboration that this platform might offer -but perhaps I'll save that for another post.

Share this post on twitter:

July 06, 2012

Remember to wind up the clock

My battered Penguin classics edition of Tristram Shandy

"[W]hen they are once set a-going, whether right or wrong, 'tis not a halfpenny matter, --away they go cluttering like hey-go-mad" (Laurence Sterne, The Life and Opinions of Tristram Shandy, Gentleman, (London: Penguin, 1997, p. 5).

As Tristram insists, and as his life demonstrates, beginnings are both difficult to get right, and vitally important. The concept of a miscellany makes me less anxious than the fictional autobiographer, though -I hope my writing here does clutter away like hey-go-mad. Sterne's wonderful book, along with Wordsworth's Prelude, were the two texts that set me a-going, which, whether right or wrong, made me want to spend further time studying the period in which they were written. Much as I love Wordsworth, he doesn't have a detectable sense of humour (I'd love to be corrected on this). For this reason, in this particular celebrity death match, Sterne wins the honour of blog godfather (blog midwife? I don't think I want to go there).

You can tell how much I love this book by the battered state my personal copy is in. I first read it as an undergraduate. I have a strong memory of reading some of it under an oak tree while 'watching' cricket. I think it must have been the Easter vacation. That reading is where all the coloured tabs come from. Sometime not too long after that, I went to visit one of my best friends at her parents' house in Sutton-on-the-forest. We toured the village visiting various points of extreme significance in her childhood development, including the church. I was uncharacteristically observant, and scanned the list of past vicars. And there he was! Laurence Sterne. My friend's mum was so excited that finally someone appreciated the significance of this, having failed to awaken any enthusiasm in her own kids about Sutton-on-the-forest's claim to fame.

My site's logos, as well as the title of this miscellany 'Scrapeana' is openly but respectfully plagiarised from Scrapeana (1792) edited by John Croft. The genre of the miscellany isn't something I know a huge amount about. It's a form of print culture which I can see could be pretty fascinating. A Leverhulme funded project led by Dr Abigail Williams at Oxford University is currently creating a database of eighteenth-century poetic miscellanies; the participants blog about it here.

Like me, the title page of the 1792 Scrapeana pays homage to Sterne:

Sterne epigraph

The Sterne quote combined with the picture of the monkey shaving (!) is intriguing. While at first I was just amused by this image as an example of a facile sense of humour (which I share), the more I think about it, the more interesting and serious it seems. Its general significance seems to relate to the idea of mimicry, and of the division between the human and the animal. Is man no more than a shaved monkey? Or is the monkey mimicking man, but is this attempt at mimicry futile -is he about to cut his own throat? I found a fable where the latter happens, printed in 1788, in another miscellany, The American museum: or, Repository of ancient and modern fugitive pieces, etc. prose and poetical, at vol. 3, page 279.

In my view, the fable reinforces the idea that certain parts of the population are capable of governing, and of electing, while others are not. As such, it participates in the contestation of the concept of popular sovereignty (the focus of my first book, currently under consideration as "The Majesty of the People: popular sovereignty and the role of the writer in the 1790s"). My sense is that underneath it all the monkey shaving is about a division between those capable of politics and those not. This argument is bolstered by another fable from another miscellany The Hibernian magazine, or, Compendium of entertaining knowledge(1774) vol. 4, page 53:

I find it fascinating that the division between the political elite and the rest of the human race is reinforced by figuratively separating them into different species. The use of animals is, of course, a convention of the fable, but the political unenfranchised don't always shake off this zoomorphism outside of the fable -even if the fable is the original source. The most famous eighteenth century example of this is Burke's 'swinish multitude'. One of the 'logics' behind this seems to be that of the aristocratic discourse of civic humanism -the idea that the people at large are too preoccupied with the necessities of life to become truly 'political animals'. Here I am very much influenced by John Barrell's account of civic humanism in The Political Theory of Painting from Reynolds to Hazlitt : "The Body of the Public" (New Haven: Yale University Press, 1986) passim., but particularly 6-8.

Why not stop there, I have cluttered away, and the post has gone in directions I didn't expect, but I could have predicted (given my preoccupation with this stuff). And I thought the monkey shaving was just the 18th century equivalent of a cat playing the piano.

sterne, digression

Share this post on twitter:

Spine of John Croft

Search this blog



Most recent comments

  • Today I came across this manual, which I wish I had known about a few months ago! http://drupal.forh… by Georgina Green on this entry
  • Hi Susan, I didn't use sitebuilder –Drupal is sort of an alternative to sitebuilder, which doesn't h… by Georgina Green on this entry
  • This sounds like just the database functionality we need for Multicultural Shakespeare project. Wher… by Susan Brock on this entry

Blog archive

RSS2.0 Atom
Not signed in
Sign in

Powered by BlogBuilder