All entries for Wednesday 08 October 2008
October 08, 2008
I felt the need to quickly put a thought down before I get swamped in material. I'm currently working through articles on Edward III by Thomas Merriam, a rather clever gentleman who has published several articles outlining a case for Marlowe's involvement in (either as writer, collaborator or the original writer of source material adapted by Shakespeare) scenes of that play.
These articles mark my first tentative steps out of the relatively safe and familiar world of historical investigation into the decidedly unfamiliar world of stylometric testing. Specifically, I'm just powering through an article detailing the creation of a particular neural network that Merriam and his colleague developed which can distinguish with (according to them!) a high level of accuracy between the writings of Marlowe and Shakespeare, using a select number of function-word ratios.
It's mind-boggling stuff for a literature boy. Yet it's also incredibly exciting. Throughout all of the studies I've read so far on the use of modern stylometrics, there's an iterated concern for the application of simple common sense to these tests, which have been used to great and damaging effect by many literary scholars who simply didn't understand the basic scientific/mathematical rules which needed to be followed in order to provide meaningful results. It's the kind of basic error which leads people to, for example, calculate ratios based on the number of function words per line in a given play, yet doesn't take into account the length of a line, whether a line is verse or prose, whether the texts are standardised with each other and so on.
I'm excited because, so far,I get it. Not just the common sense bits, but the technical data. I'm not saying I could design these tests myself (give me time), but I'm picking up how to read them and, more importantly, how to interrogate them. One of the main problems with this field, as Merriam himself points out in a 2002 article, is that the level of detail needed to make a thorough case is so massive in any particular investigation that it renders itself unreadable to anyone who isn't a specialist. By contrast, if you don't put it in the detail, clarity comes at the expense of accuracy and devalues the research. Therefore, for anyone seriously considering studying authorship, it's imperative to gain a solid understanding of how to read this stuff, how to interpret and respond to it. Otherwise, you just have to take people's word for it; or, alternatively, dismiss it out of hand as many scholars do.
This is a massive challenge for me, but one I'm really excited about. It's nice to be doing something interdisciplinary (though I won't be at a stage where I can hold meaningful conversations about this with Computer Science PhDs for quite some time), and it's nice to be taking on an area which puts off so many literary academics. Bring it on!