Oxford University Press's
Academic Insights for the Thinking World

Devising data structures for scholarly works

For over 100 years, Oxford University Press has been publishing scholarly editions of major works. Prominent scholars reviewed and delivered authoritative versions of authors’ work with notes on citations, textual variations, references, and commentary added line by linefrom alternate titles for John Donne’s poetry to biographical information on recipients of Adam Smith’s correspondence. In an effort to move these works online in an interlinked fashion, we were faced with an interesting challenge to structure the content digitally so that it can be viewed, searched, and navigated to best effect.

The Oxford Scholarly Editions Online (OSEO) website needs to deal seamlessly and consistently with all manner of works, from letters and diary entries, through poetry and plays, to large works such as Burton’s Anatomy of Melancholy. As well as the variety of content type, there is also a wide variety in editorial style. The editions in OSEO have been published over a period of more than a century, and different editors have found different ways of dealing with the source materials available to them.

The main challenge in devising a data structure for the scholarly editions was to find a model that would be flexible enough to accommodate the variety of the content, but rich enough to allow the content to work in a useful way in its new digital environment. Describing all the challenges of creating a data model for OUP’s scholarly editions would be the subject of a book in itself. But we can illustrate the challenges by looking at a specific feature that is common to all scholarly editions, the editorial commentary.

In the printed book, notes on the text can appear in various locationsat the foot of the page, on the page facing the text of a work, at the back of the book, or sometimes in a separate volume.

Here’s an example from the printed Hamlet, showing the editorial notes on textual variants at the foot of the page:

OSEO hamlet

On the website, we wanted all these notes to appear in a panel next to the text, and to “march in step” so that a reader could immediately tell when a line of text had an accompanying note. To enable each note to appear alongside the corresponding line of text, we needed to create a link from note to text, one that works no matter where the note is located in the printed book. Each link needs a fixed target in the text, which in this case is the line number. In print, typically only every fifth or tenth line is numbered, but in the digital format we need to number every line (exposing the occasional instance of an editor losing count!). So instead of the “127” in the print footnote, we must create a line number object, placed alongside the text at the start of the line.

Here’s what a line number object looks like in the digital XML format:

<milestone unit=”line” num=”129″ id=”9780198129103-milestone-469″/>

The line number object, called an “element” in XML, has several components:

  • The element name (milestone, borrowed from the Text Encoding Initiative) tells us what kind of object it is. Dozens of different elements are used in OSEO to mark up the text.
  • The value of unit tells us what kind of milestone it is. This case would require a line of text, but not every work has line numbering, and so other values are also used.
  • The value of num tells us the line number.
  • The value of id is a unique identifier that acts as the target of the hyperlink created from a note.

By putting these milestones in the text, we create targets for links from the editorial notes and textual variants. The website code can then pull all the relevant notes in alongside the text, no matter where they appeared in the print format.

OSEO hamlet 2

The presence of a diamond or circle next to a line of text indicates that there is a note attached to it. The two different symbols are for the two kinds of note, textual variants and editorial commentary. Clicking on the symbol brings the note into line with the text, by making the text in the note pane scroll to the right place. The text is also highlighted with a yellow “flash.” Notes aren’t perpetually affixed alongside the text because the number and length of notes often take up more space than the original work. The click and flash offer a neat digital workaround not possible on ink and paper.

Of course, this is just one small fragment of the OSEO data model. Every display style and piece of functionality in OSEO has an XML element behind it. Close collaboration between content experts, data architects, and web developers was essential to getting the best possible experience of navigating scholarly editions online. And now, no longer faced with a pile of books, researchers can delve into these works in a new manner, creating opportunities to gain further insight into the humanities.

Image Credit: “Research Data Management” by janneke staaks. CC BY NC 2.0 via Flickr.

Recent Comments

There are currently no comments.