Using Timeliner with ContentDM

This tutorial is based on some experimenting I did recently linking a ContentDM collection of maps to a Timeliner in order to plot the collection items on a map and a timeline. There are multiple methods to make this happen, including using the ContentDM API and Google Spreadsheets to bring the collection metadata into Timeliner.

Background

Timeliner is a hosted application that generates timelines and geo-spatial mappings of a given digital collection. The service is free and can be embedded into any webpage using an iFrame.

Timeliner provides a ready-to-use data template for Google Spreadsheets. An institution need only enter the appropriate metadata from a given digital collection, from ContentDM for example, into predefined columns in the template and then publish that spreadsheet. After entering the URL of the spreadsheet, Timeliner constructs an interactive timeline and map feature.

Timeliner is also open-source and can be installed and developed locally.

Harvesting the Metadata from ContentDM

There are two methods for bringing the data into Timeliner from ContentDM:

  1. via XML export
  2. via TSV export

XML Method

The XML method is preferred, but would require an institution to add specific fields to its collections that Timeliner can use. For example, a place field that provides a human-readable placename for a given location, or a date field. In other words, if the data in ContentDM is structured in a Timeliner-ready manner, creating Timeliner interfaces for collections can be automated and rather simple once basic spreadsheets with ImportXML queries are entered into the appropriate Timeliner columns.

Special Note About Errors

For undetermined reasons, it is possible that ImportXML queries using the ContentDM API noted below will not retrieve data. There are a few possible explanations:

    1. Google limits the number of cells for a given spreadsheet and, importantly, there are limits on the complexity of spreadsheets, such as references to other cells. More information can be found on the Google Spreadsheets Size and Complexity Limits help page.
    2. ContentDM does time out from time to time

An alternative solution, not covered in this document, would be to export the full XML of a ContentDM collection and store it remotely and then have an XSLT construct a spreadsheet that could then be uploaded to Google Spreadsheets (or generated with ImportData calls within the spreadsheet). The one drawback to this solution is that this method will not dynamically update as new items are added to a collection. Thus, an institution would need to run this process each time an update was made to a collection.

As an example of using the dynamic XML method, a query to retrieve date field data might resemble something as simple as:

=ImportXML(XMLPATH,XPATHSELECTION)

for example…

=ImportXML("https://server16106.contentdm.oclc.org/dmwebservices/index.php?q=dmQuery/lpnc1/CISOSEARCHALL/title!creato!subjec!date!descri/title/1000/1/0/0/0/0/0/0/xml", "//date")

Adding similar queries to each Timeliner column will dynamically retrieve the data without any post-ContentDM publication intervention. Again, using the above example, the “date” field would need to entered by catalogers specifically for Timeliner (i.e. using a yyyy-mm-dd format).

Location: Geocoding through open web services

One Timeliner field that might be best handled directly in Google Spreadsheets post cataloging, however, would be Location as it can be automated and save catalogers significant time.

The Location field requires machine-readable latitudinal and longitudinal coordinates for a given place. Fortunately, open-source web services can be queried in a Google Spreadsheet to retrieve these coordinates.

To spare the author of such a spreadsheet from having to write incredibly complicated formulas, it is recommended to carry out this automation in stages:

  1. create a new spreadsheet with multiple sheets:

    1. the first sheet will be your Timeliner Template

    2. the second sheet will be your Geocoding spreadsheet.

  2. Populate the Timeliner Template with metadata using the above ImportXML method. This will include the Place column which contains human-readable place names.

  3. In the Geocoding sheet, create four columns:

  4. Column A will contain a formula that retrieves the data from the Place Column (Column H2) in the Timeliner Template (Sheet 1). For example:

=’Sheet 1′!H2

  1. Column B of the Geocoding sheet will query a geocoding web service to obtain the latitude. We will use the MapQuest Nominatim-based Open Geocoding API: http://developer.mapquest.com/web/products/open/geocoding-service

In Column B, you can query this service using the following XPath query, where A2 is the first row of data in Column A (assuming your columnar labels are in the first row):

=ImportXML("http://open.mapquestapi.com/nominatim/v1/?format=xml&q=" & A2 ; "//place[1]/@lat")
  1. Column C follows exactly the same XPATH statement, but replaces the latitude attribute @lat with the longitude attribute @lon.

=ImportXML("http://open.mapquestapi.com/nominatim/v1/?format=xml&q=" & A2 ; "//place[1]/@lon")
  1. Column D simply needs a comma character entered. This will be used as a separator to separate the latitude and longitude values in the format required by Timeliner.
  2. Remember to copy all of these formulas down the columns. Google Spreadsheets should calculate the correct values as you do so.
  3. Finally, back in the Timeliner Template, under the Location column, add a concatenation formula to combine the last three columns of your Geocoding sheet. The structure is:
=CONCATENATE(LATITUDE,COMMA,LONGITUDE)

your actual Google formula might look like this…

=CONCATENATE(Geocoding!B2,Geocoding!C2,Geocoding!D2)

Generating Thumbnails and other complicated fields

ContentDM generates a thumbnail image for each item. To create this, simply construct the following URL:

SITE/utils/getthumbnail/collection/COLLECTIONNAME/id/POINTER

For example:

http://cdm16106.contentdm.oclc.org/utils/getthumbnail/collection/p16106coll1/id/2

The above example can be broken down like this:

    • SITE = cdm16106.contentdm.oclc.org
    • COLLECTIONNAME = p16106coll1
    • POINTER = 2

The pointer is available as an element in the XML output of a given collection. For example, in blue:

<record>
<collection>
<![CDATA[ /p16106coll1 ]]>
</collection>
<pointer>
<![CDATA[ 2 ]]>
</pointer>

And so, to construct an IMG tag reference for Timeliner to generate a thumbnail, you would create a field in your Spreadsheet with the following formula:

=CONCAT("http://cdm16106.contentdm.oclc.org/utils/getthumbnail/collection/p16106coll1/id/","2")

Often, when constructing these kinds of concatenations you may want to create a third sheet in your spreadsheet called, for example, “Build” or something along those lines. This is a intermediary spreadsheet to begin massaging complicated data that may need to pass through a few ImportXML and Contanetation steps  before it is ready for Timeliner.

For example, in order to generate the above concatenation, you would first want two columns to pull from. For example:

  1. one column would have the URL stem:

http://cdm16106.contentdm.oclc.org/utils/getthumbnail/collection/p16106coll1/id/
  1. the second column would have the pointer, drawn from an ImportXML statement:

ImportXML("https://server16106.contentdm.oclc.org/dmwebservices/index.php?q=dmQuery/lpnc1/CISOSEARCHALL/title!creato!date!descri/title/1000/1/0/0/0/0/0/0/xml","//id")

TSV Method

For those who are not comfortable with XML, it is possible to export Tab-Separated Value (TSV) files of ContentDM metadata. This method is not unlike the XML method, except that the TSV file will be imported directly into Google Spreadsheets and the appropriate fields will then be massaged until the data is suitable for use in Timeliner. This can increase the number of interrelated sheets one might need to lead up the completed Timeliner template.

For example, your spreadsheet might be constructed in the following way:

Sheet 1: Timeliner Template

Sheet 2: Geocoding Template

Sheet 3: Concatenating TSV values (for example, multiple Place fields)

When your data is not pre-structured for Timeliner

Often, ContentDM collections do not have the required fields for Timeliner. In these cases, significant manual intervention will be required. For example, you may have dates combined within the publication field, requiring that a person go through each row and clean up the data so that Timeliner has a simple date it can understand.

Examples

Published Google Spreadsheet ready for Timeliner:

https://docs.google.com/spreadsheet/ccc?key=0AjJ_C_koXVI6dE9fcVlPSzBJWkc3TnRtZnJjTGEtWkE&usp=sharing

Timeliner view of ContentDM data:

http://digicol.lib.depaul.edu/cdm/timeline/collection/p16106coll1

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s