This tutorial is based on some experimenting I did recently linking a ContentDM collection of maps to a Timeliner in order to plot the collection items on a map and a timeline. There are multiple methods to make this happen, including using the ContentDM API and Google Spreadsheets to bring the collection metadata into Timeliner.
Timeliner is a hosted application that generates timelines and geo-spatial mappings of a given digital collection. The service is free and can be embedded into any webpage using an iFrame.
Timeliner provides a ready-to-use data template for Google Spreadsheets. An institution need only enter the appropriate metadata from a given digital collection, from ContentDM for example, into predefined columns in the template and then publish that spreadsheet. After entering the URL of the spreadsheet, Timeliner constructs an interactive timeline and map feature.
Timeliner is also open-source and can be installed and developed locally.
Harvesting the Metadata from ContentDM
There are two methods for bringing the data into Timeliner from ContentDM:
- via XML export
- via TSV export
The XML method is preferred, but would require an institution to add specific fields to its collections that Timeliner can use. For example, a place field that provides a human-readable placename for a given location, or a date field. In other words, if the data in ContentDM is structured in a Timeliner-ready manner, creating Timeliner interfaces for collections can be automated and rather simple once basic spreadsheets with ImportXML queries are entered into the appropriate Timeliner columns.
Special Note About Errors
For undetermined reasons, it is possible that ImportXML queries using the ContentDM API noted below will not retrieve data. There are a few possible explanations:
- Google limits the number of cells for a given spreadsheet and, importantly, there are limits on the complexity of spreadsheets, such as references to other cells. More information can be found on the Google Spreadsheets Size and Complexity Limits help page.
- ContentDM does time out from time to time
An alternative solution, not covered in this document, would be to export the full XML of a ContentDM collection and store it remotely and then have an XSLT construct a spreadsheet that could then be uploaded to Google Spreadsheets (or generated with ImportData calls within the spreadsheet). The one drawback to this solution is that this method will not dynamically update as new items are added to a collection. Thus, an institution would need to run this process each time an update was made to a collection.
As an example of using the dynamic XML method, a query to retrieve date field data might resemble something as simple as:
Adding similar queries to each Timeliner column will dynamically retrieve the data without any post-ContentDM publication intervention. Again, using the above example, the “date” field would need to entered by catalogers specifically for Timeliner (i.e. using a yyyy-mm-dd format).
Location: Geocoding through open web services
One Timeliner field that might be best handled directly in Google Spreadsheets post cataloging, however, would be Location as it can be automated and save catalogers significant time.
The Location field requires machine-readable latitudinal and longitudinal coordinates for a given place. Fortunately, open-source web services can be queried in a Google Spreadsheet to retrieve these coordinates.
To spare the author of such a spreadsheet from having to write incredibly complicated formulas, it is recommended to carry out this automation in stages:
create a new spreadsheet with multiple sheets:
the first sheet will be your Timeliner Template
the second sheet will be your Geocoding spreadsheet.
Populate the Timeliner Template with metadata using the above ImportXML method. This will include the Place column which contains human-readable place names.
In the Geocoding sheet, create four columns:
Column A will contain a formula that retrieves the data from the Place Column (Column H2) in the Timeliner Template (Sheet 1). For example:
Column B of the Geocoding sheet will query a geocoding web service to obtain the latitude. We will use the MapQuest Nominatim-based Open Geocoding API: http://developer.mapquest.com/web/products/open/geocoding-service
In Column B, you can query this service using the following XPath query, where A2 is the first row of data in Column A (assuming your columnar labels are in the first row):
=ImportXML("http://open.mapquestapi.com/nominatim/v1/?format=xml&q=" & A2 ; "//place/@lat")
Column C follows exactly the same XPATH statement, but replaces the latitude attribute @lat with the longitude attribute @lon.
=ImportXML("http://open.mapquestapi.com/nominatim/v1/?format=xml&q=" & A2 ; "//place/@lon")
- Column D simply needs a comma character entered. This will be used as a separator to separate the latitude and longitude values in the format required by Timeliner.
- Remember to copy all of these formulas down the columns. Google Spreadsheets should calculate the correct values as you do so.
- Finally, back in the Timeliner Template, under the Location column, add a concatenation formula to combine the last three columns of your Geocoding sheet. The structure is:
your actual Google formula might look like this…
Generating Thumbnails and other complicated fields
ContentDM generates a thumbnail image for each item. To create this, simply construct the following URL:
The above example can be broken down like this:
- SITE = cdm16106.contentdm.oclc.org
- COLLECTIONNAME = p16106coll1
- POINTER = 2
The pointer is available as an element in the XML output of a given collection. For example, in blue:
<![CDATA[ /p16106coll1 ]]>
<![CDATA[ 2 ]]>
And so, to construct an IMG tag reference for Timeliner to generate a thumbnail, you would create a field in your Spreadsheet with the following formula:
Often, when constructing these kinds of concatenations you may want to create a third sheet in your spreadsheet called, for example, “Build” or something along those lines. This is a intermediary spreadsheet to begin massaging complicated data that may need to pass through a few ImportXML and Contanetation steps before it is ready for Timeliner.
For example, in order to generate the above concatenation, you would first want two columns to pull from. For example:
one column would have the URL stem:
the second column would have the pointer, drawn from an ImportXML statement:
For those who are not comfortable with XML, it is possible to export Tab-Separated Value (TSV) files of ContentDM metadata. This method is not unlike the XML method, except that the TSV file will be imported directly into Google Spreadsheets and the appropriate fields will then be massaged until the data is suitable for use in Timeliner. This can increase the number of interrelated sheets one might need to lead up the completed Timeliner template.
For example, your spreadsheet might be constructed in the following way:
Sheet 1: Timeliner Template
Sheet 2: Geocoding Template
Sheet 3: Concatenating TSV values (for example, multiple Place fields)
When your data is not pre-structured for Timeliner
Often, ContentDM collections do not have the required fields for Timeliner. In these cases, significant manual intervention will be required. For example, you may have dates combined within the publication field, requiring that a person go through each row and clean up the data so that Timeliner has a simple date it can understand.
Published Google Spreadsheet ready for Timeliner:
Timeliner view of ContentDM data: