30.1. Data sources#

Documentation#

These sections of the Trove Data Guide explain how to access maps and place data from different parts of Trove:

Pre-harvested datasets#

The GLAM Workbench provides a number of datasets containing OCRd text harvested from Trove.

Trove digitised maps metadata

This dataset contains metadata describing digitised maps in Trove, harvested from the Trove API and other sources.

Trove digitised maps – coordinates

This dataset was generated from the harvest of digitised maps metadata. Coordinate strings in the metadata (points and bounding boxes) were parsed and converted to decimal values.

MARC Geographic Areas – Wikidata mappings

Trove uses codes from the MARC Geographic Areas list to identify locations in metadata records. I couldn’t find any mappings of these codes to other sources of geospatial information, so I fired up OpenRefine and reconciled the geographic area names against Wikidata. Once I’d linked as many as possible, I copied additional information from Wikidata, such as ISO country codes, GeoNames identifiers, and geographic coordinates.

Geolocated newspaper titles

This dataset relates newspaper titles in Trove with their places of publication and circulation.

Creating datasets#

Exploring digitised maps in Trove

I knew there were lots of great maps you could download from Trove, but how many? And how big were the files? I thought I’d try to quantify this a bit by harvesting and analysing the metadata.

Parse map coordinates from metadata

The harvest of digitised maps metadata includes a coordinates column that provides a string representation of either a point or a bounding box. This notebook attempts to parse the coordinate string and convert the values to decimals. It then uses the decimal values to explore the geographical context of Trove’s digitised map collection.

Create a subset of digitised maps by searching for coordinates

This notebook helps you create subsets of digitised maps by searching for maps whose centre points fall within a specified bounding box. You can download the results as CSV and GeoJSON files.