30.1. Data sources#
Documentation#
These sections of the Trove Data Guide explain how to access maps and place data from different parts of Trove:
Pre-harvested datasets#
The GLAM Workbench provides a number of datasets containing OCRd text harvested from Trove.
This dataset contains metadata describing digitised maps in Trove, harvested from the Trove API and other sources.
This dataset was generated from the harvest of digitised maps metadata. Coordinate strings in the metadata (points and bounding boxes) were parsed and converted to decimal values.
Trove uses codes from the MARC Geographic Areas list to identify locations in metadata records. I couldn’t find any mappings of these codes to other sources of geospatial information, so I fired up OpenRefine and reconciled the geographic area names against Wikidata. Once I’d linked as many as possible, I copied additional information from Wikidata, such as ISO country codes, GeoNames identifiers, and geographic coordinates.
This dataset relates newspaper titles in Trove with their places of publication and circulation.
Creating datasets#
- Exploring digitised maps in Trove
I knew there were lots of great maps you could download from Trove, but how many? And how big were the files? I thought I’d try to quantify this a bit by harvesting and analysing the metadata.
- Parse map coordinates from metadata
The harvest of digitised maps metadata includes a coordinates column that provides a string representation of either a point or a bounding box. This notebook attempts to parse the coordinate string and convert the values to decimals. It then uses the decimal values to explore the geographical context of Trove’s digitised map collection.
- Create a subset of digitised maps by searching for coordinates
This notebook helps you create subsets of digitised maps by searching for maps whose centre points fall within a specified bounding box. You can download the results as CSV and GeoJSON files.