Skip to main content
Ctrl+K
Trove Data Guide - Home
  • Trove Data Guide
  • About this Guide
  • Who is this for?
  • The possibilities of Trove data

What is Trove?

  • 1. Trove is…
  • 2. Categories and zones
  • 3. Works and versions
  • 4. Collections within collections
  • 5. Links and identifiers
  • 6. Interfaces

Understanding search

  • 7. Understanding search
  • 8. ‘Simple’ search options
  • 9. Date searches
  • 10. Search interface hacks
  • 11. Finding NLA digitised content you can download

Accessing data

  • 12. Data access options
  • 13. Downloading data from the Trove web interface
  • 14. Trove API introduction
  • 15. How to
    • 15.1. HOW TO: Download higher resolution versions of images from the web interface
    • 15.2. HOW TO: Harvest a complete set of search results using the Trove API

Digitised newspapers and gazettes

  • 16. Understanding the digitised newspapers
  • 17. Accessing data
    • 17.1. Articles
    • 17.2. Pages
    • 17.3. Issues
    • 17.4. Titles
  • 18. How to
    • 18.1. HOW TO: Get a newspaper issue or article as a PDF
    • 18.2. HOW TO: Get information about the position of OCRd newspaper text
    • 18.3. HOW TO: Create a dataset of digitised newspaper articles

Other digitised resources

  • 19. Understanding digitised resources
  • 20. Accessing data from digitised resources
  • 21. Books
  • 22. Periodicals
    • 22.1. Overview of periodicals
    • 22.2. Finding digitised periodicals
    • 22.3. Accessing data from periodicals
  • 23. Parliamentary papers
    • 23.1. Overview of Parliamentary Papers
    • 23.2. Finding Parliamentary Papers in Trove
  • 24. Oral histories
    • 24.1. Overview of oral histories
    • 24.2. Accessing data from digitised oral histories
  • 25. How to
    • 25.1. HOW TO: Harvest data relating to digitised resources
    • 25.2. HOW TO: Extract additional metadata from the digitised resource viewer
    • 25.3. HOW TO: Get a list of items from a digitised collection
    • 25.4. HOW TO: Get text, images, and PDFs using Trove’s download link
    • 25.5. HOW TO: Create download links for images using nla.obj identifiers
    • 25.6. HOW TO: Get and use OCR data from a book or periodical page
    • 25.7. HOW TO: Scrape metadata from the Trove audio player

Research pathways

  • 26. Introduction
  • 27. Using text
    • 27.1. Data sources
    • 27.2. Tools and resources
    • 27.3. Tutorials and examples
      • Analysing keywords in Trove’s digitised newspapers
  • 28. Using images
    • 28.1. Data sources
    • 28.2. Tools and resources
    • 28.3. Tutorials and examples
      • Working with a Trove collection in Tropy
      • Comparing manuscript collections in Mirador
  • 29. Collection and system data
    • 29.1. Data sources
    • 29.2. Tools and resources
    • 29.3. Tutorials and examples
  • 30. Maps and places
    • 30.1. Data sources
    • 30.2. Tools and resources
    • 30.3. Tutorials and examples
      • Create a layer in GHAP using metadata from Trove’s digitised maps
  • 31. Creating collections
    • 31.1. Tools and resources
    • 31.2. Tutorials and examples
      • Sharing a Trove List as a CollectionBuilder exhibition
  • Contributing to the Trove Data Guide
  • References
  • Repository
  • Open issue
  • .ipynb

Links and identifiers

Contents

  • 5.1. Identifying identifiers
  • 5.2. Digitised newspapers
  • 5.3. Other digitised resources
  • 5.4. Work and version records
  • 5.5. People and organisations
  • 5.6. Web archives
  • 5.7. Transforming links

5. Links and identifiers#

On this page

What are the different types of urls you’ll find when you’re using Trove? Learn what they do and why it matters.

  • Identifying identifiers

  • Digitised newspapers

  • Other digitised resources

  • Work and version records

  • People and organisations

  • Web archives

  • Transforming links

5.1. Identifying identifiers#

As you navigate around Trove you’ll find a range of url patterns pointing to different types of resources. Some examples are:

  • https://trove.nla.gov.au/newspaper/article/61389505

  • http://nla.gov.au/nla.news-page5417618

  • https://trove.nla.gov.au/work/1144040

  • http://nla.gov.au/nla.news-title246

Some of these urls are ‘identifiers’, maintained by the NLA as persistent links to resources. These identifiers are independent of the platform used to deliver content, so should persist across site redesigns and technology upgrades. This built-in persistence is why identifiers are recommended for use in citations. When you ‘resolve’ a persistent identifier by plugging it into your web browser you often end up at a different url. This is because the identifier redirects you to the appropriate page in the current site structure.

Use the cite tab Luke!

When you want to save a link it’s tempting just to copy the url in your browser’s address bar, but it’s better to use the persistent identifier if one is available. Most of the time you can find the persistent identifier by clicking on the Cite tab in the Trove web interface. If you’re using the API (since version 3), persistent identifiers will be automatically included in results.

NLA identifiers generally start with https://nla.gov.au/nla.. For example, a newspaper article identifier looks like this: http://nla.gov.au/nla.news-article61389505. Notice that this identifier includes a numeric id as well as information about the type of thing this is – news-article.

The identifiers used for other digitised resources have a more generic form, starting with http://nla.gov.au/nla.obj. Digitised books, individual pages in a book, photos, periodicals, periodical issues, finding aids all share the same basic pattern. You can’t tell by looking at one of these identifiers what it actually points to – you have to follow it and find out!

Not everything in Trove has a persistent identifier. Works, for example, only have a url of the form https://trove.nla.gov.au/work/1144040. This identifies a work within the context of Trove, but there’s no guarantee of persistence. Work records are aggregated from a range of sources, and can be withdrawn or deleted by the contributing organisation.

Some identifiers lead outside of Trove to other NLA systems such as Libraries Australia and the main catalogue.

5.2. Digitised newspapers#

Digitised newspapers have the most highly-structured and consistent identifier scheme.

Table 5.1 Newspaper identifiers#

Entity type

Identifier format

Example

Resolves to url

title

nla.news-title[NUMERIC_ID]

http://nla.gov.au/nla.news-title246

https://trove.nla.gov.au/newspaper/title/246

issue

nla.news-issue[NUMERIC_ID]

https://nla.gov.au/nla.news-issue120169

https://trove.nla.gov.au/newspaper/page/1216627

page

nla.news-page[NUMERIC_ID]

http://nla.gov.au/nla.news-page8164936

https://trove.nla.gov.au/newspaper/page/8164936

article

nla.news-article[NUMERIC_ID]

http://nla.gov.au/nla.news-article89701669

https://trove.nla.gov.au/newspaper/article/89701669/

You’ll probably only find issue identifiers in results from the API’s /newspaper/title endpoint. They resolve to the first page of the issue, as issues have no separate landing page.

The numeric parts of the article and title identifiers can be used with the Trove API’s /newspaper and newspaper/title endpoints to retrieve metadata about them.

You might also find a few newspaper links in the wild that were generated by older versions of the digitised newspapers platform, for example things like: http://trove.nla.gov.au/ndp/del/article/19983475. These should be redirected to the current urls.

5.3. Other digitised resources#

Beyond newspapers things get a bit more complicated. As noted above, the rest of the NLA’s digitised resources share a single identifier pattern starting starting with http://nla.gov.au/nla.obj. This applies to all formats, and all the physical and logical components that combine to display the resource online. For example, three volumes of The Mammals of Australia by John Gould have been digitised and are available online in Trove. Here are the different types of identifiers used to organise and deliver this one publication:

Table 5.2 Example of book identifiers#

Entity type

Identifier

Note

Collection

https://nla.gov.au/nla.obj-55392912

The three volumes are organised as a collection with its own identifier.

Volume

https://nla.gov.au/nla.obj-55392920

Each volume has its own identifier.

Page

http://nla.gov.au/nla.obj-2334456661

Each page has its own identifier, these are listed as ‘image identifier’ in the ‘Cite’ tab.

Chapter or section

http://nla.gov.au/nla.obj-2685532114

If a resource has logical divisions, like articles or chapters, they each have their own identifiers

../_images/nla.obj-2334463531.resized.jpeg

Fig. 5.1 These platypuses have two identifiers: nla.obj-2334463531 points to the page, while nla.obj-2685532114 points to the section headed ‘ORNUTHORHYCHUS ANATINUS.’, but they both end up at the same place.#

The page and section identifiers are redirected to the volume that contains them, and are used as parameters in the digitised book viewer to land you at the expected location. For example, if you resolve the page identifier, you end up at a url that looks like this:

https://nla.gov.au/nla.obj-55392920/view?partId=nla.obj-2334456661

The first identifier points to the volume, then the partId parameter specifies the page to load. Similarly, the section identifier resolves to:

https://nla.gov.au/nla.obj-55392920/view?sectionId=nla.obj-2685532114&partId=nla.obj-2334463531

The first identifier points to the volume, then the sectionId parameter specifies the section, and partId specifies the page on which the section begins.

You can use digitised resources without grappling with these complexities, but it’s useful to understand the differences. For example, the Magazines & Newsletters category contains mostly links to articles in periodicals. These links are section identifiers which resolve to a particular periodical issue and use the sectionId parameter to deliver the requested article.

The differences can be important when you’re trying to access data from a particular component. There are examples of this in the ‘Other digitised resources’ section of this guide.

5.4. Work and version records#

As noted above, works don’t have persistent identifiers, but they do use a standard url format.

Table 5.3 Work urls#

Entity type

URL format

Example

work

work/[NUMERIC_ID]

https://trove.nla.gov.au/work/1144040

version/edition of a work

work/[WORK_ID]/version[VERSION_ID]

https://trove.nla.gov.au/work/1144040/version/25729065

The numeric part of the work url, 1144040 in the example above, can be used with the API’s /work endpoint to retrieve metadata describing the work.

Tip

Some work records include numeric identifiers from Libraries Australia and the Australian National Bibliographic Database (ANBD). These numeric values can be used to construct persistent identifiers for the linked records, which then resolve to the Trove work page. For example, this version includes a Libraries Australia numeric identifier with the value 2767186. Using this you can construct a persistent link of the form: https://nla.gov.au/anbd.bib-an2767186. This link will take you to the version in Trove.

5.5. People and organisations#

People and organisation records have persistent identifiers of the form:

https://nla.gov.au/nla.party-[NUMERIC ID]

For example, here’s the record for John Gould:

https://nla.gov.au/nla.party-478003

This identifier resolves to the url: https://trove.nla.gov.au/people/478003

The NLA’s ‘party’ identifiers are sometimes used as identifiers for the people and organisations themselves. For example, the Wikidata entry for John Gould includes a ‘NLA Trove People ID’ property set to 478003. Using the identifiers in this way links together related resources.

5.6. Web archives#

See also

The Australian Web Archive supports the Memento Protocol which endeavours to provide a consistent way of exploring the past web. For more examples, see Timegates, Timemaps, and Mementos in the GLAM Workbench.

The Australian Web Archive doesn’t use formal identifiers, however, links have a specific format:

https://webarchive.nla.gov.au/awa/[CAPTURE DATETIME]/[CAPTURED URL]

For example:

https://webarchive.nla.gov.au/awa/20140212214143/http://wraggelabs.com/shed/trove/graphs/coffee_tea.html

In this case, some of my early Trove visualisation experiments were captured on 12 February 2014 at 43 seconds past 9:41pm (20140212214143).

If you don’t know the exact date and time a page was captured, you can just use an approximate date and Trove will return the closest possible match. For example, to find a version of the NLA home page from 2015 you could use:

https://webarchive.nla.gov.au/awa/20150101000000/http://www.nla.gov.au/

This redirects to:

https://webarchive.nla.gov.au/awa/20150227205316/http://www.nla.gov.au/

If you want to see the calendar view of all the available captures, replace the date with an asterisk (*):

https://webarchive.nla.gov.au/awa/*/http://www.nla.gov.au/

5.7. Transforming links#

Understanding the links and identifiers used by Trove helps you find, access, and transform data. The numeric components of some identifiers can be used to retrieve data from the Trove API. The identifiers of digitised pages can be used to download high-resolution images. By resolving the identifiers of newspaper issues, you can find all the front pages. There are examples of these sorts of transformations throughout the Trove Data Guide.

previous

4. Collections within collections

next

6. Interfaces

Contents
  • 5.1. Identifying identifiers
  • 5.2. Digitised newspapers
  • 5.3. Other digitised resources
  • 5.4. Work and version records
  • 5.5. People and organisations
  • 5.6. Web archives
  • 5.7. Transforming links

By Tim Sherratt

© Copyright 2024 Australian Research Data Commons.

Creative Commons License This work is licensed under a Creative Commons Attribution 4.0 International License.

Version: v1.0-beta.16 (07 November 2024)

The Trove Data Guide received investment from the Australian Research Data Commons (ARDC). The ARDC is funded by the National Collaborative Research Infrastructure Strategy (NCRIS).