Tutorials and examples

27.3. Tutorials and examples#

This page includes information on tutorials and examples to help you work with text from Trove.

Tutorials#

Analysing keywords in Trove’s digitised newspapers

Example image

You want to explore differences in language use across a collection of digitised newspaper articles. The Australian Text Analytics Platform provides a Keywords Analysis tool that helps you examine whether particular words are over or under-represented across collections of text. But how do get data from Trove’s newspapers to the keyword analysis tool?

Get started

Examples from the GLAM Workbench#

Exploring text files harvested with the Trove Harvester: This notebook suggests some ways in which you can aggregate and analyse the individual OCRd text files for each article — look at word frequencies; calculate TF-IDF values.
Finding non-English newspapers in Trove: There are a growing number of non-English newspapers digitised in Trove. However, if you’re only searching using English keywords, you might never know that they’re there. This notebook analyses the language of a sample of articles from each newspaper to create a list of non-English newspapers.
Counting words and phrases in digitised books: This notebook provides a simple example of extracting word and ngram frequencies from the OCRd text of a digitised book using TextBlob and Wordcloud.
Recipe generator: In this notebook we use TextBlob to extract nouns, verbs, and sentences from the OCRd text of a 19th century cookery book. We try to clean things up a bit, using regular expressions to discard likely OCR errors. Then we recombine the various parts in random combinations to create delicious recipes for all occasions. Enjoy!

Other examples#

Topic modelling of Trove Books (Adel Rahmani)
Topic modelling of Australian parliamentary press releases (Adel Rahmani)

Tutorials and examples

Contents

27.3. Tutorials and examples#

Tutorials#

Examples from the GLAM Workbench#

Other examples#