23.1. Overview of Parliamentary Papers#

23.1.1. What are Parliamentary Papers?#

Parliamentary Papers are documents presented to the Australian Parliament. Sometimes this is required by law. Other times it’s just for information. The Parliament of Australia website notes:

Documents presented include the annual reports of all government agencies, reports of royal commissions and other government inquiries, parliamentary committee reports, and a wide variety of other material.

As well as Trove, Parliamentary Papers can be found through ParlInfo, Parliament’s own online database.

Here’s a few randomly selected examples:

thumbnail title contributor date fulltext_url
Data-Matching Program (Assistance and Tax) Act 1990 : report on progress ... NaN 1992 https://nla.gov.au/nla.obj-2141188459
PP no. 2 of 1956|Special Report of the Auditor-General (under Section 54 of the Audit Acty 1901-1955) relating to the accounts of the Australian Aluminium Production Commission year 1955-1956. NaN 1956 https://nla.gov.au/nla.obj-2720246436
Australian Security Intelligence Organisation Legislation Amendment (Terrorism) Bill 2002 and related matters|Australian Security Intelligence Organisation Legislation Amendment (Terrorism) Bill 2002 and related matters / The Senate Legal and Constitutional References Committee.|PP no. 577 of 2002 Australia. Parliament. Senate. Legal and Constitutional References Committee|Australia. Parliament. Senate. Legal and Constitutional References Committee.|Australia. Parliament. Senate. Legal and Constitutional References Committee. 46885 f44357c6-62da-5f0b-89d9-4fd9b657d0af|Bolkus, Nick. 2002 https://nla.gov.au/nla.obj-958844291

23.1.2. How many Parliamentary Papers are digitised in Trove?#

Many Commonwealth Parliamentary Papers have been digitised and made available through Trove. But, because of the way they’re arranged and described, it’s difficult to know exactly how many there are. I’ve attempted to harvest details of all the Parliamentary Papers in Trove using a combination of techniques. Based on this dataset, it seems there are currently 24,991 digitised Parliamentary Papers in Trove. Here are some more statistics from this dataset:

Hide code cell source
df = pd.read_csv(
    "https://github.com/GLAM-Workbench/trove-parliamentary-papers-data/raw/main/trove-parliamentary-papers.csv",
    keep_default_na=False,
)

stats = [
    ["Number of digitised Parliamentary Papers", df.shape[0]],
    ["Total number of pages", df["pages"].sum()],
    ["Median number of pages per publication", df["pages"].median()],
]

stats_df = pd.DataFrame(stats)
stats_df.style.format(thousands=",", precision=0).hide().hide(axis=1).set_properties(
    **{"text-align": "left"}
)
Number of digitised Parliamentary Papers 24,991
Total number of pages 2,448,576
Median number of pages per publication 60

Most of the Parliamentary Papers in Trove were published before 2013. If you search in ParlInfo for Parliamentary Papers published before 2013 the total number of results is 25,853 – close, but not exactly the same. There could be publications missing from Trove, or duplicates in the ParlInfo results.

23.1.3. When were the Parliamentary Papers published?#

The date metadata is not always accurate, but it seems good enough to explore the distribution of Trove’s Parliamentary Papers over time.

Hide code cell source
import altair as alt

df["year"] = df["date"].str.extract(r"\b(\d{4})$")
years = df["year"].value_counts().to_frame().reset_index()

chart_dates = (
    alt.Chart(years)
    .mark_bar(size=3)
    .encode(
        x="year:T", y="count:Q", tooltip=[alt.Tooltip("year:T", format="%Y"), "count:Q"]
    )
    .properties(width="container")
)

display(chart_dates)

Fig. 23.1 Publication dates of digitised Parliamentary Papers in Trove#

From the chart above it looks like the earliest Parliamentary Paper pre-dates the Commonwealth Parliament. What is it?

Hide code cell source
df["year"] = df["year"].astype("Int64")
earliest = df.loc[df["year"].idxmin()]
display(
    HTML(
        f"<a href='{earliest['fulltext_url']}'>{earliest['title']} / {earliest['alternative_title']}</a>"
    )
)

23.1.4. Titles and topics of Parliamentary Papers#

What are all these Parliamentary Papers about? You can use the title, subject, and contributor fields to explore their content.

Here, for example is a word cloud generated from the title field. There’s a lot of annual reports, and many of the titles include the abbreviation “PP”, so I’ve excluded the words “report”, “annual”, “PP”, and “AR”.

Hide code cell source
from wordcloud import STOPWORDS, WordCloud

# Add to the list of standard stopwords
stopwords = ["report", "annual", "pp", "AR"] + list(STOPWORDS)

titles = " ".join(df["title"].to_list())
wc = WordCloud(stopwords=stopwords, width=800, height=300)
wc.generate(titles).to_image()
../../_images/da5011752e55cff89442e1a64106d64f8ebc9485024e42f50040aa68d2554169.png

The subject field contains a list of standard(ish) subject headings. Here’s the top twenty values:

Hide code cell source
import re


def split_and_clean(value):
    values = value.split("|")
    return list(
        set([re.sub(r"(\w)--(\w)", r"\1 -- \2", v).strip(".") for v in values if v])
    )


df["subject"] = df["subject"].apply(split_and_clean)

subjects = df["subject"].explode().to_frame()
# Remove trailing full stops
subjects["subject"] = subjects["subject"].str.strip(".")
subjects["subject"].value_counts().to_frame().reset_index()[:20].style.format(
    thousands=","
).hide()
subject count
Australian 7,321
Australia 6,833
Tariff -- Australia 1,575
Finance, Public -- Australia -- Accounting -- Periodicals 1,568
Federal issue 1,265
Administrative agencies -- Australia -- Auditing -- Periodicals 1,165
Finance, Public -- Australia -- Auditing 1,150
Finance, Public -- Auditing 1,140
Executive departments -- Australia -- Auditing -- Periodicals 1,135
Tariff Australia 1,111
Legislative auditing -- Australia -- Periodicals 1,106
Australia -- Appropriations and expenditures -- Periodicals 1,035
Public works -- Australia -- Periodicals 947
Public buildings -- Australia -- Periodicals 862
Finance, Public -- Australia -- Periodicals 765
Industries -- Australia -- Periodicals 760
Australia -- Industries -- Periodicals 686
Periodicals 553
Tariff -- Australia -- Periodicals 551
Key item 501

The name of the agency that created a particular publication can also give an indication of its content. Here are the top twenty contributing organisations:

Hide code cell source
def clean_contributor(value):
    if cleaned := re.search(r"(.*?) [0-9]+ [0-9a-z\-]+$", str(value)):
        return cleaned.group(1).strip(".")
    else:
        return str(value).strip(".")


contributors = df["contributor"].str.split("|").explode().to_frame()
contributors["cleaned name"] = contributors["contributor"]
contributors["cleaned name"] = contributors["contributor"].apply(clean_contributor)
contributors.dropna()["cleaned name"].value_counts().to_frame().reset_index()[
    :20
].style.format(thousands=",").hide()
cleaned name count
Australia. Tariff Board 3,802
Australia. Parliament 3,284
Australian National Audit Office 3,111
Australia. Parliament. Standing Committee on Public Works 2,059
1,560
Australia. Industries Assistance Commission 1,053
Australia. Parliament. Joint Committee of Public Accounts 825
Australia. Parliament. issuing body 787
Australia 415
Australia. Parliament. Senate. Committee of Privileges 399
Australia. Parliament. Joint Standing Committee on Treaties 348
Australia. Parliament. House of Representatives, issuing body 305
Australia. Parliament, 294
Australia. Royal Commission into Aboriginal Deaths in Custody 282
Australia. Inter-State Commission 276
Australia. Special Advisory Authority 240
Australia. Inter-state Commission 239
Australia. Treasury 236
Australia. Parliament. Senate. Standing Committee on Regulations and Ordinances 219
Australia. Parliament. The Senate, issuing body 212