23.1. Overview of Parliamentary Papers#

23.1.1. What are Parliamentary Papers?#

Parliamentary Papers are documents presented to the Australian Parliament. Sometimes this is required by law. Other times it’s just for information. The Parliament of Australia website notes:

Documents presented include the annual reports of all government agencies, reports of royal commissions and other government inquiries, parliamentary committee reports, and a wide variety of other material.

As well as Trove, Parliamentary Papers can be found through ParlInfo, Parliament’s own online database.

Here’s a few randomly selected examples:

thumbnail title contributor date fulltext_url
PP no. 219 of 1980, Vol. 4|Technological change in Australia Committee of Inquiry into Technological Change in Australia.|Committee of Inquiry into Technological Change in Australia. 12109 4d6f85c2-0873-5994-8f8d-371247d19ae5|Myers, Rupert H. (Rupert Horace) 1980 https://nla.gov.au/nla.obj-1473897559
Cash matters : cash management in the Commonwealth|PP no. 282 of 1995, Report no. 340 Australia. Parliament. Joint Committee of Public Accounts.|Australia. Parliament. Joint Committee of Public Accounts. 6365 178b78f3-4c12-546f-b78d-4548df63ad05 1995 https://nla.gov.au/nla.obj-2510757504
PP no. 169 of 1935|Tariff Board's report on perambulators and go-carts and bodies therefor; wheels and parts (excepting parts of malleable cast iron) of wheels for perambulators and go-carts - tariff item 357. Australia. Parliament|Australia. Parliament.|Australia. Tariff Board|Australia. Tariff Board 280 7743b2ec-b86b-5480-8aed-8460efac9864 1935 https://nla.gov.au/nla.obj-2657275417

23.1.2. How many Parliamentary Papers are digitised in Trove?#

Many Commonwealth Parliamentary Papers have been digitised and made available through Trove. But, because of the way they’re arranged and described, it’s difficult to know exactly how many there are. I’ve attempted to harvest details of all the Parliamentary Papers in Trove using a combination of techniques. Based on this dataset, it seems there are currently 24,991 digitised Parliamentary Papers in Trove. Here are some more statistics from this dataset:

Hide code cell source
df = pd.read_csv(
    "https://github.com/GLAM-Workbench/trove-parliamentary-papers-data/raw/main/trove-parliamentary-papers.csv",
    keep_default_na=False,
)

stats = [
    ["Number of digitised Parliamentary Papers", df.shape[0]],
    ["Total number of pages", df["pages"].sum()],
    ["Median number of pages per publication", df["pages"].median()],
]

stats_df = pd.DataFrame(stats)
stats_df.style.format(thousands=",", precision=0).hide().hide(axis=1).set_properties(
    **{"text-align": "left"}
)
Number of digitised Parliamentary Papers 24,991
Total number of pages 2,448,576
Median number of pages per publication 60

Most of the Parliamentary Papers in Trove were published before 2013. If you search in ParlInfo for Parliamentary Papers published before 2013 the total number of results is 25,853 – close, but not exactly the same. There could be publications missing from Trove, or duplicates in the ParlInfo results.

23.1.3. When were the Parliamentary Papers published?#

The date metadata is not always accurate, but it seems good enough to explore the distribution of Trove’s Parliamentary Papers over time.

Hide code cell source
import altair as alt

df["year"] = df["date"].str.extract(r"\b(\d{4})$")
years = df["year"].value_counts().to_frame().reset_index()

chart_dates = (
    alt.Chart(years)
    .mark_bar(size=3)
    .encode(
        x="year:T", y="count:Q", tooltip=[alt.Tooltip("year:T", format="%Y"), "count:Q"]
    )
    .properties(width="container")
)

display(chart_dates)

Fig. 23.1 Publication dates of digitised Parliamentary Papers in Trove#

From the chart above it looks like the earliest Parliamentary Paper pre-dates the Commonwealth Parliament. What is it?

Hide code cell source
df["year"] = df["year"].astype("Int64")
earliest = df.loc[df["year"].idxmin()]
display(
    HTML(
        f"<a href='{earliest['fulltext_url']}'>{earliest['title']} / {earliest['alternative_title']}</a>"
    )
)

23.1.4. Titles and topics of Parliamentary Papers#

What are all these Parliamentary Papers about? You can use the title, subject, and contributor fields to explore their content.

Here, for example is a word cloud generated from the title field. There’s a lot of annual reports, and many of the titles include the abbreviation “PP”, so I’ve excluded the words “report”, “annual”, “PP”, and “AR”.

Hide code cell source
from wordcloud import STOPWORDS, WordCloud

# Add to the list of standard stopwords
stopwords = ["report", "annual", "pp", "AR"] + list(STOPWORDS)

titles = " ".join(df["title"].to_list())
wc = WordCloud(stopwords=stopwords, width=800, height=300)
wc.generate(titles).to_image()
../../_images/7913dbd8c480de4f1b4563afb281ff07cb371b36085b03f924eafa74f1e32159.png

The subject field contains a list of standard(ish) subject headings. Here’s the top twenty values:

Hide code cell source
import re


def split_and_clean(value):
    values = value.split("|")
    return list(
        set([re.sub(r"(\w)--(\w)", r"\1 -- \2", v).strip(".") for v in values if v])
    )


df["subject"] = df["subject"].apply(split_and_clean)

subjects = df["subject"].explode().to_frame()
# Remove trailing full stops
subjects["subject"] = subjects["subject"].str.strip(".")
subjects["subject"].value_counts().to_frame().reset_index()[:20].style.format(
    thousands=","
).hide()
subject count
Australian 7,321
Australia 6,833
Tariff -- Australia 1,575
Finance, Public -- Australia -- Accounting -- Periodicals 1,568
Federal issue 1,265
Administrative agencies -- Australia -- Auditing -- Periodicals 1,165
Finance, Public -- Australia -- Auditing 1,150
Finance, Public -- Auditing 1,140
Executive departments -- Australia -- Auditing -- Periodicals 1,135
Tariff Australia 1,111
Legislative auditing -- Australia -- Periodicals 1,106
Australia -- Appropriations and expenditures -- Periodicals 1,035
Public works -- Australia -- Periodicals 947
Public buildings -- Australia -- Periodicals 862
Finance, Public -- Australia -- Periodicals 765
Industries -- Australia -- Periodicals 760
Australia -- Industries -- Periodicals 686
Periodicals 553
Tariff -- Australia -- Periodicals 551
Key item 501

The name of the agency that created a particular publication can also give an indication of its content. Here are the top twenty contributing organisations:

Hide code cell source
def clean_contributor(value):
    if cleaned := re.search(r"(.*?) [0-9]+ [0-9a-z\-]+$", str(value)):
        return cleaned.group(1).strip(".")
    else:
        return str(value).strip(".")


contributors = df["contributor"].str.split("|").explode().to_frame()
contributors["cleaned name"] = contributors["contributor"]
contributors["cleaned name"] = contributors["contributor"].apply(clean_contributor)
contributors.dropna()["cleaned name"].value_counts().to_frame().reset_index()[
    :20
].style.format(thousands=",").hide()
cleaned name count
Australia. Tariff Board 3,802
Australia. Parliament 3,284
Australian National Audit Office 3,111
Australia. Parliament. Standing Committee on Public Works 2,059
1,560
Australia. Industries Assistance Commission 1,053
Australia. Parliament. Joint Committee of Public Accounts 825
Australia. Parliament. issuing body 787
Australia 415
Australia. Parliament. Senate. Committee of Privileges 399
Australia. Parliament. Joint Standing Committee on Treaties 348
Australia. Parliament. House of Representatives, issuing body 305
Australia. Parliament, 294
Australia. Royal Commission into Aboriginal Deaths in Custody 282
Australia. Inter-State Commission 276
Australia. Special Advisory Authority 240
Australia. Inter-state Commission 239
Australia. Treasury 236
Australia. Parliament. Senate. Standing Committee on Regulations and Ordinances 219
Australia. Parliament. The Senate, issuing body 212