23.1. Overview of Parliamentary Papers#
23.1.1. What are Parliamentary Papers?#
Parliamentary Papers are documents presented to the Australian Parliament. Sometimes this is required by law. Other times it’s just for information. The Parliament of Australia website notes:
Documents presented include the annual reports of all government agencies, reports of royal commissions and other government inquiries, parliamentary committee reports, and a wide variety of other material.
As well as Trove, Parliamentary Papers can be found through ParlInfo, Parliament’s own online database.
Here’s a few randomly selected examples:
thumbnail | title | contributor | date | fulltext_url |
---|---|---|---|---|
Data-Matching Program (Assistance and Tax) Act 1990 : report on progress ... | NaN | 1992 | https://nla.gov.au/nla.obj-2141188459 | |
PP no. 2 of 1956|Special Report of the Auditor-General (under Section 54 of the Audit Acty 1901-1955) relating to the accounts of the Australian Aluminium Production Commission year 1955-1956. | NaN | 1956 | https://nla.gov.au/nla.obj-2720246436 | |
Australian Security Intelligence Organisation Legislation Amendment (Terrorism) Bill 2002 and related matters|Australian Security Intelligence Organisation Legislation Amendment (Terrorism) Bill 2002 and related matters / The Senate Legal and Constitutional References Committee.|PP no. 577 of 2002 | Australia. Parliament. Senate. Legal and Constitutional References Committee|Australia. Parliament. Senate. Legal and Constitutional References Committee.|Australia. Parliament. Senate. Legal and Constitutional References Committee. 46885 f44357c6-62da-5f0b-89d9-4fd9b657d0af|Bolkus, Nick. | 2002 | https://nla.gov.au/nla.obj-958844291 |
23.1.2. How many Parliamentary Papers are digitised in Trove?#
Many Commonwealth Parliamentary Papers have been digitised and made available through Trove. But, because of the way they’re arranged and described, it’s difficult to know exactly how many there are. I’ve attempted to harvest details of all the Parliamentary Papers in Trove using a combination of techniques. Based on this dataset, it seems there are currently 24,991 digitised Parliamentary Papers in Trove. Here are some more statistics from this dataset:
Show code cell source
df = pd.read_csv(
"https://github.com/GLAM-Workbench/trove-parliamentary-papers-data/raw/main/trove-parliamentary-papers.csv",
keep_default_na=False,
)
stats = [
["Number of digitised Parliamentary Papers", df.shape[0]],
["Total number of pages", df["pages"].sum()],
["Median number of pages per publication", df["pages"].median()],
]
stats_df = pd.DataFrame(stats)
stats_df.style.format(thousands=",", precision=0).hide().hide(axis=1).set_properties(
**{"text-align": "left"}
)
Number of digitised Parliamentary Papers | 24,991 |
Total number of pages | 2,448,576 |
Median number of pages per publication | 60 |
Most of the Parliamentary Papers in Trove were published before 2013. If you search in ParlInfo for Parliamentary Papers published before 2013 the total number of results is 25,853 – close, but not exactly the same. There could be publications missing from Trove, or duplicates in the ParlInfo results.
23.1.3. When were the Parliamentary Papers published?#
The date
metadata is not always accurate, but it seems good enough to explore the distribution of Trove’s Parliamentary Papers over time.
Show code cell source
import altair as alt
df["year"] = df["date"].str.extract(r"\b(\d{4})$")
years = df["year"].value_counts().to_frame().reset_index()
chart_dates = (
alt.Chart(years)
.mark_bar(size=3)
.encode(
x="year:T", y="count:Q", tooltip=[alt.Tooltip("year:T", format="%Y"), "count:Q"]
)
.properties(width="container")
)
display(chart_dates)
From the chart above it looks like the earliest Parliamentary Paper pre-dates the Commonwealth Parliament. What is it?
Show code cell source
df["year"] = df["year"].astype("Int64")
earliest = df.loc[df["year"].idxmin()]
display(
HTML(
f"<a href='{earliest['fulltext_url']}'>{earliest['title']} / {earliest['alternative_title']}</a>"
)
)
23.1.4. Titles and topics of Parliamentary Papers#
What are all these Parliamentary Papers about? You can use the title
, subject
, and contributor
fields to explore their content.
Here, for example is a word cloud generated from the title
field. There’s a lot of annual reports, and many of the titles include the abbreviation “PP”, so I’ve excluded the words “report”, “annual”, “PP”, and “AR”.
Show code cell source
from wordcloud import STOPWORDS, WordCloud
# Add to the list of standard stopwords
stopwords = ["report", "annual", "pp", "AR"] + list(STOPWORDS)
titles = " ".join(df["title"].to_list())
wc = WordCloud(stopwords=stopwords, width=800, height=300)
wc.generate(titles).to_image()
The subject
field contains a list of standard(ish) subject headings. Here’s the top twenty values:
Show code cell source
import re
def split_and_clean(value):
values = value.split("|")
return list(
set([re.sub(r"(\w)--(\w)", r"\1 -- \2", v).strip(".") for v in values if v])
)
df["subject"] = df["subject"].apply(split_and_clean)
subjects = df["subject"].explode().to_frame()
# Remove trailing full stops
subjects["subject"] = subjects["subject"].str.strip(".")
subjects["subject"].value_counts().to_frame().reset_index()[:20].style.format(
thousands=","
).hide()
subject | count |
---|---|
Australian | 7,321 |
Australia | 6,833 |
Tariff -- Australia | 1,575 |
Finance, Public -- Australia -- Accounting -- Periodicals | 1,568 |
Federal issue | 1,265 |
Administrative agencies -- Australia -- Auditing -- Periodicals | 1,165 |
Finance, Public -- Australia -- Auditing | 1,150 |
Finance, Public -- Auditing | 1,140 |
Executive departments -- Australia -- Auditing -- Periodicals | 1,135 |
Tariff Australia | 1,111 |
Legislative auditing -- Australia -- Periodicals | 1,106 |
Australia -- Appropriations and expenditures -- Periodicals | 1,035 |
Public works -- Australia -- Periodicals | 947 |
Public buildings -- Australia -- Periodicals | 862 |
Finance, Public -- Australia -- Periodicals | 765 |
Industries -- Australia -- Periodicals | 760 |
Australia -- Industries -- Periodicals | 686 |
Periodicals | 553 |
Tariff -- Australia -- Periodicals | 551 |
Key item | 501 |
The name of the agency that created a particular publication can also give an indication of its content. Here are the top twenty contributing organisations:
Show code cell source
def clean_contributor(value):
if cleaned := re.search(r"(.*?) [0-9]+ [0-9a-z\-]+$", str(value)):
return cleaned.group(1).strip(".")
else:
return str(value).strip(".")
contributors = df["contributor"].str.split("|").explode().to_frame()
contributors["cleaned name"] = contributors["contributor"]
contributors["cleaned name"] = contributors["contributor"].apply(clean_contributor)
contributors.dropna()["cleaned name"].value_counts().to_frame().reset_index()[
:20
].style.format(thousands=",").hide()
cleaned name | count |
---|---|
Australia. Tariff Board | 3,802 |
Australia. Parliament | 3,284 |
Australian National Audit Office | 3,111 |
Australia. Parliament. Standing Committee on Public Works | 2,059 |
1,560 | |
Australia. Industries Assistance Commission | 1,053 |
Australia. Parliament. Joint Committee of Public Accounts | 825 |
Australia. Parliament. issuing body | 787 |
Australia | 415 |
Australia. Parliament. Senate. Committee of Privileges | 399 |
Australia. Parliament. Joint Standing Committee on Treaties | 348 |
Australia. Parliament. House of Representatives, issuing body | 305 |
Australia. Parliament, | 294 |
Australia. Royal Commission into Aboriginal Deaths in Custody | 282 |
Australia. Inter-State Commission | 276 |
Australia. Special Advisory Authority | 240 |
Australia. Inter-state Commission | 239 |
Australia. Treasury | 236 |
Australia. Parliament. Senate. Standing Committee on Regulations and Ordinances | 219 |
Australia. Parliament. The Senate, issuing body | 212 |