[Linux-aus] Update on the Australian Newspapers service - 1 millionth page, 10 millionth article now available

Donna Benjamin donna at cc.com.au
Wed Dec 16 10:13:14 EST 2009

With all the doom and gloom around at present... clean feeds, climate change,
this development brought a smile to my face.

I wonder what it means for the various collections of microfiche around the country?
I really need to find myself a geeknest of librarians to discuss this with.

Thought others might enjoy this news.

-------- Original Message --------
Subject: [andp-announce] Update on the Australian Newspapers service 14
December 2009. 1 millionth page
Date: Mon, 14 Dec 2009 17:11:57 +1100
From: Rose Holley <RHOLLEY at nla.gov.au>
Reply-To: andp-announce at nla.gov.au
To: ANDP Announce-l <andp-announce at listserver.nla.gov.au>,
"anplan at listserver.nla.gov.au" <anplan at listserver.nla.gov.au>

Dear Australian Newspaper users

I enclose the latest information for you on the service:

1. Progress - 1 millionth page now available.

The millionth page was made publicly available on 14 December 2009,
marking a project milestone. The millionth page contained the 10
millionth article. This was a 1901 edition of the Sydney Morning Herald.
There will be 40 million articles available by 2011.
Digitisation started in 2007 and 4.4 million pages were targeted for
digitisation over 4 years to be complete and publicly accessible as full
text articles by June 2010. 3 million of the identified 4.4 million
pages have been scanned from microfilm into digital images so far. Of
the 3 million scanned pages 1 million have been converted into full text
articles by the OCR process and are publicly available. The remaining
pages will be made available from now through til June 2010. Visual
progress chart: http://www.nla.gov.au/ndp/selected_newspapers/

The 1 million pages publicly available amounts to 10 million articles
with coverage dates of 1803 -1954.

Public users have enhanced the data significantly since August 2008 by
correcting 8.13 million lines of text in 368,390 articles. This really
improves the searching. Also 5061 comments and 230,384 tags have been
added to articles, which will be used for search and retrieval in the
2010 version of Trove.

2. Sydney Morning Herald

The first 70 years of the Sydney Morning Herald are now publicly
available. 1831-1901.

Please be aware that some issues of the Sydney Morning Herald are
missing.  These are being sourced in hard copy from locations in
Australia and will be added to the public service in 2010.  So don't
worry if you spot a missing issue, we know about it and it will appear
in the service soon.

3. Public search interfaces and intergration into TROVE

There are currently 2 search interfaces available as below:

Standalone service [BLUE INTERFACE] - Australian Newspapers v1.0 :
*       Newspapers integrated into Trove [GREEN INTERFACE]
http://trove.nla.gov.au/ndp/del/home still under development December
2009, expected to be completed end Jan 2010.

About the integration of Newspapers into Trove:
Positive feedback was received from public users in 2009 about
integrated searching of newspapers with other resources, when as a trial
newspapers could be searched across with other content. The new Trove
service integrates searching of many different resources at once (the
Australian National Bibliographic Database, Australian Newspapers,
Picture Australia, Australian Research Online , PANDORA, OAIster, Open
Library, the Hathi Trust, the Internet Archive and the Library of
Congress tables of contents, publishers' descriptions and sample book
In December 2009 work began to integrate the standalone newspapers
service into the wider Trove service. From December 2009- end January
2010 work will be undertaken in Trove so that the full functionality of
the Australian Newsapers v 1.0 is duplicated. The public will be invited
to give feedback before the standalone version is removed and replaced
with the Trove version.
As at 14 December 2009:

Text correction, tagging and commenting histories and rankings are
identical in both services and will remain so (though history and
ranking tables look different in Trove). These functions can be done in
either service.
*       Search results are different in Trove to standalone version
(because tags and comments are included in the relevancy ranking in Trove).
*       Content is identical in both services.

If you have issues please clearly report which version you are in (Green
or Blue?)
Read more about the Australian Newspapers - Trove integration here:
ANDP_AN and Trove integration
4. Enhancements to Australian Newspapers service
In early 2010 enhancements suggested by the public in 2008 (beta
for the newspapers service will be implemented in Trove. This will
include establishing a public forum, review of tagging functionality,
enhancement to public profiles, and RSS feeds for new content.

5. ANDP website updated

The ANDP website has been updated with project information, project
reports, title selection lists etc.  Available here:

Donna Benjamin          Libre Graphics Day miniconf LCA2010
Co-organiser      Wellington Convention Centre, New Zealand
http://libregraphicsday.org       Monday, 18th January 2010

More information about the linux-aus mailing list