Archive for Digital_Collections

Building low-cost legal digital collections on the cheap

Today, while surfing the web, I found this YouTube video of Eric Gilson and John Joergensen of Rutgers Camden Law Library presenting at the 2005 CALI (Computer Aided Law Instruction) conference. In it, Gilson & Joergensen discuss how to digitize congressional documents and build a digital library in a low-cost manner using open-source software. Unfortunately, you cannot really see the screen output from the projector, but the concepts are still relevant.

For more specifics about full-text indexing, here is Joergensen at CALI 2010 explaining the Swish-e search engine which Rutgers uses to index the congressional documents:



Some of the processing details have changed since 2005, but the Digital Library is still running today, with over 13K congressional documents processed in its U.S. Congressional Documents collection.

If you are interested in legal informatics, I highly recommend CALI’s YouTube channel.

Rutgers Law Library (Camden, NJ) Project Featured in Philadelphia Inquirer


John Joergensen of Rutgers Law Library Camden NJ (Image by Kevin Riordan of the Philadelphia Inquirer)


Last week, my supervisor, John Joergensen, was profiled in the Philadelphia Inquirer. Check out the article to see the kinds of projects we do at Rutgers Camden and why we do it. The tall bookshelves featured in the photographs are right next to the desk where I sit in Room 510, aka the “scanning room”, the hub of the Rutgers Law Camden Digital Library.

Way to go, John!

If you are interested in John’s work, check out his blog at A Hacked Librarian, or follow him on Twitter:  @jjoerg42

GSP: Presentation on Newspaper Digitization Projects (8/6/11)

On Saturday, I attended a local presentation on the topic of large-scale historical newspaper digitization projects, sponsored by the Genealogical Society of Pennsylvania. The first speaker was Sue Kellerman of the Penn State University Libraries Digitization and Preservation unit. She discussed her role in the Pennsylvania Newspaper Project from 1983-1990 and how she traveled around the state of Pennsylvania finding and cataloging historical local newspapers held by libraries, archives, historical societies, and individuals. She even rescued some old newspapers by purchasing them from antique dealers. What a fascinating, albeit daunting, job to be tasked with!

The second part of the presentation centered on the PA Newspaper Digitization Project, managed by Karen Morrow of Penn State University. 5 historical newspapers from different counties were selected for the process of digitization from microfilm masters. During Phase I, the project received NEH funding to digitize approximately 103,000 pages from papers dating from 1888-1922. In Phase II, they received a second grant to digitize another 100,000 pages dating from 1836-1922. These papers are now included in the Library of Congress searchable Chronicling America database. Because of the extensive metadata provided and full-text keyword search capabilities, these newspapers are wonderful free resources for genealogists. The database contains almost 4 million digitized pages from newspapers dating from 1836-1922 (public domain) covering about 25 US states. In addition to full text searching, users can also search the U.S. Newspaper Directory, which contains listings from 1690 of all newspapers available in digital formats (via free or pay sources) from all U.S. states,

Scans were produced using the LC NDNP technical specifications.  Technical information about the database api can be found here.

I was interested to hear that using the LC technical specifications, scans from microfilm to digital file cost the project approximately $1.00 per page.

Morrow and her team provided the audience with detailed reference lists of Pennsylvania newspapers which are already digitized and can be accessed for free or via paid subscriptions. Resource lists are available on the Digitized Titles page of the PADNP Phase II Blog.

Both Kellerman and Morrow mentioned that currently, the LC is only able to accept newspapers in English, French, Spanish and Italian due to font issues. In the future, they hope that LC will be able to accept German fonts to be able to digitize the large number of historical German-language newspapers found in Pennsylvania.  The team will be writing a grant for Phase III of the limited three-phase grants from the National Endowment for the Humanities to continue their work beyond 2012. I sincerely hope that they receive their grant, and future ones, to be able to preserve our state’s wonderful historic periodicals.

In addition, there are many more titles that could not be chosen due to time and financial constraints. Many counties in the state of Pennsylvania are still underrepresented in the digitization of historic newspaper content. (See the PADNP map tracking digitization rates across counties in the state.) The field of digital collection building is ripe for greater work in this area, especially now that Google has decided not to add new titles to its newspaper digitization endeavors.

What should small libraries or historical societies do if they want to preserve their historic newspapers but lack the funds to do so? The presenters suggest that small institutions focus their limited resources on microfilming for long-term preservation those issues or runs which get the most reference requests or patron usage, and as such might incur greater wear and damage. The key is to preserve as much as you can, even in partial runs or in stages over time, instead of waiting for funding and not preserving anything. Small-scale fundraising can be very effective if done in stages.

At the end of the presentation, Kellerman focused on a special collection sponsored by PSU Libraries, the Pennsylvania Civil War Era Newspaper Collection. Did you know that Pennsylvania is the only state to currently have published online a digital collection of newspapers dating specifically from the Civil War era? (I did not. Kudos to PSU!)

Thank you to the Genealogical Society of Pennsylvania for sponsoring this informative presentation.