Tracking Digital Collections at the Library of Congress, from Donor to Repository

When Kathleen O’Neill talks about digital collections, she slips effortlessly into the info-tech language that software engineers, librarians, archivists and other information technology professionals use to communicate with each other. O’Neill, a senior archives specialist in the Library of Congress’s Manuscript Division, speaks with authority about topics such as file signatures, hex editors and checksums even though she has a traditional paper-centric Master of Library Science degree. She picked up her technology expertise on the job, through years of rescuing digital content off of erratic computers, troublesome files and unstable storage media.

The Library often acquires a collection at the end of someone’s career, which means that many of the digital files that O’Neill sees in collections may have been created decades ago. And chances are good that some of the storage devices, or the files they contain, will be obsolete and will require the Manuscript Division to process them with special digital forensics resources.

When a collection is first received by the Manuscript Division, a staff member reviews the contents and if digital media devices are found, they are transferred to the digital collections registrar, O’Neill. Archivists might find digital storage devices among the paper documents later when they are processing the collection. In either case, O’Neill records receipt of the materials in a local database. The record includes the collection name, collection number, a registration number and any additional notes about it. Said O’Neill, “If I get digital material, I give it a registration ID and that forms the beginning of what will become a unique ID for each piece of media.” This begins the tracking information or what O’Neill calls the “chain of custody.”