Digital Preservation in the Government Sector

from a Book of same title by Heather Brooke

At this ARK hosted event held in Melbourne 23-24 October 2013 Rosemary Kaczynski chaired an eclectic array of speakers with a variety of backgrounds and experiences to share. She and many of the presenters were keen to convey the message that we are all information managers though our roles and duties may vary within our own sector. It is information management which brings us together to discuss the challenges of preservation, management and retrieval. It is what we do.

I’ve shared some highlights from my perspective and all papers are now available:

Suresh Hungenahally, from Department of State Department, Business and Innovation sported vast experience and knowledge in; systems and asset security, security of conventional forms of storage vs cloud architecture, that getting your metadata model right and befitting your needs whether that be a “deep or shallow” one is critical. How you define and classify your assets will help inform your metadata model at the highest level. How you classify your assets may be the best way to inform how you organise, secure and access them; Sensitive, Commercial, Resource, Collections, Public, Legal...etc. Function should not necessarily inform this high level though it can be incorporated within a strong taxonomic classification to underpin asset management. Strong classification and controlled vocabularies assist browsing and retrieval and data migration. Preservation means that as well as securing it you can reproduce it, play it, run it, open it, storage and retrieval is just one part of an overall preservation strategy. Deploying mandatory system enforcement where appropriate increases asset security contributing to preservation. Tip: The cloud is a dollars + pragmatic decision, your cloud may not be your cloud, and expenses can be hidden so cost it well particularly in download terms. On site solutions are arguably still the most secure and cost effective for most. (I asked a question I always ask - why is there no “Arts” sector cloud? Wry agreement and nods from the audience. We will be waiting a while.) Your people must be incorporated into the workflows they will drive contributing to systems development, adoption and adaption. Identification of preservation to the business needs is crucial; what to preserve? How long? Risk profiling of the organisation’s digital asset management health can be the strongest upwards management tool for gaining executive support and championing for strengthening the digital preservation model you have. Ultimately performance remains a key technical issue that characteristically reflects on adoption. The analogy of a grapefruit through a straw sized pipe was conveyed. Suresh was kind to a broad and largely non-technical audience. His conclusions were that there remains a heavy operational commitment to any DAMS and Preservation roll-out. Get your executive on board, stake-holder buy in, be customer driven, and allow content personalisation. Prepare to commit to 36 months from strategic directive to deployment and then ongoing “ is a project that never ends”.

Martin Rennhackkamp, from PBT Group reinforced the advantages and disadvantages of using “The Cloud” for supporting your digital preservation model. Martin profiled the private cloud storage solution behind the fire wall, hybrid and public cloud solutions identifying risks and advantages. Which solution is determined by the business need, costs and risks. Where “big data” is concerned the cloud offers advantages for the non conventional data packs often resulting from special data, census data, social media (twitter) and/or research work. Ever increasingly the big data concept now applies to, digital document environments, email, images/media. Risks continue to exist as the cloud tends to be more vulnerable to open surface area attack, long term viability of the vendor, vendor lock in, costs over the internet for upload and download, sensitive data protection.

Checkout - “the way back machine” internet archive project
Open source - Hadoop
Europeana – runs on Cassandra cloud
LOCKSS - runs on Amazon cloud

Ken Mould, from Deakin University outlined the key challenges and successes tackled in deploying an effective records disposal management solution. The goals were customer and governance driven; improve the student and staff experience as well as adhere and align the institution with legislative compliance.
Challenges were also what identified the need for an efficient solution:
- geographical spread – we do it this way in Wollongong, etc
- demographic spread – wide age range/experience with/acceptance of technology
- state and federal government compliance - legal
- inconsistent tools and access to IT support, duplication of services - infrastructure

Wide ranging though inexpensive improvements to infrastructure were identified to align, tools, standards and systems support.
Grouping records into broader categories of Student, Staff, Research, and Corporate Governance made sense for records management in terms of the workflows, the metadata structure, systems deployment and made sense to the users engaging with the solution.

Main recommendations were:
- Implement processes to stop anti-systematic ways of doing things
- Understand and educate the users in the difference between a document (can be modified) and a record (cannot be modified) the later is archived
- Quality control and testing – be prepared to experiment
- Apply lossless formats
- Match the metadata model to your business requirements
- Deploy common standards to produce a common result

Successful outcomes are measurable; dramatic improvements in staff and student experience, efficiencies have dramatically reduced processing and batching workloads, adoption and adaption by staff increasing and requested, and the institution is meeting its compliance obligations.

Sally Vermaaten of Statistics New Zealand, David Fowler of Public Records Office Victoria, and Rosemary McLaughlan of Department of Family and Community Services presented case studies from their very different projects that collectively contain some underpinning recommendations and interesting observations.

There is a trend toward digital preservation as a service but the solution in house can be achieved through accessing simpler and inexpensive tools, open source or other, available online. All emphasised the need to build the solution into existing systems. Identification of the metadata model matches the requirements and facilitates use, reuse, rights, and content discovery. Does the preservation strategy ensure authenticity, accessibility, operability, and persistent identification? Think about the collection strategy of the organisation and match to this the preservation and disposal statement. Create a preservation and disposal statement.
Identify the aims of your organisation’s preservation model: encapsulation, emulation, and migration/normalisation. Preservation of context and content is important, just preserving the 0’s and 1’s may not achieve this. Challenges remain: long term, preservation formats change. Building in adaptability and flexibility whilst locking into standards may involve some compromise of the Archival model. Identify the whole of relationship – object and asset to historic and current time frames. Provide easy access, well documented processes, enhance working partnerships, integrate or rationalise all convergences and duplications of systems, policies, procedures. Provide training and support and manage your provider relationships well.

We finished the day with a legal wind up. Emphasising the need for good information management was Maureen Duffy, from Herbert Smith Freehills, covering the trend of eDiscovery contributing to the challenges for organisations with regards document evidence compliance and the law. The keys tips here are that documentation supports your version of events. Disputes occur around unsupported versions or recollections of events. Document unavailability is presumed suspect and can go against a party as the conclusion can be that it was destroyed. Expense here for an organisation can result from having more documentation but managing it less. But as Maureen noted with humour and a touch of significance this is not legally defensible, sharing a quote connected with practicing good EDRM standards, “we don’t do this for fun”.

I chose to attend Workshop ‘A’ the following morning with Leisa Gibbons, from Rhizome Digital. It was a brief, perhaps too brief for some, focus group on getting a digitisation project in motion in terms of the broad brushstrokes. The to-do’s; develop a project scope, perform business analysis of requirements, identify organisational objectives, identify regulatory requirements, perform stakeholder identification and develop the requirements matrix. Ultimately document the progress and record the processes. It was a chance for professional experience sharing in the main which was valuable to most. I’d recommend this type of session be run in-house to facilitate those newly embarking upon a project scoping, business analysis and project plan for an organisation led digitisation project. Identification of the stakeholders into groups “internal, external and distant” was a valuable exercise. Leisa shared her ideas and documentation tool for a “requirements matrix” also handy.

From my perspective the forum emphasised the agencies each had familiar sounding or like challenges to each other. Overall it was not quite what I was expecting, quite a bit pertained to business and planning concepts with case studies focusing around digitisation projects which had as the goal digital preservation through digitisation (not a criticism just an observation). There was much on project management tips and tricks (which is always handy to have reinforced). It was an interesting mix of agencies and I concluded we are all getting only partly what we want to/have to achieve done but doing well given the challenges. We were all facing similar issues and could point to successes in the field that inspired and helped to demonstrate we are on the right track. No one agency has everything they need but each is achieving quite impressive outcomes in their own sphere including my own.