November 29, 2018
Daniel Greenberg, Rosetta Product Manager, Ex Libris
In honor of World Digital Preservation Day, I would like to share a few thoughts about principles you should take into account when selecting a preservation system.
3 Principles for Selecting a Digital Preservation Solution
1. Interoperability – Every institution manages different types of data for various departments, from research data, through archival content, to creative image editing. The digital preservation system plays a specific role in this institutional ecosystem and, as in any other team effort, it has to integrate seamlessly with the other “players.” This is done by:
- Supporting common protocols for harvesting, publishing and searching content, such as OAI-PMH and SRU.
- Supporting ingest of content through multiple methods and structures; e.g., BagIt, METS, CSV, and XML.
- Providing external APIs for as many modules as possible in the system. These must be comprehensive and well-documented.
- Providing out-of-the-box integrations with different types of leading content management systems, such as for research and archival data.
As Taiichi Ohno, father of the Toyota Production System, once put it: ‘Without standards, there can be no improvement.’
2. Standards – Discipline is not only helpful in raising kids! As Taiichi Ohno, father of the Toyota Production System, once put it: “Without standards, there can be no improvement.” Using standard metadata schemas and communication protocols contributes to system efficiency, by:
- Providing interoperability between new and existing services and applications.
- Enabling compliance with policies and regulations.
- Speeding up introduction of innovative features.
- Laying the foundation for a robust exit strategy, in case the vendor goes out of business.
3. Scalability – As the institutional manager, you want to know that files will continue flying through the system queues, even when you reach those crazy peak seasons with one thousand ingests a day. But to be honest, you have no idea how many servers you’ll need in a year from now. So, how many should you install initially? To put your institution at ease, the system must prove scalability in multiple dimensions:
- Architectural scalability: Start small and grow big. The system should allow institutions to expand the throughput over time without compromising performance. It should also be able to dedicate servers for particular roles, and process millions of files and TBs per day, while serving multiple end users accessing stored content simultaneously.
- Operational scalability: Preservation planning poses endless challenges, having to juggle multiple formats, standards, tools, policies, risks, and the list goes on. The system must include enough open and extendable points to allow customizing its usage to the institutions’ on-going, complex decision-making processes.
- Informational scalability: Managing preservation is not an easy task! It requires constant research of the latest strategies, practices, tools and policies. The system must be able to incorporate and integrate the results of research by an active and collaborative user community.
- Organizational scalability: Let’s say that after several years of working with a preservation system, you’ve become a proud expert! And now, this has reached the ears of the smaller libraries in your state that do not have the resources to reach your level of expertise and they’re turning to you for help. How can you administer multiple institutions with a single installation? Well, that’s a crucial aspect of a preservation system: it should support a flexible consortium model, allowing larger institutions to provide services to smaller ones.
If you are interested in learning more, feel free to reach out and contact us directly from the Rosetta page.
Happy World Digital Preservation Day!
Daniel & the Rosetta Team