Digital Preservation

This guide provides basic information on preserving emails from the Microsoft Outlook system on a campus server. It also provides guidelines for long-term storage of digital documents in various formats. This is intended for materials with enduring value.

Preferred Formats

System dependence is a huge concern for digital objects created using proprietary software. Not only is this software necessary for access and use of the materials, but software obsolescence has a huge impact on preservation workloads that may involve constant migration of files to new versions. There are recommended formats for various types of digital media. If a proprietary software was used to create the object it is advisable to also create a secondary copy in one of these more stable formats. Maintaining both the original and the secondary format will provide some assurance that the content will be available in the future, even if some of the structure or functionalities can't be replicated.

Text-based objects - PDF/UA-1 UA stands for Universally Accessible and means these documents are supportive of assistive technology like screen readers. If PDF-UA is not available select the highest quality output PDF with features such as searchable text, embedded fonts, lossless compression, high resolution images, and content tagging.

Digital Photographs/Graphics - TIFF, JPG, or PNG Select the highest resolution (300 DPI preferred) and bit-depth (16 bits per channel preferred) available. Uncompressed and unlayered versions preferred.

Audio - WAVE Production version, rather than pre-production version, preferred and uncompressed files. Files must contain no measures (such as digital rights management technologies or encryption) that control access or prevent use of the digital work. If the production copy was issued in analog (CD) format save 2 hard copies. 

Video - MXF, MOV (quicktime) or MPEG - 2 Final production version with the original production resolution and frame rate. Files must contain no measures (such as digital rights management technologies or encryption) that control access to or prevent use of the digital work. If production copy with issued in analog (DVD) format save two hard copies. 

Storage Environment

Storage media is a major factor in successful long-term preservation of digital objects. Storage media includes everything from magnetic tape used for A/V records, to floppy disks, CDs and DVDs, external hard drives and flash drives, to servers and virtual machines. Predicting the life expectancy of the more mobile forms of storage media, like DVDs and external hard drives, is problematic. Storage conditions are a huge factor in extending their life expectancy, and since environmental controls and other safety measures are not present for much of their life cycle, using these media as the only storage environment is not recommended.

Digital objects should be stored on servers or virtual machines with redundant storage. Redundancy ensures that if one copy of a digital object is lost or damaged, there is another copy to replace it. Backups are probably the form we are most familiar with. Backups are static copies of the data (snapshots), usually used for data recovery purposes. While they are good for continuity of business, they are not considered authentic copies as backup procedures often compress the file and as snapshots they may not be the most current version of a file. Replication is different from redundancy in that is dynamically updates the secondary storage location, meaning the secondary copy is a more authentic copy of the original. If data is accidentally changed in the original and the replication happens automatically then that change will happen to the secondary copy and more authentic data preserved in that copy will be lost. Best not to have automatic replication setup without some sort of file integrity checks in place. The production servers and the back up servers should reside in disparate geographical locations to protect against catastrophic loss.

