Self-contained information retention format for future semantic interoperability
Abstract
Long term preservation of digital information, including machine generated large data sets, is a growing necessity in many domains. A key challenge to this need is the creation of vendor-neutral storage containers that can be interpreted over time. We describe SIRF, the Self-contained Information Retention Format, which is being developed by the Storage Networking Industry Association (SNIA) to support this challenge. We define the SIRF components, its metadata, categories and elements, along with some security guidelines. SIRF metadata includes the semantic information as well as schema and ontological information needed to preserve the physical integrity and logical meaning of preservation objects. We also describe how the SIRF logical format is serialized for storage containers in the cloud and for tape based containers. Aspects of SIRF serialization for the cloud are being experimented with OpenStack Swift object storage in the ForgetIT EU project.