The 100-year Archive and the Data Preservation Explosion—Part One: The Compounding Storage Growth Rate and Long-term Data Preservation demands a Next Generation Archive

By David Morris Originally Posted February 24, 2020 –

The technology market is always changing and innovating across the computing, networking, and storage elements, which are the holy trinity of the Von Neumann architecture. Innovation, in one area, spurs changes in the other two, as the cycle of innovation progresses with each element alternating between being the leader and then the bottleneck within the overall architecture. Today, new requirements are emerging concerning data storage, retention, and reinstatement of electronically generated data assets, which demands new approaches and new levels of optimization across all three elements.

To understand the emerging storage challenges, segmentation of the storage market is necessary. If the market is segmented from an active and passive data access patterns viewpoint, it bifurcates into two distinct sections. The first is Operational Storage, and the second is Long-term Archival Storage. Both have drastically different requirements. Operational Storage is the high access, high-performance, low latency, and very dynamic storage that is actively used daily for business operations, as well as the short-term backup and snapshots for quick recovery purposes. Long-term Archival Storage is the cost-efficient, deduplicated, and compressed storage that is used for information preservation purposes. It is data that is traditionally infrequently accessed. Both capabilities are essential for businesses—however, the demands and requirements for Long-term Archival undergoing significant evolution.

IDC’s Datasphere highlights archival data volume, which currently accounts for 60% of the total Datasphere. And, their growth estimates for archive data volumes have consistently increased over the last few years, as the implications of new mandates and regulations have become apparent. The growth of archival data volume will compound exponentially, as more information is included in the lengthy data retention and preservation mandates and, we believe, will quickly eclipse the other categories in the Datasphere.

Today’s Data Storage Volumes

As more informational assets become subject to preservation, organizations will have to bear the ever-growing expense of data retention volumes, which will negatively impact to profitably. The traditional archive solutions with their monolithic architectures cannot scale efficiently to meet the expected data volumes or meet the new preservation technical requirements. The cost point of today’s solutions becomes cost-prohibitive, as preservation volume grows.

There are many future implications and demands of the long-term archives with 100-year preservation lifecycles. Data portability between storage systems is a concern, as storage systems age and are replaced by newer technologies. Over 100 years, the data would need to be migrated to new systems approximately ten times throughout its lifecycle. How does an organization maintain and prove data integrity throughout ten migrations? The Cloud will also have similar problems with technology upgrades and multi-Cloud interoperability. The use of virtualization could abstract the stored information in a data center or Cloud to make it easier to migrate or reduce the impact of system-level technology changes. However, this would also imply that the abstraction would need to be backward compatible for up to 100 years, which is “Infinity Plus” from a development perspective and a rather complicated issue. Another concern surrounds the application itself that generated the data. In a fifty to hundred years from today, will the application be able to process information from 10, 25, 50, or 100 years in the future, as the application transitions from Version 1.0 to Version 100 overtime? Will, the company that developed the application, still be in business to provide application access, licensing, and backward compatibility? These and other questions and concerns must be addressed to mitigate the impact of the Long-term Archival data tsunami.

This blog series will focus on the changes in regulations and mandates, extended retention periods, data integrity inquiries, expense management, and other technical challenges facing the Long-term Archival Storage segment. Fundamentally, we believe a new approach is needed to meet the Long-term Archival requirements.

Published by morrisjd1

David Morris is a technology and business executive with 20+ years of management & high-growth experience in both startup & public companies. His experience spans technology development & innovation, business strategy & management, corporate & business development, engineering, & marketing roles. Recognized for his ability to identify new emerging markets, develop targeted solutions, and create accretive strategic imperatives, David has worked with and advised private equity backed and public companies to position them into high-growth markets, including Kazeon, acquired by EMC, and Cetas, acquired by VMware. With a reputation as a technology thought leader and evangelist through blogs, articles, and speaking engagements, he had advised numerous companies on emerging technology market trends and the impact of disruptive technologies on existing busines models. David has founded two companies, launched six (6) companies, had two (2) successful public successful turnarounds. His technology experience is across compute, networking, storage, compliance, eDiscovery, SaaS, IoT, cybersecurity, Linux containers for DevOps & Storage, & AI solutions. David holds graduate degrees in Marketing from the University of California, Berkeley-Haas, in Finance from Columbia University in the City of New York, and in Engineering from George Washington University, as well as a Bachelors in Physics from Auburn University. He currently advises Aerwave, a next-gen security company, Loop, and Brite Discovery, a GDPR compliance and eDiscovery company. He is active in and is a long time supporter of Compass Family Services, which services homeless and at-risk families in San Francisco, The Tech Interactive in San Jose, CA, and The American Indian Science and Engineering Society. In his off time, David enjoys cycling, weightlifting, and scuba diving (especially in Belize). LinkedIn:

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

<span>%d</span> bloggers like this: