Monday, 2 December 2013

Harnessing the Strategic Value of Your Content Begins with Content Inventory and Cleanup

,

The proliferation of digital content continues unabated.  A recent IDC study estimates that “…from 2005 to 2020, the digital universe will grow by a factor of 300, from 130 exabytes to 40,000 exabytes, or 40 trillion gigabytes (more than 5,200 gigabytes for every man, woman, and child in 2020). From now until 2020, the digital universe will about double every two years…” At the same time, the level of investment in IT infrastructure and services to manage such an unprecedented growth in digital content is anticipated to grow by only 40%.  A report by Oxford Economics titled  ”The New Digital Economy” estimates that in 2013 the “total size of the digital economy is about $20.4 trillion, equivalent to roughly 13.8% of all sales flowing thought the world economy…”. 

Given the accelerated growth of the digital economy a widening gap between unmanaged growth of digital content and investments necessary to harness it is becoming unsustainable and is creating significant challenges for both private and public sector organizations.   Such challenges span security, privacy and operational risks.  The IDC noted that “ much of the digital universe is unprotected” to meet increasingly more complex data privacy regimes. The cost to organizations to remedy data breaches and comply with e-discovery requests can be prohibitive and it may also damage organizational reputation, brand and competitive advantages. It is estimated that on a per record basis the cost of remedying a data breach is $200 and the cost of collecting, reviewing and producing documents pursuant to an e-discovery request can be in the millions of dollars, particularly in highly litigated industries. 

While most organizations have well defined content lifecycle, records management systems and policies in place they continue to lack clear insight to what content they have accumulated over time. A particular challenge is managing legacy data in file systems, older versions of document repositories and email systems. A recent AIIM survey found that 61% of survey respondents indicated that “organizational assets are not leveraged to maximum effect” and 46% “consider that storage media and IT infrastructure will be swamped with uncontrolled content if no actions are taken…”A study by Haystac Associates, a software and services company focused on information governance best practices found that “most organizations don’t know where their all their data is and lack tools to systematically filter it.  The amount of time spent on searching for content is estimated at 24% which may be considerably reduced if the data is properly cleansed, organized and well identified. Understanding where the data is located is a necessary starting point for a digital landfill clean-up…” 

The Haystac analysis is particularly instructive in that it provides a systematic foundation for the content inventory and cleanup process that begins with a content identification phase using advanced tools to crawl, index and classify content repositories against organizational taxonomies that may be based on subject, function, hybrid or faceted classification schemes.  The second phase of a digital landfill cleanup project is the content analysis phase the objective of which is to determine the value and relevance of documents identified in the initial content classification phase.  The analysis may encompass a number of variables such as the age of the document, the organizational value of the document, the authors who created the document and for what purpose, the application in which the document was created (this is particularly relevant from the perspective of long term preservation and longevity standards), the version level, how many versions, business and archival value consistent with organizational retention and archival policies.  The third and final phase of the digital landfill project is the content cleanup phase the objective of which is a determination of what should be kept, what should be retained because of its business value, what should be migrated to a system of record as part of a managed repository and what should be preserved and archived in compliance with record retention and archival policies and regulatory mandates.  The content clean up phase outcomes may be illustrated in the following diagram:
Often organizations tend to focus on the go forward strategy for harnessing the value of their knowledge assets and defer the tough decisions relating to how to address their legacy content, their digital landfill.  The confluence of IT consolidation, budget cutbacks and the changing composition of the workforce necessitates that content inventory and cleanup ought to be much higher on the IT/IM project priorities.  Investments in content clean up initiatives far outweigh the downstream costs associated with the continued growth of unmanaged content repositories.

Read more