From potential to real-world impact - solving the unstructured data challenge

By Steve Leeper, VP of Product Management, Datadobi.

Take a look at the tech news headlines on any given day, and you’ll see stories covering the value organisations are attaching to unstructured data. Representing anywhere between 80% to 90% of the world’s data, this vast pool of information can be anything from video files and audio recordings to emails, social media posts and the enormous amounts of data now collected from device sensors.

 

This kind of data is considered unstructured because it typically lacks a predefined format, resists categorization and is context-dependent. As a result, it doesn’t lend itself to easy organization, analysis, or management, which dictates the use of advanced tools and techniques to extract meaningful insights.

 

Traditionally, many organisations approach unstructured data collection and management by adding more storage. However, the exponential rise in data accumulation rates seen in recent years—particularly machine-generated data—is making this an expensive proposition, and storage strategies can become out of control. Another problem is overreliance on an unstructured data solution designed for a single storage technology when the existing infrastructure actually includes products from multiple vendors located in various locations, storing multiple data types.

 

As a result, most organizations are sitting on large volumes of digital information with the potential to improve business insight, processes and outcomes. Many are trying to do so, with some collecting unstructured data from as many sources as possible in the hope that it can be monetised in some way down the line. The problem most face, however, is translating that latent value into something tangible, with businesses everywhere overwhelmed by the challenge and running the risk of putting money into strategies that don’t deliver.

 

All this adds up to organizations that don’t know how much data they have, let alone where it is in its lifecycle, what risk levels to attach to it and whether there is any value in paying to keep it. The rollout of the GenAI workload has added further data management complexity to the mix, particularly given the strategic importance of these initiatives.

 

Squaring the circle

 

So, organisations everywhere understand that they’re sitting on an asset with transformational potential, but they either don’t know how or have tried and failed to capitalise on it. How can they square this circle? A good starting point is to adopt a mindset whereby unstructured data is seen as a data management issue rather than a storage technology problem. 

 

In this context, one of the most important issues to address is data integration, the process of organising, managing, and optimising unstructured data across diverse storage environments to enhance visibility, reduce risks, and improve cost efficiency. Here, there are some crucial interdependencies to address. Firstly, unstructured data assets need to exist in harmony, be compatible, and be readily available for the types of detailed access and analysis that can drive the outcomes businesses are looking for.

 

Effective data integration also depends on having access to data that is accurate, consistent, complete and relevant – a requirement that drives the need for highly effective quality management processes. All of this depends on adherence to comprehensive data governance policies and procedures to ensure that all data within an organisation is properly documented, stored, and maintained. This includes conducting regular data audits, assigning clear ownership and responsibility and establishing guidelines for data creation and storage. Effective data governance is also essential for mitigating risks associated with unstructured data, such as security vulnerabilities, compliance issues and operational inefficiencies.

 

Of course, a key part of this picture is adopting vendor-agnostic data management technologies that seamlessly integrate unstructured data across diverse storage systems, applications, and cloud systems. For organisations with complex, multi-environment architectures—and there are many—getting this right can help deliver the transformational impact at the heart of unstructured data narratives.

 

Indeed, organisations that successfully bring these strategic and technological elements together can do the data equivalent of ‘crossing the chasm’ to integrate unstructured data into their mainstream activities with enormously beneficial results.

By Graham Jarvis, Freelance Business and Technology Journalist; Lead Business and Technology...
By Duncan Hart, Co-founder and CEO of DeepMiner.
By Oz Olivo, VP, Product Management at Inrupt.
It’s getting to the time of year when priorities suddenly come into sharp focus. Just a few...
With Richard Jones, VP EMEA, Confluent.
By Guy Eden, VP Product Management, BMC.