I read with interest the recent announcement by storage vendor HGST, a subsidiary of Western Digital, around its new Active Archive System array. I can certainly see the need for more of these types of solutions with the vast amounts of unstructured data that data centres and other institutions now have to deal with. This is fueling the need for new data and object storage approaches that can retain it all, whilst also making it accessible.
Emerging mandates for retaining more data is also driving demand. For example the Engineering and Physical Sciences Research Council (EPSRC), the UK's main agency for funding research in Higher Education institutions, has set out clear expectations for how data from funded research is managed, curated and preserved, requiring that data is made accessible for reuse by other researchers with as few restrictions as possible. Its new policy framework comes into force on 1st May 2015 and those conducting funded research projects need to be aware that it’s their responsibility to do the data management planning under the auspices of their institutional Research Data Management policies.
These new mandates will affect all UK universities and other research institutions conducting funded research. Add the petabyte-scale requirements on data centres nowadays to archive data beyond the ‘create and modify’ stage and you have two major contributing factors that are radically changing the storage technology sector.
Going back to the new HGST offering, this system is all about what we call ‘active’ data archiving in that the data is always there on the spinning disc, which is great for real-time data analytics and ‘big data’ activities, also great for speed and faster-than-fast access times. That said, the new HGST system doesn’t come cheap. You get 4.7 raw petabytes per rack at a cost of $850,000. If you do the maths that equates to around $180 per terabyte as a one-off cost once you’ve bought the rack. But this doesn’t take into consideration the cost of maintenance and upgrades or, crucially the resources needed to manage it and the utility costs to run it. HGST claims the drives consume one watt/terabytes which is low for spinning disks but much higher than tape. So while it offers lower power and running costs than traditional spinning discs, it’s still much higher than tape.
This leads me onto point out that there are different types of archiving for different data requirements and it is worth thinking about that before you opt for a particular solution. The one aspect about the HGST solution that customers need to be aware of is that the integrity of the data is not guaranteed - i.e. you cannot guarantee that you will get back what you archived. There are other solutions, like Arkivum/100 for example, that guarantee that you get back all the data you archive. We provide large scale, long term, and cost effective digital archiving services with a unique 100% data integrity guarantee. For some organisations and industries this might not matter in the same way that speed won’t matter as much for others. For example, we work closely with UK Higher Education institutions who have to meet the curation and preservation expectations of EPSRC. Here data integrity matters big time. Also, when I talk about long term archiving, long term means anything up to 25 years and large-scale means petabytes of data. Our solutions meet the particular needs of organisations that have very large data volumes, long retention times on that data and where there is a high cost of data loss.
All of these factors need to be taken into consideration when you are looking for the right archiving solution. One thing for sure is that volumes of data are only set to continue to increase and many of the traditional storage offerings are struggling to keep up with not only the volume of data but also the increased demand for high speed data processing. Customers are also finding it increasingly difficult to buy solutions at an affordable price. My advice is that organisations rethink their storage needs, now and into the future, and consider in context the potential benefits of new technologies, whilst also understanding that if they handle their storage requirements badly they may be putting their business at risk.