IF YOU LOOK AT THE FEES CHARGED by the architect that designs it and the engineer that puts together the functional spec, they are much higher than those charged by the people who build and maintain the bridge. The materials used represent a large part of the cost of the build so they need to be kept to a minimum. A strong, light bridge is the most efficient; and this is also true for data centres. I’m sure the man who regularly repaints the Forth Bridge considers himself to be a vital part of the project, and he indeed is as maintenance must be kept up to date, but in terms of cost he is a small element.
Setting up a new datacentre, or improving an existing one, is similar, in a way, to building a bridge. The skill is in the design: a poorly designed system will consume more resources, cost more to maintain, cause more aggravation and potentially fail.
The data centre’s job is to efficiently run applications for the business; we all know this but it is sometimes forgotten in the discussions with technology suppliers. In an ideal world the IT infrastructure comes with a Service Level Agreement (SLA) for the performance of its applications. The world is not, however, ideal: hardware manufacturers will not guarantee application performance as they are not responsible for the whole system. The best they offer is an availability guarantee for their solution. Similarly if you outsource your application to a host most will not guarantee you a performance SLA and you will find that the responsibility is still with you, not them.
In the good old days an application ran in a physical environment, you knew which server, switches and storage it used. The system was designed to be able to cope with peak loads but operate most of the time to well below half capacity. If there was concern over peaks in performance the hardware was overprovisioned to provide a cushion. This approach ensured that the application ran well, that problems could be identified quickly and that upgrades were planned in, but it was wasteful of resources. So along came virtualisation to improve hardware utilisation: servers, switches and storage could now be managed more efficiently, cost was reduced but there was an additional layer of complexity that made management more difficult.
This management issue meant that large, business-critical applications were rarely virtualised to a great extent; virtualisation was still a cost benefit to less critical applications so costs were reduced. The next wave of new technology was the private cloud. General cloud computing provided yet more efficiency improvements and cost reduction but an even further degree of complexity; so what happened was that the designers of cloud systems over-engineered and over provisioned hardware capacity to ensure application performance; so we were back to square one.
Why is this happening? The additional complexity of virtualisation and cloud means that you can no longer see what is happening to the application end to end. Each element has tools that show you how it is working but you lack visibility into the system as a whole and certainly into how the user is seeing the application perform. What is needed is Infrastructure Performance Management (IPM) as this approach gives you the ability to continuously capture, correlate and analyze the system-wide performance of applications in real-time; this allows you to set a performance SLA. It importantly also shows the utilization and health of heterogeneous physical, virtual and cloud computing environments, letting you optimize the infrastructure supporting the applications and reduce underutilization and overprovisioning.
In order to implement private cloud you need to also implement IPM; cloud-based infrastructures cannot be successfully implemented without it. To come back to the bridge analogy, you wouldn’t try to build a complex bridge blindfolded, would you? Similarly, a well-designed private cloud will incorporate the continuous, real-time monitoring in an unbiased way, across the whole, heterogeneous system.
The real-time aspect is crucial to the data centre build. By real-time I mean being able to monitor the performance of every application transaction at line speed across the shared infrastructure. Latency in applications is caused mainly by many small issues working together to create a big one.
Therefore the higher the granularity, the higher the frequency, the better the data and therefore the ability to predict and avoid (catastrophic) problems. Traditional tools that manage the various elements tend to average the results over minutes; the reason for this is that they use polling technologies that impact performance if they are used regularly. These averaged results however give a false impression of what is going on. Think of cars going over a bridge; if you average their speed over five minutes the result will be that they are all going at the same speed; the reality is that each car moves at its own pace and if one slows down it will not show up on an averaged result. The same is true of application traffic. An IPM solution shows more depth, gives better data, and gives you more knowledge to enable a greater Return on Investment (ROI).
Measurement of the application should be performance based. If this approach is taken, and there really isn’t another solution for an efficient physical, virtual and cloud infrastructure, then business-aligned SLAs can be set. System-wide optimization can happen as you can see how everything is utilized and is performing and reporting can show the business reduced CAPEX and OPEX and the new efficiency.
Building data centres, like building bridges, is complex but with the right information it is a lot more efficient process with guaranteed results. It also helps bridge the gap between IT and the business and IPM enables you to actually achieve the benefits of virtualization and the cloud.
“More depth = Better Data = More knowledge = Greater ROI”