What to measure, and understand, about the cloud

A Compuware-sponsored survey has shown some user dissatisfaction with the cloud, especially in terms of provision and reliability of services, but it can also be said to show that many users’ expectation of cloud services delivery need to be tempered with greater understanding.

  • 11 years ago Posted in

A recent survey undertaken on behalf of Compuware by Thomas Mendel, the founder of analyst form Research in Action can be taken as showing that the perception of cloud service provision by many business users was not good.

But it could be that it also shows a more understandable trend – at least in the short term – though one that could become worrisome if it continues. In short, the survey shows that many of the business and IT decision makers have a skewed view of what is achievable with the cloud, and perhaps need to rethink what cloud service metrics tell them useful information.

For example, the survey shows that expectations on cloud performance are not being met and that this is having a negative effect on users’ perceptions about the efficacy of cloud in real coal-face service.

This showed up in three areas of performance assessment.

Perhaps the most interesting result was that  some 64 percent of the respondents said they had a poor cloud experience due to performance bottlenecks. Another majority vote – 51 percent – said they were left with a negative impact on brand perception and customer loyalty, which close to a majority, 44 percent, said they had suffered aloss of revenue due to poor availability, performance, or troubleshooting cloud services.

Taken at face value these show the provision of cloud services in a less than flattering light. It seems that cloud service providers, be they SaaS shops, Cloud Service Providers CSP) and managed hosters, or a company’s internal IT department, fail to deliver what is expected of them to the majority of users.

There is a flip-side to this, of course, which is that the users themselves may be not just expecting far too much, but also expecting things that are in practice impossible to deliver: a point made by Michael Allen, Compuware’s Vice President of EMEA Strategic Partners and Business Development.    

“There is a need to understand what factors are affecting performance,” he said. “It won’t always be the application or the specifics of the server, it could be network factors, or the internet itself.”

One of the facts most commonly missed by end users in their cloud services perception-building is the fact that much – probably most – of what contributes to the the delivery of the service is outside of the direct control of both the user as consumer of the service, and the CSP in whatever form that takes.

The biggest problem, of course, is the internet itself – the original `cloud’. The number of possible routings between two points is, in all practical terms, infinite. It has to be to ensure that a problem with one node of the network cannot shut down whole sections of it. So a message sent from Leatherhead to Basingstoke will get through, but it may find itself routed via Japan and Australia.

That can appear like poor service and a bottleneck, but when the numbers involved are considered, it actuall works rather well. Microsoft, for example is reckoned to have over 1 million servers running its services, and it would claim to be no better than number two in size to Google. Then there is Amazon to throw into the pot. Those three must account for at least three million servers on the internet alone.

And no matter how reliable they are there will be unplanned downtime and failures. The big names will try for the best possible reliability ratings, say `five-nines’ or 99.999 percent up time per server. On just Microsoft’s 1 million servers that means each server will be out of commission for 5mins 15sec. This means that, just for Microsoft, and just on the servers, it will be unable to deliver a cumulative 10 years of service provision each year. And that is on a good reliability rating.

Now add in all the other systems and services present and work out  the availability on a similar reliability rating. Then add in Google and Amazon to the pot, and then all the other CSP and the private cloud systems that might be involved.

Then look at a different issue – the number of different service providers that can contribute to a single service – even the delivery of a single web page to an individual consumer. Elements of it may come from half a dozen or more different service providers, anywhere in the world, hosted with an  unreliable ISP or CSP.

From that it is possible to see that the perception of service delivery that has built up around the cloud needs to be tempered by a touch more understanding from users than is often the case. It would be more fair to turn it round and marvel at how often it matches, or even betters, expectations.

“There are a few things users can look at, especially with cloud, Allen said. “For example is it the performance of the job itself, are the response times as expected? Second, if it’s running on a flexible architecture is it scaling out evenly across the nodes? Third, is this scaling using the capacity of the nodes to their maximum, or is it only using 50 percent of the performance of each node? That is th last point, is it using resources efficiently? To my mind, when people look at the cloud they look at an elastic environment, so they stop looking at efficiency and just think in terms of spooling up more elasticity.”

Such observation raise the obvious parallel of similarity of the cloud to Microsoft Windows. That created a strange balance between the (relative) ease of producing and using applications, and the vast amounts of bloatware that followed.

Allen’s response to the suggestion was both oblique and pertinent: “I have just been reading a tweet from a guy suggesting Hadoop is a brilliant system for running inefficient code at large scale.”

Mendel joined Allen in obliquely agreeing with the `bloatware’ analogy, though acknowledging it comes from different causes and different reasons.

He suggested that most traditional IT had been able to put everything into the equivalent of a Change and Configuration Management system. This had always allowed users to derive a true end to end measurement of any business transaction, and be confident that everything was under control, because all the component parts of the process were known, understood, and measurable.

“Using that Microsoft example, the big change now is that on one level the world has become more complex and we don’t have all the end-points under control any more,” he said. So we need to be more clever at doing approximations and measurements of various areas. It is becoming too complex to instrument every application and device. That would create more network traffic than the applications. Now there is a real need to be more clever at sorting out what gets measured and why; which measurements are really important.”

The customers are, however, becoming wiser in the ways they are using new technology. Last year, Mendel conducted the same survey, and the big issue for users was implementing private clouds on their internal IT. This year the top cloud specific investment objective is using the cloud for test and back up.

“What this shows is that IT departments don’t now deploy their most critical applications in the cloud. They use the cloud, especially the public cloud, where it fits,” he said. “And service providers are helping here because the biggest growth area today is the hybrid cloud, so people have learned.”

Mendel also pointed to the third area to consider with the cloud, the elephant in the room. This is the perfection expectations of the users, who still seem to think that the cloud means instant everything.

 “There is still a need for education here on what is and what is not possible. This in turn raises some important issues such as users understanding what is within their control and what is not, and what is possible to migrate to the cloud and what is not.”

Talent and training partner, mthree, which supports major global tech, banking, and business...
On average, only 48% of digital initiatives meet or exceed business outcome targets, according to...
GPUaaS provides customers on-demand access to powerful accelerated resources for AI, machine...
TMF Group, a leading provider of critical administrative services for global businesses, turned to...
Strengthening its cloud credentials as part of its mission to champion the broader UK tech sector...
Nearly all UK IT managers surveyed (98%) state cloud investment is an organisational priority for...
LetsGetChecked is a global healthcare solutions company that provides the tools to manage health...
Node4 to the rescue.