The increase in businesses moving their data into the cloud has created several new challenges for data center managers. Many departments and employees are increasingly using the public cloud to host unpredictable applications with unknown resource requirements. Secondly, ongoing operational management considerations must be fleshed out for the post-cloud move. Both of these are further complicated by requirements relating to information privacy and security.
It is intuitively understood that multi-tenancy, self-service portals and virtualization (shared resources) potentially impact information privacy and security. But, the rapid growth of dynamically changing virtual and cloud configurations is upsetting the heretofore relatively well understood management approaches for security in multi-tenant and self-service environments. After all, those could be relatively easily managed via relatively simple policies and procedures. It is the additional complexity of the latest highly dynamic configurations for virtualized “everything” and cloud environments; wherein IaaS, Paas, and even AaaS are made available, used, reconfigured, and returned that has upset the complacent security management applecart.
These additional dynamic complexities also significantly impact the ability of IT to discern how to analyze report and plan the resources needed to successfully deliver expected levels of service cost-effectively. Combining performance data from not only end-user experiences (response times, throughput of business transactions, etc.) but also the dynamically changing, shared IT resources underpinning the services themselves is hugely challenging.
With that said, data center managers are now trying to navigate these challenges According to a recent IDC whitepaper, “Cloud computing represents a new set of challenges for performance and capacity management professionals. Cloud infrastructures are typically based on shared, pooled, highly virtualized hardware and operating environments. Clouds require extensive management software for enabling such functions as resource allocation, self-service catalogs, automated provisioning, service-level management, usage-based metering and billing, and support for dynamic expansion and contraction of resources or "elasticity." As such, clouds represent technology evolution for the delivery of business services and workloads, but cloud infrastructures still require the same performance and capacity management functions as more conventional infrastructures.”
Planning for the Move
Like anything in IT, planning and transparency is necessary to moving data into the cloud. Provisioning in the cloud too quickly can lead to many pitfalls, like losing total control of your IT process, overestimating your cloud capabilities and exponentially increasing the cost of your cloud initiative – and all with no guarantee of acceptable service deliveries. Planning will also speed up the release of new IT services into the cloud. As things get more dynamic, and more abstracted, in cloud environments - the key to success is going to be transparency.
Moving the data itself requires a security "audit" upfront (i.e. organizational approval as to what data is allowed to be placed in a public cloud - or not). If the cloud strategy is either public, or hybrid this is critical, private-only clouds have fewer issues because the data has not been externalized from the organization, though there may be some within the organization (i.e. compartmentalization of data privacy, etc.).
Independent of security aspects, and whether you are in a private cloud or using a hosting provider, several key requirements exist:
· You will have to monitor and report on your agreed service levels back to your constituencies. Therefore, you must have some visibility of the IT service or application, or at least, have information about response time from a user’s perspective. Otherwise, if you don’t measure these, you are not providing business value – you are not aligning to what they care about.
· Because service performance requires timely access to acceptably performing IT resources, you need to understand your applications and their computing resources in the cloud. Collecting performance data will enable you to understand the resource consumption and provide an estimate to the initial sizing of your cloud environment. Toolsets to monitor end user experience will be a must; not only to monitor SLA adherence, but also as a mechanism to use as a baseline while doing performance and capacity optimization. You can automate the process of measuring end user response time with a spectrum of approaches ranging from the less costly, easier representative (sample) transactions at low granularities such as 5 minutes. Or you can do this with high granularity and fidelity transactional decomposition tools which actually measure each and every individual real transaction. The price for such accuracy? Typically complex, invasive and potentially expensive solutions.
· Application workloads need to have their performance continuously analyzed against actual resource requirements in order to ensure optimal usage of IT resources provided by IaaS and PaaS cloud platforms. On a private cloud you can monitor these resources and create alerts that can speed restoration by easily drilling down to pinpoint the causes of incidents. Currently, access to these types of metrics in public cloud environments is typically not possible; with such cloud provides not exposing this level of detail of the underlying infrastructure; after all there is a conflict of interest here – such providers make their margins by effectively having you pay for more than you are using. Sometimes, this will matter to you, sometimes it will not.
Measuring Business Value & Performance in the Cloud
Measuring business value means you need to have SLAs (service-level agreements) with your cloud provider, setting expectations and responsibilities for both parties and negotiating any penalties for failing to meet those expectations. But you cannot (necessarily) rely on the hosting partner to provide these metrics for you. The other half of business value from services delivered is cost to deliver them, so you will need financial information (over time), including both CapEx and OpEx.
An Ever-changing Environment
Change is a constant in IT, and the cloud in its varying forms is no exception. In fact, because of the complexity nature of the cloud, with a mix of virtual, physical, applications, services, public, private and scalability agility, the change is rapid and competitive – it is built in.
In order to continuously compete, your cloud initiatives should continually improve and re-invent your processes and service offerings. Done well and transparently, you can begin to drive the resource demand curve, making it easier for you to provision the appropriate resources, at the right time and cost while simultaneously ensuring service levels that meet business goals are achieved. Service Improvement should start immediately after your cloud implementation is done, not in 9 months. SLAs are measured, gap analysis conducted, identification of potential risks and efficiency of the services and process must be an ongoing process. This continuous process is how we gain IT maturity.
Finally, it is about Organization Wide Transparency
Customers using public clouds in any way/shape or form should really demand accountability with respect to service level agreements from their provider – whether private, hybrid or fully public. Where this is impossible; they should strongly consider only moving workloads that are NOT critical with respect to SLAs (including response time/latency and throughput). Customers should demand that any cloud provider of IaaS or PaaS deliver on-going reporting of performance and capacity and cost against commitments.
The educated cloud consumer should align to the educated cloud provider – where both sides of the contract are communicating with the same understandings of resources, performance, cost and service levels, the opportunity for expensive outages and slowdowns – with their concomitant misunderstandings that only delay mean time to resolve – are minimized.