The Private Cloud: Who Will Stop the Rain?post by Chris Curran on January 27, 2011
Guest post by Nick Macey
As some companies move towards a virtualized, private cloud infrastructure, they get caught in a downpour of problems. The additional layer of virtualization requires significant expertise to manage and support, can substantially decrease overall stability and makes diagnosing and resolving incidents more complex. Operations teams are faced with an incredible challenge, as they often are not responsible for building the cloud, but are ultimately responsible for maintaining and supporting business customers on the private cloud.
In a typical private cloud buildout, a systems integrator spends significant time gathering requirements, defining the architecture and planning the project. Once execution begins, a variety of technical decisions are made which can impact stability and support options at a later date. Most operations organizations are new to virtualization and may not be informed, or may not understand the impact of their technical decisions. Virtualization brings a new level of complexity to infrastructure operations in which even seemingly minor decisions, such as selecting a time zone, can cause significant issues once the environment goes live.
Regardless of the decisions made during execution, it is critical for an operations transition team to be deployed during the execution phase of a virtualized private cloud buildout. This transition team should be comprised primarily of operations employees dedicated to covering the five key areas critical for transitioning a virtualized private cloud. By focusing time and attention to the transition, operations teams are in a position to be more successful once the virtual environment becomes their support and maintenance responsibility.
A colleague recently described to me a situation that shows the importance of properly transitioning a virtualized environment to an operations team. As his client’s virtual private cloud began to gain traction within the organization, system instability and downtime grew to unacceptable levels. This company had a typical transition from a systems integration team: documentation was in place, alerting had been configured, and their employees had been briefed on the environment. However, their failure to ensure the proper skillsets were in place within their organization, as well as a failure to adapt organizational processes, led to a series of cascading failures. Within six weeks of taking over the environment, the operations team had lost customer data by removing a misidentified storage device, had failed to properly manage the environment resulting in multi-day outages and had over-provisioned the environment to a point where server infrastructure refused to respond. This resulted in a large unexpected expense to repair and grow their infrastructure, as well as train their people. During this three- month recovery period, their business customers continued to suffer through an unstable and problem-ridden environment.
These types of problems can be addressed if a transition is properly planned and executed. Transition teams should focus on five key areas to reduce the likelihood of problems with a virtualized private cloud environment down the road. The five focus areas are:
- Ensuring skillset readiness
- Understanding the environment
- Investing in instrumentation to monitor the private cloud
- Adapting operational processes
- Exercising failure scenarios
In subsequent posts, I will examine each of these areas and the issues surrounding transition of a new virtual infrastructure to a technology operations organization.
What other areas have you seen an operations organizations struggle with when taking ownership of a private cloud? Are there lessons learned in these areas that can help organizations avoid a “raining cloud”?