This vendor-written tech primer has been edited by Network World to eliminate product promotion, but readers should note it will likely favor the submitter’s approach.

Virtualization is a mature technology but if you don’t have a virtualization wizard on staff managing the environment can be a challenge. Benefits such as flexibility, scalability and cost savings can quickly give way to security risks, resource waste and infrastructure performance degradation, so it is as important to understand common virtual environment problems and how to solve them.

The issues tend to fall into three main areas: virtual machine (VM) sprawl, capacity planning and change management. Here’s a deeper look at the problems and what you can do to address them:

* VM Sprawl. Anyone who has been involved in virtualization administration is likely no stranger to VM sprawl—the unchecked growth of virtual machines in a virtual environment. While sprawl can result from the unplanned creation of VMs, it’s also frequently attributed to rogue virtual machines, which are VMs that should have followed a very specific deployment lifecycle, but ultimately were lost in the daily IT operations rather than properly retired.

This problem largely stems from the ability to automate the creation and provisioning of VMs, and by doing so it has become much more difficult for administrators to keep track of every VM in the context of day-to-day management. This is especially true in the case of large enterprises where administrators are responsible for monitoring extensive virtual environments with hundreds to thousands of VMs spread across clusters, data centers and even geographic locations. To make matters worse, sprawl typically happens over time, making it difficult to recognize it as it occurs.

Unresolved, VM sprawl can cost an organization time, money and resources, not to mention lead to security breaches and compliance concerns with old, unpatched VMs sitting ripe for attack. However, it is possible to get a grip on sprawl. Here are some tips to address sprawl:

  • Create a formal process for requesting and approving VMs.
  • Document the lifecycle plan for each VM with details that include the who, what, why, when and where.
  • Monitor VM resource utilization and establish baseline utilization trend lines to identify abandoned or inactive VMs. This can also help identify potential security anomalies.
  • Control access based on roles.
  • Simplify these tasks through the use of a virtualization management tool.

Even with these measures in place, however, given the current threat landscape it’s better to be safe than sorry. Thus, administrators can fortify their virtual environments specifically against the security risks of VM sprawl by following these additional best practices:

  • Segregate access to your virtual environment resources via role-based access control.
  • Log and monitor VM-to-VM traffic. Regularly check traffic for anomalies in the logs.
  • Lockdown and monitor the VM file folders. This goes a long way in preventing a root kit hack, in which case a hacker would gain control of the root of a specific system and make changes to allow transmission interceptions, among other things, within your environment.
  • Establish and maintain a log of activities and events on the host servers, which can be accomplished in just a few steps for VMware or using Event Viewer for Microsoft.


* Capacity planning and right-sizing. Capacity planning in virtual environments is the proper allocation of resources to a VM to meet application quality-of-service (QoS) requirements while not overcommitting resources. This is critical because if VMs are not properly sized the entire value proposition of virtualization—making the most efficient use of resources—can be undermined, reducing the return on investment (ROI).

In many cases, problems with capacity planning stem from people who perceive of virtual resources as limitless. Thus, requests for “max capacity VMs”—think 2 vCPUs, 16GB of RAM and 1TB of disk on the flash array for a single VM that’s only being used to surf the Web and check email—aren’t uncommon. As a result, all too often VMs are provisioned with more resources—storage, memory, vCPUs—than they actually need to support their workloads.

While that results in wasted resources and drops ROI, the penalties can potentially be even more severe. For example, outgrowing physical storage space can lead to data integrity issues as well as VM performance degradation. As another example, incorrectly sizing virtual resources can lead business units to turn to other providers to meet their infrastructure needs. This includes public cloud services and services provided by IT outsourcing.

So taking the time to get capacity planning is critical, but how do you properly do it?

The reality is that proper capacity planning and right sizing of VMs takes time, experience, and skills like performance analysis and performance modeling. The learning curve can be reduced with a tool that has the ability to automatically analyze an environment’s historical data to report on how it has grown over time and then predict how it will look in the future based on algorithms that factor in today’s utilization pattern, historical growth spurts, etc. Such a tool should:

  • Identify over- or under-allocated VMs.
  • Provide the ability to identify capacity issues before they occur by monitoring resource usage in real-time. This allows administrators to both detect and remove potential bottlenecks before they occur.
  • Alert on pending or predicted resource shortages in cluster shared resources, data stores and VMs so proactive steps can be taken to prevent them.
  • Project when resources will be maxed out based on historical trends.
  • Enable what-if projections to see what would happen when adding new resource capacity to an environment.
  • Show capacity usage from an application/workload perspective, which enables capacity decisions to be made consistent with business priorities.

* Change Management. With the maturity of virtualization, advanced features such as dynamic resource scheduling (DRS) and hypervisor memory management, along with other hypervisor enhancements, greatly improve application QoS, but in turn, have created an incredibly complex change management scenario for administrators.

Specifically, the challenge with these features and several other hypervisor-centric technologies is that VMs can constantly migrate from host to host within the overall cluster depending on dynamic resource scheduling or load-balancing schemes. That’s fine when an application is performing well, but as soon as an end user complains about overall performance or quality of service, the administrator is left scrambling to identify where the application is in the virtual data center and what happened to it that caused the performance degradation.

To better manage change in virtual environments, consider the following suggestions:

  • Schedule periodic discovery of virtual assets. Know where your VMs are and how they’re connected.
  • Schedule periodic health checks of your virtual environment. Establish a time-stamped baseline of your data center’s health and risks.

These measures will allow you to stay ahead of the change curve in your virtual environment.

In closing, VM sprawl, poor capacity planning and ineffective change management in virtual environments can create a vicious cycle of data center contention. By following the best practices outlined here that are aimed at solving these challenges, a secure, more efficient virtual environment is entirely possible.