Capacity Planning and Monitoring for VMware Virtual Infrastructure

CapacityiqIf you have a growing Virtual Infrastructure, like most – it has outgrown the monitoring tools integrated into VirtualCenter. VC has very limited capability to help resource managers and NOCs monitor and manage virtuals and this post will identify a few shortfalls.

There are actually three categories this post could be broken up into, and then sub-categories after that. But, I will identify the three main topics in my opinion we need VMware to provide an enterprise solution for:

One – standard resource monitoring which will allow a NOC to monitor CPU, memory, drive space, etc. Yes, I know the VC can send SNMP traps and email alerts, but the options fall short for those NOCs looking for more data. I heard a network administrator say that 40% of network traffic coming from the VMware hosts is SNMP traffic.  That makes sense when you have a host with 40 VMs which are being monitored sending their SNMP traffic through the same host’s physical NICs. Get the picture?

Two – performance monitoring for the host, guest VMs and data stores. I made the comment to a VMware project manager the other day that in the VirtualCenter, when a VM is having issues that the VM icon turns red to alert someone who, hey, I’m having problems. My suggestion was to also turn a data store red when high latency was impacting the LUN. This is just one suggestion but from an administrator’s perspective, it would really help to be able to double-click the red alert on a data store, and the windows would change to see all the VMs being affected in that LUN. Then the VM causing the latency could be Storage-VMotion off the LUN. Another view could be a window in the VC that allows you to see the performance of the VMs as a group so heavy hitters are identified and dealt with before they cause a problem for other VMs. Yes, I am aware of the performance monitor tool in the VC and I have used it but it falls short of what I would like to see.

Three – capacity planning is a big one that is completely left out of the VirtualCenter. Yes, if you click and peck into enough windows the information for how many hosts and VMs is there but no forecasting is available and there is no way to set your own thresholds for growth. Yes, I’m aware of Lifecycle Manager and other 3rd party products but all are expensive and from my evaluation of at least 3 of the leading products, don’t live up to their own marketing.

For the record!

For the record, VMware is aware of these issues and they are working on them. The PM and sales staff who were in the meeting I attended got an ear- trashing for these short-falls. One engineer went as far as to say, “VMware provides an infrastructure with no way to correctly monitor it unless you want to spend more money to buy more VMware Products”, end quote. The PM from VMware was very open to all that was said and explained what VMware is doing to fill the gaps on these three topics, resource monitoring, performance monitoring, and capacity planning. New features in ESX 4.0 and VC 3.0 will help resolve some of these pain points. I won’t go into all that VMware is doing but they are doing a lot to fix these issues in future releases. One new product scheduled to be released soon is CapacityIQ. It’s a capacity planning and growth forecasting tool that will integrate into VirtualCenter. There will be a new button on the main VC menu soon. I’ve included a few screenshots from my Webex of the product.


Let me conclude with one final point. That point being dollar savings – ROI, return on investment is what is driving many data centers towards virtualization. Every demo and meeting I have attended has had a smooth talker using slides to show dollar saving if you buy their product. For the record, when it costs more to maintain a data center of virtual servers than of physical servers the party is over and, at the current rate of the cost that will be soon. Based on my experience and listening to administrators and engineers, tools to properly monitor and maintain virtual data centers will soon drive the ROI down below what it costs to maintain the physical data centers they are supposed to replace.


  1. Alex Bakman

Leave a Reply