VMware Health Check (Improve Uptime & Efficiency)

VMware Health Check

Managers! Is it time for a VMware health check to make sure your virtualization investment is not wasted?

Improve Uptime by Keeping vSphere in Shape

I know infrastructure budgets are tight, and I also know how difficult it can be to balance CAPEX and OPEX from month to month.

Over the years, I’ve had the pleasure of creating SLA Uptime reports as well as calculating monthly staff and environment utilization down to the kilowatt.

This is why performing a regular VMware health check is important because it helps keep VMware costs of ownership low by exposing waste, and providing opportunities to take proactive actions.

Barring some of the finer details, here’s what I look for:

1. Start by going vZombie Hunting!

The biggest culprit of a poorly utilized vSphere environment is VM sprawl.

What is sprawl?

Sprawl is when virtual machines get deployed for a project and then they are left unused, or when old services get upgraded with new virtual servers…

…but the old VMs never get decommissioned. Or when the service gets shut down but the servers never get turned off …

VMware Health Check for zomies

Stop VM Sprawl Before It’s Too Late!

When sprawl happens you end up with valuable resources burning cycles on storage, servers, and network hardware for nothing. They are vZombies (BTW – I started using this term years ago) and prefect candidates for decommission.

Hunting vZombie servers isn’t easy unless you have a tool such as VMware OpsManager or vKernel.

The other way is to create a custom VMware health check PowerShell script for checking and logging when there is no CPU, memory or network traffic on a VM.

Normally a flatline is a good indicator of a zombie.

Once you track down the vZombies and check with the service owner to get the go-ahead, turn them off, back them up, and delete these VMs from your vCenter inventory. (Note: follow the standard decommissioning process)

Also don’t forget some server IP addresses have firewall rules and VIPS associated with them so clean them up, too!

2. Retire Old Server Hardware that is OOW and EOL

No VMware health check would be complete without retiring Out of Warranty (OOW) and End of Life (EOL) server hardware.

Read my lips, this hardware is wasting your ESXi licenses because you cannot get enough memory or CPU cores in these systems to leverage your “per socket” ESX license efficiently.

Old servers are inefficient for VMware and server hardware OOW and EOL should be retired, ASAP!

For example: One loaded HP 380G8 or Dell R420 can handle more memory and CPU cores than 4 – 6 old servers and will still use only 2 ESX licenses. Also consolidation on new servers is good for reducing rack U, using less power and cooling, lowering port count on switches, lowering warranty renewals, reducing management overhead from less physical servers and lowers down time from failures of tired junk.

Also, another best practice to reduce the risk of reusing OOW and EOL servers is to get rid of this junk so your admins don’t reuse them.

I’ve seen too many junk servers pulled from the bone yard and put back into service because they were available.

I repeat…Old servers are inefficient for VMware!

If it’s now on the current VMware HCL it should be disposed of…

3. Standardize Configuration for Good VMware Health

A little bit of HP EVA, and a little bit of NetApp, and a little bit of local disk might make for a good song lyric, but they add up to a vSphere that is hard to manage, optimize and keep efficient.

And the same goes for mixed-matched servers of all makes and models as well. Mixed-matched configurations of memory and CPU types in the same cluster is a no-no!

Some servers with 32GB, and others with 64GB, and even others with 192GB all in the same vSphere ESX cluster… this is a recipe for data loss and poor uptime.

A good best practice to follow is taking inventory of your equipment and enforcing standardization of hardware configurations.

This is key to optimizing your VMware investment because one-off environments are trouble and need to be on the VMware health check report so they can be dealt with!

4. Report Bloated Virtual Servers with Too Much Memory, CPU and Disk

Finally, a thorough report will include resource waste such as VMs that were created with too much memory, CPU and disk space (aka over provisioned).

Over using valuable vSphere resources is common in some vSphere environments because engineers and developers are used to ordering servers based on physical criteria. This is because they have never been shown the proof their servers are only using 20% of the resources they have provisioned (another reason for a good tool).

The unfortunate thing here – depending on the scale – is you may not be able to clean up existing systems because too much work may be involved. But you can start to trim back resources on newly deployed VMs.

Small, Medium, Large Virtual Servers

A good best practice is to come up with some standard configurations for VM sizes such as small, medium, and large; with various memory, CPU and disk size configurations.

This will also make capacity management easier since now you have a set block that you can calculate capacity from.

This is not uncommon and most cloud services use standard sizes for their VMs.

5. Take Action ASAP…

A good VMware health check documents and lists all offenders of these best practices. Once the report is completed you will want an action plan that road maps:

  • How zombie VMs are to be decommissioned to get rid of sprawl and reclaim resources.
  • How old hardware will be replaced and disposed of to get rid of inefficient server hardware and improve TCO and uptime.
  • How storage, servers, and network systems are standardized and consolidated to reduce overall CAPEX and OPEX. And to make them easier to manage.
  • How VM configurations will be standardized and made more efficient.

In Conclusion:

Over all, this VMware health check focuses on cleanup tasks, standardization, and consolidation that will make your vSphere more efficient and help increase the return on your investment and reduce downtime.

There are other tasks you can add to the check such as:

  • VMtools and firmware updating
  • ESXi upgrading

But I listed key areas that should be included in your standard VMware health check report.

VMware Health Check Scripts and Services: 

Do you have a recommendation to add?

  • 1


Leave a Reply

HOT Skills >>Master DevOps Tools
vSphere Data Protection EOA
Finding The Best vSphere Backup Replacement For VDP (3 Alternatives)

Important Notification: vSphere Data Protection (VDP) End of Availability (EOA) That...

vSphere 6
8 Updates That Make vSphere 6 Better – Keith Barker

Editor’s note: Keith Barker has been a CBT Nuggets Trainer since 2012. Some...

VMware Interview Questions
25 VMware Interview Questions And Answers: Tough & Technical (Download PDF)

Free VMware vSphere Interview Guide In this VMinstall Guide, I’ll share...

Project Photon
VMware Project Photon: Technical Review for Linux Admins

My quick and dirty review of Project Photon. I was very happy...

The Best Blockchain Jobs
10 Best Blockchain Jobs Near You (Perfect Match For DevOps Skills)

We’ll cover the best Blockchain jobs in a minute but first, let...

DevOps Plan
Best DevOps Strategy Hack (Winners Start With Why)

As IT Leaders, we’re all looking for innovative ways to improve ourselves,...

DevOps Tools Download
DevOps Toolchain: Download 3 Free Tools Used By Pros (Ansible, Git, Jenkins)

Hack Your Resume with DevOps Skills Lately, I’ve been getting a...

DevOps Guide
The Best Microsoft DevOps Skills: 25+ For Windows SysAdmins

What is Microsoft DevOps? This is an interesting question because first off,...

Shift to the Cloud – Will It Shift the Jobs by Danish Wadhwa

Editor’s note: Danish Wadhwa is a strategic thinker and an IT Pro....

VULTR Reviews
VULTR Review – Best 2018 VPS Alternatives (25 PROs CONs & FAQs)

Should Your App, Website or Blog be Powered by VULTR VPS?...

Assessing Your Company's Cloud Readiness
13 Cloud Readiness Assessment Tips To Guide Your Migration Success (Updated)

Planning your migration to the cloud? If you’re planning to move...

devops engineer skills
10 Best DevOps Skills: Finding The Elusive DevOps Engineer

What It Takes To Be A DevOps Engineer in 2018 The...

What’s So Interesting About Woz U?

On 10/12/2017, I had the privilege of attending the AZ Tech...

TensorFlow Tools
TensorFlow Tools (7 Takeaways Toward A Deep Learning Career)

Intro: Deep Learning & A.I. Technology As a follow-up to my...

job rut
How To Get MOVING When You’re Stuck In A Career Rut (Video)

You worked hard to get your degree and after graduation you...

Storage Engineer Skills
Why Storage Engineer Skills Are HOT! (Can You Say Big Data?)

3 BIG Skills New Storage Engineers Are Missing Out On… Behind...

VULTR Reviews
VULTR Review – Best 2018 VPS Alternatives (25 PROs CONs & FAQs)
Website for Small Business
Best Small Business Website Packages (6 Alternatives That Rank!)
Rise of the Robots
Rise Of The Robots Review (You Won’t Guess What’s Going On!)
Click here to learn how to Optimize WordPress Speed
WordPress Speed Optimization: Learn To Rank Higher In Google