View previous topic :: View next topic |
Author |
Message |
doodleman99
Joined: 02 Sep 2008 Posts: 38
|
Posted: Tue Sep 24, 2013 4:54 am Post subject: VMWare - ESXi monitoring |
|
|
I'm a great fan of HostMonitor and have implemented it at several client sites but with VMWare dominating every infrastructure i come across, i'm struggling to pitch it as a solution to my boss due to the lack of hardware monitoring.
i've asked for support in the forum before and have read a few other threads but it seems you resign yourself to defeat every time stating that you are not VMWare experts and there's nothing that can be done.
i really don't believe that you couldn't spend a little time looking into this and produce a couple of health check tests any hardware failures must be detectable and although it's easy for me to say, it can't be that hard can it?
All the best,
JV |
|
Back to top |
|
|
KS-Soft
Joined: 03 Apr 2002 Posts: 12806 Location: USA
|
Posted: Tue Sep 24, 2013 5:41 am Post subject: |
|
|
We were able to check VMWare using SOAP test method.
What exactly parameter do you want to monitor?
Regards
Alex |
|
Back to top |
|
|
doodleman99
Joined: 02 Sep 2008 Posts: 38
|
Posted: Tue Sep 24, 2013 5:49 am Post subject: |
|
|
To check for hardware failures/alarms would be nice.
Secondary to that, standard CPU & Memory resource checks to monitor stress levels. |
|
Back to top |
|
|
doodleman99
Joined: 02 Sep 2008 Posts: 38
|
Posted: Tue Sep 24, 2013 5:51 am Post subject: |
|
|
To check for hardware failures/alarms would be nice.
Secondary to that, standard CPU & Memory resource checks to monitor stress levels. |
|
Back to top |
|
|
KS-Soft
Joined: 03 Apr 2002 Posts: 12806 Location: USA
|
Posted: Tue Sep 24, 2013 1:03 pm Post subject: |
|
|
There are hundred classes but not all of them really available, it may depend on your server hardware.
http://www.vmware.com/support/developer/cim-sdk/smash/u2/ga/apirefdoc/
At least CIM_Processor and CIM_Memory provides HealthState on our system.
Quote: | Indicates the current health of the element. This attribute expresses the health of this element but not necessarily that of its subcomponents. The possible values are 0 to 30, where 5 means the element is entirely healthy and 30 means the element is completely non-functional. The following continuum is defined: "Non-recoverable Error" (30) - The element has completely failed, and recovery is not possible. All functionality provided by this element has been lost. "Critical Failure" (25) - The element is non-functional and recovery might not be possible. "Major Failure" (20) - The element is failing. It is possible that some or all of the functionality of this component is degraded or not working. "Minor Failure" (15) - All functionality is available but some might be degraded. "Degraded/Warning" (10) - The element is in working order and all functionality is provided. However, the element is not working to the best of its abilities. For example, the element might not be operating at optimal performance or it might be reporting recoverable errors. "OK" (5) - The element is fully functional and is operating within normal operational parameters and without error. "Unknown" (0) - The implementation cannot report on HealthState at this time. DMTF has reserved the unused portion of the continuum for additional HealthStates in the future. |
>Secondary to that, standard CPU & Memory resource checks to monitor stress levels.
HostMonitor offers CPU Usage and Memory test methods. You may check host or virtual systems.
Regards
Alex |
|
Back to top |
|
|
doodleman99
Joined: 02 Sep 2008 Posts: 38
|
Posted: Thu Sep 26, 2013 5:05 am Post subject: |
|
|
i have managed to get a few of those working, although as you said, not all of them.
Quote: | HostMonitor offers CPU Usage and Memory test methods. You may check host or virtual systems |
yes, checking the virtual machines is fine, but the ESXi boxes aren't running windows, so your CPU and Memory tests wont work. |
|
Back to top |
|
|
KS-Soft
Joined: 03 Apr 2002 Posts: 12806 Location: USA
|
Posted: Thu Sep 26, 2013 12:11 pm Post subject: |
|
|
Well, VMWare said latest ESXi does not support SNMP but unofficially you still can enable SNMP agent (edit /etc/vmware/snmp.xml file) and use Memory test to check Physical Memory.
But you are right, we should increase priority of VMWare related tasks because of ESXi...
Regards
Alex |
|
Back to top |
|
|
mrw
Joined: 08 Oct 2012 Posts: 182
|
Posted: Wed Oct 02, 2013 7:43 am Post subject: |
|
|
Hi,
May I ask which snmp OID I can use to get free/used/total physical memory using snmp on an ESXi?
I have not managed to find that, so If you have found it please let me know which oid it is.
And on our ESXi server the only thing I can monitor is "Overall HealthState" using SOAP. And the test itself sucks because all I can check for is if the Relpy is not=5. And when that happens I donīt get any information at all on whatīs the exact problem is.
So VMware should really implement snmp to allow more standard tests like CPU/disks/vdisks/raids and other more specific tests. |
|
Back to top |
|
|
KS-Soft
Joined: 03 Apr 2002 Posts: 12806 Location: USA
|
Posted: Wed Oct 02, 2013 8:48 am Post subject: |
|
|
Quote: | May I ask which snmp OID I can use to get free/used/total physical memory using snmp on an ESXi? |
There is Memory test metod, it will find OID itself (it may use different counters depending on OS).
Quote: | So VMware should really implement snmp to allow more standard tests like CPU/disks/vdisks/raids and other more specific tests. |
They implemented SNMP in old versions on VMWare, later they dropped SNMP support and implemented API that does not provide useful information
Regards
Alex |
|
Back to top |
|
|
mrw
Joined: 08 Oct 2012 Posts: 182
|
Posted: Thu Oct 03, 2013 2:05 am Post subject: |
|
|
The "Memory Test" doesnīt work against my ESXi hosts.
SNMP is on and working and I can get a few values from it, but the "Memory Test" doesnīt get any reply.
Any ideas why? Itīs set to use snmp and the same credentials as my other snmp queries. |
|
Back to top |
|
|
KS-Soft Europe
Joined: 16 May 2006 Posts: 2832
|
Posted: Thu Oct 03, 2013 4:35 am Post subject: |
|
|
What ESXi version do you use? |
|
Back to top |
|
|
mrw
Joined: 08 Oct 2012 Posts: 182
|
Posted: Thu Oct 03, 2013 4:49 am Post subject: |
|
|
Several different, but atleast 5.0.0, and 5.1.0 |
|
Back to top |
|
|
KS-Soft
Joined: 03 Apr 2002 Posts: 12806 Location: USA
|
Posted: Thu Oct 03, 2013 9:12 am Post subject: |
|
|
Physical memory check works on our ESXi 5.1.0 Build 799733
Can you check the following OIDs
1.3.6.1.4.1.2021.4.5.0
1.3.6.1.4.1.2021.4.6.0
1.3.6.1.2.1.25.2.3.1.4.6
1.3.6.1.2.1.25.2.3.1.5.6
1.3.6.1.2.1.25.2.3.1.6.6
Regards
Alex |
|
Back to top |
|
|
mrw
Joined: 08 Oct 2012 Posts: 182
|
Posted: Fri Oct 04, 2013 3:01 am Post subject: |
|
|
These OIDs doesnīt work on any of the ESXi servers:
1.3.6.1.4.1.2021.4.5.0
1.3.6.1.4.1.2021.4.6.0
But all of these work on all ESXi servers:
1.3.6.1.2.1.25.2.3.1.4.6
1.3.6.1.2.1.25.2.3.1.5.6
1.3.6.1.2.1.25.2.3.1.6.6
But I already use those OIDs to get "Disk Space Usage" on all available diskstores that the ESXi host can use, and I cant find a "disk" that would represent physical memory when I parse all OIDs that gives me a reply. Or how would you use those?
And I have got the "Memory Test" to work on 1 ESXI of the 5 I have. but itīs uses the same VMware build as the rest. The only thing different about that specific host is the hardware. Itīs not a real "server" but more of a workstation. But I hope that the values that test gives me is correct and not "Disk space usage"? |
|
Back to top |
|
|
KS-Soft
Joined: 03 Apr 2002 Posts: 12806 Location: USA
|
Posted: Fri Oct 04, 2013 6:43 am Post subject: |
|
|
Quote: | And I have got the "Memory Test" to work on 1 ESXI of the 5 I have. but itīs uses the same VMware build as the rest. The only thing different about that specific host is the hardware. Itīs not a real "server" but more of a workstation. |
Sorry, we did not find how to enable memory counters. Try to ask VMWare support team...
Quote: | But I hope that the values that test gives me is correct and not "Disk space usage"? |
HostMonitor checks description of each counter within hrStorageDescr branch, it will not use counters related to disk volumes. It checks for 'physical memory', 'real memory', 'memory buffers' counters...
So it should report correct information.
Regards
Alex |
|
Back to top |
|
|
|