Hardware / Hard disk failures

All questions related to installations, configurations and maintenance of Advanced Host Monitor (including additional tools such as RMA for Windows, RMA Manager, Web Servie, RCC).
Post Reply
doodleman99
Posts: 38
Joined: Tue Sep 02, 2008 5:45 am

Hardware / Hard disk failures

Post by doodleman99 »

I have had a little snoop on the forums and looked into the documentation but still need a little help.
I would like to be able to detect hard disk failures on our servers.
with the majority of them all being VMware ESXi boxes, i can't use WMI tests so i assume that leaves me with SNMP traps/Gets?
I have looked into this, but am falling at the first hurdle as i have had no experience with these before.

Not sure if it's relevant, but we use mostly HP ProLiant & StorageWorks boxes

Many thanks for your help!!!!!
JV
KS-Soft
Posts: 13012
Joined: Wed Apr 03, 2002 6:00 pm
Location: USA
Contact:

Post by KS-Soft »

Our MIB Browser that is included into Advanced Host Monitor package comes with some compiled MIB files from HP.
Probably you may use the following counters
1.3.6.1.4.1.11.2.3.9.4.2.3.3.2.1.5
iso.org.dod.internet.private.enterprises.hp.nm.hpsystem.net-peripheral.netdm.dm.hrm.hrDevice.hrDeviceTable.hrDeviceEntry.hrDeviceStatus
The current operational state of the device described by this row of the table. A value unknown(1) indicates that the current state of the device is unknown. running(2) indicates that the device is up and running and that no unusual error conditions are known. The warning(3) state indicates that agent has been informed of an unusual error condition by the operational software (e.g., a disk device driver) but that the device is still 'operational'. An example would be high number of soft errors on a disk. A value of testing(4), indicates that the device is not available for use because it is in the testing state. The state of down(5) is used only when the agent has been informed that the device is not available for any use.
1.3.6.1.4.1.11.2.3.9.4.2.3.3.2.1.6
iso.org.dod.internet.private.enterprises.hp.nm.hpsystem.net-peripheral.netdm.dm.hrm.hrDevice.hrDeviceTable.hrDeviceEntry.hrDeviceErrors
The number of errors detected on this device. It should be noted that as this object has a SYNTAX of Counter, that it does not have a defined initial value. However, it is recommended that this object be initialized to zero.
On the other hand I think its better to use MIB files that comes with your server (not just some MIB files) and read manual that comes with your server because different version of SNMP agent software from HP may support some new useful set of counters...

According to HP you should have the following MIB files
CPQAPLI
CPQAPPG80
CPQCLUS
CPQCMC
CPQCR
CPQDMII
CPQFCA
CPQFIX
CPQGEN
CPQHLTH
CPQHOST
CPQHOSTB
CPQHSV110V3
CPQICA
CPQIDA
CPQIDE
CPQINFO
CPQSCSI
CPQSINFO
CPQSM2
CPQSRVMN
CPQSTAT
CPQSTDEQ
CPQSTSYS
CPQSWCC
CPQTHRSH
CPQUPS
CPQWINOS
http://www.hp.com/wwsolutions/misc/hpsi ... imsnmp.pdf

Regards
Alex
doodleman99
Posts: 38
Joined: Tue Sep 02, 2008 5:45 am

Post by doodleman99 »

as always, a super fast and useful response!
love you guys!!!
KS-Soft
Posts: 13012
Joined: Wed Apr 03, 2002 6:00 pm
Location: USA
Contact:

Post by KS-Soft »

You are welcome :)

Regards
Alex
doodleman99
Posts: 38
Joined: Tue Sep 02, 2008 5:45 am

Post by doodleman99 »

KS-Soft wrote: According to HP you should have the following MIB files
CPQAPLI
CPQAPPG80
CPQCLUS
CPQCMC
CPQCR
CPQDMII
CPQFCA
CPQFIX
CPQGEN
CPQHLTH
CPQHOST
CPQHOSTB
CPQHSV110V3
CPQICA
CPQIDA
CPQIDE
CPQINFO
CPQSCSI
CPQSINFO
CPQSM2
CPQSRVMN
CPQSTAT
CPQSTDEQ
CPQSTSYS
CPQSWCC
CPQTHRSH
CPQUPS
CPQWINOS
http://www.hp.com/wwsolutions/misc/hpsi ... imsnmp.pdf

Regards
Alex
Excuse my dumb question, as mentioned before, this is my first exposure to SNMP so i need a little hand holding.

A mentioned, you said i should have the HP MIB files already... im not sure how/where to find them and also, how to amend them to the current MIB list in hostmon.

When i try to "Get Value" on any of the OID's suggested. it's either doing nothing at all or it thows an error "an existing connection was forcibly closed by the remote host" or "Cannot connect to remote SNMP agent". is this because i need to load the MIB as mentioned above?

many thanks,
J
KS-Soft
Posts: 13012
Joined: Wed Apr 03, 2002 6:00 pm
Location: USA
Contact:

Post by KS-Soft »

A mentioned, you said i should have the HP MIB files already... im not sure how/where to find them and also
Please send this question to Hewlett-Packard support team.
Probably you already have these files on your server. If you do not have them, I think you may use Insight Management MIB update kit for HP Systems Insight Manager
http://h18002.www1.hp.com/products/serv ... ibkit.html
But it would be better if you consult with manufacturer. They know for sure which files better suit your hardware and software.
how to amend them to the current MIB list in hostmon
You should use MIB Browser. MIB Browser allows you to view the hierarchy of SNMP MIB variables in the form of a tree and provides you with additional information about each node.
Just start MIB Browser as regular application (do not call MIB Browser from HostMonitor when you want to modify MIB database) then use menu File->Append MIB file to compile MIB files one by one.
Please read the manual or visit our web site for more information
http://www.ks-soft.net/hostmon.eng/mibbrowser/index.htm

Actually you may perform SNMP Get tests without using any MIB files, you just need to know OID of the counter. MIB Browser just allows you to find such counter easily. Also if you compile MIB files, HostMonitor will allow you to use variables like %EnterpriseName%, %EnterpriseNameShort%, %MibName%, etc. These variable pretty useful when you setup SNMP Trap test method to check for various different events.
When i try to "Get Value" on any of the OID's suggested. it's either doing nothing at all or it thows an error "an existing connection was forcibly closed by the remote host"
May be SNMP agent is not started? May be you specified wrong UDP port? You should use the same port you have specified for SNMP agent. Normally this is port 161 but may be your SNMP agent was configured using different port?
Also please check firewall settings, may be it blocks requests?

Regards
Alex
Post Reply