HostMon v7.78 - Unresponsive service after ~1 hr

All questions related to installations, configurations and maintenance of Advanced Host Monitor (including additional tools such as RMA for Windows, RMA Manager, Web Servie, RCC).
Post Reply
gwindsor
Posts: 9
Joined: Wed Feb 17, 2010 7:59 am

HostMon v7.78 - Unresponsive service after ~1 hr

Post by gwindsor »

Hello KS-Soft,

I am running HostMon v7.78 as a windows service and like many others have had issues with the monitoring ceasing and hostmon itself becoming unresponsive.

I'll start with a list of facts about the environment:
  • -ODBC tests seem to be the cause. I stopped all ODBC tests and didn't encounter this problem after letting it run for over 48 hours
    -I have updated the ODBC driver on the server running hostmon to the 10203 drivers. This is the same version i have running on a far more intensive HostMon setup on a completely different system. That system also runs ODBC tests, and far more of them.
    -Initial troubleshooting led me to using the auditing tool. With it I discovered that this particular hostmon setup did not have a high enough setting in the Behavior tab of Options for the Number of tests to start per second. It has since been increased past the recommended number.
    -Auditing tool currently reports no problems, average of 4 tests/sec in this system.
    -There are 54 ODBC query tests running at 5.7/min. 13 of these tests are currently disabled.
From reading around this forum, ODBC tests seem to be a common denominator with freezing problems. Has there been any officially suggested ODBC driver version to use?

What doesn't make sense to me is that, as mentioned above, I have another HostMon environment setup which is far more intense in terms of tests per second than this one. It has the same ODBC driver as well, 10203.

Thanks in advance,
Greg
gwindsor
Posts: 9
Joined: Wed Feb 17, 2010 7:59 am

Post by gwindsor »

I should also mention that there is no Symantec AV running on the machine...
KS-Soft
Posts: 13012
Joined: Wed Apr 03, 2002 6:00 pm
Location: USA
Contact:

Post by KS-Soft »

Well, sometimes 3rd party software leads to the problems. We cannot fix such problems because we are not owners/developers of that 3rd party software.
From reading around this forum, ODBC tests seem to be a common denominator with freezing problems. Has there been any officially suggested ODBC driver version to use?
ODBC drivers? It depends on SQL server you are using.
What server do you use? MS SQL? Oracle? MySQL, PostgreSQL? Interbase?

As we know Oracle clients version 8 and 10 may lead to problems while version 9 works fine.
PostgreSQL and MS SQL clients work fine.
MySQL software often has bugs as well.
What doesn't make sense to me is that, as mentioned above, I have another HostMon environment setup which is far more intense in terms of tests per second than this one. It has the same ODBC driver as well, 10203.
That's good question for ODBC driver developers.
Truth to say I don't know what is "driver 10203".
Do you mean Oracle ODBC driver version 10.02.00.03?

Regards
Alex
gwindsor
Posts: 9
Joined: Wed Feb 17, 2010 7:59 am

Post by gwindsor »

KS-Soft wrote:ODBC drivers? It depends on SQL server you are using.
What server do you use? MS SQL? Oracle? MySQL, PostgreSQL? Interbase?
Sorry, Oracle ODBC. And to answer your other question below, yes i did mean 10.02.00.03. Initially, this particular setup had 10.02.00.01 and after reading about various ODBC issues, and with knowledge that our other hostmon setup is using 10.02.00.03, I patched to that.
KS-Soft wrote:As we know Oracle clients version 8 and 10 may lead to problems while version 9 works fine.
PostgreSQL and MS SQL clients work fine.
MySQL software often has bugs as well.
So you've had luck with 9i ODBC drivers for Oracle? I'll keep that in mind as something to try if i can't make sense of other things i'm seeing.

Something else i meant to add, was that I see a high amount of threads under the hostmon process and tried to dig deeper into what exactly they were doing. I can get the thread stacks for a few of them if you're familiar with them and it would help.

I've seen it reach around 480 threads before it ceases function entirely.

Thanks Alex,
Greg
KS-Soft
Posts: 13012
Joined: Wed Apr 03, 2002 6:00 pm
Location: USA
Contact:

Post by KS-Soft »

So you've had luck with 9i ODBC drivers for Oracle? I'll keep that in mind as something to try if i can't make sense of other things i'm seeing
We are not using version 9 on our systems but such replacement (version 10 -> 9) helped our customers.
Something else i meant to add, was that I see a high amount of threads under the hostmon process and tried to dig deeper into what exactly they were doing. I can get the thread stacks for a few of them if you're familiar with them and it would help.
I've seen it reach around 480 threads before it ceases function entirely.
480 threads? Yes, something wrong. If you disable ODBC tests and restart HostMonitor, will it work normally?
HostMonitor creates new thread for each test probe. When everything works properly thread is terminated after test execution.

Regards
Alex
gwindsor
Posts: 9
Joined: Wed Feb 17, 2010 7:59 am

Post by gwindsor »

KS-Soft wrote:We are not using version 9 on our systems but such replacement (version 10 -> 9) helped our customers.
Gotcha - so are you guys also using the 10.02.00.03 drivers or have you moved to 10.02.00.04?

KS-Soft wrote:480 threads? Yes, something wrong. If you disable ODBC tests and restart HostMonitor, will it work normally?
HostMonitor creates new thread for each test probe. When everything works properly thread is terminated after test execution.
Aye, around 480. Sometimes it stop at less (upper 300's) but once it reaches anywhere close to that point i know its just a matter of time until it stops. Interestingly, using a process explorer to view the threads in hostmon.exe, I manually killed a few threads which had a state of "Wait:UserRequest" to see what impact if any it had - I didn't see any impact other than the tray icon being removed once i had finally killed the thread responsible for managing that.

It appears that there is a thread limit it hits each time that it can't go past however that limit has changed each time.

When I disabled all ODBC tests to try and narrow down the problem, I was able to have the rest of the monitors run for approximately 48 hours with no interruption at all. I'm confident it is not the tests themselves that are poorly constructed and/or requiring too much time to come back from their respective sources because I run identically structured tests from the other hostmon environment I manage and the ODBC tests there are more intensive (they query bigger databases).

If I can cause it to grind to a halt and get some individual thread information, would it help at all? I noticed that when it reached that point, 99% of the threads were very very similar but I couldn't really draw any conclusions from what i was seeing.

Thanks again,
Greg
KS-Soft
Posts: 13012
Joined: Wed Apr 03, 2002 6:00 pm
Location: USA
Contact:

Post by KS-Soft »

Gotcha - so are you guys also using the 10.02.00.03 drivers or have you moved to 10.02.00.04?
Currently we are using 10.02.00.01 and it works fine for us.
If I can cause it to grind to a halt and get some individual thread information, would it help at all?
May be it can help Oracle developers but I don't see how we can fix this problem :roll:

Regards
Alex
gwindsor
Posts: 9
Joined: Wed Feb 17, 2010 7:59 am

Post by gwindsor »

Seeing as i'm not a developer myself I'm not quite understanding this. Why would the ODBC drivers be the problem when they work fine in one system but not another?

Is it not the hostmon process that opens the calls to the ODBC drivers during the tests and subsequently the threads which hang? How is this not something you can't assist with?
KS-Soft
Posts: 13012
Joined: Wed Apr 03, 2002 6:00 pm
Location: USA
Contact:

Post by KS-Soft »

Seeing as i'm not a developer myself I'm not quite understanding this. Why would the ODBC drivers be the problem when they work fine in one system but not another?
You should send this question to Oracle.
Is it not the hostmon process that opens the calls to the ODBC drivers during the tests and subsequently the threads which hang?
Yes, HostMonitor calls ODBC manager that in turn calls ODBC driver. These DLLs (ODBC manager and ODBC driver) work in address space of the process and use resources of the process (hostmon.exe). This means any problem caused by some DLL (ODBC driver in our case) will make problems for entire process. If you close/restart process, Windows releases all resources, regardles what module allocated such resource (hostmonitor itself, ODBC manager or some other DLL).
How is this not something you can't assist with?
Our software - Advanced Host Monitor. We can fix mistakes in our code.
We cannot fix mistakes in Windows, ODBC driver or some antivirus. Its not our property.

Regards
Alex
gwindsor
Posts: 9
Joined: Wed Feb 17, 2010 7:59 am

Post by gwindsor »

Understood. I appreciate the information Alex. I will investigate further on my end and follow up with you if i get any information that may help others in the future.
KS-Soft
Posts: 13012
Joined: Wed Apr 03, 2002 6:00 pm
Location: USA
Contact:

Post by KS-Soft »

I would recommend to try Oracle client version 9

Regards
Alex
gwindsor
Posts: 9
Joined: Wed Feb 17, 2010 7:59 am

Post by gwindsor »

No luck with the 9 odbc drivers... caused all tests to fail.
KS-Soft
Posts: 13012
Joined: Wed Apr 03, 2002 6:00 pm
Location: USA
Contact:

Post by KS-Soft »

Fail? What is the status of the test? Unknown? Bad? Any message in Reply field of the test?

Regards
Alex
gwindsor
Posts: 9
Joined: Wed Feb 17, 2010 7:59 am

Post by gwindsor »

Status is unknown with the following as its reply:

: Specified driver could not be loaded due to system error 126 (Oracle in 10gclient)

My guess is it's because i didn't install a new 9 client and was hoping the 9 ODBC drivers from oracle would be backwards compatible with a 10gclient install. It appears that's not the case.
Post Reply