View previous topic :: View next topic |
Author |
Message |
sista71
Joined: 06 Aug 2003 Posts: 11
|
Posted: Tue Apr 12, 2005 12:00 am Post subject: Hostmonitor 5.12 is constantly crashing |
|
|
Hi all,
we have a brand-new installation of HostMonitor 5.12 on a 2-processor/2GB RAM Windows2003 Standard Server. So there is plenty of resources available. The hostmonitor is running a bit more than 1300 tests on machines all over the world. Is 1300 too much for one system? The new "Estimate load..." feature says no. Anyhow, the hostmonitor service is constantly crashing, without any notice why in the syslog. The Application Eventlog just once in a while shows the following error (but not every time hostmonitor crashes):
Faulting application hostmon.exe, version 0.0.0.0, faulting module kernel32.dll, version 5.2.3790.0, fault address 0x000249d3.
Do you have any idea why this might be happening? Any known issues? Is there any logging I can turn up to get more information on this?
Thanks in advance for your help
sista71 |
|
Back to top |
|
|
timn
Joined: 20 Nov 2003 Posts: 184 Location: United States
|
Posted: Tue Apr 12, 2005 6:43 am Post subject: |
|
|
I can't answer your main question but I can you that we are running approx. 3,000 texts (19 tests/sec) on a similar 2-CPU, 2GB machine. Our OS is Win 2000 Server. We are also running HM 5.12
On rare occassions (once every 2-4 weeks), we will get "out of memory" dialog boxes popping up. (We are exploring this issue on another thread -- see below.) We are running HM as application -- my understanding is that you don't see these dialog boxes when HM is running as a service.
Sounds like you may be seeing a similar problem. How frequently is this occurring?
You may also want to read this thread...
http://www.ks-soft.net/cgi-bin/phpBB/viewtopic.php?t=1964&highlight= |
|
Back to top |
|
|
sista71
Joined: 06 Aug 2003 Posts: 11
|
Posted: Tue Apr 12, 2005 9:25 am Post subject: If it was only once every 2 weeks... |
|
|
Hi timn,
first of all, thanks a lot for your reply. We get this failure about every 10-60 minutes. Yes, we are running hostmonitor as a service, since this gives me the advantage that I can have the service restart automatically whenever it fails. we cannot afford being without this monitoring for longer than a few minutes.
Regarding the thread you mentioned: We are not running any reports at all. Also no ODBC logging (yet). I wanted this to work ok before I turn on ODBC. We have the hostmonitor to send out mails whenever a test fails.
We do have a file virus scanner running on the system (TrendMicro ServerProtect 5.58 ). Should we exclude any directories from scanning?
We also have the newest Compaq Insigh Manager Version running on the server, but not really doing anything yet. This is on hold because of the HostMonitor issues we are experiencing.
My answers to the resource questions:
GDI Objects=229, User Objects=160, Memory=14.664 K, Handles=500, Threads=10
It is hard to get a snapshot of the moment when it fails because it is happening out of the blue.
Hope that clears up some questions for Alex in advance. But, when I understand correctly, the issue in the other thread
http://www.ks-soft.net/cgi-bin/phpBB/viewtopic.php?t=1964&highlight=
is not solved yet, is it?
Regards
sista71 |
|
Back to top |
|
|
KS-Soft
Joined: 03 Apr 2002 Posts: 12801 Location: USA
|
Posted: Tue Apr 12, 2005 1:24 pm Post subject: |
|
|
Do you have installed antivirus monitor, such as Norton Antivirus or McAfee?
Often Norton Antivirus monitor leads to problem like this - crash without error message.
At the same time antivirus scanner does not produce any problems.
Its not solved yet But I was able to reproduce resource leakage using James' settings. Looks like some test method under some circumstances works incorrectly... Hope we will find solution soon
Regards
Alex |
|
Back to top |
|
|
sista71
Joined: 06 Aug 2003 Posts: 11
|
Posted: Wed Apr 13, 2005 3:39 am Post subject: Antivirus Software |
|
|
Hi Alex,
no we do not have a antivirus monitor installed, just regular antivirus file scanning. So, I guess we can exclude this. Could a memory leak crash hostmonitor every few minutes??
Regards
Silke |
|
Back to top |
|
|
KS-Soft
Joined: 03 Apr 2002 Posts: 12801 Location: USA
|
Posted: Wed Apr 13, 2005 11:01 am Post subject: |
|
|
Quote: | Could a memory leak crash hostmonitor every few minutes |
I don't think so.
Are you using SNMP or Traffic Monitor test methods? Windows 2003 has bug in mgmtapi.dll that often generate errors and may be can cause application to crash. Please read this article http://www.ks-soft.net/cgi-bin/phpBB/viewtopic.php?t=1301
Also, could you try to setup HostMonitor on different system? Just for testing..
Regards
Alex |
|
Back to top |
|
|
sista71
Joined: 06 Aug 2003 Posts: 11
|
Posted: Thu Apr 14, 2005 7:08 am Post subject: |
|
|
Hi Alex,
we are not using any SNMP or Traffic monitors at all. We also do not use RMA agents. We are getting ready to put this on another system. What be your suggestion as to what OS version would be the most stable? 2000 Server or even Workstation?
Here is an overview of the tests we are performing:
Service 644
UNC resources 382
Ping 74
TCP 103
SMTP 73
POP3 28
Count files 18
URL request 2
Total 1324
I've also noticed a lot of Win32 1722 and 1726 errors sometimes, but I cannot necessarily make a connection between these errors and crashes. And I have excessively checked the network performance and there is no errors or packet loss whatsoever. Is there a chance HostMonitor might be overloaded by making al ot of RPC calls at the same time?
Regards
Sista71 |
|
Back to top |
|
|
KS-Soft
Joined: 03 Apr 2002 Posts: 12801 Location: USA
|
Posted: Thu Apr 14, 2005 10:07 am Post subject: |
|
|
Yes, I would recommend Windows 2000 SP4 + security patches. Probably Server edition better optimized for performance but Workstation works good as well.
Quote: | And I have excessively checked the network performance and there is no errors or packet loss whatsoever. Is there a chance HostMonitor might be overloaded by making al ot of RPC calls at the same time? |
HostMonitor just sends request to Windows... How often HM performs UNC and Service tests? Probably you may increase test intervals and decrease "Do not start more than N tests per second" option?
Regards
Alex |
|
Back to top |
|
|
sista71
Joined: 06 Aug 2003 Posts: 11
|
Posted: Fri Apr 15, 2005 4:46 am Post subject: tested w2k professional SP4 |
|
|
Hi Alex,
we just installed hostmonitor on another machine (w2K professional SP4). I am afraid it has exactly the same problems as the w2003 Server. The frequency of the tests is:
Services 1,1/sec
UNC 24,7/min
I cannot really increase the test interval, since we need it set like that. I 've tried several different settings for the "Do not start more than N tests per second" option. I've tried setting it down to 16 and also tried setting it higher to 60. Makes no difference. I am depressed. Is there any way to turn up logging on hostmonitor to see what the problem is?
Regards
Sista71 |
|
Back to top |
|
|
sista71
Joined: 06 Aug 2003 Posts: 11
|
Posted: Fri Apr 15, 2005 7:20 am Post subject: Ok, I think I have to give up on this one... |
|
|
...but that brings up my next question: I have everything correctly formatted in 5.12. Now I've installed 4.86 and want to migrate/downgrade all my tests to this version. I have copied all *.lst and *.ini files to the new server and also the hml and ~hm files. Everything looks very good besides one (very important) thing. The Action profiles are not accepted by the older version. I have even tried recreating the action profiles manually and then adding the other *.lst, ini and hml files again. It doesn't work Please Alex, tell me there is away that will save me from having to add the action profiles to each single test again?
Thanks
Sista71 |
|
Back to top |
|
|
KS-Soft
Joined: 03 Apr 2002 Posts: 12801 Location: USA
|
Posted: Fri Apr 15, 2005 12:24 pm Post subject: |
|
|
Quote: | Please Alex, tell me there is away that will save me from having to add the action profiles to each single test again |
If your action profiles do not contain "Record HM log" actions (that was implemented in version 5.10), you may change 1st byte of the actions.lst file from 08 to 07, then you will be able to use this file for HostMonitor 4.86
If your action profiles contain "Record HM log" actions, you should remove all these actions, save profiles and then modify 1st byte of the file.
If you do not have utility to edit binary file, send actions.lst file to support@ks-soft.net.
When action profiles will be ready for old version of HostMonitor, copy HML file with tests from version 5.12, start version 4.86 and load tests.
If you load tests before actions.lst modification, HostMonitor will not be able to keep "test->actions" links.
Quote: | we just installed hostmonitor on another machine (w2K professional SP4). I am afraid it has exactly the same problems as the w2003 Server. I am depressed. |
Me too
Quote: | Is there any way to turn up logging on hostmonitor to see what the problem is? |
Windows closes HostMonitor without any error message, right? It means HostMonitor does not have any chance to report about problem
So, no ODBC logging, no antivirus monitors, and it crashes on Windows 2000 and Windows 2003.
Could you send your settings (all *.LST, *.INI and *.HML files)? HostMonitor cannot successfully perform your tests from our network, but may be I am lucky enough to reproduce this problem.
Regards
Alex |
|
Back to top |
|
|
sista71
Joined: 06 Aug 2003 Posts: 11
|
Posted: Tue Jul 26, 2005 2:40 am Post subject: Found the culprit!!! |
|
|
Hi Alex,
I know it.s been a while, but I was quite busy. I finally found some time to do some more testing. I disabled one folder at a time and let hostmonitor run for a while. Then enabled the folder again and went on with the next one. The crashing stopped when I disabled the "Count all files" and "Count old files" tests we have running on 8 machines. I understand that the "Count all files" tests might be quite a load since they have to go through about 400 folders altogether containing about 300 files. Are there any time-outs associated with this test? Is there any config file that I can modify? The "Count old files" tests on the other hand only have to check one folder with usually no files in it and they still let the hostmonitor crash all the time. Is there possibly a test setting I could screw up on? I would be thankful for any suggestion since we do need those tests.
Thanks
Sista71 |
|
Back to top |
|
|
KS-Soft
Joined: 03 Apr 2002 Posts: 12801 Location: USA
|
Posted: Tue Jul 26, 2005 11:58 am Post subject: |
|
|
H'm, "Count files" test causes error....
We have checked our code and I am 99.999% sure there are no mistake that could cause HostMonitor to crash. Probably some bug in network client???
You are checking remote system, right? Could you try to install RMA on remote system and perform these tests using agent?
Regards
Alex |
|
Back to top |
|
|
mpriess
Joined: 02 Jul 2002 Posts: 112 Location: Arizona, USA
|
Posted: Thu Aug 04, 2005 1:48 pm Post subject: Hostmonitor WAS crashing on us a lot... |
|
|
We consistently had problems with hostmonitor hanging every few days using the last couple releases and we couldn't find the cause; however, when we upgraded to 5.38 all those issues have gone away. The app has been running well for two weeks straight with no crash\memory leak\etc. We are running Windows 2003 Server ~2000 tests: Load 4 per second
[/img] |
|
Back to top |
|
|
KS-Soft
Joined: 03 Apr 2002 Posts: 12801 Location: USA
|
Posted: Thu Aug 04, 2005 2:03 pm Post subject: |
|
|
Good news
But.. we did not fix any bugs in version 5.38 We fixed some possible problems in version 5.34. Did you have problems with version 5.34?
Regards
Alex |
|
Back to top |
|
|
|