View previous topic :: View next topic |
Author |
Message |
jbarrellon
Joined: 08 Oct 2008 Posts: 23
|
Posted: Thu Aug 22, 2013 3:28 am Post subject: test does not log (primary and backup) |
|
|
Hello KS Soft team,
We have a problem with some tests which seem to be not logged in primary log (Database) neither in backup log (file).
Example :
; ------- Test #01 -------
Method = ShellScript
;--- Common properties ---
;DestFolder = Root\SURVEILLANCE\BERNER-St_Julien_du_Sault\ORACLE\LMPROD\
RMAgent = BERNER-ST_JULIEN_DU_SAULT.FRNT05
Title = LMPROD temps de reponse
Comment = 236
RelatedURL =
ScheduleMode= OneTestPerDay
ScheduleTime= 08:01:00
Alerts = Recheck
ReverseAlert= No
UnknownIsBad= No
WarningIsBad= No
UseCommonLog= Yes
PrivLogMode = Default
CommLogMode = Default
;--- Test specific properties ---
Script = Scripts Oracle||Windows
Params = "C:\asis\RMA-Win\Scripts\temps_reponse.pl" "LMPROD" "manager"
Timeout = 15
UseMacros = No
This does not produce every day, some day it is working (test is logged correctly in database), some other days it's not (no log in primary log / backup log). HM is running at this moment (some other tests are logged in the same second), and we don't see any error in html system log file.
When we take a look at test info window for this test, the number of total checks is the same that the number of occurences of this test in our database (select count(*) where testid='XXX') so it really seems that this test (and some other tests in the same case) is not executed some days. Is this possible ?
This is pretty annoying for us since we use those data to make reports and statistics.
HM v9.40 running on Windows Server 2008 R2.
Please let me know if you need additional informations.
Thank you for your help
Julien |
|
Back to top |
|
|
KS-Soft Europe
Joined: 16 May 2006 Posts: 2832
|
Posted: Thu Aug 22, 2013 7:06 am Post subject: |
|
|
Quote: | When we take a look at test info window for this test, the number of total checks is the same that the number of occurences of this test in our database (select count(*) where testid='XXX') |
This doesn't look like logging problem. Number of checks == number of records in DB.
I assume you are using FULL mode Primary & Backup logging?
Quote: | so it really seems that this test (and some other tests in the same case) is not executed some days. Is this possible ? |
It's possible if test has beed disabled, paused, or all monitoring has been stopped/paused, etc..
Could you try to setup FULL mode Private log for this test? |
|
Back to top |
|
|
jbarrellon
Joined: 08 Oct 2008 Posts: 23
|
Posted: Thu Aug 22, 2013 7:52 am Post subject: |
|
|
Yes, we're using FULL mode for primary and backup logging.
Test has not been disabled, paused or anything else (otherwise we would see this information in Quick Log window). As i said, monitoring was running correctly since there are other tests logged in the database at the same time (before, after and also at the exact same second).
I have setup FULL private log for this test (in addition to common log). I'll let you know if the problem appears again and if an entry is made in this new logfile. |
|
Back to top |
|
|
KS-Soft
Joined: 03 Apr 2002 Posts: 12806 Location: USA
|
Posted: Thu Aug 22, 2013 9:01 am Post subject: |
|
|
May be test returns too long Reply string and database returns some error.
Please check Auditing Tools (men View->Auditing Tool) and system log file (HotMonitor log file, default name syslog.htm) for errors.
Regards
Alex |
|
Back to top |
|
|
jbarrellon
Joined: 08 Oct 2008 Posts: 23
|
Posted: Fri Aug 23, 2013 7:47 am Post subject: |
|
|
I don't think test returns too long reply string since it's configured to return just XXX ms. Also, we can't see any error in database logs.
As already said, there is no error in HostMonitor system log file syslog.htm
In Auditing Tools, there is also no error. However, we can see that HM is running 4.1 tests/s, would it be possible that some test are not done because HM is overloaded ? (even if it's saying Conclusion: system is able to perform given tests without significant load)
Anyway, the problem did not appear this morning, i'll let you know next time it appears and i'll check :
- if there is a corresponding entry in the private log file.
- if the field Last test time in Test info window is OK |
|
Back to top |
|
|
KS-Soft
Joined: 03 Apr 2002 Posts: 12806 Location: USA
|
Posted: Fri Aug 23, 2013 10:34 am Post subject: |
|
|
Quote: | In Auditing Tools, there is also no error. However, we can see that HM is running 4.1 tests/s, would it be possible that some test are not done because HM is overloaded ? (even if it's saying Conclusion: system is able to perform given tests without significant load) |
4 tests per sec? overloaded? no way.
You may easily check if test was performed, just look at Last test time field.
Also you may setup private log file for this test. If you will see all records in file log and some missed records in database and no errors in system log then we can assume ODBC driver or database does not work correctly. What exactly ODBC driver do you use?
Regards
Alex |
|
Back to top |
|
|
jbarrellon
Joined: 08 Oct 2008 Posts: 23
|
Posted: Tue Aug 27, 2013 2:56 am Post subject: |
|
|
We use "Oracle dans OraClient11g_home1_32bit" driver with 11.02.00.03 version.
The problem appeared again, I send the private logfile :
[08/23/2013 8:02:47] GENTRAN temps de reponse Ok 625 ms Shell Script 64573
[08/24/2013 8:02:58] GENTRAN temps de reponse Ok 391 ms Shell Script 64573
[08/27/2013 8:01:11] GENTRAN temps de reponse Ok 375 ms Shell Script 64573
We can see, 08/25/2013 and 08/26/2013 tests are missing.
There are scheduled every days.
We have the same log in our database. |
|
Back to top |
|
|
KS-Soft
Joined: 03 Apr 2002 Posts: 12806 Location: USA
|
Posted: Tue Aug 27, 2013 7:51 am Post subject: |
|
|
Could you please send your configuration files to support@ks-soft.net?
We need HML file with tests + *.LST files + *.INI files (you may skip connlist.lst with passwords)
Regards
Alex |
|
Back to top |
|
|
KS-Soft
Joined: 03 Apr 2002 Posts: 12806 Location: USA
|
Posted: Wed Aug 28, 2013 8:30 pm Post subject: |
|
|
You have scheduled 262 test items to be performed at 08:01
And you set 10 tests per second limit.
Also there are some limitations for simultaneous test execution by agents...
If some scripts require a lot of time, HostMonitor may not be able to perform all tests within specified time frame and do not start them. May be this is the reason of this problem.
Try to schedule different time for some items.
If you need to start test once a day but execution time is not necessary limited to several minutes, there are better solution - you may use regular schedule for test items. Just set long test interval (e.g. 2 hours) and assign schedule with shorter window (e.g. from 8:00 till 9:00). In this case HostMonitor will perform tests once a day within hour between 8:00 and 9:00...
Regards
Alex |
|
Back to top |
|
|
jbarrellon
Joined: 08 Oct 2008 Posts: 23
|
Posted: Mon Sep 09, 2013 9:17 am Post subject: |
|
|
We have followed your advice. For now the problem didn't appear again.
Thank you for your help. |
|
Back to top |
|
|
KS-Soft
Joined: 03 Apr 2002 Posts: 12806 Location: USA
|
Posted: Mon Sep 09, 2013 9:47 am Post subject: |
|
|
You are welcome
Regards
Alex |
|
Back to top |
|
|
|