Active RMA stops responding and service won't stop

All questions related to installations, configurations and maintenance of Advanced Host Monitor (including additional tools such as RMA for Windows, RMA Manager, Web Servie, RCC).
russionix
Posts: 15
Joined: Thu Dec 27, 2007 3:00 pm

Active RMA stops responding and service won't stop

Post by russionix »

I have the Active RMA 3.04 Beta installed on several machines (all Windows 2003 R2 Server SP2). Everything works fine for several hours, then some agents start to drop off.

For example, out of 40 agents I had running yesterday, when I came in this morning 11 of the agents were in the "RMA not connected" state. When I go to the remote machine and try to stop the service, it hangs.

The only way to restart is to use Task Manager to kill rma_active.exe and then start the service again. Then everything returns to normal.

I am communicating with HM 7.08, but plan to upgrade to 7.10. Since the problem appears to be the RMA, I doubt this will help. Is there a new RMA I should be using? What else can I do to help troubleshoot this problem?

Thank you.
KS-Soft Europe
Posts: 2832
Joined: Tue May 16, 2006 4:41 am
Contact:

Post by KS-Soft Europe »

Could you provide more information, please?
- What Windows is installed on the machine, where HostMonitor is running? Service Pack?
- Do you have installed antivirus monitor? Personal firewall? Content monitoring software? Non-standard winsock components? Network packet analyzer?
- Do you see any suspicious error messages in Event Viewer (Start > Settings > Control Panel > Administrative Tools > Event Viewer applet)?
- Do you see any error messages in System Log (file if specified in menu "Options" -> "System Log"). You may access the System Log using menu "View" - > "System Log".
- Do you see any error messages in RMA's log files (successful and failure audit logs, that are specified by rma_cfg.exe utility)?

Regards,
Max
russionix
Posts: 15
Joined: Thu Dec 27, 2007 3:00 pm

Post by russionix »

All machines including the Host Monitor machine are running Windows 2003 R2 Server with Service Pack 2. Some of the RMA machines are running the 64 bit version, but the problem occurs with both 64 and 32 bit machines.

All machines have McAfee VirusScan Enterprise Server 8.5.0.781 ePolicy Orchestrator Agent 3.6.0.574 running. This is true for machines where the RMA stops communicating as well as machines where there is not a problem. I am having our AV admin take a look to make sure McAfee is not interfering.

I do not see anything unusual in the syslog or log for HM. In one check the machine responds, in the next check it fails. There is nothing in between to indicate a problem.

I do not see anything unusual in the Event Viewer for the Host Monitor machine. On the RMA machine's Event log I see "The KS Active Remote Monitoring Agent service terminated unexpectedly." That is probably when I killed the process using Task Manager. Otherwise nothing unusual.
KS-Soft Europe
Posts: 2832
Joined: Tue May 16, 2006 4:41 am
Contact:

Post by KS-Soft Europe »

russionix wrote: I am having our AV admin take a look to make sure McAfee is not interfering.
In fact, we recommend to install HostMonitor/RMA onto clean system. However, If you are unable to disable antivirus, we recommend at least to disable real-time protection module or add HostMonitor/RMA into the exclusions list.
russionix wrote:On the RMA machine's Event log I see "The KS Active Remote Monitoring Agent service terminated unexpectedly." That is probably when I killed the process using Task Manager. Otherwise nothing unusual.
Correct. This message is related to killing the process.
What about RMA's log files? Each RMA writes information into log files. You may start rma_cfg.exe utility from the folder, where RMA is running to find out the log filenames. Also you may view certain rma.ini file and look for "[Logging]" section. Could you send the log files from one of failed rma's to support@ks-soft.net?

Regards,
Max
russionix
Posts: 15
Joined: Thu Dec 27, 2007 3:00 pm

Log from failing RMA

Post by russionix »

This is the tail of the log. Prior to this, just wrong reply packet errors while I was setting things up. If you still want the entire log, let me know.

[2/11/2008 12:22 PM] active-name2 Connection error: Wrong reply packet received
[2/15/2008 5:57 PM] active-name2 Decode error: Cannot read data
[2/15/2008 5:57 PM] active-name2 Connection error
KS-Soft Europe
Posts: 2832
Joined: Tue May 16, 2006 4:41 am
Contact:

Post by KS-Soft Europe »

Could you provide more information, please?
- Do these RMA perform a tests and actions, or tests only?
- What is an estimate load of HostMonitor (tests per second)? You may find this information using menu "View" -> "Estimate Load".
- What exact value is specified in "Don't start more than [N] tests per second" box in "Options" -> "Behavior" tab?

Regards,
Max
russionix
Posts: 15
Joined: Thu Dec 27, 2007 3:00 pm

Post by russionix »

- Do these RMA perform a tests and actions, or tests only?

Ping, two shell scripts to test disk usage, CPU Usage, NTP check on each machine. All tests are dependant upon Ping.

- What is an estimate load of HostMonitor (tests per second)?

Load: 2 test/sec "System is able to perform given tests without significant load"

- What exact value is specified in "Don't start more than [N] tests per second" box in "Options" -> "Behavior" tab?

"Don't start more than 32 tests per second"
russionix
Posts: 15
Joined: Thu Dec 27, 2007 3:00 pm

Still seeing 50% of Active RMA's failing

Post by russionix »

I am still experiencing the same problem with Active RMA. After running for a while, they stop communicating. The process is still running, but not communicating. When I try to stop the service, I am not able to do so. I must kill the active_rma.exe program and restart. Then it works for a while and starts all over again.

Currently half the hosts I am monitoring are not responding, which means I am not really monitoring them.
KS-Soft
Posts: 12821
Joined: Wed Apr 03, 2002 6:00 pm
Location: USA
Contact:

Post by KS-Soft »

We understand the problem but we cannot reproduce it :(
We are checking our code and trying to find mistake...

Regards
Alex
KS-Soft
Posts: 12821
Joined: Wed Apr 03, 2002 6:00 pm
Location: USA
Contact:

Post by KS-Soft »

Could you please try the following update www.ks-soft.net/download/test/actrma305t.zip ?
Do not apply this module on all systems, this is test version with some "configuration limitations" so you will need to replace it again with normal version. Just install on several systems and check how it works.

If my gues is right and some functions that are published as thread-save are not really thread-safe, this version should work stable or almost stable (hung much more rarely). If it does, we will know how to fix the problem.

Regards
Alex
Eggers2
Posts: 19
Joined: Tue Dec 25, 2007 8:25 am

active rma

Post by Eggers2 »

Hi Alex,

remember me? I had the same problem with active rma and it ist still not solved.

Only good thing ist now, that i´m not the only one with this problem. I was already going crazy, because no one execpt me had this problem...

I will also try your beta downlaod from the post before this and wil report...

Bye,
Alex
russionix
Posts: 15
Joined: Thu Dec 27, 2007 3:00 pm

Installed new rma_active.exe

Post by russionix »

I just installed new rma on 4 machines and will watch them over the next day. I'll let you know if they continue to run or if they hang.

Thanks for continuing to work on this.

Russ
russionix
Posts: 15
Joined: Thu Dec 27, 2007 3:00 pm

Active RMA still running

Post by russionix »

Good news. All 4 test Active RMA agents are still running this morning. I think this means you may have identified the problem. I'll let you know if the status changes.
KS-Soft
Posts: 12821
Joined: Wed Apr 03, 2002 6:00 pm
Location: USA
Contact:

Post by KS-Soft »

Good news :) We will modify our code a little. Please keep testing these RMA...

Regards
Alex
russionix
Posts: 15
Joined: Thu Dec 27, 2007 3:00 pm

Active RMA still running

Post by russionix »

More good news. The Active RMA survived the weekend. This is the longest the RMA has ever run, so it appears you have found the problem. I just wanted to let you know that you are on the right track. Let me know when the new Active RMA client is ready and I will deploy it on all our systems and report back to you.
Post Reply