Problem triggering alert when agent is down

Community support

Problem triggering alert when agent is down

Posted by dave on October 3, 2007 at 21:53

Hi!

I have been trying for a while now to make the following scenario to work without great success. I am using 1.3 Beta 2.

I have a Windows agent running several data and proc modules. It works flawlessly for the monitoring of this part. Reporting all the data and procs perfectly.

Now I want to test a failure to report alerts.

When the agent stops reporting (Stopping the service for testing), the agent is flagged as “agent down” in purple which is correct, but impossible to trigger an alert.

The data modules reports “unknown”. The last contact time for the proc module is being incremented correctly but it never turn RED as down and keep showing as GREEN (like ignoring the last contact time) which in my opinion should turn RED after the interval time * 2 when “out of limits”.

The agent_keepalive alert never triggers for that reason I suspect.

There should be a way to alert when an agent is “out of limit” (like adding that the the alerting modules) and proc should take care of the last contact time and turn RED when out of limit.

Thanks

Dave

Sancho replied 17 years, 4 months ago 3 Members · 3 Replies
3 Replies

dave

Member
October 12, 2007 at 20:32

0 Karma points

Community rank: Tentacle noob

Like it
Up
0
Down
Drop it
::
Hi!

I wonder if anybody have the same problem or if this is a bug to be reported to the tracker. Any suggestion or solution welcome.

Thanks

Dave
manu

Member
October 12, 2007 at 20:53

0 Karma points

Community rank: Tentacle noob

Like it
Up
0
Down
Drop it
::
Hi dave,

You’re not experimenting any bug, that’s the expect behaviour.
You get the last contact status for those monitors, but, on the other hand you also have (in red) the time since the last contact was made, so, if it’s red it means, correctly, that the agent is down.

If you feel like this should change, please, feel free to join the mailing list and send an email with your points and let’s discuss what would be better 🙂

Cheers
Manuel
Sancho

Administrator
October 12, 2007 at 20:59

2321 Karma points

Community awards: Bright ideas

Community rank: Tentacle Master

Like it
Up
0
Down
Drop it
::
Hi!
The data modules reports “unknown”. The last contact time for the proc module is being incremented correctly but it never turn RED as down and keep showing as GREEN (like ignoring the last contact time) which in my opinion should turn RED after the interval time * 2 when “out of limits”.

The agent_keepalive alert never triggers for that reason I suspect.

There should be a way to alert when an agent is “out of limit” (like adding that the the alerting modules) and proc should take care of the last contact time and turn RED when out of limit.

Dave

agent_keepalive special module it’s exactly for that. If you have an agent (with only agent-based modules) that do not report in intervalx2 time, agent_keepalive should get 0 value and fire an alert if you have setup one for it.

If you have a external check, (network ICMP for example), agent_keepalive never get 0 value because a ICMP network check always reports something.

Welcome to Pandora FMS Community!

Problem triggering alert when agent is down

dave

manu

Sancho