Welcome to Pandora FMS Community › Forums › Community support › Problem triggering alert when agent is down
-
Problem triggering alert when agent is down
Posted by dave on October 3, 2007 at 21:53Hi!
I have been trying for a while now to make the following scenario to work without great success. I am using 1.3 Beta 2.
I have a Windows agent running several data and proc modules. It works flawlessly for the monitoring of this part. Reporting all the data and procs perfectly.
Now I want to test a failure to report alerts.
When the agent stops reporting (Stopping the service for testing), the agent is flagged as “agent down” in purple which is correct, but impossible to trigger an alert.
The data modules reports “unknown”. The last contact time for the proc module is being incremented correctly but it never turn RED as down and keep showing as GREEN (like ignoring the last contact time) which in my opinion should turn RED after the interval time * 2 when “out of limits”.
The agent_keepalive alert never triggers for that reason I suspect.
There should be a way to alert when an agent is “out of limit” (like adding that the the alerting modules) and proc should take care of the last contact time and turn RED when out of limit.
Thanks
Dave
Sancho replied 17 years, 4 months ago 3 Members · 3 Replies -
3 Replies
-
-
::
Hi dave,
You’re not experimenting any bug, that’s the expect behaviour.
You get the last contact status for those monitors, but, on the other hand you also have (in red) the time since the last contact was made, so, if it’s red it means, correctly, that the agent is down.If you feel like this should change, please, feel free to join the mailing list and send an email with your points and let’s discuss what would be better 🙂
Cheers
Manuel -
::
Hi!
The data modules reports “unknown”. The last contact time for the proc module is being incremented correctly but it never turn RED as down and keep showing as GREEN (like ignoring the last contact time) which in my opinion should turn RED after the interval time * 2 when “out of limits”.The agent_keepalive alert never triggers for that reason I suspect.
There should be a way to alert when an agent is “out of limit” (like adding that the the alerting modules) and proc should take care of the last contact time and turn RED when out of limit.
Dave
agent_keepalive special module it’s exactly for that. If you have an agent (with only agent-based modules) that do not report in intervalx2 time, agent_keepalive should get 0 value and fire an alert if you have setup one for it.
If you have a external check, (network ICMP for example), agent_keepalive never get 0 value because a ICMP network check always reports something.