Welcome to Pandora FMS Community!

Find answers, ask questions, and connect with our community around the world.

Welcome to Pandora FMS Community Forums Community support Problem triggering alert when agent is down

  • Problem triggering alert when agent is down

    Posted by dave on October 3, 2007 at 21:53

    Hi!

    I have been trying for a while now to make the following scenario to work without great success. I am using 1.3 Beta 2.

    I have a Windows agent running several data and proc modules. It works flawlessly for the monitoring of this part. Reporting all the data and procs perfectly.

    Now I want to test a failure to report alerts.

    When the agent stops reporting (Stopping the service for testing), the agent is flagged as “agent down” in purple which is correct, but impossible to trigger an alert.

    The data modules reports “unknown”. The last contact time for the proc module is being incremented correctly but it never turn RED as down and keep showing as GREEN (like ignoring the last contact time) which in my opinion should turn RED after the interval time * 2 when “out of limits”.

    The agent_keepalive alert never triggers for that reason I suspect.

    There should be a way to alert when an agent is “out of limit” (like adding that the the alerting modules) and proc should take care of the last contact time and turn RED when out of limit.

    Thanks

    Dave

    Sancho replied 17 years, 4 months ago 3 Members · 3 Replies
  • 3 Replies
  • dave

    Member
    October 12, 2007 at 20:32
    0 Karma points
    Community rank: tentacle-noob-1 Tentacle noob
    Like it
    Up
    0
    Down
    Drop it
    ::

    Hi!

    I wonder if anybody have the same problem or if this is a bug to be reported to the tracker. Any suggestion or solution welcome.

    Thanks

    Dave

  • manu

    Member
    October 12, 2007 at 20:53
    0 Karma points
    Community rank: tentacle-noob-1 Tentacle noob
    Like it
    Up
    0
    Down
    Drop it
    ::

    Hi dave,

    You’re not experimenting any bug, that’s the expect behaviour.
    You get the last contact status for those monitors, but, on the other hand you also have (in red) the time since the last contact was made, so, if it’s red it means, correctly, that the agent is down.

    If you feel like this should change, please, feel free to join the mailing list and send an email with your points and let’s discuss what would be better 🙂

    Cheers
    Manuel

  • Sancho

    Administrator
    October 12, 2007 at 20:59
    2321 Karma points
    Community awards: bulb Bright ideas
    Community rank: tentacle_master_icon Tentacle Master
    Like it
    Up
    0
    Down
    Drop it
    ::

    Hi!
    The data modules reports “unknown”. The last contact time for the proc module is being incremented correctly but it never turn RED as down and keep showing as GREEN (like ignoring the last contact time) which in my opinion should turn RED after the interval time * 2 when “out of limits”.

    The agent_keepalive alert never triggers for that reason I suspect.

    There should be a way to alert when an agent is “out of limit” (like adding that the the alerting modules) and proc should take care of the last contact time and turn RED when out of limit.

    Dave

    agent_keepalive special module it’s exactly for that. If you have an agent (with only agent-based modules) that do not report in intervalx2 time, agent_keepalive should get 0 value and fire an alert if you have setup one for it.

    If you have a external check, (network ICMP for example), agent_keepalive never get 0 value because a ICMP network check always reports something.