Welcome to Pandora FMS Community!

Find answers, ask questions, and connect with our community around the world.

Welcome to Pandora FMS Community Forums Community support agent_keepalive alerts not happening

  • agent_keepalive alerts not happening

    Posted by steveybaby on January 8, 2008 at 07:32

    Hello guys – 1st of all want to say thanks for a great bit of software – this does just what I need, and does it well.

    The only unsolved problem in my installation is the ability to send an alert when the agent or machine running the agent shuts down. I saw the earlier post about the “bug” where keep alive alerts were being broken by ping alerts – and this is not my situation.

    All I have is a handful of proc and data modules assigned to my agent. I created the agent_keepalive alert, and assigned it 0/0 for max/min.

    I’ve turned off the Agent and the server and hoped to receive the alert – but it doesnt happen.

    Any ideas what I can look at?

    alcsmith replied 16 years, 9 months ago 5 Members · 37 Replies
  • 37 Replies
  • Sancho

    Administrator
    January 8, 2008 at 15:19
    2321 Karma points
    Community awards: bulb Bright ideas
    Community rank: tentacle_master_icon Tentacle Master
    Like it
    Up
    0
    Down
    Drop it
    ::

    Hello guys – 1st of all want to say thanks for a great bit of software – this does just what I need, and does it well.

    Thanks ! :). Nice place San Francisco, I was there on holidays !!

    The only unsolved problem in my installation is the ability to send an alert when the agent or machine running the agent shuts down. I saw the earlier post about the “bug” where keep alive alerts were being broken by ping alerts – and this is not my situation.

    All I have is a handful of proc and data modules assigned to my agent. I created the agent_keepalive alert, and assigned it 0/0 for max/min.

    I’ve turned off the Agent and the server and hoped to receive the alert – but it doesnt happen.

    Any ideas what I can look at?

    There is a bug there. I’m working on it now, because it was a unfinished bug for release 1.3 and another source of problems. Don’t worry, I think this should be easy to fix. I’ll post here the solution in a few hours.

  • steveybaby

    Member
    January 8, 2008 at 21:37
    0 Karma points
    Community rank: tentacle-noob-1 Tentacle noob
    Like it
    Up
    0
    Down
    Drop it
    ::

    Thanks for the quick response – im excited to get the fix!

  • Sancho

    Administrator
    January 8, 2008 at 22:54
    2321 Karma points
    Community awards: bulb Bright ideas
    Community rank: tentacle_master_icon Tentacle Master
    Like it
    Up
    0
    Down
    Drop it
    ::

    Hello guys – 1st of all want to say thanks for a great bit of software – this does just what I need, and does it well.
    Any ideas what I can look at?

    I’ve just uploaded new code about this issue, it affects to SQL data, data server and console as well, so is not a minor or stable change. Works for me now (keepalive is now a “standard” module) and get inputs from agent contact data (from any source, data server or network server) any server could detect a keepalive under it’s time limit and raise alerts assigned to this module. Also this module could be deleted from console.

    If you want to try it I will please to assist you in every I can. You need to access SVN code, and apply changes manually, is not easy if you dont work before with SVN, but it has no technical dificulties, this is officially code for 1.3.1, a minor fix release scheduled for february.

  • steveybaby

    Member
    January 8, 2008 at 23:52
    0 Karma points
    Community rank: tentacle-noob-1 Tentacle noob
    Like it
    Up
    0
    Down
    Drop it
    ::

    Yes please – can you give the steps involved? I’m no svn expert, but if you give me some pointers I should be fine.

  • steveybaby

    Member
    January 11, 2008 at 00:25
    0 Karma points
    Community rank: tentacle-noob-1 Tentacle noob
    Like it
    Up
    0
    Down
    Drop it
    ::

    well I did a svn checkout. I replaced the console and server code with the new version. Its still not working – I guess I missed something. What is the “sql data and “data server” changes I need to make?

    I did notice that the icon for the keepalive monitor is now broken and I had to delete it from my configured agents. So something had changed

  • Sancho

    Administrator
    January 11, 2008 at 14:10
    2321 Karma points
    Community awards: bulb Bright ideas
    Community rank: tentacle_master_icon Tentacle Master
    Like it
    Up
    0
    Down
    Drop it
    ::

    well I did a svn checkout. I replaced the console and server code with the new version. Its still not working – I guess I missed something. What is the “sql data and “data server” changes I need to make?

    I did notice that the icon for the keepalive monitor is now broken and I had to delete it from my configured agents. So something had changed

    I’m working on a tarball for 1.3.1 beta. A minimal upgrade is needed (very simple) and no coding skills required. It takes me one day more. I want to fix several problems detected on 1.3.0 and add some minimal new features.

  • Sancho

    Administrator
    January 11, 2008 at 21:35
    2321 Karma points
    Community awards: bulb Bright ideas
    Community rank: tentacle_master_icon Tentacle Master
    Like it
    Up
    0
    Down
    Drop it
    ::

    well I did a svn checkout. I replaced the console and server code with the new version. Its still not working – I guess I missed something. What is the “sql data and “data server” changes I need to make?

    I did notice that the icon for the keepalive monitor is now broken and I had to delete it from my configured agents. So something had changed

    Last commit for 1.3.1 branch solves all problems with keepalive, and it’s very stable (tested several days on several different environments, some of them with huge load). Take a look at project website about that:

    http://pandora.sourceforge.net/en/index.php?sec=downloads

    To “migrate” to 1.3.1 you only need to insert a new data in ttipo_module table as described here:

    INSERT INTO `ttipo_modulo` VALUES (100,’keep_alive’,-1,’KeepAlive’,’mod_keepalive.png’);

  • Sancho

    Administrator
    January 11, 2008 at 21:45
    2321 Karma points
    Community awards: bulb Bright ideas
    Community rank: tentacle_master_icon Tentacle Master
  • steveybaby

    Member
    January 15, 2008 at 11:52
    0 Karma points
    Community rank: tentacle-noob-1 Tentacle noob
    Like it
    Up
    0
    Down
    Drop it
    ::

    Thanks – that worked great!

  • Sancho

    Administrator
    January 20, 2008 at 03:42
    2321 Karma points
    Community awards: bulb Bright ideas
    Community rank: tentacle_master_icon Tentacle Master
    Like it
    Up
    0
    Down
    Drop it
    ::

    Thanks – that worked great!

    Fantastic ! 🙂

  • alcsmith

    Member
    January 30, 2008 at 18:05
    0 Karma points
    Community rank: tentacle-noob-1 Tentacle noob
    Like it
    Up
    0
    Down
    Drop it
    ::

    Hey Nil,
    I am having a similar problem with the 1.3.1 build. I am too a new user and have been tooling around with the app for about a week now and learning the ins and outs. I have the latest SVN I upgraded from 1.3. Everything dropped in place with no problem but I am having a problem with it sending alerts:

    1 – In the Manage Alerts I had to change the email Alert from “echo _field3_ | sendmail -s _field2_ _field1_” as sendmail doesn’t recognize the -s. I changed it to “echo _field3_ | mail -s _field2_ _field1_” or you can use mailx either will work.

    2 – I have an alert setup to check to see if the agent is down. The min/max value is set to 1. if anything other then 1 send an alert. With a Min Alert of 1 and a Max Alert of 5. Time threshold of 5 min. It was working fine until I restarted the agent in question and the alert’s stopped however I did not get the [RECOVERED] message that is supposed to be delivered.

    I take the agent down today and the alerts do not fire.

    Any idea what could be the problem?

    -Al

    BTW Well done on a great product!

  • raul

    Member
    January 30, 2008 at 21:03
    0 Karma points
    Community rank: tentacle-noob-1 Tentacle noob
    Like it
    Up
    0
    Down
    Drop it
    ::

    Please, take a look at this topics in the Wiki:

    Example of adding a new alert

    Assigning Alerts to modules

    Regards,

    Raúl

  • alcsmith

    Member
    January 30, 2008 at 23:51
    0 Karma points
    Community rank: tentacle-noob-1 Tentacle noob
    Like it
    Up
    0
    Down
    Drop it
    ::

    Raul,
    I don’t think my issue is with the config because it worked previously in this manner. It just stops working when I cycle the agent.

    I will however change the following settings:

    time threshold: 30 min
    Min # of Alerts: 1
    Max # of Alerts: 1

    Lets see what happens

  • alcsmith

    Member
    January 31, 2008 at 00:12
    0 Karma points
    Community rank: tentacle-noob-1 Tentacle noob
    Like it
    Up
    0
    Down
    Drop it
    ::

    Made the change and the alert still isn’t firing the alerts. I don’t see any error’s in the logs and the alert and the module both seem to be configured correctly.

  • alcsmith

    Member
    January 31, 2008 at 00:37
    0 Karma points
    Community rank: tentacle-noob-1 Tentacle noob
    Like it
    Up
    0
    Down
    Drop it
    ::

    This time I recreated the alert. However for the Min/Max I put a 1 in both columns. After I save Alert it shows in Min as Text and Max as empty.

    Again no errors are shown the alert just doesn’t fire.

  • alcsmith

    Member
    January 31, 2008 at 17:08
    0 Karma points
    Community rank: tentacle-noob-1 Tentacle noob
    Like it
    Up
    0
    Down
    Drop it
    ::

    I think I may have possibly found the problem. I have had the agent down for 15.6 hours and no alert has fired. I checked the data tab for the agent_keepalive module and it says that it is currently a 0. that would tell me that it thinks the agent is down.

    My alert does specifically indicate if the value is anything other then a 1 (agent up) then the alert does not fire. It seems like this this a bug.

  • Sancho

    Administrator
    February 3, 2008 at 22:23
    2321 Karma points
    Community awards: bulb Bright ideas
    Community rank: tentacle_master_icon Tentacle Master
    Like it
    Up
    0
    Down
    Drop it
    ::

    I think I may have possibly found the problem. I have had the agent down for 15.6 hours and no alert has fired. I checked the data tab for the agent_keepalive module and it says that it is currently a 0. that would tell me that it thinks the agent is down.

    My alert does specifically indicate if the value is anything other then a 1 (agent up) then the alert does not fire. It seems like this this a bug.

    Have you using the current development 1.3.1 version ?, if you’re using 1.3.1 version could you send us a screenshot ?, this has been fixed in 1.3.1 and tested by several users 😕

    Grab a tarball from current development of 1.3.x on:

    http://artica.homelinux.com/pandora_tarball/1.3.x/?C=M;O=D

    If not, take a look (dont forget to run the small migration script (.sql) to upgrade from 1.3.0 to 1.3.1 -it’s very easy-.

    http://www.openideas.info/wiki/index.php?title=Pandora_1.3:Documentation_en:1.3_to_1.3.1

  • Royal TS

    Member
    February 4, 2008 at 14:04
    0 Karma points
    Community rank: tentacle-noob-1 Tentacle noob
    Like it
    Up
    0
    Down
    Drop it
    ::

    Hi guys,

    i have the also the problem that pandora don’t fires any alert if agent is down. I have updated my pandora system to Version 1.3.1 and also made the update in sql.
    Are there any other possible mistakes you can make by configuring an alert.
    Other alerts for modules like “check host alive” over ICMP are working fine.

    mfg

    Royal TS

  • Sancho

    Administrator
    February 4, 2008 at 14:59
    2321 Karma points
    Community awards: bulb Bright ideas
    Community rank: tentacle_master_icon Tentacle Master
    Like it
    Up
    0
    Down
    Drop it
    ::

    Hi guys,

    i have the also the problem that pandora don’t fires any alert if agent is down. I have updated my pandora system to Version 1.3.1 and also made the update in sql.
    Are there any other possible mistakes you can make by configuring an alert.
    Other alerts for modules like “check host alive” over ICMP are working fine.

    mfg

    Royal TS

    Could you describe with more detail the full process to set up it ?. I suppose you have activated alert recovery notification in pandora_server.conf file, isn’t ?

  • Royal TS

    Member
    February 4, 2008 at 16:42
    0 Karma points
    Community rank: tentacle-noob-1 Tentacle noob
    Like it
    Up
    0
    Down
    Drop it
    ::

    Could you describe with more detail the full process to set up it ?. I suppose you have activated alert recovery notification in pandora_server.conf file, isn’t ?

    Hi Nil,

    The recovery notification in the pandora_server.conf is activ and operative

    I’ve the following setup in use:

    Alert Typ: email
    Alert Status: enabled
    Min Value: 1
    Max Value: 1
    Alert text: Agent down
    Field #1: email Address
    Field #2: _Agent_ down
    Field #3: Text
    Time From: 0:00, Time to 0:00
    Time threshold: 30 min
    min Number of Alerts: 1
    max Number of Alerts: 3
    Assigned Module: agent_Keepalive

    I hope i’ve just a little mistake in line of thought 😕

    greetz

Page 1 of 2