ALert and or Agent relationships

Community support

ALert and or Agent relationships

Posted by godzone-nz on July 9, 2008 at 15:31

One of the features I am missing in Pandora, which I have used in other monitoring solutions is the ability to form relationships between various objects.

The normal use for these relationships is to prevent being flooded with events. I can not see such a mechanism in Pandora so would like to propose the following as an enhancement request. I would do it myself but it involves changing the database schema so I thought I’d float it here first.

I appreciate that what I am proposing might not suit everyone so would appreciate bounding the idea around a bit first.

Phase 1. – Establishing relationships between alerts on the same agent.
Every alert is assigned a priority. If when checking the monitors, if an alert is going to fire, check to see whether any higher priority alerts are already in the ‘fired’ state for that agent. If so, don’t perform the alert action but set the alert to fired.

Say I have a ‘icmp_proc’, some ‘snmp_inc’s and a couple of ‘tcp_proc’ modules associated with an agent, and I make the icmp alert the highest priority and the alerts for the others all lower. If the agent is down, the icmp_proc alert will fire and I will get told about it. However, the others will also all fire as the agent is down, but I already know that so I don’t need the alerts telling me so.

I haven’t thought through what to do with the recovery end but an idea would be that when an alert ceases, it also ceases all lower priority ones for the same agent and does not send the emails. If any module is still in a fault state, its alert will fire again shortly when the next test is done.

Phase 2 – establishing relationships between agents.
This one can get a bit tricky but I like the nagios approach of establishing trees i.e. each agent can have a parent. The idea here is similar to that in phase 1, but exactly how to use it, I am still thinking about. The idea though is to prevent getting alerts for agents that have a common root. i.e. if a router is down, then all the network based alerts for things on the other side of it are going to fire. I don’t want to get hundreds of alerts when I know the router is down.

Anyway, I would be interested in what opinions others might have.

godzone-nz replied 16 years, 7 months ago 2 Members · 2 Replies
2 Replies

Sancho

Administrator
July 10, 2008 at 01:21

2321 Karma points

Community awards: Bright ideas

Community rank: Tentacle Master

Like it
Up
0
Down
Drop it
::
[cite]Posted By: godzone-nz[/cite]
One of the features I am missing in Pandora, which I have used in other monitoring solutions is the ability to form relationships between various objects.

The normal use for these relationships is to prevent being flooded with events. I can not see such a mechanism in Pandora so would like to propose the following as an enhancement request. I would do it myself but it involves changing the database schema so I thought I’d float it here first.

I appreciate that what I am proposing might not suit everyone so would appreciate bounding the idea around a bit first.

Phase 1. – Establishing relationships between alerts on the same agent.

Every alert is assigned a priority. If when checking the monitors, if an alert is going to fire, check to see whether any higher priority alerts are already in the ‘fired’ state for that agent. If so, don’t perform the alert action but set the alert to fired.

Say I have a ‘icmp_proc’, some ‘snmp_inc’s and a couple of ‘tcp_proc’ modules associated with an agent, and I make the icmp alert the highest priority and the alerts for the others all lower. If the agent is down, the icmp_proc alert will fire and I will get told about it. However, the others will also all fire as the agent is down, but I already know that so I don’t need the alerts telling me so.

I haven’t thought through what to do with the recovery end but an idea would be that when an alert ceases, it also ceases all lower priority ones for the same agent and does not send the emails. If any module is still in a fault state, its alert will fire again shortly when the next test is done.

Phase 2 – establishing relationships between agents.

This one can get a bit tricky but I like the nagios approach of establishing trees i.e. each agent can have a parent. The idea here is similar to that in phase 1, but exactly how to use it, I am still thinking about. The idea though is to prevent getting alerts for agents that have a common root. i.e. if a router is down, then all the network based alerts for things on the other side of it are going to fire. I don’t want to get hundreds of alerts when I know the router is down.

Anyway, I would be interested in what opinions others might have.

Really is a interesting stuff. We have different ideas here, and each user have different needs, it’s because a monitoring system, like Pandora could be used to monitor a local network, a wider WAN or even physical elements like temperature using sensors, so concepts of “alerting” is very diffuse.

The first one, based on priority is really a nice idea, because is easy to “priorize” alerts. And I think also is not very difficult to implement, but not everybody has more than alert in the same agent. So, the problem is to trigger the real “ONE” alert.

I don’t know if you have tested 2.0 version (currently in SVN). It has a new alert system, called “Combined” or complex alerts, really is a logical multi-alert system. You could define an alert as the logical result of several others (simple or even complex). You can define a number of “single non-firing alerts” that can fire a real one. I think is the better solution because you really can define the combination of alerts you want.

This could be used to fire a very refined alert on real problems only, or/and to fire individual ones in “non-very-noisy” notifications (like a pandora event), and big ones by sending an SMS to admin.
godzone-nz

Member
July 10, 2008 at 16:37

0 Karma points

Community rank: Tentacle noob

Like it
Up
0
Down
Drop it
::
OK, it seems it would be worth my while having a look at version 2. No point in redesigning the wheel and it looks like you guys are already on top of this one.

Welcome to Pandora FMS Community!

Sancho

godzone-nz