Difference between revisions of "Pandora: QuickGuides EN: Fast deployment"

From Pandora FMS Wiki
Jump to: navigation, search
(Introduction)
 
(18 intermediate revisions by 3 users not shown)
Line 5: Line 5:
 
==Introduction==
 
==Introduction==
  
This guide wants to show the user how to administrate in a quick an efficient way a high number of machines (5, 10..500) using the different features of Pandora FMS designed for this purpose. We are going to divide the document into four parts:  
+
This guide aims to show the user how to quickly and efficiently administrate a high number of machines (5, 10... or 500) using the different features of Pandora FMS designed for this purpose. We are going to divide the document into four parts:  
  
  
Line 16: Line 16:
 
* Remote monitoring with customized scripts, using an agent generator via XML.
 
* Remote monitoring with customized scripts, using an agent generator via XML.
  
==Network devices Monitoring, using Recon Server and templates==
+
==Network device Monitoring, using Recon Server and templates==
  
 
'''Situation'''
 
'''Situation'''
We have to monitor two hundred servers, twenty switches and ten router, and we can't configure one by one. The "general" monitoring is very easy, but we have not much time neither the possibility of installing agents in the machines.
+
To monitor two hundred servers, twenty switches and ten routers, which can't be configured one by one. The "general" monitoring is very easy, but there isn't much time or the possibility to install agents on the machines.
  
 
'''Solution'''
 
'''Solution'''
  
Pandora FMS will detect the systems and will apply them different templates depending on it's a switch, a router or a server. The templates would have remote checks that could be apply just when detecting the kind of machine it is.
+
Pandora FMS will detect the systems and will apply different templates depending if it's a switch, a router or a server. The templates have remote checks that can be applied when detecting the kind of machine it is.
  
  
'''How long will it takes ?'''
+
'''How long will it take ?'''
  
A network class C (255 hosts) is scanned in less than one minute with the 4.0 version. To apply to the detected machines a monitoring standard is almost immediate, so you could have these 230 machines in less than ten minutes, completely configured.
+
A class C network (255 hosts) can be scanned in less than one minute using version 6.0. Applying a monitoring standard to the detected machines is almost immediate, so you could have those 230 machines completely configured in less than ten minutes.
  
  
  
=== Step 1. Defining the monitoring profiles ===
+
=== Step 1. Defining monitoring profiles ===
  
First we are going to define one monitoring template that in Pandora FMS is called "Module Template". For it we go to the following menu:
+
First we are going to define the monitoring template that is called "Module Template" in Pandora FMS. Go to the following menu:
  
<br>
 
 
<center>
 
<center>
 
[[image:quick_mon_1.png]]
 
[[image:quick_mon_1.png]]
 
</center>
 
</center>
<br>
 
  
Here we see some profiles already defined, that have some generic checks. We are going to edit one of them (Linux Server) that refers to one profile that is useful to monitor generic Linux servers in a remote way.
+
Here we see some pre-defined profiles, that have some generic checks. We are going to edit one of them (on a Linux Server) that refers to a profile that is useful to monitor generic Linux servers remotely.
  
 
<br>
 
 
<center>
 
<center>
 
[[image:quick_mon_2.png|600px]]
 
[[image:quick_mon_2.png|600px]]
 
</center>
 
</center>
<br>
 
  
<br>
 
 
<center>
 
<center>
 
[[image:quick_mon_3.png|600px]]
 
[[image:quick_mon_3.png|600px]]
 
</center>
 
</center>
<br>
 
  
As you can see in the shootscreen at the top, this profile has several modules, between them, some basic TCP checks, as for example "Check SSH Server", an ICMP basic check ("Host Alive") and different SNMP modules that uses the Linux MIB, that are the rest of checks.
+
As you can see in the screenshot, this profile has some basic TCP checks, for example "Check SSH Server", a basic ICMP check ("Host Alive") and different SNMP modules that use the Linux MIB, which make up the rest of the checks.
  
  
These "template" checks are defined in the Pandora FMS basic module library that comes with the simple installation.
+
These "template" checks are defined in the Pandora FMS basic module library that comes with the simple installation and which contains generic module definitions.
  
The basic modules are called "Network components". They are generic definitions of modules. They could be edited and see at:
 
Administration -> Modules -> Network components.Here we can see one generic module look called "Sysname" that gets the system value via SNMP.
 
  
<br>
+
The IP value doesn't exist in this module, because it is auto-assigned from the agent IP when this module is applied. The rest of the fields are "default", i.e. thresholds, SNMP community; and are applied to all the agents that have a template with this module. If you want to modify it (i.e. changing the community) you need to change them one by one on the agents, or with the massive change tool.
<center>
 
[[image:quick_mon_4.png]]
 
</center>
 
<br>
 
  
The IP value doesn't exist in this module, because it with self assign from de agent Ip when this module would be apply. The rest of fields are "by default" for example:thresolds, snmp community, and they will be aply to all the agents that have a template with this module. If we want to customize it (for example: to change the community) then we should have to change it in the agents one by one or in a general way with the massive change tool.
+
Now that we know what monitoring templates, generic modules for templates or network components are, we can look at some of the other templates, specifically those involved with WMI generic monitoring and those corresponding to basic monitoring.
  
 +
The first has three WMI modules for Windows. These modules need to be customized, editing the original component or the generated modules, so they need a username and password with permissions to perform remote queries on WMI.
  
Well, now that we know what  a monitoring template, a generic module for template or network component are, we can see some of the other templates, specifically that of WMI generic monitoring and that one of basic monitoring.
+
The second one only has a basic check for ICMP connectivity. We can add other basic checks, as you can see in the following screenshot:
  
The first one has got three WMI modules for Windows. These modules need to be customized, editing the original component or the generated modules, so they need user and password with permissions to do WMI remote queries.
 
 
the second one has got only a ICMP connectivity basic check. We can add other basic checks, such as service verifications  HTTP, FTP, SMTP, etc service verifications, as you can see in the following screen shot:
 
 
 
<br>
 
 
<center>
 
<center>
 
[[image:quick_mon_5.png]]
 
[[image:quick_mon_5.png]]
 
</center>
 
</center>
<br>
 
  
=== Step 2. Using a Network task with Recon Server ===
 
  
Now that we have three monitoring basic profiles: Linux, Windows and network.
+
=== Step 2. Using a Network task with a Recon Server ===
 +
 
 +
Now we have three basic monitoring profiles: Linux, Windows and network.
  
 
Supposing that we have to monitor all the machines in a network group, for example:
 
Supposing that we have to monitor all the machines in a network group, for example:
Line 96: Line 78:
 
* 192.168.50.0/24,192.168.1.0/24 for communications.
 
* 192.168.50.0/24,192.168.1.0/24 for communications.
  
And we want that it identify all the machines of that networks and depending on its OS applies a template or another. Other way of doing it, is the one in which the switches could be of several brands and models. Here we have to "identify" them through an standard based on having or not an open port. I.e. that those machines with the port 23 (telnet) open, identifies them as generic machines (switches, routers).
+
We want it to identify all the machines on that network, and, depending on their OS, apply one template or another. Another way of doing it when/if the switches are of several different brands and models is to "identify" them through a standard procedure based on having an open port or not. i.e. that those machines with the port 23 (telnet) open, are identified as generic machines (switches, routers).
  
 
Go to the recon servers section to create a new one:
 
Go to the recon servers section to create a new one:
  
<br>
 
 
<center>
 
<center>
 
[[image:quick_mon_6.png]]
 
[[image:quick_mon_6.png]]
 
</center>
 
</center>
<br>
+
 
 
:
 
:
We are going to create one to look for and register Windows servers,applying the Widows machine monitoring standard:
+
We're going to create one to look for, and register, Windows servers by applying the Windows machine's monitoring standard:
  
<br>
 
 
<center>
 
<center>
 
[[image:quick_mon_7.png|520px]]
 
[[image:quick_mon_7.png|520px]]
 
</center>
 
</center>
<br>
 
 
Here we can see, how in the "OS" field (kind of operative system) we have selected Windows. That's to say, it'll only apply this monitoring profile to those machines that it finds that are kind Windows, if not, it will ignore them.
 
  
As the way of detect automatically the kind of OS is not 100% reliable, it would be possible to select other method, as for example to specify an specific port.
+
Here we can see, how in the "OS" field Windows is selected. That means it will only apply this monitoring profile to those machines that it finds running a Windows OS, if not, it will ignore them.
  
This way, all the machines with this port open, would enter in the template application. This example could be seen here, where we've created another task. but using a filter by port instead of OS, to apply it the generic network device monitoring:
+
Since the OS detection isn't 100% reliable (it depends on the machine's own services), it is possible to select another method, like for example singling out a specific port.
  
 +
This way, all the machines with this port open, would fit under the template's application. This example could be seen here, where we've created another task. This time we're using a port filter instead of an OS filter, to apply it to the generic network device monitoring:
  
<br>
 
 
<center>
 
<center>
 
[[image:quick_mon_8.png|520px]]
 
[[image:quick_mon_8.png|520px]]
 
</center>
 
</center>
<br>
 
  
It's important to know  how specify two networks, I put them together, separated by one space 192.168.50.0/24,192.168.1.0/24
+
To specify two networks, separate them by commas: 192.168.50.0/24,192.168.1.0/24
  
Finally, I'll configure the one of Linux in a similar way. When finishing to define the three groups it will be like this:
+
Finally, we'll configure the Linux one in a similar way. When you are finished defining the three groups it should look like this:
  
<br>
 
 
<center>
 
<center>
 
[[image:quick_mon_9.png]]
 
[[image:quick_mon_9.png]]
 
</center>
 
</center>
<br>
 
  
Once we have defined the recon task, these could start alone, but, let's see their status and force them if it'd be necessary. To do this, click on the eye icon, to go to the Recon server operation view.
+
Once we have defined the recon task, these can start alone, but let's see their status and force them to start if necessary. To do this, click on the eye icon and go to the Recon server operation view.
  
<br>
 
 
<center>
 
<center>
 
[[image:quick_mon_10.png]]
 
[[image:quick_mon_10.png]]
 
</center>
 
</center>
<br>
 
 
By default, the recon server has one thread, so you'll only execute one task at the same time. The rest will wait to that it will end the active exploration task. We can force the exploration task pressing the circular green icon at the left of the task.
 
  
This will do that the Recon server search new machines that are doesn't exist in the active monitoring. If it finds them, it will register them automatically (trying to resolve the name, if we've activated this option) and assigning it all the modules that were contained in the profile.
+
By default, the recon server has one execution thread, so you'll only be able to execute one task at a time. The rest will wait for the active exploration task to end. However, the server configuration file (pandora_server.conf) can be modified. You can force exploration tasks by pressing the circular green icon at the left of the task.
  
 +
This will make the Recon server search for new machines that don't exist in the active monitoring scheme. If it finds them, it will register them automatically (trying to resolve the name, if this option is activated) and assign all the modules that were contained in the profile to it.
  
We should be aware of many of the modules assigned in one profile could have no sense or not be correctly configured for specifically one agent. In this agent, we have detected a Linux system correctly, but this server hasn't got SNMP, so not all the SNMP modules are reporting. Given that not even in the first time they could got data, they are in a mode know as "Non-init status" (not initialized). The next time you pass the DDBB manteinance script, they will be deleted automatically:
+
Be aware that many of the modules assigned to one profile could make no sense or not be correctly configured for that specific agent. On this agent, we've correctly detected a Linux system, but this server hasn't got SNMP, so not all the SNMP modules are reporting. Given that they couldn't retrieve data on the first attempt, they are in a mode known as "Non-init status" (not initialized). The next time you pass the database maintenance script, they will be deleted automatically:
 
 
  
 
<center>
 
<center>
 
[[image:quick_mon_11.png|650px]]
 
[[image:quick_mon_11.png|650px]]
 
</center>
 
</center>
<br>
 
  
== SNMP Network Devices Monitoring, using SNMP Recon Script==
 
  
It's almost the same that with the previous case. In this case, we consider the necessity of monitoring in an "automatic" way and in depth, an SNMP device with lot of interfaces, needing to get the state of each interface, the traffic in each entry, the rate of mistakes, etc.
+
== SNMP Network Device Monitoring, using SNMP Recon Script==
  
For it, we're going to use a system known as Recon Scrip. It's a modular system that allows to execute complex actions in one script. Pandora FMS has one script already created to detect this kind of SNMP devices.
+
In this case, we need to monitor an SNMP device with many interfaces "automatically" and in depth, needing to retrieve the status of each interface, the traffic on each entry, the error rate, etc.
  
For it, we create a network task, like this:
+
To do this, we're going to use a system known as Recon Script. It's a modular system that allows you to execute complex actions on one script. Pandora FMS has a script to detect this kind of SNMP device.
 +
 
 +
To execute it, create a network task, like this:
  
<br>
 
 
<center>
 
<center>
 
[[image:Quick_mon_12.png]]
 
[[image:Quick_mon_12.png]]
 
</center>
 
</center>
<br>
 
  
In the "first field" we write the network or the destination network.
+
In the "first field" write the network or the destined network.
In the "second field" we write the SNMP community that we are going to use when exploring these devices.
+
In the "second field" write the SNMP community that we are going to use when exploring these devices.
In the "third field" we write some optional parameters. In this case -n is for it register also the interfaces that are down, so by default it only register the active interfaces.
+
In the "third field" write some optional parameters. In this case -n is to register the interfaces that are also down, this means that by default it only registers active interfaces.
  
This script will register the interfaces that didn't exist previously and that now are active in each machine, in each execution. So if new interfaces are started up, it will detect it. We can leave the network tasks to they would be executed once a day, or once at every hour.
+
This script will register the interfaces that didn't exist previously and that now are active on each machine, in each execution. So if new interfaces are started up, it detects them. We can program the network tasks so they are executed once a day, once an hour, etc....
  
 
This is the way that the Recon Script Task looks once it has been created:
 
This is the way that the Recon Script Task looks once it has been created:
  
 
<br>
 
 
<center>
 
<center>
 
[[image:Quick_mon_14.png]]
 
[[image:Quick_mon_14.png]]
 
</center>
 
</center>
<br>
 
  
And this is the look that the Recon Script Task in execution has:
+
And this is how the Recon Script Task looks in execution:
  
 
<br>
 
 
<center>
 
<center>
 
[[image:Quick_mon_13.png]]
 
[[image:Quick_mon_13.png]]
 
</center>
 
</center>
<br>
+
 
  
 
== Agent Monitoring through monitoring templates and massive operations==
 
== Agent Monitoring through monitoring templates and massive operations==
Line 202: Line 165:
 
==Agent Monitoring through Policies==
 
==Agent Monitoring through Policies==
  
'''Not written yet'''
+
 
 +
To manage monitoring on a massive scale with software agents installed, we can make use of policies. This is an '''Enterprise''' feature.
 +
 
 +
Firstly, the software agents must be installed with ''remote_config'' enabled, otherwise execution modules cannot be created.
 +
 
 +
  remote_config 1
 +
 
 +
Next, navigate to the ''Add policy'' section and create a new policy, filling out some of the informative parameters, such as name, group and description;
 +
 
 +
[[image:policy1.JPG|800px]]
 +
 
 +
From here, navigate to the section for creating new modules in the policies and create a new local module (''dataserver module''):
 +
 
 +
[[image:policy2.JPG|800px]]
 +
 
 +
Once you have created the modules you need, which can be of local (''dataserver module'') or remote execution, you can start adding as many agents as you need to the policy. To do this, navigate to the corresponding tab in your policy and move the agents to the section "Agents in policy":
 +
 
 +
 
 +
[[image:policy3.JPG|800px]]
 +
 
 +
Once the agents are added, apply the changes made in ''Queue''. Apply all the changes and wait for the progress bar to be completed.
 +
 
 +
 
 +
[[image:policy4.JPG|800px]]
 +
 
 +
Once it's done, all the modules created in the policy are deployed on the selected agents.
 +
 
 +
Policies allow us to not only add modules to groups of agents, but to also include other kinds of elements as alerts, archive collections, plugins, etc.Furthermore, any modifications you make to the policies, like modifying thresholds on a module, are automatically inherited by all the agents included in that policy once it is applied.
  
 
==Agent Monitoring using Customized Scripts==
 
==Agent Monitoring using Customized Scripts==
  
This is an ''advanced'' way to monitor high sistem volumes, similar between them, in a way completely "ad-hoc". To do this you should have tools that already exist that give you information about your systems. Some examples can be:
+
This is an ''advanced'' way to monitor high system volumes which are similar to each other, in a completely "ad-hoc" way. To do this you should have pre-existing tools that give you information about your systems. Some examples are:
  
* Scripts that it has that gives information about remote systems.
+
* Scripts already installed which give information about remote systems.
  
* Other monitoring systems already working that generate data that could be reused.
+
* Other monitoring systems already in use which generate data that could be recycled.
  
* Small checks that are similar for a group of XXX machines, but that don't return a single data but several simultaneously. If they would return data one by one, it could reuse them as plugins for the remote server.
+
* Small checks that are similar for a group of X machines, but that don't return a single data string, instead they retrieve several simultaneously. If they return data piece by piece, they could be reused as plugins for the remote server.
  
The philosophy is simple: it uses an script to generate the agent XML headers, writing the agent name that you want and filling in the module data through an script, external, that it executes as argument. This external script should generate the correct data with the Pandora XML format (extremely simple!). The main script will close the XML and moves it to the standard path to process the SML data files (/var/spool/pandora/data_in).Program the script through CRON.You have more information about the XML format that Pandora FMS uses to report the data. Please, check our technical annexes.
+
The idea is simple: it uses a script to generate the agent's XML headers, writing the agent name that you want and filling out the module data through an external script that it executes as an argument. This external script should generate the correct data with the Pandora XML format (extremely simple!). The main script will close the XML and move it to the standard path to process the XML data files (/var/spool/pandora/data_in).Program the script through CRON. There is more information about the XML format that Pandora FMS uses to report the data. Check our technical annexes.
  
  
 
'''Remote agent Script'''
 
'''Remote agent Script'''
  
There is an small script at ''/usr/share/pandora_server/util/pandora_remote_agent.sh'' that has two parameters
+
There is a small script at ''/usr/share/pandora_server/util/pandora_remote_agent.sh'' that has two parameters
  
  
Line 225: Line 215:
 
  -f <script file it'll execute>
 
  -f <script file it'll execute>
  
This way if you have an script as ''/tmp/sample_remote.sh'' that contains:
+
This way if you have a script as ''/tmp/sample_remote.sh'' that contains:
  
 
<pre>
 
<pre>
Line 247: Line 237:
 
echo "</module>"
 
echo "</module>"
  
# Another script with returns XML
+
# Another script with XML retrieval
 
EXT_FILE=/tmp/myscript.sh
 
EXT_FILE=/tmp/myscript.sh
  
Line 257: Line 247:
 
</pre>
 
</pre>
  
It could generates a complete XML with the agent name "agent_test" executing the remote agent script in the following way:
+
It could generate a complete XML with the agent name "agent_test" executing the remote agent script in the following way:
  
 
  /usr/share/pandora_server/util/pandora_remote_agent.sh -a agent_test -f /tmp/sample_remote.sh
 
  /usr/share/pandora_server/util/pandora_remote_agent.sh -a agent_test -f /tmp/sample_remote.sh
  
  
Supposing you want to execute the same script against XX machines.You should pass some data, like: user, IP, password to the same script:
+
Supposing you want to execute the same script against X machines. You should transfer some data, e.g.: user, IP, and password onto the same script:
  
 
  /usr/share/pandora_server/util/pandora_remote_agent.sh -a agent_test -f "/tmp/sample_remote.sh 192.168.50.1"
 
  /usr/share/pandora_server/util/pandora_remote_agent.sh -a agent_test -f "/tmp/sample_remote.sh 192.168.50.1"
  
You should have to parametrize the script /tmp/sample_remote.sh to get the command line parameters and use them well.
+
You have to parametrize the script /tmp/sample_remote.sh to get the command line parameters and use them correctly.
  
'''Programming the script with cron'''
+
'''Programming the script with Cron'''
  
 
Imagine that you have 10 machines monitored like this:
 
Imagine that you have 10 machines monitored like this:
Line 289: Line 279:
 
  -*/5 * * * *  root /tmp/my_remote_mon.sh
 
  -*/5 * * * *  root /tmp/my_remote_mon.sh
  
This will do that this script will execute in the system each 5 minutes. You can start adding machines to the script.
+
This will make the script execute on the system each 5 minutes. You can start adding machines to the script.
 +
 
 +
 
 +
 
 +
{{Tip|If you want more information about system monitoring and its advantages and the processes to follow for correct monitoring, visit our
 +
[https://blog.pandorafms.org/computer-system-monitoring/ system monitoring blog]}}
 +
 
  
  
[[Pandora:QuickGuides_ES|Volver a Índice de Guías Rápidas de Pandora FMS]]
+
[[Pandora:QuickGuides_ES|Return to the Pandora FMS Quick Guides index]]
  
 
[[Category:Pandora FMS]]
 
[[Category:Pandora FMS]]
 
[[Category:English]]
 
[[Category:English]]

Latest revision as of 11:04, 14 July 2017

Go back to Quick Guides index

Info.png

This work is under development (not translated!)

 


1 Introduction

This guide aims to show the user how to quickly and efficiently administrate a high number of machines (5, 10... or 500) using the different features of Pandora FMS designed for this purpose. We are going to divide the document into four parts:


  • Network device monitoring, using Recon Server and templates.
  • SNMP network device monitoring, using Recon Script SNMP.
  • Agent monitoring, using policies (only Enterprise).
  • Remote monitoring with customized scripts, using an agent generator via XML.

2 Network device Monitoring, using Recon Server and templates

Situation To monitor two hundred servers, twenty switches and ten routers, which can't be configured one by one. The "general" monitoring is very easy, but there isn't much time or the possibility to install agents on the machines.

Solution

Pandora FMS will detect the systems and will apply different templates depending if it's a switch, a router or a server. The templates have remote checks that can be applied when detecting the kind of machine it is.


How long will it take ?

A class C network (255 hosts) can be scanned in less than one minute using version 6.0. Applying a monitoring standard to the detected machines is almost immediate, so you could have those 230 machines completely configured in less than ten minutes.


2.1 Step 1. Defining monitoring profiles

First we are going to define the monitoring template that is called "Module Template" in Pandora FMS. Go to the following menu:

Quick mon 1.png

Here we see some pre-defined profiles, that have some generic checks. We are going to edit one of them (on a Linux Server) that refers to a profile that is useful to monitor generic Linux servers remotely.

Quick mon 2.png

Quick mon 3.png

As you can see in the screenshot, this profile has some basic TCP checks, for example "Check SSH Server", a basic ICMP check ("Host Alive") and different SNMP modules that use the Linux MIB, which make up the rest of the checks.


These "template" checks are defined in the Pandora FMS basic module library that comes with the simple installation and which contains generic module definitions.


The IP value doesn't exist in this module, because it is auto-assigned from the agent IP when this module is applied. The rest of the fields are "default", i.e. thresholds, SNMP community; and are applied to all the agents that have a template with this module. If you want to modify it (i.e. changing the community) you need to change them one by one on the agents, or with the massive change tool.

Now that we know what monitoring templates, generic modules for templates or network components are, we can look at some of the other templates, specifically those involved with WMI generic monitoring and those corresponding to basic monitoring.

The first has three WMI modules for Windows. These modules need to be customized, editing the original component or the generated modules, so they need a username and password with permissions to perform remote queries on WMI.

The second one only has a basic check for ICMP connectivity. We can add other basic checks, as you can see in the following screenshot:

Quick mon 5.png


2.2 Step 2. Using a Network task with a Recon Server

Now we have three basic monitoring profiles: Linux, Windows and network.

Supposing that we have to monitor all the machines in a network group, for example:

  • 192.168.50.0/24 for servers.
  • 192.168.50.0/24,192.168.1.0/24 for communications.

We want it to identify all the machines on that network, and, depending on their OS, apply one template or another. Another way of doing it when/if the switches are of several different brands and models is to "identify" them through a standard procedure based on having an open port or not. i.e. that those machines with the port 23 (telnet) open, are identified as generic machines (switches, routers).

Go to the recon servers section to create a new one:

Quick mon 6.png

We're going to create one to look for, and register, Windows servers by applying the Windows machine's monitoring standard:

Quick mon 7.png

Here we can see, how in the "OS" field Windows is selected. That means it will only apply this monitoring profile to those machines that it finds running a Windows OS, if not, it will ignore them.

Since the OS detection isn't 100% reliable (it depends on the machine's own services), it is possible to select another method, like for example singling out a specific port.

This way, all the machines with this port open, would fit under the template's application. This example could be seen here, where we've created another task. This time we're using a port filter instead of an OS filter, to apply it to the generic network device monitoring:

Quick mon 8.png

To specify two networks, separate them by commas: 192.168.50.0/24,192.168.1.0/24

Finally, we'll configure the Linux one in a similar way. When you are finished defining the three groups it should look like this:

Quick mon 9.png

Once we have defined the recon task, these can start alone, but let's see their status and force them to start if necessary. To do this, click on the eye icon and go to the Recon server operation view.

Quick mon 10.png

By default, the recon server has one execution thread, so you'll only be able to execute one task at a time. The rest will wait for the active exploration task to end. However, the server configuration file (pandora_server.conf) can be modified. You can force exploration tasks by pressing the circular green icon at the left of the task.

This will make the Recon server search for new machines that don't exist in the active monitoring scheme. If it finds them, it will register them automatically (trying to resolve the name, if this option is activated) and assign all the modules that were contained in the profile to it.

Be aware that many of the modules assigned to one profile could make no sense or not be correctly configured for that specific agent. On this agent, we've correctly detected a Linux system, but this server hasn't got SNMP, so not all the SNMP modules are reporting. Given that they couldn't retrieve data on the first attempt, they are in a mode known as "Non-init status" (not initialized). The next time you pass the database maintenance script, they will be deleted automatically:

Quick mon 11.png


3 SNMP Network Device Monitoring, using SNMP Recon Script

In this case, we need to monitor an SNMP device with many interfaces "automatically" and in depth, needing to retrieve the status of each interface, the traffic on each entry, the error rate, etc.

To do this, we're going to use a system known as Recon Script. It's a modular system that allows you to execute complex actions on one script. Pandora FMS has a script to detect this kind of SNMP device.

To execute it, create a network task, like this:

Quick mon 12.png

In the "first field" write the network or the destined network. In the "second field" write the SNMP community that we are going to use when exploring these devices. In the "third field" write some optional parameters. In this case -n is to register the interfaces that are also down, this means that by default it only registers active interfaces.

This script will register the interfaces that didn't exist previously and that now are active on each machine, in each execution. So if new interfaces are started up, it detects them. We can program the network tasks so they are executed once a day, once an hour, etc....

This is the way that the Recon Script Task looks once it has been created:

Quick mon 14.png

And this is how the Recon Script Task looks in execution:

Quick mon 13.png


4 Agent Monitoring through monitoring templates and massive operations

Not written yet

5 Agent Monitoring through Policies

To manage monitoring on a massive scale with software agents installed, we can make use of policies. This is an Enterprise feature.

Firstly, the software agents must be installed with remote_config enabled, otherwise execution modules cannot be created.

  remote_config 1

Next, navigate to the Add policy section and create a new policy, filling out some of the informative parameters, such as name, group and description;

Policy1.JPG

From here, navigate to the section for creating new modules in the policies and create a new local module (dataserver module):

Policy2.JPG

Once you have created the modules you need, which can be of local (dataserver module) or remote execution, you can start adding as many agents as you need to the policy. To do this, navigate to the corresponding tab in your policy and move the agents to the section "Agents in policy":


Policy3.JPG

Once the agents are added, apply the changes made in Queue. Apply all the changes and wait for the progress bar to be completed.


Policy4.JPG

Once it's done, all the modules created in the policy are deployed on the selected agents.

Policies allow us to not only add modules to groups of agents, but to also include other kinds of elements as alerts, archive collections, plugins, etc.Furthermore, any modifications you make to the policies, like modifying thresholds on a module, are automatically inherited by all the agents included in that policy once it is applied.

6 Agent Monitoring using Customized Scripts

This is an advanced way to monitor high system volumes which are similar to each other, in a completely "ad-hoc" way. To do this you should have pre-existing tools that give you information about your systems. Some examples are:

  • Scripts already installed which give information about remote systems.
  • Other monitoring systems already in use which generate data that could be recycled.
  • Small checks that are similar for a group of X machines, but that don't return a single data string, instead they retrieve several simultaneously. If they return data piece by piece, they could be reused as plugins for the remote server.

The idea is simple: it uses a script to generate the agent's XML headers, writing the agent name that you want and filling out the module data through an external script that it executes as an argument. This external script should generate the correct data with the Pandora XML format (extremely simple!). The main script will close the XML and move it to the standard path to process the XML data files (/var/spool/pandora/data_in).Program the script through CRON. There is more information about the XML format that Pandora FMS uses to report the data. Check our technical annexes.


Remote agent Script

There is a small script at /usr/share/pandora_server/util/pandora_remote_agent.sh that has two parameters


-a <agent name>
-f <script file it'll execute>

This way if you have a script as /tmp/sample_remote.sh that contains:

#!/bin/bash

PING=`ping 192.168.50.1 -c 1 | grep " 0% packet loss" | wc -l`

echo "<module>"
echo "<name>Status</name>"
echo "<type>generic_proc</type>"
echo "<data>$PING</data>"
echo "</module>"

ALIVE=`snmpget -Ot -v 1 -c artica06 192.168.70.100 DISMAN-EVENT-MIB::sysUpTimeInstance | awk '{ print $3>=8640000 }'`

echo "<module>"
echo "<name>Alive_More_than_24Hr</name>"
echo "<type>generic_proc</type>"
echo "<data>$ALIVE</data>"
echo "</module>"

# Another script with XML retrieval 
EXT_FILE=/tmp/myscript.sh

if [ -e "$EXT_FILE" ]
then
	$EXT_FILE
fi

It could generate a complete XML with the agent name "agent_test" executing the remote agent script in the following way:

/usr/share/pandora_server/util/pandora_remote_agent.sh -a agent_test -f /tmp/sample_remote.sh


Supposing you want to execute the same script against X machines. You should transfer some data, e.g.: user, IP, and password onto the same script:

/usr/share/pandora_server/util/pandora_remote_agent.sh -a agent_test -f "/tmp/sample_remote.sh 192.168.50.1"

You have to parametrize the script /tmp/sample_remote.sh to get the command line parameters and use them correctly.

Programming the script with Cron

Imagine that you have 10 machines monitored like this:


/usr/share/pandora_server/util/pandora_remote_agent.sh -a agent_test1 -f "/tmp/sample_remote.sh 192.168.50.1"
/usr/share/pandora_server/util/pandora_remote_agent.sh -a agent_test2 -f "/tmp/sample_remote.sh 192.168.50.2"
/usr/share/pandora_server/util/pandora_remote_agent.sh -a agent_test3 -f "/tmp/sample_remote.sh 192.168.50.3"
/usr/share/pandora_server/util/pandora_remote_agent.sh -a agent_test4 -f "/tmp/sample_remote.sh 192.168.50.4"
/usr/share/pandora_server/util/pandora_remote_agent.sh -a agent_test5 -f "/tmp/sample_remote.sh 192.168.50.5"
/usr/share/pandora_server/util/pandora_remote_agent.sh -a agent_test6 -f "/tmp/sample_remote.sh 192.168.50.6"
/usr/share/pandora_server/util/pandora_remote_agent.sh -a agent_test7 -f "/tmp/sample_remote.sh 192.168.50.7"
/usr/share/pandora_server/util/pandora_remote_agent.sh -a agent_test8 -f "/tmp/sample_remote.sh 192.168.50.8"
/usr/share/pandora_server/util/pandora_remote_agent.sh -a agent_test9 -f "/tmp/sample_remote.sh 192.168.50.9"
/usr/share/pandora_server/util/pandora_remote_agent.sh -a agent_test10 -f "/tmp/sample_remote.sh 192.168.50.10"


Put all these lines in a new script, i.e:"/tmp/my_remote_mon.sh" and give it execution permissions and add the following line to the root crontab:

-*/5 * * * *   root /tmp/my_remote_mon.sh

This will make the script execute on the system each 5 minutes. You can start adding machines to the script.


Info.png

If you want more information about system monitoring and its advantages and the processes to follow for correct monitoring, visit our system monitoring blog

 



Return to the Pandora FMS Quick Guides index