Pandora: QuickGuides EN: Fast deployment

From Pandora FMS Wiki
Revision as of 11:46, 22 February 2016 by Asier (talk | contribs) (Network devices Monitoring, using Recon Server and templates)
Jump to: navigation, search

Go back to Quick Guides index


This work is under development (not translated!)


1 Introduction

This guide wants to show the user how to administrate in a quick an efficient way a high number of machines (5, 10..500) using the different features of Pandora FMS designed for this purpose. We are going to divide the document into four parts:

  • Network device monitoring, using Recon Server and templates.
  • SNMP network device monitoring, using Recon Script SNMP.
  • Agent monitoring, using policies (only Enterprise).
  • Remote monitoring with customized scripts, using an agent generator via XML.

2 Network device Monitoring, using Recon Server and templates

Situation We have to monitor two hundred servers, twenty switches and ten routers, and we can't configure them one by one. The "general" monitoring is very easy, but we don't have much time or the possibility to install agents on the machines.


Pandora FMS will detect the systems and will apply different templates depending if it's a switch, a router or a server. The templates would have remote checks that could be applied when detecting the kind of machine it is.

How long will it take ?

A class C network (255 hosts) is scanned in less than one minute using version 4.0. Applying a monitoring standard to the detected machines is almost immediate, so you could have those 230 machines completely configured in less than ten minutes.

2.1 Step 1. Defining monitoring profiles

First we are going to define the monitoring template that is called "Module Template" in Pandora FMS. For it we have to go to the following menu:

Quick mon 1.png

Here we see some pre-defined profiles, that have some generic checks. We are going to edit one of them (on a Linux Server) that refers to a profile that is useful to monitor generic Linux servers in a remote way.

Quick mon 2.png

Quick mon 3.png

As you can see in the screenshot at the top, this profile has several modules, among them some basic TCP checks, for example "Check SSH Server", an ICMP basic check ("Host Alive") and different SNMP modules that use the Linux MIB, which compose the rest of the checks.

These "template" checks are defined in the Pandora FMS basic module library that comes with the simple installation.

The basic modules are called "Network components". They are generic definitions of modules. They could be edited and seen at: Administration -> Modules -> Network components. Here we can see a generic module view called "Sysname" that retrieves the system value via SNMP.

Quick mon 4.png

The IP value doesn't exist in this module, because it'll be auto-assigned from the agent IP when this module is applied. The rest of the fields are "default", i.e. tresholds, SNMP community; and will be applied to all the agents that have a template with this module. If we want to modify it (i.e. changing the community) we'll need to change them one by one on the agents, or with the massive change tool.

Well, now that we know what a monitoring template, a generic module for template or a network component are, we can look at some of the other templates, specifically those involved with WMI generic monitoring and those corresponding to basic monitoring.

The first one has got three WMI modules for Windows. These modules need to be customized, editing the original component or the generated modules, so they need an username and password with permissions to perform remote queries on WMI.

The second one has only got a basic check for ICMP connectivity. We can add other basic checks, such as service verifications for HTTP, FTP, SMTP, etc, as you can see in the following screenshot:

Quick mon 5.png

2.2 Step 2. Using a Network task with a Recon Server

Now we have three basic monitoring profiles: Linux, Windows and network.

Supposing that we have to monitor all the machines in a network group, for example:

  • for servers.
  •, for communications.

And we want for it to identify all the machines on that network, and depending on its OS apply a template or another. Another way of doing it is the one in which the switches could be of several brands and models. Here we have to "identify" them through a standard procedure based on having or not an open port. i.e. that those machines with the port 23 (telnet) open, are identified as generic machines (switches, routers).

Go to the recon servers section to create a new one:

Quick mon 6.png

We're going to create one to look for, and register, Windows servers by applying the Windows machine's monitoring standard:

Quick mon 7.png

Here we can see, how in the "OS" field (kind of operative system) we've selected Windows. That means, it'll only apply this monitoring profile to those machines that it finds running a Windows OS, if not, it'll ignore them.

Since the OS detection isn't 100% reliable, it would be possible to select other method, like for example to single out a specific port.

This way, all the machines with this port open, would fit under the template's application. This example could be seen here, where we've created another task. This time we're using a port filter instead of an OS filter, to apply it to the generic network device monitoring:

Quick mon 8.png

It's important to know how specify two networks, I put them together, separated by one space,

Finally, we'll configure the Linux one in a similar way. When finishing defining the three groups it should look like this:

Quick mon 9.png

Once we have defined the recon task, these could start alone, but, let's see their status and force them to start if it'd be necessary. To do this, click on the eye icon and go to the Recon server operation view.

Quick mon 10.png

By default, the recon server has one thread, so you'll only be able to execute one task at a time. The rest will wait for the active exploration task to end. We can force exploration tasks by pressing the circular green icon at the left of the task.

This will make the Recon server search for new machines that don't exist in the active monitoring scheme. If it finds them, it will register them automatically (trying to resolve the name, if we've activated this option) and assign all the modules that were contained in the profile to it.

We should be aware that many of the modules assigned to one profile could make no sense or not be correctly configured for that specific agent. On this agent, we've correctly detected a Linux system, but this server hasn't got SNMP, so not all the SNMP modules are reporting. Given that they couldn't even retrieve data on the first time they could, they are in a mode known as "Non-init status" (not initialized). The next time you pass the DDBB maintenance script, they will automatically be deleted:

Quick mon 11.png

3 SNMP Network Devices Monitoring, using SNMP Recon Script

It's almost the same that with the previous case. In this case, we consider the necessity of monitoring in an "automatic" way and in depth, an SNMP device with lot of interfaces, needing to get the state of each interface, the traffic in each entry, the rate of mistakes, etc.

For it, we're going to use a system known as Recon Scrip. It's a modular system that allows to execute complex actions in one script. Pandora FMS has one script already created to detect this kind of SNMP devices.

For it, we create a network task, like this:

Quick mon 12.png

In the "first field" we write the network or the destination network. In the "second field" we write the SNMP community that we are going to use when exploring these devices. In the "third field" we write some optional parameters. In this case -n is for it register also the interfaces that are down, so by default it only register the active interfaces.

This script will register the interfaces that didn't exist previously and that now are active in each machine, in each execution. So if new interfaces are started up, it will detect it. We can leave the network tasks to they would be executed once a day, or once at every hour.

This is the way that the Recon Script Task looks once it has been created:

Quick mon 14.png

And this is the look that the Recon Script Task in execution has:

Quick mon 13.png

4 Agent Monitoring through monitoring templates and massive operations

Not written yet

5 Agent Monitoring through Policies

Not written yet

6 Agent Monitoring using Customized Scripts

This is an advanced way to monitor high sistem volumes, similar between them, in a way completely "ad-hoc". To do this you should have tools that already exist that give you information about your systems. Some examples can be:

  • Scripts that it has that gives information about remote systems.
  • Other monitoring systems already working that generate data that could be reused.
  • Small checks that are similar for a group of XXX machines, but that don't return a single data but several simultaneously. If they would return data one by one, it could reuse them as plugins for the remote server.

The philosophy is simple: it uses an script to generate the agent XML headers, writing the agent name that you want and filling in the module data through an script, external, that it executes as argument. This external script should generate the correct data with the Pandora XML format (extremely simple!). The main script will close the XML and moves it to the standard path to process the SML data files (/var/spool/pandora/data_in).Program the script through CRON.You have more information about the XML format that Pandora FMS uses to report the data. Please, check our technical annexes.

Remote agent Script

There is an small script at /usr/share/pandora_server/util/ that has two parameters

-a <agent name>
-f <script file it'll execute>

This way if you have an script as /tmp/ that contains:


PING=`ping -c 1 | grep " 0% packet loss" | wc -l`

echo "<module>"
echo "<name>Status</name>"
echo "<type>generic_proc</type>"
echo "<data>$PING</data>"
echo "</module>"

ALIVE=`snmpget -Ot -v 1 -c artica06 DISMAN-EVENT-MIB::sysUpTimeInstance | awk '{ print $3>=8640000 }'`

echo "<module>"
echo "<name>Alive_More_than_24Hr</name>"
echo "<type>generic_proc</type>"
echo "<data>$ALIVE</data>"
echo "</module>"

# Another script with returns XML

if [ -e "$EXT_FILE" ]

It could generates a complete XML with the agent name "agent_test" executing the remote agent script in the following way:

/usr/share/pandora_server/util/ -a agent_test -f /tmp/

Supposing you want to execute the same script against XX machines.You should pass some data, like: user, IP, password to the same script:

/usr/share/pandora_server/util/ -a agent_test -f "/tmp/"

You should have to parametrize the script /tmp/ to get the command line parameters and use them well.

Programming the script with cron

Imagine that you have 10 machines monitored like this:

/usr/share/pandora_server/util/ -a agent_test1 -f "/tmp/"
/usr/share/pandora_server/util/ -a agent_test2 -f "/tmp/"
/usr/share/pandora_server/util/ -a agent_test3 -f "/tmp/"
/usr/share/pandora_server/util/ -a agent_test4 -f "/tmp/"
/usr/share/pandora_server/util/ -a agent_test5 -f "/tmp/"
/usr/share/pandora_server/util/ -a agent_test6 -f "/tmp/"
/usr/share/pandora_server/util/ -a agent_test7 -f "/tmp/"
/usr/share/pandora_server/util/ -a agent_test8 -f "/tmp/"
/usr/share/pandora_server/util/ -a agent_test9 -f "/tmp/"
/usr/share/pandora_server/util/ -a agent_test10 -f "/tmp/"

Put all these lines in a new script, i.e:"/tmp/" and give it execution permissions and add the following line to the root crontab:

-*/5 * * * *   root /tmp/

This will do that this script will execute in the system each 5 minutes. You can start adding machines to the script.

Volver a Índice de Guías Rápidas de Pandora FMS