Pandora: QuickGuides EN: Fast deployment

From Pandora FMS Wiki
Revision as of 12:21, 22 February 2016 by Asier (talk | contribs) (Agent Monitoring using Customized Scripts)
Jump to: navigation, search

Go back to Quick Guides index

Info.png

This work is under development (not translated!)

 


1 Introduction

This guide wants to show the user how to administrate in a quick an efficient way a high number of machines (5, 10..500) using the different features of Pandora FMS designed for this purpose. We are going to divide the document into four parts:


  • Network device monitoring, using Recon Server and templates.
  • SNMP network device monitoring, using Recon Script SNMP.
  • Agent monitoring, using policies (only Enterprise).
  • Remote monitoring with customized scripts, using an agent generator via XML.

2 Network device Monitoring, using Recon Server and templates

Situation We have to monitor two hundred servers, twenty switches and ten routers, and we can't configure them one by one. The "general" monitoring is very easy, but we don't have much time or the possibility to install agents on the machines.

Solution

Pandora FMS will detect the systems and will apply different templates depending if it's a switch, a router or a server. The templates would have remote checks that could be applied when detecting the kind of machine it is.


How long will it take ?

A class C network (255 hosts) is scanned in less than one minute using version 4.0. Applying a monitoring standard to the detected machines is almost immediate, so you could have those 230 machines completely configured in less than ten minutes.


2.1 Step 1. Defining monitoring profiles

First we are going to define the monitoring template that is called "Module Template" in Pandora FMS. For it we have to go to the following menu:


Quick mon 1.png


Here we see some pre-defined profiles, that have some generic checks. We are going to edit one of them (on a Linux Server) that refers to a profile that is useful to monitor generic Linux servers in a remote way.



Quick mon 2.png



Quick mon 3.png


As you can see in the screenshot at the top, this profile has several modules, among them some basic TCP checks, for example "Check SSH Server", an ICMP basic check ("Host Alive") and different SNMP modules that use the Linux MIB, which compose the rest of the checks.


These "template" checks are defined in the Pandora FMS basic module library that comes with the simple installation.

The basic modules are called "Network components". They are generic definitions of modules. They could be edited and seen at: Administration -> Modules -> Network components. Here we can see a generic module view called "Sysname" that retrieves the system value via SNMP.


Quick mon 4.png


The IP value doesn't exist in this module, because it'll be auto-assigned from the agent IP when this module is applied. The rest of the fields are "default", i.e. tresholds, SNMP community; and will be applied to all the agents that have a template with this module. If we want to modify it (i.e. changing the community) we'll need to change them one by one on the agents, or with the massive change tool.

Well, now that we know what a monitoring template, a generic module for template or a network component are, we can look at some of the other templates, specifically those involved with WMI generic monitoring and those corresponding to basic monitoring.

The first one has got three WMI modules for Windows. These modules need to be customized, editing the original component or the generated modules, so they need an username and password with permissions to perform remote queries on WMI.

The second one has only got a basic check for ICMP connectivity. We can add other basic checks, such as service verifications for HTTP, FTP, SMTP, etc, as you can see in the following screenshot:



Quick mon 5.png


2.2 Step 2. Using a Network task with a Recon Server

Now we have three basic monitoring profiles: Linux, Windows and network.

Supposing that we have to monitor all the machines in a network group, for example:

  • 192.168.50.0/24 for servers.
  • 192.168.50.0/24,192.168.1.0/24 for communications.

And we want for it to identify all the machines on that network, and depending on its OS apply a template or another. Another way of doing it is the one in which the switches could be of several brands and models. Here we have to "identify" them through a standard procedure based on having or not an open port. i.e. that those machines with the port 23 (telnet) open, are identified as generic machines (switches, routers).

Go to the recon servers section to create a new one:


Quick mon 6.png


We're going to create one to look for, and register, Windows servers by applying the Windows machine's monitoring standard:


Quick mon 7.png


Here we can see, how in the "OS" field (kind of operative system) we've selected Windows. That means, it'll only apply this monitoring profile to those machines that it finds running a Windows OS, if not, it'll ignore them.

Since the OS detection isn't 100% reliable, it would be possible to select other method, like for example to single out a specific port.

This way, all the machines with this port open, would fit under the template's application. This example could be seen here, where we've created another task. This time we're using a port filter instead of an OS filter, to apply it to the generic network device monitoring:



Quick mon 8.png


It's important to know how specify two networks, I put them together, separated by one space 192.168.50.0/24,192.168.1.0/24

Finally, we'll configure the Linux one in a similar way. When finishing defining the three groups it should look like this:


Quick mon 9.png


Once we have defined the recon task, these could start alone, but, let's see their status and force them to start if it'd be necessary. To do this, click on the eye icon and go to the Recon server operation view.


Quick mon 10.png


By default, the recon server has one thread, so you'll only be able to execute one task at a time. The rest will wait for the active exploration task to end. We can force exploration tasks by pressing the circular green icon at the left of the task.


This will make the Recon server search for new machines that don't exist in the active monitoring scheme. If it finds them, it will register them automatically (trying to resolve the name, if we've activated this option) and assign all the modules that were contained in the profile to it.


We should be aware that many of the modules assigned to one profile could make no sense or not be correctly configured for that specific agent. On this agent, we've correctly detected a Linux system, but this server hasn't got SNMP, so not all the SNMP modules are reporting. Given that they couldn't even retrieve data on the first time they could, they are in a mode known as "Non-init status" (not initialized). The next time you pass the DDBB maintenance script, they will automatically be deleted:


Quick mon 11.png


3 SNMP Network Device Monitoring, using SNMP Recon Script

It's almost the same as with the previous case. In this case, we consider the necessity of monitoring a SNMP device with many interfaces in an "automatic" way and in depth, needing to retrieve the status of each interface, the traffic on each entry, the rate of mistakes, etc.

For this, we're going to use a system known as Recon Script. It's a modular system that allows to execute complex actions on one script. Pandora FMS has a script created already to detect this kind of SNMP devices.

To execute it, we create a network task, like this:


Quick mon 12.png


In the "first field" we'll write the network or the destined network. In the "second field" we'll write the SNMP community that we are going to use when exploring these devices. In the "third field" we'll write some optional parameters. In this case -n is for it to register the interfaces that are also down, this means that by default it'll only register active interfaces.

This script will register the interfaces that didn't exist previously and that now are active on each machine, in each execution. So if new interfaces are started up, it'll detect them. We can program the network tasks so they are executed once a day, or once at every hour if we wish.

This is the way that the Recon Script Task looks once it has been created:



Quick mon 14.png


And this is the look that the Recon Script Task in execution has:



Quick mon 13.png


4 Agent Monitoring through monitoring templates and massive operations

Not written yet

5 Agent Monitoring through Policies

Not written yet

6 Agent Monitoring using Customized Scripts

This is an advanced way to monitor high system volumes which are similar amongst themselves, in a completely "ad-hoc" way. To do this you should have pre-existing tools that give you information about your systems. Some examples are:

  • Scripts that it already has which give information about remote systems.
  • Other monitoring systems already working which generate data that could be recycled.
  • Small checks that are similar for a group of X machines, but that don't return a single data string, instead they retrieve several simultaneously. If they were to return data singularly, they could be reused as plugins for the remote server.

The philosophy is simple: it uses a script to generate the agent's XML headers, writing the agent name that you want and filling in the module data through an external script that it executes as an argument. This external script should generate the correct data with the Pandora XML format (extremely simple!). The main script will close the XML and move it to the standard path to process the XML data files (/var/spool/pandora/data_in).Program the script through CRON. You'll have more information about the XML format that Pandora FMS uses to report the data. Please, check our technical annexes.


Remote agent Script

There is a small script at /usr/share/pandora_server/util/pandora_remote_agent.sh that has two parameters


-a <agent name>
-f <script file it'll execute>

This way if you have a script as /tmp/sample_remote.sh that contains:

#!/bin/bash

PING=`ping 192.168.50.1 -c 1 | grep " 0% packet loss" | wc -l`

echo "<module>"
echo "<name>Status</name>"
echo "<type>generic_proc</type>"
echo "<data>$PING</data>"
echo "</module>"

ALIVE=`snmpget -Ot -v 1 -c artica06 192.168.70.100 DISMAN-EVENT-MIB::sysUpTimeInstance | awk '{ print $3>=8640000 }'`

echo "<module>"
echo "<name>Alive_More_than_24Hr</name>"
echo "<type>generic_proc</type>"
echo "<data>$ALIVE</data>"
echo "</module>"

# Another script with XML retrieval 
EXT_FILE=/tmp/myscript.sh

if [ -e "$EXT_FILE" ]
then
	$EXT_FILE
fi

It could generate a complete XML with the agent name "agent_test" executing the remote agent script in the following way:

/usr/share/pandora_server/util/pandora_remote_agent.sh -a agent_test -f /tmp/sample_remote.sh


Supposing you want to execute the same script against X machines. You should transfer some data, like: user, IP, and password onto the same script:

/usr/share/pandora_server/util/pandora_remote_agent.sh -a agent_test -f "/tmp/sample_remote.sh 192.168.50.1"

You should have to parametrize the script /tmp/sample_remote.sh to get the command line parameters and use them correctly.

Programming the script with Cron

Imagine that you have 10 machines monitored like this:


/usr/share/pandora_server/util/pandora_remote_agent.sh -a agent_test1 -f "/tmp/sample_remote.sh 192.168.50.1"
/usr/share/pandora_server/util/pandora_remote_agent.sh -a agent_test2 -f "/tmp/sample_remote.sh 192.168.50.2"
/usr/share/pandora_server/util/pandora_remote_agent.sh -a agent_test3 -f "/tmp/sample_remote.sh 192.168.50.3"
/usr/share/pandora_server/util/pandora_remote_agent.sh -a agent_test4 -f "/tmp/sample_remote.sh 192.168.50.4"
/usr/share/pandora_server/util/pandora_remote_agent.sh -a agent_test5 -f "/tmp/sample_remote.sh 192.168.50.5"
/usr/share/pandora_server/util/pandora_remote_agent.sh -a agent_test6 -f "/tmp/sample_remote.sh 192.168.50.6"
/usr/share/pandora_server/util/pandora_remote_agent.sh -a agent_test7 -f "/tmp/sample_remote.sh 192.168.50.7"
/usr/share/pandora_server/util/pandora_remote_agent.sh -a agent_test8 -f "/tmp/sample_remote.sh 192.168.50.8"
/usr/share/pandora_server/util/pandora_remote_agent.sh -a agent_test9 -f "/tmp/sample_remote.sh 192.168.50.9"
/usr/share/pandora_server/util/pandora_remote_agent.sh -a agent_test10 -f "/tmp/sample_remote.sh 192.168.50.10"


Put all these lines in a new script, i.e:"/tmp/my_remote_mon.sh" and give it execution permissions and add the following line to the root crontab:

-*/5 * * * *   root /tmp/my_remote_mon.sh

This will make this script execute on the system each 5 minutes. You can start adding machines to the script.


Volver a Índice de Guías Rápidas de Pandora FMS