Pandora: QuickGuides EN: Fast deployment

From Pandora FMS Wiki
Revision as of 10:56, 22 February 2016 by Asier (talk | contribs) (Introduction)
Jump to: navigation, search

Go back to Quick Guides index

Info.png

This work is under development (not translated!)

 


1 Introduction

This guide wants to show the user how to administrate in a quick an efficient way a high number of machines (5, 10..500) using the different features of Pandora FMS designed for this purpose. We are going to divide the document into four parts:


  • Network device monitoring, using Recon Server and templates.
  • SNMP network device monitoring, using Recon Script SNMP.
  • Agent monitoring, using policies (only Enterprise).
  • Remote monitoring with customized scripts, using an agent generator via XML.

2 Network devices Monitoring, using Recon Server and templates

Situation We have to monitor two hundred servers, twenty switches and ten router, and we can't configure one by one. The "general" monitoring is very easy, but we have not much time neither the possibility of installing agents in the machines.

Solution

Pandora FMS will detect the systems and will apply them different templates depending on it's a switch, a router or a server. The templates would have remote checks that could be apply just when detecting the kind of machine it is.


How long will it takes ?

A network class C (255 hosts) is scanned in less than one minute with the 4.0 version. To apply to the detected machines a monitoring standard is almost immediate, so you could have these 230 machines in less than ten minutes, completely configured.


2.1 Step 1. Defining the monitoring profiles

First we are going to define one monitoring template that in Pandora FMS is called "Module Template". For it we go to the following menu:


Quick mon 1.png


Here we see some profiles already defined, that have some generic checks. We are going to edit one of them (Linux Server) that refers to one profile that is useful to monitor generic Linux servers in a remote way.



Quick mon 2.png



Quick mon 3.png


As you can see in the shootscreen at the top, this profile has several modules, between them, some basic TCP checks, as for example "Check SSH Server", an ICMP basic check ("Host Alive") and different SNMP modules that uses the Linux MIB, that are the rest of checks.


These "template" checks are defined in the Pandora FMS basic module library that comes with the simple installation.

The basic modules are called "Network components". They are generic definitions of modules. They could be edited and see at: Administration -> Modules -> Network components.Here we can see one generic module look called "Sysname" that gets the system value via SNMP.


Quick mon 4.png


The IP value doesn't exist in this module, because it with self assign from de agent Ip when this module would be apply. The rest of fields are "by default" for example:thresolds, snmp community, and they will be aply to all the agents that have a template with this module. If we want to customize it (for example: to change the community) then we should have to change it in the agents one by one or in a general way with the massive change tool.


Well, now that we know what a monitoring template, a generic module for template or network component are, we can see some of the other templates, specifically that of WMI generic monitoring and that one of basic monitoring.

The first one has got three WMI modules for Windows. These modules need to be customized, editing the original component or the generated modules, so they need user and password with permissions to do WMI remote queries.

the second one has got only a ICMP connectivity basic check. We can add other basic checks, such as service verifications HTTP, FTP, SMTP, etc service verifications, as you can see in the following screen shot:



Quick mon 5.png


2.2 Step 2. Using a Network task with Recon Server

Now that we have three monitoring basic profiles: Linux, Windows and network.

Supposing that we have to monitor all the machines in a network group, for example:

  • 192.168.50.0/24 for servers.
  • 192.168.50.0/24,192.168.1.0/24 for communications.

And we want that it identify all the machines of that networks and depending on its OS applies a template or another. Other way of doing it, is the one in which the switches could be of several brands and models. Here we have to "identify" them through an standard based on having or not an open port. I.e. that those machines with the port 23 (telnet) open, identifies them as generic machines (switches, routers).

Go to the recon servers section to create a new one:


Quick mon 6.png


We are going to create one to look for and register Windows servers,applying the Widows machine monitoring standard:


Quick mon 7.png


Here we can see, how in the "OS" field (kind of operative system) we have selected Windows. That's to say, it'll only apply this monitoring profile to those machines that it finds that are kind Windows, if not, it will ignore them.

As the way of detect automatically the kind of OS is not 100% reliable, it would be possible to select other method, as for example to specify an specific port.

This way, all the machines with this port open, would enter in the template application. This example could be seen here, where we've created another task. but using a filter by port instead of OS, to apply it the generic network device monitoring:



Quick mon 8.png


It's important to know how specify two networks, I put them together, separated by one space 192.168.50.0/24,192.168.1.0/24

Finally, I'll configure the one of Linux in a similar way. When finishing to define the three groups it will be like this:


Quick mon 9.png


Once we have defined the recon task, these could start alone, but, let's see their status and force them if it'd be necessary. To do this, click on the eye icon, to go to the Recon server operation view.


Quick mon 10.png


By default, the recon server has one thread, so you'll only execute one task at the same time. The rest will wait to that it will end the active exploration task. We can force the exploration task pressing the circular green icon at the left of the task.

This will do that the Recon server search new machines that are doesn't exist in the active monitoring. If it finds them, it will register them automatically (trying to resolve the name, if we've activated this option) and assigning it all the modules that were contained in the profile.


We should be aware of many of the modules assigned in one profile could have no sense or not be correctly configured for specifically one agent. In this agent, we have detected a Linux system correctly, but this server hasn't got SNMP, so not all the SNMP modules are reporting. Given that not even in the first time they could got data, they are in a mode know as "Non-init status" (not initialized). The next time you pass the DDBB manteinance script, they will be deleted automatically:


Quick mon 11.png


3 SNMP Network Devices Monitoring, using SNMP Recon Script

It's almost the same that with the previous case. In this case, we consider the necessity of monitoring in an "automatic" way and in depth, an SNMP device with lot of interfaces, needing to get the state of each interface, the traffic in each entry, the rate of mistakes, etc.

For it, we're going to use a system known as Recon Scrip. It's a modular system that allows to execute complex actions in one script. Pandora FMS has one script already created to detect this kind of SNMP devices.

For it, we create a network task, like this:


Quick mon 12.png


In the "first field" we write the network or the destination network. In the "second field" we write the SNMP community that we are going to use when exploring these devices. In the "third field" we write some optional parameters. In this case -n is for it register also the interfaces that are down, so by default it only register the active interfaces.

This script will register the interfaces that didn't exist previously and that now are active in each machine, in each execution. So if new interfaces are started up, it will detect it. We can leave the network tasks to they would be executed once a day, or once at every hour.

This is the way that the Recon Script Task looks once it has been created:



Quick mon 14.png


And this is the look that the Recon Script Task in execution has:



Quick mon 13.png


4 Agent Monitoring through monitoring templates and massive operations

Not written yet

5 Agent Monitoring through Policies

Not written yet

6 Agent Monitoring using Customized Scripts

This is an advanced way to monitor high sistem volumes, similar between them, in a way completely "ad-hoc". To do this you should have tools that already exist that give you information about your systems. Some examples can be:

  • Scripts that it has that gives information about remote systems.
  • Other monitoring systems already working that generate data that could be reused.
  • Small checks that are similar for a group of XXX machines, but that don't return a single data but several simultaneously. If they would return data one by one, it could reuse them as plugins for the remote server.

The philosophy is simple: it uses an script to generate the agent XML headers, writing the agent name that you want and filling in the module data through an script, external, that it executes as argument. This external script should generate the correct data with the Pandora XML format (extremely simple!). The main script will close the XML and moves it to the standard path to process the SML data files (/var/spool/pandora/data_in).Program the script through CRON.You have more information about the XML format that Pandora FMS uses to report the data. Please, check our technical annexes.


Remote agent Script

There is an small script at /usr/share/pandora_server/util/pandora_remote_agent.sh that has two parameters


-a <agent name>
-f <script file it'll execute>

This way if you have an script as /tmp/sample_remote.sh that contains:

#!/bin/bash

PING=`ping 192.168.50.1 -c 1 | grep " 0% packet loss" | wc -l`

echo "<module>"
echo "<name>Status</name>"
echo "<type>generic_proc</type>"
echo "<data>$PING</data>"
echo "</module>"

ALIVE=`snmpget -Ot -v 1 -c artica06 192.168.70.100 DISMAN-EVENT-MIB::sysUpTimeInstance | awk '{ print $3>=8640000 }'`

echo "<module>"
echo "<name>Alive_More_than_24Hr</name>"
echo "<type>generic_proc</type>"
echo "<data>$ALIVE</data>"
echo "</module>"

# Another script with returns XML
EXT_FILE=/tmp/myscript.sh

if [ -e "$EXT_FILE" ]
then
	$EXT_FILE
fi

It could generates a complete XML with the agent name "agent_test" executing the remote agent script in the following way:

/usr/share/pandora_server/util/pandora_remote_agent.sh -a agent_test -f /tmp/sample_remote.sh


Supposing you want to execute the same script against XX machines.You should pass some data, like: user, IP, password to the same script:

/usr/share/pandora_server/util/pandora_remote_agent.sh -a agent_test -f "/tmp/sample_remote.sh 192.168.50.1"

You should have to parametrize the script /tmp/sample_remote.sh to get the command line parameters and use them well.

Programming the script with cron

Imagine that you have 10 machines monitored like this:


/usr/share/pandora_server/util/pandora_remote_agent.sh -a agent_test1 -f "/tmp/sample_remote.sh 192.168.50.1"
/usr/share/pandora_server/util/pandora_remote_agent.sh -a agent_test2 -f "/tmp/sample_remote.sh 192.168.50.2"
/usr/share/pandora_server/util/pandora_remote_agent.sh -a agent_test3 -f "/tmp/sample_remote.sh 192.168.50.3"
/usr/share/pandora_server/util/pandora_remote_agent.sh -a agent_test4 -f "/tmp/sample_remote.sh 192.168.50.4"
/usr/share/pandora_server/util/pandora_remote_agent.sh -a agent_test5 -f "/tmp/sample_remote.sh 192.168.50.5"
/usr/share/pandora_server/util/pandora_remote_agent.sh -a agent_test6 -f "/tmp/sample_remote.sh 192.168.50.6"
/usr/share/pandora_server/util/pandora_remote_agent.sh -a agent_test7 -f "/tmp/sample_remote.sh 192.168.50.7"
/usr/share/pandora_server/util/pandora_remote_agent.sh -a agent_test8 -f "/tmp/sample_remote.sh 192.168.50.8"
/usr/share/pandora_server/util/pandora_remote_agent.sh -a agent_test9 -f "/tmp/sample_remote.sh 192.168.50.9"
/usr/share/pandora_server/util/pandora_remote_agent.sh -a agent_test10 -f "/tmp/sample_remote.sh 192.168.50.10"


Put all these lines in a new script, i.e:"/tmp/my_remote_mon.sh" and give it execution permissions and add the following line to the root crontab:

-*/5 * * * *   root /tmp/my_remote_mon.sh

This will do that this script will execute in the system each 5 minutes. You can start adding machines to the script.


Volver a Índice de Guías Rápidas de Pandora FMS