Pandora: QuickGuides EN: Remote Monitoring
The purpose of this document is to make the reader aware of the possibilities regarding configuration and usability in Pandora FMS as a monitoring tool for all kind of systems and applications.
During this guide we'll have a set of specific indications to build our network environment with Pandora FMS incorporated.
These indications will include some short steps about the installation and the initial configuration, followed by practical examples regarding the real use of the program with their corresponding screenshots.
Let's think we want to implement a solution which integers Pandora FMS as a monitoring tool in a network environment, mainly to make different remote checks against the critical elements of this network (Servers, routers...) and to have an alert firing and sending an email any time the status of any of these elements is considered critical.
In the same way, we want to dispose of a historic of all these events presented in a list with graphs of a router's interface traffic data.
The requirements to make this implementation work properly are the following:
- Install a Pandora Server and a Pandora Console in a server with access to all the machines pending to be monitored.
- Have opened and listening all those ports we need to make our remote checks.
In order to know how to install all the elements of Pandora FMS, check the Pandora FMS installation manual. It is also possible to operate from a virtual image containing all the necessary elements installed and configured.
5 Monitoring our network with Pandora FMS
The software agents of Pandora FMS aren't going to be used in this guide, since we've only going to explain the data collection by using the Network Server, with remote agents, so we'll leave them for another guide.
It is recommended to read the Pandora FMS operation manual in order to obtain more information and a better understanding of the following processes:
5.1 ICMP checks
The first thing we're going to do is to define modules to check the availability and latency of a remote element from Pandora Console.
In order to monitor a server or one of it's services (FTP, SSH, etc.) remotely, we'll first have to create the corresponding agent to monitor this service, so let's get started from here. This agent is going to represent our router, and to achieve that we'll need to introduce it's main IP during the agent creation. This way we'll be able to perform all the remote checks against this IP by default.
In the management section of the Pandora FMS console click over Manage agents:
In the next screen, click on the button Create agent:
Fulfill all the date for your new agent and click on the button Create agent:
Once the agent has been created, click over the upper right flap representing the modules. In this section, select to create a new Network Server module and click on Create:
In the next form, select a module network component, and when the right menu is displayed, select the check you want to perform. In this example we will select Host Alive, which represents a ping check against the maghine, a simple check to know whether the machine is connected or not to the network.
In the case of boolean modules (check a service's availability for example) or xxxx_proc type in Pandora, which return values 0, when the result isn't good and any other value over 0 when it is good, these are represented with red and green colours respectively, automatically, so it is not necessary to define any range to change its status.
We'll leave the advanced options for another moment. Note that the module has obtained the IP address of the agent. If you wish, this field can have a different one. Once you're done with the module definition, click on the Create button.
In the following screen, all the modules defined in the agent will be shown... in this case the Host Alive module we've just created:
As you can see, there's a warning icon over the modules. This warning only means that no data has been received in the module yet, since it's just been added. Once the data begins to be received, this warning will disappear.
However, in the case of the Host Latency module, which returns the time in milliseconds that takes the server to establish contact with the remote machine, we can define value ranges for warning and critical status of the module.
For example, let's configure the module to enter in a warning status from 50 to 100 ms and in a critical status for a value above 100 ms.
Once we've finished adding modules, click on the upper right flap "View", and go to the bottom of the new section, where the data will be shown once it is received:
This has been an example of ICMP monitoring, with the most basic and simple checks that allow us to have an important and precise information about the status of our monitored machines. There are two kinds of ICMP checks:
- icmp_proc, or host check (ping), which allows us to know whether an IP address is available or not.
- icmp_data, or latency check. Basically it tells us the time in milliseconds it takes the machine located in that IP address to respond to a basic ICMP query.
5.2 SNMP checks
Now let's define two remote SNMP modules, following the same procedure, to measure the ingoing and outgoing traffic of the 11th interface of a router.
In order to achieve this, we first need to check the OIDs our router model has and check which one matches the data we want to obtain.
The easiest way is to use the SNMP Explorer tool, so we can do a SNMP Walk against the IP of the router we want to monitor.
Let's go to the management view of the desired agent and click on the SNMP Explorer tab, on the upper right section of the screen:
In order to make the SNMP exploration, we need the router IP as well as its port, if it's not the default one, with valid authentication data. In our case we're going to use SNMP v1 with a SNMP community with read privileges.
Once we perform the SNMP Walk, we will be able to see a list with all the router interfaces. Select the desired one, and then select the modules we want to create as well. In our case:
Another way to define SNMP modules is by knowing it's numeric OID, defining the module the same way we did with the ICMP ones.
- Ingoing traffic (interface 11): .220.127.116.11.18.104.22.168.1.10.11
- Outgoing traffic (interface 11): .22.214.171.124.126.96.36.199.1.16.11
If we wanted to add all the modules found by the SNMP Explorer, we should see something like this:
5.3 TCP checks
The TCP checking allows to check the state of a port or a TCP service.
There are two specific fields for TCP tests:
The TCP checking by default simply looks if the destination port is open or not. Optionally you could send a text string and wait to receive something that will be processed directly as a data.
It is possible to send a text string(using the «^M» string to replace the CR)and you can wait when receiving an answer substring to check that the communication is right. This allows to implement simple protocol checking. For example, we could check if a server is alive sending the string:
GET / HTTP/1.0^M^M
And waiting to receive the string
This is codified in TCP Send and TCP receive fields.
Now we have a chance to use a couple of predefined module components for the Network Server, in order to create two modules to check the status of a web and SMTP servers, respectively.
Note that while in the module we are defining to check the web server, we're doing a TCP send/receive query, in the case of the SNMP server we only want to check whether the corresponding port is open or not.
Once we are done, we can check the status of these servers in the agent's monitor view.
5.4 Module detail in graphs
If we want to check the data history of one of the SNMP modules we've defined previously, for example one indicating the ingoing traffic of one of the router's interface, we would only have to go to the agent's module view, and click on the graphs icon of the desired module
This way we would display a graph with the module data of the last 24 hours by default. In our case we've chosen to display the data of the last 6 hours. In order to change the graph format, just click on the gray bar located to the left.
5.5 Event listing
Besides all the features commented previously, we also have the possibility to see all the events occurred in our system, since modules changing their status to alerts fired.
In order to access the event list, simply enter the Event section in the operation menu:
Once inside, we can see the list of unvalidated events during the last 8 hours by default.
If we want to choose the events we want to show, we have a filter to manage this in the upper section of the console: