Pandora: Documentation en: Architecture
- 1 Pandora FMS Architecture
- 1.1 Pandora FMS’ Servers
- 1.1.1 Data server
- 1.1.2 Web server
- 1.1.3 SNMP server (also known as Traps Console SNMP)
- 1.1.4 WMI server
- 1.1.5 Recognition server
- 1.1.6 Accessories server (Plugins)
- 1.1.7 Prediction server
- 1.1.8 WEB Test server (Goliat)
- 1.1.9 Export server
- 1.1.10 Inventory server
- 1.1.11 Event correlation server
- 1.1.12 Enterprise Networl server for SNMP & ICMP
- 1.2 Pandora FMS’ Web console
- 1.3 Pandora FMS’ Database
- 1.4 Software Agents of Pandora FMS
- 1.5 Typologies, schemes and monitoring models
- 1.1 Pandora FMS’ Servers
1 Pandora FMS Architecture
This chapter gives a general description of Pandora’s components, the way they relate among themselves and on how to use Pandora FMS’ architecture to meet different challenges regarding the typology of its infrastructure.
Pandora FMs is extremely modular and decentralized. The most important component is the database, were everything is kept (it is only supported, at present, on MySQL production systems, but PostgreSQL and Oracle are also supported) Each one of Pandora FMS’ components can be replicated and function in a pure HA environment, be it passive or active, or in a clustered environment (Active/Active, with load balancing) There are also method descriptions for a high availability SQL backend.
Pandora FMS is made of several elements, among them, the servers, which are those in charge of collecting and processing data. The servers also introduce the collected and processed data into the database. The console is the part in charge of displaying the data present in the database and of interacting with the final user. The software agents are the applications that run on the monitored systems (usually servers) and collect the information to send it to Pandora FMS’ servers.
1.1 Pandora FMS’ Servers
Pandora FMS' servers are the elements in charge of doing the pertinent checks. They verify and change them according to their results. They are also in charge of firing the alerts established to control the status of the database.
Pandora FMS’ data server can work with high availability and/or load balancing. In a very large architecture, it is possible to employ several servers simultaneously, to be able to handle large volumes of information distributed by geographic or functional zones.
Pandora FMS servers are always working and permanently verifying if any elements have any difficulties and if those are defined as alerts. On facing a problem, this executes the response defined in the alert, like sending an SMS, an email or activating the execution of a script.
There can be simultaneous servers, one of them being a main server and the rest of the servers being slave servers. Although there is a master and a slave server, they work simultaneously. The difference between them is that when one is down (E.g. a network server) the master server is in charge of processing all the down server’s associated data.
The server receiving the data file from the agent, or processing the information (if this is of the remote type) is that who fires the associated alerts to that data which is after processing.
Pandora FMS automatically manages each server’s status, its loading levels and other parameters. The user can monitor the state of each server, through the server’s status section of the web’s console.
In Pandora FMS 3.x there are ten different servers, in total, specialized in and responsible for the aforementioned tasks. The ten servers integrate in a single application, under the general name ’Pandora Server”, which is a multi-threaded application (multi-processing) that executes in sub-processes (threads) each one being different to the instances or to the specialized servers of Pandora FMS.
1.1.1 Data server
It processes the information sent by the Software agents. The Software agents send XML data to the server in different ways (FTP, SSH, or Tentacle) and the server periodically verifies if there are any data files awaiting processing. This process uses a disk directory as a bonding medium for the elements to be processed. It is possible to install different data servers in different systems or in the same host (which will be different virtual servers) .Several servers can work together in very large environments, making the most of the available hardware (E.g. multiple CPU’s environments)
The data server -like the rest of servers- accesses Pandora FMS’ database, which shares with the web server and contains the processed data packets. The server executes as daemon or as service and processes the gathered packets in its file system. In spite of its simplicity and its modest use of resources, the data server is one of the elements critical to the system, as it processes all of the agent’s information and generates system alerts and events according to that data. The data server works only with the XML data from the software agents without doing any remote verification.
1.1.2 Web server
The web server executes remote monitoring tasks through the Web: ICMP tests (Ping, latency times) TCP and SNMP requests. When an agent is assigned to a server is always to a Web server, not to a data server, that is why it is very important that the engines executing the Web servers have ‘Web visibility’ to be able to carry the monitoring tasks assigned to them. Making it possible for a server to connect to a chosen web. For instance, if we create a module for a ping verification on 192.168.1.1 and the agent/ module is assigned to a server on 192.168.2.0/24 without Web access to 192.168. 1.0/24 we will always get a DOWN reply, as it can not make contact with it.
1.1.3 SNMP server (also known as Traps Console SNMP)
This server uses the standard daemon of the traps recollection system snmptradp'. This daemon receives SNMP traps and Pandora FSM’ server processes and stores them in the data base. While processing and analyzing them it can also fire the designated alerts in the SNMP console of Pandora FMS.
1.1.4 WMI server
WMI is a Microsoft standard to obtain information from the operating system and Microsoft Windows environmental applications. Pandora FMS has a dedicated server to make native WMI calls in a centralized way. Thanks to that server it is possible to collect data from Windows systems remotely and without an agent
1.1.5 Recognition server
The recognition server (Recon Server) is used to explore the Web regularly and to detect new systems in operation. The recon server can also assign a monitoring template to those systems recently detected and apply the modules by default automatically, as defined by that template, so they can be used immediately to monitor the new system. By using the applications of the nmap, xprobe and traceroute systems it is also capable of identifying systems by their Operating System, based in the ports that are opened, and to establish the Web’s typology, guided by the systems it already knows.
1.1.6 Accessories server (Plugins)
It executes complex user tests, done in any language, integrated in Pandora FMS’ interface and centrally managed. This allows the advanced user to define its own complex tests, developed by himself, and to integrate them in the application so they can be used in an easy, integrated manner from Pandora FMS.
1.1.7 Prediction server
It is a small component of Artificial Intelligence that implements, in an statistical way, a data prevision based in past data- with a scope of up to 30 days in four temporal references- and that allows to predict the value of an item of data in 10-15 minutes intervals, knowing if a particular data presents an anomaly at present based in its past historical performance. You will, basically, have to construct a dynamic baseline with a weekly profile. This server also manages the service monitoring calculations (BPM) from the 5.0 version of Pandora FMS.
1.1.8 WEB Test server (Goliat)
(Only Enterprise version)
The Web test server is used to do load testing. It does synthetic Web testing, that is, complete Web testing, including: user identification process, parameters for data transfer, contents verification, menu navigation, etc. It is used for verification tests (does it or does it not work) and to obtain latency times of the complete Web navigation experience (including the resources linked to the page: images, full texts, etc).
1.1.9 Export server
(Enterprise version only)
The export server of Pandora FMS permits the transfer of data from the monitored dispositive of a Pandora FMS’ installation to another, making possible, facilitating a replicate of the data. This is particularly useful with large deployments, with several Pandora FMS’ installations, when we want to have some critical information centralized in just one of them.
1.1.10 Inventory server
(Enterprise version only)
The inventory server obtains and visualizes inventory information in the systems: Software installed, installed patches, hardware memory chips, hard disks, services running in the system, etc. It can obtain this information, both remotely or in a localized way, through the software agents.
1.1.11 Event correlation server
(Enterprise version only)
This special server can be used to correlate events and generate alerts. This is a special server that does not monitor, and that like the others, can be specified in the configuration for its start up, or left alone. This server, unlike the rest, does not avail of threads configuration or high availability.
1.1.12 Enterprise Networl server for SNMP & ICMP
(Enterprise version only)
There are two additional servers using advance strategies to process ICMP (ping) and SNMP (polling) requests in a way that offers a superior performance to the opensource version, in exchange for some delicate requests (especially SNMP) as they work with previously validated OID’s by the opened server.
They use low level binary tools to access the TCP/IP system of the server; in a much more efficient way, by doing block surveillance.
1.2 Pandora FMS’ Web console
It is Pandora FMS user’s interface. This administration and operations console allows to different users, with different privileges, to control the status of the agents, access statistical information, generate graphics and tables of data, as well as managing incidences with its integrated system. It is also capable of producing reports and defining new modules, agents and alerts, as well as creating other users profiles, all of it done in a centralized way.
The Web console is programmed in PHP and does not require the end user to install any additional software, neither Java, nor ActiveX. Graphics are, however, available in FLASH, and to be able to see them in that format you will need the Flash application, on your navigator; which can be accessed from any modern platform with HTLM and CSS support. We recommend Firefox 2.x or IE 7.x. The user experience with navigators like IE6 is very poor, and most of the features implemented in Pandoras FMS 3.0’s console are lost.
The Web console can operate with multiple servers, this means we can add as many Web consoles as we want, be it for load distribution or to facilitate access due to logistical problems (large Webs, several groups with different users, geographical and administrative differences, etc.) Its one prerogative is to have access to the data storage center were Pandora FMS compiles everything. That is, access to the database, and in the case of the enterprise version, access to the agent’s repository of configurations (via NFS) in a synchronized way.
1.3 Pandora FMS’ Database
Pandora FMS uses MySQL database. Pandora FMS keeps an asynchronous database with all the information received, doing a temporary cohesion and normalizing of all the data received from several sources. Each agent’s data module generates a data entry for each packet, which means a real production system can have a scope of ten million units of data, or atoms of information.
This data is automatically managed from Pandora FMS, which carries on a periodical and automatic maintenance service of the data base, allowing Pandora FMS to do without any sort of data base administration system manually assisted, be it by an operator, or an administrator. This is done by periodically purging any data after a certain date (90 days in a preset option) as well as a compactacion of the data which is above a predetermined and configurable number of days (30 days in a preset option)
1.4 Software Agents of Pandora FMS
When we are referring to an agent in Pandora FMS, we can distinguish three essential components in the recollection of data:
- Software Agent (Software application, Pandora FMS agent running in a machine)
- Physical Agent (hardware)
Pandora FMS’ agent, itself, is basically an organizational element created with Pandora FMS’ Web console and associated to a group of modules (seen as individual monitoring elements). This agent can also have (optionally) one or more IP addresses associated to it.
The agent can have remote modules associated, which would have been obtained through Web servers, WMI, Plugin, etc.
- Verification on whether the engine is connected or on line (PING).
- Verifications on whether a given port is open or closed.
- Verifications on whether a Web, hosted in a specific port of the hardware, is responding correctly.
- Verification on whether a Web, hosted in a specific point of the hardware, has the desired content.
- Hardware verifications throw SNMP (knowing the MIB)
- Latency time verification between the hardware and Pandora FMS’ servers.
The agent can also have local modules associated to it. Local modules are those defined in the software agent’s configuration and which must also be defined in the Web console’s agent. If the agent is in auto-learning mode (default setting) these local modules are created automatically in the Web’s console when a packet of data arrives, from the agent, for the first time. Therefore, an ‘Agent’ can contain modules of both the remote and local types. The remote type modules are executed by those servers obtaining information remotely (prediction included) and the local modules are obtained by the Data Server.
1.4.2 Software Agent
A software agent installed in a remote engine completely different from the one in the server or in Pandora’s Web console. The software agent gathers ‘local’ information from the engine where it is executing throw commands which obtain information on the system.
Pandora FMS’ software agents are based in the native languages of each platform: ShellScripting for Unix —including GNU/Linux, Solaris, AIX, HP-UX y BSD, as well as IPSO for Nokia (Check Point Firewall’s operating system).
Pandora’s agents can be developed, practically, in any language, as long as it meets the data exchange API with Pandora FMS’ data server (defined by the XML data exchange). The Window agents operate in a free environment for C++ and employ the same interface and modularity than the UNIX agents, although with several characteristic of their own.
These scripts are built from sub-modules and each one of them recollects a portion of information.
Each agent collects several ‘portions’ of information. These are compiled into the one packet and stored in a single file which we name ‘data package”.
The copy process of the data packet from agent to server is synchronously executed on a regular basis That is, within regular intervals -defined by the agent- which can be modified in order not to clutter the data base with superfluous information, overload the Web, or become detrimental to the system’s performance.
The interval is set up at 300 seconds, which is the decimal equivalent of 5 minutes. Lesser values to 100 (seconds) are not recommended as it can affect the performance of the host system, as well as overload the data base and the central processing system.
It is important to remember that Pandora FMS is not a’ real time system’, but a general monitoring system for systems and applications in environments where ‘real time’ is not a critical factor. It can, non the less, be adapted to operate in environments with response times of between 3 and 5 seconds.
Packet transfers are done throw Tentacle’s protocol, but they can also be transferred using SSH or FTP.
With either SSH or Tentacle the process can be made secure, given that passwords do not travel through the Web nor does unencrypted confidential data, assuring the confidentiality, integrity and authentication of the connexions between agent and server. The codes generating process, to be able to carry the SCP (SSH) transfer automatically -and also throw Tentacle protocol- is detailed in the documentation on the installation and configuration of the Agents and the Server.
The transfer can be done throw FTP or throw any other transfer system for files; although we choose Tentacle, due to the security this system offers, its user friendliness, and its multiple options.
Please check the annexes to the documentation to configure the transference through protocols.
Pandora FMS’ agents are designed to be executed in the agent from which they recollect data, although the agents can recollect information stored in accessible engines from the host were they are installed, also known as “satellite agent’.
It is also feasible to configure an engine so it has several of Pandora FMS agents simultaneously. This predicament is quite rare. It occurs when, for instance, we have a software agent and a satellite agent. The standard software agent monitors the engine were it executes, while the satellite agents installed (there can be several) monitors remote systems through Telnet, SNMP or other proprietary commands.
1.4.3 Data File XML
The data file has the following Syntax:
<nombredehost>.<nº de serie>.data
This data file is a XML structure and take its name from a combination of the host name, were the agent is, a serial number, which differs in each packet, and the extension data which indicates that is a data packet.
<nombredehost>.<nº de serie>.checksum
The data file is the file with the extension: ‘.data’. The verification file with the extension: ‘. Checksum’ contains a MD5 hash of the data file. These allow to make a final verification to ensure the data has not been altered in anyway before they are processed.
The XML data file the agent generates is at the heart of Pandora. It contains a data packet with the information gathered by the Agent. This packet of data has a compact design; light and flexible, that allows any user to avail of Pandora FMS agents or to, by its own method, generate information to be processed in Pandora FMS. The data file is an XML similar to the following one:
<agent data os_name=”SunOS” os_version=”5.8” timestamp=”300” agent_name=”pdges01” version=”1.0”> <module> <name>FTP Daemon</name> <type>generic_proc</type> <data>0</data> </module> <module> <name>DiskFree</name> <type>generic_data</type> <data>5200000</data> </module> <module> <name>UsersConnected</name> <type>generic_data_inc</type> <data>119</data> </module> <module> <name>LastLogin</name> <type>generic_data_string</type> <data>slerena</data> </module> </agent_data>
1.4.4 Physical Agent
Pandora FMS has a physical agent mounted on an Asus and an Arduino automaton. This tandem, together with the sensors connected, facilitates, at present, the monitoring of the following environmental features:
- Ambient lighting
The sensors, being electrical, are easily calibrated, and their values can also be processed by Pandora FMS without any difficulty The fact that the sensor is a non-wired router opens a world of possibilities to this type of sensors, already present in the CPD’s of some companies
1.5 Typologies, schemes and monitoring models
There are different models to address the monitoring process, both local and remote. We enumerate the following common examples for different topologies in order to familiarize the reader with the possible problems and the solutions Pandora has to offer. Each of the solutions is described in successive chapters.
1.5.1 Accessible Webs
This is the norm in small, simple Webs but also in the very centralized and well organized ones. This is the easiest model to implement.
- Web access for centralized remote monitoring. It implies that we can access every engine from Pandora’s server to probe remotely.
- Web access for agent based monitoring. In this Web we can reach Pandora’s server from the Pandora’s agents installed in the monitoring engine.
1.5.2 Limited access Webs
- Remote Web. This is a not reachable Web for remote testing by Pandora. We use, to that end, a software agent, as a remote gatherer, to test other systems. We call these operating modes: “satellite agent mode’ (when all testing is carried within the same agent) and ‘broker agent mode’ (when it impersonates several agents, but every test is actually carried in the same physical engine)
Deployment model for remote Webs non accessible in broker mode
- Software agents without access to Pandora’s server We will, in this case, use the proxy characteristic of the software agents, allowing an agent without access to the server, to use an agent with access to connect throw it.
Deployment model for remote Webs by using the proxy agent mode
- Need to monitor Remote server monitoring for different Webs . In this situation we mount several of the different Pandora's servers. Feeding from the same data base, one server will execute a battery of pre-defined tests and a different one another. Both servers operate in the same environment and are managed simultaneously from the console.
1.5.3 Special organizational characteristics
- A need to monitor several headquarters with monitoring equipment and different configurations. We use, in this case, an export; to duplicate part of the monitoring in an independent environment segregated from Pandora.
hierarchical export model with Export Server
- Duality of reporting. We can configure additional agents to support two different Pandora server’s although only one will be able to manage it.
- Fragmented management. Useful when you need to delegate the administration of part of the equipment to different personnel, with different accesses. This is, more a management issue than an architectural problem.. It can be resolved through the permits assigned on management policies.
1.5.4 Large environments
- Large volume Web, with thousands of web testing processes that we are to distribute in different "remote monitoring probes". Given their large numbers (over 50,000) we can’t centralize them in a single server. To do that we use different servers in broker mode, that distribute the remote testing load.
Distribution of remote testing model with agents in broker mode
Need to set up a HA server for security reasons, in case of a failure in the primary hardware. We will study how to mount two servers: one ‘passive’, waiting on standby for the active one to stop responding so it can start functioning .There are several ways to do it.
- Need to monitor a large volume of systems and manage them in a centralized way (more than 2500 agents). To do that we configure different Pandora servers, coordinated by the same system we call ‘metaconsole’. They can, in this way, be linearly escalated.
Modelo de metaconsola