Pandora: Documentation en: Services
- 1 Service Monitoring
1 Service Monitoring
1.1.1 The concept of service monitoring
A service is a way to group your IT resources basis on their functionalities. For example a service could be your official website, your CRM system, your support application, or even your printers. Services are logical groups which can include hosts, routers, switches, firewalls, CRMs, ERPs, webs and of course another services. You can see what is a service more clearly with the following example.
Chip Company sells computer through it's website all around the world, and it has three big department: Online Shop, Support and Management.
As you can see there are three services which are offered to customers: Online Shop, Support and, indirectly, Management. All services are crucial for the business because if one fails the others can be affected and the company could lost a lot of money even customers. And at the end as you know a happy customers could give back to your company more customers.
To monitor the service of Chip Company we need to know more in deep each service.
The Online Shop department is responsible to guarantee that the shop website is online, that all products prices are right, create the product categories and overall to ensure that all information about products, delivery and payment methods is right on the website to make easy the shopping. From this service we want to monitor ther following parameters:
The Support department have to solve all customer's problems with the computers they had bought. Some tasks of this department are: helping customers to configure their computers, manage the replacement of computer parts and manage the return of products delivered. This department joined to Online Shop are the services in the client side so they are very important to be percived as a high quality company. From the support service we want to monitor the following parameter:
The third department is Management inside it there is Marketing, Commercial, HHRR and other department focused on internal management. Their principal job is ensure all process inside the organization are right. The services of this department are crucial because is the coordinator of all departments. The most interesting parameters for Management services are:
To monitor our services we make some maps thanks to Pandora FMS Visual Console and the pictures we have about services hierarchy of Chip Company. These maps are calculated in real time, so you will always known the status of your services everytime. First of all we made the map of each service.
The next picture shows the map of Online Shop service with the status of each parameter. As you can see the parameter called Content Updated has a red dot and it means there is a problem with it. About the other paremeters we can say they are right because they have green dots. With the green arrow you can go to the map of general view, you will see it in the next steps.
If you want to see what is the problem you can click on the red dot and you will see the technical view with which you can know more about the problem. This technical view shows the data gathered by Pandora FMS from a lot of sources such as: CRM, ERP, SAP Servers, Databases (MySQL, Oracle, etc), even from devices like PC, servers and routers.
We also made another maps for Support Service which you can see in the picture below. As you can see all the important paremeters of Support service are ok because all of them have green dots.
To finish with the serices map we made the service map for Management Service which you can see in the next picture. Again it shows all the important paramenter with their dots in this case all the dots are green so thath means the paramenters of the service are right.
Furthermore we made a general map with all the services, you can see it in the next picture. In this map you can see the service hierarchy of Chip Company with the status of each service. Also, if you click on each dot you will see the specific map of each service. With all these maps we have created a full navigation map of all the service of Chip Company. The status of each service is the same that is showed in the specific map for each service and as you can see Management and Support service are ok but Online Shop Service has problems, as you can see the status of the services climbs up inside the hierarchy until the top.
1.2 Services in Pandora FMS
1.2.1 How services work in Pandora FMS
Unlike as with the "specific" monitoring, where there are kept specific values from specific indicators, the service monitoring with Pandora FMS is though to monitor "groups" of elements, from different kind, with certain "margin of error", based on the failure accumulation.
To understand better in which the service monitoring consist on, we are going to show an example.
We want to monitor if the service that we are giving, through a WEB cluster, is "Ok". This cluster consist of the following elements:
- Two routers in HA.
- Two switches in HA.
- Twenty WEB Apache servers
- Four Weblogic appliance servers
- One MySQL cluster of two storage nodes and two SQL processing nodes
It's possible to monitor each element in an individual way and, in fact it's the first thing we will need to activate the service monitoring "globally". Each element included in the service should be an "standard" monitor of the ones monitored with Pandora, that is, it's something PREVIOUS to the service monitoring.
The need of monitoring services as something "abstract" appears when we ask ourselves this question: What happens when an element that initially is not critical? such as, for example, one of the twenty Apache servers. Firstly, we could not to warn, in fact, could be it has frequent falls, so there are 20 nodes, it shouldn't warn us for the fall of only one node ( let's imagine that this warning wake up someone who is sleeping. In fact, a service with so many redundance is for giving us more peace, not more work. It should only warn us if a more critical element is down (such as a router) or if "several" WEB servers are down, for example, four or five of them.
In this way, if we put "weights" to each element from our example:
- Switches and routers: 5 points for each one when there are in critical, and 3 points if they are in warning.
- WEB servers: 1.2 point for each one in critical. We don't consider the warning status.
- WebLogic Servers: 2 points for each one in critical.
- MySQL cluster: 5 points for each node, 3 points in warning.
We fix a warning threshold for the service of 4, and a critical threshold of 6. In this way, and supposing that all things are going ok the service would be "OK" if all the monitored elements are OK.
Now, suppose that ONE APACHE WEB server:
- 1 x Apache server in CRITICAL x 1.2 point = 1.2 so 1.2 < 4 (Warning), the service is still in the OK status
See what happens if a WEB server and a Weblogic are down:
- 1 x APache server in CRITICAL x 1.2 point = 1.2
- 1 x Weblogic server in CRITICAL x 2 = 2
Summarizing: 3,2 is still < 4, so the service is still in Ok status and without waking up the operator from the bed.
See what happens if two WEB servers and one Weblogic are down:
- 2 x Servidor Apache en CRITICAL x 1.2 point = 2.4
- 1 x Servidor Weblogic en CRITICAL x 2 = 2
Then, 4,4 is now > 4 and the service for the WARNING status. It's possible that a urgent SMS has not been received from the operator yet, but it's sure that at least someone will receive an email. Let's continue with the example.
Supposing that besides the previous thing, one Router is down:
- 2 x Apache server in CRITICAL in x 1.2 point = 2.4
- 1 x Weblogic server in CRITICAL x 2 = 2
- 1 x Router in CRITICAL x 5 = 5
We have already a 9,4 higher to the 8 threshold for CRITICAL, so the service is in critical and our operator has no other option than to wake up.
The service monitoring is a feature only for the Pandora FMS Enterprise version.
1.2.2 Creating a new service
The service represents an association of agent modules and their value is calculated in real time. Because of that first of all you need to have all your devices that make a service monitored and with their module's values normalized to three status: Normal, Warning and Critical. You can learn more about them in their wiki sections: Monitoring with Pandora FMS and Monitoring with policies.
When you have all the devices monitored you can make group of them with the service. Inside each service you can add all modules you need to monitor the service. For example if you want to monitor the online shop service you need a module that monitors the content, another which monitors the comunication status and so on. Trough the next steps you can see how create a service with Pandora FMS.
To create a new service click on the service tab of Operation menu.
The list of services will appear, the image below shows the list without services.
To create a new service just click on botton Create. Then you can create the service filling the fields of the form below.
At this moment we have a service created without items, so we have to add items to the service. To add a new item click on the oragne tool an the right top of Service Management tab and after in the botton Create. Then the form below will appear. In this form you must select a module of an agent to add. Also you must fill the fields related to the weight of this module inside the service for Normal, Warning and Critical status. The heavier a module the more important is within the service.
When all fields are filled click on button create and the next picture will appear with the succesful message.
You can add all items you need to monitor your service. For example we have added elements of this service with the proper weights and the result is like in the next picture.
Once you have your service created you can go to Service Operation tab clicking.
Then the service opeartion list will appear like the image below. This view is calculated in real time and the parameters showed are:
- Name: name of the service.
- Description: description of the service
- Group: group the service belongs to
- Critical: limit value from which the service is in critical state.
- Warning: limit value from which the service is in warning state.
- Value: value of the service. It's calculated in real time.
- Status: state of the service depending on its value and the critical and warning limits.
If you click on a service name you will see the sepcific service view. As you know the value of a service is calculated as the addition of the weights associated to the state of each module. Services, same as modules, has associated an state depending on its value. This view shows the status of each service item and with following parameters:
- Agent Name: name of the agent the module belongs to.
- Module Name: name of the module.
- Description: free description.
- Weight Critical: weight when the module is in a critical state.
- Weight Warning: weight when the module is in warning state.
- Weight Ok: weight when the module is in normal state.
- Data: value of the module.
- Status: state of the module.
It's also possible to create modules associated to services, with the advantages that this implies (calculation periodicity, integration with the alert system, etc). The way to associate one module to a service is to follow the following steps:
- Create the individual monitors that make up the service and make sure that they work well.
- Fix the individual thresholds for each monitor to define CRITICAL and/or WARNING states.
- Create a servoce with those monitors that we want, and define thresholds for the service and weights for each monitor included in the service.
- Go to the agent where we want to "locate" the monitor associated to the service.
- Create a new module of "prediction" kind associated to this agent, using the module editor of the Prediction server, in order to associate it to one of the services of the list.
- If we want to associate alerts to the service, then we should do it on the module that is associated to the server. The server, as it is, has no possibilities of adding alerts, neither graphs or reports. All these has to be done through the monitor that is linked to the service, as we have described before.
After you have all your services created you can create Visual Maps to show the status of your services at everytime in a visual way. You can see more informatio about Visual Maps of Pandora FMS in their wiki section: Data display and reporting
With this tool we have made the maps you have seen in the introduction and which represent the service of Chip Company. Below you can see the map of Chip Company services.
Furthermore is you need a more technical map you can create maps more detailed with Visual Map Console of Pandora FMS. You can add icons, graphs, status dots, tags and simple data. In the picture below you can see the technical map of Online Shop service with the status of all devices.
1.2.3 Services groups
The services are logic groups which are part of the company business structure. Because of that, it could be necessary make groups of services because sometimes services alone do not have a complete meaning. To create service groups you have to add each service to an existing agent, in this case a service will be a module of the agent. With these groups we can create a new logic structure, a group of services.
These groups can help us to create visual maps, configure alerts, apply monitoring policies, etc. So, we can create alerts which trigger an alarm when the status of the company is critical because our salesmen can not do their job properly, or when one of our headquarters can not work properly because technical problems with their ERP.
To see it more clear, below you can see two example of services groups.
184.108.40.206 Several services inside the same company
Following the previous example, we could suppose that we also have salesmen who sell the WEB service our clients and they have to access to a CRM to manage their clients.
Our CRM service is made with:
- Two routers HA.
- Two Apache WEB servers.
- MySQL cluster 2 nodes for data and 2 nodes for processing SQL.
For the example we can suppose that the CRM architecture is monitorized and that the CRM service was created.
At this moment we have two services:
- WEB Cluster Service (for our clients)
- CRM Service (for our sales deparment)
To use service groups the best option is create a new agent called, for example, "Company" which has Cluster Service and CRM Service as modules. Thus the services will be grouped in this way.
220.127.116.11 Different services through several headquarters
Another example could be that we want monitor the different headquarters of a company, and each headquarter has his own services.
For this example we could suppose that there is three headquarters with the following services: CRM, ERP and Internal Web. Services are configured with specific needs of each headquarter. At this point, we have all services monitored for each headquarter like in the following picture.
But, we could need another logic representation that grouped the services of each headquarters together to get a structure closer to your company inside Pandora FMS. To do that we could create an agent for each headquarters with a module for each service of this headquarter. With this approach we obtain the following groups. Pero puede surgir la necesidad de crear agrupaciones lógicas que representen a las diferentes sedes de la empresa para poder tener en Pandora FMS una estructura más fiel a la que existe en la realidad. Para ello podemos crear un agente por sede cuyos módulos sean los diferentes servicios de dicha sede, obteniendo los siguientes grupos.
With this way to grouping services, we can create the logic structure present in the real world inside Pandora FMS. Furthermore we obtain a full monitorization of all services.