Pandora: Documentation en: Web Monitoring
- 1 Monitoring User Browsing Behavior
- 1.1 Introduction
- 1.2 How to Create Web Surveillance Modules
- 1.3 String Check on a Website
- 1.4 Checking the Latency of a Website
- 1.5 Checking of Websites by a Proxy
- 1.6 Retrieving web content
- 1.7 Form Checking on a Website
- 1.8 Simple HTTP Authentication
- 1.9 HTTPS (HTTP with SSL) Monitoring
- 1.10 Monitoring of Website-Services
- 1.11 IPv6 Support
- 1.12 Advanced Options
- 2 Distributed Transactional WEB Monitoring by Selenium
1 Monitoring User Browsing Behavior
This is one of the Enterprise Version's features which allows you to monitor user's browsing behavior on a website. It's based on transactions to check the various steps taken during the browsing process which effectively reproduces the complete browsing history in detail. It also includes monitoring features such as authenticating or filling out a form, clicking on menu options or verifying whether each process returns a specific text string or not. Any mistake in the moment of processing is going to result in a failure of the surveillance. The complete transaction includes the download of all resources (graphs, animations, etc.) which real browsing consists of. You're also able to obtain values from these websites to monitor availability and latency.
'Goliath' is the name of the Pandora FMS Web Surveillance Server. Goliath has the ability to monitor HTTP and HTTPS connections transparently if you have any OpenSSL libraries installed on your system.
1.2 How to Create Web Surveillance Modules
To remotely monitor a web page, you're required to create the corresponding agent in order to monitor the service first.
To do so, please click on Manage agents from the menu Resources.
On the following screen, click on 'Create agent':
Please fill out the form for your new agent and click on 'Create':
Once you have created the agent, click on the upper flap of the modules. Please select 'Create a new webserver module' and click on 'Create':
Click 'Create', to see a form appear in which you fill out the necessary fields to monitor a web page.
This is an explanation of the form fields:
The name of the check.
There are various check types to choose from:
- Remote HTTP module to check latency: Obtains the total time from the first request until the last one is checked (in a WEB test there is one or several intermediate requests which complete the transaction. If there are several requests in the definition of the check, the average time of each request is going to be used.
- Remote HTTP module to check server response: Obtains a value of '1' (OK) or '0' (FAILED) as a result of checking all the transactions. If there are several attempts and all of them happen to fail, we consider the test as a total failure.
- Remote HTTP module to retrieve numeric data: Retrieves a numeric value from an HTTP response using a regular expression.
- Remote HTTP module to retrieve string data: Retrieves a string from an HTTP response using a regular expression.
The entirety of the web checks to perform (only one by default).
The WEB check is either defined by several steps or as a simple request.
These simple petitions are required to be written in a special format in the 'Web checks' field. The checks are started by the 'task_begin' tag and they end on 'task_end'.
It's also possible to check whether there is a string on a web page or not. The 'check_string' variable was created for this purpose. This variable however doesn't allow you to check the HTML code itself. A usage example of this variable could be something like e.g.: Search on the 'http://www.example.com' web page if there is the 'Section 3' string. If the string exists, the variable has to be configured in the following way:
check_string Section 3
If you want to make sure a string is not on a web page, you can use the 'check_not_string' variable instead:
check_not_string Section 3
There are several extra variables to check forms:
- resource ('1' or '0'): Download all the web resources (images, videos, etc).
- cookie ('1' or '0'): Keeps a cookie or an open session for later checks.
- variable_name: The name of a variable in a form.
- variable_value: The value of the previous variable on the form.
By these variables, it's possible to send data to forms and check whether they work appropriately or not.
In some specific cases, the domain redirection is not going to work. To solve this problem, you're required to create a module which uses the final domain address after all redirections are completed.
1.3 String Check on a Website
The check to look up the 'Section 3' string on the 'http://www.example.com' website would have to look like this:
task_begin get http://www.example.com check_string Section 3 task_end
The complete form under Pandora FMS is going to look like this:
Once the check was executed, it's going to be shown in the 'View' menu by clicking on the corresponding flap. The data will be shown on the bottom of that page, once they start receive any.
In this example the monitor turns to critical becouse there's no string matching "Section 3" in the web page.
The syntax for the 'check_string' parameter is not a regular string, it's a Regular Expression ('regexp'). This is a special notation to code searches. If you are e.g. trying to search for 'Pandora FMS (4.0)', the expression should be something like 'Pandora FMS \(4.0\)'. This allows you to search for regular words and to conduct complex searches, but you're required to escape from non-alphanumeric characters by the '\' character before any attempt of execution.
1.4 Checking the Latency of a Website
If you want to check latency of a website, you're just required to select the module type named 'Remote HTTP module to check latency'.
If you e.g. want to learn the latency of the website 'http://pandorafms.com' and to check the string named 'pandora', the code would have to look like the one shown below.
task_begin get http://pandorafms.com task_end
The complete form under Pandora FMS is going to look like this:
The result of this module, showing the latency is going to look like this:
1.5 Checking of Websites by a Proxy
You're also able to conduct website checks by using a proxy. To configure the proxy, you're required to add the proxy URL in the field 'Proxy URL' which is located under 'Advanced options':
An example of the URL could be:
If the proxy requires an authentication, you may utilize an URL like this:
1.6 Retrieving web content
Example: retrieving Google's Stock Quote
To retrieve Google's stock quote, you're required to create a 'Remote HTTP module to retrieve numeric data' module along with the appropriate regular expression which is shown below.
task_begin get http://finance.google.com/finance/info?client=ig&q=NASDAQ%3aGOOG get_content \d+\.\d+ task_end
The output is going to be something like this:
From Pandora FMS 4.1 and above, you're able to specify the part of the regular expression which is going to be returned to retrieve data from more complex HTTP responses:
task_begin get http://finance.yahoo.com/q?s=GOOG get_content_advanced <span id="yfs_l84_goog">([\d\.]+)</span> task_end
The part of the regular expression to be returned (which is defined in 'get_content_advanced') has to be enclosed in brackets.
1.7 Form Checking on a Website
One check which is more interesting is a web site's form check, although it's more complex than the simple text check on a website. The example check is going to use a Pandora FMS public demo page, starts a session an checks whether it has successfully accomplished its task or not.
To be able to conduct these kinds of checks, you're required to possess the appropriate credentials in order to start the session. It's also advisable to go to the page and obtain the HTML code to take a look at the variable names.
The URL of the website is 'http://firefly.artica.es/pandora_demo/index.php?login=1'. Once you are there, you could see that the variables are the following:
- nick: user name
- pass: user password
It's recommended to use the variables 'variable_name' and 'variable_value' for being able to validate the form. The complete example is shown below.
task_begin post http://firefly.artica.es/pandora_demo/index.php?login=1 variable_name nick variable_value demo variable_name pass variable_value demo cookie 1 resource 1 task_end
You have managed to gain access to the website and validated some values by the previous task. Now it's recommended to check whether you're correctly registered on the page by searching for something which would be only possible to see if you are appropriately registered:
task_begin get http://firefly.artica.es/pandora_demo/index.php?sec=messages&sec2=operation/messages/message cookie 1 resource 1 check_string Read messages task_end
It's possible to conduct another check, which would be to end the session on the page and to exit:
task_begin get http://firefly.artica.es/pandora_demo/index.php?bye=bye cookie 1 resource 1 check_string Logged Out task_end
The complete check under Pandora FMS looks like this:
Once all the checks are added, they appear on the module's list:
To see the status of the check, please go to the 'View' menu, click on the corresponding flap and and take a look at the bottom of the page, where the data is going to be shown once the check starts to receive any.
You're also able to see a lot more data regarding some modules. To do so, you're just required to click on the 'Data' flap. If you do, a list like the one shown below is going to pop up:
On this image, you're able to see both checks, their names, the interval in which each of them are executed (which could be different from the agent interval), and the data. In the web checks, the Data column refers to the total time the check has taken.
On the following screen, the advanced options for the website monitoring are shown (which are partially different to the rest) :
The 'advanced features' fields are quite similar to the ones from other types of modules, but there are some differences which are specific to website checks:
It's the expiration time in which the petition has to be conducted. If the specified time is elapsed, the petition check is going to be ruled out.
Agent Browser ID
It's the web browser's identifier to use, because some specific pages only accept certain web browsers. Please see zytrax.com to obtain more information.
Pandora FMS is going to repeat the check exactly the number of times by the value this parameter contains. If any of the checks fail, it's going to be considered as a total failure. Depending on the number of checks in the module, we're going to get a specific number of pages, e.g. if the module has three checks, three pages are going to be downloaded. If we however have some fixed value in the 'Request' field, the number of downloads is going to get multiplied by this. This is important to know if we'd like the total time the module is going to take to complete the operations.
1.8 Simple HTTP Authentication
Some websites might require HTTP authentication. This authentication method doesn't consist of the 'normal' user / password way to authenticate, e.g. if you're entering a site and get a popup window which requires a user name and password for a realm or domain from you.
This way to authenticate is coded in the web task and utilizes a few new tokens like the ones shown below.
http_auth_serverport artica.es:80 http_auth_realm Private area http_auth_user admin http_auth_pass xxxxxx
- http_auth_serverport - The domain and HTTP port on which to listen on.
- http_auth_realm - The realm's name.
- http_auth_user - The user.
- http_auth_pass - The password.
This is a full example:
task_begin get http://artica.es/pandoraupdate4/ui/ cookie 1 resource 1 check_string Pandora FMS Update Manager \(4.0\) http_auth_serverport artica.es:80 http_auth_realm Private area http_auth_user admin http_auth_pass xxxx task_end
1.9 HTTPS (HTTP with SSL) Monitoring
Goliath is able to check both HTTP and HTTPS. To conduct checks on secured websites which are utilizing HTTPS, you're only required to incorporate the protocol into its URL, e.g.:
task_begin get https://www.google.com/accounts/ServiceLogin?service=mail&passive=true&rm=false&continue=https%3A%2F%2Fmail.google.com%2Fmail%2F%3Fui%3Dhtml%26zy%3Dl&bsv=zpwhtygjntrz&ss=1&scc=1<mpl=default<mplcache=2 cookie 1 resource 0 check_string Google task_end
1.10 Monitoring of Website-Services
With Pandora FMS and Goliath's Web Surveillance features, you're also able to monitor web services and APIs which are based on the REST specification, but you're unable to conduct such a surveillance on SOAP or XML-RPC based web services.
Let's assume for a moment that you e.g. want to check a web API by a specific call which returns a number (from '0' to 'n') if it works properly. If not, it's not going return anything, which is going to be considered a failure by Pandora FMS:
task_begin get http://artica.es/integria/include/api.php?user=slerena&pass=xxxx&op=get_stats¶ms=opened,,1 check_string \n[0-9]+ task_end
This is going to return a reply like this:
HTTP/1.1 200 OK Cache-Control: no-store, no-cache, must-revalidate, post-check=0, pre-check=0 Connection: close Date: Mon, 13 May 2013 15:39:27 GMT Pragma: no-cache Server: Apache Vary: Accept-Encoding Content-Type: text/html Expires: Thu, 19 Nov 1981 08:52:00 GMT Client-Date: Mon, 13 May 2013 15:39:27 GMT Client-Peer: 18.104.22.168:80 Client-Response-Num: 1 Client-Transfer-Encoding: chunked Set-Cookie: a81d4c5e530ad73e256b7729246d3d2c=pcasWqI6pZzT2x2AuWo602; path=/ 0
This is going to produce an 'OK' (Green) module status, because my regular expression matching ('regexp') found the '0' just before a carriage return. It's important to check the entire response data - not just the 'data' section, so you can match the HTTP headers, too. For other responses, a different regular expression is required.
1.11 IPv6 Support
From versions 4.0.3 and above, Goliath (the library which supports the Pandora FMS Transactional WEB Monitoring) supports IPv6. Supported IPv6 Websites are required to use FQDN (Fully Qualified Domain Name) addresses.
There, the URL has to be a host name (e.g. 'ipv6.google.com'). IPv6 address representations (e.g. [::1], [2404:6800:4004:803::1014] etc.) are not allowed. This limitation originates from the LWP module called 'libwww-perl'.
1.12 Advanced Options
1.12.1 Modifying HTTP Headers (available from versions 4.0.2 and above)
By the 'header' option, you're able to modify HTTP headers or create your own. The example below e.g. changes the 'Host' HTTP header:
task_begin get http://192.168.1.5/index.php header Host 192.168.1.1 task_end
1.12.2 Debugging Web Surveillance (available from versions 4.0.2 and above)
If you want to debug a website check, please add the 'debug <log_file>' option. The files 'log_file.req' and 'log_file.res' are going to be created along with the contents of the HTTP request and response:
task_begin get http://192.168.1.5/index.php debug /tmp/request.log task_end
The website check above is going to generate the files '/tmp/request.log.req' and '/tmp/request.log.res'.
1.12.3 Using CURL instead of LWP
LWP sometimes crashes when multiple threads issue HTTPS requests simultaneously. To solve this problem, you just have to edit the file named '/etc/pandora/pandora_server.conf' and to add the below mentioned line to it:
After you've restarted the Pandora FMS Server, the CURL binary is going to be used to perform website checks instead of LWP.
2 Distributed Transactional WEB Monitoring by Selenium
In addition to the features of Goliath, which is integral part of all Pandora FMS Enterprise versions, there is another way to perform Transactional Monitoring by Pandora FMS. This method is using an agent-like approach instead of a centralized system. It allows you to distribute the load and to use servers in remote networks to monitor different websites or applications.
The Selenium plug-in Documentation is very extensive and specific and can be found in the Pandora FMS Module Library, along with the Enterprise Selenium Plug in.