Web Monitoring

Classic Web Monitoring

Introduction

In Pandora FMS, the Web Server works on an independent server, the Network Server. This system operates under the web transaction principle, where each complete transaction against one or several WEB pages is defined by one or more consecutive steps, which must conclude satisfactorily to consider the transaction finished successfully.

  • Network Server has important limitations such as dynamic JavaScript management at runtime.
  • For more complex web transactions, Pandora FMS has another much more powerful (and complex) component called WUX Monitoring (Web User Experience).

Installation and configuration

The Network Server is active by default. Depending on the number of requests, you may need to increase the number of threads and the default timeout:

network_threads 4
web_timeout 60

Each module can also use a different timeout.

Pandora FMS has CSRF protection and it may happen that when web checks are debugged, this message is obtained:
Cannot verify the origin of the request
Take this protection into account to consider using “WUX Monitoring”.

/etc/pandora/pandora_server.conf
# Uncomment to perform web checks with LWP instead of CURL.
#web_engine lwp

Upon restarting the Pandora FMS Server, LWP will be used to carry out web checks instead of the Curl binary.

For checks over IPv6, FQDN addresses must be used. This means that URLs must have full names, such as ipv6.google.com.
Numerical representation of IPv6 addresses such as ::1 or 2606:4700:3108::ac42:2896 are not valid for performing IPv6 checks with PFMS web modules.

Creating web modules

Menu Management → Resources → Manage agents.

To monitor a web page remotely, select the agent that will contain the new module.
In that agent, click on the corresponding Modules shortcut, click the Create module button , choose Web module from the list, and press the Create button.

After naming the module, you must choose the type of module to use:

  1. Remote HTTP module to check latency: Obtains the total load time elapsed from the first request until the last one is checked.

  2. Remote HTTP module to check server response: This module type works similarly to the previous one, with the only difference that it lacks thresholds. The only values it obtains are 1 for normal status and 0 for critical status.

  3. Remote HTTP module to retrieve numeric data: Using a regular expression, this module applies it to the HTTP response to obtain a numeric value.

  4. Remote HTTP module to retrieve string data: Using a regular expression, this module compares it against the HTTP response to obtain a text value.

  5. Remote HTTP module to check server status code: The basic check of a web page consists of quickly obtaining its status code and knowing in advance if more checks can be performed.

This text box is accompanied by three practical buttons:

  • Button Load basic.

Always places the following code regardless of whether it works or is compatible with the selected module type:

task_begin
get https://demoweb.com/page/
check_string text string or HTML code to search (regexp)
task_end

The above code is merely illustrative; each type of web module has its specific and/or appropriate instructions for it.

  • Button Check.

When pressed, the syntax of the inserted commands will be verified; if they are valid, otherwise it will show a yellow icon with a pop-up message when placing the pointer over it. Essentially, it ensures that it starts with task_begin and that at least one task_end tag appears (this must always appear last). The appropriate parameters for each web query should be reviewed, depending on the module type.

  • Button Debug.

Many times it is necessary to determine in detail if the web query is receiving the expected response. To be able to use it, the module must be in a state other than unknown status. The module must also be saved each time before using this button.

With this button, the first thing checked is if a response other than HTTP 200 is received. This is done through the following curl parameter for the get parameter:

-w '%{http_code}'

In the benign case of HTTP 3xx redirects, it will be enough to place the final address after the redirect. The curl command can use the --location parameter to follow the received redirect.
Bear in mind that using --location-trusted will force all established information to be delivered to the redirected URL.

Receiving an HTTP 200 code does not necessarily mean the page is actually available since it can later be redirected via JavaScript.

If the -s, -o and -w parameters (and their values) are removed, leaving only the URL, the HTML code itself can be analyzed, which can be useful to rule out a JavaScript redirect, a protection CAPTCHA to determine if it is a human querying, etc.

Some web servers are configured to only serve requests from certain web browsers. For debugging, the curl software allows using the --user-agent parameter to simulate being the desired browser (at the module level, see Agent browser ID):

--user-agent 'Mozilla/5.0 \ 
(Windows NT 10.0; Win64; x64) \ 
AppleWebKit/537.36 \
(KHTML, like Gecko) \ 
Chrome/99.0.4844.84 \ 
Safari/537.36' 'https://pandorafms.com/en/'

If this traditional web monitoring becomes too complicated, you can consider using WUX Monitoring, which can execute JavaScript code and many other functionalities.

Advanced options

Modifying HTTP headers

With the header option, you can modify HTTP header fields or create custom fields. To change the Host field of the HTTP header:

task_begin
get http://192.168.1.5/index.php
header Host 192.168.1.1
task_end

Proxy use

In each of the web check module types, Basic tab, the following fields necessary for proxy use will appear:

Monitoring web services and APIs

REST APIs can be monitored, except for more complex types of APIs based on protocols such as SOAP or XMLRPC.

By checking the output with a regular expression and using a web module to obtain text responses, everything can be verified as correct:

task_begin
get http://127.0.0.1/pandora_console/include/api.php?info=version
get_content 786
debug /tmp/api
task_end

For the previous case of PFMS API 1.0, it is expected to return version 786 and for it to be reported as critical status, this value must be placed in the Critical threshold (Min / Max) threshold, checking Inverse interval (any value other than 786 will cause it to change to critical status).

For more complex responses, other regular expressions and the get_content_advanced token must be used.

  • When making API calls, it is important to keep in mind that the destination API must have permissions to be queried.

In the following case with a response in JSON format, the license_mode field is searched for and then its content is extracted with a regular expression:

task_begin
get http://127.0.0.1/pandora_console/include/api.php?op=get&op2=license&return_type=json&apipass=1234&user=admin&pass=pandora
get_content_advanced "license_mode":"([A-Za-z ]+)"
debug /tmp/api_license
task_end

For the above, the response is expected to be Perpetual in the critical threshold with inverse interval activated: Any other type of license will cause the module status to change.

Using Simple HTTP authentication

By default, web checks in PFMS Server are performed without any user authorization (Anyauth in the Check type field). Some pages may require Simple HTTP authentication (or other historically standardized methods). It is generally used as a quick check, a minimum security greeting that allows access to more advanced security checks (encryption, data persistence, etc.).

  • Using quotes in the password for http_auth_pass is not supported.
  • Avoid using single quotes.

Consider the following file with PHP code hosted in the root of a web server called https://example.com/:

your_file_name.php
<?php
# BASIC authentication
if (!isset($_SERVER['PHP_AUTH_USER'])) {
    header('WWW-Authenticate: Basic realm="My Realm"');
    header('HTTP/1.0 401 Unauthorized');
    exit;
} else {
    echo "<p>Hello {$_SERVER['PHP_AUTH_USER']}.</p>";
    echo "<p>You entered {$_SERVER['PHP_AUTH_PW']} as your password.</p>";
}
?>
  1. The HTML code is the minimum for the test.
  2. No identity check is performed. It is simply expected that a username and password be provided according to the BASIC authentication standard.
  3. In the PFMS Server, a Remote HTTP module to check server response module will be added to an agent (verify that this module type has been chosen) in this way:
task_begin
# BASIC authentication
get https://example.com/your_file_name.php
check_string Pandor4!
task_end
  • In the Check type field, BASIC must be selected.
  • In the HTTP auth (login) field, place any user.
  • In the HTTP auth (password) field, place Pandor4!.
  • Save the module and force the check.

The check_string command will verify that the dummy test passwords match and will set the module to normal. To test and verify the critical status, change the HTTP auth (password) field to any other value and repeat the procedure.

Similarly, the test PHP file can be changed with other authentication standards.

  • DIGEST:
your_file_name.php
<?php
# DIGEST authentication
<?php
if (!isset($_SERVER['PHP_AUTH_DIGEST'])) {
    header('HTTP/1.1 401 Unauthorized');
    header('WWW-Authenticate: Digest realm="My Realm",
           qop="auth", nonce="' . uniqid() . '",
           opaque="' . md5('My Realm') . '"');
    exit;
}

// Process the digest authentication
echo $_SERVER['PHP_AUTH_DIGEST'];
?>
task_begin
# DIGEST authentication
get https://example.com/your_file_name.php
check_string admin
task_end

Form check on a web page

A form check is much more complex than simple text checking on a web page. To be able to perform these types of checks (POST), you must have the necessary credentials and/or variables. Additionally, it is necessary to go to the page and obtain the HTML code to get the variable names, and then minimum knowledge of HTML is required to enter the query for the Network Server.

The practical method for designing a multi-step WEB transactional test is to test each block of code, adding them one by one with the error debugging command. The debug button is disabled for these types of cases.

Consider the following file with PHP code hosted at the root of a web server called https://example.com/:

your_file_name.php
<?php
   if( $_POST["name"] || $_POST["age"] ) {
      echo "Welcome <b>". $_POST['name']. "</b><br />";
      echo "You are <i>". $_POST['age']. "</i> years old.<br />";
      echo 'Now: '.time();
      exit();
   }
?>
<form action = "<?php $_PHP_SELF ?>" method = "POST">
   Name: <input type = "text" name = "name" />
   Age: <input type = "text" name = "age" />
   <input type = "submit" />
</form>

  - For educational reasons, strict HTML code is omitted.   - The first visit shows a simple form to enter name and age, and at least one of them must be entered. When sending the data via POST, a message will be displayed showing said entered value or values.   - The $_PHP_SELF command allows using any valid filename.   - The time() command (UNIX time) will be used in the debug (log) files in a separate terminal window.

In the PFMS Server, an agent will be chosen to add a Remote HTTP module to check server response module with the following script:

task_begin
debug /tmp/post_variable
variable_name name
variable_value JIMMY
variable_name age
variable_value 99
post https://example.com/your_file_name.php
# Verify if exists "<b>Jimmy<b>":
check_string \<b\>Jimmy\<\/b\>
task_end

The post command is in charge of performing the web query. Note that then the check_string command checks uppercase or lowercase indifferently. If the variable is found with the name, it will report normal status, otherwise the module will go into critical status. This last part can be tested and verified by making the following change and then forcing the check on the agent:

variable_name name
variable_value Kevin

Number of requests and number of retries

In modules for web checks, the Requests (Basic tab) and Retries (Data tab) fields are included via the Web Console.

Pandora FMS will repeat the check the number of times indicated in this parameter. If one of the checks fails, the check will be considered failed.
Depending on the amount of checks in the module, a certain number of pages will be obtained. It is important to keep this in mind to calculate the total time it will take the module to complete operations.

The number of times it performs a Request until a successful result is obtained.

Web browser identifier

Web check timeout

The timeout is defined by default in the web_timeout token as 60 seconds.

If a different timeout is needed for a specific module, it can be configured in the Timeout field of the Data tab.

Available parameters

task_begin
[resource <1|0>]
[cookie <1|0>]
[post url]
[get url]
[head url]
[check_string text]
[check_not_string text]
[variable_name variable]
[variable_value value]
[get_content regular_expression]
[get_content_advanced html(regular_expression)html]
[header html_header]
[debug path_to_log]
task_end

task_begin

Marks the start of a block of code which must end with task_end.

Each web module must contain at least one block of code, with at least one instruction to execute:

task_begin
head https:/example.com/
task_end

Several blocks of code can be added when forms need to be checked or, in general, when several additional steps have to be performed to reach a specific result:

  • The most common case is form checking, whether for user authentication processes or filling out a survey with one or more fields to fill in.
  • Another case would be querying a specific web page and, according to its response, continuing and performing a second query. In this scenario, the result of the first and second code blocks must be successful for the complete check to be given as positive and move on to comparing with the module's thresholds to determine if its status should change.

resource

Specifies (resource 1) if multimedia elements such as images, sound, etc. will be downloaded in a code block.

If this command does not appear in a code block, resource 0 is implicitly used.

Specifies (cookie 1) if the cookies obtained in a code block will be stored for subsequent use in other code blocks.

If this command does not appear in a code block, cookie 0 is implicitly used.

post

Command that, using variable_name and variable_value, specifies the web check URL. If it is used more than once in a code block, the URL to be examined will be the one specified in its last appearance.

get

Command that specifies the web check URL. If it is used more than once in a code block, all URLs will be examined; however, the URL specified in its last appearance will be the one providing the data for the check.

Special command that returns the specific HTTP header code response in a web check.

The response format is HTTP/V XXX where V is the version and XXX the code. Some responses that can be obtained: HTTP/2 403, HTTP/1.1 200, HTTP/1 302, etc.

Generally used as the only command in a single code block, see “Server status code check” for more information.

check_string

Checks that a specific text string exists in the web check being performed. The returned result is a boolean variable (0 false, any other value is true) and must be stored in a server response check module type (web_proc type).

The arguments taken by the check_string syntax are not normal text strings, they are regular expressions. Any character that is not a letter or a number will have to be escaped with \:

task_begin
get https://apache.org/
# Comment: Search "/images/oakleaf.svg" (including quotation marks)
check_string \"\/images\/oakleaf\.svg\"
debug /tmp/apache
task_end

Note that although the period can do without being escaped, in the previous code it has been denoted to emphasize regular expression syntax.
A debug instruction is included in case queries and responses need to be analyzed. Once this work is done, and to save storage space, it is recommended to remove it from routine monitoring.

The following “clean” code is equally valid even if the order of appearance of the get command has been changed (and even if the Check button indicates an error) within the same code block task_begintask_end:

task_begin
check_string \"\/images\/oakleaf.svg\"
get https://apache.org/
task_end

See also “Form check on a web page

check_not_string

Checks that the web check being performed lacks a specific text string. In the rest of its operation it is the same as check_string.

variable_name

Declares a variable whose value will be set by the next variable_value instruction. For greater code clarity, it is recommended to use in ordered pairs of contiguous lines.

The debug button omits syntax verification of this command: If a value is omitted and/or there is disparity with variable_value, it will still and always show a correct syntax.

This pair (or pairs) of instructions must be accompanied by a post command. In the web query, the order of appearance of the variables will be the last one declared first, the second to last declared second, and so on.

variable_value

Sets the value of a variable declared with variable_name.

For greater code clarity (and to facilitate debugging tasks, if applicable) it is recommended to use in ordered pairs of contiguous lines, first variable_name and then variable_value.

The debug button omits syntax verification of this command: If a value is omitted and/or there is disparity with variable_name, it will still and always show a correct syntax.

While these instructions can be placed in any order, this can cause unpredictable results if odd or orphan commands are accidentally established.

get_content

Unlike the check_string and check_not_string commands, which only return a true or false result, the get_content command is used when a numeric value or a specific text string needs to be obtained.

Similarly, these three commands use regular expressions to perform their work with the fundamental difference that get_content delivers the full result of its search.

To obtain a numeric value from an API-type query, the following code can be used
get_content \d+ in a code block task_begintask_end:

task_begin
get http://127.0.0.1/pandora_console/include/api.php?apipass=1234&user=admin&pass=pandora&op=get&op2=total_modules&id=0
get_content \d+
debug /tmp/num_of_modules
task_end

get_content_advanced

Allows extracting a text string in an HTML element. “Numeric” values can also be extracted, depending on the regular expression used. Syntax:

task_begin
get <URL>
get_content_advanced <html_code>(regex)</html_code>
task_end

It must be used in conjunction with get to set the URL to check. Once its HTML content is obtained, the entire page will be searched until an HTML element match is found and its content will be returned. See “Obtaining text data from a web page”.

Allows adding HTTP headers to the web query.

By default, Curl is used to perform web checks and two elements are always included:

  1. -H 'Pragma: no-cache': Used to skip CDN cache. Pragma is compatible with HTTP 1.0 (the most used HTTP versions currently are 1.1 and 2.0).
  2. -H 'Accept: text/html': This indicates that the HTML code is requested as the first response.

With header included in a task_begintask_end block, additional value pairs can be specified. In a PFMS API 1.0 query, this is very useful when authenticating via a user's private Bearer token and then checking an expected response value with check_string:

task_begin
header Authorization Bearer 05c43863bf76c54456837ea7c3008e56
get http://127.0.0.1/pandora_console/include/api.php?op=get&op2=test
debug /tmp/api
check_string 786
task_end

It could also be useful for explicitly requesting a response in a specific language:
header Accept-Language en-US.

With header, it is unnecessary to use double or single quotes, or colons, since when executing the check these characters are inserted in the appropriate way:

  • The language header Accept-Language es-ES is queried with Curl as follows:
    -H 'Accept-Language:es-ES'
  • The referrer (indicates the web page visited previously) header Referer pandorafms.com:
    -H 'Referer:pandorafms.com'
  • If the queried web server requires particular headers, these must start with -X…. It is usually followed by one or more hyphens and then the name and value of the header itself.
    Take the case of sending a customer identifier header X-Customer-ID “Ñ123”, it is queried:
    -H 'X-Customer-ID:"Ñ123"' (note the use of double quotes to wrap Ñ123).

See also:

debug

Web checks can be debugged by adding the debug <path>/<file_name> option.
Two files <path><file_name>.req and <path><file_name>.res will be created with the contents of the HTTP request and response, respectively:

task_begin
get http://192.168.1.5/index.php
debug /tmp/debug_file
task_end

When forcing checks to speed up debugging work, you must always wait for the response from the server or device being monitored. There is at least one way to see the content of these files in real time via command line:

tail -f <path><file_name>.res --retry

These debugging files, once created with the provided name, will always have subsequent queries and responses added to them, which will always cause an increase in their size. Once debugging is finished, it is recommended to remove all debug instructions.

task_end

Marks the completion of a block of codes. See task_begin.

Web monitoring examples

Checking a web page load time

To check the response time in seconds (latency) of a web page, select the Remote HTTP module to check latency module type.

The download time of the web's HTML code is different from the time it takes to display a web in a browser. The latter usually depends on the loading time of multimedia elements and the loading and execution of the incorporated JavaScript code.

In a WEB test, one or several intermediate requests may exist that complete the transaction. In this way, it obtains the total time elapsed from the first request until the last one is checked.

task_begin
get https://pandorafms.com
task_end
  • If the check definition states that the transaction is performed more than once (get command), the average time of each request will be used.
  • Thresholds for warning and critical states must be set by the user.
  • For the download time to include all resources (JavaScript, CSS, images, etc.), resource 1 must be added in a line before task_end.
  • Web checks also support the use of proxy servers; for this, the fields named Proxy that are necessary must be completed.
  • The debug parameter is used to keep an accumulated record in two files of both the query and the response of each check.

Checking a web page response

With a Remote HTTP module to check server response type module, a 1 (OK) or a 0 (CRITICAL) is obtained as a result of checking the entire transaction.

The module type (web_proc) used here dispenses with values set in the thresholds: It only accepts false (0) and true values. In this way, it is uniquely and automatically associated with normal and critical (0) states.

If there are several attempts but at least one of them fails, the test as a whole is also considered to have failed. Specifically, the number of attempts is sometimes used to avoid false positives; for this, use the Requests field in advanced options.

In its basic syntax, it is enough to add the following in the Web Checks box:

task_begin
get <URL>
task_end

Unlike the HTTP status check, the previous check brings all the HTML code, which can be used for additional purposes such as checking that a text string exists on said website with check_string:

task_begin
get <URL>
check_string <text>
task_end
  • If the requested text string exists, it will go into normal status, otherwise it will go into critical.
  • check_not_string <text> can be used for the opposite case: If the indicated text is not found, it will go into normal status, otherwise it will go into critical.
  • Although other parameters such as resource can be used to download all resources (images, etc.), this will only overload the PFMS Server with unnecessary work.
    The addition of other parameters must be weighed carefully; only those really necessary should be used, especially if it is a simple query.
  • The same applies to downloading and saving cookies; unless other web check steps are going to be performed, you can explicitly declare to skip them:
task_begin
get https://apache.org/
# No save cookie
cookie 0
# Do not download images, etc
resource 0
check_string Jimmy
debug /tmp/apache
task_end

Note the use of comments at the beginning of a line with # .

Obtaining numeric data from a web page

To check that a web page responds with a numeric value, or contains a specific value, the Remote HTTP module to retrieve numeric data module type is used.

For this, the basic query get_content \d+ can be used (the added regular expression filters only digits). The mechanism is to analyze the entire HTTP response from the beginning and start comparing until it matches the given regular expression.

* Thresholds for warning and critical states must be set by the user. By default, these thresholds are zero, so any valid numeric check obtained will be classified as normal status.

  • Proxy servers can be used by completing the necessary Proxy fields.
  • Adding the debug instruction keeps the queries and responses in two files to be debugged later.
  • To force a direct query with Curl to the web server (thus skipping CDN cache), PFMS includes the command header Pragma: no-cache by default.
    Pragma was chosen instead of Cache-Control to maintain compatibility with HTTP 1.0 (the most used versions currently are 1.1 and 2.0).
  • For an advanced query and to extract a numeric value from any HTML element on the page, such as a level 3 heading or the first paragraph that appears with a specific CSS Attribute, see the get_content_advanced parameter.

If no figure is obtained from the last check, or the web server is unable to respond or is unreachable, the last value and therefore the last recorded state of the module will be kept.

Obtaining text data from a web page

The Remote HTTP module to retrieve string data module type works similarly to its numeric counterpart without the inconvenient filtering process to guarantee only figures.

task_begin
get https://academy.pandorafms.com/
get_content_advanced <h3 class="h2-b">(.*)</h3>
task_end

Here get_content_advanced is the key parameter that allows choosing an HTML element (in this case a level three heading h3, class ha2-b) and then, according to a regular expression (.*), extracting the character string located there.

Since for the normal condition values cannot be compared, the warning threshold or the critical threshold is used with the inverse interval set with the expected text:

Terms &amp; Important Notes (note the use of entities in the received HTML code).

Server status code check

To check the status code of a web page, select the Remote HTTP module to check server status code module type (web_server_status_code_string type) with the following task_begintask_end code block:

task_begin
head https://pandorafms.com
task_end
  • In the server configuration, Curl must be used.
  • It is important to use the head parameter to obtain the status code.
  • To configure the critical threshold, HTTP/X 200 OK can be placed (where X is the protocol version) and marking the inverse interval:

This way, upon receiving any different response, the module will change to critical status:

Advanced transactional monitoring

In addition to the functionality offered by PFMS Web server, there is another way to perform transactional monitoring: Web User Experience Monitoring (WUX).

Back to Pandora FMS documentation index