AWS RDS

This document describes the AWS RDS functionality of PandoraFMS discovery.

Introduction

The purpose of this plugin is to monitor RDS instances and AWS regions, using key metrics regarding CPU, networks, IOPs and disks that are essential to control and monitor these machines and to guarantee optimal performance, solve problems, plan scaling , comply with SLAs and improve security.

The plugin connects to the AWS API and monitors zones and instances using the aforementioned metrics, generating an agent for each zone and instance via XML that is sent to the Pandora server.

Compatibility matrix

Systems where it has been tested Rocky linux, Fedora 34
Systems where it works Any linux system

Prerrequisites

 

Permission Assignment

Create a policy in JSON like the following:

{
   "Version": "2012-10-17",
   "Statement": [
       {
           "Sid": "VisualEditor0",
           "Effect": "Allow",
           "Action": [
  			   "rds:DescribeDBInstances",
               "cloudwatch:Get*",
           ],
           "Resource": "*"
       }
   ]
}

The previous policy must be assigned to a new user.

image.png

Parameters and configuration

Parameters

--conf Path to configuration file

Configuration file (--conf)

agents_group_name = < Name of the target group for the created agents >
interval = < Interval in seconds for agents and for metric analysis >
threads = < Number of execution threads, each zone/instance will be equally distributed in the number of threads >
transfer_mode = < Transfer mode, tentacle or local >
tentacle_ip = < IP of the target machine for the created agents >
tentacle_port = <tentacle port, default: 41121>
tentacle_opts = < Tentacle client additional options >
data_dir = < (Only activated if the transfer_mode is local) Destination path for the XML of each agent, by default "/var/spool/pandora/data_in/" >

advance_monitoring = < Activate with 1 to enable widespread monitoring (these modules will only be created in the agents of the running instances) >
cpu_summary = < Enable with 1 to enable CPU monitoring >
iops_summary = < Enable with 1 to enable IOPS monitoring >
disk_summary = < Enable with 1 to enable disk monitoring >
network_summary = < Enable with 1 to enable network monitoring >

stats_agent = < Activate with 1 to enable a global agent that will monitor based on the task created and the parameters used >
stats_agent_name = < Name for the agent that is activated with the "stats_agent" parameter. If you do not use and "stats_agent" is enabled, the agent will be called "Aws.rds" by default >

aws_regions = < List of regions to monitor (when you mark a region to monitor, it automatically monitors all buckets found within that region) >
aws_instances = < List with the id of the rds instances to monitor >

creds_b64 = < Base64 credentials in the JSON file to authenticate >

Example

agents_group_name  = "aws"
interval           = 300
threads            = 4
transfer_mode      = tentacle 
tentacle_ip        = 172.42.42.101
tentacle_port      = 41121
data_dir           = /var/spool/pandora/data_in/

advance_monitoring = 1
cpu_summary        = 1
iops_summary       = 1
disk_summary       = 1
network_summary    = 1

stats_agent        = 1
stats_agent_name   = "Rds"

aws_instances = ["database-1","database-2"]
aws_regions   = ["us-east-1","ap-northeast-1","ap-southeast-1"]

creds_b64          =  ewdhBDJDdvb2tleV9pZGdhjDNDHDhbdjdKKDNDbdBiwKInNlY3JldFSHSHHDGJCJChfDHCNCNHCdjdghDMDBGBkxlSLiIKfQ==

Manual execution

The plugin execution format is as follows:

./pandora_aws_rds --conf < path to configuration file >

For example 

./pandora_aws_ec2 --conf /usr/share/pandora_server/util/plugin/aws_rds.conf

The execution will return an output in JSON format with information about the execution, and will generate an XML file for each monitored agent that will be sent to the Pandora FMS server by the transfer method indicated in the configuration.

For example:

{"summary": {"Total agents": 5, "Zones agents": 6, "Instances agents": 18}}

 

Discovery

This plugin can be integrated with Pandora FMS Discovery.

To do this, you must load the ".disco" package that you can download from the Pandora FMS library:

https://pandorafms.com/library/

image-1687944725901.png

Once loaded, Amazon RDS environments can be monitored by creating Discovery tasks from the Management > Discovery > Cloud section.

image-1687944806299.png

For each task, the following minimum data will be requested:

image-1687787846261.png

If the credentials provided are correct and the Pandora FMS server is able to connect to the AWS API, you will be able to see a tree with AWS RDS zones and instances, which can be marked for monitoring.

If a zone is selected, in addition to the zone itself, all the instances it contains will be monitored (both at the time of configuring the task and later if new instances are included).

If specific instances are selected, they will be monitored regardless of whether their zones have not been selected.

image-1687787872807.png

Next you can adjust the monitoring you want to obtain for each agent:

image-1687787892172.png

Tasks that are successfully completed will have an execution summary with the following information:

image-1687787919142.png

The tasks that are not completed successfully will have an execution summary recording the errors produced.

Agents and modules generated by the plugin

Running the plugin will create the following agents and modules:

< Name used with the parameter "stats_agent_name" or failing that "Aws.rds" >

Modules

AWS RDS Instances count
Total instances registered in AWS
<Region name>


Modules

summary.aws.rds.DBconnections
Summary of the number of connections for each instance in this zone
summary.aws.rds.CPUUtilization Average CPU percentage used for instances in this zone
summary.aws.rds.CPUCreditBalance
Summary of the number of CPU credits earned from each instance in this zone
summary.aws.rds.CPUCreditUsage
Summary of the number of CPU credits spent for each instance in this zone
summary.aws.rds.CPUSurplusCreditBalance
Summary of the number of surplus CPU credits available for each instance in this zone
summary.aws.rds.CPUSurplusCreditsCharged
Summary of the number of CPU surplus credits used for each instance in this zone
summary.aws.rds.DiskReadBytes Summary of the number of bytes read from disk for each instance of this zone
summary.aws.rds.DiskReadOps Summary of the number of read operations performed on the disk of each instance of this zone
summary.aws.rds.diskWriteBytes Summary of the number of bytes written to disk for each instance of this zone
summary.aws.rds.DiskWriteOps Summary of the number of write operations performed on the disk for each instance in this zone
summary.aws.rds.BinLogDiskUsage
Summary of the amount of disk space occupied by the binary logs of each instance in this zone
summary.aws.rds.LVMReadIOPS
Summary of the number of read operations performed per second on an LVM-based storage system for each instance in this zone
summary.aws.rds.LVMWriteIOPS
Summary of the number of writes performed per second on an LVM-based storage system for each instance in this zone
summary.aws.rds.instances Number of instances monitored in this zone
summary.aws.rds.NetworkReceiveThroughput Summary of incoming network traffic for each instance in this zone
summary.aws.rds.NetworkTransmitThroughput Summary of outbound network traffic for each instance in this zone
<instance ID>

 

Modules

State Machine status, in string format
Instance State (bool) State of the machine, 1 if it is running, 0 otherwise
DatabaseConnections The number of client network connections to the database instance
CPUUtilization Percentage of CPU utilization used
CPUCreditBalance
The number of earned CPU credits that an instance has accumulated since it was launched or started
CPUCreditUsage The number of CPU credits spent by the instance for CPU utilization
CPUSurplusCreditBalance The number of surplus CPU credits available for an Amazon RDS instance
CPUSurplusCreditsCharged The number of surplus CPU credits used by an Amazon RDS instance
DiskReadBytes Number of bytes read from disk
DiskReadOps The number of read operations performed on the disk
DiskWriteBytes Number of bytes written to disk
DiskWriteOps Number of write operations performed on the disk
BinLogDiskUsage The amount of disk space occupied by binary logs
LVMReadIOPS The number of read operations performed per second on an LVM-based storage system
LVMWriteIOPS The number of write operations performed per second on an LVM-based storage system
NetworkReceiveThroughput The incoming (receiving) network traffic on the database instance
NetworkTransmitThroughput
Outbound (transfer) network traffic on the database instance