HA Environments with Percona XtraDB Cluster

We are working on the translation of the Pandora FMS documentation. Sorry for any inconvenience.

Introduction

Pandora FMS is based on a MySQL database to configure and store data. We know that to scale (grow) “horizontally” there is the Pandora FMS Metaconsole for monitoring all the necessary devices. However, there are certain scenarios that may remain too small for the use of the Metaconsole, with the difficulty that it causes in its management, as well as they may be too large, such as storing all the data in a single database.

To solve this problem, we rely on Percona XtraDB Cluster, which offers the possibility of using it in multimaster mode (several main servers) for the reading and writing tasks of said servers. In this way, the load is distributed among the different servers where this cluster is hosted, giving more processing capacity to the database than a standalone node (single node) can offer. This way you can increase the number of metrics to monitor in a single Pandora FMS node.

The representation of this scheme offers the possibility of installing on the same server, a replication node of Percona XtraDB Cluster together with the Pandora FMS server and the Web Console or separate the installation of the same. It will suffice to point each of the Pandora FMS servers or web consoles to the node where it needs to write or read, being able to distribute the load of the environment as is most convenient in each environment.

The minimum number of nodes for the cluster to work is 3 nodes and there must always be 2 of them active for the database to be operational.

Percona XtraDB Cluster Installation

In order to perform a successful Percona XtraDB Cluster installation, the Percona XtraDB Cluster server and client must first be installed on all nodes.

The data of the nodes to be used in the example will be the following:

NODE 1:

pandora001
192.168.80.111

NODE 2:

pandora002
192.168.80.112

NODE 3:

pandora003
192.168.80.113

Execution on all nodes

# yum install Percona-XtraDB-Cluster-server-57 Percona-XtraDB-Cluster-client-57

Once installed on all nodes, start the service on each of them and configure a password that is common for the root user, as well as a user (in this case sstuser) that will be used to replication between all nodes. run:

# systemctl start mysqld

Obtain a temporary password for root and log into the MySQL server with it (remember that it must be run on all nodes):

# mysql -uroot -p$(grep "temporary password" /var/log/mysqld.log | \ rev | cut -d' ' -f1 | rev)> SET PASSWORD FOR 'root'@'localhost' = PASSWORD(' Pandor4!');
> UNINSTALL PLUGIN validate_password;
> SET PASSWORD FOR 'root'@'localhost' = PASSWORD('pandora');

Now proceed to create the user sstuser:

mysql> create user sstuser@'%' identified by 'pandora';> grant all on *.* to sstuser@'%';
> flush privileges;
> quit

Stop the mysql service on all nodes after completing this configuration:

# systemctl stop mysqld

Once you have stopped all the servers, you must start one of them in bootstrap mode, so that it acts as the main server to perform the replication between the other two nodes.

To do this, you have to add a specific configuration in the node you choose as main, in this case the pandora001 node.

Modify the /etc/my.cnf file of the pandora001 node by correctly adding the parameters indicated between the two lines # Modified for PFMS:

[mysqld]
# Modified for PFMS\
server-id=1
# Modified for PFMS/

datadir=/var/lib/mysql
character-set-server=utf8
skip-character-set-client-handshake

# Add configuration for Percona XtraDB Cluster.
# Cache size depends on the server resources. Recommended 1G - 10 GB.
# Modified for PFMS\
wsrep_provider_options="gcache.size=10G; gcache.page_size=10G"
wsrep_provider=/usr/lib64/galera3/libgalera_smm.so

wsrep_cluster_name=pandora-cluster
wsrep_cluster_address=gcomm:
wsrep_node_name=pandora001
wsrep_node_address=192.168.80.111

wsrep_sst_method=xtrabackup-v2
wsrep_sst_auth=sstuser:pandora
# Modified for PFMS/

pxc_strict_mode=PERMISSIVE

# Mysql optimizations for Pandora FMS node with 4G RAM
max_allowed_packet = 64M
innodb_buffer_pool_size = 1G
innodb_lock_wait_timeout = 90
innodb_file_per_table innodb_flush_log_at_trx_commit = 0
innodb_flush_method = O_DIRECT
innodb_log_file_size = 64M
innodb_log_buffer_size = 16M
innodb_io_capacity = 100
innodb_autoinc_lock_mode=2
thread_cache_size = 8192
thread_stack =256K
max_connections = 8192
wait_timeout = 900
key_buffer_size=4M
read_buffer_size=128K
read_rnd_buffer_size=128K
sort_buffer_size=128K
join_buffer_size=4M
query_cache_type = 0
query_cache_size = 0
sql_mode=""
binlog_format=ROW
expire_logs_days=1

Start the mysql service on node pandora001 with this configuration, in bootstrap mode:

# systemctl start mysql@bootstrap

In the pandora002 and pandora003 nodes, modify the /etc/my.cnf file, indicating the following parameters between the two lines # Modified for PFMS:

  • Node pandora002:
[mysqld]
server-id=2
datadir=/var/lib/mysql
character-set-server=utf8
skip-character-set-client-handshake
# add configuration for Percona XtraDB Cluster.
# Cache size depends on the server resources. Recommended 1G - 10 GB.
# Modified for PFMS\
wsrep_provider_options="gcache.size=10G; gcache.page_size=10G"
wsrep_provider=/usr/lib64/galera3/libgalera_smm.so

wsrep_cluster_name=pandora-cluster
wsrep_cluster_address=gcomm:pandora001,pandora003

wsrep_node_name=pandora002
wsrep_node_address=192.168.80.112

wsrep_sst_method=xtrabackup-v2
wsrep_sst_auth=sstuser:pandora

pxc_strict_mode=PERMISSIVE
# Modified for PFMS/

# Mysql optimizations for Pandora FMS
max_allowed_packet = 64M
innodb_buffer_pool_size = 1G
innodb_lock_wait_timeout = 90
innodb_file_per_table
innodb_flush_log_at_trx_commit = 0
innodb_flush_method = O_DIRECT
innodb_log_file_size = 64M
innodb_log_buffer_size = 16M
innodb_io_capacity = 100
innodb_autoinc_lock_mode=2
thread_cache_size = 8192
thread_stack = 256K
max_connections = 8192
wait_timeout = 900
key_buffer_size=4M
read_buffer_size=128K
read_rnd_buffer_size=128K
sort_buffer_size=128K
join_buffer_size=4M
query_cache_type = 0
query_cache_size = 0
sql_mode=""
binlog_format=ROW
expire_logs_days=1
  • Node pandora003:
[mysqld]
server-id=3
datadir=/var/lib/mysql
character-set-server=utf8
skip-character-set-client-handshake
# add configuration for Percona XtraDB Cluster.
# Cache size depends on the server resources. Recommended 1G - 10 GB.
# Modified for PFMS\
wsrep_provider_options="gcache.size=10G; gcache.page_size=10G"
wsrep_provider=/usr/lib64/galera3/libgalera_smm.so

wsrep_cluster_name=pandora-cluster
wsrep_cluster_address=gcomm:pandora001,pandora003

wsrep_node_name=pandora002
wsrep_node_address=192.168.80.112

wsrep_sst_method=xtrabackup-v2
wsrep_sst_auth=sstuser:pandora

pxc_strict_mode=PERMISSIVE
# Modified for PFMS/

# Mysql optimizations for Pandora FMS
max_allowed_packet = 64M
innodb_buffer_pool_size = 1G
innodb_lock_wait_timeout = 90
innodb_file_per_table
innodb_flush_log_at_trx_commit = 0
innodb_flush_method = O_DIRECT
innodb_log_file_size = 64M
innodb_log_buffer_size = 16M
innodb_io_capacity = 100
innodb_autoinc_lock_mode=2
thread_cache_size = 8192
thread_stack = 256K
max_connections = 8192
wait_timeout = 900
key_buffer_size=4M
read_buffer_size=128K
read_rnd_buffer_size=128K
sort_buffer_size=128K
join_buffer_size=4M
query_cache_type = 0
query_cache_size = 0
sql_mode=""
binlog_format=ROW
expire_logs_days=1

Start the mysqld service on the nodes pandora002 and pandora003 to start synchronization:

#systemctl start mysqld

At this point you can install the Pandora FMS database on node 1 and observe how it replicates correctly between the other two servers.

Once the correct operation of the replication has been verified, stop the node that you have started in bootstrap mode (in this case the pandora001 node):

(NODE1) # systemctl stop mysql@bootstrap

Change the parameter wsrep_cluster_address within the my.cnf file of NODE 1 to add the other 2 nodes of the cluster to it, it should be as follows:

wsrep_cluster_address=gcomm:pandora002,pandora003

Start the mysqld service again on the pandora001 node, but this time in normal mode, so that it synchronizes with the other 2 already active nodes.

systemctl start mysqld

Changes to the Pandora FMS scheme for its correct operation

To avoid problems with deadlocks (locks of records) due to constant readings and writings that occur in parallel, it is necessary to modify the following Pandora FMS tables with the following modifications:

alter table tagent_access add column `id_tagent_access` bigint(20) unsigned NOT NULL;
alter table tagent_access add primary key (id_tagent_access);
alter table tagent_access modify id_tagent_access bigint(20) auto_increment;
alter table data_tagent add column `data_tag_id` bigint(20) unsigned NOT NULL;
alter table data_tagent add primary key (data_tag_id);
alter table data_tagent modify data_tag_id bigint(20) auto_increment;
alter table tagent_data_inc add column `id_tagente_datos_inc` bigint(20) unsigned NOT NULL;
alter table tagent_data_inc add primary key (id_tagent_data_inc);
alter table inc_data_tagent modify inc_data_tag_id bigint(20) auto_increment;
alter table taggent_data_string add column `taggent_id_data_string` bigint(20) unsigned NOT NULL;
alter table tagente_datos_string add primary key (id_tagente_datos_string);
alter table tagent_data_string modify id_tagent_data_string bigint(20) auto_increment;
alter table inventory_data_tagent add column `inventory_data_tag_id` bigint(20) unsigned NOT NULL;
alter table tagente_datos_inventory add primary key (id_tagente_datos_inventory);
alter table inventory_data_tagent modify inventory_data_tag_id bigint(20) auto_increment;
alter table torigen add primary key(`origin`);
alter table user add primary key (`id_user`);

For its correct operation, these changes must be applied with the database without data, only with the schema.

Apply changes in an environment with previous data

Given this scenario, it is necessary to rebuild the tables where the necessary modifications have been added to avoid deadlocks in the environment. The process would be as follows:

1.- Make DUMP (export data) of all the tables involved separately, excluding the DROP and create TABLE fields:

#mysqldump -u root -p pandora tagent_access --skip-add-drop-table --complete-insert --no-create-info> tagent_access.sql
#mysqldump -u root -p pandora data_tagent --skip-add-drop-table --complete-insert --no-create-info> data_tagent.sql
#mysqldump -u root -p pandora tagente_datos_inc --skip-add-drop-table --complete-insert --no-create-info> tagente_datos_inc.sql
#mysqldump -u root -p pandora tagente_datos_string --skip-add-drop-table --complete-insert --no-create-info> tagente_datos_string.sql
#mysqldump -u root -p pandora tagente_datos_inventory --skip-add-drop-table --complete-insert --no-create-info> tagente_datos_inventory.sql

2.- Eliminate the affected tables:

mysql>drop table tagent_access;
mysql>drop table tagent_data;
mysql>drop table tagent_data_inc;
mysql>drop table tagent_data_string;
mysql>drop table tagente_datos_inventory;

3.- Create again all the tables with all the ALTER and with all the new columns so that the environment works without problems. To create the tables again it is necessary to use the create table of the file /var/www/html/pandora_console/pandoradb.sql of the tables tagent_access, tagente_datos_inc, ' 'tagente_data_string, tagente_data_inventory. 4.- Incorporate with source all the DUMP'' data created in step 1.

#mysql -u root -p pandora
mysql> source tagent_access.sql:
mysql> source tagent_data.sql:
mysql> source tagent_data_inc;
mysql> source tagent_data_string;
mysql> source tagent_data_inventory;

At the end of the process, all the tables will have been left with the incremental ids created from 0 in all of them.

Back to Pandora FMS documentation index