HA Environments with Percona XtraDB Cluster
Introduction
Pandora FMS is based on a MySQL database to configure and store data. We know that in order to scale (grow) “horizontally” there is the Pandora FMS Metaconsole for monitoring all the necessary devices. However, there are certain scenarios that can be too small for the use of the Metaconsole, with the difficulties it entails in its management, as well as them being too big to store all the data in a single database.
To solve this problem we rely on Percona XtraDB Cluster, which offers the possibility of using in multimaster mode (several main servers) for the reading and writing tasks of these servers. That way, the load is distributed among the different servers where this cluster is hosted, granting more processing capacity to the database than what a standalone node can offer. Therefore, the number of metrics to monitor in a single Pandora FMS node can be increased.
The representation of this scheme offers the possibility of installing in the same server a Percona XtraDB Cluster replication node together with Pandora FMS server and the Web Console or separate their installations. It will be enough to point each one of Pandora FMS servers or Pandora FMS Web Consoles to the node where it is needed to write or read, being able to distribute the load of the environment as it is convenient in each environment.
The minimum number of nodes for the cluster to work is 3 nodes and there should always be 2 of them active for the database to be operative.
Percona XtraDB Cluster Installation
To perform a correct installation of Percona XtraDB Cluster, first the Percona XtraDB Cluster server and client must be installed in all nodes.
The data of the nodes to be used in the example will be the following:
NODE 1:
pandora001 192.168.80.111
NODE 2:
pandora002 192.168.80.112
NODE 3:
pandora003 192.168.80.113
Running on all nodes
# yum install Percona-XtraDB-Cluster-server-57 Percona-XtraDB-Cluster-client-57
Once installed on all nodes, start the service on each node and set up a password that is common for the root user, as well as a user (in this case sstuser
) that will be used for replication among all nodes. Execute:
# systemctl start mysqld
Get a temporary password for root and log in with it to MySQL server (remember to run it on all nodes):
# mysql -uroot -p$(grep "temporary password" /var/log/mysqld.log | \ rev | cut -d' ' -f1 | rev)> SET PASSWORD FOR 'root'@'localhost' = PASSWORD('Pandor4!');> UNINSTALL PLUGIN validate_password; > SET PASSWORD FOR 'root'@'localhost' = PASSWORD('pandora');
Now proceed to create the user sstuser
:
mysql> create user [email protected]'%' identified by 'pandora';> grant all on *.* to [email protected]'%';> flush privileges; > quit
Stop the mysql
service on all nodes after finishing this configuration:
# systemctl stop mysqld
Once you have stopped all servers, start one of them in bootstrap mode, so that it works as the main server to perform replication between the other two nodes.
To do this you have to add a specific configuration in the node that you choose as the main one, in this case the pandora001
node.
Modify the /etc/my.cnf
file of the pandora001
node by properly adding the parameters indicated between the two lines # Modified for PFMS
:
[mysqld] # Modified for PFMS\ server-id=1 # Modified for PFMS/ datadir=/var/lib/mysql character-set-server=utf8 skip-character-set-client-handshake # Add configuration for Percona XtraDB Cluster. # Cache size depends on the server resources. Recommended 1G - 10 GB. # Modified for PFMS\ wsrep_provider_options="gcache.size=10G; gcache.page_size=10G" wsrep_provider=/usr/lib64/galera3/libgalera_smm.so wsrep_cluster_name=pandora-cluster wsrep_cluster_address=gcomm:// wsrep_node_name=pandora001 wsrep_node_address=192.168.80.111 wsrep_sst_method=xtrabackup-v2 wsrep_sst_auth=sstuser:pandora # Modified for PFMS/ pxc_strict_mode=PERMISSIVE # Mysql optimizations for Pandora FMS node with 4G RAM max_allowed_packet = 64M innodb_buffer_pool_size = 1G innodb_lock_wait_timeout = 90 innodb_file_per_table innodb_flush_log_at_trx_commit = 0 innodb_flush_method = O_DIRECT innodb_log_file_size = 64M innodb_log_buffer_size = 16M innodb_io_capacity = 100 innodb_autoinc_lock_mode=2 thread_cache_size = 8192 thread_stack = 256K max_connections = 8192 wait_timeout = 900 key_buffer_size=4M read_buffer_size=128K read_rnd_buffer_size=128K sort_buffer_size=128K join_buffer_size=4M query_cache_type = 0 query_cache_size = 0 sql_mode="" binlog_format=ROW expire_logs_days=1
Start with this configuration the mysql service in the pandora001
node, in bootstrap mode:
# systemctl start [email protected]
In the nodes pandora002
and pandora003
modify the file /etc/my.cnf
indicating the following parameters indicated between the two lines # Modified for PFMS
:
- Node
pandora002
:
[mysqld] server-id=2 datadir=/var/lib/mysql character-set-server=utf8 skip-character-set-client-handshake # add configuration for Percona XtraDB Cluster. # Cache size depends on the server resources. Recommended 1G - 10 GB. # Modified for PFMS\ wsrep_provider_options="gcache.size=10G; gcache.page_size=10G" wsrep_provider=/usr/lib64/galera3/libgalera_smm.so wsrep_cluster_name=pandora-cluster wsrep_cluster_address=gcomm:pandora001,pandora003 wsrep_node_name=pandora002 wsrep_node_address=192.168.80.112 wsrep_sst_method=xtrabackup-v2 wsrep_sst_auth=sstuser:pandora pxc_strict_mode=PERMISSIVE # Modified for PFMS/ # Mysql optimizations for Pandora FMS max_allowed_packet = 64M innodb_buffer_pool_size = 1G innodb_lock_wait_timeout = 90 innodb_file_per_table innodb_flush_log_at_trx_commit = 0 innodb_flush_method = O_DIRECT innodb_log_file_size = 64M innodb_log_buffer_size = 16M innodb_io_capacity = 100 innodb_autoinc_lock_mode=2 thread_cache_size = 8192 thread_stack = 256K max_connections = 8192 wait_timeout = 900 key_buffer_size=4M read_buffer_size=128K read_rnd_buffer_size=128K sort_buffer_size=128K join_buffer_size=4M query_cache_type = 0 query_cache_size = 0 sql_mode="" binlog_format=ROW expire_logs_days=1
- Node
pandora003
:
[mysqld] server-id=3 datadir=/var/lib/mysql character-set-server=utf8 skip-character-set-client-handshake # add configuration for Percona XtraDB Cluster. # Cache size depends on the server resources. Recommended 1G - 10 GB. # Modified for PFMS\ wsrep_provider_options="gcache.size=10G; gcache.page_size=10G" wsrep_provider=/usr/lib64/galera3/libgalera_smm.so wsrep_cluster_name=pandora-cluster wsrep_cluster_address=gcomm:pandora001,pandora003 wsrep_node_name=pandora002 wsrep_node_address=192.168.80.112 wsrep_sst_method=xtrabackup-v2 wsrep_sst_auth=sstuser:pandora pxc_strict_mode=PERMISSIVE # Modified for PFMS/ # Mysql optimizations for Pandora FMS max_allowed_packet = 64M innodb_buffer_pool_size = 1G innodb_lock_wait_timeout = 90 innodb_file_per_table innodb_flush_log_at_trx_commit = 0 innodb_flush_method = O_DIRECT innodb_log_file_size = 64M innodb_log_buffer_size = 16M innodb_io_capacity = 100 innodb_autoinc_lock_mode=2 thread_cache_size = 8192 thread_stack = 256K max_connections = 8192 wait_timeout = 900 key_buffer_size=4M read_buffer_size=128K read_rnd_buffer_size=128K sort_buffer_size=128K join_buffer_size=4M query_cache_type = 0 query_cache_size = 0 sql_mode="" binlog_format=ROW expire_logs_days=1
Start in the nodes pandora002
and pandora003
the mysqld
service to start the synchronization:
#systemctl start mysqld
At this point you may perform the installation of the Pandora FMS database on node 1 and see how it replicates correctly between the other two servers.
Once you check the correct operation of the replication, stop the node that started in bootstrap mode (in this case the pandora001
node):
(NODO1) # systemctl stop [email protected]
Change inside the my.cnf
file of the NODE 1 the parameter wsrep_cluster_address
to add the other 2 cluster nodes. It must be as follows:
wsrep_cluster_address=gcomm:pandora002,pandora003
Start again the mysqld
service in the node pandora001
, but this time in normal mode, so that it synchronizes with the other 2 nodes already active.
systemctl start mysqld
Changes to Pandora FMS schema for its correct operation
To avoid problems with deadlocks due to the constant writings and readings that take place simultaneously, it is necessary to modify the following Pandora FMS tables with the following modifications:
alter table tagent_access add column `id_tagent_access` bigint(20) unsigned NOT NULL; alter table tagent_access add primary key (id_tagent_access); alter table tagent_access modify id_tagent_access bigint(20) auto_increment; alter table tagente_datos add column `id_tagente_datos` bigint(20) unsigned NOT NULL; alter table tagente_datos add primary key (id_tagente_datos); alter table tagente_datos modify id_tagente_datos bigint(20) auto_increment; alter table tagente_datos_inc add column `id_tagente_datos_inc` bigint(20) unsigned NOT NULL; alter table tagente_datos_inc add primary key (id_tagente_datos_inc); alter table tagente_datos_inc modify id_tagente_datos_inc bigint(20) auto_increment; alter table tagente_datos_string add column `id_tagente_datos_string` bigint(20) unsigned NOT NULL; alter table tagente_datos_string add primary key (id_tagente_datos_string); alter table tagente_datos_string modify id_tagente_datos_string bigint(20) auto_increment; alter table tagente_datos_inventory add column `id_tagente_datos_inventory` bigint(20) unsigned NOT NULL; alter table tagente_datos_inventory add primary key (id_tagente_datos_inventory); alter table tagente_datos_inventory modify id_tagente_datos_inventory bigint(20) auto_increment; alter table torigen add primary key (`origen`); alter table tusuario add primary key (`id_user`);
For its correct operation, these changes should be applied with the database without data, only with the schema.
Applying changes in an environment with previous data
In this scenario, it is necessary to rebuild the tables where the necessary modifications have been added to avoid deadlocks in the environment. The process would be as follows:
1.- DUMP
(export data) all the tables involved separately, excluding the DROP and create TABLE fields:
#mysqldump -u root -p pandora tagent_access --skip-add-drop-table --complete-insert --no-create-info> tagent_access.sql #mysqldump -u root -p pandora tagente_datos --skip-add-drop-table --complete-insert --no-create-info> tagente_datos.sql #mysqldump -u root -p pandora tagente_datos_inc --skip-add-drop-table --complete-insert --no-create-info> tagente_datos_inc.sql #mysqldump -u root -p pandora tagente_datos_string --skip-add-drop-table --complete-insert --no-create-info> tagente_datos_string.sql #mysqldump -u root -p pandora tagente_datos_inventory --skip-add-drop-table --complete-insert --no-create-info> tagente_datos_inventory.sql
2.- Delete the affected tables:
mysql>drop table tagent_access; mysql>drop table tagente_datos; mysql>drop table tagente_datos_inc; mysql>drop table tagente_datos_string; mysql>drop table tagente_datos_inventory;
3.- Create again all the tables with all the ALTER
and with all the new columns so that the environment works without problems. To create the tables again it is necessary to use the create table of the file /var/www/html/pandora_console/pandoradb.sql
of the tables tagent_access
, tagente_datos_inc
, tagente_datos_string
, tagente_datos_inventory
.
4.- Incorporate with source all the DUMP
data created in step 1.
#mysql -u root -p pandora mysql> source tagent_access.sql: mysql> source tagente_datos.sql: mysql> source tagente_datos_inc; mysql> source tagente_datos_string; mysql> source tagente_datos_inventory;
At the end of the process, all the tables will have the incremental IDs created from scratch in all of them.