一、服务器概况
Galera Cluster需要至少三个节点,在此次实验过程中,三个节点IP地址:
10.16.24.107
10.16.24.108
10.16.24.109
OS为redhat 6.3,推荐用centos 6.6.
服务器配置:
8G 内存,4核CPU,30G数据磁盘空间。
关闭防火墙:
service iptables stop;
systemctl stop firewalld (centOS 7)
chkconfig iptables off
关闭SELinux:
/etc/selinux/config中的SELINUX=disabled
三台主机上加入/etc/hosts:
10.16.24.107 db1
10.16.24.108 db2
10.16.24.109 db3
二、MariaDB 软件安装源(三个节点均需配置)
vi /etc/yum.repos.d/mariadb.repo
[mariadb]
name = MariaDB
#baseurl = http://yum.mariadb.org/5.5/centos6_amd64
#baseurl = http://yum.mariadb.org/10.0/rhel7-amd64
baseurl = http://yum.mariadb.org/10.0.25/rhel6_x86
gpgkey=https://yum.mariadb.org/RPM_GPG_KEY_MariaDB
enabled=1
gpgcheck=1
三、安装MariaDB_Galera_server软件(三个节点均需安装,先确保主机能上外网)
yum erase -y mysql
rpm -e mysqlclient16-5.1.61-4.ius.el6.x86_64
rpm -e mysql-connector-odbc-5.1.5r1144-7.el6.x86_64
yum install http://dl.fedoraproject.org/pub/epel/6/x86_64/socat_1.7.2.3_1.el6.x86_64.rpm (若安装报错),请执行下面括号内的安装:
(rpm -Uhv http://dl.iuscommunity.org/pub/ius/stable/Redhat/6/x86_64/epel_release_6_5.noarch.rpm
rpm -Uhv http://dl.iuscommunity.org/pub/ius/stable/Redhat/6/x86_64/ius_release_1.0_14.ius.el6.noarch.rpm
将下面的enabled=1改成enabled=0
cd /etc/yum.repos.d
vi epel.repo
yum install yum_plugin_replace
yum replace --enablerepo=ius-archive openssl10 –replace-with openssl10)
yum install MariaDB_Galera_server MariaDB_client rsync galera
已安装的版本信息:
MariaDB_Galera_server.i686 0:10.0.25_1.el6
MariaDB_client.i686 0:10.0.25_1.el6
MariaDB_compat.i686 0:10.0.25_1.el6
galera.i686 0:25.3.15_1.rhel6.el6
或下载二进制包:
https://downloads.mariadb.org/mariadb-galera/10.0.25/
四、初始化第一个节点(10.16.24.107)
MySQL配置文件(/etc/my.cnf):
[MYSQLD]
user=mysql
basedir=/usr
datadir=/data/mysql/3306
socket=/data/mysql/3306/mysql.sock
pid_file=mysqld.pid
port=3306
log_error=mysql.err
#log_output=FILE
#relay_log=relay_bin
### INNODB OPTIONS
innodb_buffer_pool_size=1G
innodb_additional_mem_pool_size=20M
innodb_flush_log_at_trx_commit=2
innodb_file_per_table=1
innodb_data_file_path = ibdata1:100M:autoextend
## You may want to tune the below depending on number of cores and disk sub
innodb_read_io_threads=4
innodb_write_io_threads=4
innodb_doublewrite=1
innodb_log_file_size=512M
innodb_log_files_in_group=2
innodb_log_buffer_size=64M
innodb_buffer_pool_instances=4
innodb_thread_concurrency=0
innodb_file_format=barracuda
innodb_flush_method = O_DIRECT
innodb_locks_unsafe_for_binlog=1
innodb_autoinc_lock_mode=2
## avoid statistics update when doing e.g show tables
innodb_stats_on_metadata=0
#engine_condition_pushdown=1
#default_storage_engine=innodb
# CHARACTER SET
#collation_server = utf8_unicode_ci
#init_connect='SET NAMES utf8'
character_set_server = utf8
# REPLICATION SPECIFIC _ GENERAL
#server_id must be unique across all mysql servers participating in replication.
server_id=88057
#auto_increment_increment=2
#auto_increment_offset=SERVERID
# REPLICATION SPECIFIC _ MASTER
binlog_format=ROW
log_bin=binlog
expire_logs_days=7
log_slave_updates=1
# OTHER THINGS, BUFFERS ETC
key_buffer_size = 24M
tmp_table_size = 64M
max_heap_table_size = 64M
max_allowed_packet = 512M
#sort_buffer_size = 256K
#read_buffer_size = 256K
#read_rnd_buffer_size = 512K
#myisam_sort_buffer_size = 8M
skip_name_resolve
memlock=0
sysdate_is_now=1
max_connections=1000
thread_cache_size=512
query_cache_type = 0
query_cache_size = 0
table_open_cache=1024
lower_case_table_names=0
##
## WSREP options
##
[galera]
# Full path to wsrep provider library or 'none'
wsrep_provider=/usr/lib64/galera/libgalera_smm.so
bind-address=0.0.0.0
wsrep_node_address=”10.16.24.107”
# Provider specific configuration options
wsrep_provider_options="gcache.size=1999M"
# Logical cluster name. Should be the same for all nodes.
wsrep_cluster_name= ‘galera_cluster’
# Group communication system handle
wsrep_cluster_address ="gcomm://10.16.24.107,10.16.24.108,10.16.24.109"
# Human_readable node name (non_unique). Hostname by default.
wsrep_node_name=”db1”
# Address for incoming client connections. Autodetect by default.
#wsrep_node_incoming_address=
# How many threads will process writesets from other nodes
wsrep_slave_threads=4
# DBUG options for wsrep provider
#wsrep_dbug_option
# Generate fake primary keys for non_PK tables (required for multi_master
# and parallel applying operation)
wsrep_certify_nonPK=1
# Location of the directory with data files. Needed for non_mysqldump
# state snapshot transfers. Defaults to mysql_real_data_home.
#wsrep_data_home_dir=
# Maximum number of rows in write set
wsrep_max_ws_rows=131072
# Maximum size of write set
wsrep_max_ws_size=1073741824
# to enable debug level logging, set this to 1
wsrep_debug=0
# convert locking sessions into transactions
wsrep_convert_LOCK_to_trx=0
# how many times to retry deadlocked autocommits
wsrep_retry_autocommit=1
# change auto_increment_increment and auto_increment_offset automatically
wsrep_auto_increment_control=1
# replicate myisam
wsrep_replicate_myisam=1
# retry autoinc insert, which failed for duplicate key error
wsrep_drupal_282555_workaround=0
# enable "strictly synchronous" semantics for read operations
wsrep_causal_reads=0
# Command to call when node status or cluster membership changes.
# Will be passed all or some of the following options:
# __status _ new status of this node
# __uuid _ UUID of the cluster
# __primary _ whether the component is primary or not ("yes"/"no")
# __members _ comma_separated list of members
# __index _ index of this node in the list
#wsrep_notify_cmd=
##
## WSREP State Transfer options
##
# State Snapshot Transfer method
# ClusterControl currently DOES NOT support wsrep_sst_method=mysqldump
wsrep_sst_method=rsync
# Address on THIS node to receive SST at. DON'T SET IT TO DONOR ADDRESS!!!
# (SST method dependent. Defaults to the first IP of the first interface)
#wsrep_sst_receive_address=
# SST authentication string. This will be used to send SST to joining nodes.
# Depends on SST method. For mysqldump method it is root:<root password>
wsrep_sst_auth= sst_user:dbpass
# Desired SST donor name.
#wsrep_sst_donor=
# Protocol version to use
# wsrep_protocol_version=
[MYSQL]
socket=/data/mysql/3306/mysql.sock
[client]
socket=/data/mysql/3306/mysql.sock
[mysqldump]
max_allowed_packet = 512M
[MYSQLD_SAFE]
pid_file=mysqld.pid
log_error=mysql.err
basedir=/usr/
datadir=/data/mysql/3306
初始化并启动MySQL:
mkdir -p /data/mysql/3306
chown -R mysql:mysql /data/mysql
su - mysql
mysql_install_db --defaults-file=/etc/my.cnf
service mysql start --wsrep-new-cluster
'/usr/bin/mysqladmin' -u root password 'safe2016'
五、初始化第二个节点(10.16.24.108)
然后修改配置文件:wsrep_node_address=”10.16.24.108”,wsrep_node_name="db2"
mkdir -p /data/mysql/3306
chown -R mysql:mysql /data/mysql
su - mysql
mysql_install_db --defaults-file=/etc/my.cnf
service mysql start
'/usr/bin/mysqladmin' -u root password 'safe2016'
六、初始化第三个节点(10.16.24.109)
然后修改配置文件:wsrep_node_address=”10.16.24.109”,wsrep_node_name="db3"
mkdir -p /data/mysql/3306
chown -R mysql:mysql /data/mysql
su - mysql
mysql_install_db --defaults-file=/etc/my.cnf
service mysql start
'/usr/bin/mysqladmin' -u root password 'safe2016'
七、为集群创建帐号
mysql> DELETE FROM mysql.user WHERE user='';
mysql> GRANT ALL ON *.* TO 'root'@'%' IDENTIFIED BY 'dbpass';
mysql> GRANT USAGE ON *.* to sst_user@'%' IDENTIFIED BY 'dbpass';
mysql> GRANT ALL PRIVILEGES on *.* to sst_user@'%';
mysql> FLUSH PRIVILEGES;
八、日常维护
1.集群启动
启动第一个节点:service mysql start --wsrep-new-cluster
启动其它节点:server mysql start
关闭节点:server mysql stop
2.通过查看4567端口确认集群是否启动
1netstat -plantu | grep mysqld
3.查询 galera插件是否已启用
mysql -e "SHOW status LIKE 'wsrep_ready'"-p
mysql -e "SHOW VARIABLES LIKE 'wsrep_cluster_address'"-p
4.检查集群是否正常
MariaDB [demo]> Show global status like "wsrep_local_state";
+-------------------+-------+
| Variable_name | Value |
+-------------------+-------+
| wsrep_local_state | 4 | ---表示正常监听
+-------------------+-------+
1 row in set (0.00 sec)
MariaDB [demo]> Show global status like "wsrep_local_state_comment";
+---------------------------+--------+
| Variable_name | Value |
+---------------------------+--------+
| wsrep_local_state_comment | Synced |
+---------------------------+--------+
1 row in set (0.00 sec)
MariaDB [demo]> Show global status like "wsrep_cluster_status";
+----------------------+---------+
| Variable_name | Value |
+----------------------+---------+
| wsrep_cluster_status | Primary | ---表示主节点能正常写入
+----------------------+---------+
1 row in set (0.00 sec)
MariaDB [demo]> Show global status like "wsrep_connected";
+-----------------+-------+
| Variable_name | Value |
+-----------------+-------+
| wsrep_connected | ON |
+-----------------+-------+
1 row in set (0.00 sec)
5.监控集群
查看MySQL版本: mysql> SHOW GLOBAL VARIABLES LIKE 'version';
MariaDB [(none)]> SHOW GLOBAL VARIABLES LIKE 'version';
+---------------+-----------------------+
| Variable_name | Value |
+---------------+-----------------------+
| version | 10.0.25-MariaDB-wsrep |
+---------------+-----------------------+
1 row in set (0.00 sec)
查看wsrep版本: mysql> SHOW GLOBAL STATUS LIKE 'wsrep_provider_version';
查看wsrep有关的所有变量: mysql> SHOW VARIABLES LIKE 'wsrep%' \G
MariaDB [(none)]> SHOW GLOBAL STATUS LIKE 'wsrep_provider_version';
+------------------------+----------------+
| Variable_name | Value |
+------------------------+----------------+
| wsrep_provider_version | 25.3.15(r3578) |
+------------------------+----------------+
1 row in set (0.00 sec)
查看Galera集群状态: mysql> show status like 'wsrep%';
监控状态参数说明:
集群完整性检查:
wsrep_cluster_state_uuid:在集群所有节点的值应该是相同的,有不同值的节点,说明其没有连接入集群.
wsrep_cluster_conf_id:正常情况下所有节点上该值是一样的.如果值不同,说明该节点被临时"分区"了.当节点之间网络连接恢复的时候应该会恢复一样的值.
wsrep_cluster_size:如果这个值跟预期的节点数一致,则所有的集群节点已经连接.
wsrep_cluster_status:集群组成的状态.如果不为"Primary",说明出现"分区"或是"split-brain"状况.
节点状态检查:
wsrep_ready: 该值为ON,则说明可以接受SQL负载.如果为Off,则需要检查wsrep_connected.
wsrep_connected: 如果该值为Off,且wsrep_ready的值也为Off,则说明该节点没有连接到集群.(可能是wsrep_cluster_address或wsrep_cluster_name等配置错造成的.具体错误需要查看错误日志)
wsrep_local_state_comment:如果wsrep_connected为On,但wsrep_ready为OFF,则可以从该项查看原因.
复制健康检查:
wsrep_flow_control_paused:表示复制停止了多长时间.即表明集群因为Slave延迟而慢的程度.值为0~1,越靠近0越好,值为1表示复制完全停止.可优化wsrep_slave_threads的值来改善.
wsrep_cert_deps_distance:有多少事务可以并行应用处理.wsrep_slave_threads设置的值不应该高出该值太多.
wsrep_flow_control_sent:表示该节点已经停止复制了多少次.
wsrep_local_recv_queue_avg:表示slave事务队列的平均长度.slave瓶颈的预兆.
最慢的节点的wsrep_flow_control_sent和wsrep_local_recv_queue_avg这两个值最高.这两个值较低的话,相对更好.
检测慢网络问题:
wsrep_local_send_queue_avg:网络瓶颈的预兆.如果这个值比较高的话,可能存在网络瓶
冲突或死锁的数目:
wsrep_last_committed:最后提交的事务数目
wsrep_local_cert_failures和wsrep_local_bf_aborts:回滚,检测到的冲突数目
九.高可用测试
1)在db1上创建demo数据库,db2是自动创建demo
DB1:
MariaDB [(none)]> create database demo;
Query OK, 1 row affected (0.00 sec)
DB2:
MariaDB [(none)]> show databases;
+--------------------+
| Database |
+--------------------+
| demo |
| information_schema |
| mysql |
| performance_schema |
| test |
+--------------------+
DB3:
MariaDB [(none)]> show databases;
+--------------------+
| Database |
+--------------------+
| demo |
| information_schema |
| mysql |
| performance_schema |
| test |
+--------------------+
2)在关掉db1后,操作db2,再启动db1,db2的数据自动与db1同步
DB1:
[mysql@mvxl0782 ~]$ service mysql stop
Shutting down MySQL..... SUCCESS!
DB2:
MariaDB [demo]> create table demo1 (id int not null);
Query OK, 0 rows affected (0.04 sec)
MariaDB [demo]> insert into demo1 select 1;
Query OK, 1 row affected (0.02 sec)
Records: 1 Duplicates: 0 Warnings: 0
DB1:
[mysql@mvxl0782 ~]$ service mysql start
Starting MySQL...SST in progress, setting sleep higher. SUCCESS!
MariaDB [demo]> use demo
Database changed
MariaDB [demo]> select * from demo1;
+----+
| id |
+----+
| 1 |
+----+
1 row in set (0.00 sec)