keepalived mysql 主从复制 容器实现(失败)

发布于:2025-08-29 ⋅ 阅读:(16) ⋅ 点赞:(0)


由于keepalived本身机制问题 无法实现容器层的单机部署两个keepalived及两个mysql主从复制,故障后自动漂移至另一个可用的节点的效果,原因在末尾
数据库主从复制配置部分依然可用

准备

创建网络

docker network create \
  --subnet=192.168.10.0/24 \
  --gateway=192.168.10.1 \
  --driver=bridge docker-mysql-ha

启动脚本

services:
  mysql-master:
    container_name: mysql-master
    hostname: mysql-master
    image: mysql:8.0
    networks:
      docker-mysql-ha:
        ipv4_address: 192.168.10.11
    environment:
      TZ: Asia/Shanghai
      MYSQL_ROOT_PASSWORD: 20240510
      MYSQL_USER: kd1
      MYSQL_PASSWORD: 20240511
    volumes:
      - /etc/localtime:/etc/localtime:ro
      - /data/dockerfiles/mysql-master/mysql_data:/var/lib/mysql
      - /data/dockerfiles/mysql-master/conf/my.cnf:/etc/my.cnf
      - /data/dockerfiles/mysql-master/mysqld:/var/run/mysqld
      - /data/dockerfiles/mysql-master/log:/var/log/mysql
    restart: unless-stopped

  mysql-slave:
    container_name: mysql-slave
    hostname: mysql-slave
    image: mysql:8.0
    networks:
      docker-mysql-ha:
        ipv4_address: 192.168.10.12
    environment:
      TZ: Asia/Shanghai
      MYSQL_ROOT_PASSWORD: 20240510
      MYSQL_USER: kd1
      MYSQL_PASSWORD: 20240511
    volumes:
      - /etc/localtime:/etc/localtime:ro
      - /data/dockerfiles/mysql-slave/mysql_data:/var/lib/mysql
      - /data/dockerfiles/mysql-slave/conf/my.cnf:/etc/my.cnf
      - /data/dockerfiles/mysql-slave/mysqld:/var/run/mysqld
      - /data/dockerfiles/mysql-slave/log:/var/log/mysql
    restart: unless-stopped
    
networks:
  docker-mysql-ha:
    external: true

创建持久化存储目录

mkdir -p /data/dockerfiles/mysql-master/conf/
mkdir -p /data/dockerfiles/mysql-slave/conf/

放入配置文件

主节点配置
my.cnf

# For advice on how to change settings please see
# http://dev.mysql.com/doc/refman/8.0/en/server-configuration-defaults.html

[mysqld]
#
# Remove leading # and set to the amount of RAM for the most important data
# cache in MySQL. Start at 70% of total RAM for dedicated server, else 10%.
# innodb_buffer_pool_size = 128M
#
# Remove leading # to turn on a very important data integrity option: logging
# changes to the binary log between backups.
# log_bin
#
# Remove leading # to set options mainly useful for reporting servers.
# The server defaults are faster for transactions and fast SELECTs.
# Adjust sizes as needed, experiment to find the optimal values.
# join_buffer_size = 128M
# sort_buffer_size = 2M
# read_rnd_buffer_size = 2M

# Remove leading # to revert to previous value for default_authentication_plugin,
# this will increase compatibility with older clients. For background, see:
# https://dev.mysql.com/doc/refman/8.0/en/server-system-variables.html#sysvar_default_authentication_plugin
# default-authentication-plugin=mysql_native_password

# 基本路径设置
datadir=/var/lib/mysql
socket=/var/run/mysqld/mysqld.sock
user=mysql
pid-file=/var/run/mysqld/mysqld.pid
secure-file-priv=/var/lib/mysql-files

# 网络与连接
skip-host-cache
skip-name-resolve
max_connections=200
wait_timeout=28800
interactive_timeout=28800

# 字符集与排序规则
character-set-server=utf8mb4
collation-server=utf8mb4_unicode_ci
init_connect='SET NAMES utf8mb4'

# 设置服务器时区(建议与宿主机一致)
default-time-zone='+08:00'

# 二进制日志(推荐开启,用于数据恢复/主从复制)
server-id=1
binlog_format=ROW
log_bin=/var/log/mysql/mysql-bin.log
binlog_expire_logs_seconds=604800  # 保留7天
sync_binlog=1

# 慢查询日志(调优关键)
slow_query_log=1
slow_query_log_file=/var/log/mysql/mysql-slow.log
long_query_time=1
log_queries_not_using_indexes=1

# 错误日志
# log_error=/var/log/mysql/mysql-error.log

# InnoDB 引擎优化
innodb_buffer_pool_size=512M
innodb_redo_log_capacity=256M
innodb_file_per_table=1
innodb_flush_log_at_trx_commit=1

# 缓存优化
table_open_cache=400
tmp_table_size=64M
max_heap_table_size=64M
sort_buffer_size=4M
read_buffer_size=2M
join_buffer_size=4M

[client]
socket=/var/run/mysqld/mysqld.sock
default-character-set=utf8mb4

[mysqldump]
default-character-set=utf8mb4
quick
max_allowed_packet=64M

!includedir /etc/mysql/conf.d/

从节点配置
my.cnf

# For advice on how to change settings please see
# http://dev.mysql.com/doc/refman/8.0/en/server-configuration-defaults.html

[mysqld]
#
# Remove leading # and set to the amount of RAM for the most important data
# cache in MySQL. Start at 70% of total RAM for dedicated server, else 10%.
# innodb_buffer_pool_size = 128M
#
# Remove leading # to turn on a very important data integrity option: logging
# changes to the binary log between backups.
# log_bin
#
# Remove leading # to set options mainly useful for reporting servers.
# The server defaults are faster for transactions and fast SELECTs.
# Adjust sizes as needed, experiment to find the optimal values.
# join_buffer_size = 128M
# sort_buffer_size = 2M
# read_rnd_buffer_size = 2M

# Remove leading # to revert to previous value for default_authentication_plugin,
# this will increase compatibility with older clients. For background, see:
# https://dev.mysql.com/doc/refman/8.0/en/server-system-variables.html#sysvar_default_authentication_plugin
# default-authentication-plugin=mysql_native_password

# 基本路径设置
datadir=/var/lib/mysql
socket=/var/run/mysqld/mysqld.sock
user=mysql
pid-file=/var/run/mysqld/mysqld.pid
secure-file-priv=/var/lib/mysql-files

# 网络与连接
skip-host-cache
skip-name-resolve
max_connections=200
wait_timeout=28800
interactive_timeout=28800

# 字符集与排序规则
character-set-server=utf8mb4
collation-server=utf8mb4_unicode_ci
init_connect='SET NAMES utf8mb4'

# 设置服务器时区(建议与宿主机一致)
default-time-zone='+08:00'

# 二进制日志(推荐开启,用于数据恢复/主从复制)
server-id=2
relay-log=/var/lib/mysql/relay-log
binlog_format=ROW
log_bin=/var/log/mysql/mysql-bin.log
binlog_expire_logs_seconds=604800  # 保留7天
sync_binlog=1

# 慢查询日志(调优关键)
slow_query_log=1
slow_query_log_file=/var/log/mysql/mysql-slow.log
long_query_time=1
log_queries_not_using_indexes=1

# 错误日志
# log_error=/var/log/mysql/mysql-error.log

# InnoDB 引擎优化
innodb_buffer_pool_size=512M
innodb_redo_log_capacity=256M
innodb_file_per_table=1
innodb_flush_log_at_trx_commit=1

# 缓存优化
table_open_cache=400
tmp_table_size=64M
max_heap_table_size=64M
sort_buffer_size=4M
read_buffer_size=2M
join_buffer_size=4M

[client]
socket=/var/run/mysqld/mysqld.sock
default-character-set=utf8mb4

[mysqldump]
default-character-set=utf8mb4
quick
max_allowed_packet=64M

!includedir /etc/mysql/conf.d/

启动容器

docker compose -f /data/dockercompose/docker-compose-mysql-ha.yml up -d

Mysql主从复制配置

主库配置

docker exec -it mysql-master mysql -uroot -p20240510

# 创建一个用户名叫 replicator 的新用户。 '%' 表示允许从任意 IP 地址连接。 账号的密码是 replica123。
mysql> CREATE USER 'replicator'@'%' IDENTIFIED BY 'replica123';

# 给 replicator 用户授予 REPLICATION SLAVE 权限。 这个权限允许该用户作为从库连接主库,进行二进制日志的复制。 *.* 表示数据库和表全部适用。
mysql> GRANT REPLICATION SLAVE ON *.* TO 'replicator'@'%';

# 刷新权限表:
mysql> FLUSH PRIVILEGES;


# 查看当前 binlog 状态 记住 File 和 Position,后面从库要用。
mysql> SHOW MASTER STATUS;

+------------------+----------+--------------+------------------+-------------------+
| File             | Position | Binlog_Do_DB | Binlog_Ignore_DB | Executed_Gtid_Set |
+------------------+----------+--------------+------------------+-------------------+
| mysql-bin.000005 |      868 |              |                  |                   |
+------------------+----------+--------------+------------------+-------------------+
1 row in set (0.00 sec)

从库配置

docker exec -it mysql-slave mysql -uroot -p20240510

# 从库中设置主库连接信息
mysql> CHANGE MASTER TO
    MASTER_HOST='192.168.10.11',
    MASTER_PORT=3306,
    MASTER_USER='replicator',
    MASTER_PASSWORD='replica123',
    MASTER_LOG_FILE='mysql-bin.000005',  -- 此处替换为主库 SHOW MASTER STATUS 的结果
    MASTER_LOG_POS=868,                 -- 同上
    GET_MASTER_PUBLIC_KEY = 1;

# 启动复制
mysql> START SLAVE;

# 查看复制状态
mysql> SHOW SLAVE STATUS\G

确认以下两项均为 Yes:
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
示例输出:
mysql> SHOW SLAVE STATUS\G
*************************** 1. row ***************************
               Slave_IO_State: Waiting for source to send event
                  Master_Host: 192.168.10.11
                  Master_User: replicator
                  Master_Port: 3306
                Connect_Retry: 60
              Master_Log_File: mysql-bin.000005
          Read_Master_Log_Pos: 868
               Relay_Log_File: relay-log.000002
                Relay_Log_Pos: 326
        Relay_Master_Log_File: mysql-bin.000005
             Slave_IO_Running: Yes              -- 此处应为 Yes
            Slave_SQL_Running: Yes              -- 此处应为 Yes
              Replicate_Do_DB: 
          Replicate_Ignore_DB: 
           Replicate_Do_Table: 
       Replicate_Ignore_Table: 
      Replicate_Wild_Do_Table: 
  Replicate_Wild_Ignore_Table: 
                   Last_Errno: 0
                   Last_Error: 
                 Skip_Counter: 0
          Exec_Master_Log_Pos: 868
              Relay_Log_Space: 530
              Until_Condition: None
               Until_Log_File: 
                Until_Log_Pos: 0
           Master_SSL_Allowed: No
           Master_SSL_CA_File: 
           Master_SSL_CA_Path: 
              Master_SSL_Cert: 
            Master_SSL_Cipher: 
               Master_SSL_Key: 
        Seconds_Behind_Master: 0
Master_SSL_Verify_Server_Cert: No
                Last_IO_Errno: 0
                Last_IO_Error: 
               Last_SQL_Errno: 0
               Last_SQL_Error: 
  Replicate_Ignore_Server_Ids: 
             Master_Server_Id: 1
                  Master_UUID: b8b661ab-6d15-11f0-be67-0242c0a80a0b
             Master_Info_File: mysql.slave_master_info
                    SQL_Delay: 0
          SQL_Remaining_Delay: NULL
      Slave_SQL_Running_State: Replica has read all relay log; waiting for more updates
           Master_Retry_Count: 86400
                  Master_Bind: 
      Last_IO_Error_Timestamp: 
     Last_SQL_Error_Timestamp: 
               Master_SSL_Crl: 
           Master_SSL_Crlpath: 
           Retrieved_Gtid_Set: 
            Executed_Gtid_Set: 
                Auto_Position: 0
         Replicate_Rewrite_DB: 
                 Channel_Name: 
           Master_TLS_Version: 
       Master_public_key_path: 
        Get_master_public_key: 1
            Network_Namespace: 
1 row in set, 1 warning (0.00 sec)

验证同步

  • 主库操作
CREATE DATABASE test_sync;
USE test_sync;
CREATE TABLE users (id INT PRIMARY KEY, name VARCHAR(100));
INSERT INTO users VALUES (1, 'Alice');
  • 从库验证
SELECT * FROM test_sync.users;

# 示例输出
+----+-------+
| id | name  |
+----+-------+
|  1 | Alice |
+----+-------+
1 row in set (0.00 sec)

更改主从配置

  • 主库
docker exec -it mysql-master mysql -uroot -p20240510

-- 确保之前创建用于主从配置的用户存在
mysql> SELECT user, host FROM mysql.user WHERE user = 'replicator';
+------------+------+
| user       | host |
+------------+------+
| replicator | %    |
+------------+------+
1 row in set (0.00 sec)

-- 再次查看当前 binlog 状态 记住 File 和 Position,后面从库要用。
mysql> SHOW MASTER STATUS;
+------------------+----------+--------------+------------------+-------------------+
| File             | Position | Binlog_Do_DB | Binlog_Ignore_DB | Executed_Gtid_Set |
+------------------+----------+--------------+------------------+-------------------+
| mysql-bin.000011 |      756 |              |                  |                   |
+------------------+----------+--------------+------------------+-------------------+
1 row in set (0.00 sec)
  • 从库
docker exec -it mysql-slave mysql -uroot -p20240510

-- 停止之前配置的复制的 IO 线程
STOP REPLICA IO_THREAD;

-- 从库中设置主库连接信息
CHANGE MASTER TO
    MASTER_HOST='192.168.0.8',
    MASTER_PORT=3306,
    MASTER_USER='replicator',
    MASTER_PASSWORD='replica123',
    MASTER_LOG_FILE='mysql-bin.000011', 
    MASTER_LOG_POS=456,
    GET_MASTER_PUBLIC_KEY = 1;
    
-- 启动复制
mysql> START SLAVE;

-- 查看复制状态
mysql> SHOW SLAVE STATUS\G

确认以下两项均为 Yes:
Slave_IO_Running: Yes
Slave_SQL_Running: Yes

验证同步

  • 主库操作
USE test_sync;
INSERT INTO users VALUES (5, 'Slice');
  • 从库验证
SELECT * FROM test_sync.users;

# 示例输出
+----+-------+
| id | name  |
+----+-------+
|  1 | Alice |
|  5 | Slice |
+----+-------+
2 rows in set (0.00 sec)

取消从库主从复制

-- 停止复制线程
STOP SLAVE;

-- 重置复制配置(清除复制相关信息)
RESET SLAVE ALL;

-- 检查当前是否启用了 read_only; ON 表示开启; OFF 表示关闭
SHOW VARIABLES LIKE 'read_only';

-- 可选)如果你想让从库完全变成普通库,且保证可写,执行:
SET GLOBAL read_only = OFF;

-- (可选)检查复制状态,确认已经取消; 如果显示为空或报错,说明复制已取消。
SHOW SHOW SLAVE STATUS\G

安装 Keepalived 容器(两个)

  • 需要部署两个 Keepalived 容器:分别绑定到 mysql-master 和 mysql-slave 上。
    Keepalived 容器共享主机的网络命名空间(host 模式),以便绑定 VIP:
# 创建配置目录
mkdir -p /data/dockerfiles/keepalived/master
mkdir -p /data/dockerfiles/keepalived/slave

编写 Keepalived 配置

注意:interface 应该写 docker-mysql-ha 也就是192.168.10.x 网络对应的网卡名

  • master 端:
    文件:/data/dockerfiles/keepalived/master/keepalived.conf
vrrp_script check_mysql {
    script "/usr/local/etc/keepalived/check_mysql.sh 192.168.10.11"
    interval 2
    weight -30
}

vrrp_instance VI_1 {
    state MASTER
    interface eth0
    virtual_router_id 51
    priority 150
    advert_int 1

    authentication {
        auth_type PASS
        auth_pass 1234
    }

    virtual_ipaddress {
        172.21.147.250
    }

    track_script {
        check_mysql
    }
}

  • 脚本:/data/dockerfiles/keepalived/master/check_mysql.sh
#!/bin/bash
CHECK_HOST=$1
CHECK_PORT=3306

timeout 1 bash -c "echo > /dev/tcp/${CHECK_HOST}/${CHECK_PORT}"
if [ $? -eq 0 ]; then
    exit 0
else
    exit 1
fi

  • slave 端:
    文件:/data/dockerfiles/keepalived/slave/keepalived.conf
vrrp_script check_mysql {
    script "/usr/local/etc/keepalived/check_mysql.sh 192.168.10.12"
    interval 2
    weight -30
}

vrrp_instance VI_1 {
    state BACKUP
    interface eth0
    virtual_router_id 51
    priority 100
    advert_int 1

    authentication {
        auth_type PASS
        auth_pass 1234
    }

    virtual_ipaddress {
        172.21.147.250
    }

    track_script {
        check_mysql
    }
}

  • 脚本:/data/dockerfiles/keepalived/slave/check_mysql.sh
#!/bin/bash
CHECK_HOST=$1
CHECK_PORT=3306

timeout 1 bash -c "echo > /dev/tcp/${CHECK_HOST}/${CHECK_PORT}"
if [ $? -eq 0 ]; then
    exit 0
else
    exit 1
fi

  • 别忘了给脚本加执行权限:
chmod +x /data/dockerfiles/keepalived/*/check_mysql.sh

启动脚本

确保容器使用 host 网络(否则无法绑定 VIP):

version: "3.8"

services:
  keepalived-master:
    image: osixia/keepalived:2.0.20
    container_name: keepalived-master
    network_mode: host
    cap_add:
      - NET_ADMIN
      - NET_RAW
      - NET_BROADCAST
    volumes:
      - /data/dockerfiles/keepalived/master/keepalived.conf:/usr/local/etc/keepalived/keepalived.conf:ro
      - /data/dockerfiles/keepalived/master/check_mysql.sh:/usr/local/etc/keepalived/check_mysql.sh:ro
    restart: always

  keepalived-slave:
    image: osixia/keepalived:2.0.20
    container_name: keepalived-slave
    network_mode: host
    cap_add:
      - NET_ADMIN
      - NET_RAW
      - NET_BROADCAST
    volumes:
      - /data/dockerfiles/keepalived/slave/keepalived.conf:/usr/local/etc/keepalived/keepalived.conf:ro
      - /data/dockerfiles/keepalived/slave/check_mysql.sh:/usr/local/etc/keepalived/check_mysql.sh:ro
    restart: always

原因

一、致命架构缺陷

‌VIP与宿主机IP跨子网问题‌

  • VIP(192.168.1.100/24)与宿主机IP(192.168.10.25/24)处于不同子网
  • 二层网络无法直接通信,ARP广播请求无法到达
  • 违反网络基础原理:同一物理接口不能承载跨子网IP

‌路由黑洞问题‌

路由192.168.10.0/24
请求VIP 192.168.1.100
192.168.10.24
路由器
192.168.10.25
黑洞
  • 192.168.1.0/24网段在路由器无路由条目
  • 返回数据包因非对称路由被丢弃

‌ARP代理失效‌

  • 宿主机无法响应192.168.1.0/24网段的ARP请求
  • 客户端持续发送ARP请求无响应

二、技术方案矛盾点

方案要素 矛盾点 后果
单机VIP漂移 单节点无需VIP漂移 增加无效复杂度
端口重定向(DNAT) 与VIP功能重叠 流量路径混乱
Keepalived监控 单点监控无意义 无法实现真正HA
跨子网VIP 违反RFC网络标准 连通性不可达

三、网络通信原理冲突

‌OSI模型违反‌

  • 三层IP(192.168.1.100)绑定在二层接口(eth0)
  • 但eth0已绑定192.168.10.0/24子网
  • 违反"一个物理接口不同时承载多个逻辑子网"原则

ARP协议限制‌

# 客户端ARP请求
ARP who-has 192.168.1.100 tell 192.168.10.24

# 宿主机响应
# 因IP不属于接口子网,内核自动丢弃该请求!

网站公告

今日签到

点亮在社区的每一天
去签到