Zabbix 高可用架构部署方案(2最新版)

发布于:2025-06-11 ⋅ 阅读:(24) ⋅ 点赞:(0)

Zabbix 高可用架构部署方案(MySQL + 双 VIP+HAProxy+Nginx)

前景提要使用 MySQL 作为数据库,两个虚拟 IP(10.0.0.100 和 10.0.0.200),HAProxy 作为数据库负载均衡,Nginx 作为 Web 访问入口。

1. 架构规划

Server1(10.0.0.12):主 Zabbix Server + MySQL 主库 + HAProxy(主) + Keepalived
Server2(10.0.0.15):备 Zabbix Server + MySQL 从库 + HAProxy(备) + Keepalived
Server3(10.0.0.18):Nginx 负载均衡器

2.环境准备

在所有服务器上执行:

# 更新系统   时间可能会有点儿长(可选)
yum update -y
 
# 关闭防火墙和SELinux(生产环境需配置规则)
systemctl disable --now firewalld
setenforce 0
sed -i 's/SELINUX=enforcing/SELINUX=disabled/g' /etc/selinux/config

# 配置主机名解析
cat > /etc/hosts << EOF
127.0.0.1   localhost localhost.localdomain
10.0.0.12   server1 zabbix-master
10.0.0.15   server2 zabbix-backup
10.0.0.18   server3 zabbix-lb
10.0.0.100  zabbix-web
10.0.0.200  zabbix-db
EOF

# 安装基础工具(可选)
yum install -y vim wget net-tools

3.安装 MySQL

在Server2(10.0.0.15)上执行: 建议server1在安装zabbix时安装mysql

# 安装mysql
yum install mysql-server -y

# 启动并设置开机自启
systemctl enable --now mysql

# 安全初始化
mysql_secure_installation

4. 配置 MySQL主从复制(挫折重重)

主库(server1)配置  此步骤建议在zabbix官网安装zabbix完成之后再进行配置

cat > /etc/my.cnf.d/mysql-server.cnf << EOF
[mysqld]
server-id=1
log-bin=mysql-bin
binlog-do-db=zabbix
expire-logs-days=10
max-binlog-size=100M
binlog-format=ROW
innodb_flush_log_at_trx_commit=1
sync_binlog=1
EOF

# 重启MySQL
systemctl restart mysqld

# 创建复制用户  zabbix已经在安装zabbix 的时候创建好,这里就不再赘述
CREATE USER 'repl'@'%' IDENTIFIED BY 'ReplicationPassword';
GRANT REPLICATION SLAVE ON *.* TO 'repl'@'%';
SHOW MASTER STATUS;

记录SHOW MASTER STATUS输出的File和Position值。  并将值填写到从库配置里

从库(server2)配置

cat > /etc/my.cnf.d/mysql-server.cnf << EOF
[mysqld]
server-id=2
log-bin=mysql-bin
binlog-do-db=zabbix
expire-logs-days=10
max-binlog-size=100M
binlog-format=ROW
relay-log=mysql-relay-bin
read-only=1
innodb_flush_log_at_trx_commit=1
sync_binlog=1
EOF

# 重启MySQL
systemctl restart mysqld

# 配置从库连接主库(替换FILE和POSITION值)

CHANGE MASTER TO
    MASTER_HOST='10.0.0.12',
    MASTER_USER='repl',
    MASTER_PASSWORD='ReplicationPassword',
    MASTER_LOG_FILE='mysql-bin.000006',
    MASTER_LOG_POS=1117065;

START SLAVE;
SHOW SLAVE STATUS\G;

确保Slave_IO_Running和Slave_SQL_Running均为Yes。

实际挫折1 好几次出现下面的报错

解决方法:

主节点操作

  • 修改复制用户的认证插件为 mysql_native_password(兼容性好,MySQL 5.7 及之前常用 ):

ALTER USER 'repl'@'%' IDENTIFIED WITH mysql_native_password BY 'ReplicationPassword'; FLUSH PRIVILEGES;

从节点操作

重新配置主从连接(无需 SSL ,简单场景 ):

CHANGE MASTER TO 
    MASTER_HOST='10.0.0.12', 
    MASTER_USER='repl', 
    MASTER_PASSWORD='ReplicationPassword', 
    MASTER_LOG_FILE='mysql-bin.xxxxxx', 
    MASTER_LOG_POS=xxxxxx;
START SLAVE;
SHOW SLAVE STATUS\G;  # 查看是否恢复

实际挫折2  修改完之后又出现下面的报错

错误分析:主从节点都为1

解决方法:

在主节点或者从节点更改一下server_id=xxx 使两个值不一样即可(/etc/my.cnf.d/mysql-server.cnf)

实际挫折3 改完又遇到下面的错误

查 performance_schema: 登录 MySQL,查询 performance_schema.replication_applier_status_by_worker 表,获取 Worker线程的详细错误
SELECT * FROM performance_schema.replication_applier_status_by_worker\G;
重点看 LAST_ERROR_MESSAGE 字段,能看到事务执行失败的具体 SQL 或原因。

解决办法:
(1)主库导出数据
mysqldump -u root -p zabbix > zabbix_db.sql

(2)将主库里面的zabbix 转到从库里
scp zabbix_db.sql 从库用户@从库IP:/tmp/

(3)在从库导入 zabbix 库:

CREATE DATABASE zabbix;
mysql zabbix < zabbix_db.sql

(4)停止从库

STOP SLAVE;
非 GTID 模式:
#跳过错误事务 
CHANGE MASTER TO 
    MASTER_LOG_FILE='mysql - bin.000006', 
    MASTER_LOG_POS=75510;  -- 错误位置 +1

GTID 模式:先查当前 GTID 集合,找到对应事务的 GTID 并跳过(假设 GTID 为 xxx:123 ):

SET GLOBAL sql_slave_skip_counter = 1;

启动从库复制:START SLAVE;
验证复制状态:SHOW SLAVE STATUS\G;

实际挫折4  做完上述的 又遇到新的错误

解决方法:

stop slave; reset slave; CHANGE MASTER TO ...(此项又操作一遍) 后再次查看恢复正常


5. 导入 Zabbix 数据库架构

在主库(Server1)上执行:   下载Zabbix   官网安装参考  

# 添加Zabbix仓库
rpm -Uvh https://repo.zabbix.com/zabbix/7.0/rocky/9/x86_64/zabbix-release-latest-7.0.el9.noarch.rpm
dnf clean all

# 导入Zabbix数据库架构
zcat /usr/share/zabbix-sql-scripts/mysql/server.sql.gz | mysql --default-character-set=utf8mb4 -uzabbix -p zabbix

6. 安装 Zabbix Server

在主从库上分别执行:

# 安装Zabbix Server、Web前端和Agent
dnf install -y zabbix-server-mysql zabbix-web-mysql zabbix-nginx-conf zabbix-sql-scripts zabbix-agent

# 配置Zabbix Server连接数据库(最好是将原来的文件备份然后重新再建一个)
cat > /etc/zabbix/zabbix_server.conf << EOF
LogFile=/var/log/zabbix/zabbix_server.log
LogFileSize=0
PidFile=/var/run/zabbix/zabbix_server.pid
DBHost=10.0.0.200
DBName=zabbix
DBUser=zabbix
DBPassword=ZabbixPassword
DBPort=3306
SNMPTrapperFile=/var/log/snmptrap/snmptrap.log
Timeout=4
AlertScriptsPath=/usr/lib/zabbix/alertscripts
ExternalScripts=/usr/lib/zabbix/externalscripts
LogSlowQueries=3000
StartPollers=15
StartPollersUnreachable=5
StartTrappers=5
StartPingers=1
StartDiscoverers=1
CacheSize=128M
HistoryCacheSize=64M
TrendCacheSize=64M
ValueCacheSize=256M
EOF

# 配置Web前端时区
sed -i 's/;date.timezone =/date.timezone = Asia\/Shanghai/' /etc/php.ini

# 启动服务
systemctl enable --now zabbix-server zabbix-agent nginx php-fpm

7. 配置 HAProxy(数据库负载均衡)

在 Server1 和 Server2 上分别执行:

# 安装HAProxy
dnf install -y haproxy

# 配置HAProxy
cat > /etc/haproxy/haproxy.cfg << EOF
global
    log /dev/log local0
    log /dev/log local1 notice
    chroot /var/lib/haproxy
    stats socket /var/lib/haproxy/stats
    user haproxy
    group haproxy
    daemon

defaults
    log global
    mode tcp
    option tcplog
    option dontlognull
    timeout connect 5000
    timeout client 50000
    timeout server 50000

listen mysql-cluster
    bind 10.0.0.200:3306
    mode tcp
    balance source
    option mysql-check user haproxy_check
    server mysql-master 10.0.0.12:3306 check weight 100
    server mysql-slave 10.0.0.15:3306 check weight 50 backup

listen stats
    bind *:9000
    mode http
    stats enable
    stats uri /stats
    stats realm HAProxy\ Statistics
    stats auth admin:password
EOF

# 创建监控用户
mysql -u root -p << EOF
CREATE USER 'haproxy_check'@'%' IDENTIFIED BY 'CheckPassword';
GRANT PROCESS ON *.* TO 'haproxy_check'@'%';
FLUSH PRIVILEGES;
EOF

# 启动HAProxy
systemctl enable --now haproxy

遇到的问题 haproxy 重启失败

配置文件的问题 

8. 配置 Keepalived 实现双 VIP

主 Server(10.0.0.12)配置:

# 安装Keepalived
dnf install -y keepalived

# 配置Keepalived
cat > /etc/keepalived/keepalived.conf << EOF
! Configuration File for keepalived

global_defs {
    router_id ZABBIX_MASTER
}

# Web VIP (10.0.0.100)
vrrp_instance VI_WEB {
    state MASTER
    interface eth0
    virtual_router_id 101
    priority 101
    advert_int 1
    authentication {
        auth_type PASS
        auth_pass 1111
    }
    virtual_ipaddress {
        10.0.0.100/24
    }
    track_script {
        chk_httpd
    }
}

# DB VIP (10.0.0.200)
vrrp_instance VI_DB {
    state MASTER
    interface eth0
    virtual_router_id 201
    priority 101
    advert_int 1
    authentication {
        auth_type PASS
        auth_pass 2222
    }
    virtual_ipaddress {
        10.0.0.200/24
    }
    track_script {
        chk_haproxy
    }
}

# 监控脚本
vrrp_script chk_httpd {
    script "systemctl is-active httpd"
    interval 2
    weight -20
}

vrrp_script chk_haproxy {
    script "systemctl is-active haproxy"
    interval 2
    weight -20
}
EOF

# 启动Keepalived
systemctl enable --now keepalived

备 Server(10.0.0.15)配置:

# 安装Keepalived
dnf install -y keepalived

# 配置Keepalived
cat > /etc/keepalived/keepalived.conf << EOF
! Configuration File for keepalived

global_defs {
    router_id ZABBIX_BACKUP
}

# Web VIP (10.0.0.100)
vrrp_instance VI_WEB {
    state BACKUP
    interface eth0
    virtual_router_id 101
    priority 100
    advert_int 1
    authentication {
        auth_type PASS
        auth_pass 1111
    }
    virtual_ipaddress {
        10.0.0.100/24
    }
    track_script {
        chk_httpd
    }
}

# DB VIP (10.0.0.200)
vrrp_instance VI_DB {
    state BACKUP
    interface eth0
    virtual_router_id 201
    priority 100
    advert_int 1
    authentication {
        auth_type PASS
        auth_pass 2222
    }
    virtual_ipaddress {
        10.0.0.200/24
    }
    track_script {
        chk_haproxy
    }
}

# 监控脚本
vrrp_script chk_httpd {
    script "systemctl is-active httpd"
    interval 2
    weight -20
}

vrrp_script chk_haproxy {
    script "systemctl is-active haproxy"
    interval 2
    weight -20
}
EOF

# 启动Keepalived
systemctl enable --now keepalived

9. 配置 Nginx 负载均衡(Server3)

# 安装Nginx
dnf install -y nginx

# 配置Nginx代理Zabbix Web
cat > /etc/nginx/conf.d/zabbix.conf << EOF
upstream zabbix_backend {
    server 10.0.0.100:80 weight=10 max_fails=3 fail_timeout=30s;
}

server {
    listen 80;
    server_name zabbix.example.com;

    location / {
        proxy_pass http://zabbix_backend;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header X-Forwarded-Proto $scheme;
        proxy_connect_timeout 150;
        proxy_send_timeout 100;
        proxy_read_timeout 100;
        proxy_buffers 4 32k;
        client_max_body_size 8m;
        client_body_buffer_size 128k;
        
        # Zabbix Web优化
        proxy_http_version 1.1;
        proxy_set_header Connection "";
        proxy_cache_bypass $http_upgrade;
    }
}
EOF

# 启动Nginx
systemctl enable --now nginx

10. 验证高可用性

访问 http://10.0.0.18/zabbix 完成 Web 界面初始化配置
验证 MySQL 主从复制:
bash
mysql -uzabbix -ppassword -h 10.0.0.200 -e "SHOW SLAVE STATUS\G"


测试故障转移:
停止 Server1 的 Keepalived 服务,验证 VIP 是否自动切换到 Server2
访问 http://10.0.0.18/zabbix 确认服务正常
恢复 Server1 的 Keepalived 服务,验证 VIP 是否自动切回

11. 防火墙配置(生产环境)

# Server1和Server2
firewall-cmd --permanent --add-service=http
firewall-cmd --permanent --add-port=10051/tcp
firewall-cmd --permanent --add-port=3306/tcp
firewall-cmd --permanent --add-port=9000/tcp  # HAProxy统计页面
firewall-cmd --permanent --add-protocol=vrrp  # Keepalived
firewall-cmd --reload

# Server3
firewall-cmd --permanent --add-service=http
firewall-cmd --reload

12. 监控与维护

MySQL 主从状态:定期检查复制延迟
HAProxy 状态:访问 http://10.0.0.18:9000/stats
Keepalived 状态:检查 VIP 是否正常工作
Zabbix 自监控:配置 Zabbix 监控自身组件状态


网站公告

今日签到

点亮在社区的每一天
去签到