5-2 基于StatefulSet运行MySQL一主多从

发布于:2022-12-25 ⋅ 阅读:(246) ⋅ 点赞:(0)

前言

MySQL一主多是从数据库生产级部署方案。第一可以实现数据的多处⾃动备份,加强了数据的安全性。第二实现数据库的拓展,通过多写分离提升数据库的负载性能。web等业务服务器的读操作由多个slave数据库实现,需要修改数据库的写操作在一个master数据库进行。数据文件会根据主从复制自动备份。


下面将通过k8s集群基于NFS,PV/PVC,ConfigMap,StatefulSet部署MySQL一主多从方案。



镜像准备

准备mysql镜像,数据库主镜像,上传到本地仓库:

docker pull mysql:5.7.36
docker tag mysql:5.7.36 easzlab.io.local:5000/myhub/mysql:5.7.36
docker push easzlab.io.local:5000/myhub/mysql:5.7.36

准备xtrabackup镜像,用于主从复制,上传到本地仓库:

docker pull registry.cn-hangzhou.aliyuncs.com/hxpdocker/xtrabackup:1.0
docker tag registry.cn-hangzhou.aliyuncs.com/hxpdocker/xtrabackup:1.0 easzlab.io.local:5000/myhub/xtrabackup:1.0
docker push easzlab.io.local:5000/myhub/xtrabackup:1.0

NFS准备

在NFS服务器192.168.100.155创建目录并共享:

# 创建3个共享目录,属主属组选root。
mkdir -p /data/k8sdata/myserver/mysql-datadir-{1..3}
chown root.root /data/k8sdata/myserver/mysql-datadir-{1..3}

# 提供共享服务,可共享的IP段为物理机IP,rw可读写,no_root_squash指root远程过去还是root,mysql需要root权限。
cat << EOF >> /etc/exports
/data/k8sdata/myserver/mysql-datadir-1 192.168.100.0/24(rw,no_root_squash)
/data/k8sdata/myserver/mysql-datadir-2 192.168.100.0/24(rw,no_root_squash)
/data/k8sdata/myserver/mysql-datadir-3 192.168.100.0/24(rw,no_root_squash)
EOF

# 重新加载,查看挂载情况
exportfs -r
exportfs -v

# nfs涉及端口较多,测试环境可关闭防火墙
systemctl disable --now firewalld

创建PV

一个mysql pod对应一个pv,pod数量不能大于pv总数。

---
apiVersion: v1
kind: PersistentVolume
metadata:
  name: mysql-datadir-1
  namespace: myserver
spec:
  capacity:
    storage: 50Gi
  accessModes:
    - ReadWriteOnce
  nfs:
    path: /data/k8sdata/myserver/mysql-datadir-1 
    server: 192.168.100.155

---
apiVersion: v1
kind: PersistentVolume
metadata:
  name: mysql-datadir-2
  namespace: myserver
spec:
  capacity:
    storage: 50Gi
  accessModes:
    - ReadWriteOnce
  nfs:
    path: /data/k8sdata/myserver/mysql-datadir-2
    server: 192.168.100.155

---
apiVersion: v1
kind: PersistentVolume
metadata:
  name: mysql-datadir-3
  namespace: myserver
spec:
  capacity:
    storage: 50Gi
  accessModes:
    - ReadWriteOnce
  nfs:
    path: /data/k8sdata/myserver/mysql-datadir-3
    server: 192.168.100.155

检查pv状态:

sudo kubectl get pv 
NAME              CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS      CLAIM   STORAGECLASS   REASON   AGE
mysql-datadir-1   50Gi       RWO            Retain           Available                                   13s
mysql-datadir-2   50Gi       RWO            Retain           Available                                   13s
mysql-datadir-3   50Gi       RWO            Retain           Available                                   13s

MySQL一主多从

配置对象

创建ConfigMap,包括主mysql和从mysql配置信息:

apiVersion: v1
kind: ConfigMap
metadata:
  name: mysql
  namespace: myserver
  labels:
    app: mysql
data:
  master.cnf: |
    # Apply this config only on the master.
    [mysqld]
    log-bin
    log_bin_trust_function_creators=1
    lower_case_table_names=1
  slave.cnf: |
    # Apply this config only on slaves.
    [mysqld]
    super-read-only
    log_bin_trust_function_creators=1
kubectl apply -f mysql-configmap.yaml 

无头服务

Headless Service⽆头服务,与service的区别就是它没有Cluster IP,解析它的名称时将返回该Headless Service对应的全部Pod的Endpoint列表。
mysql-read是给slave pod使用的mysql只读服务,以此实现读写分离。

# Headless service for stable DNS entries of StatefulSet members.
apiVersion: v1
kind: Service
metadata:
  namespace: myserver
  name: mysql
  labels:
    app: mysql
spec:
  ports:
  - name: mysql
    port: 3306
  clusterIP: None
  selector:
    app: mysql
    
---
# Client service for connecting to any MySQL instance for reads.
# For writes, you must instead connect to the master: mysql-0.mysql.
apiVersion: v1
kind: Service
metadata:
  name: mysql-read
  namespace: myserver
  labels:
    app: mysql
spec:
  ports:
  - name: mysql
    port: 3306
  selector:
    app: mysql
kubectl apply -f mysql-services.yaml

MySQL主从

创建MySQL一主多从集群,每个pod分别执行4个容器。具体作用如下:
初始化容器1:根据mysql-0数字标记为master,其它为slave,并分发不同配置文件。
初始化容器2:mysql-0不动,mysql-1从mysql-0全量拷贝数据,mysql-2再从mysql-1全量拷贝,以此类推。
主容器mysql:数据库主程序,都有读写功能。读写分离依靠mysql和mysql-read服务实现。
主容器xtrabackup:实现主从复制自动备份,除刚创建外都从mysql-0拷贝。开放3307端口供后一位pod全量复制。

apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: mysql
  namespace: myserver
spec:
  selector:
    matchLabels:
      app: mysql
  serviceName: mysql
  replicas: 3
  template:
    metadata:
      labels:
        app: mysql
    spec:
      initContainers:
      - name: init-mysql #初始化容器1、基于当前pod name匹配角色是master还是slave,并动态生成相对应的配置文件
        image: easzlab.io.local:5000/myhub/mysql:5.7.36
        command:
        - bash
        - "-c"
        - |
          set -ex
          # Generate mysql server-id from pod ordinal index.
          [[ `hostname` =~ -([0-9]+)$ ]] || exit 1 #匹配hostname的最后一位、最后是一个顺序叠加的整数
          ordinal=${BASH_REMATCH[1]} 
          echo [mysqld] > /mnt/conf.d/server-id.cnf
          # Add an offset to avoid reserved server-id=0 value.
          echo server-id=$((100 + $ordinal)) >> /mnt/conf.d/server-id.cnf
          # Copy appropriate conf.d files from config-map to emptyDir.
          if [[ $ordinal -eq 0 ]]; then #如果是master、则cpmaster配置文件
            cp /mnt/config-map/master.cnf /mnt/conf.d/
          else #否则cp slave配置文件
            cp /mnt/config-map/slave.cnf /mnt/conf.d/
          fi
        volumeMounts:
        - name: conf #临时卷、emptyDir
          mountPath: /mnt/conf.d
        - name: config-map
          mountPath: /mnt/config-map
      - name: clone-mysql #初始化容器2、用于生成mysql配置文件、并从上一个pod完成首次的全量数据clone(slave 3从slave2 clone,而不是每个slave都从master clone实现首次全量同步,但是后期都是与master实现增量同步)
        image: easzlab.io.local:5000/myhub/xtrabackup:1.0 
        command:
        - bash
        - "-c"
        - |
          set -ex
          # Skip the clone if data already exists.
          [[ -d /var/lib/mysql/mysql ]] && exit 0
          # Skip the clone on master (ordinal index 0).
          [[ `hostname` =~ -([0-9]+)$ ]] || exit 1
          ordinal=${BASH_REMATCH[1]}
          [[ $ordinal -eq 0 ]] && exit 0 #如果最后一位是0(master)则退出clone过程
          # Clone data from previous peer.
          ncat --recv-only mysql-$(($ordinal-1)).mysql 3307 | xbstream -x -C /var/lib/mysql #从上一个pod执行clone(binlog),xbstream为解压缩命令
          # Prepare the backup.xue
          xtrabackup --prepare --target-dir=/var/lib/mysql #通过xtrabackup恢复binlog
        volumeMounts:
        - name: data
          mountPath: /var/lib/mysql
          subPath: mysql
        - name: conf
          mountPath: /etc/mysql/conf.d
      containers:
      - name: mysql #业务容器1(mysql主容器)
        image: easzlab.io.local:5000/myhub/mysql:5.7.36
        env:
        - name: MYSQL_ALLOW_EMPTY_PASSWORD
          value: "1"
        ports:
        - name: mysql
          containerPort: 3306
        volumeMounts:
        - name: data #挂载数据目录至/var/lib/mysql
          mountPath: /var/lib/mysql
          subPath: mysql
        - name: conf #配置文件/etc/mysql/conf.d
          mountPath: /etc/mysql/conf.d
        resources: #资源限制
          requests:
            cpu: 500m
            memory: 1Gi
        livenessProbe: #存活探针
          exec:
            command: ["mysqladmin", "ping"]
          initialDelaySeconds: 30
          periodSeconds: 10
          timeoutSeconds: 5
        readinessProbe: #就绪探针
          exec:
            # Check we can execute queries over TCP (skip-networking is off).
            command: ["mysql", "-h", "127.0.0.1", "-e", "SELECT 1"]
          initialDelaySeconds: 5
          periodSeconds: 2
          timeoutSeconds: 1
      - name: xtrabackup #业务容器2(xtrabackup),用于后期同步master 的binglog并恢复数据
        image: easzlab.io.local:5000/myhub/xtrabackup:1.0 
        ports:
        - name: xtrabackup
          containerPort: 3307
        command:
        - bash
        - "-c"
        - |
          set -ex
          cd /var/lib/mysql
          # Determine binlog position of cloned data, if any.
          if [[ -f xtrabackup_slave_info ]]; then
            # XtraBackup already generated a partial "CHANGE MASTER TO" query
            # because we're cloning from an existing slave.
            mv xtrabackup_slave_info change_master_to.sql.in
            # Ignore xtrabackup_binlog_info in this case (it's useless).
            rm -f xtrabackup_binlog_info
          elif [[ -f xtrabackup_binlog_info ]]; then
            # We're cloning directly from master. Parse binlog position.
            [[ `cat xtrabackup_binlog_info` =~ ^(.*?)[[:space:]]+(.*?)$ ]] || exit 1
            rm xtrabackup_binlog_info
            echo "CHANGE MASTER TO MASTER_LOG_FILE='${BASH_REMATCH[1]}',\
                  MASTER_LOG_POS=${BASH_REMATCH[2]}" > change_master_to.sql.in #生成CHANGE MASTER命令
          fi
          # Check if we need to complete a clone by starting replication.
          if [[ -f change_master_to.sql.in ]]; then
            echo "Waiting for mysqld to be ready (accepting connections)"
            until mysql -h 127.0.0.1 -e "SELECT 1"; do sleep 1; done
            echo "Initializing replication from clone position"
            # In case of container restart, attempt this at-most-once.
            mv change_master_to.sql.in change_master_to.sql.orig 
            #执行CHANGE MASTER操作并启动SLAVE
            mysql -h 127.0.0.1 <<EOF
          $(<change_master_to.sql.orig),
            MASTER_HOST='mysql-0.mysql',
            MASTER_USER='root',
            MASTER_PASSWORD='',
            MASTER_CONNECT_RETRY=10;
          START SLAVE;
          EOF
          fi
          # Start a server to send backups when requested by peers. #监听在3307端口,用于为下一个pod同步全量数据
          exec ncat --listen --keep-open --send-only --max-conns=1 3307 -c \
            "xtrabackup --backup --slave-info --stream=xbstream --host=127.0.0.1 --user=root"
        volumeMounts:
        - name: data
          mountPath: /var/lib/mysql
          subPath: mysql
        - name: conf
          mountPath: /etc/mysql/conf.d
        resources:
          requests:
            cpu: 100m
            memory: 100Mi
      volumes:
      - name: conf
        emptyDir: {}
      - name: config-map
        configMap:
          name: mysql
  volumeClaimTemplates:
  - metadata:
      name: data
    spec:
      accessModes: ["ReadWriteOnce"]
      resources:
        requests:
          storage: 10Gi

查看一主多从pod状态:

sudo kubectl get pod -n myserver 
NAME      READY   STATUS    RESTARTS      AGE
mysql-0   2/2     Running   0             21m
mysql-1   2/2     Running   1 (20m ago)   21m
mysql-2   2/2     Running   0             18m

进入slave pod显示slave状态:

kubectl exec -it mysql-1 -n myserver -- mysql
mysql> show slave status\G;
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: mysql-0.mysql
                  Master_User: root
                  Master_Port: 3306
                Connect_Retry: 10
              Master_Log_File: mysql-0-bin.000003
          Read_Master_Log_Pos: 316
               Relay_Log_File: mysql-1-relay-bin.000002
                Relay_Log_Pos: 484
        Relay_Master_Log_File: mysql-0-bin.000003
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes

排错:NFS目录权限

在创建一主多从pod,可能会CrashLoopBackOff状态,通过logs发现Operation not permitted:

sudo kubectl logs mysql-0 -n myserver 
# 一个pod有4个容器,默认进入mysql
Defaulted container "mysql" out of: mysql, xtrabackup, init-mysql (init), clone-mysql (init)
# 对/var/lib/mysql/操作无权限
chown: changing ownership of '/var/lib/mysql/': Operation not permitted

错误原因:共享NFS目录时使用了错误的权限。
正确权限:192.168.100.0/24(rw, no_root_squash)

squash原意压榨,即远程用户进入NFS服务器,角色会降低权限。
no_squash保留身份,root还是root,tom还是tom,但本质是保留uid+gid。

在NFS配置中squash几种组合:

  1. (root_squash, no_all_squash):默认配置,root用户压榨成nobody,普通用户不改变身份。
  2. (no_root_squash, no_all_squash):root进入NFS后还是root,uid=gid=0,普通用户也没变。
  3. (root_squash, all_squash):root用户和普通用户都被压榨成nobody。权限降低相对安全。
  4. (no_root_squash, all_squash):出现all_squash,no_root_squash失效,效果等同于3。


本文含有隐藏内容,请 开通VIP 后查看