k8s监控方案实践补充(一):部署Metrics Server实现kubectl top和HPA支持

发布于:2025-05-16 ⋅ 阅读:(11) ⋅ 点赞:(0)

k8s监控方案实践补充(一):部署Metrics Server实现kubectl top和HPA支持


随着容器化和微服务架构的不断发展,系统的复杂性与日俱增,构建一套完善的监控与资源管理体系已成为保障系统稳定运行的关键。在前几篇文章中,我们已经介绍了如何部署 Prometheus、Node Exporter、Grafana 以及 Alertmanager,并通过钉钉 Webhook 实现了监控告警的闭环。

在本篇补充文章中,我们将部署 Kubernetes 原生的资源指标采集组件 —— Metrics Server。它是实现 kubectl top 命令、自动水平扩缩容(HPA)等关键功能的基础,为进一步增强集群资源可观测性和智能调度能力提供支持。

一、Metrics Server简介

Metrics Server 是 Kubernetes 官方提供的资源指标聚合组件,主要用于收集各节点和各 Pod 的 CPU 与内存使用情况。它通过调用 Kubelet 的 Summary API 聚合数据,并将指标存储在内存中(不持久化),供 API Server 查询。

部署 Metrics Server 后,可以实现以下功能:

  • 使用 kubectl top 命令实时查看节点和 Pod 的资源使用情况
  • 为 HPA(Horizontal Pod Autoscaler)提供基础指标支撑,实现基于资源使用的自动扩缩容
  • 在某些 Kubernetes 仪表盘中显示资源使用情况(如 Kubernetes Dashboard)

⚠️ 需要注意的是,Metrics Server 并不会将数据持久化,也不支持 Prometheus 查询语法,它只适用于实时性要求高但不需要历史数据的场景。

g.cn/direct/31da7451a2e34431b7ce7606e6722ebf.png)

二、Metrics Server实战部署

1. 创建RBAC(metrics-server-rbac.yaml)

为 Metrics Server 分配所需的访问权限,包括读取节点、Pod 等资源指标,并配置相应的 ServiceAccount 与 RoleBinding

apiVersion: v1
kind: ServiceAccount
metadata:
  labels:
    k8s-app: metrics-server
  name: metrics-server
  namespace: kube-system
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  labels:
    k8s-app: metrics-server
    rbac.authorization.k8s.io/aggregate-to-admin: "true"
    rbac.authorization.k8s.io/aggregate-to-edit: "true"
    rbac.authorization.k8s.io/aggregate-to-view: "true"
  name: system:aggregated-metrics-reader
rules:
- apiGroups:
  - metrics.k8s.io
  resources:
  - pods
  - nodes
  verbs:
  - get
  - list
  - watch
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  labels:
    k8s-app: metrics-server
  name: system:metrics-server
rules:
- apiGroups:
  - ""
  resources:
  - pods
  - nodes
  - nodes/stats
  - namespaces
  - configmaps
  verbs:
  - get
  - list
  - watch
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  labels:
    k8s-app: metrics-server
  name: metrics-server-auth-reader
  namespace: kube-system
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: Role
  name: extension-apiserver-authentication-reader
subjects:
- kind: ServiceAccount
  name: metrics-server
  namespace: kube-system
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  labels:
    k8s-app: metrics-server
  name: metrics-server:system:auth-delegator
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: system:auth-delegator
subjects:
- kind: ServiceAccount
  name: metrics-server
  namespace: kube-system
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  labels:
    k8s-app: metrics-server
  name: system:metrics-server
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: system:metrics-server
subjects:
- kind: ServiceAccount
  name: metrics-server
  namespace: kube-system

2. 创建Service(metrics-server-svc.yaml)

暴露 Metrics Server 的 HTTPS 服务端口,供 Kubernetes API Server 注册并访问其指标服务

apiVersion: v1
kind: Service
metadata:
  labels:
    k8s-app: metrics-server
  name: metrics-server
  namespace: kube-system
spec:
  ports:
  - name: https
    port: 443
    protocol: TCP
    targetPort: https
  selector:
    k8s-app: metrics-server

3. 创建Deployment(metrics-server-deploy.yaml)

部署 Metrics Server,配置启动参数、TLS 端口、探针、ServiceAccount 以及临时目录等关键运行参数

apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    k8s-app: metrics-server
  name: metrics-server
  namespace: kube-system
spec:
  selector:
    matchLabels:
      k8s-app: metrics-server
  strategy:
    rollingUpdate:
      maxUnavailable: 0
  template:
    metadata:
      labels:
        k8s-app: metrics-server
    spec:
      containers:
      - args:
        - --cert-dir=/tmp
        - --secure-port=4443
        - --kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname
        - --kubelet-use-node-status-port
        - --kubelet-insecure-tls
        image: harbor.local/k8s/metrics-server:0.4.3
        imagePullPolicy: IfNotPresent
        livenessProbe:
          failureThreshold: 3
          httpGet:
            path: /livez
            port: https
            scheme: HTTPS
          periodSeconds: 10
        name: metrics-server
        ports:
        - containerPort: 4443
          name: https
          protocol: TCP
        readinessProbe:
          failureThreshold: 3
          httpGet:
            path: /readyz
            port: https
            scheme: HTTPS
          periodSeconds: 10
        securityContext:
          readOnlyRootFilesystem: true
          runAsNonRoot: true
          runAsUser: 1000
        volumeMounts:
        - mountPath: /tmp
          name: tmp-dir
      nodeSelector:
        kubernetes.io/os: linux
      priorityClassName: system-cluster-critical
      serviceAccountName: metrics-server
      volumes:
      - emptyDir: {}
        name: tmp-dir

4. 创建APIService(metrics-server-apiservice.yaml)

注册 metrics.k8s.io 资源组的 v1beta1 版本,使 Kubernetes 能够通过标准 API 查询 Metrics Server 提供的实时指标

apiVersion: apiregistration.k8s.io/v1
kind: APIService
metadata:
  labels:
    k8s-app: metrics-server
  name: v1beta1.metrics.k8s.io
spec:
  group: metrics.k8s.io
  groupPriorityMinimum: 100
  insecureSkipTLSVerify: true
  service:
    name: metrics-server
    namespace: kube-system
  version: v1beta1
  versionPriority: 100

5. 部署所有资源

kubectl apply -f 01-metrics-server-rbac.yaml
kubectl apply -f 02-metrics-server-service.yaml
kubectl apply -f 03-metrics-server-deployment.yaml
kubectl apply -f 04-metrics-server-apiservice.yaml

三、配置Prometheus抓取资源指标配置

⚠️ 注意:Prometheus 不直接支持从 Metrics Server 抓取指标,但可以从 Kubelet 的 cAdvisor 路径采集节点与容器资源使用情况。

    - job_name: 'kubernetes-node-cadvisor'
      kubernetes_sd_configs:
      - role:  node
      scheme: https
      tls_config:
        ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
      bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
      relabel_configs:
      - action: labelmap
        regex: __meta_kubernetes_node_label_(.+)
      - target_label: __address__
        replacement: kubernetes.default.svc:443
      - source_labels: [__meta_kubernetes_node_name]
        regex: (.+)
        target_label: __metrics_path__
        replacement: /api/v1/nodes/${1}/proxy/metrics/cadvisor

总结

🚀 本篇文章补充了 Kubernetes 原生监控能力的关键组件 —— Metrics Server 的部署过程,解决了 kubectl top 无法使用的问题,并为 HPA 自动扩缩容提供资源指标支持。
✅下一篇补充文章将继续完善监控体系,介绍如何部署 kube-state-metrics,用于采集 Kubernetes 对象状态(如 Deployment、Pod、Node 等)的关键指标,为 Prometheus 提供结构化的集群状态数据支撑。


网站公告

今日签到

点亮在社区的每一天
去签到