如何配置 Horizontal Pod Autoscaler (HPA）

Horizontal Pod Autoscaler (HPA) 配置指南

一、HPA 核心原理

HPA 动态调整 Pod 数量，需满足以下条件：

依赖监控指标：基于 CPU、内存、自定义指标等资源/业务指标。
阈值规则：定义触发扩缩容的具体条件（如 average CPU > 70%）。
时间窗口：统计指标数据的持续时间（如 last 5 minutes）。

二、配置 HPA 的 4 种方式

方式 1：基于 CPU 使用率的 HPA（推荐）

步骤 1：安装 Metrics Server

kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml

步骤 2：创建 HPA 资源

apiVersion: autoscaling/v2beta2
kind: HorizontalPodAutoscaler
metadata:
  name: my-hpa
  namespace: default
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: my-app
  minReplicas: 1
  maxReplicas: 10
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 70  # CPU 使用率 >70% 时扩容

步骤 3：应用配置

kubectl apply -f hpa.yaml

方式 2：基于自定义指标的 HPA

步骤 1：部署自定义指标适配器（以 Prometheus 为例）

# 示例：部署指标导出器
kubectl create deployment prometheus-adapter --image quay.io/prometheus adapter:v1.12.0
kubectl expose deployment prometheus-adapter --port 9464 --type=LoadBalancer

步骤 2：定义 HPA 使用自定义指标

apiVersion: autoscaling/v2beta2
kind: HorizontalPodAutoscaler
metadata:
  name: custom-metric-hpa
  namespace: default
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: my-app
  minReplicas: 1
  maxReplicas: 10
  metrics:
  - type: Custom
    custom:
      name: my_custom_metric  # 需适配器支持的指标名称
      target:
        type: AverageValue
        averageValue: 100  # 当指标值 >100 时扩容

方式 3：基于内存使用率的 HPA

apiVersion: autoscaling/v2beta2
kind: HorizontalPodAutoscaler
metadata:
  name: memory-hpa
  namespace: default
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: my-app
  minReplicas: 1
  maxReplicas: 10
  metrics:
  - type: Resource
    resource:
      name: memory
      target:
        type: Utilization
        averageUtilization: 80

方式 4：基于多个指标的复合策略

apiVersion: autoscaling/v2beta2
kind: HorizontalPodAutoscaler
metadata:
  name: multi-metric-hpa
  namespace: default
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: my-app
  minReplicas: 1
  maxReplicas: 10
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        averageUtilization: 70
  - type: Custom
    custom:
      name: request_count
      target:
        type: AverageValue
        averageValue: 500

三、HPA 支持的指标类型

指标类型	描述
Resource	系统资源（CPU、内存），需通过 Metrics Server 监控
Custom	自定义业务指标（如请求量、错误率），需对接 Prometheus Adapter 等适配器
Object	针对特定对象的数量（如 Ingress 访问量），需自定义适配器

四、关键参数详解

字段	作用
`minReplicas`	最小副本数（不可低于此值）
`maxReplicas`	最大副本数（不可超过此值）
`targetRef`	指定要自动缩放的目标资源（Deployment/StatefulSet/CronJob）
`averageUtilization`	资源利用率阈值（百分比）

五、验证 HPA 是否生效

查看 HPA 状态

kubectl get hpa -n default
# 输出示例：
# NAME       REFERENCE               TARGETS          CURRENT REPLICAS   DESIRED REPLICAS
# my-hpa     deployment/my-app      cpu=75%/100ms   1                3

模拟触发条件

# 人工增加 CPU 负载（需安装 loadgen 工具）
kubectl run -i --tty loadgen --image=busybox --rm sleep 3600
while true; do curl http://my-app:8080; done

六、高级配置技巧

1. 阶梯式扩缩容（Stepwise Scaling）

# 配置文件片段
behavior:
  scaleUp:
    steps:
    - duration: 30s
      targetSize: 2
    - duration: 60s
      targetSize: 5
  scaleDown:
    steps:
    - duration: 30s
      targetSize: 3

2. 基于队列长度的扩缩容（适用于消息队列）

# 自定义指标示例（需适配器支持）
metric:
  name: queue_length
  type: Gauge
target:
  type: AverageValue
  averageValue: 100

七、常见问题排查

现象	解决方案
HPA 无反应	1. 检查 Metrics Server 是否正常运行 2. 确认 RBAC 权限（HPA 需要访问 metrics-server）
扩缩容不生效	1. 验证指标是否达到阈值 2. 检查目标资源的 `autoscaling` 注解是否正确
频繁抖动	调整 `hysteresis` 参数（如 `5%` 缓冲）或增大 `evaluation-period` 时间窗口

总结

通过合理配置 HPA，可实现：

• 成本优化：低负载时减少 Pod 数量降低资源消耗。

• 弹性伸缩：高负载时自动扩容保障服务稳定性。

• 业务适配：基于业务指标（如 QPS）实现精准扩缩容。

秒客网

如何配置 Horizontal Pod Autoscaler (HPA）

Horizontal Pod Autoscaler (HPA) 配置指南

一、HPA 核心原理

二、配置 HPA 的 4 种方式

方式 1：基于 CPU 使用率的 HPA（推荐）

步骤 1：安装 Metrics Server

步骤 2：创建 HPA 资源

步骤 3：应用配置

方式 2：基于自定义指标的 HPA

步骤 1：部署自定义指标适配器（以 Prometheus 为例）

步骤 2：定义 HPA 使用自定义指标

方式 3：基于内存使用率的 HPA

方式 4：基于多个指标的复合策略

三、HPA 支持的指标类型

四、关键参数详解

五、验证 HPA 是否生效

六、高级配置技巧

1. 阶梯式扩缩容（Stepwise Scaling）

2. 基于队列长度的扩缩容（适用于消息队列）

七、常见问题排查

总结

相关文章