K8S弹性伸缩之基于Prometheus QPS指标的HPA
基于Prometheus自定义指标缩放
资源指标只包含CPU、内存,一般来说也够了。但如果想根据自定义指标:如请求qps/5xx错误数来实现HPA,就需要使用自定义指标了,目前比较成熟的实现是 Prometheus Custom Metrics。自定义指标由Prometheus来提供,再利用k8s-prometheus-adpater聚合到apiserver,实现和核心指标(metric-server)同样的效果。

1、部署Prometheus
Prometheus(普罗米修斯)是一个最初在SoundCloud上构建的监控系统。自2012年成为社区开源项目,拥有非常活跃的开发人员和用户社区。为强调开源及独立维护,Prometheus于2016年加入云原生云计算基金会(CNCF),成为继Kubernetes之后的第二个托管项目。
Prometheus 特点: - 自动采集,服务发现;
多维数据模型:由度量名称和键值对标识的时间序列数据;
PromSQL:一种灵活的查询语言,可以利用多维数据完成复杂的查询;
不依赖分布式存储,单个服务器节点可直接工作;
基于HTTP的pull方式采集时间序列数据;
推送时间序列数据通过PushGateway组件支持;
通过服务发现或静态配置发现目标;
多种图形模式及仪表盘支持(grafana);
Prometheus组成及架构:

Prometheus Server:收集指标和存储时间序列数据,并提供查询接口
ClientLibrary:客户端库
Push Gateway:短期存储指标数据。主要用于临时性的任务
Exporters:采集已有的第三方服务监控指标并暴露metrics
Alertmanager:告警
Web UI:简单的Web控制台
部署:
现在node上安装:
[root@k8s-node1 ~]# yum install -y nfs-utils
NFS 配置及使用
我们在服务端创建一个共享目录 /data/share ,作为客户端挂载的远端入口,然后设置权限。
$ mkdir -p /opt/sharedata/
$ chmod 666 /opt/sharedata/
然后,修改 NFS 配置文件 /etc/exports
[root@k8s-node1 ~]# cat /etc/exports
/opt/sharedata 192.168.171.0/24(rw,sync,insecure,no_subtree_check,no_root_squash)
说明一下,这里配置后边有很多参数,每个参数有不同的含义,具体可以参考下边。此处,我配置了将 /data/share 文件目录设置为允许 IP 为该 192.168.171.0/24 区间的客户端挂载,当然,如果客户端 IP 不在该区间也想要挂载的话,可以设置 IP 区间更大或者设置为 * 即允许所有客户端挂载,例如:/home *(ro,sync,insecure,no_root_squash) 设置 /home 目录允许所有客户端只读挂载。
# 启动 NFS 服务
$ service nfs start
# 或者使用如下命令亦可
/bin/systemctl start nfs.service
[root@k8s-node1 ~]# showmount -e localhost
Export list for localhost:
/opt/sharedata 192.168.171.0/24
示例:
挂载远端目录到本地 /share 目录。
$ mount 192.168.171.11:/opt/sharedata /share
$ df -h | grep 192.168.171.11
Filesystem Size Used Avail Use% Mounted on
192.168.171.11:/opt/sharedata 27G 11G 17G 40% /share
客户端要卸载 NFS 挂载的话,使用如下命令即可。
$ umount /share
现在master上安装:
链接:https://pan.baidu.com/s/1b4Fu8j4Flf2Lzd0naT_iRg 提取码:7l3z
从分享包中导入nfs-client.zip
# cd nfs-client
# [root@k8s-master1 nfs-client]# cat deployment.yaml
...省略
serviceAccountName: nfs-client-provisioner
containers:
- name: nfs-client-provisioner
image: quay.io/external_storage/nfs-client-provisioner:latest
volumeMounts:
- name: nfs-client-root
mountPath: /persistentvolumes
env:
- name: PROVISIONER_NAME
value: fuseim.pri/ifs
- name: NFS_SERVER
value: 192.168.171.12 ##nfs的server地址
- name: NFS_PATH
value: /opt/sharedata ##暴露的目录
volumes:
- name: nfs-client-root
nfs:
server: 192.168.171.12
path: /opt/sharedata
...省略
[root@k8s-master1 nfs-client]# kubectl apply -f .
storageclass.storage.k8s.io/managed-nfs-storage created
serviceaccount/nfs-client-provisioner created
deployment.apps/nfs-client-provisioner created
serviceaccount/nfs-client-provisioner unchanged
clusterrole.rbac.authorization.k8s.io/nfs-client-provisioner-runner created
clusterrolebinding.rbac.authorization.k8s.io/run-nfs-client-provisioner created
role.rbac.authorization.k8s.io/leader-locking-nfs-client-provisioner created
rolebinding.rbac.authorization.k8s.io/leader-locking-nfs-client-provisioner created
[root@k8s-master1 nfs-client]# kubectl get po
NAME READY STATUS RESTARTS AGE
nfs-client-provisioner-9c784f97-cqzhb 1/1 running 0 2m16s
链接:https://pan.baidu.com/s/1b4Fu8j4Flf2Lzd0naT_iRg 提取码:7l3z
# cd prometheus
# kubectl apply -f .
[root@k8s-master1 nfs-client]# kubectl get po -o wide -n kube-system
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
coredns-6d8cfdd59d-pbbbc 1/1 Running 2 2d1h 10.244.2.15 k8s-node2 <none> <none>
kube-flannel-ds-amd64-q8g25 1/1 Running 3 2d1h 192.168.171.13 k8s-node2 <none> <none>
metrics-server-7dbbcf4c7-v5zpm 1/1 Running 3 47h 10.244.2.14 k8s-node2 <none> <none>
prometheus-0 2/2 Running 0 6m48s 10.244.3.15 k8s-node3 <none> <none>
[root@k8s-master1 nfs-client]# kubectl get svc -n kube-system
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kube-dns ClusterIP 10.0.0.2 <none> 53/UDP,53/TCP 2d1h
metrics-server ClusterIP 10.0.0.5 <none> 443/TCP 47h
prometheus NodePort 10.0.0.147 <none> 9090:30090/TCP 7m46s
访问Prometheus UI:http://NdeIP:30090

2、 部署 Custom Metrics Adapter
但是prometheus采集到的metrics并不能直接给k8s用,因为两者数据格式不兼容,还需要另外一个组件(k8s-prometheus-adpater),将prometheus的metrics 数据格式转换成k8s API接口能识别的格式,转换以后,因为是自定义API,所以还需要用Kubernetes aggregator在主APIServer中注册,以便直接通过/apis/来访问。
https://github.com/DirectXMan12/k8s-prometheus-adapter
该 PrometheusAdapter 有一个稳定的Helm Charts,我们直接使用。
先准备下helm环境:
wget https://get.helm.sh/helm-v3.0.0-linux-amd64.tar.gz
tar zxvf helm-v3.0.0-linux-amd64.tar.gz
mv linux-amd64/helm /usr/bin/
helm repo add stable http://mirror.azure.cn/kubernetes/charts
helm repo update
helm repo list
部署prometheus-adapter,指定prometheus地址:
# helm install prometheus-adapter stable/prometheus-adapter --namespace kube-system --set prometheus.url=http://prometheus.kube-system,prometheus.port=9090
# helm list -n kube-system
# kubectl get pods -n kube-system
NAME READY STATUS RESTARTS AGE
prometheus-adapter-77b7b4dd8b-ktsvx 1/1 Running 0 9m
确保适配器注册到APIServer:
[root@k8s-master1 ~]# kubectl get apiservices |grep custom
v1beta1.custom.metrics.k8s.io kube-system/prometheus-adapter True 87s
[root@k8s-master1 ~]# kubectl get --raw "/apis/custom.metrics.k8s.io/v1beta1"
{"kind":"APIResourceList","apiVersion":"v1","groupVersion":"custom.metrics.k8s.io/v1beta1","resources":[{"name":"namespaces/kubelet_volume_stats_inodes_used","singularName":"","namespaced":false,"kind":"MetricValueList","verbs":["get"]},{"name":"namespaces/kubelet_volume_stats_used_bytes","singularName":"","namespaced":false,"kind":"MetricValueList","verbs":["get"]},{"name":"persistentvolumeclaims/kubelet_volume_stats_capacity_bytes","singularName":"","namespaced":true,"kind":"MetricValueList","verbs":["get"]},{"name":"namespaces/kubelet_volume_stats_inodes_free","singularName":"","namespaced":false,"kind":"MetricValueList","verbs":["get"]},{"name":"jobs.batch/kubelet_volume_stats_inodes_free","singularName":"","namespaced":true,"kind":"MetricValueList","verbs":["get"]},{"name":"persistentvolumeclaims/kubelet_volume_stats_inodes_used","singularName":"","namespaced":true,"kind":"MetricValueList","verbs":["get"]},{"name":"jobs.batch/kubelet_volume_stats_used_bytes","singularName":"","namespaced":true,"kind":"MetricValueList","verbs":["get"]},{"name":"namespaces/kubelet_container_log_filesystem_used_bytes","singularName":"","namespaced":false,"kind":"MetricValueList","verbs":["get"]},{"name":"namespaces/kubelet_volume_stats_inodes","singularName":"","namespaced":false,"kind":"MetricValueList","verbs":["get"]},{"name":"persistentvolumeclaims/kubelet_volume_stats_inodes","singularName":"","namespaced":true,"kind":"MetricValueList","verbs":["get"]},{"name":"persistentvolumeclaims/kubelet_volume_stats_available_bytes","singularName":"","namespaced":true,"kind":"MetricValueList","verbs":["get"]},{"name":"jobs.batch/kubelet_volume_stats_capacity_bytes","singularName":"","namespaced":true,"kind":"MetricValueList","verbs":["get"]},{"name":"namespaces/kubelet_volume_stats_capacity_bytes","singularName":"","namespaced":false,"kind":"MetricValueList","verbs":["get"]},{"name":"persistentvolumeclaims/kubelet_volume_stats_inodes_free","singularName":"","namespaced":true,"kind":"MetricValueList","verbs":["get"]},{"name":"persistentvolumeclaims/kubelet_volume_stats_used_bytes","singularName":"","namespaced":true,"kind":"MetricValueList","verbs":["get"]},{"name":"pods/kubelet_container_log_filesystem_used_bytes","singularName":"","namespaced":true,"kind":"MetricValueList","verbs":["get"]},{"name":"jobs.batch/kubelet_container_log_filesystem_used_bytes","singularName":"","namespaced":true,"kind":"MetricValueList","verbs":["get"]},{"name":"jobs.batch/kubelet_volume_stats_available_bytes","singularName":"","namespaced":true,"kind":"MetricValueList","verbs":["get"]},{"name":"namespaces/kubelet_volume_stats_available_bytes","singularName":"","namespaced":false,"kind":"MetricValueList","verbs":["get"]},{"name":"jobs.batch/kubelet_volume_stats_inodes","singularName":"","namespaced":true,"kind":"MetricValueList","verbs":["get"]},{"name":"jobs.batch/kubelet_volume_stats_inodes_used","singularName":"","namespaced":true,"kind":"MetricValueList","verbs":["get"]}]}
基于QPS指标实践
部署一个应用:
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
app: metrics-app
name: metrics-app
spec:
replicas: 3
selector:
matchLabels:
app: metrics-app
template:
metadata:
labels:
app: metrics-app
annotations:
prometheus.io/scrape: "true" ##是否可以被采集数据
prometheus.io/port: "80" ##采集访问的端口
prometheus.io/path: "/metrics" ##采集访问的URL
spec:
containers:
- image: zhdya/metrics-app
name: metrics-app
ports:
- name: web
containerPort: 80
resources:
requests:
cpu: 200m
memory: 256Mi
readinessProbe:
httpGet:
path: /
port: 80
initialDelaySeconds: 3
periodSeconds: 5
livenessProbe:
httpGet:
path: /
port: 80
initialDelaySeconds: 3
periodSeconds: 5
---
apiVersion: v1
kind: Service
metadata:
name: metrics-app
labels:
app: metrics-app
spec:
ports:
- name: web
port: 80
targetPort: 80
selector:
app: metrics-app
[root@k8s-master1 hpa]# kubectl get po -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
metrics-app-7674cfb699-5l72f 1/1 Running 0 19s 10.244.1.13 k8s-node1 <none> <none>
metrics-app-7674cfb699-btch5 0/1 Running 0 19s 10.244.2.16 k8s-node2 <none> <none>
metrics-app-7674cfb699-kksjr 0/1 Running 0 19s 10.244.0.15 k8s-master1 <none> <none>
[root@k8s-master1 hpa]# kubectl get svc
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kubernetes ClusterIP 10.0.0.1 <none> 443/TCP 2d1h
metrics-app ClusterIP 10.0.0.163 <none> 80/TCP 39s
该metrics-app暴露了一个Prometheus指标接口,可以通过访问service看到:
[root@k8s-master1 hpa]# curl 10.0.0.163/metrics
# HELP http_requests_total The amount of requests in total
# TYPE http_requests_total counter
http_requests_total 20
# HELP http_requests_per_second The amount of requests per second the latest ten seconds
# TYPE http_requests_per_second gauge
http_requests_per_second 0.5
##顺带测试下负载均衡:
[root@k8s-master1 hpa]# curl 10.0.0.163
Hello! My name is metrics-app-7674cfb699-btch5. The last 10 seconds, the average QPS has been 0.5. Total requests served: 35
[root@k8s-master1 hpa]# curl 10.0.0.163
Hello! My name is metrics-app-7674cfb699-5l72f. The last 10 seconds, the average QPS has been 0.5. Total requests served: 38
[root@k8s-master1 hpa]# curl 10.0.0.163
Hello! My name is metrics-app-7674cfb699-kksjr. The last 10 seconds, the average QPS has been 0.5. Total requests served: 37
收集到的每个容器被访问的次数:

创建HPA策略:
# vi app-hpa-v2.yml
apiVersion: autoscaling/v2beta2
kind: HorizontalPodAutoscaler
metadata:
name: metrics-app-hpa
namespace: default
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: metrics-app
minReplicas: 1
maxReplicas: 8
metrics:
- type: Pods
pods:
metric:
name: http_requests_per_second
target:
type: AverageValue
averageValue: 800m # 800m 即0.8个/秒
[root@k8s-master1 hpa]# kubectl get hpa
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
metrics-app-hpa Deployment/metrics-app <unknown>/800m 1 8 3 36s
这里使用Prometheus提供的指标测试来测试自定义指标(QPS)的自动缩放。
4、配置适配器收集特定的指标
当创建好HPA还没结束,因为适配器还不知道你要什么指标(http_requests_per_second),HPA也就获取不到Pod提供指标。
ConfigMap在default名称空间中编辑prometheus-adapter ,并seriesQuery在该rules: 部分的顶部添加一个新的:
# kubectl edit cm prometheus-adapter -n kube-system
apiVersion: v1
kind: ConfigMap
metadata:
labels:
app: prometheus-adapter
chart: prometheus-adapter-v0.1.2
heritage: Tiller
release: prometheus-adapter
name: prometheus-adapter
data:
config.yaml: |
rules: ##增加如下一段:
- seriesQuery: 'http_requests_total{kubernetes_namespace!="",kubernetes_pod_name!=""}' ##在prometheus中就可以直接查询到这部分数据
resources:
overrides:
kubernetes_namespace: {resource: "namespace"}
kubernetes_pod_name: {resource: "pod"}
name:
matches: "^(.*)_total"
as: "${1}_per_second"
metricsQuery: 'sum(rate(<<.Series>>{<<.LabelMatchers>>}[2m])) by (<<.GroupBy>>)'
...
该规则将http_requests在2分钟的间隔内收集该服务的所有Pod的平均速率。
测试API:
[root@k8s-master1 hpa]# kubectl get --raw "/apis/custom.metrics.k8s.io/v1beta1/namespaces/default/pods/*/http_requests_per_second"
{"kind":"MetricValueList","apiVersion":"custom.metrics.k8s.io/v1beta1","metadata":{"selfLink":"/apis/custom.metrics.k8s.io/v1beta1/namespaces/default/pods/%2A/http_requests_per_second"},"items":[{"describedObject":{"kind":"Pod","namespace":"default","name":"metrics-app-7674cfb699-5l72f","apiVersion":"/v1"},"metricName":"http_requests_per_second","timestamp":"2019-12-12T15:52:47Z","value":"416m"},{"describedObject":{"kind":"Pod","namespace":"default","name":"metrics-app-7674cfb699-btch5","apiVersion":"/v1"},"metricName":"http_requests_per_second","timestamp":"2019-12-12T15:52:47Z","value":"416m"},{"describedObject":{"kind":"Pod","namespace":"default","name":"metrics-app-7674cfb699-kksjr","apiVersion":"/v1"},"metricName":"http_requests_per_second","timestamp":"2019-12-12T15:52:47Z","value":"416m"}]}
[root@k8s-master1 hpa]# kubectl get hpa
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
metrics-app-hpa Deployment/metrics-app 416m/800m 1 8 2 20m
压测:
ab -n 100000 -c 100 http://10.0.0.163/metrics
查看容器扩容的情况:
[root@k8s-master1 ~]# kubectl get po -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
metrics-app-7674cfb699-5l72f 1/1 Running 0 48m 10.244.1.13 k8s-node1 <none> <none>
metrics-app-7674cfb699-6rht6 1/1 Running 0 16s 10.244.0.16 k8s-master1 <none> <none>
metrics-app-7674cfb699-9ltvr 0/1 ContainerCreating 0 1s <none> k8s-master1 <none> <none>
metrics-app-7674cfb699-btch5 1/1 Running 0 48m 10.244.2.16 k8s-node2 <none> <none>
metrics-app-7674cfb699-kft7p 1/1 Running 0 16s 10.244.3.16 k8s-node3 <none> <none>
metrics-app-7674cfb699-plhrp 0/1 ContainerCreating 0 1s <none> k8s-node2 <none> <none>
metrics-app-7674cfb699-sgvln 0/1 ContainerCreating 0 1s <none> k8s-node1 <none> <none>
metrics-app-7674cfb699-wr56r 0/1 ContainerCreating 0 1s <none> k8s-node1 <none> <none>
nfs-client-provisioner-f9fdd5cc9-ffzbd 1/1 Running 0 8m7s 10.244.2.17 k8s-node2 <none> <none>
[root@k8s-master1 ~]# kubectl get po -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
metrics-app-7674cfb699-5l72f 1/1 Running 0 48m 10.244.1.13 k8s-node1 <none> <none>
metrics-app-7674cfb699-6rht6 1/1 Running 0 18s 10.244.0.16 k8s-master1 <none> <none>
metrics-app-7674cfb699-9ltvr 0/1 Running 0 3s 10.244.0.17 k8s-master1 <none> <none>
metrics-app-7674cfb699-btch5 1/1 Running 0 48m 10.244.2.16 k8s-node2 <none> <none>
metrics-app-7674cfb699-kft7p 1/1 Running 0 18s 10.244.3.16 k8s-node3 <none> <none>
metrics-app-7674cfb699-plhrp 0/1 Running 0 3s 10.244.2.18 k8s-node2 <none> <none>
metrics-app-7674cfb699-sgvln 0/1 Running 0 3s 10.244.1.16 k8s-node1 <none> <none>
metrics-app-7674cfb699-wr56r 0/1 Running 0 3s 10.244.1.17 k8s-node1 <none> <none>
nfs-client-provisioner-f9fdd5cc9-ffzbd 1/1 Running 0 8m9s 10.244.2.17 k8s-node2 <none> <none>
查看HPA状态:
[root@k8s-master1 ~]# kubectl get hpa
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
metrics-app-hpa Deployment/metrics-app 414345m/800m 1 8 8 21m
[root@k8s-master1 ~]# kubectl describe hpa metrics-app-hpa
...省略
Metrics: ( current / target )
"http_requests_per_second" on pods: 818994m / 800m
Min replicas: 1
Max replicas: 8
Deployment pods: 8 current / 8 desired
Conditions:
Type Status Reason Message
---- ------ ------ -------
AbleToScale True ReadyForNewScale recommended size matches current size
ScalingActive True ValidMetricFound the HPA was able to successfully calculate a replica count from pods metric http_requests_per_second
ScalingLimited True TooManyReplicas the desired replica count is more than the maximum replica count
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedComputeMetricsReplicas 19m (x12 over 22m) horizontal-pod-autoscaler invalid metrics (1 invalid out of 1), first error is: failed to get pods metric value: unable to get metric http_requests_per_second: unable to fetch metrics from custom metrics API: the server could not find the metric http_requests_per_second for pods
Warning FailedGetPodsMetric 7m18s (x61 over 22m) horizontal-pod-autoscaler unable to get metric http_requests_per_second: unable to fetch metrics from custom metrics API: the server could not find the metric http_requests_per_second for pods
Normal SuccessfulRescale 88s horizontal-pod-autoscaler New size: 4; reason: pods metric http_requests_per_second above target
小结

1、应用程序暴露/metrics监控指标并且是prometheus数据格式;
2、通过/metrics收集每个Pod的http_request_total指标;
3、prometheus将收集到的信息汇总;
4、APIServer定时从Prometheus查询,获取request_per_second的数据;
5、HPA定期向APIServer查询以判断是否符合配置的autoscaler规则;
6、如果符合autoscaler规则,则修改Deployment的ReplicaSet副本数量进行伸缩。
本博客所有文章除特别声明外,均采用 CC BY-SA 4.0 协议 ,转载请注明出处!