prometheus operator部署(olm方式)
prometheus operator部署(olm方式)
olm
olm即Operator Lifecycle Manager,是用来管理operator生命周期的工具
安装olm首先需要我们安装operator sdk
安装operator sdk
operator sdk是安装在我们的服务器上的,该服务器需要可以通过kubeconfig
链接到k8s集群的能力
# 设置平台信息 export ARCH=$(case $(uname -m) in x86_64) echo -n amd64 ;; aarch64) echo -n arm64 ;; *) echo -n $(uname -m) ;; esac) export OS=$(uname | awk '{print tolower($0)}') # 下载对应二进制文件 export OPERATOR_SDK_DL_URL=https://github.com/operator-framework/operator-sdk/releases/download/v1.31.0 curl -LO ${OPERATOR_SDK_DL_URL}/operator-sdk_${OS}_${ARCH} # 验证下载的二进制文件,验证成功应该会输出类似operator-sdk_linux_amd64: OK的内容 gpg --keyserver keyserver.ubuntu.com --recv-keys 052996E2A20B5C7E curl -LO ${OPERATOR_SDK_DL_URL}/checksums.txt curl -LO ${OPERATOR_SDK_DL_URL}/checksums.txt.asc gpg -u "Operator SDK (release) <cncf-operator-sdk@cncf.io>" --verify checksum grep operator-sdk_${OS}_${ARCH} checksums.txt | sha256sum -c - # 安装到path中 chmod +x operator-sdk_${OS}_${ARCH} && sudo mv operator-sdk_${OS}_${ARCH} /usr/local/bin/operator-sdk
安装olm
当我们安装好operator sdk
后就可以使用operator sdk
向k8s集群中安装olm了
operator-sdk olm install
operatorhub
operatorhub 提供了很多我们需要的operator,我们可以从operatorhub方便的查找各种应用的operator
OperatorHub.io | The registry for Kubernetes Operators
安装prometheus
安装prometheus operator
当上述前置操作完成后,我们就可以通过olm来安装promethues operator
首先我们去operatorhub查找prometheus的operator
OperatorHub.io | The registry for Kubernetes Operators
根据页面提供的命令就可以很方便的安装prometheus operator
# 安装prometheus operator kubectl create -f <https://operatorhub.io/install/prometheus.yaml> # 查看已经安装的operator kubectl get csv -n operators
安装prometheus
页面上提供了各种相关资源的安装方式,我们可以直接查看示例yaml来安装对应资源
promethues crd
rbac配置
这里在安装前我们首先要创建需要的rbac授权,使用下面文件即可
apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRole metadata: name: prometheus rules: - apiGroups: [""] resources: - nodes - nodes/metrics - services - endpoints - pods verbs: ["get", "list", "watch"] - apiGroups: [""] resources: - configmaps verbs: ["get"] - apiGroups: - networking.k8s.io resources: - ingresses verbs: ["get", "list", "watch"] - nonResourceURLs: ["/metrics"] verbs: ["get"]
apiVersion: v1 automountServiceAccountToken: false kind: ServiceAccount metadata: labels: app.kubernetes.io/component: controller app.kubernetes.io/name: prometheus-operator app.kubernetes.io/version: 0.68.0 name: prometheus-k8s namespace: monitoring
apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRoleBinding metadata: labels: app.kubernetes.io/component: controller app.kubernetes.io/name: prometheus-operator app.kubernetes.io/version: 0.68.0 name: prometheus roleRef: apiGroup: rbac.authorization.k8s.io kind: ClusterRole name: prometheus subjects: - kind: ServiceAccount name: prometheus-k8s namespace: monitoring
prometheus配置
prometheus server
apiVersion: monitoring.coreos.com/v1 kind: Prometheus metadata: name: prometheus namespace: monitoring spec: replicas: 1 serviceAccountName: prometheus-k8s serviceMonitorSelector: {} ruleSelector: {} podMonitorSelector: {} probeSelector: {} alerting: alertmanagers: - namespace: monitoring name: alertmanager-main port: web # 后续需要写入victoria使用 remoteWrite: - url: "<http://victoria.cefso.online/api/v1/write>"
prometheus server的service文件
apiVersion: v1 kind: Service metadata: name: prometheus namespace: monitoring labels: promethues-servicemonitor: 'true' spec: type: ClusterIP ports: - name: web port: 9090 protocol: TCP targetPort: web selector: app.kubernetes.io/name: prometheus
prometheus server的ingress文件
apiVersion: networking.k8s.io/v1 kind: Ingress metadata: name: prometheus namespace: monitoring spec: ingressClassName: traefik rules: - host: "prometheus.cefso.online" http: paths: - pathType: Prefix path: "/" backend: service: name: prometheus port: name: web
这些都安装完成后我们就可以访问到prometheus了
监控配置
由于我们使用了prometheus operator
,所以我们可以通过CRD的方式来配置服务发现,在operatorhub相关页面也有简单的介绍,这里主要是pod monitor
和service monitor
service monitor
apiVersion: monitoring.coreos.com/v1 kind: ServiceMonitor metadata: name: prometheus-servicemonitor namespace: monitoring spec: selector: matchLabels: promethues-servicemonitor: "true" endpoints: - port: web interval: 30s
service monitor相关介绍
apiVersion: monitoring.coreos.com/v1 kind: ServiceMonitor metadata: name: micrometer-demo namespace: default spec: endpoints: - interval: 15s path: /actuator/prometheus port: metrics #注意:这里port配置的是端口名,并非端口号。 namespaceSelector: any: true selector: matchLabels: micrometer-prometheus-discovery: 'true'
在这段YAML文件中,各代码段的含义如下:
**
metadata
下的name
和namespace
**将指定ServiceMonitor所需的一些关键元信息。spec
的endpoints
为服务端点,代表Prometheus所需的采集Metrics的地址。endpoints
为一个数组,同时可以创建多个endpoints
。每个**endpoints
**包含三个字段,每个字段的含义如下:interval
:指定Prometheus对当前**endpoints
采集的周期。单位为秒,在本次示例中设定为15s
**。path
:指定Prometheus的采集路径。在本次示例中,指定为**/actuator/prometheus
**。port
:指定采集数据需要通过的端口,设置的端口为创建Service时端口所设置的**name
。在本次示例中,设定为metrics
**。**重要:**这里port配置的是端口名,并非端口号。
**
spec
的namespaceSelector
为需要发现的Service的范围。namespaceSelector
**包含两个互斥字段,字段的含义如下:any
:有且仅有一个值**true
**,当该字段被设置时,将监听所有符合Selector过滤条件的Service的变动。matchNames
:数组值,指定需要监听的**namespace
的范围。例如,只想监听default和arms-prom两个命名空间中的Service,那么matchNames
**设置如下:namespaceSelector: matchNames: - default - arms-prom
**
spec
的selector
**用于选择Service。在本次示例所使用的Service有micrometer-prometheus-discovery: 'true' Label,所以**
selector
**设置如下:selector: matchLabels: micrometer-prometheus-discovery: 'true'
pod monitor
piVersion: monitoring.coreos.com/v1 kind: PodMonitor metadata: name: prometheus-podmonitor namespace: monitoring spec: selector: matchLabels: app.kubernetes.io/name: prometheu podMetricsEndpoints: - port: web interval: 30s