在K8S上使用Clickhouse
介绍
clickhouse是一款开源的分析型数据库,性能强大。本文介绍如何在K8S环境中部署和使用clickhouse。
我们使用开源的clickhouse operator: https://github.com/Altinity/clickhouse-operator
相关依赖:
k8s 1.15+。我们使用了k8s 1.20。8C16G 3节点。
存储CSI。我们使用了nfs。nfs仅用作测试,生产环境不建议使用nfs作为数据库存储
安装部署
1、下载operator相关代码
wget https://github.com/Altinity/clickhouse-operator/archive/refs/tags/0.18.4.tar.gz
下载的文件解压后,主要有几个目录
deploy: 包含operator安装、zookeeper安装相关文件和脚本
docs:clickhouse集群的各种配置样例
cmd, pkg: operator源代码。这里我们先不关注源代码
2、安装operator
安装脚本:
deploy/operator/clickhouse-operator-install.sh
默认会把operator安装到kube-system命名空间下。
# kubectl get po -n kube-system | grep click clickhouse-operator-994c5bb44-g9t9s 2/2 Running 2 24h
除此之外,脚本还会创建相关的CRD
# kubectl get crd | grep click clickhouseinstallations.clickhouse.altinity.com 2022-04-19T07:43:25Z clickhouseinstallationtemplates.clickhouse.altinity.com 2022-04-19T07:43:25Z clickhouseoperatorconfigurations.clickhouse.altinity.com 2022-04-19T07:43:25Z
CRD:
clickhouseinstallations: 描述一个clickhouse安装。
clickhouseinstallationtemplates:
clickhouseoperatorconfigurations:
3、安装zookeeper
clickhouse的分布式DDL( on cluster)、表复制等功能依赖zookeeper。可以使用外置的zookeeper集群,或者使用k8s集群中的zookeeper。
这里使用了下面这个脚本创建3节点的zookeeper集群:
deploy/zookeeper/quick-start-persistent-volume/zookeeper-3-nodes-create.sh
# kubectl get po -n zoo3ns NAME READY STATUS RESTARTS AGE zookeeper-0 1/1 Running 2 23h zookeeper-1 1/1 Running 1 23h zookeeper-2 1/1 Running 0 23h # kubectl get svc -n zoo3ns NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE zookeeper ClusterIP 10.104.149.244 <none> 2181/TCP,7000/TCP 23h zookeepers ClusterIP None <none> 2888/TCP,3888/TCP 23h
脚本执行成功后,可以看到zookeeper的pod和service。
clickhouse配置文件中,可以使用这里的service来配置zookeeper连接。
4、安装clickhouse集群
docs/chi-examples 目录提供了大量clickhouse集群配置的样列yaml。
我们这里创建一个多分片、多副本、多集群的clickhouse环境,并以此为例来介绍clickhouse的核心概念、配置项。
clickhouse yaml文件:
apiVersion: "clickhouse.altinity.com/v1" kind: "ClickHouseInstallation" metadata: name: "ck-cluster-x" spec: configuration: users: user2/networks/ip: "::/0" user2/password: qwerty user2/profile: default zookeeper: nodes: - host: 10.104.149.244 clusters: - name: "ck-cluster-1" layout: shardsCount: 2 replicasCount: 2 - name: "ck-cluster-2" layout: shardsCount: 2 replicasCount: 2 defaults: templates: # Templates are specified as default for all clusters podTemplate: pod-template-resource-limit hostTemplate: host-template-custom-ports templates: hostTemplates: - name: host-template-custom-ports spec: tcpPort: 7000 httpPort: 7001 interserverHTTPPort: 7002 podTemplates: - name: pod-template-resource-limit spec: containers: - name: clickhouse image: clickhouse/clickhouse-server:22.3 volumeMounts: - name: clickhouse-data-storage mountPath: /var/lib/clickhouse # Container has explicitly specified resource limits resources: requests: memory: "1024Mi" cpu: "500m" limits: memory: "1024Mi" cpu: "500m" volumeClaimTemplates: - name: clickhouse-data-storage spec: accessModes: - ReadWriteOnce # VolumeClaim has explicitly specified resource limits resources: requests: storage: 500Mi
关键配置:
configuration.zookeeper: zookeeper配置。可以配置多个节点。这里使用了service ip: 10.104.149.244,默认端口
users: 用户账号
clusters:集群。clickhouse里的集群是一个比较特殊的概念。可以配置多个集群。
name: 集群名称
layouts: 集群的分片数和副本数配置
shardsCount
replicasCount
defaults:
templates: 指定clickhouse相关模版: pod、pvc等。这里设置对应的模版名称。具体的模版在templates定义。
podTemplate: 可设置pod的image、资源限制等参数
hostTemplate: 配置clickhouse的监听端口。一般可使用默认端口。
templates: 定义各种template
hostTemplates
podTemplates。memory设置稍微大一些(1G),不然可能会出现OOM
volumeClaimTemplates
创建clickhouse:
kubectl apply -f 02.yaml
创建完成后,可以看到:
clickhouseinstallations:
2个集群,8个节点。完成状态。
# kubectl get clickhouseinstallations NAME CLUSTERS HOSTS STATUS AGE ck-cluster-x 2 8 Completed 86m
pod
总共启动了8个pod,
# kubectl get pod NAME READY STATUS RESTARTS AGE chi-ck-cluster-x-ck-cluster-1-0-0-0 1/1 Running 0 81m chi-ck-cluster-x-ck-cluster-1-0-1-0 1/1 Running 0 79m chi-ck-cluster-x-ck-cluster-1-1-0-0 1/1 Running 0 78m chi-ck-cluster-x-ck-cluster-1-1-1-0 1/1 Running 0 76m chi-ck-cluster-x-ck-cluster-2-0-0-0 1/1 Running 0 75m chi-ck-cluster-x-ck-cluster-2-0-1-0 1/1 Running 0 73m chi-ck-cluster-x-ck-cluster-2-1-0-0 1/1 Running 0 71m chi-ck-cluster-x-ck-cluster-2-1-1-0 1/1 Running 0 69m
sts
每个sts对应一个pod。
# kubectl get sts NAME READY AGE chi-ck-cluster-x-ck-cluster-1-0-0 1/1 89m chi-ck-cluster-x-ck-cluster-1-0-1 1/1 88m chi-ck-cluster-x-ck-cluster-1-1-0 1/1 88m chi-ck-cluster-x-ck-cluster-1-1-1 1/1 88m chi-ck-cluster-x-ck-cluster-2-0-0 1/1 87m chi-ck-cluster-x-ck-cluster-2-0-1 1/1 87m chi-ck-cluster-x-ck-cluster-2-1-0 1/1 87m chi-ck-cluster-x-ck-cluster-2-1-1 1/1 86m
services
每个pod对应一个service。
此外还有一个LoadBalancer类型的service。
# kubectl get svc NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE chi-ck-cluster-x-ck-cluster-1-0-0 ClusterIP None <none> 7001/TCP,7000/TCP,7002/TCP 90m chi-ck-cluster-x-ck-cluster-1-0-1 ClusterIP None <none> 7001/TCP,7000/TCP,7002/TCP 90m chi-ck-cluster-x-ck-cluster-1-1-0 ClusterIP None <none> 7001/TCP,7000/TCP,7002/TCP 90m chi-ck-cluster-x-ck-cluster-1-1-1 ClusterIP None <none> 7001/TCP,7000/TCP,7002/TCP 89m chi-ck-cluster-x-ck-cluster-2-0-0 ClusterIP None <none> 7001/TCP,7000/TCP,7002/TCP 89m chi-ck-cluster-x-ck-cluster-2-0-1 ClusterIP None <none> 7001/TCP,7000/TCP,7002/TCP 89m chi-ck-cluster-x-ck-cluster-2-1-0 ClusterIP None <none> 7001/TCP,7000/TCP,7002/TCP 88m chi-ck-cluster-x-ck-cluster-2-1-1 ClusterIP None <none> 7001/TCP,7000/TCP,7002/TCP 88m clickhouse-ck-cluster-x LoadBalancer 10.100.145.161 <pending>