使用Velero备份与恢复K8s集群及应用
环境
3台虚拟机组成一主两从的测试集群,使用NFS作为动态存储
主机 | IP | 系统 |
---|---|---|
k8s-master | 192.168.1.10 | centos7.9 |
k8s-node1 | 192.168.1.11 | centos7.9 |
k8s-node2 | 192.168.1.12 | centos7.9 |
一、介绍
1.1 简介
备份容灾
一键恢复
集群迁移
支持备份pv,备份数据加密,通过两种备份插件实现,通过启动命令upload-type参数更改
1.2 架构
Velero 客户端调用 Kubernetes API 服务器来创建对象
Backup
。注意到
BackupController
新Backup
对象并执行验证。BackupController
备份过程开始。它通过向 API 服务器查询资源来收集要备份的数据。调用
BackupController
对象存储服务(例如 AWS S3)来上传备份文件
1.3 备份有状态数据
velero 有两种备份有状态数据的方式,对比如下
考量维度 | 基于 CSI 快照 | 文件复制 |
---|---|---|
应用性能影响 | 低,CSI 接口调用存储系统快照 | 取决于数据量,占用额外资源 |
数据可用性 | 依赖于存储系统,需要使用支持快照的CSI | 对象存储和生产环境隔离,独立可用性,支持跨站点可用性 |
数据一致性 | 支持 Crash Consistency,配合 hook 机制实现一致性 | 无保障,基于 hook |
文件复制会进行加密、压缩、增量备份,压缩比在60%左右,备份文件都是加密后的二进制文件,打开乱码
两种文件复制插件:
Restic(默认) https://restic.readthedocs.io/en/latest/100_references.html#terminology
Kopia https://kopia.io/docs/advanced/architecture
1.4 备份最佳实践
如果你的存储支持快照,高频本地快照 + 低频 restic 备份到 s3
从应用角度选择合适的备份粒度和备份策略
多集群环境中共享同一对象存储时要防止冲突
1.5 同步机制
由于velero备份会将本次备份任务的元信息上传到s3中,当在集群中删除了备份任务,但是s3中数据为删除,velero会定时将s3的备份任务同步到集群内
1.6 坑
删除长时间未完成的备份或恢复任务,会导致 velero 阻塞无法处理后续任务
当使用文件复制备份方式时,备份文件系统速度变化快的应用,比如Es,Ck十有八九会备份失败
二、安装
velero安装比较主流的由以下两种方式,如果喜欢省事就选择第一种,如果需要定制化就用helm安装
注意:提前准备好一个对象存储,我这里使用的是minio,地址192.168.1.100:9000 minioadmin/minioadmin
2.1 通过二进制文件安装
下载二进制文件,并解压到/usr/local/bin下
## 下载 wget https://github.com/vmware-tanzu/velero/releases/download/v1.11.1/velero-v1.11.1-linux-amd64.tar.gz ## 解压 tar -zxvf velero-v1.11.1-linux-amd64.tar.gz -C /usr/local/bin && mv /usr/local/bin/velero-v1.11.1-linux-amd64/velero /usr/local/bin
生成minio认证文件
cat >velero-auth.txt << EOF [default] aws_access_key_id = minioadmin aws_secret_access_key = minioadmin EOF
安装 (需要提前创建bucket)
velero --kubeconfig /root/.kube/config install \ --use-node-agent --provider aws --plugins velero/velero-plugin-for-aws:v1.7.0 \ --bucket velerodata --secret-file ./velero-auth.txt \ --use-volume-snapshots=false --namespace velero default-volumes-to-restic \ --backup-location-config region=minio,s3ForcePathStyle="true",s3Url=http://192.168.1.10:9000
查看安装状态
[root@master1 ~]# kubectl get pod -n velero NAME READY STATUS RESTARTS AGE node-agent-86vhf 1/1 Running 0 34s node-agent-hz624 1/1 Running 0 34s velero-55568bff5b-rr7sz 1/1 Running 0 34s [root@master1 ~]# kubectl get crd | grep velerobackuprepositories.velero.io 2023-08-28T12:58:41Z backups.velero.io 2023-08-28T12:58:41Z backupstoragelocations.velero.io 2023-08-28T12:58:41Z deletebackuprequests.velero.io 2023-08-28T12:58:41Z downloadrequests.velero.io 2023-08-28T12:58:41Z podvolumebackups.velero.io 2023-08-28T12:58:41Z podvolumerestores.velero.io 2023-08-28T12:58:41Z restores.velero.io 2023-08-28T12:58:41Z schedules.velero.io 2023-08-28T12:58:41Z serverstatusrequests.velero.io 2023-08-28T12:58:41Z volumesnapshotlocations.velero.io 2023-08-28T12:58:41Z
检查备份数据存储状态
[root@master1 ~]# kubectl get backupstoragelocations -A NAMESPACE NAME PHASE LAST VALIDATED AGE DEFAULT velero default Available 3s 10m true
2.2 通过helm安装
http://www.1024sky.cn/blog/article/77733
三、基本操作
3.1 部署测试应用
部署两个测试应用,用来测试备份与恢复结果,使用minio进行备份数据的存储
mysql 5.7
nginx
apiVersion: v1kind: PersistentVolumeClaimmetadata: name: mysql-pv-claimspec: storageClassName: nfs accessModes: - ReadWriteOnce resources: requests: storage: 1Gi---apiVersion: apps/v1 #版本kind: Deployment #创建资源的类型metadata: #资源的元数据 name: mysql-dep #资源的名称,是元数据必填项spec: #期望状态 replicas: 1 #创建的副本数量(pod数量),不填默认为1 selector: # matchLabels: app: mysql-pod template: #定义pod的模板 metadata: #pod的元数据 labels: #labels标签,必填一个 app: mysql-pod spec: #pod的期望状态 containers: #容器 - name: mysql #容器名称 image: mysql:5.7 #镜像 imagePullPolicy: IfNotPresent ports: #容器的端口 - containerPort: 3306 env: - name: MYSQL_ROOT_PASSWORD value: "root" volumeMounts: - name: mysql-persistent-storage mountPath: /var/lib/mysql volumes: - name: mysql-persistent-storage persistentVolumeClaim: claimName: mysql-pv-claim---apiVersion: v1 #版本kind: Service #创建资源的类型metadata: #资源的元数据 name: mysql-svc #资源的名称,是元数据必填项 labels: #labels标签 app: mysql-svcspec: #期望状态 type: NodePort #服务类型 ports: #端口 - port: 3306 targetPort: 3306 #与containerPort一样 protocol: TCP nodePort: 30306 selector: app: mysql-pod
nginx部署文件
apiVersion: v1kind: PersistentVolumeClaimmetadata: name: my-pvcspec: accessModes: - ReadWriteOnce resources: requests: storage: 1Gi storageClassName: nfs---apiVersion: apps/v1kind: Deploymentmetadata: name: nginx-deploymentspec: replicas: 1 selector: matchLabels: app: nginx template: metadata: labels: app: nginx spec: volumes: - name: data-volume persistentVolumeClaim: claimName: my-pvc containers: - name: nginx image: nginx volumeMounts: - name: data-volume mountPath: /var/www/html ports: - containerPort: 80---apiVersion: v1kind: Servicemetadata: name: nginx-servicespec: type: NodePort ports: - port: 80 targetPort: 80 selector: app: nginx
向两个测试应用写一点文件
-- 创建测试数据库CREATE DATABASE testdb;-- 使用测试数据库USE testdb;-- 创建测试表CREATE TABLE test_table ( id INT AUTO_INCREMENT PRIMARY KEY, name VARCHAR(50), age INT, email VARCHAR(100) );-- 插入测试数据INSERT INTO test_table (name, age, email) VALUES ('John Doe', 25, 'john.doe@example.com'), ('Jane Smith', 30, 'jane.smith@example.com'), ('Mike Johnson', 35, 'mike.johnson@example.com'); select * from test_table;+----+--------------+------+--------------------------+| id | name | age | email |+----+--------------+------+--------------------------+| 1 | John Doe | 25 | john.doe@example.com || 2 | Jane Smith | 30 | jane.smith@example.com || 3 | Mike Johnson | 35 | mike.johnson@example.com |+----+--------------+------+--------------------------+
3.2 备份
创建单次的备份任务,备份数据库与nginx
velero backup create test3 --include-namespaces=default --default-volumes-to-fs-backup
3.3 定时备份
这里我们为了测试将备份周期调整成了每分钟一次
[root@master1 yaml]# velero schedule create schedule-backup --schedule="* * * * *" --include-namespaces=default --default-volumes-to-fs-backup Schedule "schedule-backup" created successfully. [root@master1 yaml]# kubectl get schedule -A NAMESPACE NAME STATUS SCHEDULE LASTBACKUP AGE PAUSED velero schedule-backup Enabled * * * * * 15s
3.4 恢复
首先删除mysql与nginx,我们这里手动删除了nginx与mysql,模拟数据丢失
[root@master1 ~]# kubectl get pod No resources found in default namespace. [root@master1 ~]# kubectl get all NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE service/kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 64d [root@master1 ~]#
开始恢复,创建恢复策略,从我们之前备份过的备份任务恢复
velero restore create --from-backup schedule-backup-20230830131234
[root@master1 ~]# kubectl get backup -A NAMESPACE NAME AGE velero schedule-backup-20230830131234 5m13s velero test1 21m velero test2 11m velero test3 9m26s [root@master1 ~]# velero restore create --from-backup schedule-backup-20230830131234 Restore request "schedule-backup-20230830131234-20230830091758" submitted successfully. Run `velero restore describe schedule-backup-20230830131234-20230830091758` or `velero restore logs schedule-backup-20230830131234-20230830091758` for more details. [root@master1 ~]# kubectl get restore -A NAMESPACE NAME AGE velero schedule-backup-20230830131234-20230830091758 7s [root@master1 ~]# kubectl get pod NAME READY STATUS RESTARTS AGE mysql-dep-5984d97dc4-8fp7p 0/1 Init:0/1 0 10s nginx-deployment-6787dfdbf6-gf88w 0/1 Init:0/1 0 10s
等待一会后发现pod已经全部running
[root@master1 ~]# kubectl get pod NAME READY STATUS RESTARTS AGE mysql-dep-5984d97dc4-8fp7p 1/1 Running 0 62s nginx-deployment-6787dfdbf6-gf88w 1/1 Running 0 62s [root@master1 ~]# [root@master1 ~]# kubectl exec -it mysql-dep-5984d97dc4-8fp7p /bin/bash kubectl exec [POD] [COMMAND] is DEPRECATED and will be removed in a future version. Use kubectl exec [POD] -- [COMMAND] instead. Defaulted container "mysql" out of: mysql, restore-wait (init) bash-4.2# bash-4.2# ls bin boot dev docker-entrypoint-initdb.d entrypoint.sh etc home lib lib64 media mnt opt proc root run sbin srv sys tmp usr var bash-4.2# mysql -uroot -proot mysql: [Warning] Using a password on the command line interface can be insecure. Welcome to the MySQL monitor. Commands end with ; or \g. Your MySQL connection id is 2 Server version: 5.7.43 MySQL Community Server (GPL) Copyright (c) 2000, 2023, Oracle and/or its affiliates. Oracle is a registered trademark of Oracle Corporation and/or its affiliates. Other names may be trademarks of their respective owners. Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.mysql> show databases;+--------------------+ | Database | +--------------------+ | information_schema | | mysql | | performance_schema | | sys | | testdb | +--------------------+ 5 rows in set (0.01 sec)mysql> use testdb;Reading table information for completion of table and column names You can turn off this feature to get a quicker startup with -A Database changedmysql> select * from test_table;+----+--------------+------+--------------------------+ | id | name | age | email | +----+--------------+------+--------------------------+ | 1 | John Doe | 25 | john.doe@example.com | | 2 | Jane Smith | 30 | jane.smith@example.com | | 3 | Mike Johnson | 35 | mike.johnson@example.com | +----+--------------+------+--------------------------+ 3 rows in set (0.00 sec)mysql>
数据恢复完成