首页 > 其他分享 >利用K8S CronJob来实现etcd集群的自动备份

利用K8S CronJob来实现etcd集群的自动备份

时间:2023-08-14 17:05:15浏览次数:54  
标签:master1 CronJob k8s root snapshot etcd K8S backup

前言:

利用k8s CronJob 来实现etcd集群的自动备份,并通过sftp传输到本k8s集群外的服务器上,进行存储。


实验步骤:

基本环境情况:

服务器角色

IP

系统

ETCD版本

K8S集群操作服务器

192.168.1.136

Centos7.9

3.4.9

存储服务器

192.168.1.105

Centos7.9

-

创建Dockerfile镜像:

[root@k8s-master1 ~]# mkdir /software/k8s-yaml/etcd-backup/
[root@k8s-master1 ~]# cd /software/k8s-yaml/etcd-backup/

[root@k8s-master1 etcd-backup]# vim Dockerfile

FROM python:3-alpine

RUN mkdir /root/.ssh  \
    && touch /root/.ssh/config \
    && echo -e "Host *\n\tStrictHostKeyChecking no\n\tUserKnownHostsFile /dev/null\n\tKexAlgorithms +diffie-hellman-group1-sha1\n\tPubkeyAcceptedKeyTypes +ssh-rsa\n\tHostkeyAlgorithms +ssh-rsa" > /root/.ssh/config

RUN apk add -U --no-cache curl lftp ca-certificates openssh \ 
    && curl -L https://yunwei-software.oss-cn-zhangjiakou.aliyuncs.com/etcdctl -o /usr/local/bin/etcdctl \
    && chmod +x /usr/local/bin/etcdctl

PS:etcd版本为3.4.9,如ETCD版本是不是3.4.9,可以使用ADD将自己集群中的etcdctl打入镜像中。或调整下面的Dockerfile,从Gitlab上拉去。

GitHub拉取使用的Dockerfile:

FROM python:3-alpine

RUN sed -i 's/dl-cdn.alpinelinux.org/mirrors.aliyun.com/g' /etc/apk/repositories

# 设置自己集群中etcd的版本
ARG ETCD_VERSION=v3.4.9

RUN apk add -U --no-cache curl lftp ca-certificates openssh

RUN mkdir /root/.ssh  \
    && touch /root/.ssh/config \
    && echo -e "Host *\n\tStrictHostKeyChecking no\n\tUserKnownHostsFile /dev/null\n\tKexAlgorithms +diffie-hellman-group1-sha1\n\tPubkeyAcceptedKeyTypes +ssh-rsa\n\tHostkeyAlgorithms +ssh-rsa" > /root/.ssh/config

ADD s3cmd-master.zip /s3cmd-master.zip
RUN unzip /s3cmd-master.zip -d /opt \
    && cd /opt/s3cmd-master \
    && python setup.py install \
    && rm -rf /s3cmd-master.zip

RUN curl -L https://github.com/etcd-io/etcd/releases/download/${ETCD_VERSION}/etcd-${ETCD_VERSION}-linux-amd64.tar.gz -o /opt/etcd-${ETCD_VERSION}-linux-amd64.tar.gz \
    && cd /opt && tar xzf etcd-${ETCD_VERSION}-linux-amd64.tar.gz \
    && mv etcd-${ETCD_VERSION}-linux-amd64/etcdctl /usr/local/bin/etcdctl \
    && rm -rf etcd-${ETCD_VERSION}-linux-amd64*

镜像创建并上传至镜像仓库(本地和云上都可,方便其他节点拉取该镜像)

[root@k8s-master1 etcd-backup]# docker build -t lws_etcd_backups:v1 .
[root@k8s-master1 etcd-backup]# docker tag lws_etcd_backups:v1 registry.cn-zhangjiakou.aliyuncs.com/newtime-test/etcd_backups:lws_v1
[root@k8s-master1 etcd-backup]# docker push registry.cn-zhangjiakou.aliyuncs.com/newtime-test/etcd_backups:lws_v1

ConfigMap创建:

[root@k8s-master1 etcd-backup]# vim etcd-backup-cm.yaml

apiVersion: v1
kind: ConfigMap
metadata:
  name: cron-sftp
  namespace: backups
data:
  entrypoint.sh: |
    #!/bin/bash

    #variables
    sftp_user="ftp01"
    sftp_passwd="Nisec123456"
    sftp_url="sftp://192.168.1.105:22"
    backup_dir=/home/ftp/etcd-backup/$CLUSTER_NAME

    # backup etcd data
    mkdir -p /snapshot
    chmod +x /usr/local/bin/etcdctl
    file=etcd-snapshot-$(date +%Y%m%d-%H%M%S).db
    etcdctl --endpoints $ENDPOINTS \
    --cert=/opt/etcd/ssl/server.pem \
    --key=/opt/etcd/ssl/server-key.pem \
    --cacert=/opt/etcd/ssl/ca.pem \
    snapshot save /snapshot/$file

    # upload etcd snapshot file
    lftp -u $sftp_user,$sftp_passwd $sftp_url<<EOF
    mkdir -p $backup_dir
    cd $backup_dir
    lcd /snapshot
    put $file
    by
    EOF

    # remove the expired snapshot file
    total_num=$(lftp -u $sftp_user,$sftp_passwd $sftp_url -e "ls $backup_dir | wc -l;by")
    if [ $total_num -gt $BACKUP_COUNTS ]; then
      expired_num=$(expr $total_num - $BACKUP_COUNTS)
      expired_files=$(lftp -u $sftp_user,$sftp_passwd $sftp_url -e "ls $backup_dir | head -n $expired_num;by" | awk '{print $NF}')
      for f in $expired_files; do
        to_remove=${backup_dir}/${f}
        echo "start to remove $to_remove"
        lftp -u $sftp_user,$sftp_passwd $sftp_url -e "rm -f $to_remove;by"
      done
    fi

    # remove local etcd snapshot file
    rm -f /snapshot/$file

PS:按实际情况修改SFTP段落的配置。

#创建cm类型的cron-sftp
[root@k8s-master1 etcd-backup]# kubectl create ns backups
[root@k8s-master1 etcd-backup]# kubectl apply -f etcd-backup-cm.yaml
[root@k8s-master1 etcd-backup]# kubectl get cm -n backups
NAME               DATA   AGE
cron-sftp          1      6s
kube-root-ca.crt   1      11s

CronJob创建:

[root@k8s-master1 etcd-backup]# vim etcd-backup-cronjob.yaml

apiVersion: batch/v1beta1
kind: CronJob
metadata:
  name: etcd-backup-sftp
  namespace: backups
spec:
 schedule: "*/5 * * * *"
 jobTemplate:
  spec:
    template:
      metadata:
       labels:
        app: etcd-backup
      spec:
        containers:
        - name: etcd-backup
          image: registry.cn-zhangjiakou.aliyuncs.com/newtime-test/etcd_backups:lws_v1
          imagePullPolicy: IfNotPresent
          workingDir: /
          command: ["sh", "./entrypoint.sh"]
          env:
          - name: ENDPOINTS
            value: "192.168.1.136:2379"
          - name: ETCDCTL_API
            value: "3"
          - name: BACKUP_COUNTS
            value: "5"
          - name: CLUSTER_NAME
            value: "cluster1"
          volumeMounts:
            - mountPath: /entrypoint.sh
              name: configmap-volume
              readOnly: true
              subPath: entrypoint.sh
            - mountPath: /opt/etcd/ssl
              name: etcd-certs
              readOnly: true
            - mountPath: /etc/localtime
              name: lt-config
            - mountPath: /etc/timezone
              name: tz-config
        volumes:
          - name: configmap-volume
            configMap:
              defaultMode: 0777
              name: cron-sftp
          - name: etcd-certs
            hostPath:
              path: /opt/etcd/ssl
          - name: lt-config
            hostPath:
              path: /etc/localtime
          - name: tz-config
            hostPath:
              path: /etc/timezone
        hostNetwork: true
        restartPolicy: OnFailure

PS:可以通过nodeAffinity将执行etcd备份的CrobJob调度到任意etcd节点上运行。示例如下:

affinity:
  nodeAffinity:
    requiredDuringSchedulingIgnoredDuringExecution:
      nodeSelectorTerms:
      - matchExpressions:
        - key: node-role.kubernetes.io/etcd
          operator: Exists

我这边共有4个节点,是将ETCD的SSL证书放到了每个节点中,所以没有设置nodeAffinity。

#把SSL证书放到所有节点中:
[root@k8s-master1 etcd-backup]# scp -p /opt/etcd/ssl/ 192.168.1.139:/opt/etcd/ssl

运行etcd-backup-cronjob.yaml:

[root@k8s-master1 etcd-backup]# kubectl apply -f etcd-backup-cronjob.yaml
[root@k8s-master1 etcd-backup]# kubectl get cj -n backups
NAME               SCHEDULE      SUSPEND   ACTIVE   LAST SCHEDULE   AGE
etcd-backup-sftp   */5 * * * *   False     0        <none>           7s

#5分钟后查询pods创建情况:
[root@k8s-master1 etcd-backup]# kubectl get pods -n backups
NAME                                READY   STATUS      RESTARTS   AGE
etcd-backup-sftp-1677308100-cw4b8   0/1     Completed   0          1m51s

[root@k8s-master1 etcd-backup]# kubectl logs etcd-backup-sftp-1677308100-cw4b8 -n backups
{"level":"info","ts":1677308105.1600003,"caller":"snapshot/v3_snapshot.go:119","msg":"created temporary db file","path":"/snapshot/etcd-snapshot-20230225-145505.db.part"}
{"level":"info","ts":"2023-02-25T14:55:05.191+0800","caller":"clientv3/maintenance.go:200","msg":"opened snapshot stream; downloading"}
{"level":"info","ts":1677308105.1914499,"caller":"snapshot/v3_snapshot.go:127","msg":"fetching snapshot","endpoint":"192.168.1.136:2379"}
{"level":"info","ts":"2023-02-25T14:55:05.872+0800","caller":"clientv3/maintenance.go:208","msg":"completed snapshot read; closing"}
{"level":"info","ts":1677308106.153034,"caller":"snapshot/v3_snapshot.go:142","msg":"fetched snapshot","endpoint":"192.168.1.136:2379","size":"18 MB","took":0.992465311}
{"level":"info","ts":1677308106.1532946,"caller":"snapshot/v3_snapshot.go:152","msg":"saved","path":"/snapshot/etcd-snapshot-20230225-145505.db"}
Snapshot saved at /snapshot/etcd-snapshot-20230225-145505.db
mkdir: Access failed: Failure (/home/ftp/etcd-backup/cluster1)
start to remove /home/ftp/etcd-backup/cluster1/.
start to remove /home/ftp/etcd-backup/cluster1/..
start to remove /home/ftp/etcd-backup/cluster1/etcd-snapshot-20230225-143011.db

查看etcd备份情况:

利用K8S CronJob来实现etcd集群的自动备份_CronJob

因为机房的K8S集群目前没有出现过问题,自己目前也没有时间去测试使用snapshot.db文件恢复,等有时间了再去做实验吧。

恢复以及参考的链接如下:

参考链接


标签:master1,CronJob,k8s,root,snapshot,etcd,K8S,backup
From: https://blog.51cto.com/u_14500227/7078535

相关文章

  • 记录一次调查OpenEuler上部署K8s,CoreDNS组件一直 CrashLoopBackOff ,describe 时 Back
    详细过程不赘述了。百度了很多办法都不可用,然后尝试重装k8s好多次也不管用。 最后解决。kubectleditdeploymentcoredns-nkube-systemapiVersion:apps/v1kind:Deploymentmetadata:annotations:deployment.kubernetes.io/revision:"2"creationTimestamp......
  • etcd集群恢复
    etcd概述etcd 是 CoreOS团队于2013年6月发起的开源项目,它的目标是构建一个高可用的分布式键值(key-value)数据库。etcd 内部采用 raft 协议作为一致性算法,etcd基于Go语言实现。完全复制:集群中的每个节点都可以使用完整的存档高可用性:Etcd可用于避免硬件的单点故障或网......
  • gitlab-runner配合k8s完成代码自动打包部署上线
    前期搭建了云服务器私有的gitlab和k8s环境,但是都是独立运行的,每次代码更新需要手动去打包好镜像,推送到镜像仓库,然后在deployment里面更新image,这样平时不太有问题,但是会给运维我这边产生很多琐事(反正就是想偷懒,能自动化的为什么要手动,懒惰才是提高生产力的动力!)。在这种情况下我就考......
  • k8s笔记10
    摘要:组播;;1、docker加入组播(docker加入组播(docker组播))(docker与组播(docker组播)),Docker容器默认不支持UDP/组播流量,在使用容器进行网络应用的开发过程中,需要使用第三方软件支持组播。下面介绍使用Docker加入组播的方法。#创建一个新的网络dockernetworkcreate--driver=......
  • Jenkinsfile使用k8s agent构建失败:Container jnlp was terminated (Exit Code: 1, Rea
    问题描述Jenkinsfile使用k8sagent构建失败jenkins报错截图:查看pod app-system-23-wmx8b-5lnl2-lxvlr的jnlp容器日志:分析处理一般构建失败大都是jnlp容器问题。经以下日志分析发现jenkins主节点和slave节点的jdk版本不一致导致该提示JavaJDK版本不对:hudson/slaves/SlaveComputer......
  • k8s etcd operator
    在k8s生态中,Operator是灵活管理有状态应用的解决方案。operator通过crd来描述部署的有状态应用和自定义控制器来完成部署和运维工作。EtcdOperator部署Etcd集群,采用的是静态集群的方式。好处是不必依赖一个额外的服务发现机制来组建集群,适合本地容器化部署。难点在于部署时规划好......
  • 安装etcd服务
    #下载etcd-v3.4.27wgethttps://github.com/etcd-io/etcd/releases/download/v3.4.27/etcd-v3.4.27-linux-arm64.tar.gz################################################方式1:#etcd服务的配置文件都定义在etcd.service中#在etcd.conf文件中少体现###############......
  • k8s finalizers和owner references
    finalizers终结器,存放键的列表,列表内的键为空时资源才可被删除。删除指定了Finalizer的对象时,填充.metadata.deletionTimestamp来标记要删除的对象,返回已接受202状态码使其进入只读状态。#创建包含finalizers的configmapcat<<EOF|kubectlcreate-f-apiVersion:v1kind:......
  • k8s 网络模型
    容器网络通信模式在Host模式中,各容器共享宿主机的根网络名称空间,它们使用同一个接口设备和网络协议栈,因此,用户必须精心管理共享同一网络端口空间容器的应用与宿主机应用,以避免端口冲突。Bridge模式对host模式进行了一定程度的改进,在该模式中,容器从一个或多个专用网络(地址池)中获取IP......
  • k8s 容器安全上下文
    容器安全上下文介绍kubernetes为安全运行pod及容器运行设计了安全上下文机制,该机制允许用户和管理员定义pod或容器的特权与访问控制,已配置容器与主机以及主机之上的其它容器间的隔离级别。安全上下文就是一组用来决定容器时如何创建和运行的约束条件,这些条件代表创建和运行容器时使......