Back up Kyma
Context
The Kyma cluster load consists of Kubernetes objects and volumes.
Object backup
Kyma relies on a managed Kubernetes cluster for periodic backups of Kubernetes objects to avoid any manual steps.
CAUTION: Automatic backup doesn't include Kubernetes volumes. Back up your volumes periodically either on demand, or set up a periodic job.
For example, Gardener uses etcd as the Kubernetes backing store for all cluster data. Gardener runs periodic jobs to take major and minor snapshots of the etcd database to include Kubernetes objects in the backup.
The major snapshot that includes all resources is taken on a daily basis, while minor snapshots are taken every five minutes.
If the etcd database experiences any problems, Gardener automatically restores the Kubernetes cluster using the most recent snapshot.
Volume backup
We recommend that you back up your volumes periodically with the VolumeSnapshot API resource, which is provided by Kubernetes. You can use your snapshot to provision a new volume prepopulated with the snapshot data, or restore the existing volume to the state represented by the snapshot.
Taking volume snapshots is possible thanks to Container Storage Interface (CSI) drivers, which allow third-party storage providers to expose storage systems in Kubernetes. For details on available drivers, see the full list of drivers.
You can create on-demand volume snapshots manually, or set up a periodic job that takes automatic snapshots periodically.
Back up resources using Velero
You can back up and restore individual resources manually or automatically with Velero. For more information, read the Velero documentation. Be aware that a full backup of a Kyma cluster isn't supported. Start with the existing Kyma installation and restore specific resources individually.
Create on-demand volume snapshots
If you want to provision a new volume or restore the existing one, create on-demand volume snapshots:
- Gardener
- AKS
- GKE
Create a periodic snapshot job
You can also create a CronJob to handle taking volume snapshots periodically. A sample CronJob definition that includes the required ServiceAccount and roles looks as follows:
---apiVersion: v1kind: ServiceAccountmetadata: name: volume-snapshotter---apiVersion: rbac.authorization.k8s.io/v1kind: Rolemetadata: name: volume-snapshotter namespace: {NAMESPACE}rules:- apiGroups: ["snapshot.storage.k8s.io"] resources: ["volumesnapshots"] verbs: ["create", "get", "list", "delete"]---apiVersion: rbac.authorization.k8s.io/v1kind: RoleBindingmetadata: name: volume-snapshotter namespace: {NAMESPACE}roleRef: apiGroup: rbac.authorization.k8s.io kind: Role name: volume-snapshottersubjects:- kind: ServiceAccount name: volume-snapshotter---apiVersion: batch/v1beta1kind: CronJobmetadata: name: volume-snapshotter namespace: {NAMESPACE}spec: schedule: "@hourly" #Run once an hour, beginning of hour jobTemplate: spec: template: spec: serviceAccountName: volume-snapshotter containers: - name: job image: europe-docker.pkg.dev/kyma-project/prod/tpi/k8s-tools:v20230809-6a330b54 command: - /bin/bash - -c - | # Create volume snapshot with random name. RANDOM_ID=$(openssl rand -hex 4) cat <<EOF | kubectl apply -f - apiVersion: snapshot.storage.k8s.io/v1beta1 kind: VolumeSnapshot metadata: name: volume-snapshot-${RANDOM_ID} namespace: {NAMESPACE} labels: "job": "volume-snapshotter" "name": "volume-snapshot-${RANDOM_ID}" spec: volumeSnapshotClassName: {SNAPSHOT_CLASS_NAME} source: persistentVolumeClaimName: {PVC_NAME} EOF # Wait until volume snapshot is ready to use. attempts=3 retryTimeInSec="30" for ((i=1; i<=attempts; i++)); do STATUS=$(kubectl get volumesnapshot volume-snapshot-${RANDOM_ID} -n {NAMESPACE} -o jsonpath='{.status.readyToUse}') if [ "${STATUS}" == "true" ]; then echo "Volume snapshot is ready to use." break fi if [[ "${i}" -lt "${attempts}" ]]; then echo "Volume snapshot is not yet ready to use, let's wait ${retryTimeInSec} seconds and retry. Attempts ${i} of ${attempts}." else echo "Volume snapshot is still not ready to use after ${attempts} attempts, giving up." exit 1 fi sleep ${retryTimeInSec} done # Delete old volume snapshots. kubectl delete volumesnapshot -n {NAMESPACE} -l job=volume-snapshotter,name!=volume-snapshot-${RANDOM_ID}