Enable backups
|
This page applies to SUSE® Observability {release-version} or newer. If you are upgrading from a version prior to {release-version}, see Migrating to S3Proxy for important information about changes in configuration of the backup and restore solution. |
Overview
SUSE® Observability has a built-in backup mechanism that can be configured to store backups to AWS S3, Azure Blob Storage, or a Kubernetes Persistent Volume.
Most backups are enabled with a single Helm value global.backup.enabled. Settings backups are always enabled but behave differently based on the following values:
Backup scope
The following data can be automatically backed up:
-
Settings (StackPacks, monitors, views, tokens) - always enabled:
-
When
global.backup.enabledisfalse: backups are stored via the S3 Proxy to a K8s Persistent Volume -
When
global.backup.enabledistrue: backups are stored via the S3 Proxy on your configured storage backend
-
-
Topology data and Settings stored in StackGraph - enabled when
global.backup.enabledistrue -
Metrics stored in SUSE® Observability’s Victoria Metrics instance(s) - enabled when
global.backup.enabledistrue -
Telemetry data stored in SUSE® Observability’s Elasticsearch instance - enabled when
global.backup.enabledistrue -
OpenTelemetry data stored in SUSE® Observability’s ClickHouse instance - enabled when
global.backup.enabledistrue
Storage options
Backups use [S3 Proxy](https://github.com/gaul/s3proxy) as an S3-compatible gateway to your storage backend. S3 Proxy is automatically deployed by the Helm chart.
It can be configured to store the backups in three locations:
Enable backups
AWS S3
|
Encryption Amazon S3-managed keys (SSE-S3) should be used when encrypting S3 buckets that store the backups. ⚠️ Encryption with AWS KMS keys stored in AWS Key Management Service (SSE-KMS) isn’t supported. This will result in errors such as this one in the Elasticsearch logs:
|
Using separate S3 buckets
To enable scheduled backups to separate AWS S3 buckets (one per datastore), add the following YAML fragment to the Helm values.yaml file used to install SUSE® Observability:
global:
backup:
enabled: true
s3proxy:
credentials:
# Credentials used internally in the cluster to access the backup storage (S3Proxy).
accessKey: YOUR_ACCESS_KEY
secretKey: YOUR_SECRET_KEY
backup:
storage:
backend:
s3:
enabled: true
# Specify AWS region (can also be set via environment variable instead)
region: "eu-west-1"
# Option 1: Use explicit credentials
accessKey: AWS_ACCESS_KEY
secretKey: AWS_SECRET_KEY
# Option 2: Use IAM role / IRSA
# Leave accessKey and secretKey empty
# Option 3: Use credentials from external secret:
fromExternalSecret: "my-aws-credentials" # Kubernetes secret containing backendAccessKey and backendSecretKey
# Optional: Custom S3-compatible endpoint (e.g., MinIO)
# endpoint: "https://minio.example.com"
stackGraph:
bucketName: AWS_STACKGRAPH_BUCKET
elasticsearch:
bucketName: AWS_ELASTICSEARCH_BUCKET
configuration:
bucketName: AWS_CONFIGURATION_BUCKET
victoria-metrics-0:
backup:
bucketName: AWS_VICTORIA_METRICS_BUCKET
victoria-metrics-1:
backup:
bucketName: AWS_VICTORIA_METRICS_BUCKET
clickhouse:
backup:
bucketName: AWS_CLICKHOUSE_BUCKET
Replace the following values:
-
YOUR_ACCESS_KEYandYOUR_SECRET_KEYare the credentials that will be used to secure the S3 Proxy instance. The automatic backup jobs and the restore jobs use these to authenticate. They’re also required if you want to manually access the S3 Proxy instance.-
YOUR_ACCESS_KEY should contain 5 to 20 alphanumerical characters.
-
YOUR_SECRET_KEY should contain 8 to 40 alphanumerical characters.
-
-
AWS_ACCESS_KEYandAWS_SECRET_KEYare the AWS credentials for the IAM user that has access to the S3 buckets where the backups will be stored. See below for the permission policy that needs to be attached to that user. -
AWS_*_BUCKETare the names of the S3 buckets where the backups should be stored.The names of AWS S3 buckets are global across the whole of AWS. Therefore, the S3 buckets, with the default name (
sts-elasticsearch-backup,sts-configuration-backup,sts-stackgraph-backup,sts-victoria-metrics-backupandsts-clickhouse-backup), will probably not be available.
Using a single S3 bucket with prefixes
Instead of using separate buckets for each datastore, you can use a single S3 bucket with different prefixes:
global:
backup:
enabled: true
s3proxy:
credentials:
# Credentials used internally in the cluster to access the backup storage (S3Proxy).
accessKey: YOUR_ACCESS_KEY
secretKey: YOUR_SECRET_KEY
backup:
storage:
backend:
s3:
enabled: true
region: "eu-west-1"
# For other authentication options and custom endpoint, see the previous example
accessKey: AWS_ACCESS_KEY
secretKey: AWS_SECRET_KEY
elasticsearch:
bucketName: BUCKET
s3Prefix: elasticsearch
stackGraph:
bucketName: BUCKET
s3Prefix: stackgraph
configuration:
bucketName: BUCKET
s3Prefix: configuration
victoria-metrics-0:
backup:
bucketName: BUCKET
s3Prefix: victoria-metrics-0
victoria-metrics-1:
backup:
bucketName: BUCKET
s3Prefix: victoria-metrics-1
clickhouse:
backup:
bucketName: BUCKET
s3Prefix: clickhouse
Replace BUCKET with your S3 bucket name. The backups for different datastores are organized using the configured s3Prefix values. The same YOUR_ACCESS_KEY, YOUR_SECRET_KEY, AWS_ACCESS_KEY, and AWS_SECRET_KEY values from the previous section apply here.
AWS S3 Permissions
The IAM user identified by AWS_ACCESS_KEY and AWS_SECRET_KEY must be configured with the following permission policy to access the S3 buckets:
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "AllowListBackupBuckets",
"Effect": "Allow",
"Action": [
"s3:ListBucket",
"s3:GetBucketLocation"
],
"Resource": [
"arn:aws:s3:::AWS_STACKGRAPH_BUCKET",
"arn:aws:s3:::AWS_ELASTICSEARCH_BUCKET",
"arn:aws:s3:::AWS_VICTORIA_METRICS_BUCKET",
"arn:aws:s3:::AWS_CLICKHOUSE_BUCKET",
"arn:aws:s3:::AWS_CONFIGURATION_BUCKET"
]
},
{
"Sid": "AllowWriteBackupBuckets",
"Effect": "Allow",
"Action": [
"s3:PutObject",
"s3:GetObject",
"s3:DeleteObject",
"s3:ListMultipartUploadParts",
"s3:AbortMultipartUpload"
],
"Resource": [
"arn:aws:s3:::AWS_STACKGRAPH_BUCKET/*",
"arn:aws:s3:::AWS_ELASTICSEARCH_BUCKET/*",
"arn:aws:s3:::AWS_VICTORIA_METRICS_BUCKET/*",
"arn:aws:s3:::AWS_CLICKHOUSE_BUCKET/*",
"arn:aws:s3:::AWS_CONFIGURATION_BUCKET"
]
}
]
}
Azure Blob Storage
Using separate containers
To enable backups to separate Azure Blob Storage containers (one per datastore), add the following YAML fragment to the Helm values.yaml file used to install SUSE® Observability:
global:
backup:
enabled: true
s3proxy:
credentials:
# Credentials used internally in the cluster to access the backup storage (S3Proxy).
accessKey: YOUR_ACCESS_KEY
secretKey: YOUR_SECRET_KEY
backup:
storage:
backend:
azure:
enabled: true
accountName: "mystorageaccount"
# Option 1: Use explicit credentials
accountKey: "your-storage-account-key"
# Option 2: Use managed identity
# Leave accountKey empty
# Option 3: Use credentials from external secret:
fromExternalSecret: "my-azure-credentials" # Kubernetes secret containing azureAccountName and azureAccountKey
Replace the following values:
-
YOUR_ACCESS_KEYandYOUR_SECRET_KEYare the credentials that will be used to secure the S3 Proxy instance. The automatic backup jobs and the restore jobs use these to authenticate. They’re also required if you want to manually access the S3 Proxy instance. -
AZURE_STORAGE_ACCOUNT_NAME- the Azure storage account name (learn.microsoft.com) -
AZURE_STORAGE_ACCOUNT_KEY- the Azure storage account key (learn.microsoft.com) where the backups should be stored.
The StackGraph, Elasticsearch, Victoria Metrics, and ClickHouse backups are stored in BLOB containers called sts-stackgraph-backup, sts-configuration-backup, sts-elasticsearch-backup, sts-victoria-metrics-backup, sts-clickhouse-backup respectively. These names can be changed by setting the Helm values backup.stackGraph.bucketName, backup.elasticsearch.bucketName, victoria-metrics-0.backup.bucketName, victoria-metrics-1.backup.bucketName and clickhouse.backup.bucketName respectively.
Using a single container with prefixes
Instead of using separate containers for each datastore, you can use a single Azure Blob Storage container with different prefixes:
global:
backup:
enabled: true
s3proxy:
credentials:
# Credentials used internally in the cluster to access the backup storage (S3Proxy).
accessKey: YOUR_ACCESS_KEY
secretKey: YOUR_SECRET_KEY
backup:
storage:
backend:
azure:
enabled: true
accountName: "mystorageaccount"
# For other authentication options, see the previous example
accountKey: "your-storage-account-key"
elasticsearch:
bucketName: CONTAINER
s3Prefix: elasticsearch
stackGraph:
bucketName: CONTAINER
s3Prefix: stackgraph
configuration:
bucketName: CONTAINER
s3Prefix: configuration
victoria-metrics-0:
backup:
bucketName: CONTAINER
s3Prefix: victoria-metrics-0
victoria-metrics-1:
backup:
bucketName: CONTAINER
s3Prefix: victoria-metrics-1
clickhouse:
backup:
bucketName: CONTAINER
s3Prefix: clickhouse
Replace the following values:
* YOUR_ACCESS_KEY and YOUR_SECRET_KEY are the credentials that will be used to secure the S3 Proxy instance. The automatic backup jobs and the restore jobs use these to authenticate. They’re also required if you want to manually access the S3 Proxy instance.
* CONTAINER with your Azure Blob Storage container name. The backups for different datastores will be organized using the configured s3Prefix values. The same AZURE_STORAGE_ACCOUNT_NAME and AZURE_STORAGE_ACCOUNT_KEY values from the previous section apply here.
Kubernetes Persistent Volume
|
Using Kubernetes Persistent Volumes for backups has significant limitations:
Recommendation: Use AWS S3 or Azure Blob Storage instead for production environments. |
Basic configuration
To enable backups to cluster-local storage, enable global backups with PVC storage by adding the following YAML fragment to the Helm values.yaml file used to install SUSE® Observability:
global:
backup:
enabled: true
s3proxy:
credentials:
# Credentials used internally in the cluster to access the backup storage (S3Proxy).
accessKey: YOUR_ACCESS_KEY
secretKey: YOUR_SECRET_KEY
backup:
storage:
backend:
pvc:
enabled: true
size: 500Gi # Size for main backup storage, size depends on the expected combined backups size
Replace the following values:
-
YOUR_ACCESS_KEYandYOUR_SECRET_KEYare the credentials that will be used to secure the S3 Proxy instance. The automatic backup jobs and the restore jobs use these to authenticate. They’re also required if you want to manually access the S3 Proxy instance.
Configuration and topology data (StackGraph)
Configuration and topology data (StackGraph) backups are full backups, stored in a single file with the extension .graph. Each file contains a full backup and can be moved, copied or deleted as required.
Backup schedule
By default, the StackGraph backups are created daily at 03:00 AM server time.
The backup schedule can be configured using the Helm value backup.stackGraph.scheduled.schedule, specified in Kubernetes cron schedule syntax (kubernetes.io).
Backup retention
By default, the StackGraph backups are kept for 30 days. As StackGraph backups are full backups, this can require a lot of storage.
The backup retention delta can be configured using the Helm value backup.stackGraph.scheduled.backupRetentionTimeDelta, specified in the format of GNU date --date argument. The default is 30 days ago. See Relative items in date strings for more examples.
Settings
Settings (Configuration previously) includes installed instances of StackPacks with their configuration and other customizations created by the user, such as monitors, custom views, and service tokens.
Settings backups are lightweight (typically only several megabytes) and quick to restore with minimal downtime. After a settings backup is restored, new data will be processed as before, recreating topology, health states, and alerts. However, topology history (including health) is not preserved in settings backups - for that purpose, use the StackGraph backup described above.
|
Settings backups are always enabled, regardless of the
|
Backup schedule
By default, settings backups are created daily at 04:00 AM server time.
The backup schedule can be configured using the Helm value backup.configuration.scheduled.schedule, specified in Kubernetes cron schedule syntax (kubernetes.io).
Backup retention
Backup retention depends on the global.backup.enabled setting:
When global.backup.enabled is true (backups stored via S3 Proxy):
-
By default, settings backups are kept for 365 days
-
Configure retention using
backup.configuration.scheduled.backupRetentionTimeDelta- specified in the format of GNU date--dateargument. The default is365 days ago
When global.backup.enabled is false (backups stored to a dedicated PV):
-
By default, the last 10 backup files are kept on the Persistent Volume
-
Configure the maximum number of files using
backup.configuration.maxLocalFiles(default:10) -
Configure the PV size using
backup.configuration.scheduled.pvc.size(default:2Gi)
Example configuration for dedicated PV storage:
backup:
configuration:
maxLocalFiles: 10
scheduled:
pvc:
size: '2Gi'
Metrics (Victoria Metrics)
Victoria Metrics uses a two-tier smart backup strategy:
-
Hourly incremental — takes a snapshot and uploads only changed blocks to a fixed
latestpath in the backup storage. -
Daily snapshot — performs a server-side copy from
latestto a timestamped folder (YYYYMMDDHHmmss), creating a point-in-time restore point.
|
High Availability deployments High Available deployments use two instances of Victoria Metrics ( |
Backup schedule
Victoria Metrics backups run on two schedules:
-
Hourly incremental backups:
-
victoria-metrics-0— 25 minutes past the hour -
victoria-metrics-1— 35 minutes past the hour
-
-
Daily snapshot backups:
-
victoria-metrics-0— at 00:55 daily -
victoria-metrics-1— at 01:05 daily
-
The backup schedules can be configured using the Helm values victoria-metrics-0.backup.scheduled.hourly and victoria-metrics-0.backup.scheduled.daily (and the corresponding victoria-metrics-1 values).
Backup retention
Daily snapshot backups are retained using a two-tier retention policy:
-
The most recent daily backups are kept unconditionally (default: 7, configured via
keepLastDaily). -
From older backups, one backup per 7-day period is kept (default: up to 4 weekly backups, configured via
keepLastWeekly). -
Everything else is deleted automatically.
The retention can be configured using the following Helm values:
victoria-metrics-0:
backup:
keepLastDaily: 7 # Number of most recent daily backups to retain
keepLastWeekly: 4 # Number of weekly backups to retain (beyond the daily window)
victoria-metrics-1:
backup:
keepLastDaily: 7
keepLastWeekly: 4
Disable scheduled backups
To disable scheduled Victoria Metrics backups, set the backup schedules for both instances to a date far in the past:
victoria-metrics-0:
backup:
scheduled:
hourly: '0 0 1 1 1970' # January 1, 1970 (epoch start)
daily: '0 0 1 1 1970' # January 1, 1970 (epoch start)
victoria-metrics-1:
backup:
scheduled:
hourly: '0 0 1 1 1970' # January 1, 1970 (epoch start)
daily: '0 0 1 1 1970' # January 1, 1970 (epoch start)
OpenTelemetry (ClickHouse)
ClickHouse uses both incremental and full backups. The old backups are deleted based on the configured retention policy.
Backup schedule
The ClickHouse backups are created:
-
Full Backup - at 00:45 every day
-
Incremental Backup - 45 minutes past the hour (from 3 am to 12 am)
Backup retention
By default, the tooling keeps last 308 backups (full and incremental) what is equal to ~14 days.
The backup retention can be configured using the Helm value clickhouse.backup.config.keep_remote.
Disable scheduled backups
To disable scheduled ClickHouse backups, set both full and incremental backup schedules to a date far in the past:
clickhouse:
backup:
scheduled:
full_schedule: '0 0 1 1 1970' # January 1, 1970 (epoch start)
incremental_schedule: '0 0 1 1 1970' # January 1, 1970 (epoch start)
Telemetry data (Elasticsearch)
Elasticsearch snapshots are incremental.
Snapshot retention
By default, Elasticsearch snapshots are kept for 30 days, with a minimum of 5 snapshots and a maximum of 30 snapshots.
The retention time and number of snapshots kept can be configured using the following Helm values:
-
backup.elasticsearch.scheduled.snapshotRetentionExpireAfter, specified in Elasticsearch time units (elastic.co). -
backup.elasticsearch.scheduled.snapshotRetentionMinCount -
backup.elasticsearch.scheduled.snapshotRetentionMaxCount