Self-hosted Archival setup
Archival is a feature that automatically backs up Event Histories and Visibility records from Temporal Cluster persistence to a custom blob store.
Workflow Execution Event Histories are backed up after the Retention Period is reached. Visibility records are backed up immediately after a Workflow Execution reaches a Closed status.
Archival enables Workflow Execution data to persist as long as needed, while not overwhelming the Cluster's persistence store.
This feature is helpful for compliance and debugging.
Temporal's Archival feature is considered experimental and not subject to normal versioning and support policy.
Archival is not supported when running Temporal through Docker and is disabled by default when installing the system manually and when deploying through helm charts (but can be enabled in the config).
How to set up Archival
Archival consists of the following elements:
- Configuration: Archival is controlled by the server configuration (i.e. the
config/development.yaml
file). - Provider: Location where the data should be archived. Supported providers are S3, GCloud, and the local file system.
- URI: Specifies which provider should be used. The system uses the URI schema and path to make the determination.
Take the following steps to set up Archival:
- Set up the provider of your choice.
- Configure Archival.
- Create a Namespace that uses a valid URI and has Archival enabled.
Providers
Temporal directly supports several providers:
- Local file system: The filestore archiver is used to archive data in the file system of whatever host the Temporal server is running on. In the case of temporal helm-charts, the archive data is stored in the
history
pod. APIs do not function with the filestore archive. This provider is used mainly for local installations and testing and should not be relied on for production environments. - Google Cloud: The gcloud archiver is used to connect and archive data with Google Cloud.
- S3: The s3store archiver is used to connect and archive data with S3.
- Custom: If you want to use a provider that is not currently supported, you can create your own archiver to support it.
Make sure that you save the provider's storage location URI in a place where you can reference it later, because it is passed as a parameter when you create a Namespace.
Configuration
Archival configuration is defined in the config/development.yaml
file.
Let's look at an example configuration:
# Cluster level Archival config
archival:
# Event History configuration
history:
# Archival is enabled at the cluster level
state: 'enabled'
enableRead: true
# Namespaces can use either the local filestore provider or the Google Cloud provider
provider:
filestore:
fileMode: '0666'
dirMode: '0766'
gstorage:
credentialsPath: '/tmp/gcloud/keyfile.json'
# Default values for a Namespace if none are provided at creation
namespaceDefaults:
# Archival defaults
archival:
# Event History defaults
history:
state: 'enabled'
# New Namespaces will default to the local provider
URI: 'file:///tmp/temporal_archival/development'
You can disable Archival by setting archival.history.state
and namespaceDefaults.archival.history.state
to "disabled"
.
Example:
archival:
history:
state: 'disabled'
namespaceDefaults:
archival:
history:
state: 'disabled'
The following table showcases acceptable values for each configuration and what purpose they serve.
Config | Acceptable values | Description |
---|---|---|
archival.history.state | enabled , disabled | Must be enabled to use the Archival feature with any Namespace in the cluster. |
archival.history.enableRead | true , false | Must be true to read from the archived Event History. |
archival.history.provider | Sub provider configs are filestore , gstorage , s3 , or your_custom_provider . | Default config specifies filestore . |
archival.history.provider.filestore.fileMode | File permission string | File permissions of the archived files. We recommend using the default value of "0666" to avoid read/write issues. |
archival.history.provider.filestore.dirMode | File permission string | Directory permissions of the archive directory. We recommend using the default value of "0766" to avoid read/write issues. |
namespaceDefaults.archival.history.state | enabled , disabled | Default state of the Archival feature whenever a new Namespace is created without specifying the Archival state. |
namespaceDefaults.archival.history.URI | Valid URI | Must be a URI of the file store location and match a schema that correlates to a provider. |
Additional resources: Cluster configuration reference.
Namespace creation
Although Archival is configured at the cluster level, it operates independently within each Namespace.
If an Archival URI is not specified when a Namespace is created, the Namespace uses the value of defaultNamespace.archival.history.URI
from the config/development.yaml
file.
The Archival URI cannot be changed after the Namespace is created.
Each Namespace supports only a single Archival URI, but each Namespace can use a different URI.
A Namespace can safely switch Archival between enabled
and disabled
states as long as Archival is enabled at the cluster level.
Archival is supported in Global Namespaces (Namespaces that span multiple clusters). When Archival is running in a Global Namespace, it first runs on the active cluster; later it runs on the standby cluster. Before archiving, a history check is done to see what has been previously archived.
Test setup
To test Archival locally, start by running a Temporal server:
./temporal-server start
Then register a new Namespace with Archival enabled.
./temporal operator namespace create --namespace="my-namespace" --global false --history-archival-state="enabled" --retention="4d"
If the retention period isn't set, it defaults to 72h. The minimum retention period is one day. The maximum retention period is 30 days.
Setting the retention period to 0 results in the error A valid retention period is not set on request.
Next, run a sample Workflow such as the helloworld temporal sample.
When execution is finished, Archival occurs.
Retrieve archives
You can retrieve archived Event Histories by copying the workflowId
and runId
of the completed Workflow from the log output and running the following command:
./temporal workflow show --workflow-id="my-workflow-id" --run-id="my-run-id" --namespace="my-namespace"
How to create a custom Archiver
To archive data with a given provider, using the Archival feature, Temporal must have a corresponding Archiver component installed. The platform does not limit you to the existing providers. To use a provider that is not currently supported, you can create your own Archiver.
Create a new package
The first step is to create a new package for your implementation in /common/archiver. Create a directory in the archiver folder and arrange the structure to look like the following:
temporal/common/archiver
- filestore/ -- Filestore implementation
- provider/
- provider.go -- Provider of archiver instances
- yourImplementation/
- historyArchiver.go -- HistoryArchiver implementation
- historyArchiver_test.go -- Unit tests for HistoryArchiver
- visibilityArchiver.go -- VisibilityArchiver implementations
- visibilityArchiver_test.go -- Unit tests for VisibilityArchiver
Archiver interfaces
Next, define objects that implement the HistoryArchiver and the VisibilityArchiver interfaces.
The objects should live in historyArchiver.go
and visibilityArchiver.go
, respectively.
Update provider
Update the GetHistoryArchiver
and GetVisibilityArchiver
methods of the archiverProvider
object in the /common/archiver/provider/provider.go file so that it knows how to create an instance of your archiver.
Add configs
Add configs for your archiver to the config/development.yaml
file and then modify the HistoryArchiverProvider and VisibilityArchiverProvider structs in /common/common/config.go
accordingly.
Custom archiver FAQ
If my custom Archive method can automatically be retried by the caller, how can I record and access progress between retries?
Handle this situation by using ArchiverOptions
.
Here is an example:
func(a * Archiver) Archive(ctx context.Context, URI string, request * ArchiveRequest, opts...ArchiveOption) error {
featureCatalog: = GetFeatureCatalog(opts...) // this function is defined in options.go
var progress progress
// Check if the feature for recording progress is enabled.
if featureCatalog.ProgressManager != nil {
if err: = featureCatalog.ProgressManager.LoadProgress(ctx, & prevProgress);
err != nil {
// log some error message and return error if needed.
}
}
// Your archiver implementation...
// Record current progress
if featureCatalog.ProgressManager != nil {
if err: = featureCatalog.ProgressManager.RecordProgress(ctx, progress);
err != nil {
// log some error message and return error if needed.
}
}
}
If my Archive
method encounters an error that is non-retryable, how do I indicate to the caller that it should not retry?
func(a * Archiver) Archive(ctx context.Context, URI string, request * ArchiveRequest, opts...ArchiveOption) error {
featureCatalog: = GetFeatureCatalog(opts...) // this function is defined in options.go
err: = youArchiverImpl()
if nonRetryableErr(err) {
if featureCatalog.NonRetryableError != nil {
return featureCatalog.NonRetryableError() // when the caller gets this error type back it will not retry anymore.
}
}
}
How does my history archiver implementation read history?
The archiver package provides a utility called HistoryIterator which is a wrapper of ExecutionManager.
HistoryIterator
is more simple than the HistoryManager
, which is available in the BootstrapContainer, so archiver implementations can choose to use it when reading Workflow histories.
See the historyIterator.go file for more details.
Use the filestore historyArchiver implementation as an example.
Should my archiver define its own error types?
Each archiver is free to define and return its own errors. However, many common errors that exist between archivers are already defined in common/archiver/constants.go.
Is there a generic query syntax for the visibility archiver?
Currently, no. But this is something we plan to do in the future. As for now, try to make your syntax similar to the one used by our advanced list Workflow API.