AWS Deployment using EFS
Note
Note
- If the user chooses
backup_config
asefs
inconfig.toml
backup is already configured during deployment, the below steps are not required and can be skipped. i.e.,backup_config = "efs"
. If we have kept thebackup_config
blank, then the configuration needs to be configured manually.
Overview
A shared file system is always required to create OpenSearch snapshots. To register the snapshot repository using OpenSearch, it is necessary to mount the same shared filesystem to the exact location on all master and data nodes. Register the location in the path.repo
setting on all master and data nodes.
Setting up the backup configuration
Create an EFS file system, please refer sample steps here
Let’s create a folder structure
/mnt/automate_backups/
on all the Frontend and backend nodes, then we have to mount EFS to all the vm’s manually. To do that please refer this
Configuration in OpenSearch Node
Mount the EFS on all OpenSearch Node. For example you mount the EFS to folder structure
/mnt/automate_backups/
Create an
opensearch
sub-directory and set permissions as mention below (all the opensearch nodes).sudo mkdir -p /mnt/automate_backups/opensearch sudo chown hab:hab /mnt/automate_backups/opensearch/
Configuration for OpenSearch Node from Provision host
Configure the OpenSearch path.repo
attribute.
Create a toml file (
os_config.toml
) and add below template[path] repo = "/mnt/automate_backups/opensearch"
Patch the config
os_config.toml
from bastion to the opensearch cluster.chef-automate config patch --opensearch os_config.toml
Above command will restart the opensearch cluster.
Healthcheck commands
Following command can be run in the OpenSearch node
hab svc status (check whether OpenSearch service is up or not) curl -k -X GET "<https://localhost:9200/_cat/indices/*?v=true&s=index&pretty>" -u admin:admin (Another way to check is to check whether all the indices are green or not) # Watch for a message about OpenSearch going from RED to GREEN `journalctl -u hab-sup -f | grep 'automate-ha-opensearch'
Configuration for Automate node from Bastion host
Mount the EFS to all the Frontend node manually. For example you mount the EFS to folder structure
/mnt/automate_backups
Create an
automate.toml
file on the bastion host using the following command:touch automate.toml
Add the following configuration to
automate.toml
on the bastion host:[global.v1.external.opensearch.backup] enable = true location = "fs" [global.v1.external.opensearch.backup.fs] # The `path.repo` setting you've configured on your OpenSearch nodes must be a parent directory of the setting you configure here: path = "/mnt/automate_backups/opensearch" [global.v1.backups.filesystem] path = "/mnt/automate_backups/backups"
Patch the
config
using below command../chef-automate config patch --frontend automate.toml
Backup and Restore commands
Backup
Run the backup command from bastion as shown below to create a backup:
chef-automate backup create
Restoring the EFS Backed-up Data
To restore backed-up data of the Chef Automate High Availability (HA) using External File System (EFS), follow the steps given below:
Check the status of all Chef Automate and Chef Infra Server front-end nodes by executing the
chef-automate status
command.Execute the restore command from bastion
chef-automate backup restore <BACKUP-ID> -b /mnt/automate_backups/backups --airgap-bundle </path/to/bundle>
.
Note
- If you are restoring the backup from an older version, then you need to provide the
--airgap-bundle </path/to/current/bundle>
. - Large Compliance Report is not supported in Automate HA
Troubleshooting
Try these steps if Chef Automate returns an error while restoring data.
Check the Chef Automate status.
chef-automate status
Check the status of your Habitat service on the Automate node.
hab svc status
If the deployment services are not healthy, reload them.
hab svc load chef/deployment-service
Now check the status of the Automate node and then try running the restore command from the bastion host.
How to change the
base_path
orpath
. The steps for the File System backup are as shown below:While at the time of deployment
backup_mount
default value will be/mnt/automate_backups
In case, if you modify the
backup_mount
inconfig.toml
before deployment, then the deployment process will do the configuration with the updated valueIn case, you changed the
backup_mount
value post-deployment, then we need to patch the configuration manually to all the frontend and backend nodes, for example, if you change thebackup_mount
to/bkp/backps
Update the FE nodes with the below template, use the command
chef-automate config patch fe.toml --fe
[global.v1.backups] [global.v1.backups.filesystem] path = "/bkp/backps" [global.v1.external.opensearch.backup] [global.v1.external.opensearch.backup.fs] path = "/bkp/backps"
Update the OpenSearch node with the below template, use the command
chef-automate config patch os.toml --os
[path] repo = "/bkp/backps"
Run the curl request to one of the automate frontend node
curl localhost:10144/_snapshot?pretty
If the response is empty
{}
, then we are goodIf the response has json output, then it should have correct value for the
backup_mount
, refer thelocation
value in the response. It should start with the/bkp/backps
{ "chef-automate-es6-event-feed-service" : { "type" : "fs", "settings" : { "location" : "/mnt/automate_backups/opensearch/automate-elasticsearch-data/chef-automate-es6-event-feed-service" } }, "chef-automate-es6-compliance-service" : { "type" : "fs", "settings" : { "location" : "/mnt/automate_backups/opensearch/automate-elasticsearch-data/chef-automate-es6-compliance-service" } }, "chef-automate-es6-ingest-service" : { "type" : "fs", "settings" : { "location" : "/mnt/automate_backups/opensearch/automate-elasticsearch-data/chef-automate-es6-ingest-service" } }, "chef-automate-es6-automate-cs-oc-erchef" : { "type" : "fs", "settings" : { "location" : "/mnt/automate_backups/opensearch/automate-elasticsearch-data/chef-automate-es6-automate-cs-oc-erchef" } } }
- If the pre string in the
location
is not match withbackup_mount
, then we need to to delete the existing snapshots. use below script to delete the snapshot from the one of the automate frontend node.
snapshot=$(curl -XGET http://localhost:10144/_snapshot?pretty | jq 'keys[]') for name in $snapshot;do key=$(echo $name | tr -d '"') curl -XDELETE localhost:10144/_snapshot/$key?pretty done
- The above scritp requires the
jq
needs to be installed, You can install from the airgap bundle, please use command on the one of the automate frontend node to locate thejq
package.
ls -ltrh /hab/cache/artifacts/ | grep jq -rw-r--r--. 1 ec2-user ec2-user 730K Dec 8 08:53 core-jq-static-1.6-20220312062012-x86_64-linux.hart -rw-r--r--. 1 ec2-user ec2-user 730K Dec 8 08:55 core-jq-static-1.6-20190703002933-x86_64-linux.hart
- In case of multiple
jq
version, then install the latest one. use the below command to install thejq
package to the automate frontend node
hab pkg install /hab/cache/artifacts/core-jq-static-1.6-20190703002933-x86_64-linux.hart -bf
Below steps for object storage as a backup option
- While at the time of deployment
backup_config
will beobject_storage
- To use the
object_storage
, we are using below template at the time of deployment
[object_storage.config] google_service_account_file = "" location = "" bucket_name = "" access_key = "" secret_key = "" endpoint = "" region = ""
- If you configured pre deployment, then we are good
- If you want to change the
bucket
orbase_path
, then use the below template for Frontend nodes
[global.v1] [global.v1.external.opensearch.backup.s3] bucket = "<BUCKET_NAME>" base_path = "opensearch" [global.v1.backups.s3.bucket] name = "<BUCKET_NAME>" base_path = "automate"
- You can choose any value for the variable
base_path
.base_path
patch is only required for the frontend node. - Use the command to apply the above template
chef-automate config patch frontend.toml --fe
- Post the configuration patch, and use the curl request to validate
curl localhost:10144/_snapshot?pretty
- If the response is empty `{}`, then we are good - If the response has JSON output, then it should have the correct value for the `base_path` ```sh { "chef-automate-es6-event-feed-service" : { "type" : "s3", "settings" : { "bucket" : "MY-BUCKET", "base_path" : "opensearch/automate-elasticsearch-data/chef-automate-es6-event-feed-service", "readonly" : "false", "compress" : "false" } }, "chef-automate-es6-compliance-service" : { "type" : "s3", "settings" : { "bucket" : "MY-BUCKET", "base_path" : "opensearch/automate-elasticsearch-data/chef-automate-es6-compliance-service", "readonly" : "false", "compress" : "false" } }, "chef-automate-es6-ingest-service" : { "type" : "s3", "settings" : { "bucket" : "MY-BUCKET", "base_path" : "opensearch/automate-elasticsearch-data/chef-automate-es6-ingest-service", "readonly" : "false", "compress" : "false" } }, "chef-automate-es6-automate-cs-oc-erchef" : { "type" : "s3", "settings" : { "bucket" : "MY-BUCKET", "base_path" : "opensearch/automate-elasticsearch-data/chef-automate-es6-automate-cs-oc-erchef", "readonly" : "false", "compress" : "false" } } } ``` - In case of `base_path` value is not matching, then we have to delete the existing `snapshot`. please refer to the steps from the file system
- While at the time of deployment