Encrypted Terraform state with OpenStack and CEPH

Tomáš Sapák
4 min readJun 1, 2021

You’ll probably remember the first time you’ve created your first OpenStack/AWS instance with Terraform instead of web GUI. You’ll want to share this moment with your colleagues by pushing your Terraform code to Git. Git is an excellent tool for versioning Terraform code, and it enables teams to collaborate on infrastructure management by using the same code.

You’ll soon encounter the first issue, that the instance you yesterday bumped from m5.large to m5.xlarge is today running out of memory again. Then there might be another problem, that somebody is mining Bitcoin on your instance (this could ruin your day, not because of wasted company CPU hours, but because you’ve bought Bitcoin for 63k and somebody is mining it on your instance for free).

Local State

The source of both these issues is the local Terraform state. Terraform state contains information about managed resources. If you work with secrets in your Terraform code, those secrets will be, in some cases, also stored in the state. By default, Terraform state is stored locally in your Terraform directory. By pushing the state to the Git repository, you’ve enabled your colleagues to work with your infrastructure. Because the state is updated only when pushing/pulling from/to Git, you’ll usually end up with a different state than your colleagues, which is inconsistent with the actual state of your infrastructure. The worse part is that all your secrets are available to every user with access to your git repository.

Remote state

After dealing with the compromised instance (we are in the cloud, so let’s bomb it), you’ll need to find a replacement for the local state. Remote backends solve both mentioned issues. As the name suggests, remote backends store Terraform state remotely, ensuring that all users work with the actual state. Usually, some locking is implemented to support concurrency.

Currently, more than ten remote backends are supported. The lowest effort solution is usually to use object storage provided by the cloud provider (who wants to maintain yet another service). The first choice for every OpenStack engineer is Swift object storage, which is supported by most private and public OpenStack cloud providers.

Swift

Using Swift has one crucial downside. Unlike the AWS S3 backend, it doesn’t support any encryption, and objects in buckets are accessible to any OpenStack project member or any OpenStack admin. Because you probably don’t want your nosy front-end colleague to have access to the password of the database with salaries of your whole company, Swift is out of the picture.

Ceph Object Gateway

According to an OpenStack survey from 2020 (https://www.openstack.org/analytics) 74% of all OpenStack deployments use Ceph RBD for block storage. Ceph is additionally compatible with Swift and S3 protocols via its service Ceph Object Gateway. If your deployment counts among those lucky 74%, your object storage is probably also implemented by Ceph Object Gateway (this might not be the case for all deployments) and you have S3 compatible backend with SSE-C support to your disposal.

Infrastructure requirements

As mentioned before, your infrastructure will need to meet several requirements:

Credentials

Ceph Object Gateway S3 API is not compatible with classical Keystone credentials. It instead supports ec2 credentials that can be generated with the following command:

openstack ec2 credentials create

This command will generate access and secret key for you. Additionally you’ll need to generate SSE customer key with command:

dd if=/dev/random of=/tmp/ssec.key bs=1 count=32
AWS_SSE_CUSTOMER_KEY=$(base64 /tmp/ssec.key)

Terraform will use these secrets to authenticate to OpenStack and to decrypt Terraform’s remote state. There are several ways, how to provide this information to Terraform. Probably the most Git safe way is to create .env file with the following content:

export AWS_ACCESS_KEY_ID=access_key_value
export AWS_SECRET_ACCESS_KEY=secret_key_value
export AWS_SSE_CUSTOMER_KEY=sse_key_value

and use source .env command before any terraform command. Just remember to put .env into .gitignore.

Terraform backend configuration

terraform {
backend "s3" {
endpoint = "object-store service fqdn"
bucket = "container/bucket name"
encrypt = true
force_path_style = true
key = "state file name"
region = "object-store region"
skip_credentials_validation = true
skip_region_validation = true
workspace_key_prefix = "prefix inside a bucket"
}
}

Configuration options:

  • endpoint — object store service fqdn — public value from openstack catalog list | grep object-store command
  • bucket — container name in Swift terminology, bucket in S3. Needs to be created before the first run — openstack container create container-name
  • encrypt — enables SSE-C encryption
  • force_path_style — use https://<HOST>/<BUCKET> instead of https://<BUCKET>.<HOST>
  • key — name of the object with state file
  • region — object-store region — public value prefix from openstack catalog list | grep object-store
  • skip_credentials_validation — whether to validate credentials via STS API. Since we don’t use AWS, validation needs to be disabled by setting this option to true!
  • skip_region_validation — validates region name. Same as before, we don’t use AWS, so it needs to be true.
  • workspace_key_prefix — folder(s) in container/bucket, where state file is created in case of non-default workspaces. Whole state file path looks like /workspace_key_prefix/workspace/key.

That’s it. After the first Terraform apply run, an encrypted state file will be created. You can retrieve your state file with s3 tools, which requires AWS_SSE_CUSTOMER_KEY. This action automatically decrypts the state. You can use Swift client to retrieve the state in encrypted form (which is pretty cool because you can use it for tertiary backup).

So are your secrets completely secure? Sadly not, because CEPH admins can probably still log SSE-C keys on Ceph Object Gateway and use them for decryption. But you’ve significantly decreased the risk by lowering the number of people with direct access to the state file and lowering the risk of a data breach. And who can you trust if not the storage guys, right?

--

--

Tomáš Sapák

DevOps engineer, automation, and orchestration enthusiast. Love working with AWS and OpenStack.