Health - Terraform Enterprise | Terraform

HCP Terraform can perform automatic health assessments in a workspace to assess whether its real infrastructure matches the requirements defined in its Terraform configuration. Health assessments include the following types of evaluations:

Drift detection determines whether your real-world infrastructure matches your Terraform configuration.
Continuous validation determines whether custom conditions in the workspace’s configuration continue to pass after Terraform provisions the infrastructure.

When you enable health assessments, HCP Terraform periodically runs health assessments for your workspace. Refer to Health Assessment Scheduling for details.

Permissions

Working with health assessments requires the following permissions:

To view health status for a workspace, you need read access to that workspace.
To change organization health settings, you must be an organization owner.
To change a workspace’s health settings, you must be an administrator for that workspace.

Workspace requirements

Workspaces require the following settings to receive health assessments:

Terraform version 0.15.4+ for drift detection only
Terraform version 1.3.0+ for drift detection and continuous validation
Remote execution mode or Agent execution mode for Terraform runs

The latest Terraform run in the workspace must have been successful. If the most recent run ended in an errored, canceled, or discarded state, HCP Terraform pauses health assessments until there is a successfully applied run.

The workspace must also have at least one run in which Terraform successfully applies a configuration. HCP Terraform does not perform health assessments in workspaces with no real-world infrastructure.

Enable health assessments

You can enforce health assessments across all eligible workspaces in an organization within the organization settings. Enforcing health assessments at an organization-level overrides workspace-level settings. You can only enable health assessments within a specific workspace when HCP Terraform is not enforcing health assessments at the organization level.

To enable health assessments within a workspace:

Verify that your workspace satisfies the requirements.
Go to the workspace and click Settings > Health.
Select Enable under Health Assessments.
Click Save settings.

Health assessment scheduling

When you enable health assessments for a workspace, HCP Terraform runs the first health assessment based on whether there are active Terraform runs for the workspace:

No active runs: A few minutes after you enable the feature.
Active speculative plan: A few minutes after that plan is complete.
Other active runs: During the next assessment period.

After the first health assessment, HCP Terraform starts a new health assessment during the next assessment period if there are no active runs in the workspace. Health assessments may take longer to complete when you enable health assessments in many workspaces at once or your workspace contains a complex configuration with many resources.

A health assessment never interrupts or interferes with runs. If you start a new run during a health assessment, HCP Terraform cancels the current assessment and runs the next assessment during the next assessment period. This behavior may prevent HCP Terraform from performing health assessments in workspaces with frequent runs.

HCP Terraform pauses health assessments if the latest run ended in an errored state. This behavior occurs for all run types, including plan-only runs and speculative plans. Once the workspace completes a successful run, HCP Terraform restarts health assessments during the next assessment period.

Terraform Enterprise administrators can modify their installation's assessment frequency and number of maximum concurrent assessments from the admin settings console.

Concurrency

If you enable health assessments on multiple workspaces, assessments may run concurrently. Health assessments do not affect your concurrency limit. HCP Terraform also monitors and controls health assessment concurrency to avoid issues for large-scale deployments with thousands of workspaces. However, HCP Terraform performs health assessments in batches, so health assessments may take longer to complete when you enable them in a large number of workspaces.

Notifications

HCP Terraform sends notifications about health assessment results according to your workspace’s settings.

Workspace health status

On the organization's Workspaces page, HCP Terraform displays a Health warning status for workspaces with infrastructure drift or failed continuous validation checks.

On the right of a workspace’s overview page, HCP Terraform displays a Health bar that summarizes the results of the last health assessment.

The Drift summary shows the total number of resources in the configuration and the number of resources that have drifted.
The Checks summary shows the number of passed, failed, and unknown statuses for objects with continuous validation checks.

Drift detection

Drift detection helps you identify situations where your actual infrastructure no longer matches the configuration defined in Terraform. This deviation is known as configuration drift. Configuration drift occurs when changes are made outside Terraform's regular process, leading to inconsistencies between the remote objects and your configured infrastructure.

For example, a teammate could create configuration drift by directly updating a storage bucket's settings with conflicting configuration settings using the cloud provider's console. Drift detection could detect these differences and recommend steps to address and rectify the discrepancies.

Configuration drift differs from state drift. Drift detection does not detect state drift.

Configuration drift happens when external changes affecting remote objects invalidate your infrastructure configuration. State drift occurs when external changes affecting remote objects do not invalidate your infrastructure configuration. Refer to Refresh-Only Mode to learn more about remediating state drift.

View workspace drift

To view the drift detection results from the latest health assessment, go to the workspace and click Health > Drift. If there is configuration drift, HCP Terraform proposes the necessary changes to bring the infrastructure back in sync with its configuration.

Resolve drift

You can use one of the following approaches to correct configuration drift:

Overwrite drift: If you do not want the drift's changes, queue a new plan and apply the changes to revert your real-world infrastructure to match your Terraform configuration.
Update Terraform configuration: If you want the drift's changes, modify your Terraform configuration to include the changes and push a new configuration version. This prevents Terraform from reverting the drift during the next apply. Refer to the Manage Resource Drift tutorial for a detailed example.

Continuous validation

Continuous validation regularly verifies whether your configuration’s custom assertions continue to pass, validating your infrastructure. For example, you can monitor whether your website returns an expected status code, or whether an API gateway certificate is valid. Identifying failed assertions helps you resolve the failure and prevent errors during your next time Terraform operation.

Continuous validation evaluates preconditions, postconditions, and check blocks as part of an assessment, but we recommend using check blocks for post-apply monitoring. Use check blocks to create custom rules to validate your infrastructure's resources, data sources, and outputs.

Preventing false positives

Health assessments create a speculative plan to access the current state of your infrastructure. Terraform evaluates any check blocks in your configuration as the last step of creating the speculative plan. If your configuration relies on data sources and the values queried by a data source change between the time of your last run and the assessment, the speculative plan will include those changes. HCP Terraform will not modify your infrastructure as part of an assessment, but it can use those updated values to evaluate checks. This may lead to false positive results for alerts since your infrastructure did not yet change.

To ensure your checks evaluate the current state of your configuration instead of against a possible future change, use nested data sources that query your actual resource configuration, rather than a computed latest value. Refer to the AMI image scenario below for an example.

Example use cases

Review the provider documentation for check block examples with AWS, Azure, and GCP.

Monitoring the health of a provisioned website

The following example uses the HTTP Terraform provider and a scoped data source within a check block to assert the Terraform website returns a 200 status code, indicating it is healthy.

check "health_check" {  data "http" "terraform_io" {    url = "https://www.terraform.io"  }   assert {    condition = data.http.terraform_io.status_code == 200    error_message = "${data.http.terraform_io.url} returned an unhealthy status code"  }}

Continuous Validation alerts you if the website returns any status code besides 200 while Terraform evaluates this assertion. You can also find failures in your workspace's Continuous Validation Results page. You can configure continuous validation alerts in your workspace's notification settings.

Monitoring certificate expiration

Vault lets you secure, store, and tightly control access to tokens, passwords, certificates, encryption keys, and other sensitive data. The following example uses a check block to monitor for the expiration of a Vault certificate.

resource "vault_pki_secret_backend_cert" "app" {  backend = vault_mount.intermediate.path  name = vault_pki_secret_backend_role.test.name  common_name = "app.my.domain"} check "certificate_valid" {  assert {    condition = !vault_pki_secret_backend_cert.app.renew_pending    error_message = "Vault cert is ready to renew."  }}

Asserting up-to-date AMIs for compute instances

HCP Packer stores metadata about your Packer images. The following example check fails when there is a newer AMI version available.

data "hcp_packer_artifact" "hashiapp_image" {  bucket_name  = "hashiapp"  channel_name = "latest"  platform     = "aws"  region       = "us-west-2"} resource "aws_instance" "hashiapp" {  ami                         = data.hcp_packer_artifact.hashiapp_image.external_identifier  instance_type               = var.instance_type  associate_public_ip_address = true  subnet_id                   = aws_subnet.hashiapp.id  vpc_security_group_ids      = [aws_security_group.hashiapp.id]  key_name                    = aws_key_pair.generated_key.key_name   tags = {    Name = "hashiapp"  }} check "ami_version_check" {  data "aws_instance" "hashiapp_current" {    instance_tags = {      Name = "hashiapp"    }  }   assert {    condition = aws_instance.hashiapp.ami == data.hcp_packer_artifact.hashiapp_image.external_identifier    error_message = "Must use the latest available AMI, ${data.hcp_packer_artifact.hashiapp_image.external_identifier}."  }}

View continuous validation results

To view the continuous validation results from the latest health assessment, go to the workspace and click Health > Continuous validation.

The page shows all of the resources, outputs, and data sources with custom assertions that HCP Terraform evaluated. Next to each object, HCP Terraform reports whether the assertion passed or failed. If one or more assertions fail, HCP Terraform displays the error messages for each assertion.

The health assessment page displays each assertion by its named value. A check block's named value combines the prefix check with its configuration name.

If your configuration contains multiple preconditions and postconditions within a single resource, output, or data source, HCP Terraform will not show the results of individual conditions unless they fail. If all custom conditions on the object pass, HCP Terraform reports that the entire check passed. The assessment results will display the results of any precondition and postconditions alongside the results of any assertions from check blocks, identified by the named values of their parent block.