Infrastructure Drift: The Security Risk Hiding in Your AWS Account
Someone on your team added an inbound rule to a security group at 2 AM during an incident. They fixed the issue. They forgot to update Terraform. Your state file now lies about what's actually running in your account.
This is infrastructure drift. And it's more common than you think.
What Is Drift?
Drift happens when the actual state of your cloud resources diverges from the declared state in your infrastructure-as-code. Someone changes something via the console, a CLI script, or an automation that bypasses your IaC pipeline.
The result: your Terraform (or Pulumi, or CloudFormation) state says one thing. AWS says another. You're operating on false assumptions.
Why It's a Security Problem
Not all drift is equal. Someone changing a tag? Low risk. Someone opening port 22 to 0.0.0.0/0 on a production security group? That's a breach waiting to happen.
Common drift that creates security exposure:
- Security group rules added via console (opens unexpected ports)
- S3 bucket policies modified to allow public access
- IAM policies expanded with wildcard permissions
- Encryption disabled on an RDS instance or EBS volume
- VPC flow logs turned off (eliminates audit trail)
The dangerous part: these changes are invisible to your IaC pipeline. terraform plan only shows what Terraform *wants* to change, not what changed outside of Terraform.
How Often Does It Happen?
More than teams admit. In a 2024 survey by Firefly, 73% of cloud engineers reported finding configuration drift in their production environments. The most common cause? Manual changes during incidents — exactly when security discipline breaks down.
Detecting Drift
There are three approaches:
1. Periodic plan/refresh (basic)
Run terraform plan or pulumi refresh on a schedule. If it reports changes you didn't make, you have drift. Problem: this only works if someone is actually reading the output.
2. Cloud-native tools (partial)
AWS Config, Azure Policy, GCP Security Command Center. These can detect some configuration changes, but they don't compare against your IaC state. They check against rules, not against your declared architecture.
3. IDP-integrated drift detection (comprehensive)
This is what AskArchie does. Every 6 hours (configurable), Archie runs a state comparison against every deployed stack. It classifies drift by severity — security changes are flagged as critical. It notifies via Slack. And it offers one-click remediation to restore the desired state.
The Right Response
Not all drift should be reverted. Sometimes the 2 AM security group change was correct, and your IaC needs to catch up. That's why drift management needs nuance:
- Critical drift (security groups, IAM, encryption): alert immediately, require acknowledgment or remediation
- Warning drift (configuration changes): notify, let PE decide
- Info drift (tags, descriptions): log for audit, don't alarm
The goal isn't zero drift. It's known drift — every deviation is detected, classified, and either fixed or acknowledged with a reason.
What You Can Do Today
1. Schedule terraform plan or pulumi refresh to run daily on every stack. Pipe the output somewhere visible.
2. Set up AWS Config rules for critical resources (security groups, S3 buckets, IAM).
3. After every incident, audit for manual changes that need to be backported to IaC.
Or try AskArchie — drift detection with severity classification, Slack alerts, and auto-remediation is built in.
Ready to see AskArchie in action?
Deploy a stack in under 2 minutes. No signup required.
Try the Live Demo