Why We Started Looking for Alternatives in the First Place
The License Change That Made Our Legal Team Actually Read the Docs
August 2023 is when HashiCorp flipped Terraform from MPL 2.0 to the Business Source License (BSL 1.1). I remember reading the announcement and thinking “okay, open core, whatever” — then actually reading the license text. The specific restriction that matters: you cannot use Terraform to build a product or service that competes with HashiCorp’s commercial offerings. That sounds targeted until you realize HashiCorp sells Terraform Cloud, a platform for running infrastructure automation at scale. If your internal developer platform routes infrastructure requests through a UI and executes Terraform under the hood, your legal team will start asking uncomfortable questions about whether that “competes.”
The thing that caught me off guard was how vague “competing with HashiCorp” actually is in the license text. There’s no bright line. If you’re building an internal IaC orchestration layer for a fintech product — say, a self-service portal where teams provision their own AWS environments — is that competing with Terraform Cloud? Probably not. But “probably not” is not what your general counsel wants to hear before shipping to enterprise clients who require clean IP audits. We spent three weeks in legal back-and-forth that could’ve been avoided entirely by just not using BSL software. That’s the real cost nobody talks about: not the license fee, the legal time.
The specific clause reads: “you may not use the software to provide a competitive offering”. HashiCorp has since published clarifying FAQs that say most users are fine, but a FAQ is not a legal document. It’s not binding. If they ever decide your use case crosses a line, you’re arguing against their interpretation of their own license with no MPL protections to fall back on. I’ve watched two companies I know — both building platforms on top of Terraform — quietly migrate to OpenTofu after their Series A investors flagged the BSL dependency during due diligence. Nobody announced it. They just did it.
Free Shopify Alternatives I Actually Set Up for Real Stores (Not Just Tested in a Sandbox)
The “just pin to the last MPL version” argument — staying on Terraform 1.5.5 forever — is the one I hear most often from teams that don’t want to deal with migration. I get it. But think through what that actually means: no security patches from HashiCorp, no new provider features (AWS providers move fast — try using a new service without provider support), and a growing delta between your version and where the ecosystem is investing. By mid-2026, staying on 1.5.5 means your providers are carrying forward compatibility shims, your IDE tooling starts drifting, and every new hire you bring on has to unlearn the current default and work with a pinned fork. It’s not a strategy, it’s deferred migration pain with compounding interest.
One more practical thing before we get into alternatives: if you’re simultaneously evaluating your broader DevOps toolchain — CI systems, monitoring, secret management — you’re probably also looking at SaaS tooling decisions that have the same “free tier to paid cliff” problem Terraform Cloud has. I wrote separately about those trade-offs in the guide on Essential SaaS Tools for Small Business in 2026, which covers the pricing cliffs and lock-in risks worth knowing before you commit.
- BSL is not open source — the OSI explicitly does not recognize it as such, which matters for enterprise procurement policies that require OSI-approved licenses
- The risk is proportional to your product surface area — a 5-person team running personal projects is not HashiCorp’s target, but a platform team building infra tooling for paying customers is a different story
- OpenTofu is the fork to watch — it launched under the Linux Foundation with an MPL 2.0 license and has been tracking Terraform’s feature set closely; more on that in the next section
The Tools I Actually Evaluated (and One I Dismissed Immediately)
My Shortlist and What I Cut Before Wasting Any Real Time
The thing that surprised me most when I started this evaluation wasn’t the tooling itself — it was how different the migration cost turned out to be across tools that all claim “Terraform compatibility.” My team had roughly 40,000 lines of .tf files across three environments (AWS, GCP, a bit of Azure). That corpus is a reality check. Tools that look equivalent in a README demo fall apart completely when you throw real module hierarchies at them.
Here’s the shortlist I actually put time into:
5 JavaScript Testing Frameworks I Actually Use for React — and Which One I Reach for First
- OpenTofu — the fork of Terraform pre-BSL. Near-identical HCL syntax. This is the path-of-least-resistance option for anyone already on Terraform.
- Pulumi (free tier) — real programming languages instead of HCL. Free tier includes state management up to a team of one, essentially. Anything beyond that is the Team plan at $50/month per user as of early 2026.
- Crossplane — Kubernetes-native infrastructure management. Wildly different mental model. I included it specifically because one of our teams already lives in k8s all day.
- Terragrunt + OpenTofu — not a replacement for Terraform, but a wrapper that solves DRY configuration and remote state orchestration at scale. I treat this as its own candidate because the combo changes the workflow significantly.
- Ansible — I kept this in scope only for teams managing fewer than five environments with mostly VM-based infra. Not IaC in the full sense, but I kept getting asked about it so I evaluated it honestly.
What I ruled out fast: AWS CDK was the first to go. I don’t care how elegant the TypeScript API is — the moment you’re running multi-cloud, you’re carrying dead weight. CDK for Terraform (cdktf) is a different story, but vanilla CDK is just an expensive CloudFormation wrapper. Farmer is an F# DSL for Azure Resource Manager. I respect the niche but my team doesn’t write F# and I’m not staffing around a templating language. Raw CloudFormation fails for the same reason as CDK — we’re multi-cloud by necessity, and JSON/YAML stacks that only talk to AWS aren’t a strategy, they’re a liability.
My evaluation criteria, in the order they actually mattered:
- Migration cost from existing
.tffiles. I ran each tool against a real module — our VPC factory — and measured how long it took to get a working plan. OpenTofu passed in about 20 minutes (mostly just swapping the Terraform binary). Pulumi required rewriting the whole thing in TypeScript usingpulumi convert, which handled maybe 60% of it without manual edits. - CI/CD integration pain. Specifically GitHub Actions and GitLab CI, which is what we use. I needed to know: does it have an official action/image, does it support OIDC for cloud auth, does the exit code behavior work cleanly in pipelines?
- State backend compatibility. My hard requirement was S3 + DynamoDB lock compatibility. Anything that forces me to their proprietary state backend adds operational risk I don’t want.
- Community health. I checked GitHub commit frequency, release cadence, Discord/Slack activity, and whether the project had a governance model or was backed by a single commercial entity with an obvious upsell incentive.
One gotcha I didn’t expect: Crossplane’s approach to state is fundamentally different from everything else on this list — there’s no state file at all. Kubernetes is the state. That’s philosophically interesting but practically means your disaster recovery story is now “restore the cluster,” which is a much harder conversation to have with ops. I almost cut it for that reason alone, but kept it because two scenarios genuinely fit it well: teams already doing GitOps with Flux/ArgoCD, and shops that want to give dev teams self-service infra without handing out cloud console access.
Deploying Flask on AWS Lambda Without Losing Your Mind: A Setup-to-Production Guide
# Quick test I ran on every candidate — apply a VPC module, check the plan output
# OpenTofu example (drop-in from terraform)
tofu init
tofu plan -out=tfplan
tofu show -json tfplan | jq '.resource_changes | length'
# Pulumi equivalent after conversion
pulumi preview --json | jq '.steps | length'
The exit code behavior note from the CI/CD criterion is worth calling out specifically: terraform plan exits 0 on success and on “no changes,” but exits 2 when there are changes. OpenTofu inherits this behavior exactly. Pulumi exits non-zero on actual errors only. This sounds minor until you have a pipeline that interprets exit 2 as failure and pages someone at 2am over a routine drift detection run. Check your pipeline wrappers before you commit to either approach.
OpenTofu: The Closest Drop-In Replacement
What OpenTofu Actually Is (And Why the Licensing Matters)
The Linux Foundation forked Terraform after HashiCorp switched to BSL in August 2023 — OpenTofu is that fork, maintained by a coalition of companies including Gruntwork, Spacelift, and use. It’s genuinely MPL-2.0 licensed, which means you can use it in commercial products without the legal ambiguity that came with Terraform 1.6+. If your legal team has been asking uncomfortable questions about Terraform’s BSL, this is the answer. Current stable is v1.8.x and it’s moving fast.
Install is exactly what you’d expect:
# macOS
brew install opentofu
# Linux/CI (official install script)
curl --proto '=https' --tlsv1.2 -fsSL https://get.opentofu.org/install-opentofu.sh | sh
# Verify
tofu version
# OpenTofu v1.8.x
I switched a 14,000-line Terraform codebase to OpenTofu in about 40 minutes. The migration reality is genuinely this simple: alias tofu=terraform in your shell profile and run tofu init in your existing project directory. The HCL syntax from 0.15 onward is identical. State files are compatible. Variable files, modules, locals — all of it just works. No conversion tooling, no migration scripts, no weekend war room.
Where OpenTofu Is Actually Ahead of Terraform
Native state encryption landed in 1.7 and it’s better than what Terraform offers. You can encrypt state at rest using AES-256-GCM or PBKDF2 without routing through a third-party product like Terraform Cloud. The config looks like this:
terraform {
encryption {
key_provider "pbkdf2" "my_key" {
passphrase = var.state_passphrase
}
method "aes_gcm" "default" {
keys = key_provider.pbkdf2.my_key
}
state {
method = method.aes_gcm.default
}
}
}
Early-evaluation functions (also 1.7+) let you use functions like templatestring during the evaluation phase before providers are initialized — practically useful for dynamic backend configs. Provider-defined functions mean providers can now ship their own reusable functions alongside resources. The AWS provider is already shipping useful ones for ARN parsing.
The Registry Problem Is Real
The thing that caught me off guard: OpenTofu uses its own registry at registry.opentofu.org, not registry.terraform.io. For the major providers — AWS, Azure, GCP, Kubernetes — this is completely transparent, they’re mirrored automatically. But I had two obscure internal-tooling providers that weren’t mirrored yet, and tofu init failed with a confusing “provider not found” error.
The fix is adding an explicit source address in your required_providers block:
terraform {
required_providers {
obscure-tool = {
source = "registry.terraform.io/some-vendor/obscure-tool"
version = "~> 2.1"
}
}
}
OpenTofu respects explicit source URLs even when they point to the Terraform registry. It’s a one-line fix per obscure provider, but you need to know it’s the problem first. Run tofu providers after init to see exactly which ones got resolved and from where.
Dropping It Into CI/CD
If you’re on GitHub Actions, swapping is a one-line change. Replace hashicorp/setup-terraform with opentofu/setup-opentofu:
jobs:
deploy:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: opentofu/setup-opentofu@v1
with:
tofu_version: "1.8.x"
- name: Init
run: tofu init
env:
AWS_ACCESS_KEY_ID: ${{ secrets.AWS_ACCESS_KEY_ID }}
AWS_SECRET_ACCESS_KEY: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
- name: Plan
run: tofu plan -out=tfplan
- name: Apply
if: github.ref == 'refs/heads/main'
run: tofu apply tfplan
State backends are unchanged. Your S3 backend block, GCS backend, Azure Blob Storage config — copy-paste them verbatim into your OpenTofu project. The backend configuration syntax is identical because it’s the same codebase up to the fork point, and OpenTofu has kept compatibility deliberately.
My honest take: if you have more than 10,000 lines of .tf files and you’re on Terraform 1.x, OpenTofu is the lowest-friction move you can make. The ecosystem risk is lower than staying on BSL Terraform for commercial use, the feature velocity is actually faster right now, and the migration cost was genuinely about an hour of my time. The registry issue with obscure providers is the only real gotcha, and now you know about it before you hit it.
Pulumi Free Tier: When You’re Tired of HCL
The thing that sold me on Pulumi wasn’t the marketing copy about “infrastructure as real code” — it was the first time I refactored a security group rule using a filter() call and TypeScript’s compiler caught a type mismatch before I ever ran pulumi preview. With HCL, that bug would have lived until plan time, or worse, apply time. That single experience explains why Pulumi exists and who it’s actually for.
Getting started is fast enough that you can evaluate it in an afternoon:
curl -fsSL https://get.pulumi.com | sh
pulumi new aws-typescript
That second command scaffolds a full TypeScript project with tsconfig.json, a Pulumi.yaml, and an index file that already imports the AWS SDK. You’re writing real TypeScript immediately — not learning a new syntax. If your team already has a frontend or backend codebase in TypeScript, onboarding to Pulumi takes hours, not days. The flip side of this is real: if you’re onboarding an ops engineer who lives in Bash and has never used a typed language, HCL is genuinely easier to teach. Pulumi rewards software engineers who happen to do infra, not the other way around.
State Management: Cloud Default, S3 Escape Hatch
By default, Pulumi stores state in Pulumi Cloud, which has a free tier — check pulumi.com/pricing for the current limits since they adjust this occasionally. If you want fully self-managed state (no third-party service touching your infrastructure metadata), run this before you do anything else:
pulumi login s3://your-state-bucket
That’s it. Pulumi writes state to S3, and you handle encryption and access via IAM. I’ve run production stacks this way with no issues. The community edition is self-hostable with no resource limits, which is the actual answer to “is this really free?” — yes, as long as you’re comfortable managing state yourself. The Pulumi Cloud free tier is fine for solo developers and small teams; the S3 backend is what I’d use for anything with serious compliance requirements.
Migrating from Terraform Without Losing a Week
The migration path exists and it’s more useful than I expected, but don’t go in with unrealistic expectations:
pulumi convert --from terraform --language typescript
I ran this against a mid-sized Terraform codebase — about 800 lines of HCL across several modules. Roughly 70% of resources converted cleanly. The remaining 30% were exactly the things you’d guess: complex count patterns, nested for_each with local values, and dynamic blocks. Those need manual rewrites. The good news is that rewriting them in TypeScript is often cleaner than the original HCL — a for_each pattern in Terraform that required 15 lines of awkward syntax becomes a resources.map() call. The bad news is that if you have a large monolithic Terraform state, you’re doing partial migrations and state manipulation, which is fiddly work that you should test in a throwaway environment first.
When I’d Actually Pick Pulumi Over OpenTofu
- Greenfield projects with a software-oriented team. If everyone already knows TypeScript or Python, the learning curve inverts — Pulumi is faster to pick up than HCL, not slower.
- Anywhere you need real abstraction. Building reusable infrastructure components with HCL modules is painful. In Pulumi, you write a class, add constructor arguments, done. I’ve built internal libraries of opinionated VPC configurations that teams import like any other npm package.
- When compile-time safety matters. I’ve caught misconfigured IAM policy ARNs, wrong CIDR block types, and missing required properties before a single API call was made. That feedback loop is genuinely faster than reading Terraform plan output.
- Multi-language teams. Go, Python, TypeScript, C# — pick the language your team actually uses. That’s not a gimmick; it means your backend developers can contribute to infra without learning a DSL.
Where I’d stick with OpenTofu instead: large teams with existing HCL investment, environments where you need a sysadmin-readable file format for audits, or anywhere the political cost of introducing a compiled language into the infra workflow isn’t worth the benefits. Pulumi is a better tool for software engineers; OpenTofu is a more accessible tool for mixed ops teams. That distinction matters more than any feature comparison.
Crossplane: For Teams Already Deep in Kubernetes
What Crossplane Actually Does (And Why the Mental Model Matters)
Crossplane isn’t an infrastructure tool that happens to run on Kubernetes — it is Kubernetes, extended. You install it as an operator, and from that point forward, cloud resources become Custom Resource Definitions. An S3 bucket is a Kubernetes object. An RDS instance is a Kubernetes object. Your infra state lives in etcd, not in some remote backend you have to configure and secure. The thing that caught me off guard the first time I used it: I ran kubectl get buckets and my AWS S3 buckets showed up. That’s the paradigm shift in one command.
Getting it running takes about three minutes if you already have a cluster:
helm repo add crossplane-stable https://charts.crossplane.io/stable
helm repo update
helm install crossplane crossplane-stable/crossplane \
--namespace crossplane-system \
--create-namespace
Then you install a provider — say, AWS:
cat <<EOF | kubectl apply -f -
apiVersion: pkg.crossplane.io/v1
kind: Provider
metadata:
name: provider-aws-s3
spec:
package: xpkg.upbound.io/upbound/provider-aws-s3:v1.1.0
EOF
After that you configure credentials via a ProviderConfig and you’re in. The install story is actually pretty clean. The hard part comes later.
Where This Approach Pays Off: GitOps Without the Ceremony
If you’re already running Flux or Argo CD to manage application config, adding infra management is conceptually zero overhead. You push a YAML manifest defining an RDS instance, Flux detects it, applies it, and Crossplane reconciles the actual AWS resource. Drift detection — the thing you’d normally have to run a scheduled tofu plan for — happens automatically. The controller loop is always running. Your infra gets treated with the same reconciliation guarantees as your deployments. I switched a team to this model after spending too many Friday afternoons chasing “who ran apply manually and didn’t commit it.” That problem disappears.
The Debugging Reality Check
Here’s the honest part: when something goes wrong, the debugging experience is rough compared to OpenTofu. With tofu plan, you get a readable diff before anything happens. With Crossplane, you’re digging through Kubernetes events:
kubectl describe bucket my-bucket -n crossplane-system
kubectl logs -n crossplane-system \
deployment/crossplane \
--container crossplane
Error messages are sometimes buried three layers deep — the Crossplane core logs, the provider logs, and the actual AWS API error wrapped in a Kubernetes condition. I’ve spent longer debugging a misconfigured IAM permission with Crossplane than I ever did with Terraform. The events surface eventually, but there’s no “here’s what went wrong and here’s how to fix it” output. You need to be comfortable reading controller logs and Kubernetes status conditions, or you’ll lose a lot of time.
Compositions: The Feature That Makes This Worth It
Crossplane Compositions are the reason senior platform engineers get excited about this tool. They let you build opinionated abstractions on top of raw cloud resources. Instead of your developers writing a 60-line YAML manifest to configure an RDS instance with correct security groups, subnet groups, and parameter groups, you write a Composition once and expose a simple XRD (Composite Resource Definition) that looks like this:
apiVersion: database.mycompany.io/v1alpha1
kind: PostgresDatabase
metadata:
name: my-app-db
spec:
size: medium
region: us-east-1
Your platform team owns the Composition that translates “medium” into actual AWS resource config. Developers get a clean API. The investment to build this is real — a well-structured Composition takes a day or two to write and test properly — but once it’s done, it’s genuinely better than any module-based abstraction I’ve built in Terraform.
Be Honest With Yourself About Whether This Fits
I’d skip Crossplane entirely if your team isn’t already running Kubernetes in production. The value proposition depends on the etcd-as-state-backend model, the kubectl toolchain familiarity, and the GitOps workflow being things you’re already invested in. If you’re managing infra for a team running workloads on EC2 or ECS or Lambda, Crossplane adds a Kubernetes dependency that you’ll spend more time maintaining than it saves you. Use OpenTofu, use Pulumi, use something that fits your actual stack. But if you’re deep in Kubernetes already — running a multi-tenant cluster, using Argo CD for everything, managing infra for a platform team — Crossplane’s learning curve pays back within a couple of months. The mental model clicks, and then managing cloud resources starts feeling like managing everything else you already manage.
Terragrunt: Not a Replacement, But Worth Mentioning
Terragrunt pairs with OpenTofu to replace what Terraform Cloud charges you for
Terragrunt isn’t an alternative to Terraform — it’s a wrapper around it. But if you’re running OpenTofu (the actual free fork), pairing it with Terragrunt gets you workspace-level features that HashiCorp would otherwise charge you for on Terraform Cloud: automatic remote state wiring, dependency graphs between stacks, and DRY configurations across staging/prod/dev without copy-pasting the same S3 backend block into 12 different folders. I switched to this combo because I got tired of explaining to my team why we were paying for Terraform Cloud when OpenTofu exists and costs nothing.
The three problems Terragrunt actually solves are real and painful. First: DRY backend configs. Without it, you have a backend.tf in every environment folder pointing to the same S3 bucket with a slightly different key path — and someone always fat-fingers the key. Second: dependency ordering. If your VPC stack has to finish before your EKS stack starts, Terragrunt’s dependency blocks handle that without you writing a shell script. Third: run-all commands that let you plan or apply across every stack in a directory tree without manually traversing folders.
Here’s the terragrunt.hcl pattern I use at the root level to auto-generate backend config so every environment folder inherits it:
# root terragrunt.hcl
locals {
env = basename(dirname(get_terragrunt_dir()))
region = "us-east-1"
}
remote_state {
backend = "s3"
generate = {
path = "backend.tf"
if_exists = "overwrite_terraform"
}
config = {
bucket = "my-tf-state-${local.env}"
key = "${path_relative_to_include()}/tofu.tfstate"
region = local.region
encrypt = true
dynamodb_table = "tofu-state-lock-${local.env}"
}
}
generate "provider" {
path = "provider.tf"
if_exists = "overwrite_terraform"
contents = <<EOF
provider "aws" {
region = "${local.region}"
}
EOF
}
Each environment folder then just has a terragrunt.hcl that calls include "root" { path = find_in_parent_folders() } and a main.tf with the actual resources. No S3 bucket names copy-pasted anywhere. The path_relative_to_include() function generates a unique state key based on the folder path automatically — so envs/staging/vpc and envs/prod/vpc get different state files without you touching anything.
Now for the honest part: Terragrunt adds a real abstraction layer, and that layer will eventually bite you. terragrunt run-all plan errors are notoriously unhelpful. You’ll see something like Error: Module is not compatible with this version of Terragrunt with a stack trace pointing at dependency resolution, and the actual problem is a typo in a dependency block three folders up. Debugging requires running plans on individual modules until you isolate the failure — there’s no clean “here’s exactly what broke” output. The other thing that caught me off guard: generate blocks write files into your module directories at plan time. If you’re not careful with .gitignore, you’ll commit auto-generated provider.tf and backend.tf files and confuse everyone on the team who doesn’t know Terragrunt generated them.
- Use Terragrunt if: you’re managing more than 3 environments and copy-pasting backend configs is already annoying you
- Skip it if: you have one environment, one team, and low infrastructure complexity — the overhead isn’t worth it
- Required reading before committing: the Terragrunt docs on
run-allordering and the--terragrunt-ignore-dependency-errorsflag, which exists for a reason
Side-by-Side: Which Tool Fits Which Situation
The Comparison Table, Then the Real Talk
Before I get into the scenarios, here’s the side-by-side so you have something concrete to reference. I’ve tried to be honest about the migration effort column — “low” doesn’t mean painless, it means you won’t spend three weeks rewriting everything.
| Factor | OpenTofu | Pulumi (Community) | Crossplane |
|---|---|---|---|
| License | MPL-2.0 (open source, maintained by Linux Foundation) | Apache 2.0 for core; some plugins are source-available | Apache 2.0 |
| Migration from Terraform | Near-zero — rename binary, run tofu init, done in most cases |
Medium — pulumi convert --from terraform exists but the output needs manual cleanup on anything complex |
High — completely different mental model; your .tf files are irrelevant here |
| State management | Same backends as Terraform (S3, GCS, Azure Blob, etc.) | Pulumi Cloud free tier for state, or self-host on S3/Azure — but self-hosted state requires extra config steps | No state files — Kubernetes etcd IS your state; reconciliation loop handles drift |
| Learning curve | Minimal if you know Terraform; HCL syntax is identical | Steep for infra people who don’t code; easy for devs who already know Python/TypeScript/Go | Steep — requires solid Kubernetes knowledge, CRDs, and understanding of operator pattern |
| Best fit team profile | Ops-heavy teams, existing Terraform users, anyone who wants a safe migration | Dev-heavy teams building complex infra logic, startups with strong engineering culture | Platform teams running Kubernetes as a control plane, companies building internal developer portals |
Scenario A: Startup with 50 Existing .tf Files and No Kubernetes
OpenTofu. No contest. I’ve done this migration twice now and the actual process is anticlimactic in the best way. You install the binary, run tofu init in your existing repo, and your state file stays exactly where it is. Here’s the entire migration for a typical setup:
# Install OpenTofu (Linux)
curl --proto '=https' --tlsv1.2 -fsSL https://get.opentofu.org/install-opentofu.sh | bash
# Drop into your existing Terraform repo
cd your-infra-repo
# OpenTofu reads your existing .tfstate without conversion
tofu init
tofu plan
The thing that caught me off guard the first time: provider registry. OpenTofu uses registry.opentofu.org by default, but all the major HashiCorp providers are mirrored there. The rare exception is very niche providers that nobody has bothered to publish yet. Check your provider list before you migrate — run a quick grep -r "required_providers" . --include="*.tf" and verify each one exists at registry.opentofu.org. That five-minute check will save you a day of debugging.
Scenario B: Platform Team Building an Internal Developer Platform on EKS
Crossplane with Compositions is the right call here, but only if your team genuinely lives in Kubernetes. The pitch is compelling: your whole infra becomes Kubernetes resources, so developers request infrastructure the same way they request a deployment — with a YAML file and kubectl apply. You build a Composition that wraps an RDS instance, and developers just create a PostgreSQLInstance custom resource without ever touching AWS credentials directly. Here’s what that looks like from the developer side:
apiVersion: database.example.internal/v1alpha1
kind: PostgreSQLInstance
metadata:
name: my-app-db
namespace: team-payments
spec:
parameters:
storageGB: 20
region: us-east-1
compositionSelector:
matchLabels:
provider: aws
environment: production
writeConnectionSecretToRef:
name: my-app-db-credentials
The developer doesn’t know or care that this triggers a Crossplane Composition that provisions an RDS subnet group, a parameter group, a security group, and the actual RDS instance in sequence. That abstraction is genuinely powerful for platform teams. The honest downside: writing those Compositions is painful. The patching syntax for passing values between resources is non-obvious, and debugging a broken Composition means reading Kubernetes events and controller logs, not a clean error message. Budget two to three weeks for your first production-grade Composition before you open it up to other teams.
Scenario C: Small Team, Everyone Writes Python, Starting Fresh
Pulumi Community Edition is the obvious pick here. Real Python, not a DSL that looks like Python. You get loops, functions, classes, conditionals — all the stuff that makes HCL painful when your logic gets complex. The community edition gives you state stored in Pulumi Cloud for free (with a limit that most small teams won’t hit), or you can point state at your own S3 bucket with pulumi login s3://your-state-bucket. A basic AWS stack looks like this:
import pulumi
import pulumi_aws as aws
# This is just Python — import libraries, write functions, do what you want
environments = ["staging", "production"]
for env in environments:
bucket = aws.s3.Bucket(
f"app-assets-{env}",
acl="private",
tags={"Environment": env, "ManagedBy": "pulumi"}
)
pulumi.export(f"{env}_bucket_name", bucket.id)
That loop replacing eight blocks of HCL is the exact moment developers who’ve suffered through Terraform’s count and for_each limitations understand why Pulumi exists. The honest caveat: Pulumi’s docs are inconsistent. The TypeScript examples are the best-maintained, and the Python examples sometimes lag behind by a few provider versions. If you hit a wall, check the TypeScript example first and translate it mentally — it’s annoying but faster than waiting for a GitHub issue to get resolved.
Scenario D: Enterprise Needing Audit Trails and Policy Enforcement
OpenTofu paired with Atlantis gives you a genuinely solid open-source alternative to Terraform Cloud’s paid tiers. Atlantis runs as a server in your cluster, listens to pull request webhooks from GitHub/GitLab/Bitbucket, runs tofu plan on every PR, posts the plan output as a comment, and only runs tofu apply when a designated approver comments atlantis apply. Every infrastructure change is a PR, every plan is reviewed, every apply is logged. That’s your audit trail without paying for HCP Terraform’s Business tier.
# atlantis.yaml in your repo root — controls which projects get planned
version: 3
projects:
- name: networking
dir: ./networking
workspace: production
autoplan:
when_modified: ["**/*.tf", "../modules/**/*.tf"]
apply_requirements:
- approved
- undiverged
- name: application
dir: ./application
workspace: production
autoplan:
when_modified: ["**/*.tf"]
apply_requirements:
- approved
Add OpenTofu’s native support for OPA (Open Policy Agent) policies — a feature that arrived in OpenTofu 1.7 and doesn’t exist in Terraform’s open-source version — and you can enforce rules like “no S3 buckets without encryption” or “no security groups open to 0.0.0.0/0” at plan time, before anyone can apply. That combination covers most of what enterprise teams actually need from a governance standpoint.
The One Thing I’d Actively Warn You Against
Don’t mix two of these tools in the same stack unless you have an extremely specific, well-documented reason. I’ve seen teams run Crossplane for their Kubernetes-native resources and OpenTofu for “everything else,” and the state management complexity compounds fast. You end up with resources that depend on each other across two separate reconciliation systems — Crossplane’s controller loop and OpenTofu’s state — and debugging a broken dependency between them is miserable. The Crossplane resource exports a connection secret, your OpenTofu config reads it with a data source, something changes in the wrong order, and now you’re reading two different logs, two different state representations, and trying to figure out which system is lying to you. Pick one tool per stack boundary. If your Kubernetes platform team uses Crossplane, that’s their domain. If your infrastructure team uses OpenTofu, that’s theirs. Hard boundary between them, clean interfaces (like SSM parameters or Kubernetes secrets as the handoff point), and don’t let them share ownership of the same resources.
Setting Up OpenTofu with Atlantis for PR-Based Workflows (The Setup I Actually Use)
Why I Picked Atlantis Over Terraform Cloud (And How to Wire It Up With OpenTofu)
Terraform Cloud’s free tier caps you at 500 managed resources per month before it starts charging. Atlantis gives you the same PR-based plan/apply workflow — comments on pull requests, locked state during applies, automatic plan output — for zero dollars, running on your own infrastructure. I switched after our team hit Terraform Cloud’s limits on a mid-sized AWS environment and realized we were paying for something we could self-host in an afternoon. The trade-off is real though: you own the ops burden. If Atlantis goes down, your infra deployments stop. For most teams, that’s a fine trade.
Running Atlantis Locally First (Before You Commit to ECS)
Don’t deploy straight to ECS. Run it locally with Docker Compose to shake out your config before you touch production. Here’s the docker-compose.yml I actually use for local testing:
version: "3.8"
services:
atlantis:
image: ghcr.io/runatlantis/atlantis:latest
ports:
- "4141:4141"
environment:
ATLANTIS_GH_USER: "your-bot-account"
ATLANTIS_GH_TOKEN: "${GH_TOKEN}"
ATLANTIS_GH_WEBHOOK_SECRET: "${WEBHOOK_SECRET}"
ATLANTIS_REPO_ALLOWLIST: "github.com/your-org/*"
ATLANTIS_DEFAULT_TF_DISTRIBUTION: "opentofu"
ATLANTIS_TOFU_VERSION: "1.7.0"
volumes:
- ./atlantis.yaml:/home/atlantis/atlantis.yaml
- ~/.aws:/home/atlantis/.aws:ro
command: server --config /home/atlantis/atlantis.yaml
Use ngrok to expose port 4141 so GitHub can actually send webhooks to your local instance. ngrok http 4141, take that URL, plug it into your GitHub repo’s webhook settings with /events appended. Open a test PR and you’ll see Atlantis post a plan comment within 30 seconds if everything’s wired right.
The Gotcha That Will Burn You: Atlantis Defaults to Terraform
This one caught me off guard and cost me an hour of debugging. Out of the box, Atlantis downloads Terraform — not OpenTofu. If you don’t set ATLANTIS_DEFAULT_TF_DISTRIBUTION=opentofu, it will silently pull the Terraform binary and run your plans with it. Your .tf files will still work because OpenTofu is backward-compatible, but you’re defeating the whole point and you might hit HashiCorp’s BSL license restrictions depending on your use case. Set the env var. Don’t forget it.
The atlantis.yaml Config That Points to OpenTofu
Beyond the env var, you can get explicit control by defining a custom workflow in your repo’s atlantis.yaml. This is what I run:
version: 3
automerge: false
delete_source_branch_on_merge: false
projects:
- name: infra-prod
dir: ./infra/prod
workspace: default
terraform_version: v1.7.0
workflow: opentofu-workflow
workflows:
opentofu-workflow:
plan:
steps:
- env:
name: OPENTOFU_VERSION
value: "1.7.0"
- run: tofu init -input=false
- run: tofu plan -input=false -out=$PLANFILE
apply:
steps:
- run: tofu apply -input=false $PLANFILE
The run steps call tofu directly, which means Atlantis’s binary download behavior gets bypassed entirely for plan and apply. You need tofu on the PATH inside the container — the official Atlantis image doesn’t include it, so either build a custom image that installs OpenTofu, or use the ATLANTIS_DEFAULT_TF_DISTRIBUTION=opentofu env var to let Atlantis handle the download. I use the env var for simplicity; custom images if I need a specific patch version pinned.
GitHub Actions as the Escape Hatch If You Don’t Want to Self-Host
Atlantis is the right call for teams running more than 10 modules — the PR locking alone prevents the “two engineers applying at the same time” disaster. But if you’re a solo dev or a two-person team, GitHub Actions with OIDC auth is lighter and has zero infrastructure to maintain. Here’s the setup I’d use:
# .github/workflows/tofu-plan.yml
name: OpenTofu Plan
on:
pull_request:
paths:
- 'infra/**'
permissions:
id-token: write
contents: read
pull-requests: write
jobs:
plan:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: opentofu/setup-opentofu@v1
with:
tofu_version: "1.7.0"
- name: Configure AWS credentials via OIDC
uses: aws-actions/configure-aws-credentials@v4
with:
role-to-assume: arn:aws:iam::123456789012:role/github-actions-tofu
aws-region: us-east-1
- name: Tofu Init
run: tofu init
working-directory: infra/prod
- name: Tofu Plan
id: plan
run: tofu plan -no-color -out=tfplan 2>&1 | tee plan_output.txt
working-directory: infra/prod
- name: Post Plan to PR
uses: actions/github-script@v7
with:
script: |
const fs = require('fs');
const output = fs.readFileSync('infra/prod/plan_output.txt', 'utf8');
github.rest.issues.createComment({
issue_number: context.issue.number,
owner: context.repo.owner,
repo: context.repo.repo,
body: `Tofu Plan Output
\n\n\`\`\`\n${output}\n\`\`\`\n`
});
The opentofu/setup-opentofu action is maintained by the OpenTofu project itself, so version pinning is reliable. OIDC auth to AWS means no long-lived credentials sitting in GitHub Secrets — your IAM role just needs a trust policy that allows token.actions.githubusercontent.com as the federated identity. The downside versus Atlantis: no state locking across PRs, no auto-apply on merge without more workflow engineering, and your plan history lives in GitHub’s logs rather than Atlantis’s UI. For teams with more than three people touching infra regularly, that gets messy fast. Go Atlantis. For solo projects or small teams, Actions is fine and one less thing to operate.
What I’d Tell a Team Starting This Migration Today
Start with OpenTofu — Then Decide If You Actually Need a Bigger Change
The single biggest mistake I see teams make is treating this migration as an opportunity to rethink their entire infrastructure toolchain from day one. Don’t. If you’re running Terraform today and the BSL licensing risk is your primary concern, OpenTofu solves that problem without asking you to relearn anything. Start there. Get the legal exposure off your plate, stabilize, and then spend a quarter evaluating whether Pulumi or Crossplane actually solves a real problem you have — not a hypothetical one.
The first thing I ran when we kicked off our migration was this, against a staging environment:
tofu init -upgrade
tofu plan -out=migration.tfplan
tofu show migration.tfplan
Read that output. Don’t skim it. The -upgrade flag will pull the latest compatible provider versions and surface any constraint conflicts you’ve quietly accumulated over months of terraform apply runs where nobody updated the lockfile. We found three providers where our version constraints were pinned to ranges that OpenTofu’s registry resolved differently than HashiCorp’s. Not broken — just worth knowing before you’re staring at a failed apply against prod state at 11pm.
If you hit provider version issues that look like genuine incompatibilities rather than constraint mismatches, check the OpenTofu compatibility matrix before you assume something is broken. The registry coverage is excellent for anything in the major cloud providers, but some niche community providers lag behind. We had one internal provider wrapper that needed a minor patch — took two hours, not two days. The thing that caught me off guard was that the error messages from OpenTofu are actually more verbose than Terraform’s in these cases, which helps.
Your state file is not a migration concern. I want to be direct about this because teams waste days worrying about it. The OpenTofu state format is identical to Terraform’s. If your state lives in S3 with a DynamoDB lock table, you point your backend config at the same bucket and key, run tofu init, and OpenTofu picks it up. No import, no conversion, no ceremony. The only config change in your backend.tf is that you’re now running tofu commands instead of terraform commands.
# backend.tf — literally unchanged from your Terraform config
terraform {
backend "s3" {
bucket = "your-state-bucket"
key = "prod/terraform.tfstate"
region = "us-east-1"
dynamodb_table = "terraform-locks"
encrypt = true
}
}
Here’s the honest timeline from our migration: 8 engineers, roughly 15,000 lines of HCL spread across about 40 modules, all living in a monorepo running Atlantis for PR-based applies. Getting off Terraform and onto OpenTofu took one sprint — and most of that time was updating CI scripts and internal docs, not fixing actual breakage. The two sprints after that were getting Atlantis fully stable with OpenTofu. The Atlantis side bit us because we were on an older Atlantis version that needed an upgrade alongside the OpenTofu swap, and those two changes interacted in ways our testing hadn’t caught. If you’re running Atlantis, pin your Atlantis version upgrade to a separate PR from your OpenTofu cutover. Don’t batch those together the way we did.
- Week 1: Run
tofu init -upgradeacross all workspaces in non-prod. Fix lockfile conflicts. Get a cleantofu planthat shows no unexpected drift. - Week 2: Update CI/CD pipelines. Replace
terraformbinary withtofuin your Docker images or runner scripts. Run shadow applies — run both tools against the same state and compare plan output. - Week 3: Prod cutover. This should be boring at this point.
- Sprints 2–3: Atlantis/Spacelift/whatever automation layer stabilization.
One more thing nobody mentions in the migration guides: update your .terraform.lock.hcl files and commit them. OpenTofu generates these with a slightly different hash format for some providers. If you don’t commit updated lockfiles, engineers on the team who pull and run tofu init locally will get integrity warnings that look alarming and aren’t. Five minutes of housekeeping saves an afternoon of confused Slack messages.