~/suraj.shhire me ↗
case study · 01ai · infra2025 — presentlive in production

InfraGenie — a GenAI Terraform provisioner.

A natural-language interface that turns infra requests into reviewable, modular Terraform. I built the first prototype in five days, demoed it to a client in the US, and have been building it properly ever since.

role
software engineer · product owner
team
4–5 junior engineers
duration
may 2025 — present
stage
live · client in production
stack
python · streamlit → fastapi · azure · terraform
cover · screen 01
the brief

Five days from joining to a client demo.

I joined Synergech on May 28, 2025 as a software engineer. By June 2nd I had a working rough draft of InfraGenie — Python backend, Streamlit front-end — that could take a natural-language infra request and turn it into Terraform.

My boss demoed it to a client in the US that same week. They liked what they saw and wanted to ship it. That conversation turned a prototype into a product I have been building out ever since.

I now lead a team of four to five junior engineers building InfraGenie while also working directly with that client on a second product — an M&A automation platform for their insurance acquisition pipeline.

approach

An LLM as a router, not a writer.

The core design decision: the model does not write Terraform. It translates a natural-language request into a deterministic plan over a pre-vetted module library, then a templating engine renders the actual HCL. The model is a router. The Terraform is human-readable, diff-able, and safe for review.

input · intent
natural-language brief
chat ui · web form
customer profile
org · region · compliance tier
policy guardrails
json schema · OPA rules
brain · planner
LLM planner
classify intent → structured plan
module library
vetted terraform modules
composer
plan + library → terraform/
output · review
terraform PR
branch protected · reviewable
tflint + checkov
policy gate
human review → apply
terraform cloud · azure

fig 1 — system architecture. the model never touches production directly.

The unlock was realising the LLM shouldn't write the Terraform. It should pick which modules to compose. Determinism, where it matters.

— suraj sanjay
key decisions
  • LLM is a planner, not a code generator
  • Module library is pre-vetted and version-pinned
  • Every generation produces a PR — never a direct apply
  • Policy is enforced as code (OPA + Checkov), not by the model
  • Human-in-the-loop is mandatory for prod, optional for dev
how it shipped

Prototype to production.

01
may 28 – jun 2, 2025
5 days

Prototype

Joined Synergech. Built a rough draft in Python + Streamlit. Boss demoed it to a US client. They wanted to ship it.

02
jun — aug 2025
10 weeks

Building it properly

Replaced Streamlit with a proper backend. Built out the module library and the planner. Onboarded the first junior engineers.

03
sep — nov 2025
12 weeks

Hardening & policy layer

Added OPA, Checkov, and the PR-only path. Internal pilot. Reduced hallucinations, added guardrails.

04
dec 2025 — now
ongoing

Client in production

Live with the client. Also building their M&A automation platform in parallel. Team of 4–5 juniors across both products.

stack

What runs under the hood.

application
  • Python
  • FastAPI
  • Streamlit (prototype)
  • PostgreSQL
ai · orchestration
  • LLM planner layer
  • Structured output
  • Module composer
infra · target
  • Terraform 1.7+
  • Azure landing zones
  • Terraform Cloud
policy · ci
  • OPA · Conftest
  • Checkov
  • tflint
  • GitHub Actions
honesty section

What broke in production.

Hallucinated module versions early in the pilot. The planner suggested a module version that did not exist. Caught in the PR gate. We now hard-pin the version registry into the system prompt and validate every emitted version string against it before render.

The first client got cold feet about "AI writing infra". Reasonable. We made the planner output visible at every step and renamed the feature to "AI-assisted provisioning". Two demos later they were in. Naming is half the product.