Keeping production up — bare metal through cloud-native, and every platform shift in between. We build it, break it, and operate it.
Four things, done well. No frameworks, no decks, no "digital transformation." Just infrastructure you can read, run, and hand off.
A live production deployment — recipe site for home cooks, built on the Foundry Platform. Not a portfolio piece. A real app, under real load, paying real AWS bills.
Terraform-managed from the root module down. Plan-on-PR, apply-on-merge via OIDC. The blast radius of a compromised pipeline is limited to its scope. Runs on personal money.
The full runbook is longer, drier, and in the repo. These are the six we'd bring into a room on day one.
The person who designs the system is the person who carries the pager for it. Otherwise the design is a suggestion, not a commitment.
Least-privilege isn't a checkbox — it's the default. If a compromised pipeline can reach production, the problem is the pipeline, not the compromise.
An incident handled by a sleepy engineer following the runbook is better than a hero who remembers. Write the doc. Update it when it lies.
You cannot operate what you cannot see. Logs, metrics, traces, and a single dashboard a human actually opens. No 'we'll add it later.'
Infrastructure changes are code review. OIDC, no long-lived credentials, signed artifacts. The pipeline is the contract.
A NAT gateway you forgot about is a security problem. A forgotten log bucket is a compliance problem. Run the audit monthly, not yearly.
Built by an infrastructure operations professional with 25+ years of production experience — bare-metal data centers, 24×7 ops, single-homed environments where every decision had physical consequences.
AWS / Terraform / ECS Fargate. Foundry Platform. Moving the discipline without losing the rigor.
24×7 ops. On-call rotations. Incident command. Migrated a regulated workload through three datacenter transitions without a customer-visible outage.
Bare-metal. Single-homed environments. Every change had physical consequences. Learned what 'production' actually means.
First rotation. First outage I caused. First runbook I wrote. Everything since is a refinement.
Short engagements, long ones, and one-off audits. If you know what you need, send the repo. If you don't, send the symptoms.