Articles
Technical writing on AWS architecture, cloud governance, AI/ML systems, and agentic coding.

AWS Cost Allocation and FinOps Architecture Deep Dive
I have managed AWS spend across organizations ranging from $50,000 to $8 million per month. The single biggest difference between organizations that control their cloud costs an...

Achieving Determinism with LLM Agents: An Architecture Guide
I run a fleet of LLM agents that audit 42 repositories every day. Same code, same prompts, same model. And for weeks, every single run produced different results. Not because th...

Terraform + CloudFormation StackSets: Deploying IAM Roles Across Every Account in Your Organization
Every multi-account AWS organization needs a baseline IAM role in every member account. Cross-account access for security tooling, centralized billing queries, incident response...

AWS IAM: An Architecture Deep-Dive
Every AWS architecture decision I make runs through IAM eventually. Network topology, compute strategy, data pipeline design: none of it matters if the permissions are wrong. An...

OIDC and OAuth 2.0: An Architecture Deep-Dive
I have built OAuth integrations across web browsers, Electron desktop apps, and native iOS applications. The same protocol, three completely different implementation patterns, t...

Vector Database Architecture: How Vector Search Powers RAG Systems
I built my first vector search system with a flat numpy array and brute-force cosine similarity. Three hundred fifty chunks, 1024 dimensions, under 2MB. Search completed in micr...

Building an Enterprise Chatbot: React, FastAPI, and WebSocket Architecture
Every enterprise wants an AI chatbot now. Most of the tutorials out there will get you a working prototype in an afternoon. Deploying that prototype to production for a Fortune ...

Automated TDD with Claude Code: Testing Strategy for AI-Assisted Engineering
Every project I hand to Claude Code starts the same way: I write the testing strategy before the first line of application code exists. Not because I am a TDD purist (I have ski...

The Leverage Factor, Part 2: Defending the Numbers
The Leverage Factor: Measuring AI-Assisted Engineering Output generated more direct messages than anything else I have published. Some of the feedback was enthusiastic. A signif...

Agentic Coding, FOMO, and Flow State Addiction
Last Monday I went into my office at 7 AM to kick off a few Claude sessions before taking the trash cans to the street. I sat down, wrote three prompts, and started reviewing th...

Agentic Coding and Decision Fatigue: The Cognitive Cost of Supervising AI
Recently during heavy Claude Code usage, I started noticing an uncomfortable trend. At 8 AM I could run three agent sessions at once, spot a bad abstraction in a 200-line diff, ...

Building a Cloud Knowledge Benchmark: Testing What LLMs Actually Know About AWS
I spend most of my time building production systems on AWS. I also spend a growing fraction of my time working with LLMs to design and implement those systems. That combination ...

AWS RDS and Aurora Cost Optimization Strategies
Database costs are the second largest line item on most AWS bills I review, right behind compute. The problem is that RDS and Aurora pricing has enough moving parts to keep team...

The CAP Theorem, Consistency Models, and the Trade-Offs Nobody Warns You About
Every distributed system I have built forced a conversation about consistency before it forced a conversation about performance. Sometimes that conversation happened during desi...

Real-Time Messaging Protocols: WebSockets, SSE, gRPC, Long Polling, and MQTT Compared
I have built real-time features into more systems than I can count: chat, live dashboards, IoT telemetry pipelines, collaborative editors, trading feeds, notification systems. E...

Cutting AWS Egress Costs with a Centralized VPC and Transit Gateway
NAT Gateway costs are the silent budget killer in multi-account AWS environments. I've audited organizations spending $15,000/month on NAT Gateway data processing alone, spread ...

Step Functions for Cart and Fulfillment: Async Workflow Patterns That Survive Production
Every e-commerce team starts with a synchronous checkout. The API receives a cart, charges the card, decrements inventory, and returns a confirmation. It works until it doesn't....

Giving Claude Code a Voice with ElevenLabs
I spend hours in Claude Code every day. Long sessions where I am reading, thinking, switching contexts, and occasionally glancing at the terminal to see if the agent finished a ...

Video Content Moderation with SageMaker Pipelines and Open-Source Models
I have built video analysis pipelines that process thousands of uploads per day, routing each file through multiple ML models for content moderation, face recognition, transcrip...

Video Content Moderation: AWS Managed Services vs. Open-Source Models
I have built video content moderation pipelines both ways: one using AWS managed AI services orchestrated by Step Functions, another using open-source models running on SageMake...

AWS S3 Cost Optimization: The Complete Savings Playbook
S3 is the most used service on AWS and, for many organizations, the single largest line item on the bill after compute. The insidious thing about S3 costs is that they creep. No...

AWS Aurora: Getting Close to Multi-Region Active/Active
Every production architecture conversation I've had in the last five years eventually lands on the same question: can we go active/active across regions? The answer with Aurora ...

Video Content Moderation with Step Functions and AWS AI Services
Every platform that accepts user-uploaded video faces the same operational reality: a single piece of unmoderated content can produce legal liability, advertiser flight, and rep...

MySQL vs. PostgreSQL on Aurora: An Architecture Deep Dive
Every relational database argument eventually becomes a religion debate. I have no interest in that. What I care about is how these engines behave under load, where they break, ...

AWS EC2 Cost Optimization: Five Strategies That Cut Compute Bills in Half
EC2 is the single largest line item on most AWS bills. It is also the line item where the gap between what teams pay and what they should pay is the widest. I have audited AWS a...

The Leverage Factor: Measuring AI-Assisted Engineering Output
In finance, leverage is the use of borrowed capital to amplify returns. A trader with 10x leverage controls ten dollars of assets for every dollar of equity. The principle is st...

AWS DynamoDB: An Architecture Deep-Dive
DynamoDB sits at the center of more AWS architectures than any other database service. I've used it for everything from mobile backends handling millions of daily active users t...

AWS Cognito User Authentication: An Architecture Deep-Dive
User authentication looks simple from the outside. A sign-up form, a login page, maybe a "Forgot Password" link. Behind that surface sits a sprawling system of token management,...

AWS CodeDeploy: An Architecture Deep-Dive
Deployment automation is the single most impactful investment a team can make in operational reliability. Manual deployments (SSH into a box, pull the latest code, restart the s...

AWS CodeBuild: An Architecture Deep-Dive
Nobody wants to own build infrastructure. Everybody depends on it. I have spent years managing Jenkins clusters, debugging flaky build agents, patching security holes on build s...

AWS Event-Driven Messaging: SNS, SQS, EventBridge, and Beyond
Most teams bolt messaging onto their architecture after the first production outage caused by synchronous service-to-service calls. A payment service calls an inventory service ...

AWS CodePipeline: An Architecture Deep-Dive
I keep running into the same mistake across teams. They treat their build tool and their pipeline orchestrator as one thing. They'll jam deployment logic into CodeBuild buildspe...

AWS Lambda Container Images: An Architecture Deep-Dive
Having spent years packaging Lambda functions as zip archives, I hit the wall that every team eventually hits: the 250 MB deployment package limit. The first time it happened wa...

Building a Production CI/CD Pipeline for Containerized AWS Lambda Functions
Manually shipping containerized Lambda functions works for experiments. Build the image locally, push it to ECR, update the function, verify it works. Fine for one function upda...

iOS Telemetry Pipeline with Kinesis, Glue, and Athena
Any iOS app with real users generates telemetry. Session starts, feature usage, error events, performance metrics, purchase funnels. Most teams start by shipping all of it to Am...

Infrastructure as Code: CloudFormation, CDK, Terraform, and Pulumi Compared
Infrastructure as Code is one of those concepts that every cloud team claims to practice, yet the architectural differences between the tools they use (and the downstream implic...

Lambda Behind ALB Behind CloudFront: An Architecture Deep-Dive
Five ways to expose a Lambda function over HTTP. At least. AWS keeps adding more. Most teams pick API Gateway on day one and never revisit that decision. Fine. API Gateway handl...

Single Serving Applications - The Clones
I'm systematically replacing my SaaS subscriptions with Single Serving Applications. These are purpose-built, AI-generated apps designed for an audience of one. Each clone is bu...

Ephemeral Apps Are Almost Here
I recently built a Harvest clone in 18 minutes, a Trello clone in 19 minutes, and a Confluence clone in 16 minutes. All three were generated entirely by Claude Opus 4.6 from req...

SageMaker Pipelines: An Architecture Deep-Dive
I have deployed SageMaker Pipelines across production ML platforms ranging from simple training-to-deployment workflows to multi-model ensembles with conditional quality gates. ...

Building Large-Scale SageMaker Training Pipelines with Step Functions
I have spent the last several months orchestrating ML training pipelines that coordinate dozens of SageMaker jobs: preprocessing, feature engineering, distributed training, hype...

AWS OpenSearch Service: An Architecture Deep-Dive
AWS OpenSearch Service runs behind more production workloads than most engineers realize: log analytics, full-text search, security event monitoring, vector similarity search. L...

Amazon CloudFront: An Architecture Deep-Dive
Amazon CloudFront is one of the most underestimated services in the AWS portfolio. Most teams think of it as a caching layer you put in front of your S3 bucket or Application Lo...

Amazon ElastiCache: An Architecture Deep-Dive
ElastiCache looks easy. Deploy a managed cache, point your app at the endpoint, enjoy sub-millisecond reads. Then production happens. Engine selection, cluster topology, evictio...

AWS Elastic Load Balancing: An Architecture Deep-Dive
I've yet to ship a production architecture on AWS that doesn't involve Elastic Load Balancing somewhere. Most teams slap a load balancer in front of their service and move on. F...

Best Practices for Networking in AWS SageMaker
Three years of locking down SageMaker environments across regulated industries taught me one thing early: your networking decisions on day one determine whether the ML infrastru...

Overlooked Productivity Boosts with Claude Code
Most engineers who adopt Claude Code start with the obvious: "write me a function," "fix this bug," "add a test." Those are fine. They also miss at least half the value. The lar...

AWS Step Functions: An Architecture Deep-Dive
Most teams ignore Step Functions until they find themselves writing ad-hoc state management code inside Lambda functions, chaining queues together with brittle retry logic, or b...

Amazon API Gateway: An Architecture Deep-Dive
Amazon API Gateway sits in front of most serverless and microservice architectures on AWS. Three distinct API types, a control plane versus data plane split, a layered throttlin...