Available for new opportunities

Alex Chen

$ whoami >> _

Building resilient infrastructure at scale. 7+ years shipping production systems that don't page you at 3am.

99.98% Uptime SLA
200+ Deployments/mo
40% Cost Reduction
bash — alex@prod-cluster
$ kubectl get nodes
NAME               STATUS   ROLES           AGE
prod-node-01     Ready    control-plane   42d
prod-node-02     Ready    worker          42d
prod-node-03     Ready    worker          42d
$ terraform apply --auto-approve
Plan: 12 to add, 3 to change, 0 to destroy.
Apply complete! Resources: 12 added.
$ helm upgrade --install app ./charts
Release "app" has been upgraded. Happy Helming!
$
scroll

Infrastructure
as a craft.

I'm a Senior DevOps Engineer based in San Francisco with a passion for building infrastructure that scales gracefully and fails safely. I believe great DevOps is invisible — your developers ship faster, your systems stay up, and nobody loses sleep.

My approach combines deep technical expertise with a product mindset. I don't just keep the lights on — I architect systems that give engineering teams superpowers.

AWS Certified CKA Certified Terraform Associate Open Source Contributor
system.config
location San Francisco, CA
experience 7+ years
focus Cloud & Platform Eng
education B.S. Computer Science
status ● open to work
coffee black, always

The Stack.

Cloud Platforms

AWS GCP Azure DigitalOcean
Expert · 92%

Containers & Orchestration

Kubernetes Docker Helm Istio
Expert · 95%

CI/CD & Automation

GitHub Actions Jenkins ArgoCD CircleCI
Expert · 90%
🏗

Infrastructure as Code

Terraform Ansible Pulumi CloudFormation
Advanced · 88%
📊

Monitoring & Observability

Prometheus Grafana Datadog ELK Stack
Advanced · 85%
🔐

Security & Compliance

Vault OPA Falco SOC2
Advanced · 80%

Work History.

Senior DevOps Engineer

Stripe
2022 — Present
  • Architected multi-region Kubernetes platform serving 50M+ daily API requests with 99.99% uptime
  • Reduced deployment time from 45 min to 8 min via GitOps pipeline with ArgoCD and progressive delivery
  • Led cloud cost optimization initiative saving $2.4M annually through right-sizing and spot instance strategy
  • Built internal developer platform (IDP) adopted by 200+ engineers, reducing onboarding time by 60%
Kubernetes AWS Terraform ArgoCD Datadog

DevOps Engineer

Airbnb
2019 — 2022
  • Migrated 300+ microservices from bare-metal to GKE, reducing infrastructure costs by 35%
  • Implemented service mesh with Istio enabling zero-trust networking across all production services
  • Built automated disaster recovery system achieving RTO of 15 minutes and RPO of 5 minutes
GKE Istio Ansible Prometheus

Cloud Infrastructure Engineer

Cloudflare
2017 — 2019
  • Managed global CDN infrastructure across 200+ PoPs handling 10Tbps+ traffic
  • Automated network provisioning reducing manual configuration time by 80%
  • Developed internal tooling for capacity planning used across 3 engineering teams
Linux Python BGP Nginx

Featured Work.

02

TerraVault

Terraform module library with 50+ production-ready AWS modules, automated security scanning, and cost estimation built into the CI pipeline.

TerraformAWSPythonOPA
03

ObserveX

Unified observability platform integrating metrics, logs, and traces with ML-powered anomaly detection and automated incident response playbooks.

GrafanaClickHousePythonOpenTelemetry
04

DriftGuard

Real-time infrastructure drift detection tool that compares live cloud state against Terraform state files and triggers automated remediation workflows.

GoTerraformAWS SDKSlack API

Let's build
something great.

Open to senior DevOps, Platform Engineering, and SRE roles. Also available for consulting on cloud architecture and infrastructure audits.

Download Resume