Writing

Technical Notes & Published Work

A curated index of writing around AI systems, model deployment, vision workloads, RAG, and accelerator-backed cloud infrastructure.

Personal Deep Dives

Coming soon: concise breakdowns of GPU inference bottlenecks, ML systems tradeoffs, and cloud-native deployment patterns from my own projects.

Focus Areas

GPU InferenceML SystemsCloud InfraRAGMultimodal

AWS Published Blogs

Selected external publications on applied AI workloads.