Basin: Amazon Security's Data Lake
Platform processing 9PB daily from 350,000+ sources supporting ML workloads and security analytics across AWS.

The Challenge
Amazon Security needed an authoritative data lake capable of ingesting massive volumes of security data from hundreds of thousands of sources while supporting ML-based threat detection, maintaining sub-second query performance, and scaling to meet exponential growth—all while optimizing costs and maintaining 99.9% availability.
My Approach
Led technical program management for Basin, coordinating across 40+ microservices spanning Security, Operations, Finance, and Engineering teams. Focused on three parallel tracks: architecting scalable infrastructure capable of handling 965% year-over-year growth, implementing cost optimization initiatives that delivered $24.9M in annual savings, and establishing release management frameworks serving 2,000+ internal service teams.
Key Deliverables
Managed data platform supporting anomaly detection models and ML-based threat detection
Designed data validation frameworks ensuring quality for downstream ML pipelines
Led 15 cost-reduction initiatives delivering $24.9M in annual savings
Architected release readiness frameworks and CI/CD standards for 2,000+ teams
Technologies & Tools
Related Projects
Fangorn Coral Collector Deployment
Deployed security log collectors on 350,000+ hosts across all AWS regions, achieving 97% security coverage with <1% CPU impact.
AIP: Alias Investigation Platform
Built centralized IP investigation and insider risk platform serving 400+ weekly investigators tracking 1M+ employee aliases with EU privacy compliance.
Want to discuss this project?
I'd love to share more details about my approach and results.
Get in Touch