Summary
I help teams make Python and ML code run faster and more efficiently. Typical results include shorter processing times, higher throughput, and smoother ML deployments. I co-founded and scaled a video translation platform to 60 k+ users on GCP/Kubernetes, focusing on runtime performance, async parallelism, and model-serving optimization. My expertise lies in profiling, GPU/CPU efficiency, and designing pipelines that stay responsive under load.
Selected Results
- Scaled Platform to 60k+ Users: Architected and led the engineering for Verbalate's low-latency cloud infrastructure on GCP/Kubernetes.
- ~$120M/yr Revenue Line Impact: Deployed Adore Me's first ML recommender, now driving 15-20% of recommendations.
- +$9M/yr Marketing Spend Enabled: Co-built an influencer marketing platform that scaled marketing efforts with zero increase in customer acquisition cost (CAC).
Experience
- Scaled the platform to 60k+ users by leading product and engineering, focusing on a highly available and cost-effective GCP/Kubernetes architecture.
- Delivered production-grade ML systems (diarization, lip-sync, TTS) used in thousands of daily media processing jobs.
- Shipped a collaborative dubbing editor (React) and a retry-safe billing and metering pipeline (Stripe).
- Turnaround Project: Delivered a real-time production speech system on NVIDIA Jetson in under three weeks for a project that had been stalled for six months.
- Award-Winning Innovation: Won a special prize from Google for a real-time money transfer PoC at the Central Europe Open Banking competition.
- Finance Systems: Built a backtesting and options pricing platform that enabled the client to validate and commit to their core investment strategy.
- Optimized and deployed the company's first ML recommender, successfully serving a ~$120M/year business line.
- Built an aspect-based sentiment analysis pipeline and BI app that directly reshaped the product roadmaps of three teams.
- Automated the sourcing of 100,000 marketing prospects in one year by building an integration with Instagram before official API support.
Consulting Services
-
Python Performance & Cost Optimization
System Profiling (py-spy, Scalene), Latency Reduction (p95/p99), Cloud Cost Analysis (GCP/AWS), Database Query Tuning
-
ML Systems Engineering
Production Model Serving (FastAPI, PyTorch), RAG & Recommender Systems, GPU Pipeline Optimization
-
Systems Architecture & Reliability
Cloud-Native Design (GCP, AWS), Observability (OpenTelemetry, MLops), Incident Forensics
-
Python & Rust Integration
Accelerating critical code with Rust (PyO3), 10-100x speed improvements