About
Hi, I’m Hoang Vien Duy — an AI engineer working at the edge.
My focus is on-device / edge AI: taking models off the cloud and making them run fast, small, and reliably on real hardware. That means model optimization (quantization, pruning, distillation), inference runtimes (TFLite, ONNX Runtime, ExecuTorch, MediaPipe), and the systems work needed to hit production latency targets on mobile and embedded devices.
I believe great on-device AI lives at the intersection of three disciplines:
- Machine learning — knowing what the model actually needs to do.
- Systems engineering — building pipelines that are fast and maintainable.
- Hardware awareness — understanding the NPUs, GPUs, and memory constraints you’re deploying onto.
What I write about
On this blog I share what I learn shipping AI to the edge — practical skills, emerging trends, and lessons from real-world deployment. Posts are written in both English and Tiếng Việt.
Get in touch
- GitHub — @vieduy
- LinkedIn — Duy Hoang Vien
AI on the Edge. Optimized to the Core. Built for Speed.