Blog/Haoran Yu

Haoran Yu

Software Engineer
Haoran Yu is a Software Engineer focused on AI inference infrastructure and performance at scale. He builds and optimizes the model serving stack powering production AI workloads across classical ML and large language models, ranging from embedding models to multimodal LLMs. Specializing in distributed systems, he drives critical improvements in latency, throughput, and reliability for seamless high-performance model serving.

Where Data Does More