Yuxiong He

Yuxiong He

Sr. Director, Software Engineering
Yuxiong He is a Sr. Director, Software Engineering, spearheading the development and research of Large Language Models (LLMs). As a pivotal co-leader of the Arctic project, she collaborates with a team of exceptional AI professionals to develop the Snowflake suite of foundational models. Her dedication to innovation is matched by her commitment to open source and open research, striving to build transformative and high-performing AI technologies. Previously, Yuxiong held the position of Partner Research and Product Manager at Microsoft, where she co-founded and led the DeepSpeed project. This industry-leading, open-source deep learning optimization library introduced groundbreaking innovations like ZeRO, 3D parallelism, and ZeroQuant. These advancements have significantly accelerated and democratized the training and inference processes of cutting-edge LLMs, making them more accessible to everyone in need. Yuxiong has published over 100 papers in major computer science conferences and journals. Her work has been recognized among the best papers at esteemed venues such as SIGIR, ICDE, WSDM, and Middleware, and her research continues to be widely applied in diverse systems and products.

MORE POSTSFROM Yuxiong He

Gen AI

Agent World Model (AWM): Infinity Synthetic Environments for Agentic Reinforcement Learning

Open-source Agent World Model generates 1,000 SQL-backed executable environments for agentic RL with benchmark-winning results.
||||||
FEB 13, 2026|10 min read
Gen AI

Smarter, Faster and Snowflake-Native: Real-Time Text2SQL Behind Snowflake Intelligence

Discover Arctic-Text2SQL-R1.5, Snowflake's new, native Text-to-SQL model built to overcome LLM latency and deliver higher accuracy for real-time conversational analytics.
|||||
NOV 04, 2025|7 min read

Arctic Long Sequence Training (ALST): Scalable And Efficient Training For Multi-Million Token Sequences

Snowflake's ALST enables scalable training of long-context models with up to 15 million tokens using Hugging Face and DeepSpeed, all without custom modeling code.
||||||||
JUN 24, 2025|10 min read
Gen AI

Inside Snowflake Intelligence: Five Pillars of Enterprise-Grade Agentic AI

Explore the underlying architecture, orchestration, and system-level optimizations behind Snowflake Intelligence, a production-grade agentic AI system built for enterprise reasoning.
|||||||
JUN 03, 2025|13 min read
Gen AI

Smaller Models, Smarter SQL: Arctic-Text2SQL-R1 Tops BIRD and Wins Broadly

A deep dive into how Snowflake AI built Arctic-Text2SQL-R1 using simple rewards, strong reasoning, and a scalable approach to real-world SQL generation.
||
MAY 29, 2025|14 min read
Gen AI

Arctic Inference with Shift Parallelism: The Fastest Open Source Inference System for Enterprise AI

Built by Snowflake AI Research, Arctic Inference uses Shift Parallelism, SwiftKV, and speculative decoding to power the fastest open-source enterprise AI.
||||||||
MAY 29, 2025|15 min read
Gen AI

Scaling vLLM for Embeddings: 16x Throughput and Cost Reduction

Learn how we increased embedding throughput 3x in Snowflake Cortex—and 16x vs. vLLM—through smarter serialization, tokenization, and GPU optimization.
||||||
MAY 29, 2025|8 min read
Gen AI

Low-Latency and High-Throughput Inference for Long Context with Sequence Parallelism (aka Arctic Ulysses)

Ulysses, a novel sequence parallelism technique, boosts long-context LLM inference performance with 3.4x lower latency and better GPU efficiency.
|||||
APR 03, 2025|14 min read
Gen AI

Think. Execute. Excel: Arctic Text2SQL with Execution-Guided CoT

Learn how Snowflake’s ExCoT optimizes Text2SQL with execution-guided CoT and DPO, setting a new benchmark in natural language to SQL accuracy.
||||
APR 02, 2025|10 min read

Previous

1

2

Next

Where Data Does More

  • 30-day free trial
  • No credit card required
  • Cancel anytime