Gen AILLM Interactive Workloads: Optimizing GPU Capacity for Interactive and Batch WorkloadsVincent Chan|Hao Zhang|Flex WangSEP 12, 2024|7 min read