2024 - R1. Large Language Model Inference on Heterogeneous Clusters | HOT • Practice