Hire an Illini

Haoran Qiu

  • Advisor:
    • Ravishankar K. Iyer
  • Departments:
  • Areas of Expertise:
    • ML for Systems
    • Cloud Computing
    • Distributed Systems
  • Thesis Title:
    • Efficient and Robust Online Learning in Cloud Datacenters
  • Thesis abstract:
    • Cloud computing is undergoing an unprecedented evolution. On the demand side, emerging applications such as microservices, large/mega-scale machine learning (ML) model training (e.g., PaLM and GPT), self-driving vehicle simulation, and personalized medicine require increased efficiency and scalability. In dynamically evolving (and possibly multi-cloud) datacenters, the underlying cloud infrastructure is becoming significantly more complex, heterogeneous, and distributed. A breakthrough intelligent approach is needed to automate performance-aware resource management – the focus of this thesis. Today’s vertical integration in cloud datacenter resource management is based on handcrafted heuristics, and current cloud infrastructure (compute/network/storage) optimizations are largely limited to the resources that they control. Variations across machine configurations, workloads, and deployment environments can make heuristic generation repetitive and costly. Existing heuristics-based resource management approaches that focus on individual pieces of the system stack are untenable, which makes it hard to adapt quickly to new workloads or evolving cloud environments. To continue to meet applications’ performance and resiliency requirements, my thesis proposes that (a) machine intelligence combined (i.e., ML and reinforcement learning or RL) with systems domain knowledge and (b) cross-resource optimization (with Moore’s Law coming to an end) are two core building blocks of future resource management in cloud datacenters. The overarching goal is to design and implement a resource management framework with efficient and robust online learning in cloud datacenters that can seamlessly deliver the desired performance and resilience requirements (typically specified as service-level objectives or SLOs) to diverse workloads.
  • Downloads:

Contact information:
haoranq4@illinois.edu