-
Notifications
You must be signed in to change notification settings - Fork 10
Description
Here is the development roadmap for v0.4.0. Contributions and feedback are welcome.
Upgrades
- In-Place Upgrades: Support for updating components without pod recreation.
- Orchestrated Upgrade Order: Ensure the upgrade sequence is coordinated with the required component startup order.
Scheduling
- Original Node Scheduling: Support for scheduling pods back to their original nodes after restarts or preemptions.
- Multi-Level Gang Scheduling: Enable the co-scheduling of multiple, dependent groups of pods.
- Volcano Integration: Support for gang scheduling via the Volcano scheduler.
- Topology-Aware Scheduling: Co-locate Prefill and Decode pods on the same node whenever possible to maximize GPU utilization and VRAM efficiency.
Fault Tolerance
- Configurable Failure Policies: Allow users to define various FailurePolicy strategies to handle pod failures.
Runtime
- Simplified, Runtime-less Service Discovery: Streamline the cluster ConfigMap to reduce overhead and enable service discovery without requiring a dedicated EngineRuntime component.
whybeyoung, cheyang and stmatengss
Metadata
Metadata
Assignees
Labels
roadmapRoadmapRoadmap