DeepSeek is the first open-source library for efficient expert parallel communication, specifically designed for hybrid expert model training and inference.

On February 25, DeepSeek opened the source code of DeepEP on the second day of the "Open Source Week". DeepEP is the first EP (Expert Parallelism, expert parallelism) communication library for MoE (Mixed Expert) model training and inference, which can achieve efficient and optimized all-to-all communication, support low-precision computing including FP8, and meet the modern high-performance computing needs. Moreover, DeepEP has been deeply optimized for the asymmetric bandwidth forwarding scenario from NVLink to RDMA, providing high throughput and supporting SM (Streaming Multiprocessors) number control, balancing the high throughput performance of training and inference tasks. For latency-sensitive decoding scenarios, DeepEP offers a low-latency kernel with pure RDMA, supporting adaptive routing, which can achieve more flexible GPU resource control to meet different scenario needs.
Comments
No comments yet