Home Sources

Tag: Sources

Post
DeepSeek Open Sources DeepGEMM: Clean and efficient FP8 GEMM kernels

DeepSeek Open Sources DeepGEMM: Clean and efficient FP8 GEMM kernels

DeepGEMM DeepGEMM is a library designed for clean and efficient FP8 General Matrix Multiplications (GEMMs) with fine-grained scaling, as proposed in DeepSeek-V3. It supports both normal and Mix-of-Experts (MoE) grouped GEMMs. Written in CUDA, the library has no compilation need during installation, by compiling all kernels at runtime using a lightweight Just-In-Time (JIT) module. Currently