Skip to content

DeepSeek-V4-Flash-DSpark #468

Description

@lobanov

DeepSeek released DeepSeek-V4-Flash-DSpark variant of the V4 Flash model with built-in speculative decoding and described the methodology in a companion paper. Given the focus on performance of running the model locally, I think it makes sense to explore how DSpark can be integrated into ds4.

It seems that DSpark provides considerable improvement in tok/s performance with fixed concurrency over previous generation of speculative decoding (MTP):

Image

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions