Skip to content

Deepmind version's probability distribution #36

Description

@YEOMJINSEOP

Hi, thanks for sharing codes.

I think the deepmind version code def speculative_sampling_v2 should be fixed.
because the way code implmented the part of comparing 's probability distribution and q's probability distribution is different to algorithm in the Deepmind paper.

[current code]
if r < torch.min(torch.tensor([1], device=q.device), p[:, prefix_len + i - 1, draft_token_idx] / q[:, prefix_len + i - 1, draft_token_idx]):

[correct code]
if r < torch.min(torch.tensor([1], device=q.device), q[:, prefix_len + i - 1, draft_token_idx] / p[:, prefix_len + i - 1, draft_token_idx]):

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions