CUDA: scale q8->f16 cache reserve on >=112 GiB cards (fixes session OOM on large models)#472
Open
slackarea wants to merge 1 commit into
Open
CUDA: scale q8->f16 cache reserve on >=112 GiB cards (fixes session OOM on large models)#472slackarea wants to merge 1 commit into
slackarea wants to merge 1 commit into
Commits
Commits on Jun 28, 2026
- andcommitted