Skip to content

Qwen3-Embedding-8B produces NaN embeddings for token 474 ("import") #845

@cbwchuck

Description

@cbwchuck

System Info

Docker image: ghcr.io/huggingface/text-embeddings-inference:cuda-1.9
Start command: --tokenization-workers=16 --dtype float16 --auto-truncate --max-client-batch-size 128
Host OS: Ubuntu

Information

  • Docker
  • The CLI directly

Tasks

  • An officially supported command
  • My own modifications

Reproduction

I've been running Qwen3-Embedding-8B with float16 dtype and noticed that any input starts with the token "import" (token ID 474), such as "importance", "import", and "important" will cause all-NaN vectors returned.

# Reproduction

curl <TEI_URL>/embed \
  -X POST \
  -H "Content-Type: application/json" \
  -d '{"inputs": "importance"}'
# Returns: [[NaN, NaN, NaN, ...]]

Expected behavior

When investigating, I found people having the exactly same issue -> https://huggingface.co/Qwen/Qwen3-Embedding-8B/discussions/21, and padding the word with a leading space does seem to mitigate it.

I tried checking out tag v1.9.2 and traced through the model layer by layer and found that the NaN originates from
an FP16 overflow in the MLP layers. Here's the chain of events:

  1. RMSNorm normalizes hidden states to ~1.0 — perfectly safe in F16
  2. Attention runs fine on these normalized values
  3. The MLP's down_proj output, however, reaches values around ~2.95 million for this token
  4. F16 can only represent values up to ~65504, so this overflows to Inf
  5. The residual add (Inf + finite) stays Inf
  6. The next layer's RMSNorm receives Inf and produces NaN
  7. NaN propagates through every remaining layer

The overflow first appears at layer 2 and corrupts the entire output from that point on.

Not sure whether it's reasonable, my hypothesis is that Qwen3-Embedding-8B was trained in BF16 -> The MLP weights learned during BF16 training produce activations exceed what F16 range so precision mismatch error occurred.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions