Skip to content

Add troubleshooting guide#860

Open
jaredoconnell wants to merge 1 commit into
vllm-project:mainfrom
jaredoconnell:docs/troubleshooting-guide
Open

Add troubleshooting guide#860
jaredoconnell wants to merge 1 commit into
vllm-project:mainfrom
jaredoconnell:docs/troubleshooting-guide

Conversation

@jaredoconnell

@jaredoconnell jaredoconnell commented Jun 25, 2026

Copy link
Copy Markdown
Collaborator

Summary

Adds a troubleshooting section to the docs

Details

Talks about

  • Debug logging
  • A tokenizer situation that we've had several people ask us about.
  • Mac OS process failures

Test Plan

  • Just read over it and make sure CI passes.

  • "I certify that all code in this PR is my own, except as noted below."

Use of AI

  • Includes code generated or substantially modified by an AI agent
  • Includes markdown generated or substantially modified by an AI agent
  • Includes tests generated or substantially modified by an AI agent

NOTE: the Generated-by or Assisted-by trailers should be used in git commit messages when code or tests were generated or substantially modified by an AI agent, as described in the project's DEVELOPING.md file.


git log

commit 57b86e6
Author: Jared O'Connell joconnel@redhat.com
Date: Thu Jun 25 14:24:05 2026 -0400

Added troubleshooting guide

Generated-by: Cursor AI Claude Opus 4.6
Signed-off-by: Jared O'Connell <joconnel@redhat.com>

Generated-by: Cursor AI Claude Opus 4.6
Signed-off-by: Jared O'Connell joconnel@redhat.com

Generated-by: Cursor AI Claude Opus 4.6
Signed-off-by: Jared O'Connell <joconnel@redhat.com>
@jaredoconnell jaredoconnell changed the title Added troubleshooting guide Add troubleshooting guide Jun 25, 2026

@dbutenhof dbutenhof left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I might be tempted to add another troubleshooting scenario we're on the cusp of solving --

Problem: --profile kind=sweep --constraint kind=max_requests,count=1000 runs a long time. (E.g., "A benchmark configured with --max-requests=1000 --rate-type=sweep took ~7 hours to complete on H100 GPUs, when users expected it to complete in minutes.")

Solution: The constraint is applied separately to each benchmark strategy, and sweep's first "synchronous" strategy will run 1000 requests sequentially. Specify a smaller request limit, or limit each benchmark strategy by duration rather than by count, with --constraint kind=max_duration,seconds=60.

| Symptom | Section |
| ----------------------------------------------------------- | ------------------------------------------------------------ |
| Requests fail or results look wrong | [Debug logging](#debug-logging) |
| `trust_remote_code=True` when loading a tokenizer | [Tokenizer: trust_remote_code](#tokenizer-trust_remote_code) |

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That doesn't sound like a "symptom" -- it's the solution. The "symptom" is more like the statement at your link: "a HuggingFace model requires custom code to load".

Enable verbose output to inspect request handling and worker startup:

```bash
GUIDELLM__LOGGING__CONSOLE_LOG_LEVEL=DEBUG guidellm run ...

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You might want to suggest disabling console output, which can make it hard to read debug logs -- or use GUIDELLM__LOGGING__LOG_FILE to log to a file??

The repository moonshotai/Kimi-K2.6 contains custom code which must be executed
to correctly load the model. You can inspect the repository content at
https://hf.co/moonshotai/Kimi-K2.6.
You can avoid this prompt in future by passing the argument `trust_remote_code=True`.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the fix, not part of the "symptom".

Pass `trust_remote_code` through `--tokenizer` `load_kwargs`:

```bash
--tokenizer '{"kind":"huggingface_auto","load_kwargs":{"trust_remote_code":true}}'

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Although I'd hesitate to put this in the documentation, I'd be really tempted to shortcut that with kind=huggingface_auto,load_kwargs.trust_remote_code=true 😁

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants