diff --git a/docs/getting-started/benchmark.md b/docs/getting-started/benchmark.md index 042798f87..698dfa880 100644 --- a/docs/getting-started/benchmark.md +++ b/docs/getting-started/benchmark.md @@ -298,3 +298,7 @@ Learn more about output options in the [Outputs documentation](../guides/outputs ## Authentication When benchmarking against servers that require authentication (such as OpenAI's API), provide an API key in the backend configuration. See the [API Key Configuration](../guides/backends.md#api-key-configuration) section in the Backends documentation for details. + +## Troubleshooting + +See the [Troubleshooting guide](../guides/troubleshooting.md) for common issues. diff --git a/docs/getting-started/install.md b/docs/getting-started/install.md index 4e47e2f64..0aacb77e5 100644 --- a/docs/getting-started/install.md +++ b/docs/getting-started/install.md @@ -90,4 +90,4 @@ To use the vLLM Python backend (in-process inference), see [vLLM Python backend] ## Troubleshooting -If you encounter any issues during installation, ensure that your Python and pip versions meet the prerequisites. For further assistance, please refer to the [GitHub Issues](https://github.com/vllm-project/guidellm/issues) page or consult the [Documentation](https://github.com/vllm-project/guidellm/tree/main/docs). +If you encounter any issues during installation, ensure that your Python and pip versions meet the prerequisites. For common runtime errors (debug logging, tokenizer loading, macOS worker crashes), see the [Troubleshooting guide](../guides/troubleshooting.md). For further assistance, please refer to the [GitHub Issues](https://github.com/vllm-project/guidellm/issues) page. diff --git a/docs/guides/datasets.md b/docs/guides/datasets.md index ad3ee1e46..c7f0c8157 100644 --- a/docs/guides/datasets.md +++ b/docs/guides/datasets.md @@ -38,6 +38,8 @@ You can specify the tokenizer with `--tokenizer` and a configuration string. The - `--tokenizer '{"kind":"huggingface_auto","model":"path/to/processor","load_kwargs":{"use_fast":false}}'` +If a HuggingFace model requires custom code to load, pass `trust_remote_code` in `load_kwargs`. See [Troubleshooting: trust_remote_code](troubleshooting.md#trust_remote_code-required-by-tokenizer). + If your dataset uses non-standard column names, you can use `--data-column-mapper` to map your columns to GuideLLM's expected column names. This is particularly useful when: 1. Your dataset uses different column names (e.g., `question` instead of `prompt`, `instruction` instead of `text_column`) diff --git a/docs/guides/index.md b/docs/guides/index.md index 7005e268a..a23069f1f 100644 --- a/docs/guides/index.md +++ b/docs/guides/index.md @@ -84,4 +84,12 @@ Whether you're interested in understanding the system architecture, exploring su [:octicons-arrow-right-24: Multimodal Guide](multimodal/index.md) +- :material-lifebuoy:{ .lg .middle } Troubleshooting + + ______________________________________________________________________ + + How to troubleshoot common errors. + + [:octicons-arrow-right-24: Troubleshooting Guide](troubleshooting.md) + diff --git a/docs/guides/troubleshooting.md b/docs/guides/troubleshooting.md new file mode 100644 index 000000000..091c1fe77 --- /dev/null +++ b/docs/guides/troubleshooting.md @@ -0,0 +1,73 @@ +--- +weight: 15 +--- + +# Troubleshooting + +Find your symptom below, then follow the linked fix. For CLI syntax, see [Run a Benchmark](../getting-started/benchmark.md#cli-option-format). + +| Symptom | Section | +| ----------------------------------------------------------- | ------------------------------------------------------------ | +| Requests fail or results look wrong | [Debug logging](#debug-logging) | +| Custom code error when loading a model's tokenizer. | [Tokenizer: trust_remote_code](#tokenizer-trust_remote_code) | +| `Worker process ... died unexpectedly (signal 11)` on macOS | [macOS worker crash](#macos-worker-crash-signal-11) | + +## Debug logging + +Enable debig output to inspect request handling and worker startup: + +```bash +GUIDELLM__LOGGING__CONSOLE_LOG_LEVEL=DEBUG guidellm run ... --disable-progress +``` + +Run `guidellm env` to confirm the settings are being applied. The `--disable-progress` call is optional, but the interactive progress console can overwrite console log messages. Alternatively, you can use a file log as mentioned in the [logging guide](../developer/developing.md#logging) . + +For all logging options (file output, log levels), see [Logging](../developer/developing.md#logging) in the development guide. + +## Tokenizer: trust_remote_code + +### Symptom + +You get an error that looks like: + +```text +The repository moonshotai/Kimi-K2.6 contains custom code which must be executed +to correctly load the model. You can inspect the repository content at +https://hf.co/moonshotai/Kimi-K2.6. +You can avoid this prompt in future by passing the argument `trust_remote_code=True`. +``` + +### Fix + +If you fully trust the model, pass `trust_remote_code` through `--tokenizer` `load_kwargs`: + +```bash +--tokenizer '{"kind":"huggingface_auto","load_kwargs":{"trust_remote_code":true}}' +``` + +Do not use this if you do not trust the model, as this allows code execution on your machine. + +See [Datasets: Tokenizer](datasets.md#tokenizer) for other tokenizer options. + +## macOS worker crash (signal 11) + +### Symptom + +```text +RuntimeError: Worker process group startup failed: Worker process died +unexpectedly (signal 11). Check system logs for details +``` + +You may also see repeated log lines such as: + +```text +Worker process died unexpectedly (signal 11) +``` + +### Fix + +GuideLLM defaults to `fork` multiprocessing, which can segfault on macOS. Use `spawn` instead: + +```bash +GUIDELLM__MP_CONTEXT_TYPE=spawn guidellm run ... +```