Description
使用 swift export + gptq 量化 Qwen3-Reranker-4B 时,在 optimum.gptq 里遇到 layer_inputs[j] IndexError,是否目前不支持对 reranker 做 GPTQ?
Reproduction
Traceback (most recent call last):
File "/home/ps/www/ms-swift/swift/cli/export.py", line 5, in
export_main()
File "/home/ps/www/ms-swift/swift/llm/export/export.py", line 53, in export_main
return SwiftExport(args).main()
^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/ps/www/ms-swift/swift/llm/base.py", line 49, in main
result = self.run()
^^^^^^^^^^
File "/home/ps/www/ms-swift/swift/llm/export/export.py", line 30, in run
quantize_model(args)
File "/home/ps/www/ms-swift/swift/llm/export/quant.py", line 286, in quantize_model
QuantEngine(args).quantize()
File "/home/ps/www/ms-swift/swift/llm/export/quant.py", line 45, in quantize
gptq_quantizer = self.gptq_model_quantize(v2=(args.quant_method == 'gptq_v2'))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/ps/www/ms-swift/swift/llm/export/quant.py", line 280, in gptq_model_quantize
gptq_quantizer.quantize_model(self.model, self.tokenizer)
File "/home/ps/anaconda3/envs/ms-swift/lib/python3.11/site-packages/torch/utils/_contextlib.py", line 120, in decorate_context
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/home/ps/anaconda3/envs/ms-swift/lib/python3.11/site-packages/optimum/gptq/quantizer.py", line 630, in quantize_model
layer_inputs[j] = nested_move_to(layer_inputs[j], block_device)
IndexError: list index out of range
### Logs
```shell
```
### Environment Information
tqdm 4.67.1
traitlets 5.14.3
transformers 4.57.1
transformers-stream-generator 0.0.5
triton 3.4.0
### Known Issue
- [ ] The issue hasn't been already addressed in Documentation, Issues, and Discussions.
Description
使用 swift export + gptq 量化 Qwen3-Reranker-4B 时,在 optimum.gptq 里遇到 layer_inputs[j] IndexError,是否目前不支持对 reranker 做 GPTQ?
Reproduction
Traceback (most recent call last):
File "/home/ps/www/ms-swift/swift/cli/export.py", line 5, in
export_main()
File "/home/ps/www/ms-swift/swift/llm/export/export.py", line 53, in export_main
return SwiftExport(args).main()
^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/ps/www/ms-swift/swift/llm/base.py", line 49, in main
result = self.run()
^^^^^^^^^^
File "/home/ps/www/ms-swift/swift/llm/export/export.py", line 30, in run
quantize_model(args)
File "/home/ps/www/ms-swift/swift/llm/export/quant.py", line 286, in quantize_model
QuantEngine(args).quantize()
File "/home/ps/www/ms-swift/swift/llm/export/quant.py", line 45, in quantize
gptq_quantizer = self.gptq_model_quantize(v2=(args.quant_method == 'gptq_v2'))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/ps/www/ms-swift/swift/llm/export/quant.py", line 280, in gptq_model_quantize
gptq_quantizer.quantize_model(self.model, self.tokenizer)
File "/home/ps/anaconda3/envs/ms-swift/lib/python3.11/site-packages/torch/utils/_contextlib.py", line 120, in decorate_context
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/home/ps/anaconda3/envs/ms-swift/lib/python3.11/site-packages/optimum/gptq/quantizer.py", line 630, in quantize_model
layer_inputs[j] = nested_move_to(layer_inputs[j], block_device)