Thank you for the authors’ valuable contribution. I’d like to follow your work. I can perfectly reproduce the reported results using Qwen-1.8B, but when switching to ChatGLM3-6B-base, the performance shows some discrepancies. I wonder if this is expected, and whether any hyperparameter adjustments are needed for ChatGLM3-6B-base to achieve comparable results.
Thank you for the authors’ valuable contribution. I’d like to follow your work. I can perfectly reproduce the reported results using Qwen-1.8B, but when switching to ChatGLM3-6B-base, the performance shows some discrepancies. I wonder if this is expected, and whether any hyperparameter adjustments are needed for ChatGLM3-6B-base to achieve comparable results.