<<<<<<< HEAD
=======
e43d86b764f63d931638192f1b692ee4cd086147
FastChat supports a range of huggingface models and provide serving APIs. LLMeBench has an interface to FastChat APIs allowing to query and get responses from models. Here is a quick start to load a huggingface model with FastChat and run an asset from LLMeBench on the model.
pip3 install "fschat[model_worker,webui]"python3 -m fastchat.serve.controllerpython3 -m fastchat.serve.model_worker --model-path gpt2 --host localhost --port 5004Wait until the process finishes loading the model and you see "Uvicorn running on ...". The model worker will register itself to the controller .
ENGINE_NAME="gpt2" AZURE_API_KEY="EMPTY" AZURE_API_URL="http://localhost:5004/v1" python3 -m llmebench --filter "AraBench_Ara2Eng_FastChat_ZeroShot*" --ignore_cache assets/benchmark_v1/ results/