From 00770766be94e80ad9704c5d5b9a9b7f0886fa24 Mon Sep 17 00:00:00 2001 From: Tejas Mahajan <141305477+mahajantejas@users.noreply.github.com> Date: Fri, 12 Jun 2026 10:47:41 +0530 Subject: [PATCH 1/2] Enhance AI Evaluations documentation with new steps - Added instructions for raising requests and navigating AI Evaluations. - fixed the images stretching issue --- .../AI Evaluations in Glific.md | 26 ++++++++++++------- 1 file changed, 17 insertions(+), 9 deletions(-) diff --git a/docs/5. Integrations/AI Evaluations in Glific.md b/docs/5. Integrations/AI Evaluations in Glific.md index 91c72f447..f5468c54a 100644 --- a/docs/5. Integrations/AI Evaluations in Glific.md +++ b/docs/5. Integrations/AI Evaluations in Glific.md @@ -12,6 +12,14 @@ AI Evaluations allow your organization to test and measure how accurately your AI Assistant responds to questions, by comparing its answers against a trusted set of "golden" question-answer pairs. This helps NGOs ensure their AI Assistants are performing well and giving the right information to beneficiaries before deploying them at scale. +## Raising the request +1. Navigate to `AI Toolkit` -> `AI Evals`, you should see the button to `Request Access` +Screenshot 2026-06-12 at 10 31 40 AM + +2. CLick on the button to raise the request to enable the feature +3. Glific team will enable this feature within 24 hours. +4. Once enabled the `AI Evals` page is visible + ## Prerequisites Before running an AI Evaluation, make sure you have: 1. At least one AI Assistant configured in Glific (see: AI Assistants [documentation](https://glific.github.io/docs/docs/Integrations/Creating%20and%20modifying%20assistants%20in%20Glific)) @@ -21,7 +29,7 @@ Before running an AI Evaluation, make sure you have: 1. Log in to your Glific account. 2. On the left sidebar, click on `AI Toolkit` 3. Click on `AI Evals` -Screenshot 2026-05-21 at 10 50 06 AM +Screenshot 2026-05-21 at 10 50 06 AM The page shows a table of all past evaluations with the following columns: - Evaluation Name — The name you gave the evaluation, along with the AI Assistant version and Golden QA dataset used along with its duplication factor. @@ -33,20 +41,20 @@ The page shows a table of all past evaluations with the following columns: ## Part 1: Running an AI Evaluation ### Step 1: Click "+ Create AI Evaluation" From the AI Evaluations page, click the + Create AI Evaluation button in the top right corner. -Screenshot 2026-05-21 at 10 57 19 AM +Screenshot 2026-05-21 at 10 57 19 AM You will be taken to the Create AI Evaluation page. -Screenshot 2026-05-21 at 11 02 42 AM +Screenshot 2026-05-21 at 11 02 42 AM ### Step 2: Select or Upload a Golden QA Dataset Under the Select Golden QA section, you have two options: -Screenshot 2026-05-21 at 11 07 46 AM +Screenshot 2026-05-21 at 11 07 46 AM - Option A — Use an existing dataset: Click the "Search or select a Golden QA dataset" dropdown and choose from your previously uploaded datasets. - Option B — Upload a new dataset: Click the "Upload Golden QA" button to upload a new CSV file. Provide the duplication factor for the uploaded data set. -Screenshot 2026-05-21 at 11 07 14 AM +Screenshot 2026-05-21 at 11 07 14 AM Duplication factor is the number of times the golden questions are repeated in the given dataset while running the evaluation. Allowed values 1-5. @@ -55,13 +63,13 @@ Tip: Your CSV must follow the format question, answer with one pair per row. Acc ### Step 3: Select an AI Assistant Click the "Search or select an AI assistant" dropdown under AI Assistant and choose the specific assistant (and its version) you want to evaluate. -Screenshot 2026-05-21 at 11 08 55 AM +Screenshot 2026-05-21 at 11 08 55 AM Note: Each AI Assistant can have multiple versions. Make sure you select the correct version you want to test — this is especially useful when comparing how a newer version performs versus an older one. ### Step 4: Enter an Evaluation Name Under Evaluation Details, type a unique, descriptive name for this evaluation run in the Evaluation Name field. -Screenshot 2026-05-21 at 11 10 21 AM +Screenshot 2026-05-21 at 11 10 21 AM Tip: Use a name that helps you identify the test later, such as v2-assistant-may-test or knowledge-base-check-q1. @@ -87,7 +95,7 @@ Click the "Download Results" button on any completed evaluation to download a de Open the results CSV in a Google spreadsheet to perform further analysis and interpret the results of the evaluation. -Screenshot 2026-05-21 at 11 19 17 AM +Screenshot 2026-05-21 at 11 19 17 AM Through comparing the golden answer (ground_truth_answer) with the generated answers (llm_answer), isolating the rows with lower scores (less than 0.3), you should be able to understand what to change in your assistant (either the prompt or the knowledge base) to get better answers from the AI assistant. @@ -122,7 +130,7 @@ Each dataset is a CSV file containing a set of questions paired with their ideal ## How to Use It **Uploading a Golden QA Dataset** Golden QA datasets can be uploaded from the Create AI Evaluation form (accessed via the `+ Create AI Evaluation` button on the AI Evaluations tab). On that form, click Upload Golden QA to upload a new CSV file. A template is available via link on the create form to help you get started quickly. -Screenshot 2026-05-21 at 11 24 42 AM +Screenshot 2026-05-21 at 11 24 42 AM Once uploaded, the dataset will appear in the Golden QA tab and remain available for future evaluations. **Browsing and Searching Datasets** On the Golden QA tab, all previously uploaded datasets are listed in a table sorted by creation date (newest first). Use the Search bar at the top right to filter datasets by name if you have a large library. You can also click the Created On column header to toggle the sort order. From 2e2f6141d38367a2e3a4b5257598f5be00be7474 Mon Sep 17 00:00:00 2001 From: Tejas Mahajan <141305477+mahajantejas@users.noreply.github.com> Date: Fri, 12 Jun 2026 11:27:28 +0530 Subject: [PATCH 2/2] Update docs/5. Integrations/AI Evaluations in Glific.md Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com> --- docs/5. Integrations/AI Evaluations in Glific.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/5. Integrations/AI Evaluations in Glific.md b/docs/5. Integrations/AI Evaluations in Glific.md index f5468c54a..9d225db3b 100644 --- a/docs/5. Integrations/AI Evaluations in Glific.md +++ b/docs/5. Integrations/AI Evaluations in Glific.md @@ -16,7 +16,7 @@ AI Evaluations allow your organization to test and measure how accurately your A 1. Navigate to `AI Toolkit` -> `AI Evals`, you should see the button to `Request Access` Screenshot 2026-06-12 at 10 31 40 AM -2. CLick on the button to raise the request to enable the feature +2. Click on the button to raise the request to enable the feature 3. Glific team will enable this feature within 24 hours. 4. Once enabled the `AI Evals` page is visible