Task 6: leakage classifier (evidence → answer text overlap) by david-arredondo · Pull Request #2 · vinash85/Chem2TextQA

david-arredondo · 2026-04-24T21:37:50Z

Summary

Addresses critique C2 (does the Phase-1 model borrow text verbatim from evidence?). Audits the full agree-only gold dataset — 188,541 Q&A across 15,509 compounds — via three metrics, with staged review cases.

Metrics

LCS (longest common contiguous token substring) — reported as a threshold curve rather than a single flag; max observed across the corpus is 19 tokens.
5-gram overlap (word-level, answer ∩ union of parent-compound evidence 5-grams) — threshold > 3.
Cosine (max cosine between answer embedding and any evidence-sentence embedding, sentence-transformers/all-MiniLM-L6-v2, L2-normalized) — threshold > 0.85.

Headline results

Signal	Count	Rate
`lcs_tokens > 40`	0	0.000%
`lcs_tokens > 10`	153	0.081%
`flag_ngram` (5-gram overlap > 3)	1,821	0.97%
`flag_cos` (cos > 0.85)	4,068	2.16%
`flag_ngram ∨ flag_cos`	5,569	2.95%
`flag_ngram ∩ flag_cos` (strongest signal)	320	0.17%

No Q&A contains a 20+ token verbatim contiguous copy of any evidence sentence. Flags concentrate in topics that must draw on literature (therapeutic_use, mechanism, toxicity, drug_interactions, binding_mode, cell_biology, signaling_pathways — 10–20%) and are near-zero on purely structural topics (functional_groups 0%, scaffold 0.1%, engineering 0.2%). This is the SMILES-derivable-vs-evidence-supported design rule working as intended.

Layout

All under task-6-leakage-classifier/:

README.md — quick overview + headline numbers + how to reproduce
PLAN.md — locked plan / contract (decisions, thresholds, sampling)
leakage_summary.md — aggregate stats, LCS threshold curve, per-split / per-topic
flagged_examples.md — top-20 per category (LCS descending, ngram, cos, co-flagged), each with answer + closest evidence sentence staged for review. No judgments written — for the reviewer.
per_qa_leakage.jsonl — 188,541 rows with lcs_tokens, ngram5_overlap, cos_max, and flag booleans
scripts/ — reproducible pipeline (setup_env.sh, sample.py, embed.py, compute_metrics.py, summarize.py, run.sh)
archive_20k_sample/ — initial 20K-Q&A pilot outputs (audit trail; the two runs tracked tightly)

Reproducibility

scripts/config.py has absolute paths for this machine (inputs at /data/luis/..., large outputs at /data/dandreas/...). Edit that file to repoint, then bash scripts/setup_env.sh && bash scripts/run.sh. End-to-end on one H100 takes ~25 min (embedding 1.28 M unique texts ~4.5 min, scoring ~9 min).

Test plan

Spot-check ~20 flagged cases in flagged_examples.md — does the sort by "most egregious" make sense? Any obvious false positives / false negatives?
Validate that the LCS threshold curve gives an actionable operating point for filtering (if that's desired as a dataset-hygiene step).
Sanity-check per-topic rates vs the design-rule expectation (structural ≈ 0%, functional > 0%).

Do not merge — per instructions, the author will coordinate the merge separately.

🤖 Generated with Claude Code

Detects per-Q&A leakage across the full agree-only gold dataset (188,541 Q&A across 15,509 compounds) via three metrics: longest common contiguous token substring (LCS), 5-gram overlap, and cosine similarity of all-MiniLM-L6-v2 embeddings. LCS is reported as a threshold curve since the corpus-wide max is 19 tokens — no Q&A has a 40+ token verbatim contiguous run copied from evidence. Full results, flagged examples, and reproducible scripts included. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Copilot

Pull request overview

Adds a Task-6 “leakage classifier” audit pipeline to quantify evidence→answer textual overlap (LCS, 5-gram overlap, embedding cosine) and to generate reviewer-facing summary + example reports for the agree-only gold dataset.

Changes:

Introduces a reproducible multi-step pipeline (sample → embed → compute_metrics → summarize) under task-6-leakage-classifier/scripts/.
Commits generated analysis artifacts (leakage_summary.md, flagged_examples.md, per_qa_leakage.jsonl) plus a preserved 20k pilot archive.
Documents methodology, thresholds, and headline results in README.md and PLAN.md.

Reviewed changes

Copilot reviewed 13 out of 15 changed files in this pull request and generated 9 comments.

Show a summary per file

File	Description
task-6-leakage-classifier/scripts/config.py	Centralizes paths, thresholds, and run parameters for the pipeline.
task-6-leakage-classifier/scripts/sample.py	Streams gold JSONL and materializes per-QA rows (answer + parent evidence) for downstream scoring.
task-6-leakage-classifier/scripts/embed.py	Builds a text index and embeds unique answers/evidence sentences for cosine scoring.
task-6-leakage-classifier/scripts/compute_metrics.py	Computes LCS / 5-gram overlap / cosine per Q&A and writes `per_qa_leakage.jsonl`.
task-6-leakage-classifier/scripts/summarize.py	Aggregates stats and emits `leakage_summary.md` + `flagged_examples.md`.
task-6-leakage-classifier/scripts/setup_env.sh	Creates a conda env and installs required Python dependencies.
task-6-leakage-classifier/scripts/run.sh	Driver script to run all pipeline stages and capture logs.
task-6-leakage-classifier/README.md	Provides overview, key results, reproduction instructions, and metric definitions.
task-6-leakage-classifier/PLAN.md	Captures the “contract” for metric semantics, thresholds, and review procedure.
task-6-leakage-classifier/leakage_summary.md	Generated aggregate report for the full run.
task-6-leakage-classifier/flagged_examples.md	Generated top-ranked flagged examples for manual review.
task-6-leakage-classifier/archive_20k_sample/leakage_summary.md	Generated summary from the 20k pilot run (audit trail).
task-6-leakage-classifier/archive_20k_sample/flagged_examples.md	Generated flagged examples from the 20k pilot run (audit trail).

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-04-24T21:58:57Z

+OUT=/home/dandreas/chem2text/outputs/leakage
+LOG="$OUT/run.log"
+


run.sh hard-codes the output directory (OUT=/home/dandreas/...) instead of deriving it from scripts/config.py. This breaks the documented “edit config.py to repoint paths” workflow because logs will still be written to the old location unless run.sh is edited too. Consider reading OUT_DIR/RUN_LOG from config.py (or accepting OUT/LOG via env vars with sensible defaults).

Suggested change

OUT=/home/dandreas/chem2text/outputs/leakage

LOG="$OUT/run.log"

read_config_value () {

python - "$HERE/config.py" "$1" <<'PY'

import importlib.util

import sys

config_path, attr_name = sys.argv[1], sys.argv[2]

try:

spec = importlib.util.spec_from_file_location("run_config", config_path)

if spec is None or spec.loader is None:

raise RuntimeError("unable to load config")

module = importlib.util.module_from_spec(spec)

spec.loader.exec_module(module)

value = getattr(module, attr_name, "")

if value is None:

value = ""

print(value)

except Exception:

print("")

PY

}

DEFAULT_OUT=/home/dandreas/chem2text/outputs/leakage

CONFIG_OUT="$(read_config_value OUT_DIR)"

CONFIG_LOG="$(read_config_value RUN_LOG)"

OUT="${OUT:-${CONFIG_OUT:-$DEFAULT_OUT}}"

LOG="${LOG:-${CONFIG_LOG:-$OUT/run.log}}"

Copilot · 2026-04-24T21:58:58Z

+    md.append("# Leakage Summary\n")
+    md.append(
+        f"Source: `per_qa_leakage.jsonl` ({n} rows, sampled from "
+        f"`dataset_gold.jsonl`; n={SAMPLE_SIZE}, seed={SAMPLE_SEED}).\n"


The summary header prints n={SAMPLE_SIZE} as if that were the effective sample size, but when SAMPLE_SIZE is set larger than the dataset it’s actually a cap and the run covers all rows. This is misleading in the generated leakage_summary.md; consider emitting something like effective_n={n} (and optionally sample_size_cap={SAMPLE_SIZE}) instead.

Suggested change

f"`dataset_gold.jsonl`; n={SAMPLE_SIZE}, seed={SAMPLE_SEED}).\n"

f"`dataset_gold.jsonl`; effective_n={n}, sample_size_cap={SAMPLE_SIZE}, seed={SAMPLE_SEED}).\n"

Copilot · 2026-04-24T21:58:58Z

+SAMPLE_JSONL = BIG_DIR / "sample.jsonl"         # ~1 GB
+EMBEDDINGS_NPZ = BIG_DIR / "embeddings.npz"     # ~1.6 GB
+TEXT_INDEX_JSON = BIG_DIR / "text_index.json"   # ~200 MB


The inline size estimates for artifacts (sample.jsonl, embeddings.npz, text_index.json) don’t match the sizes documented in the README (e.g., config says sample.jsonl “~1 GB” while README mentions 9.1 GB). Consider either updating these comments to match observed sizes or removing the specific numbers to avoid misleading operators about disk requirements.

Suggested change

SAMPLE_JSONL = BIG_DIR / "sample.jsonl" # ~1 GB

EMBEDDINGS_NPZ = BIG_DIR / "embeddings.npz" # ~1.6 GB

TEXT_INDEX_JSON = BIG_DIR / "text_index.json" # ~200 MB

# Large pipeline artifacts live in BIG_DIR to avoid home-directory quota issues.

SAMPLE_JSONL = BIG_DIR / "sample.jsonl"

EMBEDDINGS_NPZ = BIG_DIR / "embeddings.npz"

TEXT_INDEX_JSON = BIG_DIR / "text_index.json"

Copilot · 2026-04-24T21:58:58Z

+"""Produce leakage_summary.md + flagged_examples.md.
+
+summary: aggregate flag rates, metric distributions, intersections,
+         breakdown by topic and split.
+flagged: up to EXAMPLES_PER_CATEGORY sampled cases per flag category
+         (lcs, ngram, cos, any), for user review. No judgments.
+"""
+from __future__ import annotations
+
+import json
+import logging
+import random
+import sys
+from collections import Counter, defaultdict
+from pathlib import Path
+from statistics import median
+
+sys.path.insert(0, str(Path(__file__).parent))
+from config import (  # noqa: E402
+    COSINE_THRESHOLD,
+    EMBED_MODEL,
+    EXAMPLES_PER_CATEGORY,
+    EXAMPLES_SAMPLE_SEED,
+    FLAGGED_MD,


The module imports random and EXAMPLES_SAMPLE_SEED, and the header comment says examples are “sampled”, but the implementation deterministically takes the top-N after sorting. Either remove the unused imports/config knobs and update the header comment, or reintroduce seeded sampling if that’s the intended behavior.

Copilot · 2026-04-24T21:58:58Z

+            # Collect evidence token-lists and embedding ids.
+            ev_token_lists: list[list[str]] = []
+            ev_ngrams: set[tuple[str, ...]] = set()
+            ev_emb_ids: list[int] = []
+            for e in evidence:
+                etext = (e.get("text") or "").strip()
+                if not etext:
+                    continue
+                eid = text_to_id.get(etext, -1)
+                if eid < 0:
+                    continue
+                ev_token_lists.append(tokenized[eid])
+                ev_ngrams |= ngrams(tokenized[eid], 5)
+                ev_emb_ids.append(eid)


This loop recomputes the union of evidence 5-grams (ev_ngrams) and rebuilds ev_token_lists / ev_emb_ids for every Q&A row even though evidence is identical for all Q&A belonging to the same cid (because sample.py duplicates the compound’s evidence into every sampled row). This is a major source of redundant work and inflates runtime/IO; consider caching precomputed evidence features keyed by cid (token lists, 5-gram union, embedding id list) and reusing them across Q&A for that compound.

Copilot · 2026-04-24T21:58:59Z

+            flag_lcs = lcs_val > LCS_TOKEN_THRESHOLD
+            flag_ngram = n_overlap > NGRAM5_OVERLAP_THRESHOLD
+            flag_cos = cos_max > COSINE_THRESHOLD
+            flag_any = flag_lcs or flag_ngram or flag_cos


flag_any is currently defined as flag_lcs or flag_ngram or flag_cos, but the reporting/README treats “any” as flag_ngram ∨ flag_cos (since LCS is now reported as a curve). To avoid semantic drift (especially if LCS thresholds change later), consider either redefining flag_any here to match the reported rule, or dropping it from the JSONL and letting downstream code compute whatever combination it needs.

Copilot · 2026-04-24T21:58:59Z

+"""Deterministically sample 20,000 Q&A from the gold dataset.
+
+Sampling unit is (cid, qa_index). Draw is uniform across all Q&A — compounds
+with more Q&A are proportionally more likely to contribute rows, which is what
+we want for an unbiased per-Q&A leakage-rate estimate.


The docstring says this script samples “20,000 Q&A”, but the actual sample size is controlled by config.SAMPLE_SIZE (currently set to 500_000 to cover the full dataset). Updating the docstring to describe the SAMPLE_SIZE behavior (including the full-dataset short-circuit) would avoid confusion when reproducing.

Suggested change

"""Deterministically sample 20,000 Q&A from the gold dataset.

Sampling unit is (cid, qa_index). Draw is uniform across all Q&A — compounds

with more Q&A are proportionally more likely to contribute rows, which is what

we want for an unbiased per-Q&A leakage-rate estimate.

"""Deterministically sample up to `SAMPLE_SIZE` Q&A from the gold dataset.

Sampling unit is (cid, qa_index). Draw is uniform across all Q&A — compounds

with more Q&A are proportionally more likely to contribute rows, which is what

we want for an unbiased per-Q&A leakage-rate estimate. If the dataset contains

`SAMPLE_SIZE` or fewer Q&A, the script keeps the full dataset instead of

drawing a smaller sample.

Copilot · 2026-04-24T21:58:59Z

+
+5. **Answer field.** `phase1_answer` (the model under audit is the Phase-1 generator).
+
+6. **Manual inspection.** We do not make judgments. We stage flagged cases (all three categories, plus intersections) in `flagged_examples.md` for user review. Cap at ~20 cases per category (as per the task spec's "Sample 20 flagged cases") with random seed=42 within each category; the full flagged list is in `per_qa_leakage.jsonl` for anyone who wants to look past the sample.


The plan says flagged examples are sampled with a random seed (“seed=42 within each category”), but scripts/summarize.py now selects “most egregious first” via deterministic sorting (and doesn’t use EXAMPLES_SAMPLE_SEED). Please update the plan to match the implemented selection policy so reviewers know whether the examples are a random sample or the top-ranked cases.

Suggested change

6. **Manual inspection.** We do not make judgments. We stage flagged cases (all three categories, plus intersections) in `flagged_examples.md` for user review. Cap at ~20 cases per category (as per the task spec's "Sample 20 flagged cases") with random seed=42 within each category; the full flagged list is in `per_qa_leakage.jsonl` for anyone who wants to look past the sample.

6. **Manual inspection.** We do not make judgments. We stage flagged cases (all three categories, plus intersections) in `flagged_examples.md` for user review. Cap at ~20 cases per category (as per the task spec's "Sample 20 flagged cases"), selected deterministically by the summarizer as the top-ranked / most egregious flagged cases first rather than as a random sample; the full flagged list is in `per_qa_leakage.jsonl` for anyone who wants to look past the capped review set.

Copilot · 2026-04-24T21:58:59Z

+1. **Env.** Create a fresh conda env `chem2text_leakage` with Python 3.11, torch (cu128), sentence-transformers, tqdm, numpy. Script: `scripts/leakage/setup_env.sh`.
+2. **Sample.** Stream `dataset_gold.jsonl`, collect all (cid, qa_index) pairs sorted, seed 42 draw of 20,000. For each, emit a sample row with the phase1_answer and parent compound's evidence sentences. Script: `scripts/leakage/sample.py` → `sample.jsonl`.
+3. **Embed.** Collect unique text strings (answers + evidence sentences) from the sample, encode on one GPU in batches, save `.npz` + id→row index map. Script: `scripts/leakage/embed.py` → `embeddings.npz`, `text_index.json`.
+4. **Metrics.** For each sampled Q&A, compute `lcs_tokens`, `ngram5_overlap`, `cos_max`. Script: `scripts/leakage/compute_metrics.py` → `per_qa_leakage.jsonl`.
+5. **Summarize.** Aggregate flag rates, metric distributions, co-flagging, write `leakage_summary.md` and `flagged_examples.md`. Script: `scripts/leakage/summarize.py`.
+6. **Driver.** `scripts/leakage/run.sh` runs 2–5 in order, assuming 1 is done.
+
+## Reproducibility
+
+- All scripts take no positional arguments — all inputs/outputs are fixed paths or CLI-settable with the same defaults we used.
+- `PLAN.md` records the thresholds and sample seed. The scripts read the same constants from a shared `scripts/leakage/config.py`.


The “Method” section references scripts under scripts/leakage/..., but in this repo they live directly under task-6-leakage-classifier/scripts/ (e.g., scripts/run.sh, scripts/sample.py). Updating these paths will prevent copy/paste reproduction errors.

Suggested change

1. **Env.** Create a fresh conda env `chem2text_leakage` with Python 3.11, torch (cu128), sentence-transformers, tqdm, numpy. Script: `scripts/leakage/setup_env.sh`.

2. **Sample.** Stream `dataset_gold.jsonl`, collect all (cid, qa_index) pairs sorted, seed 42 draw of 20,000. For each, emit a sample row with the phase1_answer and parent compound's evidence sentences. Script: `scripts/leakage/sample.py` → `sample.jsonl`.

3. **Embed.** Collect unique text strings (answers + evidence sentences) from the sample, encode on one GPU in batches, save `.npz` + id→row index map. Script: `scripts/leakage/embed.py` → `embeddings.npz`, `text_index.json`.

4. **Metrics.** For each sampled Q&A, compute `lcs_tokens`, `ngram5_overlap`, `cos_max`. Script: `scripts/leakage/compute_metrics.py` → `per_qa_leakage.jsonl`.

5. **Summarize.** Aggregate flag rates, metric distributions, co-flagging, write `leakage_summary.md` and `flagged_examples.md`. Script: `scripts/leakage/summarize.py`.

6. **Driver.** `scripts/leakage/run.sh` runs 2–5 in order, assuming 1 is done.

## Reproducibility

- All scripts take no positional arguments — all inputs/outputs are fixed paths or CLI-settable with the same defaults we used.

- `PLAN.md` records the thresholds and sample seed. The scripts read the same constants from a shared `scripts/leakage/config.py`.

1. **Env.** Create a fresh conda env `chem2text_leakage` with Python 3.11, torch (cu128), sentence-transformers, tqdm, numpy. Script: `scripts/setup_env.sh`.

2. **Sample.** Stream `dataset_gold.jsonl`, collect all (cid, qa_index) pairs sorted, seed 42 draw of 20,000. For each, emit a sample row with the phase1_answer and parent compound's evidence sentences. Script: `scripts/sample.py` → `sample.jsonl`.

3. **Embed.** Collect unique text strings (answers + evidence sentences) from the sample, encode on one GPU in batches, save `.npz` + id→row index map. Script: `scripts/embed.py` → `embeddings.npz`, `text_index.json`.

4. **Metrics.** For each sampled Q&A, compute `lcs_tokens`, `ngram5_overlap`, `cos_max`. Script: `scripts/compute_metrics.py` → `per_qa_leakage.jsonl`.

5. **Summarize.** Aggregate flag rates, metric distributions, co-flagging, write `leakage_summary.md` and `flagged_examples.md`. Script: `scripts/summarize.py`.

6. **Driver.** `scripts/run.sh` runs 2–5 in order, assuming 1 is done.

## Reproducibility

- All scripts take no positional arguments — all inputs/outputs are fixed paths or CLI-settable with the same defaults we used.

- `PLAN.md` records the thresholds and sample seed. The scripts read the same constants from a shared `scripts/config.py`.

kushalviit · 2026-04-25T00:48:34Z

@luistafoi if results are fine then comment "good to merge"

kushalviit · 2026-04-25T00:48:51Z

@luistafoi if results are fine then comment "good to merge"

david-arredondo requested a review from luistafoi April 24, 2026 21:37

Macaulay001 requested a review from Copilot April 24, 2026 21:54

Copilot started reviewing on behalf of Macaulay001 April 24, 2026 21:55 View session

Copilot AI reviewed Apr 24, 2026

View reviewed changes

kushalviit reviewed Apr 25, 2026

View reviewed changes

Comment thread task-6-leakage-classifier/scripts/embed.py

kushalviit Apr 25, 2026

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@luistafoi if results are fine then comment "good to merge"

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Task 6: leakage classifier (evidence → answer text overlap)#2

Task 6: leakage classifier (evidence → answer text overlap)#2
david-arredondo wants to merge 1 commit into
mainfrom
task-6-leakage-classifier

david-arredondo commented Apr 24, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Apr 24, 2026

Uh oh!

Copilot AI Apr 24, 2026

Uh oh!

Copilot AI Apr 24, 2026

Uh oh!

Copilot AI Apr 24, 2026

Uh oh!

Copilot AI Apr 24, 2026

Uh oh!

Copilot AI Apr 24, 2026

Uh oh!

Copilot AI Apr 24, 2026

Uh oh!

Copilot AI Apr 24, 2026

Uh oh!

Copilot AI Apr 24, 2026

Uh oh!

kushalviit Apr 25, 2026

Uh oh!

kushalviit Apr 25, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

		OUT=/home/dandreas/chem2text/outputs/leakage
		LOG="$OUT/run.log"

-OUT=/home/dandreas/chem2text/outputs/leakage
-LOG="$OUT/run.log"
+read_config_value () {
+  python - "$HERE/config.py" "$1" <<'PY'
+import importlib.util
+import sys
+config_path, attr_name = sys.argv[1], sys.argv[2]
+try:
+    spec = importlib.util.spec_from_file_location("run_config", config_path)
+    if spec is None or spec.loader is None:
+        raise RuntimeError("unable to load config")
+    module = importlib.util.module_from_spec(spec)
+    spec.loader.exec_module(module)
+    value = getattr(module, attr_name, "")
+    if value is None:
+        value = ""
+    print(value)
+except Exception:
+    print("")
+PY
+}
+DEFAULT_OUT=/home/dandreas/chem2text/outputs/leakage
+CONFIG_OUT="$(read_config_value OUT_DIR)"
+CONFIG_LOG="$(read_config_value RUN_LOG)"
+OUT="${OUT:-${CONFIG_OUT:-$DEFAULT_OUT}}"
+LOG="${LOG:-${CONFIG_LOG:-$OUT/run.log}}"

	f"`dataset_gold.jsonl`; n={SAMPLE_SIZE}, seed={SAMPLE_SEED}).\n"
	f"`dataset_gold.jsonl`; effective_n={n}, sample_size_cap={SAMPLE_SIZE}, seed={SAMPLE_SEED}).\n"


		5. Answer field. `phase1_answer` (the model under audit is the Phase-1 generator).

		6. Manual inspection. We do not make judgments. We stage flagged cases (all three categories, plus intersections) in `flagged_examples.md` for user review. Cap at ~20 cases per category (as per the task spec's "Sample 20 flagged cases") with random seed=42 within each category; the full flagged list is in `per_qa_leakage.jsonl` for anyone who wants to look past the sample.

Uh oh!

Conversation

david-arredondo commented Apr 24, 2026

Summary

Metrics

Headline results

Layout

Reproducibility

Test plan

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Apr 24, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 24, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 24, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 24, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 24, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 24, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 24, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 24, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 24, 2026

Choose a reason for hiding this comment

Uh oh!

kushalviit Apr 25, 2026

Choose a reason for hiding this comment

Uh oh!

kushalviit Apr 25, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants