evaluation-awareness

Star

Here are 4 public repositories matching this topic...

aisa-group / decomposing-eval-awareness

Star

Decomposing and measuring evaluation awareness in existing benchmarks and our proposed EvalAwareBench.

benchmark ai-safety situational-awareness ai-alignment llm evaluation-awareness

Updated Jun 1, 2026
Python

roldanjorge / sdf-belief-dissociation

Star

Synthetic-document fine-tuning on Qwen2.5-7B: a controlled study of whether SDF installs sandbagging, finding a layered recognition/generation/behavior dissociation.

model-organisms ai-safety ai-alignment sandbagging evaluation-awareness synthetic-document-finetuning

Updated May 25, 2026
Python

compass-group-tue / arxiv2026_evaluation_meta_knowledge

Star

Repository for the arXiv 2026 prepring "Models That Know How Evaluations Are Designed Score Safer"

ai ai-safety evaluation-awareness

Updated Jun 9, 2026
Jupyter Notebook

marinomaria / sensitivity-analysis-evaluation-awareness

Star

[WIP] Code and datasets for the thesis "Sensitivity Analysis of Evaluation Awareness in Large Language Models" · Licenciatura en Ciencia de Datos, Universidad de Buenos Aires (UBA).

evaluation interpretability linear-probing llm evaluation-awareness

Updated Jun 3, 2026
Jupyter Notebook

Improve this page

Add a description, image, and links to the evaluation-awareness topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the evaluation-awareness topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

evaluation-awareness

Here are 4 public repositories matching this topic...

aisa-group / decomposing-eval-awareness

roldanjorge / sdf-belief-dissociation

compass-group-tue / arxiv2026_evaluation_meta_knowledge

marinomaria / sensitivity-analysis-evaluation-awareness

Improve this page

Add this topic to your repo