WillLewis

Follow

Will Lewis WillLewis

Follow

AI/ML Product Manager | agentic systems, RAG, evals, decision intelligence | Wharton MBA | UPenn

10 followers · 9 following

Achievements

Achievements

WillLewis/README.md

Agents plan, code acts
the harness holds every gate—
evals are the ship.

demos and case studies: wxl3.com

Pinned Loading

atlas-agentic-fraud-lab atlas-agentic-fraud-lab Public

Adversarial Testing Lab for Agentic Safeguards (ATLAS). A synthetic multi-agent eval environment for adversarial fraud decisioning inspired by Anthropic's Project Deal. Measures how model quality, …

Python 1
agent-harness-environment agent-harness-environment Public

An eval and observability cockpit for coding agents. It runs policy-controlled coding agents in sandboxed toy repos, tool-use traces, MCP tools, compares harness policies, scores recovery and safet…

Python
regulated-agent-launch-kit regulated-agent-launch-kit Public

A regulated-agent deployment kit for turning traces, evals, regressions, and approval gates into launch/no-launch decisions

Python
voice-agent-prompt-lab voice-agent-prompt-lab Public

A voice agent demo and prompt evaluation harness for insurance first notice of loss claims

TypeScript
coachbench coachbench Public

Do agents make for good offensive & defensive coordinators in football? This is an adversarial-agent arena for short red-zone strategy contests. OC & DC agents compete through simultaneous legal pl…

Python 1
canon-ai canon-ai Public

Can you eval an art form? Canon is a continuity linter for serialized TV, YouTube and micro-drama fiction. Canon plays the role of whats currently the scriptwriting coordinator, verifies your story…

Python