-
Notifications
You must be signed in to change notification settings - Fork 0
Expand file tree
/
Copy pathopenenv.yaml
More file actions
106 lines (97 loc) · 2.69 KB
/
openenv.yaml
File metadata and controls
106 lines (97 loc) · 2.69 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
name: email-triage-openenv
version: "1.0.0"
description: >
An OpenEnv-compliant environment for training AI agents to manage a business
email inbox. The agent labels emails by priority, drafts replies, routes
messages to the appropriate department, and escalates critical issues.
tags:
- openenv
- email
- triage
- real-world
- nlp
tasks:
- id: task1
name: Priority Labeling
difficulty: easy
max_steps: 10
description: >
Label 5 emails with the correct priority: urgent / high / normal / low / spam.
Urgent emails (production outage, security alert) carry 2× weight in scoring.
scoring:
method: weighted_label_accuracy
range: [0.0, 1.0]
- id: task2
name: Triage and Reply
difficulty: medium
max_steps: 25
description: >
Label 10 emails by priority and draft replies to emails that require a response.
Replies must be at least 20 characters. Unnecessary replies incur a 0.05 penalty.
scoring:
method: weighted_label_plus_reply_coverage
formula: "0.6 * label_score + 0.4 * reply_score - 0.05 * unnecessary_replies"
range: [0.0, 1.0]
- id: task3
name: Full Inbox Management
difficulty: hard
max_steps: 40
description: >
Manage a 15-email inbox: label by priority, route to the correct department
(devops / legal / engineering / management), and escalate time-critical or
compliance issues. Critical emails carry 3× label weight.
scoring:
method: weighted_multi_action
formula: "0.40 * label + 0.35 * routing + 0.25 * escalation + reply_bonus"
range: [0.0, 1.0]
observation_space:
type: object
fields:
task_id: string
step: integer
emails:
type: array
items:
id: string
sender: string
subject: string
body: string
received_at: string (ISO 8601)
has_attachment: boolean
thread_id: string (nullable)
inbox_size: integer
done: boolean
message: string
metadata: object
action_space:
type: object
fields:
action_type:
type: enum
values: [label, reply, archive, escalate, route, noop]
email_id: string (nullable)
label:
type: enum
values: [urgent, high, normal, low, spam]
nullable: true
reply_text: string (nullable, min 20 chars for reward)
route_to: string (nullable)
reason: string (nullable)
reward:
type: object
fields:
value: float (0.0 – 1.0)
breakdown: object
message: string
intermediate_rewards: true
terminal_reward: true
api:
reset: POST /reset
step: POST /step
state: GET /state
tasks: GET /tasks
grade: POST /grade
runtime:
framework: fastapi
python: ">=3.11"
port: 7860