You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/data/index.md
+33-1Lines changed: 33 additions & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,7 +1,7 @@
1
1
(data-index)=
2
2
# Data
3
3
4
-
NeMo Gym datasets use JSONL format for reinforcement learning (RL) training. Each dataset connects to an agent server—the component that orchestrates agent-environment interactions during training.
4
+
NeMo Gym datasets use JSONL format for reinforcement learning (RL) training. Each dataset connects to an **agent server** (orchestrates agent-environment interactions) which routes requests to a **resources server** (provides tools and computes rewards).
5
5
6
6
## Prerequisites
7
7
@@ -28,6 +28,38 @@ Additional fields like `expected_answer` vary by resources server—the componen
|`responses_create_params`| User | Input to the model during training. Contains `input` (messages) and optional `tools`, `temperature`, etc. |
36
+
|`agent_ref`|`ng_prepare_data`| Routes each row to its resource server. Auto-generated during data preparation. |
37
+
38
+
### Optional Fields
39
+
40
+
| Field | Description |
41
+
|-------|-------------|
42
+
|`expected_answer`| Ground truth for verification (task-specific). |
43
+
|`question`| Original question text (for reference). |
44
+
|`id`| Tracking identifier. |
45
+
46
+
:::{tip}
47
+
Check `resources_servers/<name>/README.md` for fields required by each resource server's `verify()` method.
48
+
:::
49
+
50
+
### The `agent_ref` Field
51
+
52
+
The `agent_ref` field maps each row to a specific resource server. A training dataset can blend multiple resource servers in a single file—`agent_ref` tells NeMo Gym which server handles each row.
**You don't create `agent_ref` manually.** The `ng_prepare_data` tool adds it automatically based on your config file. The tool matches the agent type (`responses_api_agents`) with the agent name from the config.
You must customize these variables for your dataset:
150
+
-`INPUT_FIELD`: The field name containing your input text. Common values: `"problem"` (math), `"question"` (QA), `"prompt"` (general), `"instruction"` (instruction-following)
151
+
-`SYSTEM_PROMPT`: Task-specific instructions for the model
152
+
-`TRAIN_RATIO`: Train/validation split ratio
153
+
:::
154
+
155
+
::::
156
+
157
+
Run and verify:
158
+
159
+
```bash
160
+
uv run preprocess.py
161
+
wc -l train.jsonl validation.jsonl
162
+
```
163
+
164
+
### Create Config for Custom Data
165
+
166
+
After preprocessing, create a config file to point `ng_prepare_data` at your local files.
167
+
168
+
::::{dropdown} Example config: custom_data.yaml
169
+
:icon: file-code
170
+
171
+
```yaml
172
+
custom_resources_server:
173
+
resources_servers:
174
+
custom_server:
175
+
entrypoint: app.py
176
+
domain: math # math | coding | agent | knowledge | other
177
+
description: Custom math dataset
178
+
verified: false
179
+
180
+
custom_simple_agent:
181
+
responses_api_agents:
182
+
simple_agent:
183
+
entrypoint: app.py
184
+
resources_server:
185
+
type: resources_servers
186
+
name: custom_resources_server
187
+
model_server:
188
+
type: responses_api_models
189
+
name: policy_model
190
+
datasets:
191
+
- name: train
192
+
type: train
193
+
jsonl_fpath: train.jsonl
194
+
license: Creative Commons Attribution 4.0 International
195
+
- name: validation
196
+
type: validation
197
+
jsonl_fpath: validation.jsonl
198
+
license: Creative Commons Attribution 4.0 International
0 commit comments