A reading guide before you start patching.Documentation Index
Fetch the complete documentation index at: https://www.radixark.com/llms.txt
Use this file to discover all available pages before exploring further.
The processes
A Miles run is three kinds of processes wrapped in a Ray cluster:- Trainer ranks — Megatron processes that load
torch_distcheckpoints and run the RL loop. - SGLang servers — independent HTTP services that produce rollouts.
- Miles Router — FastAPI proxy that distributes rollout requests, preserves metadata (R3), and enforces health checks.
- Data Source — Python object owned by the trainer; reads prompt JSONL and acts as a buffer between rollout and training.
The package layout
train.py and train_async.py are the two entry points. They’re thin: ~200 lines
each. Most logic lives in the modules above.
A request’s life
For a single GRPO iteration: This is the sync path. Async (train_async.py + --rollout-function-path fully_async_rollout.generate_rollout_fully_async) breaks the request from the trainer
loop and uses a continuously-running worker.
Where common changes go
| You want to … | Edit |
|---|---|
| Add a new RL algorithm | miles/backends/training_utils/loss.py + enum in miles/utils/arguments.py |
| Add a new built-in reward type | miles/rollout/sglang_rollout.py (rm dispatch) |
| Add a new built-in filter | miles/rollout/filter_hub/ |
| Wrap a new model architecture | miles_plugins/models/<model>.py + mbridge |
| Add a new flag | miles/utils/arguments.py |
| Change weight sync | miles/backends/megatron_utils/update_weight/ and miles/utils/distributed_utils.py |
| Change rollout buffer | miles/rollout/data_source.py |
Extension points (the right way)
The trainer is plug-in-friendly. Most extensions don’t need a code change inside Miles — just pass a--something-path my_pkg.thing. See Customization
for the full list.
If you find yourself patching the trainer to make something work, that’s a sign we’re
missing a hook. Open an issue.
Tests
pytest tests/fast for a quick check; run tests/e2e before landing anything that
touches the train loop.
Where to look first when reading the code
If you have 30 minutes and want to understand Miles end-to-end:train.py— the loop, top-to-bottom.miles/rollout/sglang_rollout.py:generate_rollout— how prompts become samples.miles/backends/training_utils/loss.py— the loss and advantage computation.miles/router/router.py— the FastAPI proxy.miles/utils/distributed_utils.py— weight sync.

