Documentation Index
Fetch the complete documentation index at: https://www.radixark.com/llms.txt
Use this file to discover all available pages before exploring further.
Miles runs on AMD Instinct GPUs (MI300, MI325, MI350, MI355X) with ROCm. The launch
scripts are the same as on NVIDIA — only the container and a few env vars differ.
Container images
docker pull rlsys/miles:MI350-355-latest
# MI300 / MI325
docker pull rlsys/miles:MI300-latest
Or build from the repo:
cd docker
docker build -f Dockerfile.rocm_MI350-5 -t rlsys/miles:latest .
The base ROCm image bundles the patches needed for virtual memory management on MI300X
— thanks to Yang Wang for
that work.
Launch the container
docker run --rm -it \
--device /dev/dri \
--device /dev/kfd \
-p 8265:8265 \
--group-add video \
--cap-add SYS_PTRACE \
--security-opt seccomp=unconfined \
--privileged \
-v $HOME/.ssh:/root/.ssh \
-v $HOME:$HOME \
--shm-size 128G \
--name miles_dev \
--ulimit memlock=-1 --ulimit stack=67108864 \
-w $PWD \
rlsys/miles:latest \
/bin/bash
Inside, install Miles editable:
git clone https://github.com/radixark/miles.git
cd miles && pip install -e . --no-deps
Download model + data
hf download Qwen/Qwen3-4B --local-dir /root/Qwen3-4B
hf download --repo-type dataset BytedTsinghua-SIA/DAPO-Math-17K --local-dir /root/dapo-math-17k
hf download --repo-type dataset zhuzilin/aime-2024 --local-dir /root/aime-2024
Convert weights (CPU + Gloo)
We force CPU-only conversion on AMD to bypass some ROCm-specific issues. A GPU-based
ROCm converter is in development.
cd /root/miles
source scripts/models/qwen3-4B.sh
MEGATRON_LM_PATH=$(pip list | grep megatron-core | awk '{print $NF}')
PYTHONPATH=${MEGATRON_LM_PATH} python tools/convert_hf_to_torch_dist.py \
${MODEL_ARGS[@]} \
--hf-checkpoint /root/Qwen3-4B \
--save /root/Qwen3-4B_torch_dist
If you see miles cannot be found, re-run pip install -e . --no-deps in the repo.
Launch
The standard scripts/run-qwen3-4B.sh works as-is. The image already sets the
ROCm-specific env vars you’d otherwise need:
HSA_OVERRIDE_GFX_VERSION=11.0.0 # or 9.4.0 for MI300
NCCL_NET=Socket # for non-RDMA setups
NCCL_IB_HCA=... # if your fabric supports it
PYTORCH_NO_HIP_MEMORY_CACHING=0