Skip to main content

Documentation Index

Fetch the complete documentation index at: https://www.radixark.com/llms.txt

Use this file to discover all available pages before exploring further.

Miles supports NVIDIA’s Nemotron-3 line: a Mamba + Attention hybrid that, in the Super tier, adds MoE and ships natively in FP8. All three variants load via the Megatron AutoBridge path, so there is no offline HF → torch_dist conversion step.

Variants

ModelActive / TotalHF IDRecipe
Nemotron-3-Nano4 B / 4 B (dense)nvidia/Nemotron-3-Nano-4Bnemotron-3-nano
Nemotron-3-Nano MoE3 B / 30 Bnvidia/Nemotron-3-Nano-30B-A3Bnemotron-3-nano-moe
Nemotron-3-Super12 B / 120 B (FP8)nvidia/Nemotron-3-Super-120B-A12B-FP8nemotron-3-super

Fastest path to train

Nemotron-3-Nano (dense, 4 B) is the smallest and runs on a single 8-GPU node:
cd /root/miles
bash scripts/run-nemotron-3-nano.sh
See the Nemotron-3-Nano page for the dense walkthrough, Nemotron-3-Nano MoE for the 30 B MoE variant, and Nemotron-3-Super for the FP8-native 120 B-A12B recipe.

Which variant do I pick?

Pairs well with