Common issues
mlx_lm.convert
or convert_hf_to_mlx
) and point --model
to the MLX path.ModuleNotFoundError: rl
: run from repo root or install editable: make install
.--samples
, --pairs
, or use a smaller MLX model.--beta-schedule target --target-kl 0.05
, and/or lower --eta
.force_words_ids
; ensure the chosen model supports the tokens and adjust --temp
/--top-p
.Debug tips
print(res.info)
in the runner.--seed
; envs and sampling both use seeds.