principle-induction

Algorithms: GRPO and GSPO

GRPO (single-action)

GSPO (grouped)

Implementation hooks

Flags (runner)