Training Large Language Models To Reason In Parallel With Global Forking Tokens Paper • 2510.05132 • Published Oct 1 • 1