Multi-Stream LLMs: Unblocking Language Models with Parallel Streams of Thoughts, Inputs and Outputs Paper • 2605.12460 • Published 12 days ago • 17
How Much Is One Recurrence Worth? Iso-Depth Scaling Laws for Looped Language Models Paper • 2604.21106 • Published 27 days ago • 8
smcleish/tuo-prod-0.6b-embed-4b-instruct-cs-4-summary-mean-1024-mlp-ov0-causal-1e-5-post-train-2e-5 Updated about 1 month ago
smcleish/tuo-prod-0.6b-embed-4b-instruct-cs-4-summary-mean-1024-mlp-ov0-causal-1e-5-post-train-2e-5 Updated about 1 month ago
smcleish/0.6b-embed-4b-instruct-cs-8-summary-mean-1024-mlp-ov0-causal-1e-5-post-train-3e-5 Updated Apr 18
smcleish/0.6b-embed-4b-instruct-cs-8-summary-mean-1024-mlp-ov0-causal-1e-5-post-train-3e-5 Updated Apr 18
smcleish/tuo-prod-0.6b-embed-4b-instruct-cs-16-summary-mean-1024-mlp-ov0-causal-1e-5-post-train-5e-5 Updated Apr 16
smcleish/tuo-prod-0.6b-embed-4b-instruct-cs-16-summary-mean-1024-mlp-ov0-causal-1e-5-post-train-5e-5 Updated Apr 16
smcleish/tuo-prod-0.6b-embed-4b-instruct-cs-16-summary-mean-1024-mlp-ov0-causal-1e-5-post-train-3e-5 Updated Apr 14
smcleish/tuo-prod-0.6b-embed-4b-instruct-cs-16-summary-mean-1024-mlp-ov0-causal-1e-5-post-train-3e-5 Updated Apr 14