Agent-Ark

community

TheAgentArk

Activity Feed

AI & ML interests

💙 Agents

Recent Activity

zhangchenxu new activity 9 days ago

Agent-Ark/Toucan-1.5M:Clarification on SFT dataset construction for reproducing results

zhangchenxu new activity about 1 month ago

Agent-Ark/Toucan-1.5M:[bot] Conversion to Parquet

zhangchenxu new activity about 1 month ago

Agent-Ark/Toucan-1.5M:Can we add information about which models are used to generate the question?

View all activity

zhangchenxu

in Agent-Ark/Toucan-1.5M 9 days ago

Clarification on SFT dataset construction for reproducing results

#5 opened about 1 month ago by

bubuzeze

zhangchenxu

in Agent-Ark/Toucan-1.5M about 1 month ago

[bot] Conversion to Parquet

#1 opened about 2 months ago by

parquet-converter

Can we add information about which models are used to generate the question?

#2 opened about 2 months ago by

132lilinwei

zhangchenxu

updated a dataset about 2 months ago

Agent-Ark/Toucan-1.5M

Viewer • Updated Oct 4 • 1.65M • 12k • 176

zhangchenxu

authored a paper about 2 months ago

TOUCAN: Synthesizing 1.5M Tool-Agentic Data from Real-World MCP Environments

Paper • 2510.01179 • Published Oct 1 • 25

zhangchenxu

published 3 models about 2 months ago

zhangchenxu

updated 3 models about 2 months ago

Agent-Ark/Toucan-Qwen2.5-32B-Instruct-v0.1

1.12M • Updated Oct 2 • 7 • 4

Agent-Ark/Toucan-Qwen2.5-14B-Instruct-v0.1

841k • Updated Oct 2 • 29 • 2

Agent-Ark/Toucan-Qwen2.5-7B-Instruct-v0.1

333k • Updated Oct 2 • 90 • 5

zhangchenxu

published a dataset about 2 months ago

Agent-Ark/Toucan-1.5M

Viewer • Updated Oct 4 • 1.65M • 12k • 176

zhangchenxu

authored 2 papers 6 months ago

VisualSphinx: Large-Scale Synthetic Vision Logic Puzzles for RL

Paper • 2505.23977 • Published May 29 • 10

TinyV: Reducing False Negatives in Verification Improves RL for LLM Reasoning

Paper • 2505.14625 • Published May 20 • 13

zhangchenxu

authored 3 papers 9 months ago

SafeChain: Safety of Language Models with Long Chain-of-Thought Reasoning Capabilities

Paper • 2502.12025 • Published Feb 17 • 3

KodCode: A Diverse, Challenging, and Verifiable Synthetic Dataset for Coding

Paper • 2503.02951 • Published Mar 4 • 33

Small Models Struggle to Learn from Strong Reasoners

Paper • 2502.12143 • Published Feb 17 • 39

zhangchenxu

authored a paper 10 months ago

CleanGen: Mitigating Backdoor Attacks for Generation Tasks in Large Language Models

Paper • 2406.12257 • Published Jun 18, 2024

zhangchenxu

authored a paper about 1 year ago

Stronger Models are NOT Stronger Teachers for Instruction Tuning

Paper • 2411.07133 • Published Nov 11, 2024 • 38

amezasor

authored a paper about 1 year ago

API Pack: A Massive Multilingual Dataset for API Call Generation

Paper • 2402.09615 • Published Feb 14, 2024

AI & ML interests

Recent Activity

Team members 4

Agent-Ark's activity

Clarification on SFT dataset construction for reproducing results

[bot] Conversion to Parquet

Can we add information about which models are used to generate the question?