Speechless: Speech Instruction Training Without Speech for Low Resource Languages Paper • 2505.17417 • Published May 23 • 14
VoxRep: Enhancing 3D Spatial Understanding in 2D Vision-Language Models via Voxel Representation Paper • 2503.21214 • Published Mar 27 • 2
ReZero: Enhancing LLM search ability by trying one-more-time Paper • 2504.11001 • Published Apr 15 • 15
AlphaSpace: Enabling Robotic Actions through Semantic Tokenization and Symbolic Reasoning Paper • 2503.18769 • Published Mar 24 • 10
PoseLess: Depth-Free Vision-to-Joint Control via Direct Image Mapping with VLM Paper • 2503.07111 • Published Mar 10 • 3
AlphaMaze: Enhancing Large Language Models' Spatial Intelligence via GRPO Paper • 2502.14669 • Published Feb 20 • 14
Ichigo: Mixed-Modal Early-Fusion Realtime Voice Assistant Paper • 2410.15316 • Published Oct 20, 2024 • 12
🍓 Ichigo v0.3 Collection The experimental family designed to train LLMs to understand sound natively. • 6 items • Updated Nov 11, 2024 • 18
LayerSkip: Enabling Early Exit Inference and Self-Speculative Decoding Paper • 2404.16710 • Published Apr 25, 2024 • 80
LLM Hallucination Detection Papers Collection Collection of LLM hallucination and evaluation papers that I've been exploring and implementing. Some of them have my comments and annotated doodles. • 12 items • Updated Feb 20, 2024 • 13