Generate responses with text, images, and audio inputs
Apple Silicon optimized voice cloning with MPS GPU