MiniMaxAI/README · Pronunciation Issue: "nan4 xiong1 nan4 di4" being spoken as "nan2 xiong1 nan2 di4"

Hi, I encountered a pronunciation issue when using the TTS model.

I provided the exact pinyin input: nan4 xiong1 nan4 di4 for the sentence “难兄难弟”, where both instances of "难" should be pronounced in the fourth tone (nan4). However, in the generated audio, both were spoken in the second tone (nan2), which changes the intended meaning.

Is there a way to enforce tone accuracy strictly based on the provided pinyin input? Or is this a known limitation of the model?

Thank you in advance!