Hallucinated text output when speaking in mixed languages

1752866454.2909489.m4a

101.2 KB Audio

I was playing around with option to force a specific language text output and I configured it to output into Vietnamese, then I inputed some audio of me speaking in French. The text output that it generated was completely hallucinated, it said that I said:

Hãy subscribe cho kênh lalaschool Để không bỏ lỡ những video hấp dẫn

When I translated this Vietnamese text back into English it turns out that it means “Please subscribe to the Lalaschool channel so you don’t miss any of the exciting videos” (most likely something random from the underlying model’s training data). This is a known issue with these models but normally it only happens when the audio input is silent or the speech is inaudible, but in this case it is a very clear recording (see attachment).

I also experimented more with this and found that it seems to depend a bit on the languages selected. If I speak in English and ask it to output text in French, then that works fine. Same with the inverse. But once I asked it to output in Vietnamese the hallucinations started happening.

Please authenticate to join the conversation.

Upvoters
Status

Completed

Board
🐛

Bug Report

Date

8 months ago

Author

Rajiv Sinclair

Subscribe to post

Get notified by email when there are changes.