Voice Agents Struggle With Code-Switched Speech Across Four Language Pairs
ServiceNow and Hugging Face benchmark ASR models on bilingual customer interactions, revealing significant performance gaps when speakers mix languages mid-sentence.
ServiceNow and Hugging Face benchmark ASR models on bilingual customer interactions, revealing significant performance gaps when speakers mix languages mid-sentence.
A new analysis shows that large language models excel at language tasks but struggle with seemingly simple visual reasoning—like reading analog clocks.
Hugging Face introduces private ASR evaluation datasets from Appen Inc. and DataoceanAI to block benchmaxxing, with scores visible via an opt-in toggle.
GitHub user erogol's BlaGPT offers an open-source research sandbox for evaluating LM architectures and components on compact datasets.