Elmer Data's Watch Test Exposes a Gap Between Conversational AI and Visual Reasoning
A new analysis shows that large language models excel at language tasks but struggle with seemingly simple visual reasoning—like reading analog clocks.
A new analysis shows that large language models excel at language tasks but struggle with seemingly simple visual reasoning—like reading analog clocks.