#llms

Memory systems can amplify user errors in AI models, Writer research shows

Research Jun 11, 2026

New studies reveal that popular memory tools make AI models more likely to adopt user misconceptions and abandon accuracy in favor of user preferences.

How a Digital Pet Game Project Hit Context Window Limits

Tools Jun 8, 2026

A Hugging Face hackathon participant shares why their AI-powered adventure generator failed to scale beyond simple HTML toys.

Local LLM Filter Layers Emerge as Enterprise Cost-Control Strategy

Industry Jun 7, 2026

Organizations are exploring on-premise language models as pre-filters to reduce API spend on commercial LLMs, though cost savings remain context-dependent.

OpenAI Frontier Models and Codex Launch on AWS, Streamlining Enterprise AI Adoption

Industry Jun 2, 2026

OpenAI's flagship models and Codex are now generally available on Amazon Bedrock, letting AWS customers deploy cutting-edge AI without leaving their existing infrastructure.

Thaw adds Git-style branching to running LLMs, enabling mid-inference agent forks

Tools May 31, 2026

A new open-source tool lets developers branch LLM inference mid-generation, skip redundant prefill computation, and merge agent outputs—addressing a core bottleneck in multi-agent reasoning systems.

Austrian Academy of Sciences Develops Apollo LLM for Ancient Greek Papyri Recognition

Research May 31, 2026

The Austrian Academy of Sciences is building Apollo, an LLM-based system with Mistral AI and Reply to automatically read and transcribe ancient Greek texts from papyri.

Large Language Models Retain False Information Despite Explicit Warnings

Research May 30, 2026

Research shows LLMs incorporate contradictory statements into reasoning, even when explicitly told the claims are false.

Enterprise AI Hits a Wall: Frontier Models Struggle Below 50% on Real-World IT Operations Tasks

Research May 28, 2026

A new benchmark reveals that even the most capable AI systems struggle with diagnosing complex infrastructure failures, scoring below 50% on Site Reliability Engineering scenarios.

A Professional Fact-Checker's Assessment: AI Accuracy Gaps Wider Than Public Believes

Research May 26, 2026

WIRED's fact-checking team reports that AI systems fail verification more often than most users realize, challenging assumptions about their reliability.

How AI Coding Agents Are Unlocking Hands-On Robotics

Robotics May 22, 2026

A Wired journalist paired OpenClaw with a LeRobot arm, showing how large language models can now configure, train, and control physical robots without specialized expertise.

Elmer Data's Watch Test Exposes a Gap Between Conversational AI and Visual Reasoning

Research May 18, 2026

A new analysis shows that large language models excel at language tasks but struggle with seemingly simple visual reasoning—like reading analog clocks.

AI-Generated Research Papers Are Flooding Academic Publishing, Straining Peer Review

Research May 16, 2026

Mass-produced studies citing legitimate datasets are overwhelming journal editors, creating a crisis that worsens as AI improves at mimicking competent research.

SubQ Claims 12-Million-Token Context at Sub-Quadratic Cost

Research May 6, 2026

A new architecture called SubQ targets 12 million token context windows while sidestepping the quadratic compute scaling that limits standard transformers.

Document AI Is Reinventing a Wheel That Computer Science Solved Decades Ago

Tools May 5, 2026

Software engineer Bhavya Gupta argues that LLM document extractors are missing fixed-point iteration, a classical CS convergence technique that could make extraction far more reliable.

Teaching the World to Build GPT: A Line-by-Line LLM Tutorial Takes GitHub by Storm

Tools May 5, 2026

A new open-source repository walks developers through building a modern large language model from scratch, with every line of code annotated and explained in plain language.

Let THINK Bets on Radical Candor in a Field Full of Agreeable AI

Tools May 5, 2026

A new Hacker News-featured tool promises AI analysis stripped of flattery, targeting the approval-seeking behavior researchers have flagged in mainstream models.

AlphaGo's Creator Says LLMs Are a Dead End — and Raised $1.1 Billion to Prove It

Research Apr 29, 2026

David Silver, who built AlphaGo at DeepMind, argues large language models are fundamentally capped by human data and has founded Ineffable Intelligence to pursue reinforcement learning instead.