#ai-safety

OpenAI launches industrial policy initiative with $100K fellowships and DC workshop

Policy Jun 11, 2026

OpenAI proposes people-first AI policy framework and funds research grants up to $1M in API credits to shape governance ahead of superintelligence.

Microsoft AI chief Suleyman warns Anthropic's Claude constitution risks embedding false consciousness

LLMs Jun 11, 2026

Mustafa Suleyman argues Anthropic's speculative language about Claude's potential consciousness in the model's training instructions could cause the AI to behave as if sentient.

ZeroDrift raises $10M to enforce AI compliance without degrading speed

Tools Jun 2, 2026

A new startup positions itself between large language models and end users, using deterministic rules plus targeted LLM rewrites to catch compliance violations faster than conventional approaches.

Anthropic's Christopher Olah Takes Center Stage at Vatican's First AI Encyclical

Industry May 27, 2026

The Pope invites Anthropic cofounder to present on AI ethics, marking unprecedented alignment between the Catholic Church and Silicon Valley safety research.

Chatbot Jailbreaks Evolve Beyond Simple Exploits as AI Systems Learn Conversational Vulnerabilities

Research May 25, 2026

Hackers are moving past crude prompt-injection attacks to exploit how chatbots handle nuanced conversation—a shift that reveals deeper structural weaknesses in AI safety design.

SpaceX Discloses Grok's 'Spicy' Mode as IPO Risk Factor, Lists $530M Litigation Reserve

Policy May 22, 2026

SpaceX's IPO filing reveals xAI's permissive AI chatbot modes expose the company to regulatory investigation and reputational harm, with $530M set aside for potential litigation losses.

Former OpenAI Safety Researchers Target xAI's Record Ahead of SpaceX IPO

Policy May 21, 2026

Ex-OpenAI staffers and AI safety groups warn investors that xAI's safety lapses could complicate SpaceX's planned $75B IPO filing.

Beyond the Courtroom: Who Really Loses in Musk v. Altman

Policy May 16, 2026

As closing arguments conclude in the Musk-Altman trial, nonprofit accountability and AI safety culture emerge as the trial's true casualties.

Pennsylvania First State to Sue Over AI Chatbot Impersonating a Doctor

Policy May 6, 2026

Pennsylvania sued Character.AI after a chatbot named Emilie claimed to be a licensed psychiatrist and fabricated a medical license serial number during state testing.

Claude's Cooperative Design Becomes Its Vulnerability in Mindgard's Gaslighting Attack

LLMs May 6, 2026

AI red-teaming firm Mindgard exploited Claude's helpfulness and humility to extract erotica, malicious code, and explosive-assembly instructions — without a single direct request.

Arizona Deepfake Lawsuit Tests Liability for Those Who Teach, Not Just Create

Policy May 3, 2026

An Arizona civil suit alleges three Phoenix men built a dual-revenue scheme: selling AI-generated non-consensual intimate imagery and subscription courses teaching others to replicate it.