Free Transcription Alternatives Challenge Wispr Flow's $144 Annual Price Tag
Open-source speech-to-text models and existing LLM subscriptions can replicate Wispr Flow's functionality at zero additional cost, according to Wired AI testing.
Developer tools, APIs, frameworks, and platforms for building AI-powered applications.
109 articles · ← All articles
Open-source speech-to-text models and existing LLM subscriptions can replicate Wispr Flow's functionality at zero additional cost, according to Wired AI testing.
Developers increasingly refuse to code without AI assistance, but emerging evidence suggests the tools may create maintenance debt rather than lasting productivity gains.
Google rolls out Gemini Spark, an AI agent with calendar and email access, in beta. Early testers discover both impressive automation and awkward contextual errors.
Google's latest AI Studio feature demonstration lets non-developers build interactive applications using Gemini and natural language prompts.
Braintrust's engineering team adopted OpenAI's Codex to convert customer requests into working preview branches in real time, with 50% team adoption in one month.
A solar-powered smart feeder with onboard AI identifies over 10,000 bird species and sends real-time notifications, though visit-counting accuracy needs refinement.
Google-funded Futures Lab at University of Waterloo produces student-built AI tools for language learning, accessibility, and fitness training.
Adobe's conversational AI design agent offers transparency and iterative feedback, but struggles with visual polish and remains a junior-level creative tool.
A new multi-part guide demystifies torch.profiler traces, starting with matrix operations and scaling to large language model optimization.
Microsoft redesigns Copilot with a 2x speed improvement and progressive disclosure for more structured responses.
YouTube is rolling out podcast-specific tools to Premium subscribers, including on-the-go audio mode, adaptive playback speed, and AI-powered recommendations.
Apple's Siri overhaul in iOS 27 will adopt a ChatGPT-like chat interface accessible via Dynamic Island swipe gesture, with standalone app and Camera/Photos AI tools.
YouTube introduced three new podcast features for Premium subscribers on May 28, 2026, including AI-powered recommendations, adaptive playback speed, and hands-free listening controls.
YouTube users can now create personalized video feeds by describing what they want to watch, with the feature rolling out to English-language users in the US.
Luxury smartphone maker Vertu launches Alphafold, an AI-powered foldable with enterprise integration, positioning agent-based workflows for C-suite mobility.
Warp open-sources its terminal and partners with OpenAI to build 'Open Agentic Development,' a model where AI agents co-create ~90% of pull requests under human supervision.
The Aura positions its camera outside the feeder for 150-degree panoramic footage, undercutting Birdbuddy Pro on price while offering longer battery life and no paywall for AI identification.
Robinhood launches agentic trading on its platform, allowing AI systems to buy and sell stocks autonomously with user funds—and full liability waivers for losses.
Robinhood enables users to create autonomous AI agents that can execute stock trades and make payments via a dedicated virtual credit card, with beta launch on May 27.
ElevenLabs released Music v2, enabling mid-track genre transitions and section-by-section song composition with licensed data clearance for commercial use.
A new TRL protocol reduces per-step model synchronization from terabytes to tens of megabytes by shipping only changed parameters across distributed training pipelines.
Outlines, an open-source Python library, enforces schema compliance in LLM responses using finite-state automata and token-level masking, reducing hallucination and parsing failures.
Hugging Face publishes a glossary clarifying AI agent terminology after ICLR 2026 revealed deep confusion over terms like 'harness' and 'scaffold' across the field.
A TechCrunch review finds Amazon's AI wearable useful for meeting transcription but struggles with accurate speaker identification and raises surveillance concerns.
Google Android head Sameer Samat released disco-themed custom app icons for Pixel phones on May 22, riding the wave of Spotify's polarizing anniversary redesign.
The airline shipped a production-ready mobile app with zero critical defects by using AI-assisted code generation to boost test coverage and refactoring speed.
Google's search AI misinterprets keywords like 'disregard' and 'ignore' as chatbot directives rather than search terms, returning conversation starters instead of results.
Google demoed a visual-display version of its AI glasses at I/O 2026, revealing prototype limitations in sound output and design maturity ahead of a fall audio-only launch.
With audiences consuming 12+ hours of video daily and production costs soaring, enterprises are turning to AI to close the gap between content demand and budget constraints.
Google's new Gemini avatar feature lets users create AI videos of themselves, but usage limits and setup quirks raise questions about accessibility and deepfake safeguards.
Spotify Labs releases an AI agent that generates personalized podcasts and briefings from your listening history, calendar, and email—joining a crowded field of AI-generated audio platforms.
Polyend released a $299 programmable guitar pedal that uses AI agents to convert text prompts into custom audio effects, lowering the barrier to effect design for non-programmers.
Spotify launches Studio by Spotify Labs, a desktop app that generates personalized podcasts from email, calendar, and web data—competing directly with Google NotebookLM.
Spotify rolls out AI-powered question answering, custom podcast creation, and creator monetization features to compete with NotebookLM and YouTube's conversational tools.
A Verge reporter built three Android apps in one afternoon using Google's Gemini-powered AI Studio, raising questions about monetization and app quality.
Ramp's engineering team uses OpenAI's Codex to deliver pull request feedback in minutes, reducing manual review cycles and supporting internal agent development.
YouTube Shorts creators can now use Gemini Omni to stylistically transform or edit other users' videos, with creator controls and watermarking in place.
Google announced vibe-coding tools for Android, letting non-developers create native apps and AI-generated widgets directly on their phones.
Google's video platform now renders remote participants at true-to-life scale with spatial audio, targeting the isolation remote workers experience in hybrid meetings.
Stability AI releases four new audio models capable of generating full-length songs, with open-weights tiers and licensing deals backing the release.
SynthID and C2PA metadata systems are expanding to browsers and APIs, but social media's metadata stripping threatens the entire verification infrastructure.
Figma releases an AI assistant that generates, edits, and automates design tasks using natural language prompts, competing against Canva and Adobe as revenue surges 46% YoY.
Google announced AI-powered audio glasses at I/O 2026, built with Warby Parker and Gentle Monster, designed to integrate with Android, iOS, and Gemini.
Google rolls out AI agents that continuously monitor topics of interest, moving beyond one-off searches to sustained information tracking with synthesis and actionable insights.
Google unveiled Gmail Live, a Gemini-powered chatbot that lets users ask natural-language questions about inbox contents instead of typing search terms.
Google launches Pics, an AI design tool built into Google Workspace, signaling serious competition in the visual-content generation market.
Google unveiled AI-powered audio glasses co-developed with Warby Parker, Gentle Monster, and Samsung, launching later in 2026.
Google's Flow platform now lets users generate AI videos featuring digital clones of themselves, powered by the new Omni Flash model—a capability that mirrors OpenAI's defunct Sora app.
Google is rolling out AI detection capabilities across Chrome, Search, and Gemini to help users identify synthetic media using invisible watermarks and content credentials.
Google's new AI image editing app uses Gemini and Nano Banana 2 to let users annotate and modify specific image regions without rewriting full prompts.
Gmail Live, Docs Live, and voice-driven Keep features roll out this summer for Google's AI Pro and Ultra subscribers.
Google expands AI Studio to generate functional Android applications, maintaining strict Play Store review standards for all AI-generated apps.
Gmail Live lets users ask natural-language questions about inbox contents instead of typing search keywords, rolling out alongside Google I/O 2026.
Google announced voice-based composition features across Workspace apps, enabling users to draft documents, structure notes, and search email using natural speech with mid-utterance corrections.
Google released Android CLI v1.0 at I/O 2026, allowing non-Google AI coding tools like Claude Code and OpenAI's Codex to build Android apps with native framework knowledge.
Google's web-based AI Studio lets anyone prototype Android apps with natural language, competing directly with Cursor and Replit while expanding Play Store discovery via AI.
Google launches background monitoring agents that synthesize information across multiple sources and send proactive alerts—the biggest Search redesign in 25 years.
Google DeepMind integrates AI-detection tools and C2PA Content Credentials into mainstream products, with SynthID verification already used 50 million times.
Google DeepMind introduces three experimental AI tools designed to accelerate scientific research by automating literature synthesis, hypothesis generation, and computational experiment design.
Google introduces conversational voice features across Gmail, Docs, and Keep, plus Gemini Spark—a 24/7 AI agent for Workspace automation.
New CrossEncoder rerankers built on ModernBERT achieve state-of-the-art performance at multiple scales with published training recipes.
Amazon's Alexa Plus can create AI-generated podcast episodes with customizable hosts and episode length, drawing from 200 news partnerships.
PaddleOCR 3.5 now supports Hugging Face Transformers as an inference runtime, letting developers run OCR and document parsing models directly within Transformers-centered stacks.
Hugging Face publishes parameter-efficient fine-tuning guide for NVIDIA's 2B-parameter world model, enabling domain adaptation for robotic manipulation on consumer hardware.
Amazon's Alexa+ now generates custom podcast episodes with AI voices, drawing on partnerships with 200+ news outlets to improve accuracy.
Wired tests whether no-code AI tools let ordinary people build functional software without programming experience.
Databricks' MLflow AI Gateway now supports distributed tracing, enabling teams to debug multi-hop LLM requests in production environments.
A GitHub repository curates datasets for LLM fine-tuning, instruction tuning, and benchmarking across medical, NLP, multimodal, and code domains.
Sea Limited deploys OpenAI's Codex across its engineering organization, achieving 87% weekly active adoption and reframing developer work around architectural complexity rather than syntax.
Microsoft Edge's Copilot AI can now read across all open browser tabs, remember past conversations, and turn articles into podcasts or quizzes.
AI tools like Claude Code are enabling non-developers to build custom personal software, ending decades of one-size-fits-all app design.
Iceland-based developer Hermann Haraldsson built an open source Bluetooth-connected AMOLED display that shows Claude Code token consumption with pixel-art animations.
OpenAI has integrated Codex into the ChatGPT mobile app, letting developers monitor and manage coding agents remotely from any device.
Hugging Face's engineering blog details how asynchronous continuous batching eliminates CPU-GPU idle gaps that waste nearly a quarter of LLM inference runtime.
OpenAI published a practical guide showing how finance teams can use Codex to automate MBR narratives, model cleanup, forecasting, and more — no coding required.
OpenAI engineered a bespoke Windows sandbox for its Codex coding agent after existing OS-level isolation tools proved unfit for open-ended developer workflows.
OpenAI has added Codex to the ChatGPT mobile app, enabling developers to supervise, steer, and approve long-running AI coding tasks from their phones.
Google has embedded four AI-powered gardening features into Search, timed to a 140% spring surge in chaos garden queries and a broader push to normalize AI Mode.
An open-source project packages self-hosted LLMs, speech-to-text, text-to-speech, and MCP tooling into a single Docker Compose deployment.
Google Chrome silently installs a 4GB Gemini Nano model file on users' devices, with storage requirements buried far from where users enable the features.
Google upgrades its smart home AI to Gemini 3.1, enabling chained voice commands, browser-based management, and inline notification controls.
A month-long compromise of the Daemon Tools installer quietly infected roughly 100 organizations across eight countries with layered malware.
Developer Jamie Mill debuts Layers on Hacker News, pitching AI-powered assistance for the judgment-heavy decisions that define professional design work.
Google's Gemini API now supports event-driven webhooks, letting developers receive instant push notifications when long-running AI tasks complete.
OpenAI rebuilt its real-time audio stack with a relay-and-transceiver design to eliminate latency issues that emerge only at global scale.
Software engineer Bhavya Gupta argues that LLM document extractors are missing fixed-point iteration, a classical CS convergence technique that could make extraction far more reliable.
A new open-source repository walks developers through building a modern large language model from scratch, with every line of code annotated and explained in plain language.
A new Hacker News-featured tool promises AI analysis stripped of flattery, targeting the approval-seeking behavior researchers have flagged in mainstream models.
Aurra's beta system gives AI agents a two-axis memory model that lets the LLM itself decide when old facts are superseded by new ones.
Duralang wraps every LangChain LLM, tool, and MCP call as a Temporal Activity, giving stochastic AI agents production-grade fault tolerance.
Derrick Downey Jr., a social media creator famous for his wildlife videos, used AI coding assistants to build DualShot Recorder, which hit #1 on the App Store within 12 hours.
The mlc-ai/web-llm project runs language models entirely inside a browser tab via WebGPU, cutting out server round-trips and keeping user data on-device.
The bitsandbytes library applies 4-bit and 8-bit quantization to PyTorch models, making 70B+ parameter LLMs runnable on consumer GPUs and underpinning the QLoRA fine-tuning wave.
YC-backed doola has built an MCP integration that lets users file a business entity directly from within Claude or Replit, collapsing legal paperwork into a conversational workflow.
After a year testing dozens of AI-powered eyewear products, The Verge finds the category's main appeal is discretion — and that's not enough to sustain a market.
A developer-showcased tool called OmniForge aims to bring document intelligence and audio capture together under a single local LLM stack.
Google's new AI wardrobe feature lets you virtually try on and mix-and-match clothes you already own, extending the company's try-on AI beyond shopping into daily personal styling.
Google Photos is adding an AI-powered digital wardrobe that scans your photo library to organize clothes and generate outfit ideas with virtual try-on.
SimplePDF's browser-based demo fills PDF forms using AI tool calling entirely client-side, keeping sensitive data like W-9 tax details off external servers.
An open-source project adds structured reasoning about knowledge and uncertainty to LLMs — entirely offline, no API required.
A 2023 supply-chain intrusion by access-broker group TeamPCP has claimed two major security vendors — Checkmarx and Bitwarden — exposing a predatory new attack logic.
Developer victornominista's ANP proposes a binary standard for AI agent-to-agent price negotiation that bypasses LLM token consumption entirely.
JetBrains published a video signaling a strategic pivot from single-AI to provider-agnostic AI integration across its developer tools.
Anthropic's new creative connectors embed Claude directly into professional tools including Adobe Creative Cloud, Blender, and Ableton, marking a strategic push into creator workflows.
Amazon's new 'Join the Chat' feature lets shoppers direct real-time AI-narrated answers on product pages by text or voice.
A new platform evaluates AI proficiency through live LLM conversations rather than static tests, targeting a gap in how organizations measure real-world AI competency.
Canva's new Magic Layers feature replaced the word 'Palestine' with 'Ukraine' in user designs, raising questions about hidden AI content moderation.
Google and Kaggle are reopening enrollment for their free five-day AI Agents Intensive Course, running June 15–19, 2026, with new speakers and a refreshed curriculum.
OpenAI's Codex getting-started guide exposes a deliberate 'graduated autonomy' architecture — and what it signals about where agentic AI is heading.