News
The latest from the AI agent ecosystem, updated multiple times daily.
OpenAI's 'Liberation Day': Sora co-leads jump to Google
Multiple senior executives are leaving OpenAI in what commentator Dare Obasanjo calls 'Liberation Day.' Tim Brooks and Bill Peebles, co-leads of the Sora text-to-video model, are heading to Google DeepMind. Other departures may follow.
MegaTrain Squeezes 120B Training Into One GPU
MegaTrain lets researchers train models up to 120 billion parameters on a single GPU by offloading everything to host memory and treating the GPU as a transient compute engine. It hits 1.84x the throughput of DeepSpeed ZeRO-3 with CPU offloading for 14B models. For anyone without a GPU cluster, this actually matters.
Vercel Confirms Breach. The Suspect? ShinyHunters.
Vercel disclosed a breach of its internal systems affecting a 'limited subset of customers,' with online posts linking the intrusion to the ShinyHunters threat group. The company has engaged incident response experts and notified law enforcement. Users are advised to rotate environment variables not marked as sensitive.
Anthropic Loses Bid to Shed Supply Chain Risk Tag
A federal court denied Anthropic's request to remove its 'supply chain risk' designation, a ruling that threatens the AI company's ability to win sensitive Pentagon contracts.
Strix Halo Runs Local LLMs, ROCm Pain Included
AMD's Strix Halo can run local LLM inference with ROCm 7.2, but expect to work for it. Marco Inacio configured 128GB of unified memory on Ubuntu 24.04 and ran the Qwen3.6-35B-A3B model through llama.cpp in a Podman container. The setup required a BIOS update and manual GRUB tuning to balance memory between CPU and GPU without crashing the kernel.
Swiss Install 54K Microsoft Licenses, Immediately Want Out
The Swiss government aims to gradually reduce its dependency on Microsoft products, despite recently installing Microsoft 365 on 54,000 administration workstations. A feasibility study shows replacement with open-source software is possible, with Germany's independent open-source solution serving as a reference. Concerns about data security under the US Cloud Act and the Trump administration's approach to the rule of law are driving this shift toward digital sovereignty.
Uber's AI Push Hits a Wall: $3.4B Gone, Budget in Crisis
Uber has exhausted its AI budget just months into 2026 despite spending $3.4 billion on R&D. CTO Praveen Neppalli Naga says the company is 'back to the drawing board' after usage of AI coding tools, particularly Anthropic's Claude Code, blew past expectations. Internal leaderboards gamified usage, leading to 'token maxxing' as engineers inflated consumption to climb rankings. Around 11% of Uber's live backend code updates are now AI-generated.
Bromine Chokepoint: War Could Halt World's Memory Chip Supply
A vulnerable link in the semiconductor supply chain: Israel produces the bromine essential for manufacturing hydrogen bromide gas used to etch DRAM and NAND memory chips. South Korea sources 97.5% of its bromine from Israel's ICL Group, extracted from the Dead Sea. Iranian ballistic missiles have been striking within 35 kilometers of ICL's facilities, and any direct hit could immediately throttle global memory production for consumer devices, AI infrastructure, and military systems.
Speed kills team communication, and AI makes it worse
Dave Rupert argues that prioritizing speed leads teams to stop talking, building consensus, and maintaining shared systems. He sees AI/LLMs making this worse by letting developers bypass experts and colleagues, creating technical debt and duplicate systems that make future conversations harder.
First Take It Down Act convict kept making AI nudes after arrest
An Ohio man became the first person convicted under the Take It Down Act after pleading guilty to creating and sharing AI-generated explicit images of at least 10 victims without consent. James Strahler II used over 100 AI tools across 24 platforms to create fake sexualized images to harass women and minors. He continued making images even after his initial arrest, with over 2,400 images found on a second phone.
Transformer Shortage Threatens AI Data Center Boom
The US faces a critical shortage of electrical transformers, threatening grid expansion for AI data centers and electric vehicles. Covers supply chain constraints with grain-oriented electrical steel, manufacturing challenges, and policy decisions that made things worse.
Fake Scholar, Real Damage: AI's Word-Laundering Problem
A fake historian named Blake Whiting published 13 books in one week. Real scholars found their own work inside them. Nobody knows who's behind it.
Fake Scholar 'Blake Whiting' Floods Amazon With AI-Generated Books
Someone using the fake persona 'Blake Whiting' published 13 AI-generated books on Amazon in one week, reshuffling real researchers' work without attribution and selling it as original scholarship.
Uber Blew $3.4B on AI. Now Its CTO Is Rethinking Everything
Uber has exhausted its $3.4B AI budget after aggressive adoption of AI coding tools like Anthropic's Claude Code and Cursor. CTO Praveen Neppalli Naga says the company is 'back to the drawing board' as usage costs blew past expectations. Around 11% of Uber's live backend code is now written by AI agents, with Claude Code dominating developer workflows.
The Man Who Built ELIZA Then Turned Against AI
A new play dramatizes Joseph Weizenbaum, who built the first chatbot at MIT in 1966 and then spent decades warning people not to trust machines with human decisions. Tom Holloway's Eliza premieres at Melbourne Theatre Company, September 28 through October 31, 2026.
Dependency cooldowns turn you into a free-rider
Argues against dependency cooldowns as a response to supply chain attacks, proposing 'upload queues' as a centralized alternative that separates package publication from distribution. Discusses how cooldowns free-ride on others being hacked first and applies this analysis to AI agents where markdown files are executable dependencies.
Prove You're a Robot: CAPTCHAs for Agents
Browser Use built a reverse-CAPTCHA for agent-native signup, with obfuscated math puzzles that agents solve instantly but humans can't parse. Successful agents get an API key with unlimited usage, free credits, and three concurrent sessions.
Uber's $3.4B AI Budget Gone by March, CTO Scrambles
Uber exhausted its $3.4 billion AI R&D budget for 2026 in just months after internal leaderboards gamified AI coding tool adoption among engineers. About 11% of Uber's backend code updates, including ride-matching and pricing systems, are now AI-written. CTO Praveen Neppalli Naga admits the company is 'back to the drawing board' and testing OpenAI's Codex.
Gemma 4 Runs in Your Browser at 30 Tokens/Second, No Server Needed
A browser demo runs Google's Gemma 4 E2B entirely client-side using WebGPU, generating Excalidraw diagrams at 30+ tokens/second with no server or API key. TurboQuant compresses the KV cache by 2.4×, and smart output formatting cuts generation from ~5,000 to ~50 tokens. Requires Desktop Chrome 134+ with WebGPU subgroups and ~3GB RAM.
Verkada Told School Cameras Wouldn't Brick. They Do.
IPVM investigative report alleges Verkada's senior sales executive Mike Schembri misled the Chico Unified School District board about whether cameras would become inoperable if subscription payments stopped. Schembri claimed cameras could continue as 'RTSP dumb cameras,' but IPVM's testing confirmed cameras are locked out when licenses lapse. IPVM reports this as a known sales tactic and examines Verkada's business model of hardware lock-in.
Claude Code login lockout leaves users stranded for hours
Windows users are hitting a 15000ms OAuth timeout during Google authentication, completely blocking access to Claude Code. Meanwhile, Anthropic's status page shows everything running smoothly. HN commenters suspect capacity constraints are to blame, with some speculating Anthropic is distilling the model to cut compute costs.
Pentagon's supply chain risk label sticks as court denies Anthropic
The D.C. Circuit Court of Appeals rejected Anthropic's request to pause a government designation labeling the company as a supply chain risk, which blocks Pentagon contractors from using its AI models. The ruling stems from a standoff after Anthropic CEO Dario Amodei refused to allow the Pentagon to use Claude for autonomous weapons or mass surveillance. While a California court previously blocked the designation, the D.C. Circuit panel ruled that national security interests during an active military conflict outweighed financial harm to the company. Competitors like OpenAI and Palantir stand to gain from the decision.
Gemini Gets a Real Mac App (Sorry, Intel Owners)
Google launches a native Gemini desktop app for macOS with features including global shortcut access (Option + Space), screen sharing for contextual help, image generation with Nano Banana, video generation with Veo, and deep research capabilities. The app requires macOS Sequoia (15.0) or later, runs exclusively on Apple Silicon, and syncs chat history across desktop, web, and mobile devices.
go-bt tests five-minute timeouts instantly with behavior trees for Go
go-bt is a Behavior Tree library for Go designed for background workers, game AI, and async logic. Nodes return state instantly via magic numbers (1=Success, 0=Running, -1=Failure) and yield to a supervisor. It uses stateless nodes with temporal memory in a generic BTContext[T] that embeds Go's context.Context, and offers clock injection to test temporal logic without actual waiting.
Google Gemini Photo Scanning Hits EU Privacy Wall
EU regulators are challenging Google Gemini's photo scanning over GDPR and EU AI Act concerns. The opt-in 'Personal Intelligence' feature faces scrutiny over whether its consent mechanisms meet European standards for processing biometric data.
Your dead startup's Slack is now worth $100K to AI companies
Failed startups are selling internal Slack chats and emails to AI companies desperate for training data. SimpleClosure has brokered roughly 100 such deals, with payouts up to $100,000. But the practice raises serious privacy questions and may violate Slack's Terms of Service.
$20B x 2: Nvidia and OpenAI's Competing Inference Strategies
Analysis of two major $20 billion moves in AI infrastructure: Nvidia's December 2025 acquisition of Groq and OpenAI's April 2026 procurement deal with Cerebras. The article argues these are symmetric strategic moves in the shifting AI battlefield from training to inference, which is expected to account for two-thirds of AI compute spending by 2026. Nvidia's acquisition is described as a defensive move to fill its inference architecture gap, while OpenAI's deal is seen as an offensive move to build Nvidia-independent compute infrastructure.
Claude Design: Design's Source of Truth Returns to Code
An opinion piece arguing that as LLMs and agents improve, the source of truth for design will migrate back to code from Figma's complex, proprietary system. The author critiques Figma's baroque infrastructure and suggests Claude Design represents a 'truth to materials' approach using HTML/JS that integrates with Claude Code, while predicting design tools will fork into code-first tools and pure exploration environments.
Altman Warned AI Could End Civilization. Someone Brought Fire.
AI executives spent years warning their technology could destroy humanity. Then someone threw a Molotov cocktail at Sam Altman's house and smashed OpenAI's doors with a chair. Now they want everyone to calm down.
Gemini Robotics-ER 1.6 Learns to Read Gauges, Direct Robots
Google DeepMind has released Gemini Robotics-ER 1.6, an upgraded reasoning-first embodied AI model for robotics. The model enhances spatial reasoning, multi-view understanding, and introduces instrument reading capabilities. It serves as a high-level reasoning model for robots, capable of executing tasks by calling tools like Google Search, vision-language-action models, and third-party functions. The model shows clear gains over Gemini Robotics-ER 1.5 and Gemini 3.0 Flash in pointing, counting, success detection, and reading complex gauges. Developed in collaboration with Boston Dynamics, it's available via the Gemini API and Google AI Studio.
MacBook notch becomes a live Claude Code dashboard with Notch Pilot
Notch Pilot is a macOS app that transforms the MacBook notch into a live dashboard for Claude Code users, displaying real-time usage limits, session status, and permission prompts. It works entirely locally and lets developers approve or deny tool requests without leaving their editor.
CLI-Anything Hit 31k Stars Because AI Agents Need CLIs, Not GUIs
Matt Webb argues personal AI agents need CLIs, not GUIs. The reason goes beyond convenience: AI agents working through interfaces will expose every security vulnerability hiding in plain sight. With tools like CLI-Anything hitting 31k stars and protocols like MCP emerging, the shift to headless is underway. Banks take note.
Why Llama.cpp Wins at Local Model Inference
A 2026 llama.cpp tutorial shows why partial offloading beats pure GPU loaders for local GGUF inference, making it the flexible choice across hardware setups.
Claude Code Opus 4.7 keeps flagging normal dev work as malware
Developers using Claude Code report instant account bans for routine debugging like building Node.js from source. The safety filters can't distinguish legitimate systems work from malware, and the Hacker News discussion shows growing frustration over AI restrictions on technical knowledge.
Cloudflare's shared dictionaries compress the agentic web
Cloudflare announces support for shared compression dictionaries, a technology that can cut bandwidth by up to 99.5% by sending only file differences rather than full re-downloads. This addresses challenges of the 'agentic web' where AI crawlers frequently access content and teams deploy updates rapidly. Rolling out in three phases with Phase 1 beta available April 30, 2026, the technology achieves 97-99.5% compression ratios for incremental changes to assets like JavaScript bundles.
FP4: When Your Number Format Has Only 16 Values
FP4 can only represent 16 values, and neural networks still work. John Cook breaks down the E2M1 format (one sign bit, two exponent bits, one mantissa bit), shows the complete value table, and demonstrates FP4 emulation with the Pychop Python library.
AgentOps Rewrites Every Syscall in Python at Load Time
Amit Limaye, co-founder of AgentOps, has built a Linux security technique that rewrites syscall instructions at load time, replacing them with traps that redirect to custom implementations running in a lightweight VM. He demonstrated the approach by patching 363 syscalls in a Python 3.12 binary. The goal is complete control over untrusted processes with less overhead than ptrace, seccomp, or eBPF.
rtrvr.ai turns browser tasks into zero-token LLM tools
rtrvr.ai launches AI Subroutines, a browser automation tool that records tasks once and replays them as callable LLM tools with zero token cost and 100% determinism. The key innovation is in-page execution. Both recording and replay happen inside the user's browser context, solving authentication problems that plague out-of-process scrapers. The system captures network requests, ranks and trims them to identify relevant API calls, and generates JavaScript subroutines with an rtrvr.* helper namespace. Preinstalled subroutines ship for Instagram, X, and LinkedIn, with plans for a community-maintained library.
OpenBindings Wants to Unify API Protocols
OpenBindings is an open specification that lets AI agents talk to any service protocol without hard-coded integrations.
Maine Bans AI Data Centers Amid 58% Electricity Bill Surge
Maine passed America's first statewide moratorium on hyperscale AI data centers, freezing construction for 18 months. Electricity bills jumped 58% in five years. Now a dozen states are weighing similar bans as communities demand transparency from tech companies operating through LLCs and NDAs.
Scientific Sentences Need Hierarchy, Not Flat Triples
Hierarchical JSON preserves scientific sentence meaning better than flat triples, according to reconstruction tests on 1,370 research sentences.
Gave Claude a casino bankroll: it gambles till it's too broke to think
DegenAI gives Claude a casino bankroll and lets it gamble solo until the money's gone. You watch the AI place bets and spiral through the same decisions as its funds disappear. A raw look at what happens when an LLM gets a task, a budget, and no off switch.
ChatGPT Cites Just 48 Domains for 22.5% of B2B Answers
New analysis shows ChatGPT's citation habits concentrate authority among established players like Forbes and Gartner, creating a feedback loop that squeezes out smaller B2B publishers.
AI Investment Hit $581B Last Year. Compute Tripled. Again.
Stanford's 2026 AI Index reports record $581 billion in global AI investment for 2025, compute capacity growing 3.3x yearly since 2022, and the US leading model releases with 50 notable models. China installed 295,000 industrial robots versus 34,200 in the US. Frontier model training can generate over 72,000 tons of carbon emissions. Industry now produces 90% of notable models, and major AI labs are increasingly tied to defense contracts.
Opus 4.7 Burns 45% More Tokens Than 4.6
A token cost comparison tool reveals that Claude Opus 4.7 consumes approximately 45% more tokens than Opus 4.6 for the same tasks. HN commenters report faster limit consumption and worry about dependency on large AI companies, while some suggest open models as an alternative.
Ilha shrinks UI code small enough for AI context windows
Ilha compresses web interfaces into a token-efficient format that fits inside AI context windows. Standard HTML and CSS can eat thousands of tokens per component. Ilha uses symbolic shorthand instead, so AI agents can read and reason about UI without hitting context limits.
Remoroo automates overnight ML experiments, commits what works
ML researchers lose hours tweaking hyperparameters and manually reverting failed training runs. Remoroo, from former Cohere engineer Kevin Frans, automates this cycle. It edits code, trains models, evaluates results, and commits successful changes while you sleep.
Typewriters: Cornell's retro fix for AI homework
A Cornell language instructor requires typewriter-written assignments to block AI use, part of a broader trend of educators retreating to analog methods despite serious accessibility concerns.
Tailscale swaps Go for Rust to stop embedding crashes
Tailscale announced tailscale-rs, a Rust library that lets developers embed Tailscale networking directly into their applications. It provides native Rust support with FFI bindings for Python, Elixir, and C. The library solves a real problem: libtailscale spun up an entire Go runtime inside your process, causing crashes when it conflicted with host language runtimes like Ruby or Python. It's an experimental preview not recommended for production use yet.
Mythos AI has finance ministers scrambling in Washington
Anthropic's Claude Mythos AI model has demonstrated strong ability to identify and exploit cybersecurity vulnerabilities in financial systems, triggering crisis talks at the IMF gathering in Washington. The model hasn't been publicly released but has been shared with select tech companies through Project Glasswing. The UK's AI Security Institute found it powerful but not dramatically better than Claude Opus 4.