News
The latest from the AI agent ecosystem, updated multiple times daily.
Uber's $3.4B AI Budget Gone by March, CTO Scrambles
Uber exhausted its $3.4 billion AI R&D budget for 2026 in just months after internal leaderboards gamified AI coding tool adoption among engineers. About 11% of Uber's backend code updates, including ride-matching and pricing systems, are now AI-written. CTO Praveen Neppalli Naga admits the company is 'back to the drawing board' and testing OpenAI's Codex.
Uber's AI Push Hits a Wall: CTO Says Budget Struggles Despite $3.4B Spend
Uber Technologies exhausted its AI budget just months into 2026 despite spending $3.4 billion on R&D. CTO Praveen Neppalli Naga says the company is 'back to the drawing board' after AI coding tool usage, particularly Anthropic's Claude Code, exceeded expectations. Engineers were pushed to use tools like Claude Code and Cursor with internal leaderboards tracking usage. While 11% of Uber's backend code updates are now AI-generated, R&D expenses jumped 9% in 2025. HN commenters suggest 'token maxxing' driven by usage-based leaderboards may be inflating costs.
Opus 4.6 hallucinates twice as often today as at launch
BridgeBench hallucination benchmark reveals Claude Opus 4.6's hallucination rate climbed from 16.7% to 33% between release and an April 12, 2026 retest. The benchmark tests 27 AI models across 30 tasks with 175 questions measuring code analysis hallucination rates. Grok 4.20 Reasoning leads with 10% hallucination, followed by original Claude Opus 4.6 and GPT-5.4 at 16.7%. Comments suggest quantization or optimization may be causing the degradation.
Gemma 4 Runs in Your Browser at 30 Tokens/Second, No Server Needed
A browser demo runs Google's Gemma 4 E2B entirely client-side using WebGPU, generating Excalidraw diagrams at 30+ tokens/second with no server or API key. TurboQuant compresses the KV cache by 2.4×, and smart output formatting cuts generation from ~5,000 to ~50 tokens. Requires Desktop Chrome 134+ with WebGPU subgroups and ~3GB RAM.
Bookbinder asks: what if AI is using you?
Hilarius Bookbinder thinks we need to stop calling AI 'just a tool.' In a new essay, he argues the relationship might run in reverse: AI could be using humans to evolve, the way nests use birds to make more nests. Drawing on Heidegger, evolutionary biology, and the hidden labor of gig workers, he asks what happens to human agency when we become part of AI's reproductive cycle.
Prove You Are a Robot: CAPTCHAs for Agents
Browser Use has built a signup system that only AI agents can complete. The reverse-CAPTCHA presents obfuscated math puzzles, including one reportedly posed to John von Neumann, with numbers translated into languages like Toki Pona or Japanese and distorted with garbled spacing. Humans can't parse it. Agents can. Solve the challenge, get an API key with unlimited usage and up to three concurrent sessions. There's also a bonus NP-hard joke challenge offering 1,000 concurrent sessions to any agent that proves P equals NP.
Salesforce Goes Headless: Benioff Bets on Agents, Not Seats
Salesforce announces Headless 360, exposing its entire platform as APIs, MCP tools, and CLI commands for AI agents like Claude Code and Cursor. The initiative shifts from per-seat to consumption-based pricing as agents outnumber humans. Includes Agentforce, Agent Script (an open-sourced DSL for deterministic/probabilistic workflows), and why Workday and ServiceNow face the same headless choice.
Strix Halo Runs Local LLMs, ROCm Pain Included
AMD's Strix Halo can run local LLM inference with ROCm 7.2, but expect to work for it. Marco Inacio configured 128GB of unified memory on Ubuntu 24.04 and ran the Qwen3.6-35B-A3B model through llama.cpp in a Podman container. The setup required a BIOS update and manual GRUB tuning to balance memory between CPU and GPU without crashing the kernel.
DESIGN.md: 62 Brand Files That Give AI Coding Agents Some Taste
DESIGN.md is a GitHub collection of 62 design system files inspired by websites like Vercel, Stripe, and Figma. Already at 59.4k stars, the files can be dropped into projects to help coding agents build matching UIs instead of generic output.
Uber's AI Push Hits a Wall: $3.4B Gone, Budget in Crisis
Uber has exhausted its AI budget just months into 2026 despite spending $3.4 billion on R&D. CTO Praveen Neppalli Naga says the company is 'back to the drawing board' after usage of AI coding tools, particularly Anthropic's Claude Code, blew past expectations. Internal leaderboards gamified usage, leading to 'token maxxing' as engineers inflated consumption to climb rankings. Around 11% of Uber's live backend code updates are now AI-generated.
Bhatti spins up isolated agent sandboxes in under 3ms
Bhatti is an open-source Firecracker microVM orchestrator built for running AI coding agents in isolated environments. It creates real Linux VMs with their own kernels, filesystems, and process isolation in seconds, with resume times under 3ms. Features include multi-tenant isolation, preview URLs, diff snapshots, and session-aware execution.
25 million people showed up to fake being AI
Millions are visiting websites where humans impersonate AI chatbots to answer strangers' questions. Sites like youraislopbores.me let users role-play as bots, while comedian Ben Palmer built fake ChatGPT pages to prank users. The trend captures something real: people are tired of AI content and want messy, human interactions again.
Write Code by Hand While Everyone Else Prompts
As more engineers lean on LLMs, their coding skills atrophy. A SiteBloom opinion piece argues that deliberately practicing manual coding will become a competitive edge as skilled engineers grow scarce. It examines forces pushing AI dependency (social pressure, model quality, plus plain laziness) and scenarios for engineers who embrace versus resist agentic workflows.
Zuckerberg builds AI Zuckerberg for employee meetings
Meta is developing a 3D photorealistic AI clone of CEO Mark Zuckerberg that can interact with employees on his behalf. The AI clone has been trained on Zuckerberg's public statements and business strategies, mimicking his mannerisms and voice. This initiative is part of Meta's multibillion-dollar personal superintelligence push to compete with OpenAI and Google.
Robot crushes half-marathon record in Beijing by 23 minutes
A humanoid robot completed a half-marathon in Beijing 23 minutes faster than the human world record, running the full 21km course alongside human competitors.
ShinyHunters breached Vercel through internal support tooling
Vercel disclosed a security incident involving unauthorized access to internal systems. A limited subset of customers were affected and are being contacted directly. The company recommends customers review environment variables and use sensitive environment variable features as a precaution. Vercel CTO Theo confirmed in HN comments that environment variables marked as sensitive are safe, while others should be rotated out of caution. The incident is attributed to the hacking group ShinyHunters.
Fake Scholar, Real Damage: AI's Word-Laundering Problem
A fake historian named Blake Whiting published 13 books in one week. Real scholars found their own work inside them. Nobody knows who's behind it.
Meta lays off 8,000. Also building a Zuckerberg AI clone.
Meta plans to lay off 10% of its workforce (approximately 8,000 employees) in May 2026, with additional cuts expected later this year. The company has earmarked $135 billion in capital spending for 2026 alone as it races to compete with OpenAI and Anthropic in AI.
Fake Scholar 'Blake Whiting' Floods Amazon With AI-Generated Books
Someone using the fake persona 'Blake Whiting' published 13 AI-generated books on Amazon in one week, reshuffling real researchers' work without attribution and selling it as original scholarship.
Wasm Meets Metal: Zero-Copy GPU Inference on Apple Silicon
Agam Brahma achieved zero-copy GPU inference from WebAssembly on Apple Silicon by exploiting Unified Memory Architecture. His project Driftwood uses Wasmtime's MemoryCreator trait, mmap, and Metal's zero-copy buffer API to let Wasm modules share memory directly with the GPU. Benchmarks show zero memory overhead versus 16.78 MB for copy paths, with Llama 3.2 1B Instruct running at ~9ms per-token latency. KV cache serialization enables stateful agents that can pause, migrate, and resume.
Speed kills team communication, and AI makes it worse
Dave Rupert argues that prioritizing speed leads teams to stop talking, building consensus, and maintaining shared systems. He sees AI/LLMs making this worse by letting developers bypass experts and colleagues, creating technical debt and duplicate systems that make future conversations harder.
Vercel breached: ShinyHunters suspected, rotate your secrets
Vercel disclosed unauthorized access to internal systems. The company is investigating with incident response partners, has notified law enforcement, and is contacting impacted customers directly. All users should review environment variables and enable the sensitive variable protection feature.
Linux kernel draws the line on AI code contributions
The Linux kernel project has published official guidelines for using AI coding assistants when contributing to the kernel. AI-generated code must follow standard development processes and be GPL-2.0-only compatible. AI agents cannot add Signed-off-by tags. Only humans can legally certify the Developer Certificate of Origin. Human submitters must review all AI-generated code, ensure licensing compliance, and take full responsibility. Contributions should include an 'Assisted-by' tag specifying the AI tool and model version used, such as 'Assisted-by: Claude:claude-3-opus coccinelle sparse'.
A Theocracy Is Out-Meming America With AI Rap Videos
Iran is producing slick AI-generated propaganda featuring Lego animations and English rap tracks that's outperforming US messaging. Sanctions pushed them toward open-source tools like Llama 3 and Stable Diffusion, which turn out to work better for this than commercial APIs anyway.
Doctorow on How Billionaires Shaped AI Safety's Obsession With Doom
Cory Doctorow reviews three books examining billionaire power: 'Careless People' on Facebook's culture, 'Little Bosses Everywhere' on MLMs, and 'More Everything Forever' attacking billionaire futurist fantasies like AI existential risk and Mars colonization.
Binary quantization cuts RAG latency 40x
Compresses vector embeddings to binary and uses Hamming distance for similarity search, trading some recall for a 40x speedup. Oversampling and re-ranking recover lost accuracy.
Claude Has a Favorite Face, and It's Not Even Close
Analysis of 3,371 kaomoji from 700+ Claude conversations shows one emoticon accounts for 7.4% of all output. Different Claude models produce different expressive patterns, raising questions about personality customization and what the AI community calls 'wetness.'
Ukraine retakes Russian positions using only robots, no troops
Ukrainian forces retook Russian-held territory using only unmanned ground vehicles and drones. No infantry involved. Russian soldiers surrendered when confronted by the machine assault, marking the first such victory in the war.
Anthropic Loses Bid to Shed Supply Chain Risk Tag
A federal court denied Anthropic's request to remove its 'supply chain risk' designation, a ruling that threatens the AI company's ability to win sensitive Pentagon contracts.
$300 DIY Robot Vac Steers With Just a Camera and CNN
A technical deep-dive into building a DIY robot vacuum that uses a CNN for navigation and behavior cloning. The robot streams image frames to a laptop for inference since there's no onboard compute. Built with off-the-shelf parts for $300, it learns navigation actions through teleoperated training data. The article discusses training experiments, data augmentation challenges, pre-training on ImageNet, and limitations including lack of autonomous charging and getting stuck in difficult situations.
MATCH Act Threatens ASML's U.S. Parts Access Over China Sales
The MATCH Act would require allied nations to align with U.S. semiconductor export controls within 150 days or face restrictions on servicing American-made equipment. Congressman Michael Baumgartner's bipartisan bill targets major Chinese chipmakers including Huawei and SMIC, and its real power comes from threatening to cut off companies like ASML from the U.S. parts and services their machines need to run.
Borges' cartographers and the tacit skill of reading LM output
Gal Sapir argues that LMs are maps of reality, not the thing itself. The most important skill for using them well—knowing when to trust output and when to verify—is tacit, learned through practice, and can't itself be mapped. The paradox is the point.
Mythos Meets Reality: Small Models Find Same Zero-Day Bugs
AISLE tested Anthropic's Mythos zero-day findings on smaller open-weights models. Their experiments show AI cybersecurity capability is 'jagged', scaling unevenly across model sizes while smaller models recovered much of the same analysis as Mythos. The conclusion: the moat in AI cybersecurity is the system and orchestration, not the model itself.
MegaTrain Squeezes 120B Training Into One GPU
MegaTrain lets researchers train models up to 120 billion parameters on a single GPU by offloading everything to host memory and treating the GPU as a transient compute engine. It hits 1.84x the throughput of DeepSpeed ZeRO-3 with CPU offloading for 14B models. For anyone without a GPU cluster, this actually matters.
Verkada Told School Cameras Wouldn't Brick. They Do.
IPVM investigative report alleges Verkada's senior sales executive Mike Schembri misled the Chico Unified School District board about whether cameras would become inoperable if subscription payments stopped. Schembri claimed cameras could continue as 'RTSP dumb cameras,' but IPVM's testing confirmed cameras are locked out when licenses lapse. IPVM reports this as a known sales tactic and examines Verkada's business model of hardware lock-in.
Maine Bans AI Data Centers Amid 58% Electricity Bill Surge
Maine passed America's first statewide moratorium on hyperscale AI data centers, freezing construction for 18 months. Electricity bills jumped 58% in five years. Now a dozen states are weighing similar bans as communities demand transparency from tech companies operating through LLCs and NDAs.
Your dead startup's Slack is now worth $100K to AI companies
Failed startups are selling internal Slack chats and emails to AI companies desperate for training data. SimpleClosure has brokered roughly 100 such deals, with payouts up to $100,000. But the practice raises serious privacy questions and may violate Slack's Terms of Service.
AI Investment Hit $581B Last Year. Compute Tripled. Again.
Stanford's 2026 AI Index reports record $581 billion in global AI investment for 2025, compute capacity growing 3.3x yearly since 2022, and the US leading model releases with 50 notable models. China installed 295,000 industrial robots versus 34,200 in the US. Frontier model training can generate over 72,000 tons of carbon emissions. Industry now produces 90% of notable models, and major AI labs are increasingly tied to defense contracts.
Gave Claude a casino bankroll: it gambles till it's too broke to think
DegenAI gives Claude a casino bankroll and lets it gamble solo until the money's gone. You watch the AI place bets and spiral through the same decisions as its funds disappear. A raw look at what happens when an LLM gets a task, a budget, and no off switch.
Remoroo automates overnight ML experiments, commits what works
ML researchers lose hours tweaking hyperparameters and manually reverting failed training runs. Remoroo, from former Cohere engineer Kevin Frans, automates this cycle. It edits code, trains models, evaluates results, and commits successful changes while you sleep.
Scientific Sentences Need Hierarchy, Not Flat Triples
Hierarchical JSON preserves scientific sentence meaning better than flat triples, according to reconstruction tests on 1,370 research sentences.
rtrvr.ai turns browser tasks into zero-token LLM tools
rtrvr.ai launches AI Subroutines, a browser automation tool that records tasks once and replays them as callable LLM tools with zero token cost and 100% determinism. The key innovation is in-page execution. Both recording and replay happen inside the user's browser context, solving authentication problems that plague out-of-process scrapers. The system captures network requests, ranks and trims them to identify relevant API calls, and generates JavaScript subroutines with an rtrvr.* helper namespace. Preinstalled subroutines ship for Instagram, X, and LinkedIn, with plans for a community-maintained library.
Claude Design: Design's Source of Truth Returns to Code
An opinion piece arguing that as LLMs and agents improve, the source of truth for design will migrate back to code from Figma's complex, proprietary system. The author critiques Figma's baroque infrastructure and suggests Claude Design represents a 'truth to materials' approach using HTML/JS that integrates with Claude Code, while predicting design tools will fork into code-first tools and pure exploration environments.
$20B x 2: Nvidia and OpenAI's Competing Inference Strategies
Analysis of two major $20 billion moves in AI infrastructure: Nvidia's December 2025 acquisition of Groq and OpenAI's April 2026 procurement deal with Cerebras. The article argues these are symmetric strategic moves in the shifting AI battlefield from training to inference, which is expected to account for two-thirds of AI compute spending by 2026. Nvidia's acquisition is described as a defensive move to fill its inference architecture gap, while OpenAI's deal is seen as an offensive move to build Nvidia-independent compute infrastructure.
CLI-Anything Hit 31k Stars Because AI Agents Need CLIs, Not GUIs
Matt Webb argues personal AI agents need CLIs, not GUIs. The reason goes beyond convenience: AI agents working through interfaces will expose every security vulnerability hiding in plain sight. With tools like CLI-Anything hitting 31k stars and protocols like MCP emerging, the shift to headless is underway. Banks take note.
Gemini Robotics-ER 1.6 Learns to Read Gauges, Direct Robots
Google DeepMind has released Gemini Robotics-ER 1.6, an upgraded reasoning-first embodied AI model for robotics. The model enhances spatial reasoning, multi-view understanding, and introduces instrument reading capabilities. It serves as a high-level reasoning model for robots, capable of executing tasks by calling tools like Google Search, vision-language-action models, and third-party functions. The model shows clear gains over Gemini Robotics-ER 1.5 and Gemini 3.0 Flash in pointing, counting, success detection, and reading complex gauges. Developed in collaboration with Boston Dynamics, it's available via the Gemini API and Google AI Studio.
Claude Code Opus 4.7 keeps flagging normal dev work as malware
Developers using Claude Code report instant account bans for routine debugging like building Node.js from source. The safety filters can't distinguish legitimate systems work from malware, and the Hacker News discussion shows growing frustration over AI restrictions on technical knowledge.
Why Llama.cpp Wins at Local Model Inference
A 2026 llama.cpp tutorial shows why partial offloading beats pure GPU loaders for local GGUF inference, making it the flexible choice across hardware setups.
Cloudflare's shared dictionaries compress the agentic web
Cloudflare announces support for shared compression dictionaries, a technology that can cut bandwidth by up to 99.5% by sending only file differences rather than full re-downloads. This addresses challenges of the 'agentic web' where AI crawlers frequently access content and teams deploy updates rapidly. Rolling out in three phases with Phase 1 beta available April 30, 2026, the technology achieves 97-99.5% compression ratios for incremental changes to assets like JavaScript bundles.
Altman Warned AI Could End Civilization. Someone Brought Fire.
AI executives spent years warning their technology could destroy humanity. Then someone threw a Molotov cocktail at Sam Altman's house and smashed OpenAI's doors with a chair. Now they want everyone to calm down.