Agent-cache remembers so your LLM app doesn't have to pay twice

Every LLM-backed app wastes money on duplicate API calls. Agent-cache, which appeared on Hacker News this week, targets this with multi-tier caching for LLM responses, tool outputs, and agent environments. It works with both Valkey and Redis, a deliberate choice given the open-source community's fork of Redis into Valkey after last year's license change. Supporting both means developers don't have to pick a side.

The multi-tier approach keeps hot data in memory and pushes older session data to cheaper storage. For agents handling thousands of conversations, the cost difference adds up fast, a reality underscored by recent token throughput benchmarks. But the Hacker News thread immediately flagged thin documentation. Several commenters asked for specifics on how tier transitions work in practice.

One person noted they'd been building similar caching by hand for months. That's the real signal. The audience for this tool clearly exists. The question is whether Agent-cache's documentation catches up before someone else ships the same idea with better onboarding.