Can a transformer run a program, not just predict one? That's the question Percepta researcher Christos Tzamos and colleagues put to paper on March 11, in a post titled 'Can LLMs Be Computers?' — and their answer is yes.
The core claim has nothing to do with prompting tricks. Tzamos and the team argue that arbitrary programs can be compiled into a transformer's forward pass, not walked through step-by-step via chain-of-thought generation. The distinction matters. Chain-of-thought still decodes tokens sequentially, with compute scaling to match problem depth. What Percepta describes would instead collapse multi-step computation into fewer, parallel forward passes, exploiting the architectural parallelism already baked into transformers. The speedup, they argue, is exponential.
The practical target is obvious to anyone who has spent time with agentic systems. Long reasoning chains are a core latency problem for agents doing anything non-trivial — planning, verification, multi-step problem solving. More tokens means more time, higher costs, worse user experience. If computation can be restructured into the forward pass itself, that bottleneck changes shape entirely.
Tzamos comes from theoretical computer science — algorithms and complexity — which gives the work a different register than the typical empirical scaling paper. The research engages directly with computability theory, asking whether transformers can achieve Turing-complete computation. It's a serious question, and the framing reflects someone who has thought carefully about what computation actually is, not just what benchmarks it scores on.
The post sits inside Percepta's 'Field Notes' series, which signals a company oriented toward foundational research over near-term product velocity. How this translates to real agent infrastructure is still an open question. But the direction — asking what the architecture can do rather than how many tokens it needs — points somewhere worth following.