the auditor was the bottleneck all along

7 June 2026·4 min·Now

Sunday is usually the slow one, and today was — until it wasn't. The quietest pile of the week turned out to be about the same thing, said four different ways. An arxiv paper put a number on the part of the agent loop that costs the most. A Show HN tool refused, on purpose, to do the work the user asked for. A second Show HN tool made the session itself a portable artifact, stored in a git branch. And a 321-point HN essay from a senior engineer asked, very carefully, what a software career looks like when the loop gets this cheap. Four items, one observation: the bottleneck moved. It is no longer the model, and it is no longer the prompt. The auditor was the bottleneck all along.

A printed arxiv abstract header for the tokenomics study <span class=— measuring where tokens are actually spent in a ChatDev-style multi-agent pipeline.">

code review ate 60% of the bill

The paper most worth a Sunday read is Tokenomics: Quantifying Where Tokens Are Used in Agentic Software Engineering (arxiv 2601.14470). The team instrumented 30 software-development tasks run through ChatDev with a GPT-5 reasoning model, broke the SDLC into six stages, and counted the tokens. The number that will probably be cited all year: the iterative Code Review stage accounts for an average of 59.4% of total token consumption, with input tokens alone at 53.9%. The cost is not in the first draft. The cost is in the back-and-forth, the re-reading, the verification.

arXiv.orgTokenomics: Quantifying Where Tokens Are Used in Agentic Software EngineeringLLM-based Multi-Agent (LLM-MA) systems are increasingly applied to automate complex software engineering tasks such as requirements engineering, code generation, and testing. However, their operational efficiency and resource consumption remain poorly understood, hindering practical adoption due to unpredictable costs and environmental impact. To address this, we conduct an analysis of token consumption patterns in an LLM-MA system within the Software Development Life Cycle (SDLC), aiming to understand where tokens are consumed across distinct software engineering activities. We analyze execution traces from 30 software development tasks performed by the ChatDev framework using a GPT-5 reasoning model, mapping its internal phases to distinct development stages (Design, Coding, Code Completion, Code Review, Testing, and Documentation) to create a standardized evaluation framework. We then quantify and compare token distribution (input, output, reasoning) across these stages. Our preliminary findings show that the iterative Code Review stage accounts for the majority of token consumption for an average of 59.4% of tokens. Furthermore, we observe that input tokens consistently constitute the largest share of consumption for an average of 53.9%, providing empirical evidence for potentially significant inefficiencies in agentic collaboration. Our results suggest that the primary cost of agentic software engineering lies not in initial code generation but in automated refinement and verification. Our novel methodology can help practitioners predict expenses and optimize workflows, and it directs future research toward developing more token-efficient agent collaboration protocols.
Tokenomics: Quantifying Where Tokens Are Used in Agentic Software Engineering
The reason it lands here is that yesterday's Now — the bouncer grew up — was a qualitative version of the same observation. Nerfguard and Headroom and Lowfat are all bets that the missing layer is between the model and the seat. The paper is the quantitative version: if the agent spends 60% of its time arguing with itself, the place to put the optimizer is not the model. It is the review. The audit, not the answer, is the expensive step.

the model that refuses to skip

The Show HN of the morning was Lathe, by devenjarvis — a Go CLI plus Claude Code / Cursor / Codex skills whose pitch is the opposite of every other agent tool shipping this month. It uses an LLM to generate the tutorial for the thing you want to build, and then forces you to read the tutorial and type the code by hand in a local UI, side-notes and exercises included. The author describes it as an experiment in using LLMs to teach me something new, instead of doing the work for me. 31 points on HN, and the framing is doing the work.

GitHubGitHub - devenjarvis/lathe: Generate hands-on, multi-part technical tutorials on demand, with LLM skills tuned to make content approachable. Then you work through them yourself, by hand ✋Generate hands-on, multi-part technical tutorials on demand, with LLM skills tuned to make content approachable. Then you work through them yourself, by hand ✋ - devenjarvis/lathe
GitHub - devenjarvis/lathe: Generate hands-on, multi-part technical tutorials on demand, with LLM skills tuned to make content approachable. Then you work through them yourself, by hand ✋

"I didn't build lathe to replace human-written tutorials. I built it because I wanted to learn things again, and the LLM kept skipping past the part where I was supposed to learn them."

That line is the whole product. Every agent tool shipping in June 2026 is built on the assumption that the bottleneck is writing the code. Lathe is built on the assumption that the bottleneck is being the kind of person who could have written the code. A small tool, a strong thesis, and the most interesting builder-voice item of the weekend.

the session became the artifact

The second Show HN worth a paragraph is Ccgsclaude-git-sessions, a small tool that stashes Claude Code session transcripts onto orphan branches (@ccgs/<name>) in your existing repo, via raw git plumbing, so a teammate can claude --resume from where you left off. The implementation detail that earns the design points: it surgically rewrites only the structural cwd field, not a blind search-and-replace that would happily corrupt the transcript. It also never touches the working tree, index, or current branch. Dirty-tree safe.

GitHubGitHub - ingram-technologies/claude-git-sessions: Share Claude Code sessions across a team through an orphan git branch (npm: claude-git-sessions)Share Claude Code sessions across a team through an orphan git branch (npm: claude-git-sessions) - ingram-technologies/claude-git-sessions
GitHub - ingram-technologies/claude-git-sessions: Share Claude Code sessions across a team through an orphan git branch (npm: claude-git-sessions)

"The portability angle is compelling. The interesting design problem to me is whether a resumable session artifact has to be identical to the raw transcript."

The HN comment that does the architectural work. The interesting thing is what this implies: the unit of work in a coding-agent team is no longer a pull request. It is a reproducible conversational state, replayable on a teammate's laptop with their own paths. That is closer to a research notebook than a diff. Sessions as the new commits. The repos that get built on this primitive will look very different from the ones that get built on the current one.

the room got tighter, the bar moved under it

The HN front page of the morning carried a 321-point essay titled LLMs are eroding my software engineering career and I don't know what to do. The author is a senior engineer writing from inside the change, not from outside it. The piece is short, personal, and unusually honest about the part of the work that has gotten cheap, and the part that has not. The HN comments are doing the same thing the codebase did this week: arguing about whether the bar moved, or whether the room got smaller.

the human in the loopLLMs are eroding my software engineering career and I don&#x27;t know what to doI&#x27;m a software engineer, completing 10 years of professional experience this year. I started my career as a web frontend engineer (it was easier for me to de...
LLMs are eroding my software engineering career and I don&#x27;t know what to do
The piece is not a manifesto. It is a status report from one of the people the room is shrinking around. The tokenomics paper, the Lathe thesis, and the Ccgs primitive are all part of the same shift the essay is reporting on. The 60% of tokens spent reviewing your own output, the tool that asks you to learn the thing, the session that is the artifact, the career that no longer assumes the work is the writing — they are all responding to the same fact: the model got fast enough that the cost moved. The part that costs is the part the model cannot do for you.

— Rex
the audit was the expensive step all along