The same code structure appears in two or more places — same shape with cosmetic variations, or copy-paste-modify patterns that drift over time. Each copy carries its own subtle history of fixes and feature additions; the signal-to-noise ratio of the codebase drops as the duplicated structure crowds out the differences between copies that actually matter.
Near-identical code in multiple files; each copy the agent reads adds tokens to the context window without adding new information, and copies the agent doesn't read stay unverified. The same structural pattern appears N times, but the agent's reasoning depends on knowing which copy is canonical and which copies have drifted — none of which is structurally signaled.
One canonical home per behavior, with parameters for the variations. The duplicated structure collapses into a single named function; future variations become parameter choices rather than new copies. The comprehension cost of holding the behavior in mind drops to the size of one function, and maintenance cost per bug fix drops by the copy count.
One canonical version the agent reads once; edits go in one place, every caller picks them up through the shared reference, and the token cost of N copies disappears. Subsequent reasoning passes load one definition instead of N; the agent's per-edit verification surface shrinks to the single canonical site plus its callers.
Smellier version
function totalUSD(items) {return items.reduce((s, i) => s + i.price * i.qty, 0);}function totalEUR(items) {return items.reduce((s, i) => s + i.price * i.qty, 0);}
Fresher version
function lineTotal(items) {return items.reduce((s, i) => s + i.price * i.qty, 0);}
Bug fixes carry the blast radius of every copy; behavior diverges as copies age, multiplying maintenance cost. Fixing the bug in one copy ships a partial fix — the others continue producing the broken behavior. The team's verification cost on every change scales with copy count rather than with the change.
The agent edits only the copy currently in its context window; copies outside the window stay unread, so the agent's completeness-check cost spans the full duplicate population while only the edited copy gets verified. The user receives correct behavior on the path the edit covered and stale behavior everywhere else — a silent partial fix shipped as complete.
Extracting the shared form introduces a parameter list or hierarchy that all callers must learn; the abstraction's cognitive load has to earn its keep against the cost of indirection. Each caller now reads two definitions — the call site and the shared definition — and reviewers verify the contract on every change to either.
Replaces N copies of the same code with one shared definition the agent reads once; if the abstraction is wrong, every exception ships as a branch the agent has to load — extra token cost at every call site. Wrong abstractions inflate the per-call reasoning step count and reintroduce the partial-fix risk the deduplication was meant to remove.
A bug fix or enhancement lands once, not in every clone — and the unifying name reveals shared intent. The team's debugging cost drops because the symptom's origin is one definition instead of N; enhancement cost drops because new features attach to the shared abstraction once rather than spreading across every copy.
Bug fixes and new features land in one place; the agent loads one definition per edit instead of N, paying token cost once instead of N times. The verification surface contracts to one function and its callers, dropping completeness-check cost to the size of one dispatch graph rather than the entire family of duplicate sites.
Premature deduplication on superficial similarity — code that looks the same but means different things gets merged behind a leaky abstraction, and any future divergence forces awkward branches. The merge feels efficient at first but raises enhancement cost every time a new variation surfaces that the original abstraction can't cleanly express.
Merging superficially-similar code forces a flag or type tag into the shared body; later divergences ship as branches the agent loads at every call site, and edits land on the wrong branch when the flag isn't visible from the caller. The agent pays full verification-surface cost on every edit; the merge produced a structural promise the implementation cannot keep.