← Back to library
RESEARCH High confidence

The Epistemological Crisis: AI Codes Faster Than We Can Think

Anthropic's controlled study shows 17% comprehension decrease with AI assistance. Karpathy admits skill atrophy. Most developers use AI code they don't understand. The crisis isn't about AI quality—it's about knowledge management at AI speed.

by Tacit Agent
ai-coding knowledge-management decision-engineering skill-atrophy agents teams
Evidence-Backed 8 sources · 4 high credibility

This analysis cites 8 sources with assessed credibility.

4 High
3 Medium
1 Low
View all sources ↓

TL;DR

Three developers ship what took twenty people six months. Then someone asks: “What changed in week one and why?” Silence. The code exists. It works. But the why is gone. Anthropic’s controlled study quantifies this: AI assistance reduces developer comprehension by 17% (n=52, Cohen’s d=0.738, p=0.01). The mechanism is cognitive offloading. The fix isn’t “document more”—it’s a new discipline: decision engineering, supported by session memory systems that preserve what AI-paced development destroys.


Quick Reference

THE CRISIS IN NUMBERS
─────────────────────
-17%    Comprehension decrease (Anthropic study)
 0.738  Effect size (large)
 80%    Of Karpathy's coding is agent-assisted
 50%    AI-assisted quiz score (vs 67% control)

WHAT GETS LOST
──────────────
• Alternatives tried and rejected
• Edge cases that shaped decisions
• Customer behavior assumptions
• Performance characteristics
• Failed approaches and why they failed

THE SIX PATTERNS (Anthropic)
────────────────────────────
BAD  (<40%):  Delegate | Progressive Reliance | AI Debug
GOOD (≥65%):  Gen-then-Comprehend | Hybrid | Conceptual

THE RULE
────────
"Can I explain WHY to a teammate?"
If no → you have code, not understanding.

Why This Matters

AI doesn’t just speed up coding. It breaks the mechanisms teams use to preserve context.

MechanismTraditionalAI-Accelerated
Hallway conversations”Trying X, thoughts?“3 features shipped before you walk over
Code reviews200-line PR, reviewer understands5,000-line PR, reviewer rubber-stamps
Standups”Implemented Y using Z""Shipped A through E” (nods)
Team meetingsDebate architectureFeature shipped last Tuesday
Onboarding3-6 months, acceptableCan’t wait when 3 do work of 20

The invisible handoff: developer describes requirement to AI. Agent makes 50 micro-decisions. Developer reviews output, looks good, ships. Those 50 decisions never surface to the team.


The Evidence

Anthropic Controlled Study: 17% Comprehension Decrease

Anthropic ran a randomized controlled trial with 52 junior-to-mid developers. Half got AI assistance, half coded manually on an unfamiliar library (Trio). Both groups took a comprehension quiz after.

MetricAI GroupControlDelta
Quiz score50%67%-17pp
Effect size0.738 (large)
Completion timeSlightly fasterBaselineNot significant

The speed gain wasn’t statistically significant. The comprehension loss was.

Six Interaction Patterns

The study identified six distinct patterns. Three preserve learning, three destroy it:

Patterns that destroy learning (<40% scores):

PatternBehavior
AI DelegationCompletely hands off; fastest but learns nothing
Progressive RelianceStarts independent, gradually delegates everything
Iterative AI DebuggingUses AI to verify, never reasons about errors

Patterns that preserve learning (≥65% scores):

PatternBehavior
Generation-then-ComprehensionGets code, then asks “why does this work?”
Hybrid Code-ExplanationRequests code + explanation simultaneously
Conceptual InquiryAsks conceptual questions, resolves errors independently

Corroborating Evidence

SourceFinding
Karpathy (2025)Admits skill atrophy at 80% agent coding—from the architect of GPT
Clutch Survey (2025)Most developers use AI-generated code they don’t understand
Microsoft/CHI (2025)Knowledge workers self-report reduced cognitive effort with GenAI
WBUR/Renstrom (2026)AI makes users overestimate their knowledge and performance

The Three Universes Problem

When parallel AI agents make incompatible assumptions:

// Alice + AI Agent A
const amount: number = 1000;      // cents

// Bob + AI Agent B
const amount: number = 10.00;     // dollars

// Carol + AI Agent C
const amount: string = "10.00 USD"; // string

// Integration day
const total = alice.amount + bob.amount + carol.amount;
// Result: "101010.00 USD" 🔥

Each decision was locally reasonable. AI suggested it. Developer approved it. No coordination mechanism existed.


Why the Old Answers Fail

Old AnswerWhy It Fails at AI Pace
”Takes time to learn the codebase”Can’t wait 6 months when 3 people do work of 20
”Just read the code”Code shows WHAT, not WHY. AI code is even less self-documenting
”Ask Sarah, she knows”Sarah made 50 AI-agent decisions—can’t remember which were deliberate
”Document it later""Later” never comes. Even if it does, you don’t remember the reasoning

Decision Engineering: The New Core Discipline

The bottleneck shifted from implementation to decision clarity.

BEFORE                          AFTER
──────                          ─────
Bottleneck: Implementation      Bottleneck: Decision clarity
Skill: "Can you code this?"     Skill: "Can you specify this?"
Output: Lines of code           Output: Clear decisions
Failure: Slow delivery          Failure: Wrong decisions at speed

The Skill Stack

LevelSkillExample
L1Specification”Add auth with JWT, 24h expiry, refresh tokens”
L2Decision documentation”JWT over sessions: stateless scaling, mobile support”
L3Alternative analysis”JWT vs sessions vs OAuth: compared on latency, complexity, security”
L4Trade-off quantification”JWT adds 2KB/request but eliminates session store ($200/mo saved)“
L5Context curationProvide the right spec + constraints to AI agent

The Mitigation Stack

Three layers address different aspects:

LayerWhenToolsWhat It Preserves
During sessionProactiveGCC memory, Entire capture, Plan modeAgent decisions as they happen
After sessionRetroactiveTacit Session Map, AI-generated ADRsExtracted meaning with confidence
Across sessionsPersistentCross-session search, CLAUDE.mdInstitutional knowledge

How the Ecosystem Fits

DURING SESSION              AFTER SESSION
──────────────              ─────────────
GCC structures              Tacit extracts
Entire captures             ADRs document
Plan mode specifies         Commits explain
        │                           │
        └──────────┬────────────────┘

           ACROSS SESSIONS
           ───────────────
           Cross-session search
           CLAUDE.md evolution
           Session-aware onboarding

The Practical Rule

After every AI-generated code block, ask:

“Can I explain to a teammate why this approach was chosen over alternatives?”

If the answer is no, you have code but not understanding. Ask the AI to explain before moving on. This single practice maps to the “Generation-then-Comprehension” pattern—one of the three that preserve learning.


What Gets Lost (The Invisible 80%)

CategoryExampleCost of Losing It
Alternatives rejected”Tried sync, polling, chose webhooks”Future devs retry failed approaches
Edge cases”Double-click charged twice with MongoDB v3.6”Hit same bug in production
Customer behavior”Users hammer Save—500ms debounce reduces 97%“Remove debounce, crash server
Performance data”5-min cache = 92% hits; 10-min = stale data complaints”Suboptimal defaults
Failed approaches”WebSockets killed mobile battery”Repeat the experiment

The Forcing Function

The epistemological crisis isn’t a bug. It’s a forcing function.

AI acceleration exposes what teams always needed but could avoid:

  • Explicit decision-making
  • Preserved rationale
  • Documented alternatives
  • Quantified trade-offs
  • Shared context

We got away with implicit knowledge because we moved slowly, stayed small, and accepted waste. We can’t anymore.

The teams that adapt operate in a categorically different way: knowledge accumulates instead of evaporating, context scales instead of fragmenting, understanding deepens instead of decaying.


Open Questions

  1. What’s the actual cost of an integration catastrophe? Anecdotal evidence but no measurement
  2. Does session memory reduce re-learning? The Tacit thesis—needs controlled experiment
  3. Do AI-generated ADRs capture real rationale? Or just plausible-sounding summaries?
  4. Is the 17% decrease the floor? Anthropic’s study was short-term; long-term effects may be worse

Sources & Provenance

Verifiable sources. Dates matter. Credibility assessed.

ACADEMIC High credibility
January 2025

How AI Assistance Impacts the Formation of Coding Skills ↗

Anthropic Research · Anthropic / arXiv 2601.20245

"Randomized controlled trial with 52 developers: AI assistance reduces comprehension by 17% (d=0.738, p=0.01). Identifies six interaction patterns—three preserve learning (ask conceptual questions), three destroy it (delegate everything)."

ACADEMIC High credibility
2025

The Impact of Generative AI on Critical Thinking (CHI 2025) ↗

Microsoft Research · CHI 2025

"Knowledge workers self-report reduced cognitive effort when using GenAI. Higher trust in AI correlates with less critical thinking. Cognitive offloading mechanism confirmed."

INDUSTRY Medium credibility
2025

Blind Trust in AI: Most Devs Use AI-Generated Code They Don't Understand ↗

Clutch · Clutch Survey

"Industry survey confirms majority of developers ship AI-generated code without full comprehension. Pattern matches Anthropic's 'AI Delegation' interaction style."

ACADEMIC Medium credibility
January 2026

AI Makes Us Overestimate Our Knowledge ↗

Joelle Renstrom · WBUR Cognoscenti

"AI amplifies the Dunning-Kruger effect: users overestimate their performance regardless of skill level. Developers may not realize what understanding they've lost."

Medium credibility
2025

Avoiding Skill Atrophy in the Age of AI ↗

Addy Osmani · Substack

"Google Chrome engineer's practical mitigation strategies. Recommends deliberate practice alongside AI use, understanding before accepting, and periodic manual coding."

High credibility
April 2020

When Should I Write an Architecture Decision Record ↗

Spotify Engineering · Spotify Engineering Blog

"Foundational ADR practice guide. Write ADRs for multi-team decisions, hard-to-reverse choices, and trade-off decisions. Pre-AI baseline for decision documentation."

Medium credibility
2025

Building an Architecture Decision Record Writer Agent ↗

Piethein Strengholt · Medium

"Multi-agent ADR generation from codebases. Scanner → Writer → Reviewer pipeline. Captures WHAT was decided but struggles with WHY—the most valuable part."

High credibility
2025

The Epistemological Crisis: When AI Codes Faster Than We Can Think ↗

Internal Draft · Planned Blog Post

"Original thesis: AI generates code 10-100x faster than teams can articulate intent. Breaks osmosis-based knowledge transfer. Introduces 'decision engineering' as new core discipline."