Here's a sentence that sounds wrong but isn't: the highest-leverage thing you can do as a developer is write five markdown files.
Not code. Not tests. Not a better CI pipeline. Markdown files.
It sounds absurd until you see the numbers. Before we documented our conventions, AI generated code we could use about 40% of the time. After — same AI, same prompts — that number jumped to 90%.
The difference? AI can't read your mind. But it can read your docs.
The five files that change everything
This takes about 90 minutes total. You'll get that time back within the first day.
1. ARCHITECTURE.md
Your tech stack and the reasons behind it. What framework you're using, what ORM, what your authentication looks like. AI stops suggesting Express when you're using FastAPI. It stops recommending MongoDB when you're on PostgreSQL.
# Architecture
- Backend: FastAPI + SQLAlchemy 2.0 (async)
- Frontend: React 18 + TypeScript + Vite
- Database: PostgreSQL 15+ with RLS
- Auth: JWT tokens, bcrypt hashing
- Key pattern: Domain-driven, event-based 2. DOMAIN_PATTERNS.md
The blueprint every feature follows. Directory structure, where business logic lives, how endpoints are wired. AI generates code that fits your existing structure instead of inventing its own.
# Every domain follows this structure:
domains/{name}/
__init__.py # Public API
models.py # SQLAlchemy models
schemas.py # Pydantic schemas
service.py # Business logic (ONLY here)
router.py # Thin endpoints
events.py # Domain events
tests/ # Unit + integration 3. API_CONVENTIONS.md
URL patterns, status codes, pagination rules, authentication headers. Every endpoint AI generates follows the same conventions as the ones you wrote by hand.
4. DATABASE_PATTERNS.md
Table naming conventions, standard columns every table gets, index naming, enum handling in migrations. No more generated schemas that conflict with your existing database.
5. TESTING_STRATEGY.md
What to test, how to name tests, what fixtures look like. AI writes tests that actually belong in your test suite instead of generic ones that test framework behavior.
The proof
| Metric | Without docs | With docs |
|---|---|---|
| Time per domain | 3-5 days | 1-2 days |
| AI first-try accuracy | 40-60% | 85-95% |
| Lines rewritten after generation | 60-70% | 5-15% |
| Prompt length needed | 400-600 words | 50-100 words |
Keep them alive
Docs that don't match reality are worse than no docs — they actively mislead AI into generating wrong code. Three rules:
- The 10-minute rule: If a doc update takes more than 10 minutes, it's too detailed. Trim it. These are reference cards, not textbooks.
- Update triggers: New architectural decisions, pattern changes, new domains, bugs caused by inconsistency, onboarding confusion.
- PR checklist: Did this PR change any architectural pattern? If yes, are the docs updated?
Try this
Create a .claude/ directory in your current project. Start with ARCHITECTURE.md — just list your stack, your key decisions, and your patterns. It takes 20 minutes. Then ask AI to generate a new endpoint. Compare the output to what it generated before.
The difference will be obvious.