What Is Coding? A mental model for becoming a better engineer.
Files, repos, and databases are the surface. The real work is designing behavior, state, rules, and boundaries.
A personal field guide I wrote to make myself a much better engineer. It is grounded partly in Parnas’s information hiding paper and Ousterhout’s software design philosophy, partly in my own synthesised mental model, and partly in the projects I am actually building.
The Surface Illusion
When I look at software, I usually see files. That view is not wrong, but it is shallow.
When I open a project, the first things I see are repos, files, folders, databases, APIs, and configs. That is software as a filing cabinet. It is a real view, but it is the surface.
At the surface, software looks like file management. At the deeper level, software is behavior, state, rules, interfaces, and abstractions. The files are where it is written down. The behavior is what actually exists.
This essay is about training myself to see the second layer first.
The simplest definition of coding
One sentence: coding is designing and controlling behavior.
A useful definition I keep returning to:
The real layers of coding
Eight layers, from raw bytes up to product behavior. A way to know which floor I am actually working on.
Data
Operations
State
Interfaces
Modules & abstractions
Systems
Product & business logic
Files, repos, programs, databases: what they really are
Naming each artefact precisely so I stop treating them as interchangeable.
From files to knowledge boundaries
The shift from junior to senior engineering is the shift from “where do I put this file?” to “who is allowed to know this?”
A junior question: where should this file live? A senior question: who is allowed to know this thing?
The senior question is the one Parnas pushed in 1972. His paper on decomposing systems into modules argued that modularity is not just splitting a system into pieces. The effectiveness depends on the criteria used to divide the system. His central conclusion is sharp: it is usually wrong to begin decomposition from a flowchart of the program. Instead, begin with difficult design decisions or decisions likely to change, and design each module to hide one such decision from the rest of the system.[1]
That is information hiding. Modules should hide design decisions and expose as little as necessary through their interfaces. The rest of the system gets a small, stable contract; the messy decision lives in one place and can change without breaking everyone.
// scoring.ts
openai.chat.completions.create({ model: 'gpt-4o', ... })
// email.ts
openai.chat.completions.create({ model: 'gpt-4o-mini', ... })
// outreach.ts, summary.ts, evidence.ts, ...
// each importing the SDK directly
LLMClient hides model choice, prompt format, retries, fallback, and logging. Everyone else asks for a completion and stops caring how it is produced.// llmClient.ts -- the wall
export const llm = {
complete(prompt: Prompt): Promise<Completion> { ... }
}
// scoring.ts, email.ts, outreach.ts, ...
const result = await llm.complete(prompt)
// scoring.ts
if (apolloLead.organization?.estimated_num_employees > 50) ...
// email.ts
const company = apolloLead.organization?.name ?? '???'
ApolloAdapter converts the raw response into a NormalizedLead. The rest of the app only knows the normalised shape.// apolloAdapter.ts
export function fromApollo(raw): NormalizedLead { ... }
// scoring.ts
if (lead.employees > 50) ...
// email.ts
const company = lead.company.name
Deep modules
Ousterhout’s test: how much functionality does this module give me, behind how small an interface?
Ousterhout’s clearest idea is the deep module. A good module provides a lot of functionality through a small, simple interface. A shallow module gives little benefit but still adds cognitive cost: another name to learn, another file to open, another concept to keep in your head.[2]
He also pushes a wider definition of “interface.” The interface is not just method signatures. It is everything the caller has to understand to use the module correctly — side effects, dependencies, ordering assumptions, behavior under failure. If the docstring needs four paragraphs to be safe, the interface is not really small.[2]
Shallow module
Deep module
function getCompanyName(company) {
return company.name
}
CompanyResolver.resolve(rawLead)
// - dedupes against existing companies
// - matches by domain + name fuzzy
// - enriches from secondary source on miss
// - returns a NormalizedCompany with confidence
Tactical vs strategic programming
Two different questions an engineer can ask. The second one compounds.
Ousterhout draws a sharp line between two mindsets. Tactical programming optimises for making the current thing work. Strategic programming invests in design so that future development stays easier. Tactical is fine in short bursts. The trap is doing it forever — every shortcut becomes the next person’s constraint.[2]
The honest version of strategic programming is not perfectionism. It is: every time I touch a file, I leave its structure slightly better than I found it.
- “I’ll clean it up later.”
- Copy-pasting provider-specific logic into a third file.
- Passing raw dictionaries around because typing the shape feels slow.
- Adding another flag instead of clarifying the abstraction.
- Creating tiny pass-through functions because the structure feels off.
- Adding special cases into general-purpose code.
- Changing the fewest lines possible because the surrounding code is scary.
- Skipping a name change because too many files reference the old one.
- Design the interface before writing the implementation.
- Ask what is likely to change — then hide that thing.
- Hide external providers behind one wall.
- Keep raw external data at the edges; normalise before it travels.
- Improve one thing every time you touch a file.
- Write comments where the decision is not obvious from the code.
- Prefer one deep module over three shallow ones.
- Rename freely when the old name lies.
How to read any repo
A reading order that finds the architecture instead of getting lost in folders.
- What product behavior does this repo create? One sentence, in user-visible terms.
- What are the main domain nouns — the three to seven words that show up everywhere?
- Where does data enter the system? HTTP routes, webhooks, cron jobs, file uploads.
- Where is state stored? Which databases, caches, blob stores, external services?
- Where does data leave? Responses, emails, exports, third-party APIs, queues.
- What are the core rules? Permissions, validation, scoring, billing, compliance.
- What external services does it depend on, and how is that dependency wrapped?
- Which design decisions are hidden well — meaning one file owns the change?
- Which decisions are leaking everywhere — copies, near-duplicates, scattered conditionals?
- If I changed the database, the provider, the LLM, or the scoring logic, how many files would change?
The folder structure is the map you get for free. The real architecture is the dependency and knowledge structure — which modules know what, and which decisions ripple when something changes. Question ten is the X-ray for that.
How to think before writing code
A reusable design note I fill in before opening an editor.
- FeatureWhat behavior needs to exist?
- InputsWhat data or events enter?
- OutputsWhat should be produced?
- StateWhat needs to be remembered?
- RulesWhat must always be true?
- Likely changesWhat could change later?
- Hidden decisionsWhat should other modules not know?
- InterfaceWhat is the simplest way for another part to use this?
- Red flagsWhere might complexity leak?
- FeatureGenerate personalised outreach emails for LeadMiner.
- InputsLead profile, campaign settings, user tone preference.
- OutputsA draft email ready for human review or send.
- StateSent messages, campaign history, lead status, send window.
- RulesNo duplicates, respect unsubscribes, keep tone professional, no claims I cannot verify.
- Likely changesLLM provider, prompt format, scoring logic, CRM export shape.
- Hidden decisionsModel provider, prompt construction, retry policy, compliance checks.
- Interface
messageGenerator.generate(lead, campaign) - Red flagsPrompt strings copy-pasted across modules; raw provider fields used in send logic.
Apply this to my projects
The exercise is to name, for each product, the design decisions that should live behind one wall.
LeadMiner
- Lead source provider
- Raw data format
- Deduplication logic
- Scoring policy
- LLM provider
- Outreach policy
- CRM / export format
Locarde
- Evidence source
- Evidence storage
- Framework mapping
- Control evaluation logic
- Audit bundle generation
- Report rendering
GateCrown
- Client intake structure
- Risk assessment rules
- Document generation templates
- Regulatory mapping
- Training evidence logging
- Versioning of compliance documents
Engineering red flags
Specific code patterns that warn me I am leaking knowledge, not designing it.
processData, handleThing, manager, utils.The final operating system
The short version I want to carry around in my head.
- Modeling reality as data.
- Defining rules that transform data.
- Managing state safely over time.
- Creating interfaces between parts.
- Hiding complexity behind abstractions.
- Making behavior understandable, changeable, and reliable.
Sources & notes
What I built on, where I synthesised, and what I did not claim.
This essay is a synthesis of my own learning, grounded partly in Parnas’s information hiding paper and Ousterhout’s software design philosophy. The eight-layer model in Section 03 is my own framing, not a structure proposed by either author. Where I draw a specific claim from a source, the inline reference points to it.
- [1] D. L. Parnas, “On the Criteria To Be Used in Decomposing Systems into Modules,” Communications of the ACM, vol. 15, no. 12, December 1972. The paper that argues modularity depends on the criteria of decomposition, that it is usually wrong to begin from a flowchart, and that each module should hide a difficult design decision or one likely to change.
- [2] John Ousterhout, A Philosophy of Software Design, 2nd edition. The source for the deep-class / deep-module idea, the wider definition of “interface,” and the tactical-vs-strategic programming distinction. Also discusses defining errors out of existence and the cost of complexity over time.
- John Ousterhout, talks and transcripts on software design (including the Stanford / Google talk that summarises the book). Useful as a faster path into the deep-class and strategic-programming ideas.
More of the library this guide came from.
Books, papers, and field maps on AI, software systems, intelligence, startups, and trust.
Open Reading Room