What Is Coding? A Mental Model for Becoming a Better Engineer

Section 01

The Surface Illusion

When I look at software, I usually see files. That view is not wrong, but it is shallow.

When I open a project, the first things I see are repos, files, folders, databases, APIs, and configs. That is software as a filing cabinet. It is a real view, but it is the surface.

At the surface, software looks like file management. At the deeper level, software is behavior, state, rules, interfaces, and abstractions. The files are where it is written down. The behavior is what actually exists.

This essay is about training myself to see the second layer first.

Files are where software is written down. Code is the behavior those files describe.

Surface

Repos · files · folders · configs · databases · APIs

What I see when I open the project.

Depth

Behavior · state · rules · interfaces · abstractions

What the software actually does.

Outcome

Product experience · business logic · user trust

What anyone outside the code actually cares about.

Section 02

The simplest definition of coding

One sentence: coding is designing and controlling behavior.

A useful definition I keep returning to:

Coding is designing and controlling behavior. When this input arrives, with this state, under these rules, produce this output.

The core loop

Input → Rules → State change → Output

Input

A request, a file, a user action, an event, an API response. Anything that arrives.

Rules

Business logic, validation, scoring, permissions. The conditions that decide what should happen.

State

Database, cache, filesystem, session, logs. What the software remembers.

Output

A UI response, an email, a saved record, a report, an API result. What leaves the system.

Section 03

The real layers of coding

Eight layers, from raw bytes up to product behavior. A way to know which floor I am actually working on.

Layer 01

Bytes

Definition

The machine ultimately stores and moves numbers.

Example

0x48 0x69 0x21 (the bytes for “Hi!”)

Beginner miss

Forgetting that strings, files, and network packets are just labelled bytes underneath.

Ask yourself

What encoding, byte order, or unit assumption am I quietly making?

Layer 02

Data

Definition

Humans assign meaning to bytes: strings, users, leads, documents, evidence, payments.

Example

Lead { name, email, company, source, score }

Beginner miss

Confusing the data with its shape. A “lead” is not its database row; the row is one representation.

Ask yourself

What is this thing in the real world, and what shape best serves the work it has to do?

Layer 03

Operations

Definition

Code transforms data: clean, sort, score, validate, summarise, send, export.

Example

scoreLead(lead, policy) → score

Beginner miss

Mixing transformation with side effects, so a “clean” function also writes to the database.

Ask yourself

Is this operation pure transformation, or is it also changing state and talking to the outside world?

Layer 04

State

Definition

Software remembers things: databases, files, logs, sessions, queues, caches.

Example

leads, campaigns, sent_messages, audit_log

Beginner miss

Sprinkling stateful writes across the codebase so nothing has a single source of truth.

Ask yourself

For each piece of state: who owns it, who reads it, who is allowed to change it?

Layer 05

Interfaces

Definition

The contract between two parts of the system: functions, classes, APIs, events, schemas.

Example

leadSource.fetch(criteria) → Lead[]

Beginner miss

Exposing provider-specific Apollo or scraping shapes everywhere instead of a clean lead contract.

Ask yourself

What does the caller really need to know — and what can I hide from them?

Layer 06

Modules & abstractions

Definition

Design decides what each part is allowed to know and what it must hide.

Example

LLMClient hides model, prompt, retries, fallback

Beginner miss

Treating modules as folders rather than as boundaries of knowledge.

Ask yourself

If this design decision changes tomorrow, how many files have to move with it?

Layer 07

Systems

Definition

Frontend, backend, database, workers, queues, external APIs, auth, and logging working together.

Example

Web → API → Worker → LLM → CRM

Beginner miss

Reasoning only about one process while the failure modes live in the seams between processes.

Ask yourself

What happens at the boundary — on retries, timeouts, partial failure, or version skew?

Layer 08

Product & business logic

Definition

The software encodes real-world rules: compliance, scoring, permissions, pricing, workflows.

Example

do not message unsubscribed leads

Beginner miss

Burying real-world rules inside utility functions where the business cannot see or audit them.

Ask yourself

Which lines of code are policy decisions in disguise, and where should they actually live?

Section 04

Files, repos, programs, databases: what they really are

Naming each artefact precisely so I stop treating them as interchangeable.

file

A container for code, data, config, or documentation. By itself, just text on disk.

repo

The versioned history of project files and the collaboration record around them. Memory of changes.

program

A running behavior. The files describe it; the program is what actually executes.

database

Structured persistent state with querying, consistency guarantees, and recovery rules.

api

A contract between systems. Inputs, outputs, errors, and the assumptions both sides agree to.

log

Memory of what happened. Time-ordered events for debugging, auditing, and post-incident learning.

queue

Memory of work waiting to happen. Decouples producers from consumers and absorbs bursts.

config

Behavior I want to change without rewriting code. Feature flags, environments, secrets.

A kitchen analogy I use sparingly. Files are the recipe cards. The repo is the recipe book with handwritten edits and history. The database is the pantry plus the order records. The running program is the kitchen operating during service. The API is the printed menu that promises what the kitchen will accept. Caveat: do not overdo this. A kitchen analogy explains the artefacts. It does not explain concurrency, partial failure, schema migrations, or backpressure — the parts of software that actually hurt.

Section 05

From files to knowledge boundaries

The shift from junior to senior engineering is the shift from “where do I put this file?” to “who is allowed to know this?”

A junior question: where should this file live? A senior question: who is allowed to know this thing?

The senior question is the one Parnas pushed in 1972. His paper on decomposing systems into modules argued that modularity is not just splitting a system into pieces. The effectiveness depends on the criteria used to divide the system. His central conclusion is sharp: it is usually wrong to begin decomposition from a flowchart of the program. Instead, begin with difficult design decisions or decisions likely to change, and design each module to hide one such decision from the rest of the system.^[1]

That is information hiding. Modules should hide design decisions and expose as little as necessary through their interfaces. The rest of the system gets a small, stable contract; the messy decision lives in one place and can change without breaking everyone.

A module is not a folder. A module is a wall around a design decision that is likely to change.

Bad — provider leaks everywhere

OpenAI-specific call patterns, headers, retry shapes, and prompt formats live in twenty different files. Switching to a different model provider would touch every one of them.

// scoring.ts
openai.chat.completions.create({ model: 'gpt-4o', ... })

// email.ts
openai.chat.completions.create({ model: 'gpt-4o-mini', ... })

// outreach.ts, summary.ts, evidence.ts, ...
//   each importing the SDK directly

Good — one wall around the decision

An LLMClient hides model choice, prompt format, retries, fallback, and logging. Everyone else asks for a completion and stops caring how it is produced.

// llmClient.ts  -- the wall
export const llm = {
  complete(prompt: Prompt): Promise<Completion> { ... }
}

// scoring.ts, email.ts, outreach.ts, ...
const result = await llm.complete(prompt)

Bad — raw provider fields in business logic

Apollo’s raw response shape is used directly in scoring, email generation, export, and UI. The day Apollo changes a field name, the whole pipeline breaks.

// scoring.ts
if (apolloLead.organization?.estimated_num_employees > 50) ...

// email.ts
const company = apolloLead.organization?.name ?? '???'

Good — normalise at the edge

An ApolloAdapter converts the raw response into a NormalizedLead. The rest of the app only knows the normalised shape.

// apolloAdapter.ts
export function fromApollo(raw): NormalizedLead { ... }

// scoring.ts
if (lead.employees > 50) ...

// email.ts
const company = lead.company.name

Section 06

Deep modules

Ousterhout’s test: how much functionality does this module give me, behind how small an interface?

Ousterhout’s clearest idea is the deep module. A good module provides a lot of functionality through a small, simple interface. A shallow module gives little benefit but still adds cognitive cost: another name to learn, another file to open, another concept to keep in your head.^[2]

He also pushes a wider definition of “interface.” The interface is not just method signatures. It is everything the caller has to understand to use the module correctly — side effects, dependencies, ordering assumptions, behavior under failure. If the docstring needs four paragraphs to be safe, the interface is not really small.^[2]

Shallow module

Large interface

Small
hidden value

A function that wraps one line and adds a name. The caller now has to learn the wrapper and still know what is underneath.

Deep module

Small
interface

Large hidden complexity that the caller never has to think about

One method. Behind it: deduplication, enrichment, fallback, retries, scoring. The caller is freed from all of that.

Bad — shallow pass-through

A tiny function that just renames a field. It buys nothing for the caller and creates a layer to maintain forever.

function getCompanyName(company) {
  return company.name
}

Good — deep resolver

A small interface that hides real work: deduplication, domain matching, enrichment fallback, confidence scoring.

CompanyResolver.resolve(rawLead)
//   - dedupes against existing companies
//   - matches by domain + name fuzzy
//   - enriches from secondary source on miss
//   - returns a NormalizedCompany with confidence

Small is not automatically good. Deep is good.

Section 07

Tactical vs strategic programming

Two different questions an engineer can ask. The second one compounds.

Ousterhout draws a sharp line between two mindsets. Tactical programming optimises for making the current thing work. Strategic programming invests in design so that future development stays easier. Tactical is fine in short bursts. The trap is doing it forever — every shortcut becomes the next person’s constraint.^[2]

The honest version of strategic programming is not perfectionism. It is: every time I touch a file, I leave its structure slightly better than I found it.

Tactical red flags

“I’ll clean it up later.”
Copy-pasting provider-specific logic into a third file.
Passing raw dictionaries around because typing the shape feels slow.
Adding another flag instead of clarifying the abstraction.
Creating tiny pass-through functions because the structure feels off.
Adding special cases into general-purpose code.
Changing the fewest lines possible because the surrounding code is scary.
Skipping a name change because too many files reference the old one.

Strategic habits

Design the interface before writing the implementation.
Ask what is likely to change — then hide that thing.
Hide external providers behind one wall.
Keep raw external data at the edges; normalise before it travels.
Improve one thing every time you touch a file.
Write comments where the decision is not obvious from the code.
Prefer one deep module over three shallow ones.
Rename freely when the old name lies.

Section 08

How to read any repo

A reading order that finds the architecture instead of getting lost in folders.

What product behavior does this repo create? One sentence, in user-visible terms.
What are the main domain nouns — the three to seven words that show up everywhere?
Where does data enter the system? HTTP routes, webhooks, cron jobs, file uploads.
Where is state stored? Which databases, caches, blob stores, external services?
Where does data leave? Responses, emails, exports, third-party APIs, queues.
What are the core rules? Permissions, validation, scoring, billing, compliance.
What external services does it depend on, and how is that dependency wrapped?
Which design decisions are hidden well — meaning one file owns the change?
Which decisions are leaking everywhere — copies, near-duplicates, scattered conditionals?
If I changed the database, the provider, the LLM, or the scoring logic, how many files would change?

The folder structure is the map you get for free. The real architecture is the dependency and knowledge structure — which modules know what, and which decisions ripple when something changes. Question ten is the X-ray for that.

Section 09

How to think before writing code

A reusable design note I fill in before opening an editor.

Template

FeatureWhat behavior needs to exist?
InputsWhat data or events enter?
OutputsWhat should be produced?
StateWhat needs to be remembered?
RulesWhat must always be true?
Likely changesWhat could change later?
Hidden decisionsWhat should other modules not know?
InterfaceWhat is the simplest way for another part to use this?
Red flagsWhere might complexity leak?

Filled example — LeadMiner outreach

FeatureGenerate personalised outreach emails for LeadMiner.
InputsLead profile, campaign settings, user tone preference.
OutputsA draft email ready for human review or send.
StateSent messages, campaign history, lead status, send window.
RulesNo duplicates, respect unsubscribes, keep tone professional, no claims I cannot verify.
Likely changesLLM provider, prompt format, scoring logic, CRM export shape.
Hidden decisionsModel provider, prompt construction, retry policy, compliance checks.
InterfacemessageGenerator.generate(lead, campaign)
Red flagsPrompt strings copy-pasted across modules; raw provider fields used in send logic.

Section 10

Apply this to my projects

The exercise is to name, for each product, the design decisions that should live behind one wall.

Product 01

LeadMiner

Decisions to hide

Lead source provider
Raw data format
Deduplication logic
Scoring policy
LLM provider
Outreach policy
CRM / export format

Recommended modules

LeadSourceAdapter LeadNormalizer CompanyResolver LeadScoringPolicy LLMClient MessageGenerator CRMExporter AuditLog

Product 02

Locarde

Decisions to hide

Evidence source
Evidence storage
Framework mapping
Control evaluation logic
Audit bundle generation
Report rendering

Recommended modules

EvidenceCollector EvidenceStore FrameworkRegistry ControlMapper PolicyEvaluator AuditBundleGenerator TrustReportRenderer

Product 03

GateCrown

Decisions to hide

Client intake structure
Risk assessment rules
Document generation templates
Regulatory mapping
Training evidence logging
Versioning of compliance documents

Recommended modules

ClientProfile RiskAssessmentPolicy CDDWorkflow DocumentGenerator ComplianceMapping TrainingEvidenceLog

Section 11

Engineering red flags

Specific code patterns that warn me I am leaking knowledge, not designing it.

Duplication of detail

The same low-level detail appears in many files.

Why bad

Every change has to be made in every copy. Drift is inevitable.

Do instead

Put the detail behind one named module and let everyone call it.

Schema leak into UI

Database column names appear in the UI layer.

Why bad

A schema change ripples into screens that should not have known about the database.

Do instead

Map at the boundary: rows become domain objects before they leave the data layer.

Provider fields in business logic

Apollo, Stripe, or OpenAI field names appear deep in scoring or rules.

Why bad

Replacing the provider becomes an archaeology project instead of a swap.

Do instead

Normalise once at the edge; rules see your domain shape only.

Hard-to-name module

A module exists but its name is vague or contested.

Why bad

If I cannot name it, it is probably doing more than one job — or the wrong job.

Do instead

Split until each part has a clear, declarative name.

Vague names

Functions called processData, handleThing, manager, utils.

Why bad

The names do not constrain behavior, so every change becomes plausible.

Do instead

Force the name to commit to one job. Rename freely when behavior shifts.

Interface needs an essay

Using the module safely requires four paragraphs of explanation.

Why bad

The interface is not really small. The complexity has just been moved into the comments.

Do instead

Redesign the interface to absorb the rules. Use defaults, types, and shapes to remove choices.

Small change, many files

Editing one rule requires touching many unrelated files.

Why bad

The decision was never hidden; it was scattered.

Do instead

Pull the rule into one policy module others depend on.

Tests fight the design

Tests are painful because responsibilities are mixed in one function.

Why bad

Tests are pain-detectors. If they hurt, the design probably does too.

Do instead

Split into pure transformation + a thin side-effect shell. Test them separately.

Transform plus side effect

A function both computes data and sends an email, writes to a queue, or hits an API.

Why bad

You can never call it “just to compute” in another context.

Do instead

Return the data; let a caller decide whether to commit the effect.

Permanent “temporary” hack

A workaround written under pressure has become load-bearing.

Why bad

It is now part of the design without ever being designed.

Do instead

Promote it to a real abstraction with a clear contract — or remove it.

Section 12

The final operating system

The short version I want to carry around in my head.

Coding is

Modeling reality as data.
Defining rules that transform data.
Managing state safely over time.
Creating interfaces between parts.
Hiding complexity behind abstractions.
Making behavior understandable, changeable, and reliable.

Whenever I code, I should ask: what behavior am I creating, what state am I changing, and what complexity am I hiding?

Section 13

Sources & notes

What I built on, where I synthesised, and what I did not claim.

This essay is a synthesis of my own learning, grounded partly in Parnas’s information hiding paper and Ousterhout’s software design philosophy. The eight-layer model in Section 03 is my own framing, not a structure proposed by either author. Where I draw a specific claim from a source, the inline reference points to it.

^[1] D. L. Parnas, “On the Criteria To Be Used in Decomposing Systems into Modules,” Communications of the ACM, vol. 15, no. 12, December 1972. The paper that argues modularity depends on the criteria of decomposition, that it is usually wrong to begin from a flowchart, and that each module should hide a difficult design decision or one likely to change.
^[2] John Ousterhout, A Philosophy of Software Design, 2nd edition. The source for the deep-class / deep-module idea, the wider definition of “interface,” and the tactical-vs-strategic programming distinction. Also discusses defining errors out of existence and the cost of complexity over time.
John Ousterhout, talks and transcripts on software design (including the Stanford / Google talk that summarises the book). Useful as a faster path into the deep-class and strategic-programming ideas.

Verification · The eight-layer model is mine, labelled as a mental model in Section 03. The kitchen analogy in Section 04 is intentionally limited. Project recommendations in Section 10 are design suggestions for my own work, not formal architecture decisions. Nothing here should be cited as a claim by Parnas or Ousterhout that they did not make.

More of the library this guide came from.

Books, papers, and field maps on AI, software systems, intelligence, startups, and trust.

Open Reading Room

What Is Coding? A mental model for becoming a better engineer.

The Surface Illusion

The simplest definition of coding

The real layers of coding

Bytes

Data

Operations

State

Interfaces

Modules & abstractions

Systems

Product & business logic

Files, repos, programs, databases: what they really are

From files to knowledge boundaries

Deep modules

Shallow module

Deep module

Tactical vs strategic programming

How to read any repo

How to think before writing code

Apply this to my projects

LeadMiner

Locarde

GateCrown

Engineering red flags

The final operating system

Sources & notes

More of the library this guide came from.