Routing Back to C [Part 1]: Judgment Is the Last Expensive Thing
LLMs killed the cost of typing code. The bottleneck is now judgment, and judgment needs to know what's underneath. Part 1 of a six-part series where I rebuild the things every language hides from you, in C.
A LinkedIn post crossed my feed last week:
Every programmer should learn C. Implement a linked list, hash table, and binary tree. Then build a simple CLI program and a basic network server. Not because you’ll use it daily, but because it strips away every abstraction you’ve been hiding behind and shows you what’s really beneath whatever language you use daily.
I agreed instantly, then started writing this series.
Why Now
LLMs killed the cost of typing code. A function I’d have spent twenty minutes on lands in twenty seconds. That’s not the interesting part. The interesting part is what didn’t change: judgment. Knowing what the machine should do, why this approach over that one, what the cost of a line is.
Vibe-coding is fine for engineers who already have that judgment. They can read what comes back, smell when something is wrong, and steer. For everyone else, vibe-coding ships bloat at LLM speed: extra allocations, dead branches, accidental N+1s, frameworks pulled in to do what twelve lines of code would.
I make games. At 60fps, every frame has 16 milliseconds to draw itself. At 120fps, 8. Go over that budget and the game stutters. The player feels it instantly. There is nowhere to hide a slow line of code.
The LLM era should treat code the same way. Writing code is almost free now. Running it still isn’t. If we use that speed to ship more code without thinking, we just ship more slow software. The better move is the opposite: use the time we saved to make what we ship actually fast.
C is the cheapest way to learn what a line of code really costs.
What C Strips Away
Every modern language hides three things from you:
- Where memory lives. Stack, heap, segments, the address space.
- Who frees it. A garbage collector, a destructor, a compile-time checker, somebody.
- What a pointer is. A variable that holds an address, not a value.
Here’s the rough memory layout of any running C program on mainstream architectures (x86, x86_64, ARM, ARM64):
high addresses
┌──────────────────────┐
│ stack │ ← grows down
│ (function frames, │
│ local variables) │
├──────────────────────┤
│ ↓ │
│ │
│ ↑ │
├──────────────────────┤
│ heap │ ← grows up
│ (malloc / free) │
├──────────────────────┤
│ bss + data │ ← globals, statics
├──────────────────────┤
│ text │ ← your compiled code
└──────────────────────┘
low addresses
Stack-down and heap-up are convention, not a law of the C standard. A handful of older architectures (PA-RISC) grew the stack upward. Every modern target you’re likely to run on looks like the picture above.
Every language you use sits on top of this picture. Python objects live in the heap. Java objects live in the heap. Your Go goroutine has its own stack. The picture doesn’t go away because the syntax is prettier.
Stack vs Heap
The stack is automatic. Every function call pushes a frame. The frame holds the function’s local variables. When the function returns, the frame is popped and those variables are gone. Fast, ordered, no thinking required.
call main() call greet() greet returns
┌─────────┐ ┌─────────┐ ┌─────────┐
│ main │ │ greet │ │ main │
│ argc │ ├─────────┤ │ argc │
│ argv │ │ main │ │ argv │
└─────────┘ │ argc │ └─────────┘
│ argv │
└─────────┘
The heap is manual. You ask for a chunk of memory, you get a pointer to it, and that chunk is yours until you give it back. The stack frame that asked for it can disappear, the pointer can be passed around, copied, returned. The chunk stays alive until somebody calls free.
int *p = malloc(sizeof(int)); // ask for 4 bytes on the heap
*p = 42; // write 42 into that address
free(p); // give it back
Three lines. You’ve done what every garbage-collected language does for you a billion times a day.
Pointers in One Picture
Before we go further, the one idea everything in C is built on. A pointer is a variable that holds an address instead of a value.
int a = 42; // a holds a value
int *p = &a; // p holds the address of a
Four symbols, four things, and they confuse every C beginner until they don’t:
| Written | Means | In our example |
|---|---|---|
a | the value stored in a | 42 |
&a | the address where a lives | 0x40 |
p | the value stored in p, which happens to be an address | 0x40 |
*p | “follow the address in p” and give me what’s there | 42 |
The & operator says “give me the address of this variable”. The * operator says “follow this address and give me the value at it”. They’re opposites. *(&a) is just a, and if p = &a then *p is a too.
In a picture:
name address value
┌───┐ ┌──────┐ ┌──────┐
│ a │ → │ 0x40 │ → │ 42 │ stack variable
└───┘ └──────┘ └──────┘
┌───┐ ┌──────┐ ┌──────┐
│ p │ → │ 0x80 │ → │ 0x40 │ p's value is a's address
└───┘ └──────┘ └──────┘
a is an integer variable at address 0x40, holding 42. p is a pointer variable at address 0x80, and its value is 0x40, the address of a. Writing *p in code walks the arrow from p to a and lands on 42.
How Big Is an Address?
A pointer takes space. 8 bytes on 64-bit, 4 on 32-bit. The type it points at does not change this.
sizeof(char) = 1
sizeof(int) = 4 ← typical, not guaranteed
sizeof(int *) = 8 ┐
sizeof(char *) = 8 │ every data pointer is 8 bytes on 64-bit
sizeof(void *) = 8 ┘
Two footnotes the C standard forces on this picture. sizeof(char) is always exactly 1 by definition. sizeof(int) is implementation-defined, the standard only guarantees at least 2 bytes, but every mainstream 32/64-bit compiler (gcc, clang, MSVC) picks 4. Pointer width is also platform-dependent rather than standard-mandated, but every 64-bit target you’ll realistically use makes data pointers 8 bytes. I’ll keep writing “4” and “8” without the disclaimer from here on.
On 64-bit, a pointer to a 4-byte int is already bigger than the int itself. A linked list of 32-bit values spends more memory on next pointers than on the values. First taste of why dynamic structures feel heavy.
One Byte, One Address
Memory is a flat array of bytes. Every byte has one address. A variable takes as many consecutive bytes as its type says, and is referred to by its first one.
address: 0x40 0x41 0x42 0x43 0x44
┌────┬────┬────┬────┬────┐
bytes: │ 2A │ 00 │ 00 │ 00 │ 07 │
└────┴────┴────┴────┴────┘
└──── int a = 42 ────┘ └─ char c = 7
That 2A 00 00 00 layout is little-endian: the least significant byte sits at the lowest address. x86, x86_64, ARM64 (in its usual mode) and most of what you’ll run on are little-endian. Big-endian machines (some older PowerPC, SPARC, network byte order) would store the same int as 00 00 00 2A. The value is identical, only the byte order on disk differs.
Addresses are fixed-width. The thing behind them is not.
How Does the Pointer Know How Many Bytes?
The address doesn’t, 0x40 is just a number. The type of the pointer is what tells the compiler how wide to read. Same starting byte, three different windows:
char *cp = 0x40 → 1 byte ┌────┐
│ 2A │
└────┘
int *ip = 0x40 → 4 bytes ┌────┬────┬────┬────┐
│ 2A │ 00 │ 00 │ 00 │
└────┴────┴────┴────┘
long *lp = 0x40 → 8 bytes ┌────┬────┬────┬────┬────┬────┬────┬────┐
│ 2A │ 00 │ 00 │ 00 │ 07 │ ?? │ ?? │ ?? │
└────┴────┴────┴────┴────┴────┴────┴────┘
Casting changes the lens, and that’s exactly how buffer overruns happen. The memory didn’t change, the view did. The type also sets stride: cp+1 = 0x41, ip+1 = 0x44, lp+1 = 0x48.
Stack vs Heap, What It Actually Costs
Pointers carry no size info on their own. Stack and heap track it in completely different ways.
A stack variable is packed into its function’s frame, back-to-back with its neighbors, no metadata. Allocating is a stack-pointer bump. Freeing is a frame pop on return.
“No metadata” sounds impossible, so where is the tracking? In the compiler, at compile time. When your function is compiled, the compiler decides where each local variable will sit inside the frame and bakes those positions directly into the code it generates. Nothing is looked up at runtime.
void f() {
int a = 42;
long b = 99;
a = a + 1;
}
Layout the compiler picks for this frame:
┌─────────────┐
│ a (4 B) │ ← fixed spot #1
├─────────────┤
│ b (8 B) │ ← fixed spot #2
└─────────────┘
Every read or write to a in the function body goes to fixed spot #1, every time. Initialising it, reading it, incrementing it, passing it to printf, it’s all the same four bytes. The location never moves. Only the bytes at the location change. The variable a as a “thing” doesn’t exist at runtime at all, it’s just those 4 bytes, known to the compiler and then forgotten.
This is why C has no reflection and no runtime type info. The sizes, types, and names existed at compile time and were thrown away on the way down.
The heap has to track sizes at runtime because malloc sizes aren’t known until the program runs. The stack doesn’t, because everything it needs to know is fixed the moment you hit compile.
A heap variable comes with a hidden header written by malloc just in front of the pointer it hands you. The header stores the chunk’s size (and some allocator flags). free(p) walks backwards by the header size to read that number and release the whole chunk.
STACK (frame) HEAP (malloc(100))
┌──────┐ ┌──────────┬─────────────────┐
│ a │ ← 4 B, no header │ header │ your 100 B │
├──────┤ │ size=100 │ │
│ b │ ← 8 B, no header └──────────┴─────────────────┘
├──────┤ ▲ ▲
│ c │ │ │
└──────┘ actual start p (what you got)
popped on return free(p) walks back
The exact header layout depends on which allocator you’re linked against (glibc’s ptmalloc, jemalloc, tcmalloc, mimalloc, Windows’ HeapAlloc, etc). I’ll use glibc for the numbers below because it’s what Linux defaults to.
A minimum chunk size plus alignment padding means malloc(1) is never really 1 byte. On 64-bit glibc, every chunk is at least 32 bytes and aligned to 16:
┌──────────┬───┬──────────────────┐
│ header │ 1 │ padding │
│ (8 B) │ B │ (23 B) │
└──────────┴───┴──────────────────┘
32 bytes reserved for 1 byte of user space
(glibc’s chunk actually has two 8-byte size fields, but the first one overlaps the previous chunk’s user data when that chunk is in use, so only 8 bytes of “dead” space sit before p.)
A 32x overhead on a single byte. Other allocators have their own numbers but the shape is the same: a fixed minimum plus alignment dwarfs tiny requests. This is why “a million small mallocs” is slower and fatter than “one big buffer you carve up yourself”, and why game engines and high-performance servers ship custom allocators (pools, arenas, slabs) instead of calling malloc directly.
Two hazards fall out of the same picture:
- Heap corruption. Write one byte past the end of your allocation and you clobber a neighboring header. The next unrelated
malloc/freecrashes nowhere near your bug. free(p + 1)is undefined. No valid header sits where the allocator expects one, so it reads random bytes as a size and destroys itself.
Bonus: C strings carry no length field either, so strlen walks byte-by-byte until it hits \0. Every serious language wraps strings in a struct with a length, precisely to avoid this.
An address says where. A type says how much. A C pointer staples them together, and if you lose either one, you’re reading garbage.
Back to the Heap
Swap the stack allocation for the malloc from the last section and the picture barely changes:
int *p = malloc(sizeof(int));
*p = 42;
p now holds an address somewhere in the heap instead of the stack, and that address holds 42. Same arrows, different neighborhood. That’s the entire model. Linked lists, trees, graphs, the way every dynamic language stores objects: all of it is built out of this one idea.
What Goes Wrong Without a GC
Manual memory is small in code and large in consequences. Four classic failures:
- Leak. You malloc, you forget to free. The chunk stays reserved until the process exits. Tiny in a CLI. Lethal in a server.
- Dangling pointer. You free the chunk but keep the pointer. The pointer still has the old address, but that address now belongs to something else, or to nothing at all.
- Use-after-free. You dereference that dangling pointer. Reading it hands you whatever happens to be at that address now (garbage, another object’s data, zeros). Writing it silently corrupts whatever just moved in.
- Double free. You call
freeon the same chunk twice. The allocator’s internal bookkeeping breaks and the nextmalloccan hand you a chunk that’s already in use.
Every language with a garbage collector or a compile-time ownership checker exists to make these four bugs impossible.
How a Garbage Collector Earns Its Keep
Two big families of automatic memory management show up in the wild. Reference counting keeps a counter on every object and frees it the instant the count hits zero. CPython, Swift, and Objective-C’s ARC all work this way. Cycles (A points at B, B points at A, nothing else reaches either) are a blind spot, so refcounted runtimes usually bolt on a cycle detector. Tracing garbage collection ignores counts entirely and instead, every so often, walks the object graph from the roots and frees whatever it couldn’t reach. Go, Java, JavaScript, and C#’s runtimes are all tracing GCs.
The canonical tracing algorithm is mark and sweep.
roots: [r1] [r2]
│ │
▼ ▼
┌─────┐ ┌─────┐ ┌─────┐
│ A │ │ B │ │ C │ ← C is unreachable
└──┬──┘ └─────┘ └─────┘
│
▼
┌─────┐
│ D │
└─────┘
Mark: start at the roots (globals, the stack, CPU registers), follow every pointer, paint everything you can reach. Sweep: anything not painted is garbage. Free it.
after sweep:
┌─────┐ ┌─────┐
│ A │ │ B │
└──┬──┘ └─────┘
│
▼
┌─────┐
│ D │
└─────┘
That’s the conceptual starting point for every tracing GC. Real runtimes build on it heavily: most split the heap into a young and old generation, and the young generation is usually collected with a copying algorithm (Cheney-style) rather than plain sweep. Surviving objects are copied into a fresh space and everything left behind is freed in one shot. Old generations often use mark-compact to avoid fragmentation. Go, HotSpot Java, and V8 all mix these. The shared idea is the same: start from the roots, find what’s reachable, reclaim the rest. The variants differ in how they reclaim, not what they consider garbage.
Three Memory Models
Modern languages pick one of three answers to “who frees this”:
| Model | Languages | Cost |
|---|---|---|
| Manual | C | Bugs are on you. Zero runtime overhead. |
| Tracing GC | Java, Go, JavaScript, C# | GC pauses, more memory used than needed. |
| Reference counting (+GC) | CPython, Swift, Obj-C ARC | Per-op counter bumps. Cycles need a backup collector. |
| Ownership | Rust | A compile-time checker. Steeper to learn. |
C is at one end. Rust is at the other. Everything else is in the middle, paying some runtime cost (pauses, refcount ops, extra memory) so you don’t have to think about lifetimes.
You don’t have to like C to benefit from knowing C. Once you’ve held the manual end of the rope, you understand exactly what the GC and the ownership checker are doing for you, and what they cost.
The Series
Five builds, each its own post. Every one of them is a thing your everyday language gives you for free. Building it once in C makes the bill itemized.
- Linked list. The simplest data structure that actually needs the heap.
- Hash table. The dict, the map, the object. What’s actually in the box.
- Binary search tree. Pointers and recursion in one shape.
- CLI program. argv, file I/O, error handling. A real tool, not a toy.
- Network server. Sockets, accept loops, the handful of syscalls underneath every web framework.
Each post will end with a line about what the build cost in C and what your runtime is doing on your behalf to hide that cost.
Closing
Vibe-coding isn’t the problem. Vibe-coding without judgment is. The fastest way to build judgment about what your code costs is to spend a weekend writing code that doesn’t hide the cost. That’s what this series is.
Next post: a linked list, from struct node to free_list.
Glossary
- Stack: the region of memory used for function call frames. Automatically managed.
- Heap: the region of memory used for dynamically allocated objects. Manually managed in C, automatically in GC languages.
- Pointer: a variable that holds a memory address.
- malloc / free: the C standard library functions for asking for and returning heap memory.
- Garbage collector (GC): a runtime system that frees memory no longer reachable from the program.
- Tracing GC: a GC that walks reachable objects from a set of roots (mark) and frees the rest (sweep).
- Reference counting: an alternative GC strategy where each object tracks how many pointers reference it; freed when the count hits zero.
- Ownership: Rust’s compile-time approach to memory management, every value has exactly one owner, freed when the owner goes out of scope.
- Use-after-free: reading or writing a pointer to memory that has already been freed.
- Memory leak: heap memory that was allocated but never freed and is no longer reachable.
Game Programmer & Co-Founder of PixelPunch LLP. I ship games, build tools, and make the web work harder.
Know more →