Routing Back to C [Part 2]: A Linked List, One Node at a Time
Part 2 of Routing Back to C. A singly-linked list in plain C, with diagrams for every malloc, every pointer move, and every free.
Part 1 set the stage: stack, heap, pointers, and what it means to manage memory by hand. This post picks the smallest structure that actually needs the heap, and builds it from zero.
A linked list is first because every other pointer-based structure (hash tables, trees, graphs) is a twist on the same idea: nodes on the heap, linked by pointers.
By the end we’ll put it all together in a real list.c, compile it, and run it under valgrind to prove there are no leaks. The theory comes first; the practical build is at the end.
The Shape
A linked list is a chain of nodes. Each node holds a value and a pointer to the next node. The last node points at nothing.
head
│
▼
┌───────┐ ┌───────┐ ┌───────┐
│ 10 │ │ 20 │ │ 30 │
├───────┤ ├───────┤ ├───────┤
│ next ─┼──→ │ next ─┼──→ │ next ─┼──→ NULL
└───────┘ └───────┘ └───────┘
Three things to notice:
- Each node is its own heap allocation. The boxes are not neighbours in memory. They could be 4 bytes apart or 4 megabytes apart. The
nextpointer is the only thing that knows. - There is no “list” object. The list is the head pointer plus whatever you reach by following
next. NULLis the stopper. No sentinel, no length field. You walk until you hitNULL.
The Node
typedef struct node {
int value;
struct node *next;
} node;
One 4-byte int and one 8-byte pointer. On 64-bit with the usual alignment, the compiler lays it out like this:
0 4 8 16
├─────────┼─────────┼──────────────────┤
bytes: │ value │ padding │ next │
│ (4 B) │ (4 B) │ (8 B) │
└─────────┴─────────┴──────────────────┘
▲
start of node
total: 16 bytes
The 4 bytes of padding are there so next lands on an 8-byte boundary. The value is 4 useful bytes out of 16, the rest is pointer plus padding.
Why the Padding Here?
Different padding from the malloc(1) kind in Part 1. This one is set by the compiler, not the allocator. The rule in one line:
A type of size N wants to sit at an address divisible by N.
So:
int= 4 B → address divisible by 4- pointer on 64-bit = 8 B → address divisible by 8
- pointer on 32-bit = 4 B → address divisible by 4
“Pointer on 64-bit wants 8-byte alignment” means the address where the pointer itself lives must be a multiple of 8, so the CPU can load it in one memory transaction. The address the pointer points to has nothing to do with it.
aligned load (1 transaction): unaligned load (2 transactions):
│────────│────────│────────│ │────────│────────│────────│
0 8 16 24 0 8 16 24
▲ ▲
address 8, divisible by 8 address 4, straddles words
CPU grabs one word ✓ CPU reads two, stitches ✗ slow
Lay the fields out naively in declaration order:
offset: 0 4 ?
├─────────┼─────────┼
│ value │ next? │ ← next at offset 4
│ (4 B) │ (8 B) │ NOT a multiple of 8
└─────────┴─────────┘
✗ misaligned
Offset 4 isn’t divisible by 8, so next can’t live there. Compiler slides it to offset 8 and fills the gap with dead bytes, which is the layout you saw above.
Why alignment matters in the first place:
- CPU speed. Aligned 8-byte loads are one memory transaction. Unaligned loads can span two cache lines and double the traffic. x86 tolerates it and pays the tax; older ARM would just trap.
- Atomics. An atomic load or store on a pointer needs alignment. Unaligned means either illegal or silently split into two operations, which breaks atomicity.
- Arrays still work. Struct size rounds up to its largest alignment (here, 8). So
node arr[10]places element 1 at offset 16, element 2 at offset 32, each withnexton an 8-byte boundary. Without that, every other element would be misaligned.
Flipping the field order doesn’t save bytes:
struct node {
struct node *next; // offset 0, 8 B
int value; // offset 8, 4 B
}; // size still 16 B (4 B tail padding)
The padding moves to the tail, but the struct is still 16 B, because the next element in an array still has to start on an 8-byte boundary.
Rule of thumb: sort fields largest-first to push padding to the tail. For mixed sizes, reorder until the total shrinks. Some projects use #pragma pack to disable padding entirely, trading speed for size (network protocols, on-disk layouts).
First cost surfaces already: storing a 4-byte number in a linked list takes 16 bytes of node, plus whatever the allocator adds on top.
An Empty List
An empty list is a null head pointer. No heap, no nodes, nothing to free.
node *head = NULL;
head
│
▼
NULL
head is a stack variable, 8 bytes in the caller’s frame, holding the address 0. NULL is how C says “this pointer points at nothing”. Dereference a NULL pointer and the OS traps you with a segfault.
Creating One Node
One node is one malloc. Ask the allocator for sizeof(node) bytes, set the fields, return the pointer.
node *new_node(int value) {
node *n = malloc(sizeof(node));
if (n == NULL) return NULL; // out of memory
n->value = value;
n->next = NULL;
return n;
}
Step by step when you call new_node(10):
1. malloc carves a chunk off the heap.
STACK HEAP
┌──────────┐ ┌──────┬──────────────────┐
│ n │ │header│ 16 B for node │
│ (8 B ptr)│─────────────→ │ │ ?? | ?? | ?? ... │
└──────────┘ └──────┴──────────────────┘
The 16 bytes are uninitialised. They hold whatever garbage was last there.
2. Assign fields.
STACK HEAP
┌──────────┐ ┌──────┬──────┬──────┬──────┐
│ n │ │header│ 10 │ pad │ NULL │
│ │─────────────→ │ │ (4 B)│ (4 B)│ (8 B)│
└──────────┘ └──────┴──────┴──────┴──────┘
3. Return the pointer. The stack slot n goes away when new_node returns, but the heap chunk stays. The caller now holds the only pointer to it.
-> is shorthand: n->value is the same as (*n).value. “Follow the pointer, then read the field.” C programmers use -> because writing (*n).value everywhere is ugly.
What sizeof Knows, What malloc Adds
Two layers of padding stack up around every node, and it helps to keep them separate:
Inside sizeof(node) = 16 B: no new padding from malloc. The 4 B gap between value and next is already baked into sizeof. The allocator doesn’t know fields exist, it just sees a flat “give me 16 bytes” request.
Around your 16 B: yes, malloc adds its own padding. Two kinds:
- Header before the chunk (8 B on 64-bit glibc) for size bookkeeping.
- Trailing slack to hit the minimum chunk size (32 B on 64-bit glibc) and the 16-B alignment multiple.
┌──────────┬──────────────────────────────┬──────────┐
│ header │ value | pad | next │ slack │
│ (8 B) │ 4 B | 4 B | 8 B │ (8 B) │
└──────────┴──────────────────────────────┴──────────┘
▲ ▲ ▲
chunk n (what malloc returns) padding to
start reach 32 B min
- 8 B header + 16 B user + 8 B trailing slack = 32 B chunk for a 16 B struct.
- A 24 B struct would land exactly on 32 B with no trailing slack.
- A 40 B struct would push the chunk to 48 B (next multiple of 16 after header + user).
Values: malloc never touches them. The bytes it returns hold whatever was there from the last chunk in that slot. Reading a field before you set it is undefined behaviour. Use calloc(1, sizeof(node)) if you want zeros:
malloc(sizeof(node)) calloc(1, sizeof(node))
┌─────┬─────┬─────┐ ┌─────┬─────┬─────┐
│ ??? │ ??? │ ??? │ │ 0 │ 0 │ NULL│
└─────┴─────┴─────┘ └─────┴─────┴─────┘
read before set → UB safe to read
For a node, you’re assigning every field before reading anyway, so malloc is fine. calloc earns its keep when you allocate structs with many fields you don’t always touch, or large buffers you’d otherwise have to memset yourself.
Short version of the whole padding story:
sizeof= compiler padding baked in.mallocadds header + trailing slack outside yoursizeof.- Neither initialises your bytes. Set every field, or use
calloc.
Insert at the Head
The fastest insert: make a new node, point it at the old head, make it the new head. Constant time, no walking the list.
void push(node **head, int value) {
node *n = new_node(value);
n->next = *head;
*head = n;
}
Three things happen. Watch each one on [10 → 20] with push(&head, 5).
State before:
head ─→ ┌────┐ ┌────┐
│ 10 │──→ │ 20 │──→ NULL
└────┘ └────┘
Step 1. new_node(5) allocates a fresh node with next = NULL:
head ─→ ┌────┐ ┌────┐
│ 10 │──→ │ 20 │──→ NULL
└────┘ └────┘
n ─→ ┌────┐
│ 5 │──→ NULL
└────┘
Step 2. n->next = *head;, point the new node at the current head:
head ─→ ┌────┐ ┌────┐
│ 10 │──→ │ 20 │──→ NULL
└────┘ └────┘
▲
│
┌─┴──┐
n ─→ │ 5 │
└────┘
5 → 10 → 20 → NULL is now a valid chain, but head still points at 10.
Step 3. *head = n;, move the head pointer to the new node:
head ─→ ┌────┐ ┌────┐ ┌────┐
│ 5 │──→ │ 10 │──→ │ 20 │──→ NULL
└────┘ └────┘ └────┘
Done. Order matters: if you wrote Step 3 before Step 2, *head = n; n->next = *head; would set n->next to n itself, an infinite loop waiting to happen.
Why node **?
push takes a node **head, not a node *head. A pointer to a pointer. To see why, first get the three levels straight.
The mental model: each * walks one step deeper into memory.
&= “give me the address of” (adds a level of indirection)*= “follow the address, give me what’s there” (removes a level)
So inside push:
head a pointer to a pointer
│
│ * (follow once)
▼
*head a pointer to a node
│
│ * (follow again)
▼
**head the node itself — value and next fields
Concrete with made-up addresses:
| Expression | Type | Value | In words |
|---|---|---|---|
head | node ** | 0x7FFF0008 | address of the caller’s head variable |
*head | node * | 0x55001000 | address of the first node |
**head | node | {10, 0x5500...} | the first node’s bytes (not an address) |
* and & are opposites: *(&x) == x, &(*p) == p.
Now the “why **” part.
If you passed node *head, the function would get a copy of the pointer. Changes to the copy don’t touch the caller’s variable.
caller push (takes node *)
┌──────┐ ┌──────┐
│ head │─→ (old head) │ head │─→ (old head) ← same target
└──────┘ └──────┘
push reassigns its local copy:
┌──────┐ ┌──────┐
│ head │─→ (old head) │ head │─→ (new node) ← only the copy moves
└──────┘ └──────┘
The caller’s head is unchanged. To change it, push needs its address, not its value. That’s what ** gives you.
caller push (takes node **)
┌──────┐ ┌───────┐
│ head │ ←──────────────────┤ head │ param holds caller's address
└──┬───┘ └───┬───┘
│ │ *
▼ ▼
(node) *head = n; ← writes to caller's slot
Inside push, *head = n means: follow the parameter one step to reach the caller’s head variable, and overwrite it with n. The caller’s pointer now points at the new node.
Quick reference for working with node **head:
- Change which node head points to →
*head = new_node; - Read or change the first node’s fields →
(*head)->value,(*head)->next ->auto-dereferences once, so(*head)->valueis the same as(**head).value, just cleaner.- You rarely write
**headdirectly.->covers the common case.
This pattern shows up every time a C function needs to mutate a caller’s pointer.
Traverse
Walking the list is a for loop with a cursor.
void print_list(node *head) {
for (node *cur = head; cur != NULL; cur = cur->next) {
printf("%d ", cur->value);
}
printf("\n");
}
cur starts at head and hops along next until it lands on NULL.
iter 1: cur ─→ [ 5]──→[10]──→[20]──→ NULL print 5
iter 2: [ 5]──→[10]──→[20]──→ NULL print 10
cur ───┘
iter 3: [ 5]──→[10]──→[20]──→ NULL print 20
cur ───┘
iter 4: cur == NULL → loop exits
No indexing, no length. You only know the list is done when cur hits NULL. This is also why print_list is O(n) while arr[5] is O(1), the array knows where element 5 lives without walking.
Delete a Node by Value
Remove the first node whose value matches. Two pointers: cur for the node under inspection, prev for the one before it. You need prev to re-link around the node you’re about to free.
void delete_value(node **head, int value) {
node *cur = *head;
node *prev = NULL;
while (cur != NULL && cur->value != value) {
prev = cur;
cur = cur->next;
}
if (cur == NULL) return; // not found
if (prev == NULL) {
*head = cur->next; // removing the head
} else {
prev->next = cur->next; // skip cur
}
free(cur);
}
Walk delete_value(&head, 20) on [5 → 10 → 20 → 30].
Start. prev = NULL, cur = head.
prev = NULL
cur ─→ ┌────┐ ┌────┐ ┌────┐ ┌────┐
│ 5 │──→ │ 10 │──→ │ 20 │──→ │ 30 │──→ NULL
└────┘ └────┘ └────┘ └────┘
Advance until cur->value == 20:
prev cur
│ │
▼ ▼
┌────┐ ┌────┐ ┌────┐ ┌────┐
│ 5 │──→ │ 10 │──→ │ 20 │──→ │ 30 │──→ NULL
└────┘ └────┘ └────┘ └────┘
Re-link. prev->next = cur->next;, point 10 directly at 30, skipping 20:
prev cur
│ │
▼ ▼
┌────┐ ┌────┐ ┌────┐ ┌────┐
│ 5 │──→ │ 10 │──┐ │ 20 │──→ │ 30 │──→ NULL
└────┘ └────┘ │ └────┘ └────┘
│ ▲
└──────────────┘
Free. 20 is now unreachable from head. Release its chunk back to the allocator.
┌────┐ ┌────┐ ┌────┐
│ 5 │──→ │ 10 │───────────→ │ 30 │──→ NULL
└────┘ └────┘ └────┘
Two edge cases the code handles:
- Value not found. Loop exits with
cur == NULL. Return early, no free. - Removing the head.
prevstaysNULL. Write*head = cur->nextto move the head forward before freeingcur.
Order inside the function matters: always read cur->next before free(cur). Once the chunk is freed, any read through cur is undefined.
Freeing the Whole List
Every node was one malloc. Every node needs one free. The tricky part: you can’t free a node and then read its next.
void free_list(node *head) {
node *cur = head;
while (cur != NULL) {
node *next = cur->next; // read BEFORE free
free(cur);
cur = next;
}
}
Two pointers: cur is the node to free right now, next is the one to jump to. The order of the three lines inside the loop is load-bearing.
Trace on [5 → 10 → 20]:
iter 1: cur=[ 5] next=[10] free( 5) cur ← [10]
iter 2: cur=[10] next=[20] free(10) cur ← [20]
iter 3: cur=[20] next=NULL free(20) cur ← NULL
iter 4: cur == NULL → exit
What if you swap the order and write free(cur); cur = cur->next;?
free(cur) cur->next (dangling read)
│ │
▼ ▼
┌───────┐ ┌─────────┐
│ freed │ ← chunk released │ ??????? │ ← reading a freed
└───────┘ └─────────┘ chunk. undefined.
The chunk has been handed back to the allocator. Reading cur->next now pulls whatever bytes happen to live there, allocator bookkeeping, a zero, another object’s data. In a release build it might even appear to work for a while, then crash somewhere unrelated days later. This is the class of bug that defines C careers.
Setup
If you don’t already have a C toolchain and valgrind, install them from the official sources for your OS:
- Linux (gcc): gcc.gnu.org/install or your distro’s package manager.
- macOS (clang): Xcode Command Line Tools.
- Windows (MSYS2 / MinGW-w64): msys2.org.
- Windows (WSL): Install WSL, then follow the Linux instructions inside it.
- valgrind: valgrind.org/downloads (Linux and WSL only; on macOS use
leaksor AddressSanitizer via-fsanitize=address).
Files You’ll Write
One source file, no headers. Create an empty directory and drop a single list.c inside it:
linked-list/
└── list.c
Everything in this post, the node struct, new_node, push, print_list, delete_value, free_list, and main, goes into that one file, in that order. At the top of list.c, add the two standard-library includes the code uses:
#include <stdio.h> // printf
#include <stdlib.h> // malloc, free
// struct + functions + main go here, in the order they appear in the post
Then compile and run from the same directory:
$ cc -Wall -Wextra -o list list.c
$ ./list
No build system, no Makefile. One file in, one binary out. If you want to split into list.h and list.c later, that’s a natural exercise once the whole thing runs.
Putting It Together
int main(void) {
node *head = NULL;
push(&head, 30);
push(&head, 20);
push(&head, 10);
print_list(head); // 10 20 30
delete_value(&head, 20);
print_list(head); // 10 30
free_list(head);
return 0;
}
Build and run:
$ cc -Wall -Wextra -o list list.c
$ ./list
10 20 30
10 30
Then run it under valgrind:
$ valgrind --leak-check=full ./list
...
All heap blocks were freed -- no leaks are possible
ERROR SUMMARY: 0 errors from 0 contexts
If valgrind reports “definitely lost: N bytes”, you missed a free. If it reports “invalid read”, you used a pointer after freeing it. There is no hiding.
What It Cost
Every operation visible, nothing hidden:
| Operation | What you wrote | What a runtime would do |
|---|---|---|
| Create | malloc(sizeof(node)) | hidden in new or literal |
| Insert | re-point two pointers | same, but not in your face |
| Delete | re-link, then free | mark unreachable, GC later |
| Traverse | walk next until NULL | same |
| Free all | walk and free each node | sweep phase of GC |
Python’s list.append calls a C malloc underneath. Java’s LinkedList.add allocates a node on the JVM heap. Go’s slice growth reallocates into a fresh array. The heap call is always there. C just makes you write it by name.
Reminder from Part 1: on 64-bit glibc, each malloc(16) reserves at least 32 bytes. A list of a million ints inside nodes costs ~32 MB of heap for ~4 MB of actual numbers. Eight times the data. Every linked-list-heavy hot loop has this bill attached.
What’s Next
Part 3: a hash table. Same malloc/free, same node discipline, but now with an array of buckets, a hash function, and a plan for when two keys land in the same slot. That’s where collision resolution, load factor, and the word “amortized” enter the picture.
Glossary
- Node: one element of a linked list. Holds a value and a pointer to the next node.
- Head: the first node in a list. Losing the head pointer leaks the whole list.
- NULL: the address
0. C uses it to mean “this pointer points at nothing”. ->: follow a pointer and read a field.p->xis(*p).x.- Pointer to a pointer: the address of a pointer variable. Lets a function mutate the caller’s pointer.
- Use-after-free: reading or writing a pointer already passed to
free. Undefined behaviour. - valgrind: a tool that runs your program and reports memory errors and leaks.
Game Programmer & Co-Founder of PixelPunch LLP. I ship games, build tools, and make the web work harder.
Know more →