root.system / 0x09 / indirection

A number
that means somewhere.

A pointer is just an integer, the same kind covered on the binary page. What makes it different is the meaning we give it: this number is the address of something else in memory. Every dynamic data structure, every reference, every callback, every syscall buffer in your program is built from this single idea. So are most of the famous bugs in the history of software.

The most powerful concept in programming has six letters.

Pointer.

It is also the most dangerous.

Every dynamic data structure.
Every recursive algorithm.
Every network socket.
Every file descriptor.
Every callback and virtual function.
Every kernel buffer and device register.

All of it is this one idea.

A number that happens to be an address.

You already know everything you need to understand it.

You know binary from page 2.
A pointer is a binary number.

You know memory from page 6.
Every byte has an address.

You know variables from page 8.
A variable stores a value.

A pointer is a variable whose value is an address.

That's it.
That's the whole thing.

The complexity comes not from what a pointer is. But from what happens when you get one wrong.

Beginner// level 01

What's a pointer?

You already saw, on the memory page, that every byte in your program's address space has a number stamped on it. A pointer is a variable whose value is one of those numbers. Read through the pointer (dereference it) and you read the byte at that address. Write through the pointer and you write the byte at that address.

That's the whole mechanism. Every "reference", "handle", "object", and "ID" in every language is, somewhere underneath, this idea.

POINTERVALUE IN MEMORYp = 0x40008 bytesx = 424 bytesat 0x4000

That address in the pointer diagram. 0x4000. Written in hex, because hex is how humans read binary addresses. A 64-bit pointer is eight bytes: eight bytes of binary, the same binary from page 2, the same hex from page 1. The only new thing is the meaning: this number points somewhere. ← see: Binary · Number Systems

On a 64-bit system, a pointer is always 8 bytes wide, no matter what it points at. A pointer to an i32 is 8 bytes. A pointer to a 1 GB array is 8 bytes. A pointer to another pointer is 8 bytes. The type attached to a pointer is the compiler's way of remembering how to interpret the bytes at the destination; the pointer itself is always just one address.

Declare, take an address, dereference

Rust• • •
fn main() {
    let x: i32 = 42;

    // RAW pointer: a number that happens to be an address.
    // `*const i32` reads "pointer to a constant i32".
    let p: *const i32 = &x;

    println!("x lives at  : {:p}", &x);
    println!("p stores    : {:p}", p);

    // Dereferencing a raw pointer is `unsafe` because Rust can't
    // prove the address is still valid. With a known-good pointer
    // like this one, it's fine; in general, all the C bugs apply.
    let value = unsafe { *p };
    println!("*p          : {}", value);

    // The idiomatic Rust pointer is a *reference*, written `&T`.
    // The compiler tracks how long it's valid (its lifetime) and
    // refuses to compile code that could dereference a dead one.
    let r: &i32 = &x;
    println!("*r          : {}", *r);
}
C• • •
#include <stdio.h>

int main(void) {
    int x = 42;

    // A pointer is a variable whose value is an address.
    // `int *` reads "pointer to an int".
    int *p = &x;

    printf("x lives at  : %p\n", (void*)&x);
    printf("p stores    : %p\n", (void*)p);

    // Dereference. The compiler doesn't check that p is valid.
    printf("*p          : %d\n", *p);

    // Write through the pointer.
    *p = 100;
    printf("x is now    : %d\n", x);
    return 0;
}

Follow the pointer

// follow the pointer
variables in memory
operation
click a pointer row, or run an operation below
code
let x: i32 = 42;
let p: *const i32 = &x;
// p  = 0x1000
// *p = 42

in basic mode, take an address, read through the pointer, or write through it and watch x change. the dangling mode drops x and shows what a use-after-free is in C versus Rust. the chain mode builds a pointer to a pointer.

// the three operators
&x says "the address of x." *p says "the thing at the address stored in p." p->field (C) and p.field (Rust) are shortcuts for "follow the pointer, then read the field." Once you internalise these three, every pointer-using language reads the same.
Intermediate// level 02

Why pointers exist

If pointers are just numbers that happen to be addresses, why do we go to so much trouble over them? Because they enable five things that nothing else can.

reason 01
Sharing without copying
Pass a 1 GB image to a function by handing it the 8-byte address instead of copying the bytes. Every fast language uses pointers (or references, which are pointers in disguise) for this.
reason 02
Dynamic allocation
When the size of a thing is only known at runtime, the bytes live on the heap (see /variables) and a pointer on the stack tells you where to find them. Vec, String, malloc: all the same shape.
reason 03
Recursive structures
Linked lists, trees, and graphs cannot exist without pointers. A node embedding its successor is impossible; a node pointing at its successor is trivial.
reason 04
Polymorphism
Function pointers and vtables let one call site dispatch to many implementations. Every plug-in system, every virtual method, every callback is a pointer-to-code.

And the fifth reason is the one closest to the hardware: talking to the world. Memory-mapped device registers, DMA buffers, syscall arguments, file mappings (the mmap trick from the OS page), shared memory between processes. Every one of those is a pointer that means something to the kernel, the device, or another process. Pointers are the universal handle.

Reason three: recursive structures. A linked list node cannot contain itself, but it can contain a pointer to itself. A tree node cannot embed its children, but it can point to them. This is why every data structure page after this one depends on what you learn here. The arrays page does not need pointers. Every other data structure page does. ← see: Linked Lists · Hashing

A linked list, in two languages

Linked lists are the canonical pointer example. Each node owns a value and a pointer to the next node. In C, that pointer is a raw struct Node *; in Rust, it's an Option<Box<Node>>, which is just a nullable owned pointer. The shape is identical. The guarantees are not.

Rust• • •
// Same shape in Rust: each node *owns* the next node, expressed
// as `Option<Box<Node>>`. None marks the end of the list.
struct Node {
    value: i32,
    next: Option<Box<Node>>,
}

fn cons(value: i32, next: Option<Box<Node>>) -> Box<Node> {
    Box::new(Node { value, next })
}

fn main() {
    let head = cons(1, Some(cons(2, Some(cons(3, None)))));

    let mut cur = Some(&*head);
    while let Some(node) = cur {
        print!("{} ", node.value);
        cur = node.next.as_deref();
    }
    println!();

    // No free() needed. `head` goes out of scope here; Drop
    // walks the chain and releases every allocation in order.
}
C• • •
#include <stdio.h>
#include <stdlib.h>

// A self-referential struct: each node holds a pointer to the next
// node, or NULL at the end. This is impossible without pointers.
typedef struct Node {
    int value;
    struct Node *next;
} Node;

Node *cons(int v, Node *next) {
    Node *n = malloc(sizeof *n);
    n->value = v;
    n->next  = next;
    return n;
}

int main(void) {
    // Build the list  1 -> 2 -> 3 -> NULL.
    Node *head = cons(1, cons(2, cons(3, NULL)));

    // Walk it.
    for (Node *cur = head; cur; cur = cur->next)
        printf("%d ", cur->value);
    putchar('\n');

    // Free it. Forget this step and the memory leaks.
    while (head) {
        Node *next = head->next;
        free(head);
        head = next;
    }
    return 0;
}
// pointer-chasing has a price
Every dereference is a load instruction. The CPU page covered why this matters: if the next node lives in a different cache line, the load stalls for tens to hundreds of cycles. A Vec<T> beats a linked list on almost every modern workload, because contiguous memory is what caches were built for.
Advanced// level 03

Why pointers are dangerous (and how Rust changes the game)

A pointer is an unchecked promise. The compiler trusts that the address it stores is valid; the type system trusts that the bytes there match the declared type; the programmer trusts the address won't be reused or released while still in use. Each of those trusts is a bug waiting to happen.

The five classic pointer bugs

bugwhat happenstrigger
Null dereferenceRead through a pointer that's NULL or nullptr. Usually a segfault.Forgetting to check malloc's return, or following a missing parent in a tree.
Use-after-freeRead or write through a pointer to memory that's already been released.Freeing one alias while another still points at the same allocation.
Double freeCalling free twice on the same address. Corrupts the allocator's bookkeeping; later allocations alias or crash.Two pointers to the same block, both calling free.
Wild pointerDereferencing an uninitialised pointer. Reads from a random address.Declaring int *p; in C and using it without first assigning a real address.
Out-of-boundsPointer arithmetic that walks past the end of an allocation.p + n where n exceeds the buffer length. The basis of most buffer-overflow exploits.

Every one of those is undefined behaviour in C. The compiler is allowed to assume they never happen, so the resulting program can do anything when they do. Decades of CVEs are precisely these five bugs.

Every one of those five bugs is undefined behaviour in C. The compiler assumes they never happen; when they do, anything can happen. Heartbleed was a buffer over-read: a pointer walked two bytes past the end of an SSL record, and 64 kilobytes of server memory leaked to any attacker who asked. Certificates. Private keys. Passwords. All from one pointer that went too far. The compiler said nothing. The bug shipped. ← see: Operating System (the OS is written in C; these bugs are why Rust matters there)

The same mistake, two languages

Rust• • •
// The same logical mistake. Rust refuses to compile it.
fn main() {
    let a = Box::new(42);
    let b = &a;            // borrow: b lives only as long as a does.

    println!("{}", *a);

    drop(a);               // explicitly release.

    // println!("{}", **b);
    //                 ^^ error[E0382]: borrow of moved value: `a`
    //
    // The borrow checker tracked the lifetime of `b` and saw it
    // outlived `a`. Compilation stops; no binary is produced.
    //
    // The entire class of "use after free" is eliminated, not by a
    // runtime check, but by refusing to build programs that could
    // express it.
}
C• • •
#include <stdio.h>
#include <stdlib.h>

int *make(int v) {
    int *p = malloc(sizeof *p);
    *p = v;
    return p;
}

int main(void) {
    int *a = make(42);
    int *b = a;          // both pointers alias the SAME allocation.

    printf("%d\n", *a);  // 42, fine.
    free(a);             // the allocator reclaims those 4 bytes.

    // b is now a *dangling pointer*. The compiler said nothing.
    // What this prints depends on what the allocator wrote there
    // next: maybe 42, maybe garbage, maybe a segfault, maybe an
    // attacker-controlled value. All four are valid outcomes of
    // undefined behaviour.
    printf("%d\n", *b);
    return 0;
}

How Rust eliminates four-and-a-half of the five

rule 01
Ownership
Every allocation has a single owner. When the owner goes out of scope, the memory is freed exactly once. Double-free is impossible.
rule 02
Borrowing
A reference (&T or &mut T) must not outlive the owner it points at. The compiler tracks lifetimes and refuses to compile code that could leave a reference dangling.
rule 03
No null references
Safe references are never null. Optional pointers are written Option<&T> or Option<Box<T>>; you can't read them without first checking. Null dereference is impossible.
rule 04
Bounds-checked slices
Indexing into a slice is checked at runtime; out-of-bounds panics rather than corrupts. Pointer arithmetic on raw pointers is allowed only inside `unsafe`.

The "half" Rust doesn't eliminate is memory leaks. You can still leak by holding a reference forever (an Rc cycle, a long-lived Box::leak). Leaks are safe in Rust's safety model; they're bugs but not unsound bugs.

The escape hatch: unsafe and raw pointers

Sometimes Rust's rules are too restrictive. Talking to C code, writing a custom allocator, implementing a lock-free data structure, or reading memory-mapped hardware: all of these need raw pointers. Rust gives them to you. *const T and *mut T behave like C pointers. Dereferencing one requires an unsafe block, which is the language's way of saying "I, the programmer, promise this is sound; the compiler can no longer help."

The standard library is full of unsafe internally: Vec, String, HashMap, every reference-counted type. The point isn't that unsafe is forbidden; it's that most code can be written without it, and the parts that can't are explicitly marked so a reviewer can audit them.

Pointers in Bitcoin

Bitcoin is built on pointers. Not metaphorically. Literally.

The program counter

When a Bitcoin node validates a new block, its CPU's program counter is a pointer to the next instruction in the Bitcoin Core binary.

Every SHA-256 round function call. Every ECDSA signature verification. Every UTXO lookup. The CPU fetches the instruction at that address, decodes it, executes it, increments the pointer, and repeats.

The program counter is the original pointer. Every program that has ever run is a CPU following a pointer through code.

The prev_hash pointer

Every Bitcoin block header contains uint8_t prev_hash[32];, a 32-byte field. This is Bitcoin's version of a pointer. Not a memory address: a cryptographic content address, the SHA-256 hash of the previous block.

In a regular linked list, node->next is a memory address. Change the node and the pointer still reaches it. In Bitcoin, block.prev_hash is a content hash. Change the block and the hash changes, the pointer breaks, every subsequent block is invalid, and the network detects the tamper instantly.

The blockchain is a linked list where the pointers are unforgeable.

Rust• • •
use std::collections::HashMap;

/* The blockchain as a linked list via content addresses */
struct Block {
    header: BlockHeader,
    transactions: Vec<Transaction>,
}

struct BlockHeader {
    version:     u32,
    prev_hash:   [u8; 32], // "pointer" to previous block
    merkle_root: [u8; 32],
    timestamp:   u32,
    bits:        u32,
    nonce:       u32,
}

struct Blockchain {
    blocks: HashMap<[u8; 32], Block>, // hash -> block
}

impl Blockchain {
    /* Follow prev_hash chain from tip to genesis */
    fn validate_chain(&self, tip_hash: &[u8; 32]) -> bool {
        let mut current_hash = tip_hash;
        let genesis = [0u8; 32]; // genesis has no parent

        loop {
            let block = match self.blocks.get(current_hash) {
                Some(b) => b,
                None    => return false, // block not found
            };

            if !self.validate_block(block) {
                return false;
            }

            // follow the prev_hash "pointer"
            current_hash = &block.header.prev_hash;

            if current_hash == &genesis { return true; }
        }
        // Rust: no dangling pointers possible.
        // prev_hash is a [u8; 32] value, not a raw pointer.
        // invalid hashes return None from the HashMap:
        // no use-after-free, no null dereference.
    }
}
C• • •
/* Bitcoin block header pointer chain */
typedef struct Block {
    struct BlockHeader {
        uint32_t version;
        uint8_t  prev_hash[32]; /* pointer to previous block */
        uint8_t  merkle_root[32];
        uint32_t timestamp;
        uint32_t bits;
        uint32_t nonce;
    } header;
    /* transactions follow... */
} Block;

/* Following the chain - iterating via prev_hash */
void validate_chain(const Block *tip,
                    Block *(*find_block)(const uint8_t[32]))
{
    const Block *current = tip;
    while (current != NULL) {
        if (!validate_block(current)) {
            reject("invalid block");
            return;
        }
        /* follow the pointer to the previous block       */
        /* prev_hash is a content address, not memory addr */
        current = find_block(current->header.prev_hash);
        /* returns NULL at the genesis block */
    }
}

The five bugs in Bitcoin context

The same five pointer bugs, inside a real Bitcoin node:

  • Null dereference: following prev_hash of the genesis block, where no previous block exists. C: crash or undefined behaviour. Rust: None from the HashMap, handled explicitly.
  • Use-after-free: accessing a UTXO entry after it was spent and evicted from the UTXO set. C: returns garbage data from freed memory. Rust: the borrow checker prevents compilation.
  • Out-of-bounds: a malicious transaction script crafted with length fields that exceed the buffer. C: reads adjacent memory, Heartbleed-style. Rust: panics safely, no data leaked.

These are not theoretical. Several Bitcoin protocol bugs in history were exactly these pointer bugs in C++ code. Rust in the Linux kernel and in Bitcoin infrastructure exists because of them.

// the punchline
A pointer is the smallest unit of indirection in computing. Almost every interesting thing software does (data structures, polymorphism, dynamic memory, IPC, drivers, garbage collection, virtual memory itself) is some pattern of pointers on top of pointers. Understanding what they really are, where they live, and what makes them dangerous is the closest thing this site has to a single load-bearing skill.

Where to dig in next

Pointers go deep. A few worthwhile rabbit holes:

  • Smart pointers in Rust (Box, Rc, Arc, RefCell, Cell) and C++ (unique_ptr, shared_ptr, weak_ptr). Each one encodes a different ownership policy in the type system.
  • Pointer tagging: stealing the low bits of an aligned pointer to store extra data. The JVM, V8, and lots of GCs do this. Three free bits per word.
  • Address Sanitizer, Valgrind, Miri: tools that instrument C, C, and Rust respectively to catch use-after-free, leaks, and other pointer crimes at runtime.
  • The CHERI architecture: a CPU with hardware-enforced capabilities, where pointers carry bounds and permissions directly in their bit representation.

Every one of those is a different angle on the same fundamental thing: a number that means somewhere.

Pointers across ScrapyBytes

The same ideas surface all over ScrapyBytes. Here is where this page connects to the rest of the curriculum, and how to follow each thread.

Memory

A pointer is a memory address, nothing more. The memory page is the street; a pointer is a house number on a slip of paper. This page only makes sense on top of that one.

scrapybytes.vercel.app/memory
Variables

A pointer variable holds an address instead of a value. The variables page is where the name lives; this page is where the name points somewhere else.

scrapybytes.vercel.app/variables
Arrays

arr[i] is pointer arithmetic in disguise: base address plus i times the element size. The arrays page is the friendly face of the pointer math here.

scrapybytes.vercel.app/arrays
Linked List

A linked list is pointers made into a structure. Each node holds a next pointer to the following node. The linked-list page is this page chained together.

scrapybytes.vercel.app/linked-list
CPU

The program counter is a pointer to the next instruction. The CPU spends every cycle following pointers through memory. The CPU page runs on this page.

scrapybytes.vercel.app/cpu
Hashing

A chained hash map is an array of pointers, each the head of a linked list. The cost of following them is exactly what this page describes, which is why open addressing drops them.

scrapybytes.vercel.app/hashing
Compile vs Runtime

A null dereference is a runtime crash; Rust's borrow checker turns many pointer bugs into compile-time errors. The compile-vs-runtime page is early catch versus production catch.

scrapybytes.vercel.app/compile-vs-runtime
Operating System

Crossing the kernel boundary means handing the OS a pointer to your buffer. A bad pointer there is a segfault. The OS page is the strictest user of this one.

scrapybytes.vercel.app/operating-system
Number Systems

A pointer is an address printed in hex, like 0x7fff5fbff8d4. The number systems page is why those addresses read in base sixteen.

scrapybytes.vercel.app/number-systems
Binary

A 64-bit pointer is 8 bytes of binary in a register or on the stack. The binary page is what a pointer is made of at the lowest level.

scrapybytes.vercel.app/binary
ASCII

A C string is a char* to its first byte, ending at a NUL (0x00). The ASCII page is the bytes that pointer walks.

scrapybytes.vercel.app/ascii
Logic Gates

Load and store instructions gate an address onto the address bus through logic gates that route it to the right cells. The logic gates page is a pointer in hardware.

scrapybytes.vercel.app/logic-gates
Recursion

Each stack frame holds a return-address pointer, and buffer-overflow attacks overwrite exactly that. The recursion page is pointer-chasing through code.

scrapybytes.vercel.app/recursion
Networking

A socket is a descriptor the OS treats as a pointer into its socket table, and every send() hands the kernel a buffer pointer to validate. The networking page is pointers across the boundary.

scrapybytes.vercel.app/networking
Blockchain

Bitcoin's prev_hash is a pointer by content, not address: change the block and the hash breaks the link. The blockchain page is the pointer made cryptographic.

scrapybytes.vercel.app/blockchain
Big O Notation

Following a pointer is O(1), but a cache miss can stall the CPU ~200 cycles, so the notation hides the real cost. The big-o page is where that gap lives.

scrapybytes.vercel.app/big-o
next up / 0x0A
When does each piece happen? Compile time vs runtime.
compile vs runtime