Before you can find anything
you have to sort everything.
Every search result. Every database query. Every contact list. Every Bitcoin transaction waiting to be confirmed. None of it works without sorting. This page is the algorithm that taught everyone to sort, the algorithm that does it properly, the algorithm the real world uses, and why the gap between them is the entire field of algorithm design.
Sorting is not about alphabetical order.
Sorting is about answering one question.
Over and over.
Faster and faster.
Is this thing bigger than that thing?
That is it.
Every sorting algorithm ever written.
In every language.
On every computer.
In every data centre on Earth.
Boils down to that one comparison.
And the entire science of sorting is about how many times you have to ask it.
The Big O page proved O(n log n) is the mathematical ceiling.
No comparison-based sort can do better.
This page builds to that ceiling.
One algorithm at a time.
Why sorting exists
Sort once. Search forever.
Before the first algorithm, the argument for sorting in one concrete comparison:
Bubble sort: the prototype
Bubble sort is the algorithm nobody uses and everybody learns. Not because it is useful. Because it is the prototype. Every faster algorithm is bubble sort with a smarter idea about which elements to compare and in what order. Understand bubble sort and you understand every sort that followed it.
The idea: walk the array repeatedly, swapping any pair that is out of order. Larger elements "bubble" to the end. After each full pass, one more element is guaranteed to be in its final position. Repeat until no swaps happen.
fn bubble_sort(arr: &mut [i32]) {
let n = arr.len();
for i in 0..n {
let mut swapped = false;
// each pass guarantees the largest remaining element
// reaches its final position, so inner loop shrinks
for j in 0..n.saturating_sub(i + 1) {
if arr[j] > arr[j + 1] {
arr.swap(j, j + 1);
// swap() is bounds-checked in debug mode.
// panics instead of silently corrupting memory.
// C would let this go out of bounds quietly.
swapped = true;
}
}
// early exit: no swap means the array is already sorted.
// best case O(n) on already-sorted input.
if !swapped { break; }
}
}
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn test_bubble() {
let mut v = vec![5, 3, 8, 1, 4];
bubble_sort(&mut v);
assert_eq!(v, vec![1, 3, 4, 5, 8]);
}
}#include <stddef.h>
void bubble_sort(int *arr, size_t n) {
if (!arr || n < 2) return;
for (size_t i = 0; i < n - 1; i++) {
int swapped = 0;
/* each pass places the largest remaining element
* at its final position, so we shrink the window */
for (size_t j = 0; j < n - i - 1; j++) {
if (arr[j] > arr[j + 1]) {
/* classic three-variable swap */
int tmp = arr[j];
arr[j] = arr[j + 1];
arr[j + 1] = tmp;
swapped = 1;
}
}
/* early exit: sorted if no swap occurred */
if (!swapped) break;
}
/* C trusts you to pass a valid pointer and size n.
* Out-of-bounds access is silent memory corruption.
* Rust's swap() panics in debug builds instead. */
}The early-exit optimisation turns O(n²) into O(n) on already-sorted data. This is the same pattern the Big O page described: best case, average case, and worst case are all different measurements of the same algorithm. The array page showed why contiguous memory makes these sequential comparisons cheap. ← see: Big O ← see: Arrays
Merge sort - the recursion payoff
The Big O page proved O(n log n) is the ceiling. The recursion page showed divide and conquer. Merge sort is where both ideas become code. It is the most important sorting algorithm you will ever learn. Not because it is the fastest. Because it teaches you that splitting a problem in half is always worth considering.
The recursion tree
The merge operation
The hard part of merge sort is not the splitting. It is the merging. Given two sorted arrays, compare their fronts and take the smaller element. Each comparison costs O(1) and produces one output element. Total comparisons per level: O(n). Total levels: log n. Total work: O(n log n).
fn merge_sort(arr: &mut [i32]) {
let n = arr.len();
if n <= 1 { return; } // base case: one element is sorted
let mid = n / 2;
// split: copy each half into owned Vecs
let mut left = arr[..mid].to_vec();
let mut right = arr[mid..].to_vec();
// recurse: sort each half independently.
// call stack grows log n levels deep.
// at n = 1_000_000 that is about 20 frames.
merge_sort(&mut left);
merge_sort(&mut right);
// merge: walk both sorted halves and stitch together
let (mut i, mut j, mut k) = (0, 0, 0);
while i < left.len() && j < right.len() {
if left[i] <= right[j] {
arr[k] = left[i]; i += 1;
} else {
arr[k] = right[j]; j += 1;
}
k += 1;
}
// drain whichever half still has elements
while i < left.len() { arr[k] = left[i]; i += 1; k += 1; }
while j < right.len() { arr[k] = right[j]; j += 1; k += 1; }
// The borrow checker ensures no two mutable references
// to arr exist at once. Safe to split and recombine.
}#include <stdlib.h>
#include <string.h>
#include <stddef.h>
static void merge(int *arr, size_t l,
size_t m, size_t r) {
size_t n1 = m - l, n2 = r - m;
/* temporary arrays for each half */
int *left = malloc(n1 * sizeof *left);
int *right = malloc(n2 * sizeof *right);
if (!left || !right) { free(left); free(right); return; }
memcpy(left, arr + l, n1 * sizeof *left);
memcpy(right, arr + m, n2 * sizeof *right);
size_t i = 0, j = 0, k = l;
while (i < n1 && j < n2) {
if (left[i] <= right[j]) arr[k++] = left[i++];
else arr[k++] = right[j++];
}
while (i < n1) arr[k++] = left[i++];
while (j < n2) arr[k++] = right[j++];
free(left);
free(right);
/* C requires explicit malloc/free every call.
* Forget free() and you leak on every merge.
* Rust drops the Vec automatically at end of scope. */
}
void merge_sort(int *arr, size_t l, size_t r) {
if (r - l <= 1) return; /* base case */
size_t m = l + (r - l) / 2;
merge_sort(arr, l, m); /* sort left */
merge_sort(arr, m, r); /* sort right */
merge(arr, l, m, r); /* merge both */
}
/* call as: merge_sort(arr, 0, n); */The call stack grows exactly log n levels deep. At n = 1,000,000 that is about 20 stack frames. The recursion page showed why this is safe: each frame is small and predictable. Merge sort is why divide-and-conquer is not just an academic exercise. ← see: Recursion
Quicksort: the practical champion
Merge sort is O(n log n) guaranteed. But quicksort is faster in practice. Because of cache behaviour. Merge sort needs temporary arrays. Every merge touches memory in two places. Quicksort sorts in place. It touches memory in one place. The CPU cache loves it.
The idea: pick one element as the pivot. Rearrange so every element smaller than the pivot is on its left, every larger element is on its right. The pivot is now in its exact final position. Recurse on each side.
fn quicksort(arr: &mut [i32]) {
if arr.len() <= 1 { return; }
// partition places the pivot in its final position
let pivot_idx = partition(arr);
// split_at_mut gives two non-overlapping mutable slices.
// the borrow checker verifies they don't overlap.
let (left, right) = arr.split_at_mut(pivot_idx);
quicksort(left);
quicksort(&mut right[1..]); // skip the pivot itself
}
fn partition(arr: &mut [i32]) -> usize {
let pivot = *arr.last().unwrap(); // last element as pivot
let mut i = 0;
for j in 0..arr.len() - 1 {
if arr[j] <= pivot {
arr.swap(i, j);
i += 1;
}
}
// move pivot to its final sorted position
let last = arr.len() - 1;
arr.swap(i, last);
i
}
// Note: the standard library does NOT use this.
// .sort() -- stable Timsort, O(n log n) guaranteed
// .sort_unstable() -- pdqsort, faster, O(n log n) guaranteed
// Always prefer std over a hand-rolled sort.#include <stddef.h>
static void swap_ints(int *a, int *b) {
int t = *a; *a = *b; *b = t;
}
static size_t partition(int *arr,
size_t lo, size_t hi) {
int pivot = arr[hi]; /* last element as pivot */
size_t i = lo;
for (size_t j = lo; j < hi; j++) {
if (arr[j] <= pivot) {
swap_ints(&arr[i], &arr[j]);
i++;
}
}
swap_ints(&arr[i], &arr[hi]); /* pivot to final spot */
return i;
}
void quicksort(int *arr, size_t lo, size_t hi) {
if (lo >= hi) return;
size_t p = partition(arr, lo, hi);
if (p > 0) quicksort(arr, lo, p - 1);
quicksort(arr, p + 1, hi);
}
/* call as: quicksort(arr, 0, n - 1);
*
* C stdlib equivalent:
* qsort(arr, n, sizeof(int), cmp_int);
* Takes void* - no type safety at all.
* Rust generics replace void* with compile-time checks. */Quicksort's worst case O(n²) is not theoretical. C's qsort() has historically hit O(n²) on sorted input. This is why production systems replaced pure quicksort. pdqsort detects the bad pattern and switches algorithms. The compile-vs-runtime page's lesson applies here too: behaviour that looks fine at compile time (sorted input) can blow up at runtime. ← see: Compile vs Runtime
What your language actually uses
You now know bubble sort, merge sort, and quicksort. Your language uses none of them. It uses a hybrid. An algorithm designed by studying the real-world data that programs actually sort in production. This section explains what that hybrid is and why pure algorithms lose to pragmatism.
Timsort: for real-world data
Used by Python, Java (for objects), Android, and Swift. Invented by Tim Peters in 2002 for Python.
The key insight: real-world data is rarely random. It has natural runs -- sequences already sorted or reversely sorted. Timsort exploits this.
- Scan the array for natural runs (already-sorted sequences, ascending or descending).
- If a run is too short, extend it with insertion sort -- fastest algorithm on tiny arrays.
- Merge the runs using merge sort's merge operation.
Result: O(n) on already-sorted data (one big run, done). O(n log n) worst case. Stable. Adaptive.
pdqsort: for raw speed
Used by Rust's .sort_unstable(), C++ Boost.Sort, Go's slices.Sort(). Invented by Orson Peters in 2021.
Pattern-Defeating Quicksort. Pure quicksort has pathological cases: sorted input, reverse sorted, all equal elements. pdqsort detects these patterns and switches strategy.
- Start with quicksort (cache-friendly, fast average case).
- Detect bad patterns: many equal elements uses Dutch flag partition; recursion too deep switches to heapsort (O(n log n) guaranteed); small arrays use insertion sort.
- Each switch prevents worst-case behaviour.
Result: O(n log n) worst case guaranteed. Fastest practical sort on random data. Not stable.
Standard library comparison
| Language | Function | Algorithm | Stable | Worst case |
|---|---|---|---|---|
| Python | sorted() / .sort() | Timsort | Yes | O(n log n) |
| Java | Arrays.sort() (objects) | Timsort | Yes | O(n log n) |
| Rust | .sort() | Timsort | Yes | O(n log n) |
| Rust | .sort_unstable() | pdqsort | No | O(n log n) |
| C | qsort() | Usually quicksort | No | O(n²) possible |
| C++ | std::sort() | Introsort | No | O(n log n) |
| Go | slices.Sort() | pdqsort | No | O(n log n) |
When to use what
// Use the standard library. Always.
// It is faster than anything you would write.
fn stdlib_sorting_examples() {
let mut nums = vec![5, 3, 8, 1, 4, 2, 7, 6];
// Stable sort (Timsort) -- equal elements keep original order
nums.sort();
// [1, 2, 3, 4, 5, 6, 7, 8]
// Unstable sort (pdqsort) -- faster, same O(n log n) guarantee
nums.sort_unstable();
// same result here, but equal elements may reorder
// Custom comparator
let mut words = vec!["banana", "apple", "cherry", "date"];
words.sort_by(|a, b| a.len().cmp(&b.len()));
// ["date", "apple", "banana", "cherry"]
// Sorting Bitcoin mempool transactions by fee rate (descending)
#[derive(Debug)]
struct MempoolEntry { fee_rate: u64, size_vbytes: u32 }
let mut mempool = vec![
MempoolEntry { fee_rate: 100, size_vbytes: 250 },
MempoolEntry { fee_rate: 500, size_vbytes: 141 },
MempoolEntry { fee_rate: 250, size_vbytes: 300 },
];
// highest fee rate mines first -- pdqsort, O(n log n) worst case
mempool.sort_unstable_by(|a, b| b.fee_rate.cmp(&a.fee_rate));
// [{500, 141}, {250, 300}, {100, 250}]
// sort_by_key for cleaner field extraction
mempool.sort_unstable_by_key(|tx| std::cmp::Reverse(tx.fee_rate));
}#include <stdlib.h>
#include <stdint.h>
#include <string.h>
/* qsort() comparison functions.
* void* args are untyped -- no compile-time safety. */
int cmp_int_asc(const void *a, const void *b) {
int x = *(const int *)a;
int y = *(const int *)b;
/* never use return x - y: integer overflow is undefined */
return (x > y) - (x < y);
}
int cmp_int_desc(const void *a, const void *b) {
return cmp_int_asc(b, a);
}
/* Bitcoin mempool: sort by fee rate descending */
typedef struct {
uint64_t fee_rate; /* satoshis per virtual byte */
uint32_t size_vbytes;
uint8_t txid[32];
} MempoolEntry;
static int cmp_fee_desc(const void *a, const void *b) {
const MempoolEntry *ea = a;
const MempoolEntry *eb = b;
if (eb->fee_rate > ea->fee_rate) return 1;
if (eb->fee_rate < ea->fee_rate) return -1;
return 0;
}
void sort_mempool(MempoolEntry *pool, size_t n) {
/* O(n log n) average, O(n^2) worst case in some impls */
qsort(pool, n, sizeof *pool, cmp_fee_desc);
}
int main(void) {
int nums[] = {5, 3, 8, 1, 4};
qsort(nums, 5, sizeof *nums, cmp_int_asc);
/* [1, 3, 4, 5, 8] */
MempoolEntry pool[] = {
{100, 250, {0}},
{500, 141, {0}},
{250, 300, {0}},
};
sort_mempool(pool, 3);
/* [{500,...}, {250,...}, {100,...}] -- highest fee first */
return 0;
}Sorting in the Bitcoin mempool
Every Bitcoin miner sorts transactions before building a block.
The goal is to maximise revenue within the block size limit (4 MB weight). The sort key is fee rate: satoshis per virtual byte. Higher fee rate goes in the block first. Lower fee rate waits for the next block. When the mempool is full, transactions below the threshold fee rate are evicted entirely.
This sort runs on every new block. Every ten minutes. Across thousands of mining nodes. Simultaneously.
use std::cmp::Reverse;
#[derive(Debug, Eq, PartialEq)]
struct MempoolEntry {
fee_rate: u64, // satoshis per virtual byte
size_vbytes: u32, // transaction weight
txid: [u8; 32], // transaction identifier
}
impl Ord for MempoolEntry {
fn cmp(&self, other: &Self) -> std::cmp::Ordering {
// descending fee rate: highest mines first
other.fee_rate.cmp(&self.fee_rate)
}
}
impl PartialOrd for MempoolEntry {
fn partial_cmp(&self, other: &Self) -> Option<std::cmp::Ordering> {
Some(self.cmp(other))
}
}
fn build_block<'a>(
mempool: &'a mut Vec<MempoolEntry>,
block_limit_vbytes: u32,
) -> Vec<&'a MempoolEntry> {
// O(n log n) -- pdqsort via sort_unstable()
mempool.sort_unstable();
let mut block = Vec::new();
let mut used = 0u32;
for entry in mempool.iter() {
if used + entry.size_vbytes > block_limit_vbytes { break; }
block.push(entry);
used += entry.size_vbytes;
}
block
// This is the algorithm that decides
// whose Bitcoin transaction confirms next
// and whose waits another ten minutes.
// A sort on an array.
// The most consequential sort in finance.
}#include <stdint.h>
#include <stdlib.h>
#include <stddef.h>
typedef struct {
uint64_t fee_rate; /* sat / vbyte */
uint32_t size_vbytes;
uint8_t txid[32];
} MempoolEntry;
/* highest fee rate first */
static int cmp_fee_rate(const void *a, const void *b) {
const MempoolEntry *ea = a;
const MempoolEntry *eb = b;
if (eb->fee_rate > ea->fee_rate) return 1;
if (eb->fee_rate < ea->fee_rate) return -1;
return 0;
}
/* returns number of selected transactions */
size_t build_block(MempoolEntry *pool, size_t n,
uint32_t block_limit,
size_t *selected_idx) {
/* O(n log n) sort by fee rate descending */
qsort(pool, n, sizeof *pool, cmp_fee_rate);
uint32_t used = 0;
size_t count = 0;
for (size_t i = 0; i < n; i++) {
if (used + pool[i].size_vbytes > block_limit) break;
selected_idx[count++] = i;
used += pool[i].size_vbytes;
}
return count;
}
/* Sorting decides financial priority.
* The algorithm is O(n log n).
* The stakes are measured in satoshis. */Bitcoin Core's mempool sort is the same qsort/sort_unstable pattern as any other array sort. The hashing page showed how transactions get unique IDs. The networking page showed how they propagate across the network. The OS page showed how the sort runs as a user-space process. All of those pages converge at this one array sort. ← see: Blockchain ← see: Hashing
Sorting Algorithms across ScrapyBytes
The same ideas surface all over ScrapyBytes. Here is where this page connects to the rest of the curriculum, and how to follow each thread.
Sorting rearranges an array, in place or into a new one. Every algorithm on this page operates on the contiguous block from the arrays page.
scrapybytes.vercel.app/arrays →Big O NotationThe whole reason to prefer merge sort over bubble sort is O(n log n) versus O(n squared). The big-o page is the scoreboard this page competes on.
scrapybytes.vercel.app/big-o →RecursionMerge sort and quicksort are recursion: divide the array, sort the parts, combine. The recursion page is the technique behind the fastest sorts here.
scrapybytes.vercel.app/recursion →HashingSorting and hashing are rival answers to retrieval. Sort once and binary-search at O(log n), or hash and look up at O(1). The hashing page is the alternative; choose by whether you need order.
scrapybytes.vercel.app/hashing →Linked ListMerge sort is the natural choice for linked lists because it needs no random access. The linked-list page is a structure that changes which sort you reach for.
scrapybytes.vercel.app/linked-list →MemoryQuicksort wins in practice partly because it is cache-friendly and sorts in place. The memory page is why theoretically tied algorithms are not tied at all.
scrapybytes.vercel.app/memory →CPUBranch prediction and cache behaviour decide real sorting speed as much as Big O does. The CPU page is the hardware your sort actually runs on.
scrapybytes.vercel.app/cpu →BinaryEvery comparison, arr[j] > arr[j+1], compiles to a subtraction whose sign bit decides the swap. The binary page is sorting at the bit level.
In C a sort takes a pointer to the first element, and qsort() takes untyped void*. The pointers page is how a sort reaches the array.
The OS scheduler keeps its run queue sorted by virtual runtime, running the least-served process every millisecond. The operating system page sorts to schedule.
scrapybytes.vercel.app/operating-system →Distributed SystemsMerging k already-sorted partitions across nodes is O(n log k), the external merge sort databases use for data larger than RAM. The distributed systems page sorts at scale.
scrapybytes.vercel.app/distributed-systems →BlockchainBitcoin miners sort the mempool by fee rate, highest satoshis-per-byte first. The blockchain page runs O(n log n) on the most consequential array in finance.
scrapybytes.vercel.app/blockchain →