~/

Hashmap Is Not Really O(1)

Let’s talk honestly about one of the biggest myths in computer science.


O(1)

Imagine you’re in a job interview and the interviewer asks:

“What’s the time complexity of a hashmap lookup?”

And like everyone else, you answer:

“O(1) on average.”

Seems correct, right? Well… not really.

Let’s break the illusion.

0. What Exactly Is a Hashmap?

A hashmap (or hash table) is a data structure that stores key -> value pairs and uses a hash function to decide where each key should live in memory.

Conceptually, it looks like this:

hash(key) -> bucket index -> value

    +-------------------------------+
    |        hash(key)              |   <- O(len(key))
    +-------------------------------+
                 |
                 v
+-----------------------------------------+
| bucket_index = hash & (capacity - 1)    |
+-----------------------------------------+
                 |
                 v
+-----------------------------------------+
|      load bucket pointer (L1/L2/L3)     |
+-----------------------------------------+
                 |
                 v
+-------------------------------------------+
| compare tophash[] | load keys | load vals |
| ----------------------------------------- |
| maybe go into overflow buckets...         |
+-------------------------------------------+

Its design aims to make:

…run in “constant time,” hence the famous:

Hashmap operations are O(1). This is true in an academic sense, but misleading in the real world.

Let’s break it down.

0.1 Core Structure

A hash map consists of two main components:

The bucket array

This is an array where each element points to a “bucket”.

The array length is usually a power of two: 8, 16, 32, 64, … This allows faster hashing using bitwise operations.

Example (size = 8):

index 0: bucket pointer
index 1: bucket pointer
index 2: bucket pointer
...
index 7: bucket pointer

Bucket / Entry list

A bucket may contain:

A typical entry structure:

entry {
    tophash
    key
    value
    overflow
}

0.2 How Lookup Works

Suppose you do:

value := m["hello"]

Internally, It will performs:

Step 1 - Compute hash(key)

Hash function:

hash = algo.somehashfunction("hello")

Example:

hash = 0xFA9320AABBCCDDEE

Step 2 - Compute bucket index

Assume the map has 8 buckets (2^3):

bucketMask = bucketCount - 1 = 7 = 0x000...0007

Then:

index = hash & bucketMask

Example:

hash   = FA9320AABBCCDDEE
mask   = 0000000000000007
-------------------------
index = 0x6 (decimal 6)

So the key will be searched in bucket #6.

No memory access has happened yet. This is all done in CPU registers.

Step 3 - Load the Bucket Pointer

We stores an array of bucket pointers:

bucketPtr := h.buckets[index]

Assume key always string and value always int. This points to a bucket struct that looks like:

type bucket struct {
	tophash [8]uint8
	keys    [8]string
	values  [8]int
	overflow *bucket
}

Now it will inspect the tophash.

Step 4 - Use High Bits of the Hash (tophash) to Filter Slots

Extracts the top 8 bits of the hash:

top = hash >> (64 - 8)    // highest 8 bits

This is the key speed trick:

For string / interface keys, this avoids tons of expensive comparisons.

This so fast in practice becuase it is cache-line friendly

When CPU loads the bucket, those 8 tophash bytes come in one shot.

Then lookup does:

for i in 0..7:
    compare 1-byte tophash[i] with `top`

So the first filtering pass:

Only when a tophash matches do we touch the actual key (which may be a pointer to a string on heap).

+----------------+--------------+
| Level          | Latency      |
+----------------+--------------+
| L1 Cache       | ~1 ns        |
| L2 Cache       | ~5 ns        |
| L3 Cache       | ~20 ns       |
| RAM            | ~80-150 ns   |
| Remote NUMA    | ~150-250 ns  |
+----------------+--------------+

Branch-predictable and tight loop

Typical case

That loop turns into something very predictable:

load th
cmp th, top
jne skip_slot
// rare path: key compare

Branch predictor learns:

So the CPU’s speculative execution works in your favor.

Example:

hash = FA9320AABBCCDDEE
top  = FA

Then check:

bucket.tophash[i] == top ?

If it matches -> compare actual key.

bucket.key[i] == "hello" ?

If not -> skip slot.

This drastically accelerates lookup.

Step 5 - If No Match Found, Follow Overflow Buckets

If none of the 8 slots in the bucket match:

bucket = bucket.overflow
repeat...
key := "hello"
tophash := 0xFA

for each overflowBucket in chain:
    for i in 0..7:
        if overflowBucket.tophash[i] != tophash:
            continue

        if overflowBucket.key[i] == key:
            return overflowBucket.values[i], true

0.3 How Insert Works

We follow the same process until we got the bucket index (Step 2).

Step 3 - Place entry into bucket

bucket := buckets[bucketIndex]

// Step 1: check for empty slot in bucket
for i in 0..7:
    if bucket.tophash[i] == empty:
        use slot i
        return

// Step 2: if bucket is full, look for space in overflow chain
for each overflowBucket in chain:
    for i in 0..7:
        if overflowBucket.tophash[i] == empty:
            use this slot
            return

// Step 3: if everything full, create new overflow bucket
newOverflow := new(bucket)
lastOverflow.overflow = newOverflow

use newOverflow.slot[0]

Some programminng languages use linked list (O(n)) or balanced trees (O(log n)) to handle the collison

New Overflow Bucket

Allocates an overflow bucket and inserts the new key/value into that overflow bucket.

Criteria Load factor = number of entries / number of buckets

Example:

8 buckets, 6 entries -> load factor = 0.75

When load factor exceeds a threshold (usually 75%), the hash map:

Example:

This operation is expensive (O(n)), but infrequent.

Latency
  ^
  |          spikes due to resize
  |               /\
  |              /  \
  |    _________/    \__________
  |   /
  +--+-----------------------------------> Number of keys
     |
   O(1-ish)

1. Hashmaps Are Only O(1) Under Perfect Assumptions

The textbooks assume:

That world doesn’t exist.

On real CPUs, real RAM, and real workloads: Hashmaps behave like O(1) amortized average-time data structures.

Average lookup: O(1) Worst case: O(n)