Data Structure Every Day, Day 3: Doubly Linked List

From Tailed Lists to Doubly Linked Lists — Killing O(n) pop_back

In yesterday’s post, we improved a basic singly linked list by adding a tail pointer:

struct LinkedList {
    struct Node* head;
    struct Node* tail;
};

That change alone gave us:

OperationComplexity
push_frontO(1)
pop_frontO(1)
frontO(1)
push_backO(1)
pop_backO(n)
backO(1)

The big remaining problem was pop_back().

Because singly linked lists only store next, removing the tail requires walking the list to find the node before it:

O(n) traversal → find prev → remove tail

Today we fix that permanently.


The Upgrade: Doubly Linked Nodes

We add a prev pointer:

struct Node {
    void* data;
    size_t data_size;
    struct Node* next;
    struct Node* prev;
};

Now every node knows:

  • Who comes after it
  • Who comes before it

That single extra pointer changes everything.


New Complexity Table

OperationComplexity
push_frontO(1)
pop_frontO(1)
frontO(1)
push_backO(1)
pop_backO(1)
backO(1)

No more traversal. Ever.


Why pop_back Becomes O(1)

Old singly linked list:

HEAD → A → B → C → D → NULL
                        ↑ tail

To remove D, we had to find C by scanning from HEAD.


New doubly linked list:

HEAD → A ↔ B ↔ C ↔ D ← tail

Now:

tail->prev = C

So removal becomes:

free(tail)
tail = prev

Constant time.


Key Implementation Details

push_front (Must Update Old Head’s prev)

//onyl diff is that we enforce the standard that the head node has a NULL previous pointer and account for empty lists in a diff way
void push_front(struct LinkedList* list, struct Node* new_head) {
        if (!list || !new_head) return;

        new_head->next = list->head;
        new_head->prev = NULL; //as stated in our header, head previous pointer is NULL since its the head

        if (list->head) {
                list->head->prev = new_head;
        } else {
                list->tail = new_head;
        }

        list->head = new_head;
}

pop_front (Must Handle Single Node Case)

void pop_front(struct LinkedList* list) {
        if (!list || !list->head) return;

        struct Node* old = list->head;

        //now we grab the new head so that we can set the previous pointer to NULL
        struct Node* new_head = old->next;

        //this is how we handle the condition of one node
        if (new_head) {
                new_head->prev = NULL;
                list->head = new_head;
        } else {
                //we rmeoved the last node, empty list
                list->head = NULL;
                list->tail = NULL;
        }

        free_node(old);
}

pop_back (Now Trivial)

void pop_back(struct LinkedList* list) {
        if (!list || !list->head) return;

        // single node
        if (list->head == list->tail) {
                free_node(list->head);
                list->head = NULL;
                list->tail = NULL;
                return;
        }

        // find node right before tail - before, we had to iterate and thus it was O(n), now we can just use the previous pointer!
        struct Node* prev = list->tail->prev;

        free_node(list->tail);
        prev->next = NULL;
        list->tail = prev;
}

This now brings us to the new benchmark profile:

=== Doubly Linked List Benchmark ===
Implementation: struct LinkedList { head, tail }
Benchmarks: push_front, push_back, pop_front, pop_back, front, back
Note: pop_* includes free_node() time.

=============================
BENCH N=1000
=============================
push_front   O(1)/op total O(N)   |      0.061 ms |       61.1 ns/op
front        O(1)/op              |      2.876 ms |        1.4 ns/op (REPS=2000000)
back         O(1)/op              |      2.635 ms |        1.3 ns/op (REPS=2000000)
pop_front    O(1)/op (+free)      |      0.012 ms |       11.8 ns/op
push_back    O(1)/op total O(N)   |      0.014 ms |       14.0 ns/op
pop_back     O(1)/op total O(N) |      0.011 ms |       11.0 ns/op

=============================
BENCH N=3000
=============================
push_front   O(1)/op total O(N)   |      0.113 ms |       37.6 ns/op
front        O(1)/op              |      1.779 ms |        0.9 ns/op (REPS=2000000)
back         O(1)/op              |      1.611 ms |        0.8 ns/op (REPS=2000000)
pop_front    O(1)/op (+free)      |      0.026 ms |        8.8 ns/op
push_back    O(1)/op total O(N)   |      0.031 ms |       10.3 ns/op
pop_back     O(1)/op total O(N) |      0.025 ms |        8.2 ns/op

=============================
BENCH N=10000
=============================
push_front   O(1)/op total O(N)   |      0.293 ms |       29.3 ns/op
front        O(1)/op              |      1.614 ms |        0.8 ns/op (REPS=2000000)
back         O(1)/op              |      1.590 ms |        0.8 ns/op (REPS=2000000)
pop_front    O(1)/op (+free)      |      0.086 ms |        8.6 ns/op
push_back    O(1)/op total O(N)   |      0.108 ms |       10.8 ns/op
pop_back     O(1)/op total O(N) |      0.089 ms |        8.9 ns/op

=============================
BENCH N=30000
=============================
push_front   O(1)/op total O(N)   |      0.826 ms |       27.5 ns/op
front        O(1)/op              |      1.525 ms |        0.8 ns/op (REPS=2000000)
back         O(1)/op              |      1.613 ms |        0.8 ns/op (REPS=2000000)
pop_front    O(1)/op (+free)      |      0.251 ms |        8.4 ns/op
push_back    O(1)/op total O(N)   |      0.300 ms |       10.0 ns/op
pop_back     O(1)/op total O(N) |      0.264 ms |        8.8 ns/op

=============================
BENCH N=100000
=============================
push_front   O(1)/op total O(N)   |      2.745 ms |       27.5 ns/op
front        O(1)/op              |      1.530 ms |        0.8 ns/op (REPS=2000000)
back         O(1)/op              |      1.553 ms |        0.8 ns/op (REPS=2000000)
pop_front    O(1)/op (+free)      |      0.854 ms |        8.5 ns/op
push_back    O(1)/op total O(N)   |      1.109 ms |       11.1 ns/op
pop_back     O(1)/op total O(N) |      0.863 ms |        8.6 ns/op

TOTAL TIME: 26.819 ms

Which compared to yesterday’s post, is WAY faster, but this comes at a cost:

  • We now store an extra pointer for every node

The classic trade-off of allocating more memory to save CPU time… this is a tradeoff that you will run into in just about every piece of code you write. In this instance, the speed is well worth it however.

Comments

Leave a Reply

Your email address will not be published. Required fields are marked *