Abstract Data Structures

Welcome to MindMentor!

Computer Science

Computer Science

How Do You Build Efficient Data Organizations?

You have a million users. Need to find one instantly by username. You have a navigation system. Need to find shortest route through thousands of intersections. You have an undo feature. Need to track every action, go back through them, restore any previous state.

Simple arrays and lists work for small datasets. But they become inefficient for complex operations:

  • Inserting in the middle of an array: Move every element after it
  • Searching unsorted array: Check every single element
  • Managing relationships between data: Complex and slow

Abstract Data Structures (ADS) solve these problems. They organize data in specific ways optimized for specific operations. Need fast insertion/deletion? Linked lists. Need fast searching? Binary search trees or hash maps. Need unique values only? Sets.

Understanding these structures changes everything. You start to see which structure fits which problem. You choose the right tool for each task. You build efficient systems that scale.

This isn't just theoretical computer science. These structures power every major application—from databases to compilers to operating systems to search engines.

Let's explore how to build efficient data organizations.

Linked Lists

Definition: A linked list is a linear data structure where elements (nodes) are stored non-contiguously in memory, with each node containing data and a reference (link) to the next node.

Why Linked Lists?

Problem with arrays: Arrays store elements in contiguous memory. To insert in middle:

  1. Create new, larger array
  2. Copy all elements before insertion point
  3. Insert new element
  4. Copy all elements after insertion point

Time: O(n)

Linked lists solve this:

  • Elements can be anywhere in memory
  • Insert by changing references
  • Time: O(1) if you have reference to insertion point

Singly Linked Lists

Definition: A singly linked list is a linked list where each node contains data and a reference to the next node, allowing traversal in only one direction.

Node Structure

class Node:
    def __init__(self, data):
        self.data = data
        self.next = None  # Reference to next node

Linked List Implementation

class LinkedList:
    def __init__(self):
        self.head = None  # First node

    def is_empty(self):
        return self.head is None

    def insert_at_beginning(self, data):
        new_node = Node(data)
        new_node.next = self.head
        self.head = new_node

    def insert_at_end(self, data):
        new_node = Node(data)
        if self.is_empty():
            self.head = new_node
            return
        # Traverse to end
        current = self.head
        while current.next is not None:
            current = current.next
        current.next = new_node

    def delete(self, data):
        if self.is_empty():
            return
        # Deleting head
        if self.head.data == data:
            self.head = self.head.next
            return
        # Find node to delete
        current = self.head
        while current.next is not None:
            if current.next.data == data:
                current.next = current.next.next
                return
            current = current.next

    def search(self, data):
        current = self.head
        while current is not None:
            if current.data == data:
                return True
            current = current.next
        return False

    def display(self):
        elements = []
        current = self.head
        while current is not None:
            elements.append(str(current.data))
            current = current.next
        print(" -> ".join(elements))

Visual Representation

Head → [10|next] → [20|next] → [30|next] → [40|None]

Using the Linked List

ll = LinkedList()
ll.insert_at_end(10)
ll.insert_at_end(20)
ll.insert_at_end(30)
ll.display()              # 10 -> 20 -> 30

ll.insert_at_beginning(5)
ll.display()              # 5 -> 10 -> 20 -> 30

ll.delete(20)
ll.display()              # 5 -> 10 -> 30

print(ll.search(10))      # True
print(ll.search(99))      # False

Complexity

OperationTime Complexity
Insert at beginningO(1)
Insert at endO(n)
DeleteO(n)
SearchO(n)
Access by indexO(n)

Doubly Linked Lists

Definition: A doubly linked list is a linked list where each node contains data and references to both the next and previous nodes, allowing bidirectional traversal.

Node Structure

class DoublyNode:
    def __init__(self, data):
        self.data = data
        self.next = None
        self.prev = None  # Reference to previous node

Doubly Linked List Implementation

class DoublyLinkedList:
    def __init__(self):
        self.head = None
        self.tail = None  # Reference to last node

    def insert_at_beginning(self, data):
        new_node = DoublyNode(data)
        if self.head is None:
            self.head = self.tail = new_node
            return
        new_node.next = self.head
        self.head.prev = new_node
        self.head = new_node

    def insert_at_end(self, data):
        new_node = DoublyNode(data)
        if self.tail is None:
            self.head = self.tail = new_node
            return
        new_node.prev = self.tail
        self.tail.next = new_node
        self.tail = new_node

    def delete(self, data):
        current = self.head
        while current is not None:
            if current.data == data:
                # Update previous node
                if current.prev:
                    current.prev.next = current.next
                else:
                    self.head = current.next
                # Update next node
                if current.next:
                    current.next.prev = current.prev
                else:
                    self.tail = current.prev
                return
            current = current.next

    def display_forward(self):
        elements = []
        current = self.head
        while current is not None:
            elements.append(str(current.data))
            current = current.next
        print(" <-> ".join(elements))

    def display_backward(self):
        elements = []
        current = self.tail
        while current is not None:
            elements.append(str(current.data))
            current = current.prev
        print(" <-> ".join(elements))

Visual Representation

None ← [prev|10|next] ↔ [prev|20|next] ↔ [prev|30|next] → None
Head Tail

Using Doubly Linked List

dll = DoublyLinkedList()
dll.insert_at_end(10)
dll.insert_at_end(20)
dll.insert_at_end(30)

dll.display_forward()    # 10 <-> 20 <-> 30
dll.display_backward()   # 30 <-> 20 <-> 10

✅ Advantages over Singly Linked

  • Can traverse both directions
  • Easier deletion (have previous reference)
  • Insert before node without traversing from head

⚠️ Disadvantages

  • More memory (extra pointer per node)
  • More complex operations

Circular Linked Lists

Definition: A circular linked list is a linked list where the last node points back to the first node instead of None, forming a circle. Can be singly or doubly circular.

Singly Circular Linked List

class CircularLinkedList:
    def __init__(self):
        self.head = None

    def insert_at_end(self, data):
        new_node = Node(data)
        if self.head is None:
            self.head = new_node
            new_node.next = self.head  # Points to itself
            return
        current = self.head
        while current.next != self.head:
            current = current.next
        current.next = new_node
        new_node.next = self.head  # Circle back

    def display(self):
        if self.head is None:
            return
        elements = []
        current = self.head
        while True:
            elements.append(str(current.data))
            current = current.next
            if current == self.head:
                break
        print(" -> ".join(elements) + " -> (back to head)")

Visual Representation

┌─────────────────────────────┐
↓                                              |
[10|next] → [20|next] → [30|next]
Head

Applications

  • Round-robin scheduling
  • Multiplayer games (turn rotation)
  • Music playlists (loop mode)
  • Buffer implementations

Binary Search Trees (BST)

Definition: A binary search tree is a tree data structure where each node has at most two children, with all values in the left subtree less than the node's value and all values in the right subtree greater than the node's value.

Tree Terminology

  • Node: Element containing data and references to children
  • Root: Top node (no parent)
  • Parent: Node with children
  • Child: Node connected to parent
  • Leaf: Node with no children
  • Height: Longest path from root to leaf
  • Subtree: Tree formed by a node and its descendants

BST Implementation

class TreeNode:
    def __init__(self, data):
        self.data = data
        self.left = None   # Left child
        self.right = None  # Right child

class BinarySearchTree:
    def __init__(self):
        self.root = None

    def insert(self, data):
        if self.root is None:
            self.root = TreeNode(data)
        else:
            self._insert_recursive(self.root, data)

    def _insert_recursive(self, node, data):
        if data < node.data:
            if node.left is None:
                node.left = TreeNode(data)
            else:
                self._insert_recursive(node.left, data)
        else:
            if node.right is None:
                node.right = TreeNode(data)
            else:
                self._insert_recursive(node.right, data)

    def search(self, data):
        return self._search_recursive(self.root, data)

    def _search_recursive(self, node, data):
        if node is None:
            return False
        if data == node.data:
            return True
        elif data < node.data:
            return self._search_recursive(node.left, data)
        else:
            return self._search_recursive(node.right, data)

    def inorder_traversal(self, node, result=[]):
        if node:
            self.inorder_traversal(node.left, result)
            result.append(node.data)
            self.inorder_traversal(node.right, result)
        return result

Visual Example

Insert: 50, 30, 70, 20, 40, 60, 80

          50
       /     \
    30       70
   /  \    /  \
 20   40  60   80

BST Properties

  • Left subtree values < node value
  • Right subtree values > node value
  • No duplicates (typically)

Using BST

bst = BinarySearchTree()
values = [50, 30, 70, 20, 40, 60, 80]
for val in values:
    bst.insert(val)

print(bst.search(40))    # True
print(bst.search(99))    # False

sorted_values = bst.inorder_traversal(bst.root, [])
print(sorted_values)     # [20, 30, 40, 50, 60, 70, 80]

Tree Traversals

Inorder (Left, Root, Right)

Produces sorted sequence

Preorder (Root, Left, Right)

Good for copying tree

Postorder (Left, Right, Root)

Good for deleting tree

Example Tree

        4
      /   \
    2     6
   / \   / \
 1   3 5   7

Inorder: 1, 2, 3, 4, 5, 6, 7 (sorted)

Preorder: 4, 2, 1, 3, 6, 5, 7

Postorder: 1, 3, 2, 5, 7, 6, 4

BST Complexity

OperationAverageWorst Case
SearchO(log n)O(n)
InsertO(log n)O(n)
DeleteO(log n)O(n)

Worst Case: Unbalanced Tree

When tree becomes unbalanced (like a linked list). Insert in order: 1, 2, 3, 4, 5

1
 \
  2
   \
    3
     \
      4
       \
        5

Solution: Self-balancing trees (AVL, Red-Black trees) maintain O(log n)

Hash Maps

Definition: A hash map (hash table, dictionary) is a data structure that maps keys to values using a hash function to compute an index into an array of buckets.

How Hash Maps Work

Hash function: Converts key to array index

Example:

Key: "Alice"
Hash function: sum of ASCII values % array_size
"Alice" → (65+108+105+99+101) % 10 = 478 % 10 = 8
Store at index 8

Basic Implementation

class HashMap:
    def __init__(self, size=10):
        self.size = size
        self.table = [[] for _ in range(size)]  # Array of lists

    def _hash(self, key):
        return hash(key) % self.size

    def put(self, key, value):
        index = self._hash(key)
        # Check if key exists, update value
        for i, (k, v) in enumerate(self.table[index]):
            if k == key:
                self.table[index][i] = (key, value)
                return
        # Key doesn't exist, add new
        self.table[index].append((key, value))

    def get(self, key):
        index = self._hash(key)
        for k, v in self.table[index]:
            if k == key:
                return v
        raise KeyError(f"Key '{key}' not found")

    def remove(self, key):
        index = self._hash(key)
        for i, (k, v) in enumerate(self.table[index]):
            if k == key:
                del self.table[index][i]
                return
        raise KeyError(f"Key '{key}' not found")

    def contains(self, key):
        index = self._hash(key)
        return any(k == key for k, v in self.table[index])

Using Hash Map

phonebook = HashMap()
phonebook.put("Alice", "555-1234")
phonebook.put("Bob", "555-5678")
phonebook.put("Carol", "555-9012")

print(phonebook.get("Alice"))    # 555-1234
print(phonebook.get("Bob"))      # 555-5678

phonebook.put("Alice", "555-0000")  # Update
print(phonebook.get("Alice"))    # 555-0000

phonebook.remove("Bob")
print(phonebook.contains("Bob")) # False

Collision Resolution

Collision: Two keys hash to same index

1. Chaining (used above)

Each bucket is a list

Index 0: []
Index 1: [("Alice","555-1234"), ("David","555-3333")]
Index 2: [("Bob","555-5678")]

2. Open Addressing

Find next available slot

  • Linear probing: Check index+1, index+2, ...
  • Quadratic probing: Check index+1², index+2², ...

Hash Map Complexity

CaseInsertSearchDelete
AverageO(1)O(1)O(1)
Worst (all collisions)O(n)O(n)O(n)

In practice: With good hash function and load factor management, operations are O(1).

Python Dictionaries

Python's built-in dictionaries are hash maps:

# Create dictionary
student = {
    "name": "Alice",
    "id": "S12345",
    "major": "CS",
    "gpa": 3.8
}

# Access
print(student["name"])       # Alice

# Modify
student["gpa"] = 3.9

# Add
student["year"] = 3

# Check existence
if "major" in student:
    print(student["major"])

# Remove
del student["year"]

# Iterate
for key, value in student.items():
    print(f"{key}: {value}")

Sets

Definition: A set is a data structure that stores unique elements with no particular order, typically implemented using hash tables for fast membership testing.

Set Implementation

class Set:
    def __init__(self):
        self.elements = {}  # Use hash map

    def add(self, element):
        self.elements[element] = True

    def remove(self, element):
        if element in self.elements:
            del self.elements[element]

    def contains(self, element):
        return element in self.elements

    def size(self):
        return len(self.elements)

    def union(self, other_set):
        result = Set()
        for element in self.elements:
            result.add(element)
        for element in other_set.elements:
            result.add(element)
        return result

    def intersection(self, other_set):
        result = Set()
        for element in self.elements:
            if other_set.contains(element):
                result.add(element)
        return result

    def difference(self, other_set):
        result = Set()
        for element in self.elements:
            if not other_set.contains(element):
                result.add(element)
        return result

Using Sets

set_a = Set()
set_a.add(1)
set_a.add(2)
set_a.add(3)

set_b = Set()
set_b.add(2)
set_b.add(3)
set_b.add(4)

union = set_a.union(set_b)              # {1, 2, 3, 4}
intersection = set_a.intersection(set_b) # {2, 3}
difference = set_a.difference(set_b)     # {1}

Python Sets

# Create set
fruits = {"apple", "banana", "orange"}

# Add
fruits.add("grape")

# Remove
fruits.remove("banana")

# Check membership
if "apple" in fruits:
    print("Has apples")

# Set operations
set1 = {1, 2, 3, 4}
set2 = {3, 4, 5, 6}

print(set1 | set2)  # Union: {1, 2, 3, 4, 5, 6}
print(set1 & set2)  # Intersection: {3, 4}
print(set1 - set2)  # Difference: {1, 2}
print(set1 ^ set2)  # Symmetric difference: {1, 2, 5, 6}

Applications of Sets

Remove Duplicates

numbers = [1, 2, 2, 3, 3, 3, 4, 4, 4, 4]
unique = list(set(numbers))  # [1, 2, 3, 4]

Membership Testing

valid_users = {"alice", "bob", "carol"}
if username in valid_users:
    grant_access()

Finding Common Elements

course1_students = {"alice", "bob", "carol"}
course2_students = {"bob", "david", "alice"}
both_courses = course1_students & course2_students  # {"alice", "bob"}

Choosing the Right Structure

  • Need fast search by key? → Hash map — Example: User lookup by ID
  • Need ordered data? → Binary search tree — Example: Sorted phone book
  • Need unique elements? → Set — Example: Tracking unique visitors
  • Need fast insert/delete in middle? → Linked list — Example: Undo/redo functionality
  • Need bidirectional traversal? → Doubly linked list — Example: Browser history (forward/back)
  • Need circular iteration? → Circular linked list — Example: Round-robin scheduling

Complexity Comparison

StructureSearchInsertDeleteNotes
ArrayO(n)O(n)O(n)O(1) if at end
Linked ListO(n)O(1)*O(1)**With reference
BST (balanced)O(log n)O(log n)O(log n)Must stay balanced
Hash MapO(1)†O(1)†O(1)††Average case
SetO(1)†O(1)†O(1)††Average case

Putting It All Together

You started wondering how to build efficient data organizations. Now you understand.

Linked lists organize data with references instead of contiguous memory— singly for one-direction traversal, doubly for bidirectional, circular for loops.

Binary search trees maintain sorted hierarchies where left children are smaller and right children are larger, enabling O(log n) operations when balanced.

Hash maps use hash functions to map keys to array indices, providing O(1) average-case operations through clever indexing and collision resolution.

Sets store unique elements using hash tables, supporting fast membership testing and mathematical set operations like union and intersection.

Each structure optimizes for specific operations. Choose based on your needs—fast searching, efficient insertion, ordered traversal, unique values.

Every major system uses these structures. Databases use B-trees (similar to BST). Compilers use hash maps for symbol tables. Operating systems use linked lists for process scheduling. Search engines use sophisticated tree structures.

Understanding abstract data structures changes how you think about efficiency. You're no longer limited to basic arrays. You can choose the right tool for each problem, building systems that scale.