I'll be honest: heaps scared me. Whenever I saw "implement a priority queue" or "find the top K elements," I'd freeze. The word "heap" conjured images of memory management and low-level computer science stuff I didn't fully understand.

Then I actually sat down with Python's heapq module. Turns out? It's just a list with superpowers.

What's a Heap, Really?

Before we touch the code, let me explain heaps the way I wish someone had explained them to me.

A heap is a list where the smallest element is always at the front. That's it. That's the magic.

heap = [1, 3, 5, 7, 9, 8, 6]
#      ^
#      Smallest is always at index 0

The rest of the list follows a pattern called the "heap invariant"—each parent is smaller than its children. But honestly? You don't need to understand the tree structure to use heaps effectively. Python handles all that internally.

What you need to know:

  • Getting the minimum takes O(1) — just peek at index 0
  • Adding an item takes O(log n) — it bubbles up to the right spot
  • Removing the minimum takes O(log n) — it rebalances automatically

Compare that to a sorted list where insertion is O(n). For anything involving "give me the smallest thing" repeatedly, heaps win.

The Aha Moment: heappush and heappop

Let me show you the basics that made everything click:

import heapq
 
# Start with an empty list (that's your heap!)
tasks = []
 
# Add items with heappush
heapq.heappush(tasks, 5)
heapq.heappush(tasks, 2)
heapq.heappush(tasks, 8)
heapq.heappush(tasks, 1)
 
print(tasks)  # [1, 2, 8, 5] — looks weird, but 1 is at index 0
 
# Get the smallest with heappop
smallest = heapq.heappop(tasks)
print(smallest)  # 1
 
smallest = heapq.heappop(tasks)
print(smallest)  # 2
 
# The heap rebalances each time
print(tasks)  # [5, 8]

The key insight: heappop doesn't just remove index 0. It removes the smallest element and reorganizes the heap so the next smallest is now at index 0. That reorganization is O(log n), which is why heaps are fast.

The "Just Peek" Trick

Want to see the minimum without removing it?

heap = [3, 5, 7]
smallest = heap[0]  # Just look at index 0!

No special function needed. The smallest is always there.

heapify: Converting an Existing List

Sometimes you already have data and want to turn it into a heap. Don't push items one by one—use heapify:

import heapq
 
scores = [45, 23, 89, 12, 67, 34]
heapq.heapify(scores)  # In-place transformation!
print(scores)  # [12, 23, 34, 45, 67, 89]
 
# Now scores behaves like a heap
print(scores[0])  # 12 — the minimum

Here's the beautiful part: heapify is O(n), not O(n log n). It's mathematically cheaper than pushing each element individually. I was skeptical about this until I read about the "siftdown" algorithm—trust me, it works.

When I use heapify:

  • Loading data from a file/database and need priority ordering
  • Converting results from another operation
  • Starting a problem with an initial dataset

When I use heappush loop:

  • Building the heap dynamically as items arrive
  • Streaming data scenarios

nlargest and nsmallest: The Shortcuts I Wish I'd Known Sooner

For weeks, I wrote code like this when I needed the top 3 items:

# The hard way (what I used to do)
sorted_list = sorted(items, reverse=True)
top_3 = sorted_list[:3]

Then someone showed me this:

import heapq
 
scores = [45, 23, 89, 12, 67, 34, 78, 56]
 
# Top 3 scores
top_3 = heapq.nlargest(3, scores)
print(top_3)  # [89, 78, 67]
 
# Bottom 3 scores
bottom_3 = heapq.nsmallest(3, scores)
print(bottom_3)  # [12, 23, 34]

These functions are smarter than just sorting. When K is small relative to N, they use a heap internally and avoid sorting the entire list. O(n log k) vs O(n log n).

The key Parameter Changes Everything

Just like sorted(), you can use a key function:

import heapq
 
employees = [
    {"name": "Alice", "salary": 75000},
    {"name": "Bob", "salary": 55000},
    {"name": "Carol", "salary": 90000},
    {"name": "Dave", "salary": 65000},
]
 
# Highest paid employees
top_earners = heapq.nlargest(2, employees, key=lambda e: e["salary"])
print(top_earners)
# [{'name': 'Carol', 'salary': 90000}, {'name': 'Alice', 'salary': 75000}]
 
# Lowest paid
lowest = heapq.nsmallest(1, employees, key=lambda e: e["salary"])
print(lowest[0]["name"])  # Bob

When to Use What

This took me a while to internalize:

ScenarioBest Tool
Just need min/maxmin() / max()
Need top K, K is smallnsmallest / nlargest
Need top K, K ≈ Nsorted()
Repeated min extractionheap with heappop

Understanding the Heap Invariant (Finally)

Okay, let's demystify the "heap invariant." It's not as scary as it sounds.

A min-heap has one rule: every parent must be smaller than or equal to its children.

The heap is stored as a flat list, but conceptually it's a binary tree:

       1
      / \
     3   5
    / \
   7   9

Stored as: [1, 3, 5, 7, 9]

The parent-child relationships use simple index math:

  • Parent of index i is at (i - 1) // 2
  • Children of index i are at 2*i + 1 and 2*i + 2

But here's the thing: you never need to think about this. Python's heapq handles all the bookkeeping. I'm only explaining it because understanding why heaps work helped me trust them.

The invariant explains why the list might look "unsorted":

import heapq
 
data = [5, 3, 8, 1, 9, 2]
heapq.heapify(data)
print(data)  # [1, 3, 2, 5, 9, 8]

That list isn't sorted! But it does satisfy the heap property. 1 is less than its children (3 and 2), 3 is less than its children (5 and 9), and so on. The minimum bubbles to the top, but the rest is only partially ordered.

Python Only Has Min-Heaps (And How to Work Around It)

This tripped me up. I needed a max-heap to always get the largest element. But Python's heapq only does min-heaps.

The solution: negate your values.

import heapq
 
# Max-heap by negating
max_heap = []
heapq.heappush(max_heap, -5)
heapq.heappush(max_heap, -3)
heapq.heappush(max_heap, -8)
 
# "Pop minimum" gives us the most negative (i.e., largest original)
largest = -heapq.heappop(max_heap)
print(largest)  # 8

It feels hacky, but it's the standard pattern. If you're doing this a lot, wrap it in a class:

import heapq
 
class MaxHeap:
    def __init__(self):
        self._heap = []
    
    def push(self, value):
        heapq.heappush(self._heap, -value)
    
    def pop(self):
        return -heapq.heappop(self._heap)
    
    def peek(self):
        return -self._heap[0] if self._heap else None
    
    def __len__(self):
        return len(self._heap)

Building a Real Priority Queue

Here's where heaps become practical. A priority queue processes items by priority, not arrival order.

The simplest version:

import heapq
 
class PriorityQueue:
    def __init__(self):
        self._heap = []
    
    def push(self, item, priority):
        # Tuple comparison: compares priority first, then item
        heapq.heappush(self._heap, (priority, item))
    
    def pop(self):
        priority, item = heapq.heappop(self._heap)
        return item
    
    def __bool__(self):
        return bool(self._heap)
 
# Usage
pq = PriorityQueue()
pq.push("fix critical bug", priority=1)
pq.push("update docs", priority=5)
pq.push("add feature", priority=3)
 
while pq:
    print(pq.pop())
# fix critical bug
# add feature
# update docs

The Tie-Breaking Problem

There's a gotcha. If two items have the same priority, Python tries to compare the items themselves:

pq.push({"task": "a"}, priority=1)
pq.push({"task": "b"}, priority=1)  # Error! Can't compare dicts

The fix: add a tiebreaker. I use a counter:

import heapq
from dataclasses import dataclass, field
from typing import Any
 
@dataclass(order=True)
class PrioritizedItem:
    priority: int
    counter: int
    item: Any = field(compare=False)
 
class PriorityQueue:
    def __init__(self):
        self._heap = []
        self._counter = 0
    
    def push(self, item, priority):
        entry = PrioritizedItem(priority, self._counter, item)
        heapq.heappush(self._heap, entry)
        self._counter += 1
    
    def pop(self):
        return heapq.heappop(self._heap).item

The counter ensures FIFO order for equal priorities, and field(compare=False) excludes the item from comparisons.

Common Patterns That Finally Made Sense

Pattern 1: Top-K Problems

"Find the K largest elements" is everywhere. Interviews, data processing, recommendations.

import heapq
 
def top_k_scores(scores: list[int], k: int) -> list[int]:
    """Return the K highest scores."""
    return heapq.nlargest(k, scores)
 
# Or with more complex data
def top_k_users_by_points(users: list[dict], k: int) -> list[dict]:
    return heapq.nlargest(k, users, key=lambda u: u["points"])

But sometimes you're streaming data and can't hold everything in memory. Use a bounded min-heap:

import heapq
 
def streaming_top_k(stream, k):
    """Track top K items from a stream, memory efficient."""
    heap = []
    
    for item in stream:
        if len(heap) < k:
            heapq.heappush(heap, item)
        elif item > heap[0]:  # Bigger than smallest in our top-K
            heapq.heapreplace(heap, item)  # Pop smallest, push new
    
    return sorted(heap, reverse=True)

This only keeps K items in memory, no matter how large the stream.

Pattern 2: Merging Sorted Streams

You have multiple sorted lists and need them merged into one sorted stream. This comes up with log files, database results, or external sorting.

import heapq
 
log_files = [
    ["2026-03-22 10:01 INFO Started"],
    ["2026-03-22 10:00 DEBUG Init", "2026-03-22 10:02 ERROR Failed"],
    ["2026-03-22 10:01 WARN Slow query"],
]
 
# Merge maintaining chronological order
for line in heapq.merge(*log_files):
    print(line)

Output is perfectly sorted. And merge is lazy—it doesn't load everything into memory at once.

Pattern 3: K-Way Merge

The merge pattern generalizes to any number of sorted iterables:

import heapq
 
def merge_sorted_files(file_paths):
    """Merge multiple sorted files into one sorted output."""
    files = [open(path) for path in file_paths]
    
    try:
        # heapq.merge handles everything
        for line in heapq.merge(*files, key=str.lower):
            yield line.strip()
    finally:
        for f in files:
            f.close()

Pattern 4: Task Scheduling

Schedule tasks to run at specific times:

import heapq
from datetime import datetime, timedelta
 
class Scheduler:
    def __init__(self):
        self._tasks = []
        self._counter = 0
    
    def schedule(self, delay_seconds: float, task_name: str):
        run_at = datetime.now() + timedelta(seconds=delay_seconds)
        heapq.heappush(self._tasks, (run_at, self._counter, task_name))
        self._counter += 1
    
    def next_task(self):
        if self._tasks:
            run_at, _, name = heapq.heappop(self._tasks)
            return run_at, name
        return None
    
    def peek_next(self):
        if self._tasks:
            return self._tasks[0][0], self._tasks[0][2]
        return None
 
scheduler = Scheduler()
scheduler.schedule(10, "send_email")
scheduler.schedule(5, "process_payment")
scheduler.schedule(15, "generate_report")
 
# Tasks come out in time order
print(scheduler.next_task())  # (~5s from now, "process_payment")
print(scheduler.next_task())  # (~10s from now, "send_email")

Pattern 5: Running Median

This one blew my mind—using TWO heaps to track a running median:

import heapq
 
class RunningMedian:
    """Track median of a stream using two heaps."""
    
    def __init__(self):
        self.small = []  # Max-heap (negated) for smaller half
        self.large = []  # Min-heap for larger half
    
    def add(self, num):
        # Always add to small first
        heapq.heappush(self.small, -num)
        
        # Move largest from small to large
        heapq.heappush(self.large, -heapq.heappop(self.small))
        
        # Keep small >= large in size
        if len(self.large) > len(self.small):
            heapq.heappush(self.small, -heapq.heappop(self.large))
    
    def median(self):
        if len(self.small) > len(self.large):
            return -self.small[0]
        return (-self.small[0] + self.large[0]) / 2
 
rm = RunningMedian()
for num in [2, 1, 5, 7, 2, 0, 5]:
    rm.add(num)
    print(f"Added {num}, median now: {rm.median()}")

The insight: small heap holds the smaller half of numbers (as a max-heap), large heap holds the larger half. Median is either the top of small (if odd count) or average of both tops.

The Efficiency Shortcuts: heappushpop and heapreplace

Two functions I overlooked at first but now use all the time:

import heapq
 
heap = [1, 3, 5, 7]
 
# heappushpop: push then pop (faster than separate calls)
result = heapq.heappushpop(heap, 4)
print(result)  # 1 (the old minimum)
print(heap)    # [3, 4, 5, 7]
 
# heapreplace: pop then push (faster than separate calls)
result = heapq.heapreplace(heap, 2)
print(result)  # 3 (the old minimum)
print(heap)    # [2, 4, 5, 7]

The difference is subtle:

  • heappushpop(heap, item): Pushes item, then pops. If the new item is smallest, you get it right back.
  • heapreplace(heap, item): Pops first, then pushes. The popped item might be smaller than what you're pushing.

I use heapreplace in the streaming top-K pattern—replace the smallest with a new candidate.

Quick Reference

import heapq
 
# Create heap from list
heapq.heapify(x)              # In-place, O(n)
 
# Add item
heapq.heappush(heap, item)    # O(log n)
 
# Remove and return smallest
heapq.heappop(heap)           # O(log n)
 
# Peek at smallest
heap[0]                       # O(1)
 
# Push then pop (efficient combo)
heapq.heappushpop(heap, item) # O(log n)
 
# Pop then push (efficient combo)
heapq.heapreplace(heap, item) # O(log n)
 
# Find K smallest/largest
heapq.nsmallest(k, iterable, key=None)  # O(n log k)
heapq.nlargest(k, iterable, key=None)   # O(n log k)
 
# Merge sorted iterables
heapq.merge(*iterables, key=None, reverse=False)  # O(n) total

What I Wish I'd Known Earlier

  1. Heaps are just lists. There's no special heap type. Any list can be a heap if you only modify it with heapq functions.

  2. The "heap" is the invariant, not the data structure. If you append to a heap without using heappush, you break it.

  3. nlargest/nsmallest exist. Don't sort when you only need a few elements.

  4. Negation is the standard max-heap trick. It's not hacky, it's the pattern.

  5. heapify is O(n). Don't push elements one by one when you have existing data.

  6. You don't need to understand the tree structure to use heaps. Just know: smallest is always at index 0, and the heapq functions maintain that.

Heaps stopped being scary once I realized they're just a clever way to always keep the minimum accessible. Now I reach for them any time I see "priority," "top K," or "schedule."

The heapq module is in the standard library. Nothing to install. It's been there all along, waiting for you to stop being afraid of it.

React to this post: