I've been diving deep into Python's memory management lately, and honestly, it's been a bit of a rabbit hole. When I first heard terms like "reference counting" and "weak references," I nodded along like I understood. Spoiler: I didn't. So I decided to actually learn this stuff properly, and here's what I figured out.

How Python Memory Management Actually Works

Before we get to the cool weakref stuff, we need to understand how Python handles memory in the first place. Python uses reference counting as its primary memory management strategy, with a garbage collector as backup for tricky situations.

Reference Counting Basics

Every object in Python has a reference count—basically a number tracking how many things are pointing at it. When that number hits zero, Python immediately frees the memory.

import sys
 
# Create an object
data = [1, 2, 3]
print(sys.getrefcount(data))  # 2 (includes the temp ref from getrefcount)
 
# Add another reference
also_data = data
print(sys.getrefcount(data))  # 3
 
# Remove one
del also_data
print(sys.getrefcount(data))  # 2
 
# Remove the last real reference
del data
# Object is immediately deallocated - memory freed!

The sys.getrefcount() function always returns one more than you'd expect because passing the object to the function temporarily creates another reference.

This system is elegant for most cases. Object goes out of scope? Reference count drops. Hits zero? Memory freed. No waiting, no periodic cleanup passes.

The Problem: Circular References

But here's where things get tricky. What if two objects reference each other?

class Node:
    def __init__(self, name):
        self.name = name
        self.neighbor = None
 
a = Node("A")
b = Node("B")
 
# Create a cycle
a.neighbor = b
b.neighbor = a
 
del a
del b
# Uh oh... both objects still exist!
# Each has refcount of 1 (from the other)

After del a and del b, both objects still have a reference count of 1—they're keeping each other alive! The variables are gone, but the objects are zombies, consuming memory forever.

This is where Python's garbage collector comes in. It periodically scans for these cycles and cleans them up. But there's a better approach: don't create the problem in the first place.

Enter weakref

The weakref module provides weak references—references that don't increase an object's reference count. The object can be garbage collected even if weak references to it exist.

import weakref
 
class BigData:
    def __init__(self, data):
        self.data = data
        print(f"Created BigData with {len(data)} items")
    
    def __del__(self):
        print("BigData is being deleted!")
 
# Strong reference keeps object alive
obj = BigData([1] * 1000000)
 
# Create a weak reference
weak = weakref.ref(obj)
 
# Access through weak reference
print(weak())  # <BigData object>
 
# Delete strong reference
del obj
# Prints: BigData is being deleted!
 
# Weak reference now returns None
print(weak())  # None

The weak reference lets you access the object while it exists, but doesn't prevent it from being collected. When the object is gone, calling the weak reference returns None.

Adding Callbacks

You can get notified when the object dies:

import weakref
 
class Connection:
    def __init__(self, host):
        self.host = host
        print(f"Connected to {host}")
 
def connection_lost(weak_ref):
    print("Connection object was garbage collected!")
 
conn = Connection("localhost")
weak_conn = weakref.ref(conn, connection_lost)
 
del conn
# Prints: Connection object was garbage collected!

This is useful for cleanup tasks—maybe you want to log when resources are released, or update some registry.

WeakValueDictionary: Caches That Don't Leak

This is where weakref really shines. Imagine you're building a cache:

# The naive approach - MEMORY LEAK!
class UserCache:
    _cache = {}
    
    @classmethod
    def get_user(cls, user_id):
        if user_id not in cls._cache:
            cls._cache[user_id] = load_user_from_db(user_id)
        return cls._cache[user_id]

The problem? Users stay in _cache forever, even if no one else is using them. Your cache grows and grows until you restart the app.

Here's the fix:

import weakref
 
class User:
    def __init__(self, user_id, name):
        self.user_id = user_id
        self.name = name
    
    def __repr__(self):
        return f"User({self.user_id}, {self.name!r})"
 
class UserCache:
    _cache = weakref.WeakValueDictionary()
    
    @classmethod
    def get_user(cls, user_id):
        if user_id in cls._cache:
            print(f"Cache hit for {user_id}")
            return cls._cache[user_id]
        
        # Simulate database lookup
        print(f"Loading user {user_id} from database...")
        user = User(user_id, f"User{user_id}")
        cls._cache[user_id] = user
        return user
 
# Usage
user1 = UserCache.get_user(1)  # Loads from database
user1_again = UserCache.get_user(1)  # Cache hit!
 
print(f"Cache size: {len(UserCache._cache)}")  # 1
 
# When no one holds user1 anymore...
del user1
del user1_again
 
# The entry disappears from cache automatically!
# (After garbage collection runs)
import gc
gc.collect()
 
print(f"Cache size: {len(UserCache._cache)}")  # 0

WeakValueDictionary holds weak references to its values. When a value is garbage collected, its key-value pair automatically disappears from the dictionary. No memory leak!

When to Use WeakValueDictionary

  • Object caches: Cache expensive computations without preventing GC
  • Object pools: Reuse objects while they exist
  • ID → Object mappings: Look up objects by ID without keeping them alive

WeakKeyDictionary: Attaching Metadata

What if you want to store extra data about objects, without modifying those objects? And without keeping them alive forever?

import weakref
 
class Document:
    def __init__(self, title, content):
        self.title = title
        self.content = content
 
# Store metadata without modifying Document class
metadata = weakref.WeakKeyDictionary()
 
doc1 = Document("README", "# Hello")
doc2 = Document("CHANGELOG", "## v1.0")
 
# Attach metadata
metadata[doc1] = {"views": 42, "last_accessed": "2026-03-22"}
metadata[doc2] = {"views": 7, "last_accessed": "2026-03-21"}
 
print(f"doc1 views: {metadata[doc1]['views']}")  # 42
 
# Delete a document
del doc2
 
# Its metadata is automatically cleaned up
import gc
gc.collect()
 
print(list(metadata.keys()))  # [<Document object>] - only doc1

WeakKeyDictionary is the inverse of WeakValueDictionary—it holds weak references to its keys. When a key is garbage collected, the entry is removed.

When to Use WeakKeyDictionary

  • Object metadata: Store extra info without modifying classes
  • Memoization by object: Cache results keyed by mutable objects
  • Tracking object state: Monitor objects without keeping them alive

weakref.finalize: Clean Cleanup

Sometimes __del__ methods get complicated—especially with circular references. weakref.finalize provides a cleaner alternative for running cleanup code when objects are collected.

import weakref
import os
import tempfile
 
class TempFileManager:
    def __init__(self):
        # Create a temp file
        fd, self.path = tempfile.mkstemp()
        os.close(fd)
        
        with open(self.path, 'w') as f:
            f.write("temporary data")
        
        print(f"Created temp file: {self.path}")
        
        # Register cleanup - will run when object is collected
        # Note: cleanup function must not reference 'self'!
        self._finalizer = weakref.finalize(
            self, 
            self._cleanup, 
            self.path  # Pass path as argument, not self.path
        )
    
    @staticmethod
    def _cleanup(path):
        """Static method - can't reference self."""
        if os.path.exists(path):
            os.unlink(path)
            print(f"Cleaned up temp file: {path}")
    
    def __enter__(self):
        return self
    
    def __exit__(self, *args):
        # Explicitly clean up
        self._finalizer()
 
# Automatic cleanup on garbage collection
tmp = TempFileManager()
print(f"File exists: {os.path.exists(tmp.path)}")  # True
 
del tmp
import gc
gc.collect()
# Prints: Cleaned up temp file: /tmp/xxxxx

Why finalize over __del__?

  1. No circular reference issues: finalize doesn't keep strong references
  2. Guaranteed to run: Even if exceptions occurred
  3. Can be called manually: _finalizer() triggers cleanup early
  4. Deterministic: Check _finalizer.alive to see if cleanup happened
import weakref
 
class Resource:
    def __init__(self, name):
        self.name = name
        self._finalizer = weakref.finalize(
            self, 
            print, 
            f"Resource {name} cleaned up"
        )
    
    @property
    def alive(self):
        return self._finalizer.alive
    
    def close(self):
        """Explicit cleanup."""
        self._finalizer()
 
r = Resource("database")
print(f"Is alive: {r.alive}")  # True
 
r.close()  # Prints: Resource database cleaned up
print(f"Is alive: {r.alive}")  # False
 
# Won't run again - already cleaned up
del r

Avoiding Memory Leaks: Practical Patterns

Let me share some patterns I've learned for avoiding memory leaks.

Pattern 1: Observer Pattern with WeakSet

import weakref
 
class EventEmitter:
    def __init__(self):
        self._listeners = weakref.WeakSet()
    
    def subscribe(self, listener):
        self._listeners.add(listener)
    
    def emit(self, event):
        # Iterate safely - listeners might disappear mid-iteration
        for listener in list(self._listeners):
            listener.on_event(event)
 
class Logger:
    def on_event(self, event):
        print(f"LOG: {event}")
 
class Analytics:
    def on_event(self, event):
        print(f"TRACK: {event}")
 
emitter = EventEmitter()
 
logger = Logger()
analytics = Analytics()
 
emitter.subscribe(logger)
emitter.subscribe(analytics)
 
emitter.emit("user_login")
# LOG: user_login
# TRACK: user_login
 
# Delete logger - it automatically unsubscribes
del logger
 
emitter.emit("user_logout")
# TRACK: user_logout  (only analytics receives it)

With WeakSet, you don't need explicit unsubscribe logic. When a listener is garbage collected, it's automatically removed from the set.

Pattern 2: Back-References Without Cycles

import weakref
 
class Parent:
    def __init__(self, name):
        self.name = name
        self.children = []
    
    def add_child(self, child):
        self.children.append(child)
        child._parent = weakref.ref(self)
 
class Child:
    def __init__(self, name):
        self.name = name
        self._parent = None
    
    @property
    def parent(self):
        if self._parent is None:
            return None
        return self._parent()  # Dereference weak ref
    
    def __repr__(self):
        parent_name = self.parent.name if self.parent else "None"
        return f"Child({self.name!r}, parent={parent_name})"
 
p = Parent("mom")
c = Child("kid")
p.add_child(c)
 
print(c.parent.name)  # mom
print(c)  # Child('kid', parent=mom)
 
# No circular reference - child holds weak ref to parent
del p  # Parent can be collected even though child exists
 
print(c.parent)  # None

Pattern 3: Self-Cleaning Cache with Size Limit

import weakref
from collections import OrderedDict
 
class LRUCache:
    """
    Cache that:
    1. Uses weak references (auto-cleanup when objects collected)
    2. Has a max size (LRU eviction)
    """
    
    def __init__(self, maxsize=100):
        self._cache = weakref.WeakValueDictionary()
        self._order = OrderedDict()  # Tracks access order
        self.maxsize = maxsize
    
    def get(self, key):
        if key in self._cache:
            # Move to end (most recently used)
            self._order.move_to_end(key)
            return self._cache[key]
        return None
    
    def put(self, key, value):
        # Add to cache
        self._cache[key] = value
        self._order[key] = None
        self._order.move_to_end(key)
        
        # Evict oldest if over size
        while len(self._order) > self.maxsize:
            oldest = next(iter(self._order))
            del self._order[oldest]
            # Don't delete from _cache - weak ref handles it
    
    def __len__(self):
        return len(self._cache)
 
cache = LRUCache(maxsize=3)
 
# Create some objects
objs = [object() for _ in range(5)]
 
for i, obj in enumerate(objs):
    cache.put(f"key{i}", obj)
 
print(len(cache))  # 5 (weak refs don't evict automatically)
 
# Delete some objects
del objs[0]
del objs[1]
 
import gc
gc.collect()
 
print(len(cache))  # 3 (weak refs auto-removed)

The gc Module: Your Debugging Friend

The gc module lets you peek under the hood and control Python's garbage collector.

Basic gc Operations

import gc
 
# Force a collection
collected = gc.collect()
print(f"Collected {collected} unreachable objects")
 
# Check GC status
print(f"GC enabled: {gc.isenabled()}")
 
# Temporarily disable (for performance-critical sections)
gc.disable()
# ... performance critical code ...
gc.enable()

Understanding Generations

Python's GC uses a generational strategy. Objects start in generation 0 and get promoted as they survive collections:

import gc
 
# (gen0_threshold, gen1_threshold, gen2_threshold)
print(gc.get_threshold())  # (700, 10, 10)
 
# Meaning:
# - Collect gen 0 after 700 allocations - deallocations
# - Collect gen 1 after 10 gen-0 collections
# - Collect gen 2 after 10 gen-1 collections
 
# Current counts
print(gc.get_count())  # (123, 5, 2)
# 123 new allocations, 5 gen-0 collections since last gen-1, etc.

Finding Memory Leaks

import gc
 
def find_objects_of_type(type_name):
    """Find all objects of a given type."""
    return [
        obj for obj in gc.get_objects()
        if type(obj).__name__ == type_name
    ]
 
class LeakyClass:
    instances = []  # Class variable - potential leak!
    
    def __init__(self, data):
        self.data = data
        LeakyClass.instances.append(self)  # Leak!
 
# Create and "delete" objects
for i in range(100):
    obj = LeakyClass(f"data{i}")
    del obj
 
gc.collect()
 
# Find the leaks
leaky = find_objects_of_type("LeakyClass")
print(f"Found {len(leaky)} LeakyClass objects!")  # 100 - they leaked!

Reference Analysis

import gc
 
class Mystery:
    pass
 
obj = Mystery()
 
# Who references this object?
referrers = gc.get_referrers(obj)
print(f"Referenced by {len(referrers)} objects")
 
# What does this object reference?
obj.data = [1, 2, 3]
referents = gc.get_referents(obj)
print(f"References {len(referents)} objects")  # The dict and list

Debugging gc.garbage

Objects that can't be collected (usually due to __del__ cycles) end up in gc.garbage:

import gc
 
gc.set_debug(gc.DEBUG_SAVEALL)  # Save uncollectable to gc.garbage
 
class Trouble:
    def __init__(self, partner=None):
        self.partner = partner
    
    def __del__(self):
        # Having __del__ used to cause issues with cycles
        # Modern Python (3.4+) handles this better
        pass
 
# Create uncollectable cycle
a = Trouble()
b = Trouble(a)
a.partner = b
 
del a, b
gc.collect()
 
# Check for trouble
if gc.garbage:
    print(f"Uncollectable objects: {gc.garbage}")
else:
    print("No garbage! (Python 3.4+ handles __del__ cycles)")
 
gc.set_debug(0)  # Reset debug

Quick Reference Table

ToolUse Case
weakref.ref()Single weak reference to an object
WeakValueDictionaryCache objects by key, auto-cleanup values
WeakKeyDictionaryStore metadata keyed by objects
WeakSetCollection of objects (observers, listeners)
weakref.finalize()Cleanup callback when object dies
gc.collect()Force garbage collection
gc.get_referrers()Debug: who references this object?
gc.get_objects()Debug: all tracked objects

What I Learned

Memory management in Python is mostly automatic—that's the beauty of it. But understanding what happens behind the scenes helps you:

  1. Avoid leaks by breaking circular references with weak refs
  2. Build efficient caches that don't balloon forever
  3. Implement patterns (like observers) without manual cleanup
  4. Debug memory issues when things go wrong

The key insight for me was this: weak references let objects die when they should. Regular references are a promise: "I need you to stay alive." Weak references are more like: "I'd like to use you if you're around, but don't stick around on my account."

Start simple—understand reference counting first. Then reach for weakref when you hit caching or observer patterns. And keep gc in your toolbox for when you need to debug memory issues.

That's it! I'm still learning, but this foundation has already helped me write better Python code. Hope it helps you too.

React to this post: