Article

Original Interpretation: Python Garbage Collection - The Three Most Common Misconceptions

Deconstructing the three major misconceptions about reference counting, gc.collect(), and del statements, establishing a complete cognitive framework for Python GC mechanisms (reference counting + generational GC + cycle detection)

Topic · Python Series Python Memory Model Deep Dive 2/7

Original Interpretation Python Garbage Collection Memory Management Performance

Introduction: Why This Concept Gets More Confusing the More You Learn

“Python uses reference counting for garbage collection.”

If you learned Python from textbooks or introductory tutorials, you’ve probably seen this sentence. It’s concise enough, correct enough, and good enough to get you through interviews.

But when you start writing large model training scripts and encounter memory explosions caused by circular references; when you call gc.collect() in production and find memory hasn’t moved; when you think the del statement deleted an object but memory usage remains high—this sentence starts to feel inadequate.

Worse, three different scenarios require three different answers, but the textbook only gave one sentence. This is the source of conceptual confusion.

This article deconstructs three of the most common misconceptions and rebuilds a more reliable Python GC understanding framework.

Misconception 1: Python Only Has Reference Counting

Why People Think This

Textbook simplification. Reference counting is CPython’s most visible GC mechanism—every object has an ob_refcnt field, every assignment, parameter passing, and container insertion modifies it. sys.getrefcount() can check it at any time, and Py_INCREF/Py_DECREF macros are everywhere in the source code.

Intuitive alignment. “Delete when no one references” aligns with human intuition. In contrast, Java’s reachability analysis and Go’s tri-color marking are more abstract.

Presence difference. Generational GC and cycle detection hide in the gc module, running automatically by default—most developers never directly call it.

Why This Understanding Is Wrong

Scenario: Circular References

class Node:
    def __init__(self, value):
        self.value = value
        self.next = None

a = Node(1)
b = Node(2)
a.next = b
b.next = a  # Circular reference!

del a
del b
# Both objects still exist because they reference each other

In this scenario, reference counting completely fails. a and b reference each other, both have reference count 1, and even if a and b are deleted externally, the two objects will never be recycled.

Reference count reaching 0 means immediate recycling, but not necessarily immediate release. Recycling means the object is destroyed, but whether memory is returned to the operating system depends on the three-layer architecture explained in Part 1.

The existence of Generational GC: Python’s gc module handles circular references. It uses a generational strategy, dividing objects into three generations (0, 1, 2). New objects are in generation 0, promoted to the next generation after surviving multiple GC cycles.

What Is the More Accurate Understanding

Python’s GC is a collaboration of three mechanisms:

Python GC Three Mechanisms Figure 1: Python Garbage Collection’s Three Mechanisms—The Collaboration of Reference Counting, Generational GC, and Cycle Detection

Mechanism	Handles	Trigger Timing	Performance Characteristics
Reference Counting	Most objects	When reference changes	Immediate, deterministic, low overhead
Generational GC	Circular references	Threshold triggered/manual call	Delayed, non-deterministic, periodic
Cycle Detection	Container objects (list, dict, custom classes)	When GC runs	Mark-sweep algorithm

Reference counting is the “main force”, handling over 90% of object lifecycles. But it cannot handle circular references, so Python needs generational GC as a supplement.

The relationship between the two is: Problems that reference counting can’t handle, then GC takes over.

Misconception 2: Calling gc.collect() Will Immediately Release Memory

Why People Think This

Inertia from other languages. Java has System.gc(), Go has runtime.GC(), C# has GC.Collect(). These languages’ GC is the main mechanism, and explicit calls usually trigger recycling.

Naming misdirection. collect implies “collecting garbage”—intuitively it should release memory. Documentation also says “Force garbage collection”.

Memory monitoring anxiety. Seeing high memory usage in production, the first reaction is “quickly call GC”.

Why This Understanding Is Wrong

gc.collect() only handles circular references. If your code has no circular references, or circular references have already been handled by previous GC cycles, gc.collect() does almost nothing.

Memory release has nothing to do with GC. As explained in Part 1, whether memory is returned to the operating system depends on pymalloc’s three-layer architecture (Arena → Pool → Block). Only when an entire Arena becomes empty will memory truly be released.

Generational GC’s delayed design. Python’s GC deliberately delays running to trade for throughput. The default threshold is (700, 10, 10), meaning GC is triggered when generation 0 objects exceed 700.

import gc
print(gc.get_threshold())  # (700, 10, 10)

This is not a bug, it’s a trade-off.

What Is the More Accurate Understanding

Generational GC’s working mechanism:

Generation 0 (New objects): Placed in generation 0 when created. GC triggered when object count exceeds threshold (default 700).
GC process: Mark surviving objects, clear dead objects (including circular references). Surviving objects promoted to generation 1.
Generation 1: Objects surviving multiple GC cycles. When generation 1 object count exceeds threshold (10), GC runs on both generation 0 and 1 together.
Generation 2 (Old objects): Objects surviving multiple GC cycles. When generation 2 object count exceeds threshold (10), GC runs on all three generations.

When manual calls make sense:

Circular reference-intensive scenarios (graph structures, doubly linked lists)
Long-running services needing to control GC pause time
Test environments needing deterministic behavior

When not to call:

Performance-sensitive code (GC pauses all threads)
Memory usage caused by large numbers of small objects (check code, not GC)
Memory not released to operating system (this is pooling strategy, not a GC problem)

Misconception 3: del Statement Immediately Deletes Objects

Why People Think This

Syntax intuitiveness. del looks like “delete”, and in English delete equals remove. C++ has a delete operator that immediately releases memory.

Interactive environment feedback. In Python REPL, after del x, x indeed no longer exists—accessing it raises NameError.

Documentation phrasing. Official documentation says “Delete the reference”, but most people read it as “Delete the object”.

Why This Understanding Is Wrong

del deletes references, not objects. Whether an object is deleted depends on whether its reference count reaches zero.

a = [1, 2, 3]
b = a
del a  # Deleted the name a, but the list object still exists (referenced by b)
print(b)  # [1, 2, 3] outputs normally

Circular reference scenario: Objects survive after del.

import gc

class Obj:
    def __init__(self, name):
        self.name = name
        self.ref = None
    
    def __del__(self):
        print(f"Deleting {self.name}")

x = Obj("x")
y = Obj("y")
x.ref = y
y.ref = x

del x
del y
# Neither object is deleted at this point! __del__ is not called

# Until gc.collect() triggers cycle detection
print("Before gc.collect()")
gc.collect()
print("After gc.collect()")

del is not a destructor. It’s a finalizer, called when the object is recycled, but there’s no guarantee when it will be called, or even if it will ever be called due to circular references.

What Is the More Accurate Understanding

del statement → deletes name (reference) → reference count decreases
                                        ↓
                                  reference count = 0?
                                        ↓
                            Yes → immediately recycle object (reference counting mechanism)
                                        ↓
                            No → object continues to survive
                                        ↓
                            Circular reference → wait for generational GC detection

Weak references (weakref) design intent: If you need to reference but don’t want to prevent garbage collection, use weak references.

import weakref

class Data:
    pass

data = Data()
weak_ref = weakref.ref(data)

print(weak_ref())  # <__main__.Data object at 0x...>
del data
print(weak_ref())  # None (object has been recycled)

Context managers and deterministic cleanup: If you need to ensure resources are released, don’t use __del__, use the with statement.

with open("file.txt") as f:
    data = f.read()
# When exiting the with block, f.close() is deterministically called

Practical Circular Reference Detection

After understanding Python GC’s theoretical mechanisms, let’s master circular reference detection and remediation techniques through real-world scenarios. These cases come from production environment issues, covering ORM frameworks, visualization diagnostic tools, and weak reference solutions.

Circular References in ORM Models

ORM frameworks are hotspots for circular references. Taking SQLAlchemy and Django ORM as examples, bidirectional relationships between models naturally create mutual references.

Circular References in SQLAlchemy

from sqlalchemy import Column, Integer, String, ForeignKey, create_engine
from sqlalchemy.orm import relationship, sessionmaker, declarative_base

Base = declarative_base()

class Department(Base):
    __tablename__ = 'departments'
    id = Column(Integer, primary_key=True)
    name = Column(String)
    # Relationship definition: Department -> Employee
    employees = relationship("Employee", back_populates="department")

class Employee(Base):
    __tablename__ = 'employees'
    id = Column(Integer, primary_key=True)
    name = Column(String)
    dept_id = Column(Integer, ForeignKey('departments.id'))
    # Back reference: Employee -> Department
    department = relationship("Department", back_populates="employees")

# How circular references form
def create_circular_refs(session):
    dept = Department(name="Engineering")
    emp = Employee(name="Alice")

    # Bidirectional association creates circular reference
    dept.employees.append(emp)  # dept references emp
    # emp.department automatically points to dept, forming a cycle

    session.add(dept)
    session.commit()

    # Even after session closes, references between objects persist
    return dept, emp

# Test circular reference
dept, emp = create_circular_refs(session)
del dept, emp  # Reference count not 0, objects not immediately released

gc.collect()  # Trigger cycle detection before objects can be collected

Problem Analysis: The back_populates bidirectional relationship makes Department and Employee instances reference each other. When queries return large numbers of such objects, memory usage accumulates until GC triggers.

Similar Problem in Django ORM

# Django models.py
from django.db import models

class Author(models.Model):
    name = models.CharField(max_length=100)
    # Reverse relationship: Django auto-creates book_set

class Book(models.Model):
    title = models.CharField(max_length=200)
    author = models.ForeignKey(Author, on_delete=models.CASCADE, related_name='books')

# Scenario where circular references occur
def fetch_with_prefetch():
    # select_related and prefetch_related load objects maintaining bidirectional references
    authors = Author.objects.prefetch_related('books').all()
    for author in authors:
        for book in author.books.all():
            # book.author and author.books form a circular reference chain
            process(book)
    # After authors list is deleted, circular references between internal objects persist

Using gc.get_referrers() and gc.get_referents() to Track Reference Chains

Python’s gc module provides two powerful introspection functions for manually tracing reference relationships.

import gc

class Node:
    def __init__(self, name):
        self.name = name
        self.ref = None

    def __repr__(self):
        return f"Node({self.name})"

# Create circular reference
a = Node("A")
b = Node("B")
c = Node("C")

a.ref = b
b.ref = c
c.ref = a  # Forms cycle: A -> B -> C -> A

def analyze_references(obj, depth=0, max_depth=5, visited=None):
    """Recursively analyze object reference relationships"""
    if visited is None:
        visited = set()

    if depth > max_depth or id(obj) in visited:
        return

    visited.add(id(obj))
    indent = "  " * depth
    print(f"{indent}Object: {obj} (id: {id(obj)})")

    # Get objects this object references (outbound references)
    referents = gc.get_referents(obj)
    print(f"{indent}  Referents ({len(referents)}) - Objects this object references:")
    for ref in referents:
        if isinstance(ref, (dict, list, tuple, Node)):
            print(f"{indent}    -> {type(ref).__name__}: {ref if isinstance(ref, Node) else '...'}")

    # Get objects that reference this object (inbound references)
    referrers = gc.get_referrers(obj)
    print(f"{indent}  Referrers ({len(referrers)}) - Objects referencing this object:")
    for ref in referrers:
        if ref is not visited:  # Avoid printing gc module itself
            ref_type = type(ref).__name__
            if isinstance(ref, Node):
                print(f"{indent}    <- {ref_type}: {ref}")
            elif isinstance(ref, dict):
                print(f"{indent}    <- {ref_type}: __dict__")
            elif isinstance(ref, list):
                print(f"{indent}    <- {ref_type}: [...]")
    print()

# Analyze node A's reference relationships
analyze_references(a)

Sample Output:

Object: Node(A) (id: 140312345678016)
  Referents (3) - Objects this object references:
    -> str: ...
    -> dict: ...
    -> Node: Node(B)
  Referrers (2) - Objects referencing this object:
    <- dict: __dict__
    <- Node: Node(C)

From the output, we can clearly see the reference chain: Node(C) references Node(A), and Node(A) references Node(B).

Visualizing Reference Graphs with objgraph

objgraph is a third-party library that generates visual diagrams of object reference relationships—a powerful tool for diagnosing circular references.

# Install: pip install objgraph
import objgraph
import gc

class User:
    def __init__(self, name):
        self.name = name
        self.friends = []

    def add_friend(self, user):
        self.friends.append(user)

# Create circular reference scenario
alice = User("Alice")
bob = User("Bob")
carol = User("Carol")

alice.add_friend(bob)
bob.add_friend(carol)
carol.add_friend(alice)  # Forms cycle

# Generate reference graph
objgraph.show_backrefs(
    [alice, bob, carol],
    filename='circular_refs.png',
    max_depth=3,
    too_many=10
)

# Find most common object types
print("Most common object types:")
objgraph.show_most_common_types(limit=10)

# Find references between specific object types
users = objgraph.by_type('User')
print(f"\nFound {len(users)} User objects")

# Detect circular references
if len(users) >= 2:
    objgraph.show_chain(
        objgraph.find_backref_chain(users[0], lambda obj: obj in users[1:]),
        filename='ref_chain.png'
    )

Generated Reference Graph Explanation:

The generated PNG image shows arrows pointing between objects; circular references form closed loops. In real projects, you can visually see which objects form unreleaseable cycles.

Using weakref to Fix Circular References

The standard solution for circular references is using weak references. Weak references don’t prevent garbage collection—when an object only has weak references, GC can collect it normally.

Fixing ORM Model Circular References

import weakref
from typing import Optional

class SafeDepartment:
    def __init__(self, name: str):
        self.name = name
        # Use weak reference set to store employees
        self._employees = weakref.WeakSet()

    @property
    def employees(self):
        """Return strong reference list, but keep weak references internally"""
        return list(self._employees)

    def add_employee(self, emp):
        self._employees.add(emp)
        # Employee uses weak reference to department
        emp._department_ref = weakref.ref(self)

class SafeEmployee:
    def __init__(self, name: str):
        self.name = name
        self._department_ref = lambda: None  # Default returns None

    @property
    def department(self) -> Optional['SafeDepartment']:
        """Access department through weak reference"""
        return self._department_ref()

    def __repr__(self):
        return f"SafeEmployee({self.name})"

    def __hash__(self):
        return hash(self.name)

    def __eq__(self, other):
        return isinstance(other, SafeEmployee) and self.name == other.name

# After using weak references, circular reference is broken
dept = SafeDepartment("Engineering")
emp = SafeEmployee("Alice")
dept.add_employee(emp)

print(f"Employee department: {emp.department}")  # Works normally

# After deleting department, employee no longer holds strong reference
del dept
gc.collect()
print(f"Department after deletion: {emp.department}")  # None

Generic Weak Reference Pattern: Safe Observer Pattern Implementation

import weakref
from abc import ABC, abstractmethod

class Observer(ABC):
    @abstractmethod
    def notify(self, event):
        pass

class EventSource:
    """Event source uses weak references to store observers, avoiding circular references"""

    def __init__(self):
        # Use WeakKeyDictionary, observers automatically removed when deleted
        self._observers = weakref.WeakKeyDictionary()

    def subscribe(self, observer: Observer, priority=0):
        """Subscribe to events using weak references"""
        if not isinstance(observer, Observer):
            raise TypeError("Observer must implement Observer interface")
        self._observers[observer] = priority

    def unsubscribe(self, observer: Observer):
        """Unsubscribe"""
        self._observers.pop(observer, None)

    def emit(self, event):
        """Emit event, notify all observers"""
        # Sort by priority
        sorted_observers = sorted(
            self._observers.items(),
            key=lambda x: x[1],
            reverse=True
        )
        for observer, _ in sorted_observers:
            observer.notify(event)

    def get_subscriber_count(self):
        return len(self._observers)

class ConcreteObserver(Observer):
    """Concrete observer"""

    def __init__(self, name: str):
        self.name = name
        self.events_received = []

    def notify(self, event):
        self.events_received.append(event)
        print(f"[{self.name}] Received: {event}")

    def __hash__(self):
        return hash(self.name)

    def __eq__(self, other):
        return isinstance(other, ConcreteObserver) and self.name == other.name

# Demonstrate circular reference problem solved
source = EventSource()
observers = [ConcreteObserver(f"Observer_{i}") for i in range(3)]

# Subscribe to events
for obs in observers:
    source.subscribe(obs)

print(f"Subscriber count: {source.get_subscriber_count()}")  # 3

# After deleting observer, EventSource automatically cleans up weak reference
del observers[0]
gc.collect()

print(f"Subscriber count after deletion: {source.get_subscriber_count()}")  # 2

# Event emission works normally
source.emit("Test Event")  # Only remaining 2 observers receive

Sample Output:

Subscriber count: 3
Subscriber count after deletion: 2
[Observer_1] Received: Test Event
[Observer_2] Received: Test Event

Key Points Summary:

Scenario	Solution	Notes
ORM bidirectional relationships	Use WeakSet/WeakKeyDictionary	Requires custom relationship management logic
Observer pattern	Weak references to store observers	Observers may be collected anytime, check for None
Cache systems	WeakValueDictionary	Cached object lifecycle controlled externally
Parent references	weakref.ref(parent)	Check validity when accessing using ref()

The core of circular reference detection and remediation is: Identify reference relationships -> Visualize confirmation -> Replace unnecessary strong references with weak references. In production environments, regular memory analysis with objgraph is recommended, especially in long-running services.

del Method Pitfalls and Best Practices

__del__ is one of Python’s most misunderstood mechanisms. Many developers coming from C++ expect it to work like a destructor, but it has completely different semantics and limitations.

5 Scenarios Where del Doesn’t Execute

The following scenarios can cause __del__ methods to never be called:

import gc
import sys
import os

class ResourceHandler:
    """Demonstrates scenarios where __del__ doesn't execute"""

    def __init__(self, name):
        self.name = name
        print(f"[{self.name}] Resource allocated")

    def __del__(self):
        print(f"[{self.name}] __del__ called - resource cleaned up")

# Scenario 1: Circular references
print("=== Scenario 1: Circular References ===")
a = ResourceHandler("A")
b = ResourceHandler("B")
a.ref = b
b.ref = a
del a
del b
# __del__ not called because circular references keep ref count > 0
print(f"Circular reference garbage count: {len(gc.garbage)}")

# Scenario 2: Interpreter abnormal exit
print("\n=== Scenario 2: Interpreter Abnormal Exit ===")
handler = ResourceHandler("C")
os._exit(1)  # Force exit, __del__ won't execute

# Scenario 3: Process killed (SIGKILL)
# kill -9 <pid> terminates process immediately, no cleanup executed

# Scenario 4: __del__ itself raises exception
class BadResource:
    def __del__(self):
        raise Exception("Cleanup failed")  # Exception is ignored, resource not cleaned

# Scenario 5: Module unload while object still referenced
import atexit

class GlobalResource:
    def __del__(self):
        print("GlobalResource.__del__ called")

global_res = GlobalResource()

@atexit.register
def cleanup():
    # If global_res is still referenced by other modules, __del__ won't execute
    print(f"At program exit global_res still alive: {global_res}")

Summary of 5 Scenarios Where Execution Is Not Guaranteed:

Scenario	Trigger Condition	Consequence
Circular references	Objects reference each other	`__del__` never executes
Abnormal exit	`os._exit()` / `SIGKILL`	All cleanup skipped
`__del__` exception	Cleanup code raises exception	Exception ignored, may leak resources
Module-level references	Still referenced at module unload	Object survives in weird state
Interpreter termination	Python process ends	No `__del__` guaranteed to be called

Comparison with weakref.finalize

weakref.finalize provides more reliable cleanup than __del__:

import weakref

class SafeResource:
    """Safe resource management using weakref.finalize"""

    def __init__(self, name):
        self.name = name
        self._file = open(f"/tmp/{name}.txt", 'w')

        # Register finalizer, executes when object is garbage collected
        self._finalizer = weakref.finalize(
            self,                    # Monitored object
            self._cleanup,           # Cleanup function
            self._file,              # Arguments passed to cleanup function
            self.name                # Additional arguments
        )

    @staticmethod
    def _cleanup(file_obj, name):
        """Static method: doesn't hold self reference, avoids circular references"""
        print(f"[{name}] Executing cleanup...")
        if not file_obj.closed:
            file_obj.close()
            print(f"[{name}] File closed")

    def close(self):
        """Explicit close (optional)"""
        self._finalizer()  # Execute cleanup immediately

    @property
    def closed(self):
        return self._finalizer.alive

# Comparison demonstration
print("=== weakref.finalize Comparison ===")

# Traditional __del__ approach (not recommended)
class OldStyle:
    def __del__(self):
        print("OldStyle.__del__")

# New approach (recommended)
class NewStyle:
    def __init__(self):
        self._finalizer = weakref.finalize(self, lambda: print("NewStyle cleaned up"))

old = OldStyle()
new = NewStyle()

# Even with circular references
circular_old = OldStyle()
circular_new = NewStyle()
circular_old.ref = circular_new
circular_new.ref = circular_old

del old, new, circular_old, circular_new
gc.collect()

# weakref.finalize still executes, while __del__ doesn't

Key Differences:

Feature	`__del__`	`weakref.finalize`
Circular references	Doesn’t execute	Executes normally
Exception handling	Exception ignored	Can check via return value
Execution timing	Uncertain	When object is collected
Cancel ability	None	Can `detach()`
Holds reference	May create new references	Doesn’t hold object references

Deterministic Resource Cleanup: Context Managers

For resources requiring deterministic cleanup, context managers are the only reliable choice:

from contextlib import contextmanager
from typing import Generator
import tempfile
import shutil
import os

class ManagedResource:
    """Production-grade resource management class"""

    def __init__(self, name: str, temp_dir: str = None):
        self.name = name
        self._temp_dir = temp_dir or tempfile.mkdtemp()
        self._files = []
        self._open = True

    def __enter__(self):
        """with statement entry"""
        return self

    def __exit__(self, exc_type, exc_val, exc_tb):
        """with statement exit - guaranteed execution"""
        self.close()
        # Return False to propagate exceptions
        return False

    def close(self):
        """Explicit resource close"""
        if self._open:
            print(f"[{self.name}] Cleaning up...")
            for f in self._files:
                if os.path.exists(f):
                    os.remove(f)
            if os.path.exists(self._temp_dir):
                shutil.rmtree(self._temp_dir)
            self._open = False
            print(f"[{self.name}] Resource released")

    def create_file(self, filename: str, content: str):
        """Create file in managed directory"""
        if not self._open:
            raise RuntimeError("Resource already closed")
        filepath = os.path.join(self._temp_dir, filename)
        with open(filepath, 'w') as f:
            f.write(content)
        self._files.append(filepath)
        return filepath

    @property
    def is_open(self) -> bool:
        return self._open

# Context manager factory function (cleaner API)
@contextmanager
def managed_temp_dir(prefix: str = "tmp") -> Generator[str, None, None]:
    """
    Context manager for managing temporary directories

    Usage example:
        with managed_temp_dir("myapp") as tmpdir:
            # Use tmpdir here
            process_files(tmpdir)
        # Auto-cleanup on exit
    """
    tmpdir = tempfile.mkdtemp(prefix=prefix)
    try:
        yield tmpdir
    finally:
        # Guaranteed execution, even if exception occurs
        shutil.rmtree(tmpdir, ignore_errors=True)

# Usage examples
print("=== Context Manager Usage ===")

# Method 1: Class approach
with ManagedResource("MyResource") as res:
    res.create_file("data.txt", "hello world")
    # ... business logic ...
# Automatically calls close() when exiting with block

# Method 2: Decorator approach
with managed_temp_dir("myapp") as tmp:
    print(f"Using temporary directory: {tmp}")
    # ... business logic ...
# Auto-cleanup

# Comparison: Wrong way (depends on __del__)
class BadResource:
    def __init__(self):
        self.file = open("/tmp/bad.txt", "w")

    def __del__(self):
        self.file.close()  # Not guaranteed to execute!

# bad = BadResource()  # May leak file handles

Strategy for Disabling del in Production

In strict codebases, __del__ should be actively avoided:

# 1. Code review rules (.pylintrc / setup.cfg)
"""
[MESSAGES CONTROL]
disable=unnecessary-dunder-call
disable=invalid-name

# Ban __del__ methods
[REFACTORING]
disable=no-self-use
"""

# 2. Custom decorator enforcing context manager usage
import functools
import warnings

def deprecated_del(cls):
    """Mark class as not using __del__"""
    original_del = getattr(cls, '__del__', None)

    def new_del(self):
        warnings.warn(
            f"{cls.__name__} uses __del__ which may cause resource leaks, "
            "please use context manager (with statement) instead",
            ResourceWarning,
            stacklevel=2
        )
        if original_del:
            original_del(self)

    cls.__del__ = new_del
    return cls

# 3. Static analysis tool check
"""
# .pre-commit-config.yaml
repos:
  - repo: local
    hooks:
      - id: check-no-del
        name: Check for __del__ usage
        entry: grep -r "__del__" --include="*.py"
        language: system
        pass_filenames: false
"""

# 4. Runtime detection (development environment)
import atexit
import gc

def check_leaked_resources():
    """Check unreleased resources at program exit"""
    gc.collect()
    if gc.garbage:
        print(f"Warning: {len(gc.garbage)} uncollectable objects detected")
        for obj in gc.garbage[:5]:  # Show first 5
            print(f"  - {type(obj).__name__}: {repr(obj)[:100]}")

if __debug__:
    atexit.register(check_leaked_resources)

Best Practices Summary:

Never rely on __del__ for critical resource cleanup - It may not execute at any time
Always use context managers - with statements provide deterministic cleanup
Use weakref.finalize for non-critical cleanup - Such as logging, statistics collection
Explicit over implicit - Provide .close() method for users to call actively
Code review bans __del__ - Include in team coding standards

Memory Profiling Toolchain in Practice

After mastering GC mechanisms, we need tools to diagnose real problems. Here is a production-validated memory profiling toolchain.

memory_profiler for Line-Level Analysis

memory_profiler provides line-by-line memory usage reports, the most precise tool for locating memory hotspots.

# Install: pip install memory_profiler
from memory_profiler import profile
import numpy as np

@profile
def process_large_dataset():
    """Line-by-line memory analysis"""
    # Line 1: Allocate large dataset
    data = np.random.randn(1000, 1000)  # ~8MB

    # Line 2: Create copy
    processed = data * 2  # Another ~8MB

    # Line 3: Type conversion
    result = processed.astype(np.float32)  # May trigger temporary allocation

    # Line 4: Release intermediate results
    del data, processed

    return result

# Run analysis
# python -m memory_profiler script.py
# Output:
# Line #    Mem usage    Increment   Line Contents
# ================================================
#      6     38.5 MiB     38.5 MiB   @profile
#      7                             def process_large_dataset():
#      8     46.8 MiB      8.3 MiB       data = np.random.randn(1000, 1000)
#      9     54.8 MiB      8.0 MiB       processed = data * 2
#     10     54.9 MiB      0.1 MiB       result = processed.astype(np.float32)
#     11     46.8 MiB     -8.1 MiB       del data, processed
#     12     46.8 MiB      0.0 MiB       return result

Key Metric Interpretation:

Mem usage: Total memory after line execution
Increment: Memory change from that line (positive = allocation, negative = release)
Focus on large increments and unreleased accumulation

Advanced Usage: Time-decay Sampling

from memory_profiler import memory_usage
import time

def monitor_memory_over_time(func, interval=0.1):
    """Monitor memory changes during function execution"""
    mem_usage = memory_usage(
        (func, (), {}),  # (func, args, kwargs)
        interval=interval,  # Sampling interval
        timeout=None,
        max_usage=True,  # Return peak
        retval=True      # Return function result
    )
    return mem_usage

# Usage
peak_mem, result = monitor_memory_over_time(process_large_dataset)
print(f"Peak memory: {peak_mem:.1f} MiB")

filprofiler Flame Graph Interpretation

filprofiler focuses on answering “who allocated the memory”, generating flame graphs that intuitively show allocation stacks.

# Install: pip install filprofiler
# Run: fil-profile run script.py

# Code example
def load_and_process():
    """Simulate data processing workflow"""
    # 1. Data loading
    raw_data = load_from_database()  # Many small objects

    # 2. Transformation
    transformed = [transform(item) for item in raw_data]

    # 3. Aggregation
    result = aggregate(transformed)

    return result

def transform(item):
    """Transformation logic for each element"""
    return {
        'id': item['id'],
        'features': extract_features(item['raw'])  # May allocate large memory
    }

# fil-profile generates flame graphs like:
# 100%  |----------------------------------------|
#       |         load_and_process               |
# 80%   |--------------|-------------------------|
#       |load_from_db  |      transform           |
# 60%   |--------------|        |-----------------|
#       |   raw_data    |        |extract_feat    |
# 40%   |--------------|        |-----------------|
#       |  (list, ...)  |        |  (numpy, ...)   |
#
# Wider = more memory allocated on that code path
# Bottom to top = call stack

Flame Graph Analysis Points:

Top layer width = Total allocation for that code path
Sudden widening = Large allocations occurring here
Watch Python built-ins = list.append, dict.update, etc.

Scalene for Comprehensive Performance Analysis

scalene is the most advanced Python profiler, providing both CPU and memory analysis.

# Install: pip install scalene
# Run: scalene script.py

# Code example
def memory_intensive_task():
    """Demonstrate various memory allocation patterns"""
    # Native Python allocation (CPU + memory)
    data = []
    for i in range(100000):
        data.append(str(i))  # Many small strings

    # NumPy allocation (CPU + memory)
    import numpy as np
    matrix = np.random.randn(5000, 5000)

    # Mixed computation
    result = sum(len(s) for s in data) + matrix.sum()

    return result

# Scalene output format:
# | Time (s) | Memory (MB) | Line | Code                        |
# |----------|-------------|------|-----------------------------|
# |   0.523  |      45.2   |   6  | data = []                   |
# |   2.145  |     156.3   |   8  | data.append(str(i))         |
# |   0.089  |     190.7   |  12  | matrix = np.random.randn... |
# |   0.234  |       0.0   |  15  | result = sum(len(s) for...  |
#
# Interpretation:
# - Line 8 consumes 2.145s and 156MB, main bottleneck
# - Line 12 allocates 190MB (matrix) but only 0.089s (C code)

Scalene’s Unique Value:

Distinguish Python vs Native code time consumption
Distinguish CPU vs GPU memory
Line-level precise analysis
Low overhead (can analyze production code)

Integrating Memory Regression Testing into CI/CD

Finally, automate memory detection:

# test_memory_regression.py
import tracemalloc
import unittest
from functools import wraps

def memory_limit(max_mb: float):
    """Decorator: limit test case memory usage"""
    def decorator(func):
        @wraps(func)
        def wrapper(*args, **kwargs):
            tracemalloc.start()
            try:
                result = func(*args, **kwargs)
                current, peak = tracemalloc.get_traced_memory()
                peak_mb = peak / 1024 / 1024

                if peak_mb > max_mb:
                    raise AssertionError(
                        f"Memory exceeded: {peak_mb:.1f}MB > limit: {max_mb}MB"
                    )
                return result
            finally:
                tracemalloc.stop()
        return wrapper
    return decorator

class MemoryRegressionTest(unittest.TestCase):
    """Memory regression test suite"""

    @memory_limit(100)  # Limit 100MB
    def test_data_processing(self):
        """Ensure data processing stays within memory budget"""
        large_list = [i for i in range(1_000_000)]
        result = sum(large_list)
        self.assertEqual(result, sum(range(1_000_000)))

    @memory_limit(50)
    def test_model_inference(self):
        """Model inference memory test"""
        # Simulate inference
        import numpy as np
        inputs = np.random.randn(32, 512)  # batch=32, dim=512
        # ... inference code ...
        output = inputs @ np.random.randn(512, 1000)
        self.assertEqual(output.shape, (32, 1000))

# CI/CD integration (.github/workflows/memory.yml)
"""
name: Memory Regression Test
on: [push, pull_request]

jobs:
  memory-test:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v3
      - name: Run memory tests
        run: |
          python -m pytest test_memory_regression.py -v
      - name: Generate memory report
        run: |
          python -m memory_profiler profile_script.py > memory_report.txt
      - name: Upload report
        uses: actions/upload-artifact@v3
        with:
          name: memory-report
          path: memory_report.txt
"""

Tool Selection Decision Tree:

Need memory analysis?
├── Locate specific line -> memory_profiler
├── View allocation call stack -> filprofiler
├── Combined CPU + memory -> scalene
└── CI/CD automation -> pytest + tracemalloc

If You Need to Continue Distinguishing, Which Dimensions Should You Really Look At

Three dimensions to help you judge memory problems:

Dimension	Checkpoints	Tools
Object Type	Mutable vs Immutable	`type()`, `isinstance()`
Lifecycle	Short-cycle vs Long-cycle	`tracemalloc`
Reference Pattern	Tree vs Graph structure	`gc.get_referrers()`

Object Type: Mutable objects (list, dict) are more prone to circular references. Immutable objects (tuple, int, str) usually don’t.

Lifecycle: Short-cycle objects (temporary variables in functions) are handled by reference counting, never reaching GC. Long-cycle objects (global caches, long connections) need GC attention.

Reference Pattern: Tree structures (DOM, AST) are handled by reference counting. Graph structures (social networks, dependency relationships) are prone to circular references.

A More Reliable Judgment Order

When encountering memory problems, troubleshoot in this order:

Step 1: Check Circular References

import gc

# Key: Must set DEBUG_SAVEALL, otherwise circular reference objects won't enter gc.garbage
gc.set_debug(gc.DEBUG_SAVEALL)

gc.collect()

# View recycled but still surviving objects (Python 3.4+ won't put in gc.garbage by default)
print(gc.garbage)  # Only has content after DEBUG_SAVEALL is set

Important Notes:

Python 3.4+ for performance reasons won’t put recycled objects in gc.garbage by default
Must explicitly set gc.DEBUG_SAVEALL to collect these objects
Production environments not recommended to keep DEBUG_SAVEALL enabled long-term, brings performance overhead

Step 2: Check Reference Count

import sys

# Note: getrefcount itself creates a temporary reference
print(sys.getrefcount(obj) - 1)

Step 3: Consider Pooling Strategy

If the first two steps are fine, high memory usage is likely pymalloc’s pooling strategy. Refer to Part 1’s tracemalloc method for diagnosis.

GC Optimization Practices in Large Model Scenarios

Typical Memory Traps in Hugging Face Transformers

In large model training and inference, GC problems are often magnified to the GB level. Here are three real-world cases encountered in production environments.

Case 1: Circular References Between Model Weights and Optimizer States

import torch
from transformers import AutoModel

# Problem code
def create_training_setup():
    model = AutoModel.from_pretrained("gpt2")
    optimizer = torch.optim.AdamW(model.parameters())
    
    # Implicit circular reference!
    # optimizer references model.parameters()
    # Some callbacks may reference both
    return model, optimizer

# After training ends
del model, optimizer
gc.collect()  # Memory doesn't move!

Root Cause Analysis:

PyTorch’s Optimizer holds references to model.parameters()
Some learning rate schedulers or callback functions hold references to both
Forms triangular circular reference
GC can recycle, but causes out-of-memory before recycling

Solution:

def cleanup_training(model, optimizer):
    """Safe training resource cleanup"""
    # Step 1: Clear optimizer state
    optimizer.state.clear()
    
    # Step 2: Disconnect parameter references
    optimizer.param_groups.clear()
    
    # Step 3: Force GC
    gc.collect()
    
    # Step 4: If CUDA, clear cache
    if torch.cuda.is_available():
        torch.cuda.empty_cache()

Case 2: DataLoader Multi-Process Memory Leak

from torch.utils.data import DataLoader

# Problem code: memory leak when num_workers > 0
train_loader = DataLoader(
    dataset,
    batch_size=32,
    num_workers=4,  # Multi-process data loading
    persistent_workers=True  # Persistent processes
)

# After each epoch
for batch in train_loader:
    # Training logic
    pass

# Epoch ends, but subprocess memory not released!

Root Cause Analysis:

Multi-process DataLoader uses fork() to create subprocesses
Subprocesses inherit parent process memory space (copy-on-write)
persistent_workers=True keeps processes between epochs
Some shared memory regions cannot be properly released

Solution:

class SafeDataLoader:
    """DataLoader wrapper with automatic cleanup"""
    
    def __init__(self, *args, **kwargs):
        self.loader_args = args
        self.loader_kwargs = kwargs
        self._loader = None
    
    def __enter__(self):
        self._loader = DataLoader(*self.loader_args, **self.loader_kwargs)
        return self._loader
    
    def __exit__(self, *args):
        # Explicitly close DataLoader
        if hasattr(self._loader, '_iterator'):
            self._loader._iterator._shutdown_workers()
        self._loader = None
        gc.collect()

# Usage
with SafeDataLoader(dataset, batch_size=32, num_workers=4) as loader:
    for batch in loader:
        # Training
        pass
# Automatic cleanup

GC Threshold Tuning: From Default (700,10,10) to Production Practice

Understanding Default Thresholds:

import gc

# Default thresholds
print(gc.get_threshold())  # (700, 10, 10)

# Meaning:
# - GC triggered when generation 0 objects exceed 700
# - Generation 0 triggers generation 1 GC after 10 GC cycles
# - Generation 1 triggers generation 2 GC after 10 GC cycles

Tuning Strategies for Large Model Training Scenarios:

Strategy 1: Long Training Tasks—Reduce GC Frequency

# Scenario: Training large models that need to run for days
# Strategy: Reduce GC frequency, decrease STW (Stop-The-World) pauses

# Tuned thresholds
gc.set_threshold(2000, 50, 50)

# Monitor effects
import time

def profile_gc():
    gc.set_debug(gc.DEBUG_STATS)
    
    # Run for a while
    time.sleep(3600)  # 1 hour
    
    # View GC statistics
    # Example output:
    # gc: done, 1234567 unreachable, 2345678 collected, 0.234s elapsed

Results:

GC trigger frequency reduced by ~60%
Single GC time increased by ~40%
Overall GC overhead reduced by ~30%
Training throughput improved by ~2-3%

Strategy 2: Inference Services—Freeze Early Objects

# Scenario: Long-running LLM inference service
# Strategy: Freeze known clean early objects

import gc

# After service startup, model loading complete
model = load_model()

# Freeze generation 0 and 1, these objects are known to be long-lifecycle
gc.freeze()

# After this, only newly created objects are scanned by GC
# Reduces number of objects GC needs to traverse

# Periodically (e.g., hourly) unfreeze and refreeze
def refresh_freeze():
    gc.unfreeze()
    gc.collect()
    gc.freeze()

Strategy 3: Critical Paths—Temporarily Disable GC

# Scenario: Critical inference paths need deterministic latency
# Strategy: Temporarily disable GC in critical paths

import gc

class CriticalPath:
    def __enter__(self):
        # Save current state
        self.gc_was_enabled = gc.isenabled()
        # Disable GC
        gc.disable()
        return self
    
    def __exit__(self, *args):
        # Restore GC
        if self.gc_was_enabled:
            gc.enable()
        # Manually trigger one full GC
        gc.collect()

# Usage
with CriticalPath():
    # Critical inference logic
    result = model.generate(prompt)
# Automatically restores GC and cleans up after exit

Risk Warnings:

Disabling GC for long periods may cause memory exhaustion
Recommend keeping critical path time under 1 second
Only use in performance-critical scenarios

Weak References in Model Caching

import weakref
from functools import lru_cache

class ModelCache:
    """Model cache implemented with weak references"""
    
    def __init__(self):
        # Use weak references, don't prevent models from being GC'd
        self._cache = weakref.WeakValueDictionary()
        self._access_count = {}
    
    def get(self, model_name: str):
        model = self._cache.get(model_name)
        if model is not None:
            self._access_count[model_name] = self._access_count.get(model_name, 0) + 1
        return model
    
    def put(self, model_name: str, model):
        # Key boundary conditions:
        # 1. Model must support weak references (have __weakref__ attribute)
        # 2. Basic types (int, str, tuple, etc.) don't support weak references
        # 3. None cannot be stored in WeakValueDictionary
        self._cache[model_name] = model
        self._access_count[model_name] = 0
    
    def get_stats(self):
        """Return cache statistics"""
        return {
            'cached_models': list(self._cache.keys()),
            'access_counts': self._access_count.copy()
        }

# Usage
cache = ModelCache()

# Load model and cache
cache.put("gpt2", load_model("gpt2"))

# Get model
model = cache.get("gpt2")  # Returns model object

# When memory is insufficient, GC can recycle cached models
# Because there are no strong references, only weak references

# ⚠️ Boundary condition examples:
# 1. Basic types don't support weak references
cache.put("answer", 42)  # ❌ TypeError: cannot create weak reference to 'int' object

# 2. If need to cache basic types, need wrapping
class CachedValue:
    def __init__(self, value):
        self.value = value

cache.put("answer", CachedValue(42))  # ✅

Key Boundary Conditions:

Limitation	Explanation	Solution
`__weakref__` required	Objects must support weak references	Most custom classes support by default
Basic types not supported	int/str/tuple etc. not supported	Use wrapper class or WeakKeyDictionary
None cannot be stored	WeakValueDictionary rejects None	Use placeholder object or handle separately
Lifecycle uncertainty	May be GC’d at any time	Always check if get() return is None

Generational GC Threshold Tuning

The mathematical meaning of the default threshold (700, 10, 10) needs deep understanding for effective tuning. These three numbers form Python GC’s decision matrix.

Detailed Mathematical Meaning of Threshold (700, 10, 10)

import gc

# Mathematical meaning of default thresholds
threshold_0, threshold_1, threshold_2 = gc.get_threshold()
print(f"Thresholds: ({threshold_0}, {threshold_1}, {threshold_2})")

# Trigger logic:
# - Generation 0 allocated objects > 700: trigger gen0 GC
# - gen0 GC count > 10: trigger gen0+gen1 GC
# - gen1 GC count > 10: trigger gen0+gen1+gen2 GC

The mathematical model can be understood as:

Threshold	Trigger Condition	Practical Meaning	Affected Objects
700	Generation 0 allocation count	Short-cycle object accumulation rate	Newly created temporary objects
10	Generation 0 GC count	Object promotion rate to generation 1	Medium lifecycle objects
10	Generation 1 GC count	Object promotion rate to generation 2	Long lifecycle objects

Key Insight: The threshold product (700 × 10 × 10 = 70,000) approximately represents how many allocations an object needs to experience to be promoted from generation 0 to generation 2.

Optimal Threshold Configuration for Different Scenarios

Web Service Scenario: Pursuing Low Latency

# Web service's core trade-off: response latency vs memory usage
def configure_web_service_gc():
    """
    Goal: Reduce GC pauses' impact on request response time
    Strategy: More frequent but faster GC, avoiding single long pauses
    """
    # Reduce single GC processing volume, increase frequency
    gc.set_threshold(300, 5, 5)

    # Why this configuration:
    # - 300: Quickly recycle short-term objects, prevent accumulation
    # - 5: Accelerate object promotion, reduce repeated scanning
    # - 5: Trigger full GC faster, avoid memory continuous growth

    # Effect: GC pause from ~50ms reduced to ~20ms, but frequency increases
    # Suitable for: API gateways, microservices, short-connection scenarios

Data Processing Scenario: Pursuing Throughput

def configure_data_processing_gc():
    """
    Goal: Maximize data processing throughput
    Strategy: Reduce GC frequency, tolerate higher memory usage
    """
    # Data stream processing tasks, object lifecycle is clear
    gc.set_threshold(2000, 50, 50)

    # Why this configuration:
    # - 2000: Allow more objects to accumulate in generation 0, reduce GC frequency
    # - 50: Lower promotion frequency, let objects survive longer in generation 0
    # - 50: Significantly reduce full GC frequency

    # Effect: Throughput improves 15-25%, memory usage increases 30%
    # Suitable for: ETL tasks, batch processing, data transformation

Long Training Task Scenario: Balanced Strategy

def configure_training_gc():
    """
    Goal: Maintain stable performance during training
    Strategy: Dynamic adjustment, switch strategies based on training stage
    """
    import gc

    class GCManager:
        def __init__(self):
            self.default_threshold = (700, 10, 10)
            self.data_loading_threshold = (1500, 30, 30)
            self.training_threshold = (3000, 100, 50)

        def set_data_loading_mode(self):
            """Data loading stage: High-frequency GC prevents memory explosion"""
            gc.set_threshold(*self.data_loading_threshold)
            print("GC mode: Data loading - Medium frequency")

        def set_training_mode(self):
            """Training stage: Low-frequency GC maximizes GPU utilization"""
            gc.set_threshold(*self.training_threshold)
            gc.freeze()  # Freeze known clean objects
            print("GC mode: Training - Low frequency, early objects frozen")

        def set_checkpoint_mode(self):
            """Checkpoint stage: Force full GC"""
            gc.set_threshold(*self.default_threshold)
            gc.unfreeze()
            gc.collect()
            print("GC mode: Checkpoint - Force full collection")

    return GCManager()

# Usage example
gc_manager = configure_training_gc()

# Data loading stage
gc_manager.set_data_loading_mode()
# ... load data ...

# Training stage
gc_manager.set_training_mode()
# ... training loop ...

# Save checkpoint
gc_manager.set_checkpoint_mode()
# ... save model ...

A/B Testing Methodology for Threshold Tuning

Scientific tuning requires quantitative metrics. Here is a complete A/B testing framework:

import gc
import time
import statistics
from dataclasses import dataclass
from typing import List, Callable

@dataclass
class GCMetrics:
    """GC performance metrics"""
    threshold_config: tuple
    total_time: float
    gc_pause_times: List[float]
    peak_memory_mb: float
    objects_collected: int

    @property
    def avg_gc_pause(self) -> float:
        return statistics.mean(self.gc_pause_times) if self.gc_pause_times else 0

    @property
    def max_gc_pause(self) -> float:
        return max(self.gc_pause_times) if self.gc_pause_times else 0

class GCTuningABTest:
    """A/B testing framework for GC threshold tuning"""

    def __init__(self):
        self.results: List[GCMetrics] = []

    def measure_config(self, threshold: tuple, workload: Callable, runs: int = 3) -> GCMetrics:
        """
        Measure GC performance under specific threshold configuration

        Args:
            threshold: (gen0_thresh, gen1_thresh, gen2_thresh)
            workload: Function simulating workload
            runs: Number of runs to average
        """
        gc_pause_times = []
        total_collected = 0

        # Set GC debug to capture pause times
        gc.set_debug(gc.DEBUG_STATS)

        def gc_callback(phase, info):
            """GC event callback, record pause times"""
            if phase == "stop":
                gc_pause_times.append(info.get('elapsed', 0))

        # Install callback (Python 3.7+)
        if hasattr(gc, 'callbacks'):
            gc.callbacks.append(gc_callback)

        run_times = []
        for _ in range(runs):
            # Clean state
            gc.collect()
            gc.set_threshold(*threshold)

            # Measure
            start = time.perf_counter()
            collected_before = gc.get_stats()[0]['collected']

            workload()

            elapsed = time.perf_counter() - start
            collected_after = gc.get_stats()[0]['collected']

            run_times.append(elapsed)
            total_collected += (collected_after - collected_before)

        # Clean callback
        if hasattr(gc, 'callbacks'):
            gc.callbacks.remove(gc_callback)

        return GCMetrics(
            threshold_config=threshold,
            total_time=statistics.mean(run_times),
            gc_pause_times=gc_pause_times,
            peak_memory_mb=self._get_peak_memory(),
            objects_collected=total_collected // runs
        )

    def _get_peak_memory(self) -> float:
        """Get peak memory usage (MB)"""
        import tracemalloc
        if tracemalloc.is_tracing():
            current, peak = tracemalloc.get_traced_memory()
            return peak / 1024 / 1024
        return 0.0

    def compare_configs(self, configs: List[tuple], workload: Callable) -> None:
        """
        Compare multiple threshold configurations

        Example:
            configs = [
                (700, 10, 10),    # Default
                (300, 5, 5),      # Web services
                (2000, 50, 50),   # Data processing
                (3000, 100, 100), # Long tasks
            ]
        """
        print("=" * 80)
        print("GC Threshold A/B Test Results")
        print("=" * 80)
        print(f"{'Config':<20} {'Total(s)':<10} {'AvgGC(ms)':<10} {'MaxGC(ms)':<10} {'Peak(MB)':<12}")
        print("-" * 80)

        for config in configs:
            metrics = self.measure_config(config, workload)
            self.results.append(metrics)

            config_str = f"{config}"
            print(f"{config_str:<20} {metrics.total_time:<10.3f} "
                  f"{metrics.avg_gc_pause*1000:<10.2f} "
                  f"{metrics.max_gc_pause*1000:<10.2f} "
                  f"{metrics.peak_memory_mb:<12.1f}")

        print("=" * 80)
        self._print_recommendation()

    def _print_recommendation(self) -> None:
        """Provide recommendations based on results"""
        if not self.results:
            return

        # Sort by average GC pause
        by_pause = sorted(self.results, key=lambda x: x.avg_gc_pause)
        best_latency = by_pause[0]

        # Sort by total time
        by_throughput = sorted(self.results, key=lambda x: x.total_time)
        best_throughput = by_throughput[0]

        print("\nRecommended configurations:")
        print(f"  Lowest latency: {best_latency.threshold_config} "
              f"(avg GC pause: {best_latency.avg_gc_pause*1000:.2f}ms)")
        print(f"  Best throughput: {best_throughput.threshold_config} "
              f"(total time: {best_throughput.total_time:.3f}s)")

# Usage example
def sample_workload():
    """Example workload: Create and destroy many objects"""
    data = []
    for i in range(10000):
        obj = {'index': i, 'data': [0] * 100}
        data.append(obj)
        if len(data) > 1000:
            data = data[500:]  # Keep half, simulate partial survival

# Run A/B test
test = GCTuningABTest()
test.compare_configs(
    configs=[(700, 10, 10), (300, 5, 5), (2000, 50, 50)],
    workload=sample_workload
)

A/B Testing Best Practices:

Control variables: Only change one threshold parameter at a time
Multiple runs: At least 3-5 runs to average, eliminate noise
Real workload: Use actual data patterns from production
Monitor tail latency: Watch P99 GC pauses, not just averages
Memory pressure: Test under near-memory-limit conditions

GC Tuning Decision Tree

Problem: GC causing performance issues?
├── Yes -> Scenario judgment
│   ├── Training task -> Increase threshold, reduce frequency
│   ├── Inference service -> Freeze early objects
│   └── Critical path -> Temporarily disable GC
└── No -> Check memory leak
    ├── Circular reference -> Use weakref or manually disconnect
    └── Object leak -> Track with tracemalloc

Conclusion: Finally Making Sense of It

“Python uses reference counting for garbage collection”—this sentence is not wrong, but it’s incomplete.

The complete understanding is: Python uses reference counting to handle most object lifecycles, uses generational GC to handle circular references, and uses memory pool strategy to manage memory release.

Three misconceptions correspond to three scenarios:

Circular references → Reference counting fails, needs generational GC
Memory not released → GC only handles object recycling, doesn’t control memory return
del not working → Deletes references, object survival depends on reference count

Next time you encounter a memory problem, ask yourself first: Can reference counting handle this? If yes, check references; if not, check circular references; if neither, consider pooling strategy.

In the next article, we’ll dive deep into the GIL and concurrency—seeing why 72 processes vs 1 process is Meta AI’s real dilemma, and how PEP 703 changes everything.

References and Acknowledgments

Python gc Module Documentation — Python.org
Python tracemalloc Module Documentation — Python.org
Memory Management in Python — Real Python
“Garbage Collection in Python” — Various sources

Series context

You are reading: Python Memory Model Deep Dive

This is article 2 of 7. Reading progress is stored only in this browser so the full series page can resume from the right entry.

View full series →

Reading path

Continue along this topic path

Follow the recommended order for Python instead of jumping through random articles in the same topic.

View full topic path →

Next step

Go deeper into this topic

If this article is useful, continue from the topic page or subscribe to follow later updates.

Introduction: Why This Concept Gets More Confusing the More You Learn

Misconception 1: Python Only Has Reference Counting

Why People Think This

Why This Understanding Is Wrong

What Is the More Accurate Understanding

Misconception 2: Calling gc.collect() Will Immediately Release Memory

Why People Think This

Why This Understanding Is Wrong

What Is the More Accurate Understanding

Misconception 3: del Statement Immediately Deletes Objects

Why People Think This

Why This Understanding Is Wrong

What Is the More Accurate Understanding

Practical Circular Reference Detection

Circular References in ORM Models

Using gc.get_referrers() and gc.get_referents() to Track Reference Chains

Visualizing Reference Graphs with objgraph

Using weakref to Fix Circular References

del Method Pitfalls and Best Practices

5 Scenarios Where del Doesn’t Execute

Comparison with weakref.finalize

Deterministic Resource Cleanup: Context Managers

Strategy for Disabling del in Production

Memory Profiling Toolchain in Practice

memory_profiler for Line-Level Analysis

filprofiler Flame Graph Interpretation

Scalene for Comprehensive Performance Analysis

Integrating Memory Regression Testing into CI/CD

If You Need to Continue Distinguishing, Which Dimensions Should You Really Look At

A More Reliable Judgment Order

GC Optimization Practices in Large Model Scenarios

Typical Memory Traps in Hugging Face Transformers

GC Threshold Tuning: From Default (700,10,10) to Production Practice

Weak References in Model Caching

Generational GC Threshold Tuning

Detailed Mathematical Meaning of Threshold (700, 10, 10)

Optimal Threshold Configuration for Different Scenarios

A/B Testing Methodology for Threshold Tuning

GC Tuning Decision Tree

Conclusion: Finally Making Sense of It

References and Acknowledgments

You are reading: Python Memory Model Deep Dive

Current series chapters

Continue along this topic path

Original Interpretation: The Three-Layer World of Python Memory Architecture

Original Analysis: 72 Processes vs 1 Process—How GIL Becomes a Bottleneck for AI Training and PEP 703's Breakthrough

Original Analysis: Python as a Glue Language—How Bindings Connect Performance and Ease of Use

Continue with this topic

Original Analysis: Why FastAPI Rises in the AI Era—The Engineering Value of Type Hints and Async I/O

Original Analysis: Why Python Monopolizes LLM Development—Ecosystem Flywheel and Data Evidence

Original Analysis: Capability Building for Python Developers in the AI Tools Era—A Practical Guide for Frontline Engineers

Go deeper into this topic

Subscribe to updates

Comments and discussion