Article
Original Interpretation: Python Garbage Collection - The Three Most Common Misconceptions
Deconstructing the three major misconceptions about reference counting, gc.collect(), and del statements, establishing a complete cognitive framework for Python GC mechanisms (reference counting + generational GC + cycle detection)
Copyright and Disclaimer This article is an original interpretation based on Python official documentation and Real Python articles. Copyright of original articles belongs to their respective authors. This article is not an official translation, but clarification and reconstruction of common misconceptions.
Original References
- Garbage Collection in Python — Real Python
- Python
gcModule Documentation — Python.org
Introduction: Why This Concept Gets More Confusing the More You Learn
“Python uses reference counting for garbage collection.”
If you learned Python from textbooks or introductory tutorials, you’ve probably seen this sentence. It’s concise enough, correct enough, and good enough to get you through interviews.
But when you start writing large model training scripts and encounter memory explosions caused by circular references; when you call gc.collect() in production and find memory hasn’t moved; when you think the del statement deleted an object but memory usage remains high—this sentence starts to feel inadequate.
Worse, three different scenarios require three different answers, but the textbook only gave one sentence. This is the source of conceptual confusion.
This article deconstructs three of the most common misconceptions and rebuilds a more reliable Python GC understanding framework.
Misconception 1: Python Only Has Reference Counting
Why People Think This
Textbook simplification. Reference counting is CPython’s most visible GC mechanism—every object has an ob_refcnt field, every assignment, parameter passing, and container insertion modifies it. sys.getrefcount() can check it at any time, and Py_INCREF/Py_DECREF macros are everywhere in the source code.
Intuitive alignment. “Delete when no one references” aligns with human intuition. In contrast, Java’s reachability analysis and Go’s tri-color marking are more abstract.
Presence difference. Generational GC and cycle detection hide in the gc module, running automatically by default—most developers never directly call it.
Why This Understanding Is Wrong
Scenario: Circular References
class Node:
def __init__(self, value):
self.value = value
self.next = None
a = Node(1)
b = Node(2)
a.next = b
b.next = a # Circular reference!
del a
del b
# Both objects still exist because they reference each other
In this scenario, reference counting completely fails. a and b reference each other, both have reference count 1, and even if a and b are deleted externally, the two objects will never be recycled.
Reference count reaching 0 means immediate recycling, but not necessarily immediate release. Recycling means the object is destroyed, but whether memory is returned to the operating system depends on the three-layer architecture explained in Part 1.
The existence of Generational GC: Python’s gc module handles circular references. It uses a generational strategy, dividing objects into three generations (0, 1, 2). New objects are in generation 0, promoted to the next generation after surviving multiple GC cycles.
What Is the More Accurate Understanding
Python’s GC is a collaboration of three mechanisms:
Figure 1: Python Garbage Collection’s Three Mechanisms—The Collaboration of Reference Counting, Generational GC, and Cycle Detection
| Mechanism | Handles | Trigger Timing | Performance Characteristics |
|---|---|---|---|
| Reference Counting | Most objects | When reference changes | Immediate, deterministic, low overhead |
| Generational GC | Circular references | Threshold triggered/manual call | Delayed, non-deterministic, periodic |
| Cycle Detection | Container objects (list, dict, custom classes) | When GC runs | Mark-sweep algorithm |
Reference counting is the “main force”, handling over 90% of object lifecycles. But it cannot handle circular references, so Python needs generational GC as a supplement.
The relationship between the two is: Problems that reference counting can’t handle, then GC takes over.
Misconception 2: Calling gc.collect() Will Immediately Release Memory
Why People Think This
Inertia from other languages. Java has System.gc(), Go has runtime.GC(), C# has GC.Collect(). These languages’ GC is the main mechanism, and explicit calls usually trigger recycling.
Naming misdirection. collect implies “collecting garbage”—intuitively it should release memory. Documentation also says “Force garbage collection”.
Memory monitoring anxiety. Seeing high memory usage in production, the first reaction is “quickly call GC”.
Why This Understanding Is Wrong
gc.collect() only handles circular references. If your code has no circular references, or circular references have already been handled by previous GC cycles, gc.collect() does almost nothing.
Memory release has nothing to do with GC. As explained in Part 1, whether memory is returned to the operating system depends on pymalloc’s three-layer architecture (Arena → Pool → Block). Only when an entire Arena becomes empty will memory truly be released.
Generational GC’s delayed design. Python’s GC deliberately delays running to trade for throughput. The default threshold is (700, 10, 10), meaning GC is triggered when generation 0 objects exceed 700.
import gc
print(gc.get_threshold()) # (700, 10, 10)
This is not a bug, it’s a trade-off.
What Is the More Accurate Understanding
Generational GC’s working mechanism:
- Generation 0 (New objects): Placed in generation 0 when created. GC triggered when object count exceeds threshold (default 700).
- GC process: Mark surviving objects, clear dead objects (including circular references). Surviving objects promoted to generation 1.
- Generation 1: Objects surviving multiple GC cycles. When generation 1 object count exceeds threshold (10), GC runs on both generation 0 and 1 together.
- Generation 2 (Old objects): Objects surviving multiple GC cycles. When generation 2 object count exceeds threshold (10), GC runs on all three generations.
When manual calls make sense:
- Circular reference-intensive scenarios (graph structures, doubly linked lists)
- Long-running services needing to control GC pause time
- Test environments needing deterministic behavior
When not to call:
- Performance-sensitive code (GC pauses all threads)
- Memory usage caused by large numbers of small objects (check code, not GC)
- Memory not released to operating system (this is pooling strategy, not a GC problem)
Misconception 3: del Statement Immediately Deletes Objects
Why People Think This
Syntax intuitiveness. del looks like “delete”, and in English delete equals remove. C++ has a delete operator that immediately releases memory.
Interactive environment feedback. In Python REPL, after del x, x indeed no longer exists—accessing it raises NameError.
Documentation phrasing. Official documentation says “Delete the reference”, but most people read it as “Delete the object”.
Why This Understanding Is Wrong
del deletes references, not objects. Whether an object is deleted depends on whether its reference count reaches zero.
a = [1, 2, 3]
b = a
del a # Deleted the name a, but the list object still exists (referenced by b)
print(b) # [1, 2, 3] outputs normally
Circular reference scenario: Objects survive after del.
import gc
class Obj:
def __init__(self, name):
self.name = name
self.ref = None
def __del__(self):
print(f"Deleting {self.name}")
x = Obj("x")
y = Obj("y")
x.ref = y
y.ref = x
del x
del y
# Neither object is deleted at this point! __del__ is not called
# Until gc.collect() triggers cycle detection
print("Before gc.collect()")
gc.collect()
print("After gc.collect()")
del is not a destructor. It’s a finalizer, called when the object is recycled, but there’s no guarantee when it will be called, or even if it will ever be called due to circular references.
What Is the More Accurate Understanding
del statement → deletes name (reference) → reference count decreases
↓
reference count = 0?
↓
Yes → immediately recycle object (reference counting mechanism)
↓
No → object continues to survive
↓
Circular reference → wait for generational GC detection
Weak references (weakref) design intent: If you need to reference but don’t want to prevent garbage collection, use weak references.
import weakref
class Data:
pass
data = Data()
weak_ref = weakref.ref(data)
print(weak_ref()) # <__main__.Data object at 0x...>
del data
print(weak_ref()) # None (object has been recycled)
Context managers and deterministic cleanup: If you need to ensure resources are released, don’t use __del__, use the with statement.
with open("file.txt") as f:
data = f.read()
# When exiting the with block, f.close() is deterministically called
Practical Circular Reference Detection
After understanding Python GC’s theoretical mechanisms, let’s master circular reference detection and remediation techniques through real-world scenarios. These cases come from production environment issues, covering ORM frameworks, visualization diagnostic tools, and weak reference solutions.
Circular References in ORM Models
ORM frameworks are hotspots for circular references. Taking SQLAlchemy and Django ORM as examples, bidirectional relationships between models naturally create mutual references.
Circular References in SQLAlchemy
from sqlalchemy import Column, Integer, String, ForeignKey, create_engine
from sqlalchemy.orm import relationship, sessionmaker, declarative_base
Base = declarative_base()
class Department(Base):
__tablename__ = 'departments'
id = Column(Integer, primary_key=True)
name = Column(String)
# Relationship definition: Department -> Employee
employees = relationship("Employee", back_populates="department")
class Employee(Base):
__tablename__ = 'employees'
id = Column(Integer, primary_key=True)
name = Column(String)
dept_id = Column(Integer, ForeignKey('departments.id'))
# Back reference: Employee -> Department
department = relationship("Department", back_populates="employees")
# How circular references form
def create_circular_refs(session):
dept = Department(name="Engineering")
emp = Employee(name="Alice")
# Bidirectional association creates circular reference
dept.employees.append(emp) # dept references emp
# emp.department automatically points to dept, forming a cycle
session.add(dept)
session.commit()
# Even after session closes, references between objects persist
return dept, emp
# Test circular reference
dept, emp = create_circular_refs(session)
del dept, emp # Reference count not 0, objects not immediately released
gc.collect() # Trigger cycle detection before objects can be collected
Problem Analysis: The back_populates bidirectional relationship makes Department and Employee instances reference each other. When queries return large numbers of such objects, memory usage accumulates until GC triggers.
Similar Problem in Django ORM
# Django models.py
from django.db import models
class Author(models.Model):
name = models.CharField(max_length=100)
# Reverse relationship: Django auto-creates book_set
class Book(models.Model):
title = models.CharField(max_length=200)
author = models.ForeignKey(Author, on_delete=models.CASCADE, related_name='books')
# Scenario where circular references occur
def fetch_with_prefetch():
# select_related and prefetch_related load objects maintaining bidirectional references
authors = Author.objects.prefetch_related('books').all()
for author in authors:
for book in author.books.all():
# book.author and author.books form a circular reference chain
process(book)
# After authors list is deleted, circular references between internal objects persist
Using gc.get_referrers() and gc.get_referents() to Track Reference Chains
Python’s gc module provides two powerful introspection functions for manually tracing reference relationships.
import gc
class Node:
def __init__(self, name):
self.name = name
self.ref = None
def __repr__(self):
return f"Node({self.name})"
# Create circular reference
a = Node("A")
b = Node("B")
c = Node("C")
a.ref = b
b.ref = c
c.ref = a # Forms cycle: A -> B -> C -> A
def analyze_references(obj, depth=0, max_depth=5, visited=None):
"""Recursively analyze object reference relationships"""
if visited is None:
visited = set()
if depth > max_depth or id(obj) in visited:
return
visited.add(id(obj))
indent = " " * depth
print(f"{indent}Object: {obj} (id: {id(obj)})")
# Get objects this object references (outbound references)
referents = gc.get_referents(obj)
print(f"{indent} Referents ({len(referents)}) - Objects this object references:")
for ref in referents:
if isinstance(ref, (dict, list, tuple, Node)):
print(f"{indent} -> {type(ref).__name__}: {ref if isinstance(ref, Node) else '...'}")
# Get objects that reference this object (inbound references)
referrers = gc.get_referrers(obj)
print(f"{indent} Referrers ({len(referrers)}) - Objects referencing this object:")
for ref in referrers:
if ref is not visited: # Avoid printing gc module itself
ref_type = type(ref).__name__
if isinstance(ref, Node):
print(f"{indent} <- {ref_type}: {ref}")
elif isinstance(ref, dict):
print(f"{indent} <- {ref_type}: __dict__")
elif isinstance(ref, list):
print(f"{indent} <- {ref_type}: [...]")
print()
# Analyze node A's reference relationships
analyze_references(a)
Sample Output:
Object: Node(A) (id: 140312345678016)
Referents (3) - Objects this object references:
-> str: ...
-> dict: ...
-> Node: Node(B)
Referrers (2) - Objects referencing this object:
<- dict: __dict__
<- Node: Node(C)
From the output, we can clearly see the reference chain: Node(C) references Node(A), and Node(A) references Node(B).
Visualizing Reference Graphs with objgraph
objgraph is a third-party library that generates visual diagrams of object reference relationships—a powerful tool for diagnosing circular references.
# Install: pip install objgraph
import objgraph
import gc
class User:
def __init__(self, name):
self.name = name
self.friends = []
def add_friend(self, user):
self.friends.append(user)
# Create circular reference scenario
alice = User("Alice")
bob = User("Bob")
carol = User("Carol")
alice.add_friend(bob)
bob.add_friend(carol)
carol.add_friend(alice) # Forms cycle
# Generate reference graph
objgraph.show_backrefs(
[alice, bob, carol],
filename='circular_refs.png',
max_depth=3,
too_many=10
)
# Find most common object types
print("Most common object types:")
objgraph.show_most_common_types(limit=10)
# Find references between specific object types
users = objgraph.by_type('User')
print(f"\nFound {len(users)} User objects")
# Detect circular references
if len(users) >= 2:
objgraph.show_chain(
objgraph.find_backref_chain(users[0], lambda obj: obj in users[1:]),
filename='ref_chain.png'
)
Generated Reference Graph Explanation:
The generated PNG image shows arrows pointing between objects; circular references form closed loops. In real projects, you can visually see which objects form unreleaseable cycles.
Using weakref to Fix Circular References
The standard solution for circular references is using weak references. Weak references don’t prevent garbage collection—when an object only has weak references, GC can collect it normally.
Fixing ORM Model Circular References
import weakref
from typing import Optional
class SafeDepartment:
def __init__(self, name: str):
self.name = name
# Use weak reference set to store employees
self._employees = weakref.WeakSet()
@property
def employees(self):
"""Return strong reference list, but keep weak references internally"""
return list(self._employees)
def add_employee(self, emp):
self._employees.add(emp)
# Employee uses weak reference to department
emp._department_ref = weakref.ref(self)
class SafeEmployee:
def __init__(self, name: str):
self.name = name
self._department_ref = lambda: None # Default returns None
@property
def department(self) -> Optional['SafeDepartment']:
"""Access department through weak reference"""
return self._department_ref()
def __repr__(self):
return f"SafeEmployee({self.name})"
def __hash__(self):
return hash(self.name)
def __eq__(self, other):
return isinstance(other, SafeEmployee) and self.name == other.name
# After using weak references, circular reference is broken
dept = SafeDepartment("Engineering")
emp = SafeEmployee("Alice")
dept.add_employee(emp)
print(f"Employee department: {emp.department}") # Works normally
# After deleting department, employee no longer holds strong reference
del dept
gc.collect()
print(f"Department after deletion: {emp.department}") # None
Generic Weak Reference Pattern: Safe Observer Pattern Implementation
import weakref
from abc import ABC, abstractmethod
class Observer(ABC):
@abstractmethod
def notify(self, event):
pass
class EventSource:
"""Event source uses weak references to store observers, avoiding circular references"""
def __init__(self):
# Use WeakKeyDictionary, observers automatically removed when deleted
self._observers = weakref.WeakKeyDictionary()
def subscribe(self, observer: Observer, priority=0):
"""Subscribe to events using weak references"""
if not isinstance(observer, Observer):
raise TypeError("Observer must implement Observer interface")
self._observers[observer] = priority
def unsubscribe(self, observer: Observer):
"""Unsubscribe"""
self._observers.pop(observer, None)
def emit(self, event):
"""Emit event, notify all observers"""
# Sort by priority
sorted_observers = sorted(
self._observers.items(),
key=lambda x: x[1],
reverse=True
)
for observer, _ in sorted_observers:
observer.notify(event)
def get_subscriber_count(self):
return len(self._observers)
class ConcreteObserver(Observer):
"""Concrete observer"""
def __init__(self, name: str):
self.name = name
self.events_received = []
def notify(self, event):
self.events_received.append(event)
print(f"[{self.name}] Received: {event}")
def __hash__(self):
return hash(self.name)
def __eq__(self, other):
return isinstance(other, ConcreteObserver) and self.name == other.name
# Demonstrate circular reference problem solved
source = EventSource()
observers = [ConcreteObserver(f"Observer_{i}") for i in range(3)]
# Subscribe to events
for obs in observers:
source.subscribe(obs)
print(f"Subscriber count: {source.get_subscriber_count()}") # 3
# After deleting observer, EventSource automatically cleans up weak reference
del observers[0]
gc.collect()
print(f"Subscriber count after deletion: {source.get_subscriber_count()}") # 2
# Event emission works normally
source.emit("Test Event") # Only remaining 2 observers receive
Sample Output:
Subscriber count: 3
Subscriber count after deletion: 2
[Observer_1] Received: Test Event
[Observer_2] Received: Test Event
Key Points Summary:
| Scenario | Solution | Notes |
|---|---|---|
| ORM bidirectional relationships | Use WeakSet/WeakKeyDictionary | Requires custom relationship management logic |
| Observer pattern | Weak references to store observers | Observers may be collected anytime, check for None |
| Cache systems | WeakValueDictionary | Cached object lifecycle controlled externally |
| Parent references | weakref.ref(parent) | Check validity when accessing using ref() |
The core of circular reference detection and remediation is: Identify reference relationships -> Visualize confirmation -> Replace unnecessary strong references with weak references. In production environments, regular memory analysis with objgraph is recommended, especially in long-running services.
del Method Pitfalls and Best Practices
__del__ is one of Python’s most misunderstood mechanisms. Many developers coming from C++ expect it to work like a destructor, but it has completely different semantics and limitations.
5 Scenarios Where del Doesn’t Execute
The following scenarios can cause __del__ methods to never be called:
import gc
import sys
import os
class ResourceHandler:
"""Demonstrates scenarios where __del__ doesn't execute"""
def __init__(self, name):
self.name = name
print(f"[{self.name}] Resource allocated")
def __del__(self):
print(f"[{self.name}] __del__ called - resource cleaned up")
# Scenario 1: Circular references
print("=== Scenario 1: Circular References ===")
a = ResourceHandler("A")
b = ResourceHandler("B")
a.ref = b
b.ref = a
del a
del b
# __del__ not called because circular references keep ref count > 0
print(f"Circular reference garbage count: {len(gc.garbage)}")
# Scenario 2: Interpreter abnormal exit
print("\n=== Scenario 2: Interpreter Abnormal Exit ===")
handler = ResourceHandler("C")
os._exit(1) # Force exit, __del__ won't execute
# Scenario 3: Process killed (SIGKILL)
# kill -9 <pid> terminates process immediately, no cleanup executed
# Scenario 4: __del__ itself raises exception
class BadResource:
def __del__(self):
raise Exception("Cleanup failed") # Exception is ignored, resource not cleaned
# Scenario 5: Module unload while object still referenced
import atexit
class GlobalResource:
def __del__(self):
print("GlobalResource.__del__ called")
global_res = GlobalResource()
@atexit.register
def cleanup():
# If global_res is still referenced by other modules, __del__ won't execute
print(f"At program exit global_res still alive: {global_res}")
Summary of 5 Scenarios Where Execution Is Not Guaranteed:
| Scenario | Trigger Condition | Consequence |
|---|---|---|
| Circular references | Objects reference each other | __del__ never executes |
| Abnormal exit | os._exit() / SIGKILL | All cleanup skipped |
__del__ exception | Cleanup code raises exception | Exception ignored, may leak resources |
| Module-level references | Still referenced at module unload | Object survives in weird state |
| Interpreter termination | Python process ends | No __del__ guaranteed to be called |
Comparison with weakref.finalize
weakref.finalize provides more reliable cleanup than __del__:
import weakref
class SafeResource:
"""Safe resource management using weakref.finalize"""
def __init__(self, name):
self.name = name
self._file = open(f"/tmp/{name}.txt", 'w')
# Register finalizer, executes when object is garbage collected
self._finalizer = weakref.finalize(
self, # Monitored object
self._cleanup, # Cleanup function
self._file, # Arguments passed to cleanup function
self.name # Additional arguments
)
@staticmethod
def _cleanup(file_obj, name):
"""Static method: doesn't hold self reference, avoids circular references"""
print(f"[{name}] Executing cleanup...")
if not file_obj.closed:
file_obj.close()
print(f"[{name}] File closed")
def close(self):
"""Explicit close (optional)"""
self._finalizer() # Execute cleanup immediately
@property
def closed(self):
return self._finalizer.alive
# Comparison demonstration
print("=== weakref.finalize Comparison ===")
# Traditional __del__ approach (not recommended)
class OldStyle:
def __del__(self):
print("OldStyle.__del__")
# New approach (recommended)
class NewStyle:
def __init__(self):
self._finalizer = weakref.finalize(self, lambda: print("NewStyle cleaned up"))
old = OldStyle()
new = NewStyle()
# Even with circular references
circular_old = OldStyle()
circular_new = NewStyle()
circular_old.ref = circular_new
circular_new.ref = circular_old
del old, new, circular_old, circular_new
gc.collect()
# weakref.finalize still executes, while __del__ doesn't
Key Differences:
| Feature | __del__ | weakref.finalize |
|---|---|---|
| Circular references | Doesn’t execute | Executes normally |
| Exception handling | Exception ignored | Can check via return value |
| Execution timing | Uncertain | When object is collected |
| Cancel ability | None | Can detach() |
| Holds reference | May create new references | Doesn’t hold object references |
Deterministic Resource Cleanup: Context Managers
For resources requiring deterministic cleanup, context managers are the only reliable choice:
from contextlib import contextmanager
from typing import Generator
import tempfile
import shutil
import os
class ManagedResource:
"""Production-grade resource management class"""
def __init__(self, name: str, temp_dir: str = None):
self.name = name
self._temp_dir = temp_dir or tempfile.mkdtemp()
self._files = []
self._open = True
def __enter__(self):
"""with statement entry"""
return self
def __exit__(self, exc_type, exc_val, exc_tb):
"""with statement exit - guaranteed execution"""
self.close()
# Return False to propagate exceptions
return False
def close(self):
"""Explicit resource close"""
if self._open:
print(f"[{self.name}] Cleaning up...")
for f in self._files:
if os.path.exists(f):
os.remove(f)
if os.path.exists(self._temp_dir):
shutil.rmtree(self._temp_dir)
self._open = False
print(f"[{self.name}] Resource released")
def create_file(self, filename: str, content: str):
"""Create file in managed directory"""
if not self._open:
raise RuntimeError("Resource already closed")
filepath = os.path.join(self._temp_dir, filename)
with open(filepath, 'w') as f:
f.write(content)
self._files.append(filepath)
return filepath
@property
def is_open(self) -> bool:
return self._open
# Context manager factory function (cleaner API)
@contextmanager
def managed_temp_dir(prefix: str = "tmp") -> Generator[str, None, None]:
"""
Context manager for managing temporary directories
Usage example:
with managed_temp_dir("myapp") as tmpdir:
# Use tmpdir here
process_files(tmpdir)
# Auto-cleanup on exit
"""
tmpdir = tempfile.mkdtemp(prefix=prefix)
try:
yield tmpdir
finally:
# Guaranteed execution, even if exception occurs
shutil.rmtree(tmpdir, ignore_errors=True)
# Usage examples
print("=== Context Manager Usage ===")
# Method 1: Class approach
with ManagedResource("MyResource") as res:
res.create_file("data.txt", "hello world")
# ... business logic ...
# Automatically calls close() when exiting with block
# Method 2: Decorator approach
with managed_temp_dir("myapp") as tmp:
print(f"Using temporary directory: {tmp}")
# ... business logic ...
# Auto-cleanup
# Comparison: Wrong way (depends on __del__)
class BadResource:
def __init__(self):
self.file = open("/tmp/bad.txt", "w")
def __del__(self):
self.file.close() # Not guaranteed to execute!
# bad = BadResource() # May leak file handles
Strategy for Disabling del in Production
In strict codebases, __del__ should be actively avoided:
# 1. Code review rules (.pylintrc / setup.cfg)
"""
[MESSAGES CONTROL]
disable=unnecessary-dunder-call
disable=invalid-name
# Ban __del__ methods
[REFACTORING]
disable=no-self-use
"""
# 2. Custom decorator enforcing context manager usage
import functools
import warnings
def deprecated_del(cls):
"""Mark class as not using __del__"""
original_del = getattr(cls, '__del__', None)
def new_del(self):
warnings.warn(
f"{cls.__name__} uses __del__ which may cause resource leaks, "
"please use context manager (with statement) instead",
ResourceWarning,
stacklevel=2
)
if original_del:
original_del(self)
cls.__del__ = new_del
return cls
# 3. Static analysis tool check
"""
# .pre-commit-config.yaml
repos:
- repo: local
hooks:
- id: check-no-del
name: Check for __del__ usage
entry: grep -r "__del__" --include="*.py"
language: system
pass_filenames: false
"""
# 4. Runtime detection (development environment)
import atexit
import gc
def check_leaked_resources():
"""Check unreleased resources at program exit"""
gc.collect()
if gc.garbage:
print(f"Warning: {len(gc.garbage)} uncollectable objects detected")
for obj in gc.garbage[:5]: # Show first 5
print(f" - {type(obj).__name__}: {repr(obj)[:100]}")
if __debug__:
atexit.register(check_leaked_resources)
Best Practices Summary:
- Never rely on
__del__for critical resource cleanup - It may not execute at any time - Always use context managers -
withstatements provide deterministic cleanup - Use
weakref.finalizefor non-critical cleanup - Such as logging, statistics collection - Explicit over implicit - Provide
.close()method for users to call actively - Code review bans
__del__- Include in team coding standards
Memory Profiling Toolchain in Practice
After mastering GC mechanisms, we need tools to diagnose real problems. Here is a production-validated memory profiling toolchain.
memory_profiler for Line-Level Analysis
memory_profiler provides line-by-line memory usage reports, the most precise tool for locating memory hotspots.
# Install: pip install memory_profiler
from memory_profiler import profile
import numpy as np
@profile
def process_large_dataset():
"""Line-by-line memory analysis"""
# Line 1: Allocate large dataset
data = np.random.randn(1000, 1000) # ~8MB
# Line 2: Create copy
processed = data * 2 # Another ~8MB
# Line 3: Type conversion
result = processed.astype(np.float32) # May trigger temporary allocation
# Line 4: Release intermediate results
del data, processed
return result
# Run analysis
# python -m memory_profiler script.py
# Output:
# Line # Mem usage Increment Line Contents
# ================================================
# 6 38.5 MiB 38.5 MiB @profile
# 7 def process_large_dataset():
# 8 46.8 MiB 8.3 MiB data = np.random.randn(1000, 1000)
# 9 54.8 MiB 8.0 MiB processed = data * 2
# 10 54.9 MiB 0.1 MiB result = processed.astype(np.float32)
# 11 46.8 MiB -8.1 MiB del data, processed
# 12 46.8 MiB 0.0 MiB return result
Key Metric Interpretation:
- Mem usage: Total memory after line execution
- Increment: Memory change from that line (positive = allocation, negative = release)
- Focus on large increments and unreleased accumulation
Advanced Usage: Time-decay Sampling
from memory_profiler import memory_usage
import time
def monitor_memory_over_time(func, interval=0.1):
"""Monitor memory changes during function execution"""
mem_usage = memory_usage(
(func, (), {}), # (func, args, kwargs)
interval=interval, # Sampling interval
timeout=None,
max_usage=True, # Return peak
retval=True # Return function result
)
return mem_usage
# Usage
peak_mem, result = monitor_memory_over_time(process_large_dataset)
print(f"Peak memory: {peak_mem:.1f} MiB")
filprofiler Flame Graph Interpretation
filprofiler focuses on answering “who allocated the memory”, generating flame graphs that intuitively show allocation stacks.
# Install: pip install filprofiler
# Run: fil-profile run script.py
# Code example
def load_and_process():
"""Simulate data processing workflow"""
# 1. Data loading
raw_data = load_from_database() # Many small objects
# 2. Transformation
transformed = [transform(item) for item in raw_data]
# 3. Aggregation
result = aggregate(transformed)
return result
def transform(item):
"""Transformation logic for each element"""
return {
'id': item['id'],
'features': extract_features(item['raw']) # May allocate large memory
}
# fil-profile generates flame graphs like:
# 100% |----------------------------------------|
# | load_and_process |
# 80% |--------------|-------------------------|
# |load_from_db | transform |
# 60% |--------------| |-----------------|
# | raw_data | |extract_feat |
# 40% |--------------| |-----------------|
# | (list, ...) | | (numpy, ...) |
#
# Wider = more memory allocated on that code path
# Bottom to top = call stack
Flame Graph Analysis Points:
- Top layer width = Total allocation for that code path
- Sudden widening = Large allocations occurring here
- Watch Python built-ins =
list.append,dict.update, etc.
Scalene for Comprehensive Performance Analysis
scalene is the most advanced Python profiler, providing both CPU and memory analysis.
# Install: pip install scalene
# Run: scalene script.py
# Code example
def memory_intensive_task():
"""Demonstrate various memory allocation patterns"""
# Native Python allocation (CPU + memory)
data = []
for i in range(100000):
data.append(str(i)) # Many small strings
# NumPy allocation (CPU + memory)
import numpy as np
matrix = np.random.randn(5000, 5000)
# Mixed computation
result = sum(len(s) for s in data) + matrix.sum()
return result
# Scalene output format:
# | Time (s) | Memory (MB) | Line | Code |
# |----------|-------------|------|-----------------------------|
# | 0.523 | 45.2 | 6 | data = [] |
# | 2.145 | 156.3 | 8 | data.append(str(i)) |
# | 0.089 | 190.7 | 12 | matrix = np.random.randn... |
# | 0.234 | 0.0 | 15 | result = sum(len(s) for... |
#
# Interpretation:
# - Line 8 consumes 2.145s and 156MB, main bottleneck
# - Line 12 allocates 190MB (matrix) but only 0.089s (C code)
Scalene’s Unique Value:
- Distinguish Python vs Native code time consumption
- Distinguish CPU vs GPU memory
- Line-level precise analysis
- Low overhead (can analyze production code)
Integrating Memory Regression Testing into CI/CD
Finally, automate memory detection:
# test_memory_regression.py
import tracemalloc
import unittest
from functools import wraps
def memory_limit(max_mb: float):
"""Decorator: limit test case memory usage"""
def decorator(func):
@wraps(func)
def wrapper(*args, **kwargs):
tracemalloc.start()
try:
result = func(*args, **kwargs)
current, peak = tracemalloc.get_traced_memory()
peak_mb = peak / 1024 / 1024
if peak_mb > max_mb:
raise AssertionError(
f"Memory exceeded: {peak_mb:.1f}MB > limit: {max_mb}MB"
)
return result
finally:
tracemalloc.stop()
return wrapper
return decorator
class MemoryRegressionTest(unittest.TestCase):
"""Memory regression test suite"""
@memory_limit(100) # Limit 100MB
def test_data_processing(self):
"""Ensure data processing stays within memory budget"""
large_list = [i for i in range(1_000_000)]
result = sum(large_list)
self.assertEqual(result, sum(range(1_000_000)))
@memory_limit(50)
def test_model_inference(self):
"""Model inference memory test"""
# Simulate inference
import numpy as np
inputs = np.random.randn(32, 512) # batch=32, dim=512
# ... inference code ...
output = inputs @ np.random.randn(512, 1000)
self.assertEqual(output.shape, (32, 1000))
# CI/CD integration (.github/workflows/memory.yml)
"""
name: Memory Regression Test
on: [push, pull_request]
jobs:
memory-test:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: Run memory tests
run: |
python -m pytest test_memory_regression.py -v
- name: Generate memory report
run: |
python -m memory_profiler profile_script.py > memory_report.txt
- name: Upload report
uses: actions/upload-artifact@v3
with:
name: memory-report
path: memory_report.txt
"""
Tool Selection Decision Tree:
Need memory analysis?
├── Locate specific line -> memory_profiler
├── View allocation call stack -> filprofiler
├── Combined CPU + memory -> scalene
└── CI/CD automation -> pytest + tracemalloc
If You Need to Continue Distinguishing, Which Dimensions Should You Really Look At
Three dimensions to help you judge memory problems:
| Dimension | Checkpoints | Tools |
|---|---|---|
| Object Type | Mutable vs Immutable | type(), isinstance() |
| Lifecycle | Short-cycle vs Long-cycle | tracemalloc |
| Reference Pattern | Tree vs Graph structure | gc.get_referrers() |
Object Type: Mutable objects (list, dict) are more prone to circular references. Immutable objects (tuple, int, str) usually don’t.
Lifecycle: Short-cycle objects (temporary variables in functions) are handled by reference counting, never reaching GC. Long-cycle objects (global caches, long connections) need GC attention.
Reference Pattern: Tree structures (DOM, AST) are handled by reference counting. Graph structures (social networks, dependency relationships) are prone to circular references.
A More Reliable Judgment Order
When encountering memory problems, troubleshoot in this order:
Step 1: Check Circular References
import gc
# Key: Must set DEBUG_SAVEALL, otherwise circular reference objects won't enter gc.garbage
gc.set_debug(gc.DEBUG_SAVEALL)
gc.collect()
# View recycled but still surviving objects (Python 3.4+ won't put in gc.garbage by default)
print(gc.garbage) # Only has content after DEBUG_SAVEALL is set
Important Notes:
- Python 3.4+ for performance reasons won’t put recycled objects in
gc.garbageby default - Must explicitly set
gc.DEBUG_SAVEALLto collect these objects - Production environments not recommended to keep DEBUG_SAVEALL enabled long-term, brings performance overhead
Step 2: Check Reference Count
import sys
# Note: getrefcount itself creates a temporary reference
print(sys.getrefcount(obj) - 1)
Step 3: Consider Pooling Strategy
If the first two steps are fine, high memory usage is likely pymalloc’s pooling strategy. Refer to Part 1’s tracemalloc method for diagnosis.
GC Optimization Practices in Large Model Scenarios
Typical Memory Traps in Hugging Face Transformers
In large model training and inference, GC problems are often magnified to the GB level. Here are three real-world cases encountered in production environments.
Case 1: Circular References Between Model Weights and Optimizer States
import torch
from transformers import AutoModel
# Problem code
def create_training_setup():
model = AutoModel.from_pretrained("gpt2")
optimizer = torch.optim.AdamW(model.parameters())
# Implicit circular reference!
# optimizer references model.parameters()
# Some callbacks may reference both
return model, optimizer
# After training ends
del model, optimizer
gc.collect() # Memory doesn't move!
Root Cause Analysis:
- PyTorch’s
Optimizerholds references tomodel.parameters() - Some learning rate schedulers or callback functions hold references to both
- Forms triangular circular reference
- GC can recycle, but causes out-of-memory before recycling
Solution:
def cleanup_training(model, optimizer):
"""Safe training resource cleanup"""
# Step 1: Clear optimizer state
optimizer.state.clear()
# Step 2: Disconnect parameter references
optimizer.param_groups.clear()
# Step 3: Force GC
gc.collect()
# Step 4: If CUDA, clear cache
if torch.cuda.is_available():
torch.cuda.empty_cache()
Case 2: DataLoader Multi-Process Memory Leak
from torch.utils.data import DataLoader
# Problem code: memory leak when num_workers > 0
train_loader = DataLoader(
dataset,
batch_size=32,
num_workers=4, # Multi-process data loading
persistent_workers=True # Persistent processes
)
# After each epoch
for batch in train_loader:
# Training logic
pass
# Epoch ends, but subprocess memory not released!
Root Cause Analysis:
- Multi-process DataLoader uses
fork()to create subprocesses - Subprocesses inherit parent process memory space (copy-on-write)
persistent_workers=Truekeeps processes between epochs- Some shared memory regions cannot be properly released
Solution:
class SafeDataLoader:
"""DataLoader wrapper with automatic cleanup"""
def __init__(self, *args, **kwargs):
self.loader_args = args
self.loader_kwargs = kwargs
self._loader = None
def __enter__(self):
self._loader = DataLoader(*self.loader_args, **self.loader_kwargs)
return self._loader
def __exit__(self, *args):
# Explicitly close DataLoader
if hasattr(self._loader, '_iterator'):
self._loader._iterator._shutdown_workers()
self._loader = None
gc.collect()
# Usage
with SafeDataLoader(dataset, batch_size=32, num_workers=4) as loader:
for batch in loader:
# Training
pass
# Automatic cleanup
GC Threshold Tuning: From Default (700,10,10) to Production Practice
Understanding Default Thresholds:
import gc
# Default thresholds
print(gc.get_threshold()) # (700, 10, 10)
# Meaning:
# - GC triggered when generation 0 objects exceed 700
# - Generation 0 triggers generation 1 GC after 10 GC cycles
# - Generation 1 triggers generation 2 GC after 10 GC cycles
Tuning Strategies for Large Model Training Scenarios:
Strategy 1: Long Training Tasks—Reduce GC Frequency
# Scenario: Training large models that need to run for days
# Strategy: Reduce GC frequency, decrease STW (Stop-The-World) pauses
# Tuned thresholds
gc.set_threshold(2000, 50, 50)
# Monitor effects
import time
def profile_gc():
gc.set_debug(gc.DEBUG_STATS)
# Run for a while
time.sleep(3600) # 1 hour
# View GC statistics
# Example output:
# gc: done, 1234567 unreachable, 2345678 collected, 0.234s elapsed
Results:
- GC trigger frequency reduced by ~60%
- Single GC time increased by ~40%
- Overall GC overhead reduced by ~30%
- Training throughput improved by ~2-3%
Strategy 2: Inference Services—Freeze Early Objects
# Scenario: Long-running LLM inference service
# Strategy: Freeze known clean early objects
import gc
# After service startup, model loading complete
model = load_model()
# Freeze generation 0 and 1, these objects are known to be long-lifecycle
gc.freeze()
# After this, only newly created objects are scanned by GC
# Reduces number of objects GC needs to traverse
# Periodically (e.g., hourly) unfreeze and refreeze
def refresh_freeze():
gc.unfreeze()
gc.collect()
gc.freeze()
Strategy 3: Critical Paths—Temporarily Disable GC
# Scenario: Critical inference paths need deterministic latency
# Strategy: Temporarily disable GC in critical paths
import gc
class CriticalPath:
def __enter__(self):
# Save current state
self.gc_was_enabled = gc.isenabled()
# Disable GC
gc.disable()
return self
def __exit__(self, *args):
# Restore GC
if self.gc_was_enabled:
gc.enable()
# Manually trigger one full GC
gc.collect()
# Usage
with CriticalPath():
# Critical inference logic
result = model.generate(prompt)
# Automatically restores GC and cleans up after exit
Risk Warnings:
- Disabling GC for long periods may cause memory exhaustion
- Recommend keeping critical path time under 1 second
- Only use in performance-critical scenarios
Weak References in Model Caching
import weakref
from functools import lru_cache
class ModelCache:
"""Model cache implemented with weak references"""
def __init__(self):
# Use weak references, don't prevent models from being GC'd
self._cache = weakref.WeakValueDictionary()
self._access_count = {}
def get(self, model_name: str):
model = self._cache.get(model_name)
if model is not None:
self._access_count[model_name] = self._access_count.get(model_name, 0) + 1
return model
def put(self, model_name: str, model):
# Key boundary conditions:
# 1. Model must support weak references (have __weakref__ attribute)
# 2. Basic types (int, str, tuple, etc.) don't support weak references
# 3. None cannot be stored in WeakValueDictionary
self._cache[model_name] = model
self._access_count[model_name] = 0
def get_stats(self):
"""Return cache statistics"""
return {
'cached_models': list(self._cache.keys()),
'access_counts': self._access_count.copy()
}
# Usage
cache = ModelCache()
# Load model and cache
cache.put("gpt2", load_model("gpt2"))
# Get model
model = cache.get("gpt2") # Returns model object
# When memory is insufficient, GC can recycle cached models
# Because there are no strong references, only weak references
# ⚠️ Boundary condition examples:
# 1. Basic types don't support weak references
cache.put("answer", 42) # ❌ TypeError: cannot create weak reference to 'int' object
# 2. If need to cache basic types, need wrapping
class CachedValue:
def __init__(self, value):
self.value = value
cache.put("answer", CachedValue(42)) # ✅
Key Boundary Conditions:
| Limitation | Explanation | Solution |
|---|---|---|
__weakref__ required | Objects must support weak references | Most custom classes support by default |
| Basic types not supported | int/str/tuple etc. not supported | Use wrapper class or WeakKeyDictionary |
| None cannot be stored | WeakValueDictionary rejects None | Use placeholder object or handle separately |
| Lifecycle uncertainty | May be GC’d at any time | Always check if get() return is None |
Generational GC Threshold Tuning
The mathematical meaning of the default threshold (700, 10, 10) needs deep understanding for effective tuning. These three numbers form Python GC’s decision matrix.
Detailed Mathematical Meaning of Threshold (700, 10, 10)
import gc
# Mathematical meaning of default thresholds
threshold_0, threshold_1, threshold_2 = gc.get_threshold()
print(f"Thresholds: ({threshold_0}, {threshold_1}, {threshold_2})")
# Trigger logic:
# - Generation 0 allocated objects > 700: trigger gen0 GC
# - gen0 GC count > 10: trigger gen0+gen1 GC
# - gen1 GC count > 10: trigger gen0+gen1+gen2 GC
The mathematical model can be understood as:
| Threshold | Trigger Condition | Practical Meaning | Affected Objects |
|---|---|---|---|
| 700 | Generation 0 allocation count | Short-cycle object accumulation rate | Newly created temporary objects |
| 10 | Generation 0 GC count | Object promotion rate to generation 1 | Medium lifecycle objects |
| 10 | Generation 1 GC count | Object promotion rate to generation 2 | Long lifecycle objects |
Key Insight: The threshold product (700 × 10 × 10 = 70,000) approximately represents how many allocations an object needs to experience to be promoted from generation 0 to generation 2.
Optimal Threshold Configuration for Different Scenarios
Web Service Scenario: Pursuing Low Latency
# Web service's core trade-off: response latency vs memory usage
def configure_web_service_gc():
"""
Goal: Reduce GC pauses' impact on request response time
Strategy: More frequent but faster GC, avoiding single long pauses
"""
# Reduce single GC processing volume, increase frequency
gc.set_threshold(300, 5, 5)
# Why this configuration:
# - 300: Quickly recycle short-term objects, prevent accumulation
# - 5: Accelerate object promotion, reduce repeated scanning
# - 5: Trigger full GC faster, avoid memory continuous growth
# Effect: GC pause from ~50ms reduced to ~20ms, but frequency increases
# Suitable for: API gateways, microservices, short-connection scenarios
Data Processing Scenario: Pursuing Throughput
def configure_data_processing_gc():
"""
Goal: Maximize data processing throughput
Strategy: Reduce GC frequency, tolerate higher memory usage
"""
# Data stream processing tasks, object lifecycle is clear
gc.set_threshold(2000, 50, 50)
# Why this configuration:
# - 2000: Allow more objects to accumulate in generation 0, reduce GC frequency
# - 50: Lower promotion frequency, let objects survive longer in generation 0
# - 50: Significantly reduce full GC frequency
# Effect: Throughput improves 15-25%, memory usage increases 30%
# Suitable for: ETL tasks, batch processing, data transformation
Long Training Task Scenario: Balanced Strategy
def configure_training_gc():
"""
Goal: Maintain stable performance during training
Strategy: Dynamic adjustment, switch strategies based on training stage
"""
import gc
class GCManager:
def __init__(self):
self.default_threshold = (700, 10, 10)
self.data_loading_threshold = (1500, 30, 30)
self.training_threshold = (3000, 100, 50)
def set_data_loading_mode(self):
"""Data loading stage: High-frequency GC prevents memory explosion"""
gc.set_threshold(*self.data_loading_threshold)
print("GC mode: Data loading - Medium frequency")
def set_training_mode(self):
"""Training stage: Low-frequency GC maximizes GPU utilization"""
gc.set_threshold(*self.training_threshold)
gc.freeze() # Freeze known clean objects
print("GC mode: Training - Low frequency, early objects frozen")
def set_checkpoint_mode(self):
"""Checkpoint stage: Force full GC"""
gc.set_threshold(*self.default_threshold)
gc.unfreeze()
gc.collect()
print("GC mode: Checkpoint - Force full collection")
return GCManager()
# Usage example
gc_manager = configure_training_gc()
# Data loading stage
gc_manager.set_data_loading_mode()
# ... load data ...
# Training stage
gc_manager.set_training_mode()
# ... training loop ...
# Save checkpoint
gc_manager.set_checkpoint_mode()
# ... save model ...
A/B Testing Methodology for Threshold Tuning
Scientific tuning requires quantitative metrics. Here is a complete A/B testing framework:
import gc
import time
import statistics
from dataclasses import dataclass
from typing import List, Callable
@dataclass
class GCMetrics:
"""GC performance metrics"""
threshold_config: tuple
total_time: float
gc_pause_times: List[float]
peak_memory_mb: float
objects_collected: int
@property
def avg_gc_pause(self) -> float:
return statistics.mean(self.gc_pause_times) if self.gc_pause_times else 0
@property
def max_gc_pause(self) -> float:
return max(self.gc_pause_times) if self.gc_pause_times else 0
class GCTuningABTest:
"""A/B testing framework for GC threshold tuning"""
def __init__(self):
self.results: List[GCMetrics] = []
def measure_config(self, threshold: tuple, workload: Callable, runs: int = 3) -> GCMetrics:
"""
Measure GC performance under specific threshold configuration
Args:
threshold: (gen0_thresh, gen1_thresh, gen2_thresh)
workload: Function simulating workload
runs: Number of runs to average
"""
gc_pause_times = []
total_collected = 0
# Set GC debug to capture pause times
gc.set_debug(gc.DEBUG_STATS)
def gc_callback(phase, info):
"""GC event callback, record pause times"""
if phase == "stop":
gc_pause_times.append(info.get('elapsed', 0))
# Install callback (Python 3.7+)
if hasattr(gc, 'callbacks'):
gc.callbacks.append(gc_callback)
run_times = []
for _ in range(runs):
# Clean state
gc.collect()
gc.set_threshold(*threshold)
# Measure
start = time.perf_counter()
collected_before = gc.get_stats()[0]['collected']
workload()
elapsed = time.perf_counter() - start
collected_after = gc.get_stats()[0]['collected']
run_times.append(elapsed)
total_collected += (collected_after - collected_before)
# Clean callback
if hasattr(gc, 'callbacks'):
gc.callbacks.remove(gc_callback)
return GCMetrics(
threshold_config=threshold,
total_time=statistics.mean(run_times),
gc_pause_times=gc_pause_times,
peak_memory_mb=self._get_peak_memory(),
objects_collected=total_collected // runs
)
def _get_peak_memory(self) -> float:
"""Get peak memory usage (MB)"""
import tracemalloc
if tracemalloc.is_tracing():
current, peak = tracemalloc.get_traced_memory()
return peak / 1024 / 1024
return 0.0
def compare_configs(self, configs: List[tuple], workload: Callable) -> None:
"""
Compare multiple threshold configurations
Example:
configs = [
(700, 10, 10), # Default
(300, 5, 5), # Web services
(2000, 50, 50), # Data processing
(3000, 100, 100), # Long tasks
]
"""
print("=" * 80)
print("GC Threshold A/B Test Results")
print("=" * 80)
print(f"{'Config':<20} {'Total(s)':<10} {'AvgGC(ms)':<10} {'MaxGC(ms)':<10} {'Peak(MB)':<12}")
print("-" * 80)
for config in configs:
metrics = self.measure_config(config, workload)
self.results.append(metrics)
config_str = f"{config}"
print(f"{config_str:<20} {metrics.total_time:<10.3f} "
f"{metrics.avg_gc_pause*1000:<10.2f} "
f"{metrics.max_gc_pause*1000:<10.2f} "
f"{metrics.peak_memory_mb:<12.1f}")
print("=" * 80)
self._print_recommendation()
def _print_recommendation(self) -> None:
"""Provide recommendations based on results"""
if not self.results:
return
# Sort by average GC pause
by_pause = sorted(self.results, key=lambda x: x.avg_gc_pause)
best_latency = by_pause[0]
# Sort by total time
by_throughput = sorted(self.results, key=lambda x: x.total_time)
best_throughput = by_throughput[0]
print("\nRecommended configurations:")
print(f" Lowest latency: {best_latency.threshold_config} "
f"(avg GC pause: {best_latency.avg_gc_pause*1000:.2f}ms)")
print(f" Best throughput: {best_throughput.threshold_config} "
f"(total time: {best_throughput.total_time:.3f}s)")
# Usage example
def sample_workload():
"""Example workload: Create and destroy many objects"""
data = []
for i in range(10000):
obj = {'index': i, 'data': [0] * 100}
data.append(obj)
if len(data) > 1000:
data = data[500:] # Keep half, simulate partial survival
# Run A/B test
test = GCTuningABTest()
test.compare_configs(
configs=[(700, 10, 10), (300, 5, 5), (2000, 50, 50)],
workload=sample_workload
)
A/B Testing Best Practices:
- Control variables: Only change one threshold parameter at a time
- Multiple runs: At least 3-5 runs to average, eliminate noise
- Real workload: Use actual data patterns from production
- Monitor tail latency: Watch P99 GC pauses, not just averages
- Memory pressure: Test under near-memory-limit conditions
GC Tuning Decision Tree
Problem: GC causing performance issues?
├── Yes -> Scenario judgment
│ ├── Training task -> Increase threshold, reduce frequency
│ ├── Inference service -> Freeze early objects
│ └── Critical path -> Temporarily disable GC
└── No -> Check memory leak
├── Circular reference -> Use weakref or manually disconnect
└── Object leak -> Track with tracemalloc
Conclusion: Finally Making Sense of It
“Python uses reference counting for garbage collection”—this sentence is not wrong, but it’s incomplete.
The complete understanding is: Python uses reference counting to handle most object lifecycles, uses generational GC to handle circular references, and uses memory pool strategy to manage memory release.
Three misconceptions correspond to three scenarios:
- Circular references → Reference counting fails, needs generational GC
- Memory not released → GC only handles object recycling, doesn’t control memory return
- del not working → Deletes references, object survival depends on reference count
Next time you encounter a memory problem, ask yourself first: Can reference counting handle this? If yes, check references; if not, check circular references; if neither, consider pooling strategy.
In the next article, we’ll dive deep into the GIL and concurrency—seeing why 72 processes vs 1 process is Meta AI’s real dilemma, and how PEP 703 changes everything.
References and Acknowledgments
- Python
gcModule Documentation — Python.org - Python
tracemallocModule Documentation — Python.org - Memory Management in Python — Real Python
- “Garbage Collection in Python” — Various sources
Series context
You are reading: Python Memory Model Deep Dive
This is article 2 of 7. Reading progress is stored only in this browser so the full series page can resume from the right entry.
Series Path
Current series chapters
Chapter clicks store reading progress only in this browser so the series page can resume from the right entry.
- Original Interpretation: The Three-Layer World of Python Memory Architecture Why doesn't memory drop after deleting large lists? Understanding the engineering trade-offs and design logic of Python's Arena-Pool-Block three-layer memory architecture
- Original Interpretation: Python Garbage Collection - The Three Most Common Misconceptions Deconstructing the three major misconceptions about reference counting, gc.collect(), and del statements, establishing a complete cognitive framework for Python GC mechanisms (reference counting + generational GC + cycle detection)
- Original Analysis: 72 Processes vs 1 Process—How GIL Becomes a Bottleneck for AI Training and PEP 703's Breakthrough Reviewing real production challenges at Meta AI and DeepMind, analyzing PEP 703's Biased Reference Counting (BRC) technology, and exploring the implications of Python 3.13+ nogil builds for large-scale model concurrency
- Original Analysis: Python as a Glue Language—How Bindings Connect Performance and Ease of Use A comparative analysis of ctypes, CFFI, PyBind11, Cython, and PyO3/Rust, exploring the technical nature and engineering choices of Python as a glue language for large models
- Original Analysis: Why FastAPI Rises in the AI Era—The Engineering Value of Type Hints and Async I/O Analyzing Python type hints, async I/O, and FastAPI's rise logic; establishing a feature-capability matching framework for LLM API service development
- Original Analysis: Why Python Monopolizes LLM Development—Ecosystem Flywheel and Data Evidence Synthesizing multi-source data from Stack Overflow 2025, PEP 703 industry testimonies, and LangChain ecosystem to analyze the causes and flywheel effects of Python's dominance in AI
- Original Analysis: Capability Building for Python Developers in the AI Tools Era—A Practical Guide for Frontline Engineers Based on Stack Overflow 2025 data, establishing a capability building roadmap from beginner to expert, providing stage assessment, priority ranking, and minimum executable solutions
Reading path
Continue along this topic path
Follow the recommended order for Python instead of jumping through random articles in the same topic.
Next step
Go deeper into this topic
If this article is useful, continue from the topic page or subscribe to follow later updates.
Loading comments...
Comments and discussion
Sign in with GitHub to join the discussion. Comments are synced to GitHub Discussions