gc module internal structure and API #6910

youknowone · 2026-01-30T23:07:33Z

Only interface, without actual GC

split from #6849

Summary by CodeRabbit

New Features
- Full garbage collection API: enable/disable, manual collection, and generational behavior
- GC monitoring: per-generation stats, collection counts, debug flags, and inspectable tracked objects
- GC controls: configurable thresholds, freeze/unfreeze permanent generation, and GC callbacks/garbage lists
Behavioral
- Objects can report GC tracking and expose their referents for inspection

_{✏️ Tip: You can customize this high-level summary in your review settings.}

coderabbitai · 2026-01-30T23:07:43Z

📝 Walkthrough

Walkthrough

Implements a generational GC subsystem and exposes a Python-compatible gc module API, adds VM context fields for gc state, extends PyObject inspection for GC, and updates Cargo features and spellcheck tokens.

Changes

Cohort / File(s)	Summary
Spellcheck Dictionary `.cspell.dict/cpython.txt`	Added tokens `saveall`, `uncollectable`.
GC Module (stdlib) `crates/stdlib/src/gc.rs`	New public gc module API: debug flags, enable/disable, collect, thresholds/counts/stats, object introspection, freeze/unfreeze, callbacks invocation, and argument structs.
VM Feature & Exposure `crates/vm/Cargo.toml`, `crates/vm/src/lib.rs`	Added `gc` feature and made it default; exported `pub mod gc_state`.
GC Core State `crates/vm/src/gc_state.rs`	Introduced GcState, GcGeneration, GcStats, debug flags, tracking registries, freeze/unfreeze, threshold/count/stats accessors, collection stubs, and global singleton accessor.
Object Model Changes `crates/vm/src/object/core.rs`	Made `gc_finalized` public; added `is_gc_tracked()` and `gc_get_referents()` on PyObject for tracking and referent enumeration.
VM Context `crates/vm/src/vm/context.rs`	Added `gc_callbacks: PyListRef` and `gc_garbage: PyListRef` to Context and initialized them at genesis.

Sequence Diagram

sequenceDiagram
    actor Python
    participant GC as "gc module"
    participant GCState as "GcState (global)"
    participant Callbacks as "callbacks list"
    participant VM as "VM Context"

    Python->>GC: collect(generation=None)
    GC->>GCState: collect(generation)
    activate GCState
    GCState->>Callbacks: invoke_callbacks("start", generation, 0, 0)
    Callbacks->>VM: execute registered callbacks
    VM-->>Callbacks: callback results
    GCState->>GCState: perform collection (counts)
    GCState->>Callbacks: invoke_callbacks("stop", generation, collected, uncollectable)
    Callbacks->>VM: execute registered callbacks
    VM-->>Callbacks: callback results
    deactivate GCState
    GCState-->>GC: return (collected, uncollectable)
    GC-->>Python: return counts

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~50 minutes

Possibly related PRs

impl gc finialized #6689: Related changes to PyObject finalization and gc_bits handling which intersects with making gc_finalized public and GC tracking logic.

Suggested reviewers

ShaharNaveh

Poem

🐰 I nibble code where cycles hide,
I count three generations wide,
I hop through lists of callbacks bright,
I freeze, unfreeze in moonlit night,
Hooray — the GC burrows right!

🚥 Pre-merge checks | ✅ 3

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title 'gc module internal structure and API' accurately describes the main purpose of this pull request, which adds the internal GC state structure and comprehensive API surface for the gc module.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

github-actions · 2026-01-30T23:08:02Z

Code has been automatically formatted

The code in this PR has been formatted using:

cargo fmt --all
Please pull the latest changes before pushing again:

git pull origin gc-internal

youknowone · 2026-01-30T23:09:41Z

cc @discord9

Add gc_state module with GcState, GcGeneration, GcDebugFlags, GcStats. Replace gc module stubs with working API backed by gc_state. Add gc_callbacks and gc_garbage to Context. Add is_gc_tracked, gc_finalized, gc_get_referents to PyObject. Collection is stubbed (returns 0) — actual algorithm to follow.

coderabbitai

Actionable comments posted: 3

🤖 Fix all issues with AI agents

In `@crates/stdlib/src/gc.rs`:
- Around line 189-196: get_referrers currently unconditionally returns an empty
list but the review warns against raising NotImplementedError; to match CPython
behavior, either implement basic GC-tracking or explicitly preserve the
empty-list semantics and document the limitation. Update the
get_referrers(FuncArgs, VirtualMachine) implementation to return
vm.ctx.new_list(vec![]) for non-discoverable referrers, add/adjust the doc
comment inside the function to state that only GC-tracked references are
discoverable and stack/C-extension references are not, and do not raise
NotImplementedError; alternatively, if you choose to implement minimal tracking,
modify the scanning logic in get_referrers to consult your GC-tracking
structures and collect referrers before falling back to vm.ctx.new_list(vec![]).

In `@crates/vm/src/gc_state.rs`:
- Around line 231-287: untrack_object currently never removes a deallocated
frozen object from permanent_objects, leaving stale pointers that break unfreeze
and get_freeze_count; update unsafe fn untrack_object to also acquire a write
lock on self.permanent_objects and remove the GcObjectPtr (gc_ptr) just like
tracked_objects and finalized_objects are handled, ensuring the permanent set
does not retain freed pointers (refer to GcObjectPtr, untrack_object,
permanent_objects, unfreeze, and get_freeze_count).

In `@crates/vm/src/object/core.rs`:
- Around line 902-919: is_gc_tracked currently returns capability
(vtable.trace.is_some() || dict.is_some()) instead of actual membership; add a
public query on GcState (e.g., GcState::is_tracked(&self, obj_id: ObjectId) or
similar) that checks the tracked_objects set used by
track_object/untrack_object, then change PyObject::is_gc_tracked to call that
GcState query rather than inspecting vtable.trace or dict; ensure you use the
object's unique identifier or reference the same key used in tracked_objects so
is_gc_tracked reflects real GC membership while gc_get_referents can remain
unchanged.

🧹 Nitpick comments (1)

crates/vm/src/gc_state.rs (1)

289-305: Finalize tracking has two sources of truth.

GcState::{is_finalized, mark_finalized} uses a HashSet, but PyObject::gc_finalized relies on per-object bits and stdlib gc.is_finalized uses that path. Consider consolidating on one mechanism to avoid divergence (e.g., expose set_gc_finalized as pub(crate) and set bits here, or drop the set).

coderabbitai · 2026-01-31T13:27:25Z

crates/stdlib/src/gc.rs

+    /// Return the list of objects that directly refer to any of the arguments.
    #[pyfunction]
-    fn get_referrers(_args: FuncArgs, vm: &VirtualMachine) -> PyResult {
-        Err(vm.new_not_implemented_error(""))
+    fn get_referrers(args: FuncArgs, vm: &VirtualMachine) -> PyListRef {
+        // This is expensive: we need to scan all tracked objects
+        // For now, return an empty list (would need full object tracking to implement)
+        let _ = args;
+        vm.ctx.new_list(vec![])
    }


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🌐 Web query:

CPython gc.get_referrers documentation behavior error handling

💡 Result:

gc.get_referrers(*objs) (CPython) is explicitly a debugging API with several important behavioral limitations:

What it returns: a list of objects that directly refer to any of the given objs, but it only finds referrers among GC-tracked containers (objects participating in cyclic-GC via tp_traverse). If an extension type holds references but does not support GC, it will not be reported. [1]

“Stale” referrers can appear: objects that have been dereferenced but are still part of uncollected cycles may show up; if you want “currently live” results, run gc.collect() first. [1]

Returned objects may be temporarily invalid: the docs warn that some returned referrers may still be under construction and in a temporarily invalid state—avoid doing anything other than inspection/debugging. [1]

It can miss real referrers: not all references in the runtime are represented as GC-tracked container links. For example, references held only on the VM evaluation stack (and some other internal/C-held references) may not be reported, so gc.get_referrers(x) can be empty even when x is obviously still reachable. [2]

Error handling / exceptions (what the docs imply)

The function fires an auditing event gc.get_referrers with argument objs; if you have auditing hooks / security policy enabled, that hook may raise and thus make get_referrers() fail. [1]

Otherwise, the standard docs do not specify special “semantic” errors (it’s a normal builtin; you can still get ordinary failures like MemoryError if allocation of the result list fails, or TypeError for incorrect calling convention). The main “gotcha” called out is invalid/unsafe returned objects, not a recoverable exception path. [1]

Sources: Python gc module docs (get_referrers) [1], explanation of missed referrers (VM stack / C-held refs) [2].

[1] (docs.python.org)
[2] (stackoverflow.com)

Citations:

1: https://docs.python.org/3.12/library/gc.html?utm_source=openai

2: https://stackoverflow.com/questions/55215649/does-any-object-tracked-in-python-gc-have-no-referrer?utm_source=openai

🏁 Script executed:

# First, let's check the actual file to confirm the current state cat -n crates/stdlib/src/gc.rs | head -220 | tail -40

Repository: RustPython/RustPython

Length of output: 1633

get_referrers returns an empty list, but raising NotImplementedError contradicts CPython behavior.

CPython's gc.get_referrers() does not raise NotImplementedError—it legitimately returns an empty list when no referrers are found (e.g., references on the VM stack or in non-GC-tracked C objects are not discoverable). The function is documented as a debugging API with inherent limitations, not as a "not implemented" operation.

The underlying concern is valid: the current implementation makes no attempt to find referrers and unconditionally returns an empty list. However, the suggested fix conflicts with CPython compatibility. Consider either:

Implementing at least basic tracking of GC-managed object references, or

Accepting the empty list as a known limitation while improving documentation

🤖 Prompt for AI Agents

In `@crates/stdlib/src/gc.rs` around lines 189 - 196, get_referrers currently unconditionally returns an empty list but the review warns against raising NotImplementedError; to match CPython behavior, either implement basic GC-tracking or explicitly preserve the empty-list semantics and document the limitation. Update the get_referrers(FuncArgs, VirtualMachine) implementation to return vm.ctx.new_list(vec![]) for non-discoverable referrers, add/adjust the doc comment inside the function to state that only GC-tracked references are discoverable and stack/C-extension references are not, and do not raise NotImplementedError; alternatively, if you choose to implement minimal tracking, modify the scanning logic in get_referrers to consult your GC-tracking structures and collect referrers before falling back to vm.ctx.new_list(vec![]).

crates/vm/src/gc_state.rs

coderabbitai · 2026-01-31T13:27:25Z

crates/vm/src/object/core.rs

+    /// Check if this object is tracked by the garbage collector.
+    /// Returns true if the object has a trace function or has an instance dict.
+    pub fn is_gc_tracked(&self) -> bool {
+        if self.0.vtable.trace.is_some() {
+            return true;
+        }
+        self.0.dict.is_some()
+    }
+
+    /// Get the referents (objects directly referenced) of this object.
+    /// Uses the full traverse including dict and slots.
+    pub fn gc_get_referents(&self) -> Vec<PyObjectRef> {
+        let mut result = Vec::new();
+        self.0.traverse(&mut |child: &PyObject| {
+            result.push(child.to_owned());
+        });
+        result
+    }


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash # Search for GC tracking-related code echo "=== Searching for TRACKED bit and tracking functions ===" rg -n "GcBits::TRACKED|track_object|untrack_object|GcState" -g '*.rs' --max-count 30

Repository: RustPython/RustPython

Length of output: 1074

🏁 Script executed:

#!/bin/bash # Examine the is_gc_tracked implementation and surrounding context echo "=== Looking at core.rs to understand current implementation ===" rg -B 5 -A 5 "fn is_gc_tracked" crates/vm/src/object/core.rs

Repository: RustPython/RustPython

Length of output: 485

🏁 Script executed:

#!/bin/bash # Search for how objects are added/removed from GC tracking echo "=== Looking for gc collection and tracking mechanisms ===" fd -e rs | xargs rg -l "gc_collect|gc_track|GcBits" | head -20

Repository: RustPython/RustPython

Length of output: 175

🌐 Web query:

CPython gc.is_tracked implementation and behavior

💡 Result:

What gc.is_tracked(obj) reports (behavior)

gc.is_tracked(obj) returns whether this object instance is currently in CPython’s cyclic-GC tracking lists (not whether it is “GC-capable” in principle). Its result can change after later garbage collections due to container “untracking” optimizations. [1][2]

Objects that cannot participate in reference cycles (most “atomic” objects like int, str, etc.) are not tracked. Containers and user-defined objects typically are tracked, but some built-in containers may be untracked when they are proven to contain only non-tracked contents. [1][2]

This behavior is version- and type-optimization-dependent; e.g., dictionaries/tuples may toggle between tracked and untracked depending on their contents and what GC cycle has run. [1][2]

How CPython implements it (core idea)

At the C level, Python exposes PyObject_GC_IsTracked(op) (Stable ABI since 3.9), documented as “analogous to gc.is_tracked()”: it returns 1 only if the object both supports the GC protocol and is currently tracked. [3]

Internally, CPython’s fast check for “tracked” is essentially:

tracked ⇔ the object’s GC header indicates it’s linked into a GC list

concretely: _PyObject_GC_IS_TRACKED(op) returns true when the per-object GC header field _gc_next is non-zero. [4]

and when an object is untracked, CPython unlinks it from the GC list and sets gc->_gc_next = 0. [5]

Practical implications / “gotchas”

gc.is_tracked(x) == False does not necessarily mean “cannot ever be tracked”: many container objects can be tracked/untracked over time depending on contents and GC passes. [2][4]

If you’re writing/inspecting C extensions: “GC-capable” is PyObject_IS_GC(obj), while “currently tracked” is PyObject_GC_IsTracked(obj) / _PyObject_GC_IS_TRACKED(obj). [3][4]

Sources: Python gc docs and GC design notes [1][2], CPython C-API GC support docs [3], internal header/coverage view showing _PyObject_GC_IS_TRACKED [4], and internal untrack implementation showing _gc_next = 0 [5].

[1] https://docs.python.org/3.12/library/gc.html
[2] https://daobook.github.io/devguide/garbage_collector.html
[3] https://docs.python.org/3/c-api/gcsupport.html
[4] https://cheimes.fedorapeople.org/cpython/lcov-report/Include/internal/pycore_gc.h.gcov.html
[5] https://android.googlesource.com/platform/prebuilts/python/linux-x86/+/5e62e45e850184b4a88ada6467f73f39f8b44609/include/python3.9/internal/pycore_object.h

🏁 Script executed:

#!/bin/bash # Look at track_object and untrack_object implementations sed -n '236,280p' crates/vm/src/gc_state.rs

Repository: RustPython/RustPython

Length of output: 1872

🏁 Script executed:

#!/bin/bash # Look for what data structures hold tracked objects sed -n '99,160p' crates/vm/src/gc_state.rs

Repository: RustPython/RustPython

Length of output: 2550

🏁 Script executed:

#!/bin/bash # Search for how is_gc_tracked is currently used in the codebase rg "is_gc_tracked" -g '*.rs' -B 2 -A 2

Repository: RustPython/RustPython

Length of output: 751

🏁 Script executed:

#!/bin/bash # Check the gc.rs stdlib module to see if it implements gc.is_tracked() head -100 crates/stdlib/src/gc.rs

Repository: RustPython/RustPython

Length of output: 3103

🏁 Script executed:

#!/bin/bash # Look for how GcState is accessed from PyObject methods rg "gc_state()" crates/vm/src/object/core.rs -B 3 -A 3

Repository: RustPython/RustPython

Length of output: 47

🏁 Script executed:

#!/bin/bash # Check if there are any methods to query if an object is in tracked_objects rg "tracked_objects" crates/vm/src/gc_state.rs -B 2 -A 2

Repository: RustPython/RustPython

Length of output: 1090

🏁 Script executed:

#!/bin/bash # Look at the full GcState struct definition and public methods sed -n '99,250p' crates/vm/src/gc_state.rs

Repository: RustPython/RustPython

Length of output: 5303

🏁 Script executed:

#!/bin/bash # Check if there's a way to query tracking state rg "pub.*fn.*is_tracked|is_in_gc" crates/vm/src/gc_state.rs

Repository: RustPython/RustPython

Length of output: 47

is_gc_tracked() incorrectly reports GC-capability rather than actual tracking state.

The method returns true based on trace.is_some() or dict.is_some(), which indicates the object can be tracked, not whether it currently is tracked. GcState maintains actual tracking membership in tracked_objects (populated/cleared by track_object/untrack_object), but this state is private and inaccessible from is_gc_tracked(). This diverges from CPython's gc.is_tracked(), which reports actual GC list membership.

To fix: expose a public query method on GcState to check tracking membership, and have is_gc_tracked() consult it instead of relying on static vtable/dict presence.

🤖 Prompt for AI Agents

In `@crates/vm/src/object/core.rs` around lines 902 - 919, is_gc_tracked currently returns capability (vtable.trace.is_some() || dict.is_some()) instead of actual membership; add a public query on GcState (e.g., GcState::is_tracked(&self, obj_id: ObjectId) or similar) that checks the tracked_objects set used by track_object/untrack_object, then change PyObject::is_gc_tracked to call that GcState query rather than inspecting vtable.trace or dict; ensure you use the object's unique identifier or reference the same key used in tracked_objects so is_gc_tracked reflects real GC membership while gc_get_referents can remain unchanged.

ShaharNaveh

lgtm!

github-actions · 2026-01-31T14:04:25Z

📦 Library Dependencies

The following Lib/ modules were modified. Here are their dependencies:

[ ] lib: cpython/Lib/asyncio
[ ] test: cpython/Lib/test/test_asyncio (TODO: 14)

dependencies:

dependent tests: (8 tests)

asyncio: test_asyncio test_builtin test_contextlib_async test_inspect test_logging test_os test_sys_settrace test_unittest

[ ] test: cpython/Lib/test/test_dict.py (TODO: 3)
[ ] test: cpython/Lib/test/test_dictcomps.py (TODO: 1)
[ ] test: cpython/Lib/test/test_dictviews.py (TODO: 2)
[ ] test: cpython/Lib/test/test_userdict.py

dependencies:

dependent tests: (no tests depend on dict)

[ ] test: cpython/Lib/test/test_generators.py (TODO: 5)
[ ] test: cpython/Lib/test/test_genexps.py
[ ] test: cpython/Lib/test/test_generator_stop.py
[ ] test: cpython/Lib/test/test_yield_from.py (TODO: 10)

dependencies:

dependent tests: (no tests depend on generator)

[ ] lib: cpython/Lib/subprocess.py
[ ] test: cpython/Lib/test/test_subprocess.py (TODO: 9)

dependencies:

subprocess

dependent tests: (51 tests)

subprocess: test_android test_asyncio test_audit test_bz2 test_c_locale_coercion test_cmd_line test_cmd_line_script test_ctypes test_dtrace test_faulthandler test_file_eintr test_gzip test_inspect test_json test_msvcrt test_ntpath test_os test_platform test_plistlib test_poll test_py_compile test_regrtest test_repl test_runpy test_script_helper test_shutil test_signal test_site test_sqlite3 test_subprocess test_support test_sys test_sysconfig test_tempfile test_threading test_unittest test_urllib2 test_utf8_mode test_venv test_wait3 test_webbrowser test_zipfile
- asyncio: test_asyncio test_builtin test_contextlib_async test_logging test_sys_settrace test_unittest
- ctypes.util: test_ctypes
- ensurepip: test_ensurepip
- multiprocessing.util: test_concurrent_futures

[x] lib: cpython/Lib/weakref.py
[x] lib: cpython/Lib/_weakrefset.py
[x] test: cpython/Lib/test/test_weakref.py (TODO: 23)

dependencies:

weakref

dependent tests: (146 tests)

[ ] test: cpython/Lib/test/test_weakset.py (TODO: 1)

dependencies:

dependent tests: (no tests depend on weakset)

[ ] lib: cpython/Lib/zoneinfo
[ ] test: cpython/Lib/test/test_zoneinfo (TODO: 3)

dependencies:

zoneinfo
- datetime (native: _thread, math, sys, time)
- sysconfig (native: os.path, sys)
- collections, os
- bisect, calendar, functools, re, struct, weakref

dependent tests: (3 tests)

zoneinfo: test_email test_plistlib test_zoneinfo

Legend:

[+] path exists in CPython
[x] up-to-date, [ ] outdated

coderabbitai

Actionable comments posted: 2

🤖 Fix all issues with AI agents

In `@crates/vm/src/gc_state.rs`:
- Around line 236-252: In track_object, a poisoned write lock on
generation_objects[0] can cause the object to be inserted into tracked_objects
but not generation_objects (and counts not incremented), creating inconsistency
with get_objects or generation counts; fix by making the operation atomic:
either return a Result from track_object (propagate the lock error from
generation_objects[0].write()) or acquire/check both locks first and only insert
into tracked_objects if generation_objects[0].write() succeeded (i.e., perform
the generation_objects[0].write() and insert into gen0 and update
generations[0].count/alloc_count before touching tracked_objects), referencing
track_object, generation_objects, tracked_objects, generations, alloc_count and
get_objects so the change is applied where those symbols are used.
- Around line 391-429: The freeze() and unfreeze() functions drain sets into
local vectors then attempt to write the destination lock, which can fail and
silently drop objects; change them to handle lock errors defensively: after
draining a generation (generation_objects.write() in freeze) or draining
permanent (permanent_objects.write() in unfreeze), check the result of acquiring
the destination lock (permanent_objects.write() in freeze,
generation_objects[2].write() in unfreeze) and if it Err, re-acquire the
original source lock (or re-insert the drained pointers back into the original
generation/permanent) or else log a fatal error/panic so objects are not lost;
reference freeze, unfreeze, generation_objects, permanent_objects,
permanent.count, generations[*].count and ensure counts are only updated after
successful insertion into the destination set.

coderabbitai · 2026-01-31T14:08:49Z

crates/vm/src/gc_state.rs

+    pub unsafe fn track_object(&self, obj: NonNull<PyObject>) {
+        let gc_ptr = GcObjectPtr(obj);
+
+        // Add to generation 0 tracking first (for correct gc_refs algorithm)
+        // Only increment count if we successfully add to the set
+        if let Ok(mut gen0) = self.generation_objects[0].write()
+            && gen0.insert(gc_ptr)
+        {
+            self.generations[0].count.fetch_add(1, Ordering::SeqCst);
+            self.alloc_count.fetch_add(1, Ordering::SeqCst);
+        }
+
+        // Also add to global tracking (for get_objects, etc.)
+        if let Ok(mut tracked) = self.tracked_objects.write() {
+            tracked.insert(gc_ptr);
+        }
+    }


⚠️ Potential issue | 🟡 Minor

Potential inconsistency if generation lock is poisoned.

If generation_objects[0].write() returns Err (poisoned lock), the object is skipped from generation tracking but still added to tracked_objects. This creates an inconsistent state where get_objects(None) would return the object but get_objects(Some(0)) would not, and the generation count would be wrong.

Consider either:

Propagating the error to the caller, or

Ensuring both operations succeed or neither does (transactional semantics).

Since this is interface-only and actual GC implementation will follow, this may be acceptable for now if documented.

🤖 Prompt for AI Agents

In `@crates/vm/src/gc_state.rs` around lines 236 - 252, In track_object, a poisoned write lock on generation_objects[0] can cause the object to be inserted into tracked_objects but not generation_objects (and counts not incremented), creating inconsistency with get_objects or generation counts; fix by making the operation atomic: either return a Result from track_object (propagate the lock error from generation_objects[0].write()) or acquire/check both locks first and only insert into tracked_objects if generation_objects[0].write() succeeded (i.e., perform the generation_objects[0].write() and insert into gen0 and update generations[0].count/alloc_count before touching tracked_objects), referencing track_object, generation_objects, tracked_objects, generations, alloc_count and get_objects so the change is applied where those symbols are used.

coderabbitai · 2026-01-31T14:08:49Z

crates/vm/src/gc_state.rs

+    pub fn freeze(&self) {
+        // Move all objects from gen0-2 to permanent
+        let mut objects_to_freeze: Vec<GcObjectPtr> = Vec::new();
+
+        for (gen_idx, generation) in self.generation_objects.iter().enumerate() {
+            if let Ok(mut gen_set) = generation.write() {
+                objects_to_freeze.extend(gen_set.drain());
+                self.generations[gen_idx].count.store(0, Ordering::SeqCst);
+            }
+        }
+
+        // Add to permanent set
+        if let Ok(mut permanent) = self.permanent_objects.write() {
+            let count = objects_to_freeze.len();
+            for ptr in objects_to_freeze {
+                permanent.insert(ptr);
+            }
+            self.permanent.count.fetch_add(count, Ordering::SeqCst);
+        }
+    }
+
+    /// Unfreeze all objects (move from permanent to gen2)
+    pub fn unfreeze(&self) {
+        let mut objects_to_unfreeze: Vec<GcObjectPtr> = Vec::new();
+
+        if let Ok(mut permanent) = self.permanent_objects.write() {
+            objects_to_unfreeze.extend(permanent.drain());
+            self.permanent.count.store(0, Ordering::SeqCst);
+        }
+
+        // Add to generation 2
+        if let Ok(mut gen2) = self.generation_objects[2].write() {
+            let count = objects_to_unfreeze.len();
+            for ptr in objects_to_unfreeze {
+                gen2.insert(ptr);
+            }
+            self.generations[2].count.fetch_add(count, Ordering::SeqCst);
+        }
+    }


⚠️ Potential issue | 🟡 Minor

Objects can be lost if lock acquisition fails after draining.

In freeze(), if generation_objects are successfully drained but permanent_objects.write() returns Err (poisoned), the objects in objects_to_freeze are dropped without being added to any tracking set. Similarly in unfreeze().

Consider logging a warning or panic on lock failure here, since losing GC-tracked objects is a serious error condition.

Potential defensive handling

// Add to permanent set - if let Ok(mut permanent) = self.permanent_objects.write() { + let mut permanent = self.permanent_objects.write() + .expect("permanent_objects lock poisoned during freeze"); let count = objects_to_freeze.len(); for ptr in objects_to_freeze { permanent.insert(ptr); } self.permanent.count.fetch_add(count, Ordering::SeqCst); - }

🤖 Prompt for AI Agents

In `@crates/vm/src/gc_state.rs` around lines 391 - 429, The freeze() and unfreeze() functions drain sets into local vectors then attempt to write the destination lock, which can fail and silently drop objects; change them to handle lock errors defensively: after draining a generation (generation_objects.write() in freeze) or draining permanent (permanent_objects.write() in unfreeze), check the result of acquiring the destination lock (permanent_objects.write() in freeze, generation_objects[2].write() in unfreeze) and if it Err, re-acquire the original source lock (or re-insert the drained pointers back into the original generation/permanent) or else log a fatal error/panic so objects are not lost; reference freeze, unfreeze, generation_objects, permanent_objects, permanent.count, generations[*].count and ensure counts are only updated after successful insertion into the destination set.

youknowone force-pushed the gc-internal branch 3 times, most recently from 8052f5f to 5a8b225 Compare January 31, 2026 08:58

youknowone added 3 commits January 31, 2026 18:52

fix dict/weakref/generators

2d5be2f

unmark test_asyncio

4ff9fb8

youknowone force-pushed the gc-internal branch from d63fba5 to 4ff9fb8 Compare January 31, 2026 09:55

youknowone requested a review from ShaharNaveh January 31, 2026 10:20

youknowone marked this pull request as ready for review January 31, 2026 13:12

coderabbitai bot reviewed Jan 31, 2026

View reviewed changes

ShaharNaveh approved these changes Jan 31, 2026

View reviewed changes

apply review

06c8270

youknowone force-pushed the gc-internal branch from cec949d to 06c8270 Compare January 31, 2026 14:03

coderabbitai bot reviewed Jan 31, 2026

View reviewed changes

gc module internal structure and API #6910

Are you sure you want to change the base?

gc module internal structure and API #6910

Conversation

youknowone commented Jan 30, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Jan 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram

Estimated code review effort

Possibly related PRs

Suggested reviewers

Poem

Uh oh!

github-actions bot commented Jan 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

youknowone commented Jan 30, 2026

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 31, 2026

Choose a reason for hiding this comment

Error handling / exceptions (what the docs imply)

Uh oh!

Uh oh!

coderabbitai bot Jan 31, 2026

Choose a reason for hiding this comment

What gc.is_tracked(obj) reports (behavior)

How CPython implements it (core idea)

Practical implications / “gotchas”

Uh oh!

ShaharNaveh left a comment

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Jan 31, 2026

📦 Library Dependencies

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 31, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 31, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

youknowone commented Jan 30, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Jan 30, 2026 •

edited

Loading

github-actions bot commented Jan 30, 2026 •

edited

Loading

What `gc.is_tracked(obj)` reports (behavior)