Add sys._stdlib_dir #6798

youknowone · 2026-01-19T12:08:28Z

Summary by CodeRabbit

Release Notes

Bug Fixes
- Improved standard library path resolution, particularly on Windows with enhanced lib directory detection for both text-file and symlink-based paths
Refactor
- Centralized standard library directory configuration through interpreter state management

_{✏️ Tip: You can customize this high-level summary in your review settings.}

coderabbitai · 2026-01-19T12:08:35Z

📝 Walkthrough

Walkthrough

Introduces a new stdlib_dir field to the path configuration system, computed via platform-specific logic in getpath.rs, exposed through the sys module as _stdlib_dir, and wired during interpreter initialization to store the standard library location derived from either frozen or dynamic paths.

Changes

Cohort / File(s)	Summary
Path configuration structure `crates/vm/src/vm/setting.rs`	Added optional `stdlib_dir` field to `Paths` struct to store the standard library directory path
stdlib directory computation `crates/vm/src/getpath.rs`	Introduced private `calculate_stdlib_dir()` helper with platform-specific logic; added Step 9 to `init_path_config()` to assign computed `stdlib_dir` with existence validation
Python API exposure `crates/vm/src/stdlib/sys.rs`	Added `_stdlib_dir` attribute to `sys` module that returns `vm.state.config.paths.stdlib_dir` as a `PyObjectRef`
Interpreter initialization `src/interpreter.rs`	Replaced `vm.sys_module` hack with `PyRc`-based state access; centralized `stdlib_dir` assignment for both frozen and dynamic stdlib path flows
Windows build configuration `crates/pylib/build.rs`	Enhanced Lib path resolution to handle both symlinks and text files (git with/without symlink support); added canonicalization and `\?\` prefix stripping

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~22 minutes

Possibly related PRs

Move pylib -> crates/pylib #6225: Modifies build.rs Lib path resolution logic for platform/path handling
nt junction #6407: Strips Windows extended-length prefix (\?\) from canonicalized Lib path in build.rs
Introduce PyConfig to implement Paths correctly #6461: Extends PyConfig/Paths plumbing by adding and wiring the new stdlib_dir field through the path infrastructure

Suggested reviewers

coolreader18
arihant2math

Poem

🐰 A rabbit hops through stdlib ways,
Building paths through winding maze,
From frozen lib to dynamic ground,
stdlib_dir must now be found! 📚✨

🚥 Pre-merge checks | ✅ 3

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The pull request title 'Add sys._stdlib_dir' directly reflects the main change: adding a new Python-accessible attribute _stdlib_dir to the sys module, which is the primary public-facing feature of this changeset.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing touches

📝 Generate docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

github-actions · 2026-01-19T12:08:51Z

Code has been automatically formatted

The code in this PR has been formatted using:

cargo fmt --all
Please pull the latest changes before pushing again:

git pull origin stdlib_dir

coderabbitai

Actionable comments posted: 1

🤖 Fix all issues with AI agents

In `@crates/pylib/build.rs`:
- Around line 14-31: The build script currently calls
canonicalize(...).to_str().unwrap(), which can panic on non‑UTF‑8 Windows paths;
change the emission to use a lossy conversion from the canonicalized Path (e.g.,
Path::to_string_lossy on the value returned by std::fs::canonicalize(&lib_path))
and print that into the cargo:rustc-env=win_lib_path value, and if
std::fs::canonicalize(&lib_path) returns an Err emit a visible build warning via
println!("cargo:warning=...") instead of silently skipping so missing
win_lib_path is obvious; update the code paths referencing canonicalized_path
and to_str().unwrap() accordingly.

coderabbitai · 2026-01-19T14:48:54Z

crates/pylib/build.rs

+    if cfg!(windows) {
+        // On Windows, the Lib entry can be either:
+        // 1. A text file containing the relative path (git without symlink support)
+        // 2. A proper symlink (git with symlink support)
+        // We handle both cases to resolve to the actual Lib directory.
+        let lib_path = if let Ok(real_path) = std::fs::read_to_string("Lib") {
+            // Case 1: Text file containing relative path
+            std::path::PathBuf::from(real_path.trim())
+        } else {
+            // Case 2: Symlink or directory - canonicalize directly
+            std::path::PathBuf::from("Lib")
+        };
+
+        if let Ok(canonicalized_path) = std::fs::canonicalize(&lib_path) {
+            // Strip the extended path prefix (\\?\) that canonicalize adds on Windows
+            let path_str = canonicalized_path.to_str().unwrap();
+            let path_str = path_str.strip_prefix(r"\\?\").unwrap_or(path_str);
+            println!("cargo:rustc-env=win_lib_path={path_str}");


⚠️ Potential issue | 🟠 Major

Avoid panic on non‑UTF‑8 Windows paths.
to_str().unwrap() can panic on valid Windows paths containing non‑UTF‑8 sequences, breaking the build. Prefer a lossy conversion and consider warning when canonicalization fails so missing win_lib_path isn’t silent.

🔧 Proposed safer emission

- if let Ok(canonicalized_path) = std::fs::canonicalize(&lib_path) { - // Strip the extended path prefix (\\?\) that canonicalize adds on Windows - let path_str = canonicalized_path.to_str().unwrap(); - let path_str = path_str.strip_prefix(r"\\?\").unwrap_or(path_str); - println!("cargo:rustc-env=win_lib_path={path_str}"); - } + if let Ok(canonicalized_path) = std::fs::canonicalize(&lib_path) { + // Strip the extended path prefix (\\?\) that canonicalize adds on Windows + let path_str = canonicalized_path.to_string_lossy(); + let path_str = path_str.strip_prefix(r"\\?\").unwrap_or(&path_str); + println!("cargo:rustc-env=win_lib_path={path_str}"); + } else { + println!("cargo:warning=failed to canonicalize Lib path: {:?}", lib_path); + }

🤖 Prompt for AI Agents

In `@crates/pylib/build.rs` around lines 14 - 31, The build script currently calls canonicalize(...).to_str().unwrap(), which can panic on non‑UTF‑8 Windows paths; change the emission to use a lossy conversion from the canonicalized Path (e.g., Path::to_string_lossy on the value returned by std::fs::canonicalize(&lib_path)) and print that into the cargo:rustc-env=win_lib_path value, and if std::fs::canonicalize(&lib_path) returns an Err emit a visible build warning via println!("cargo:warning=...") instead of silently skipping so missing win_lib_path is obvious; update the code paths referencing canonicalized_path and to_str().unwrap() accordingly.

coderabbitai

Actionable comments posted: 1

🤖 Fix all issues with AI agents

In `@crates/codegen/src/compile.rs`:
- Around line 6983-6986: The DictMerge path currently uses the first mapping as
the accumulator which can mutate a caller-provided **kwargs dict; change
compile.rs so that when merging multiple mappings (the loop around emit!(self,
Instruction::DictMerge { index: 1 })), you first allocate a fresh empty
accumulator and merge each mapping into that fresh map instead of using the
first input as the target; specifically, before the for-loop that emits
Instruction::DictMerge, emit an instruction to create a new empty dict (so the
accumulator is new) and then merge all incoming `**` groups into that fresh
accumulator (update the code that emits Instruction::DictMerge and the
surrounding logic in the same block to use the new accumulator).

🧹 Nitpick comments (1)

crates/vm/src/frame.rs (1)
962-962: Inconsistent downcast pattern with DictUpdate.

DictUpdate (line 934) uses a safe downcast with .expect():
let dict = dict.downcast_ref::<PyDict>().expect("exact dict expected");
But DictMerge uses an unsafe downcast without a SAFETY comment explaining the invariant. Consider using the same safe pattern as DictUpdate for consistency and defensive coding, or at minimum add a SAFETY comment like the other unsafe downcasts in this file (e.g., ListAppend, ListExtend).
♻️ Suggested fix for consistency
-                let dict: &Py<PyDict> = unsafe { dict_ref.downcast_unchecked_ref() };
+                let dict: &Py<PyDict> = unsafe {
+                    // SAFETY: compiler guarantees correct type
+                    dict_ref.downcast_unchecked_ref()
+                };
Or use the safe pattern:
-                let dict: &Py<PyDict> = unsafe { dict_ref.downcast_unchecked_ref() };
+                let dict = dict_ref.downcast_ref::<PyDict>().expect("exact dict expected");

coderabbitai · 2026-01-19T16:56:10Z

crates/codegen/src/compile.rs

+            // Merge all dicts: first dict is accumulator, merge rest into it
+            for _ in 1..size {
+                emit!(self, Instruction::DictMerge { index: 1 });
+            }


⚠️ Potential issue | 🟠 Major

Avoid mutating user-provided **kwargs when merging.
DictMerge uses the first mapping as the accumulator. If the first group is **kwargs, this mutates the caller’s dict when multiple ** groups exist (observable vs CPython, which builds a fresh dict). Please ensure the accumulator is always a new map before merging any ** inputs.

🔧 Possible fix (merge into a fresh accumulator)

- let mut size = 0; - let groupby = keywords.iter().chunk_by(|e| e.arg.is_none()); - for (is_unpacking, sub_keywords) in &groupby { - if is_unpacking { - for keyword in sub_keywords { - self.compile_expression(&keyword.value)?; - size += 1; - } - } else { - let mut sub_size = 0; - for keyword in sub_keywords { - if let Some(name) = &keyword.arg { - self.emit_load_const(ConstantData::Str { - value: name.as_str().into(), - }); - self.compile_expression(&keyword.value)?; - sub_size += 1; - } - } - emit!(self, Instruction::BuildMap { size: sub_size }); - size += 1; - } - } - if size > 1 { - // Merge all dicts: first dict is accumulator, merge rest into it - for _ in 1..size { - emit!(self, Instruction::DictMerge { index: 1 }); - } - } + let mut have_accumulator = false; + let groupby = keywords.iter().chunk_by(|e| e.arg.is_none()); + for (is_unpacking, sub_keywords) in &groupby { + if !have_accumulator { + emit!(self, Instruction::BuildMap { size: 0 }); // fresh accumulator + have_accumulator = true; + } + if is_unpacking { + for keyword in sub_keywords { + self.compile_expression(&keyword.value)?; + emit!(self, Instruction::DictMerge { index: 1 }); + } + } else { + let mut sub_size = 0; + for keyword in sub_keywords { + if let Some(name) = &keyword.arg { + self.emit_load_const(ConstantData::Str { + value: name.as_str().into(), + }); + self.compile_expression(&keyword.value)?; + sub_size += 1; + } + } + emit!(self, Instruction::BuildMap { size: sub_size }); + emit!(self, Instruction::DictMerge { index: 1 }); + } + }

🤖 Prompt for AI Agents

In `@crates/codegen/src/compile.rs` around lines 6983 - 6986, The DictMerge path currently uses the first mapping as the accumulator which can mutate a caller-provided **kwargs dict; change compile.rs so that when merging multiple mappings (the loop around emit!(self, Instruction::DictMerge { index: 1 })), you first allocate a fresh empty accumulator and merge each mapping into that fresh map instead of using the first input as the target; specifically, before the for-loop that emits Instruction::DictMerge, emit an instruction to create a new empty dict (so the accumulator is new) and then merge all incoming `**` groups into that fresh accumulator (update the code that emits Instruction::DictMerge and the surrounding logic in the same block to use the new accumulator).

youknowone force-pushed the stdlib_dir branch from f42eaaf to 22569fa Compare January 19, 2026 14:03

youknowone marked this pull request as ready for review January 19, 2026 14:44

coderabbitai bot reviewed Jan 19, 2026

View reviewed changes

youknowone force-pushed the stdlib_dir branch from 7371d30 to b212d9a Compare January 19, 2026 15:32

coderabbitai bot reviewed Jan 19, 2026

View reviewed changes

youknowone force-pushed the stdlib_dir branch from c9e25ea to 3867a0a Compare January 19, 2026 17:12

Add sys._stdlib_dir

99f3e3e

youknowone force-pushed the stdlib_dir branch from 3867a0a to 99f3e3e Compare January 19, 2026 17:32

youknowone merged commit b76906a into RustPython:main Jan 19, 2026
13 checks passed

youknowone deleted the stdlib_dir branch January 19, 2026 18:38

youknowone mentioned this pull request Jan 20, 2026

Prohibit AI PR submit #6813

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add sys._stdlib_dir #6798

Add sys._stdlib_dir #6798

Uh oh!

youknowone commented Jan 19, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Jan 19, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested reviewers

Poem

Uh oh!

github-actions bot commented Jan 19, 2026

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Jan 19, 2026

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Jan 19, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Add sys._stdlib_dir #6798

Add sys._stdlib_dir #6798

Uh oh!

Conversation

youknowone commented Jan 19, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Release Notes

Uh oh!

coderabbitai bot commented Jan 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested reviewers

Poem

Uh oh!

github-actions bot commented Jan 19, 2026

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 19, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 19, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

youknowone commented Jan 19, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Jan 19, 2026 •

edited

Loading