Skip to content

impl more nt#6984

Merged
youknowone merged 6 commits intoRustPython:mainfrom
youknowone:nt
Feb 4, 2026
Merged

impl more nt#6984
youknowone merged 6 commits intoRustPython:mainfrom
youknowone:nt

Conversation

@youknowone
Copy link
Member

@youknowone youknowone commented Feb 3, 2026

Summary by CodeRabbit

  • New Features

    • Added Windows code-page encodings and encoding/decoding support.
    • Added Windows path normalization and improved terminal detection.
  • Bug Fixes

    • Trimmed additional trailing newline variants for clearer error pointers.
    • Improved Windows symlink creation with robust fallbacks and clearer OSError context.
    • Strengthened Windows path resolution and tightened environment-variable validation (reject empty names).

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Feb 3, 2026

📝 Walkthrough

Walkthrough

Windows-specific codecs and extensive Windows stdlib changes were added (code-page encode/decode, codec delegation), large rewrites to nt.rs (symlink, path normalization, reparse-point and attribute APIs), a tiny exceptions newline-trim tweak, and tighter env-var validation.

Changes

Cohort / File(s) Summary
Exception rendering
crates/vm/src/exceptions.rs
Trim trailing '\n' and '\r' when rendering SyntaxError text; affects r_text used for spacing and caret placement.
Codecs and delegation
crates/vm/src/stdlib/codecs.rs
Introduce Windows-only _codecs_windows submodule, add code_page_encode / code_page_decode and many Windows codec wrappers; add delegate_pycodecs helper and non-Windows delegating wrappers. (Duplicated blocks appear in diff.)
Windows stdlib — nt.rs (paths, symlinks, attrs, env)
crates/vm/src/stdlib/nt.rs
Large Windows-focused refactor: symlink creation with privileged-create probing, enhanced error mapping, reparse-tag-aware readlink/stat, win32_hchmod/fchmod_impl/lchmod, _getfinalpathname and _path_normpath (GetFinalPathNameByHandleW), SECURITY_ATTRIBUTES for pipes, new reparse constants/predicates, and long-path/overflow guards.
Environment validation (posix & nt)
crates/vm/src/stdlib/posix.rs, crates/vm/src/stdlib/nt.rs
Tightened environment-variable validation to reject empty keys in addition to keys containing =; affects env construction and related error paths.

Sequence Diagram(s)

sequenceDiagram
    participant Py as Python caller
    participant VM as VirtualMachine/_codecs
    participant Sub as _codecs_windows (submodule)
    participant Win as Windows API

    Py->>VM: call codec function (e.g., code_page_encode)
    alt on non-Windows
        VM->>VM: delegate_pycodecs! -> route to registered PyCodecs
        VM-->>Py: return result
    else on Windows
        VM->>Sub: call Windows-specific wrapper
        Sub->>Win: WideCharToMultiByte / MultiByteToWideChar
        Win-->>Sub: bytes/string + status
        Sub-->>VM: result
        VM-->>Py: return result
    end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

  • windows codecs #6337 — Adds Windows-specific codec implementations and delegation changes in crates/vm/src/stdlib/codecs.rs.
  • upgrade shutil #6558 — Overlapping changes to Windows fchmod/fchmod_impl and chmod behavior in crates/vm/src/stdlib/nt.rs.
  • nt.terminalsize #6374 — Related edits to Windows console/terminal handling and STD handle resolution in crates/vm/src/stdlib/nt.rs.

Suggested reviewers

  • ShaharNaveh

Poem

"I'm a rabbit in the VM, I hop and code with glee,
Symlinks curl like carrot tops, paths normalize for me.
Code pages hum in Windows breeze, carets point just right,
I nibble bugs and stitch the rest — the stdlib's shining bright!" 🐇✨

🚥 Pre-merge checks | ✅ 2 | ❌ 1
❌ Failed checks (1 inconclusive)
Check name Status Explanation Resolution
Title check ❓ Inconclusive The title 'impl more nt' is vague and generic, using abbreviated terminology that doesn't convey meaningful information about the substantial changes across multiple stdlib modules. Use a more descriptive title that captures the main scope, such as 'Add Windows-specific stdlib implementations (nt, codecs, posix)' or 'Implement Windows terminal support and codec functions'.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@youknowone youknowone marked this pull request as ready for review February 3, 2026 15:54
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
crates/vm/src/stdlib/nt.rs (1)

2262-2294: ⚠️ Potential issue | 🟠 Major

Change ValueError to OSError for non-symlink reparse tags.

readlink should raise OSError (with EINVAL errno), not ValueError, to match POSIX semantics when the reparse point is not a symlink or junction.

The proposed fix references ERROR_NOT_A_REPARSE_POINT, which does not exist in windows_sys. Instead, use libc::EINVAL:

🛠️ Corrected error mapping
-            } else {
-                return Err(vm.new_value_error("not a symbolic link".to_owned()));
-            };
+            } else {
+                let err = io::Error::from_raw_os_error(libc::EINVAL);
+                return Err(OSErrorBuilder::with_filename(&err, path, vm));
+            };
🤖 Fix all issues with AI agents
In `@crates/vm/src/stdlib/codecs.rs`:
- Around line 770-865: The decode currently falls back to non-strict decoding
whenever the first MultiByteToWideChar call returns 0; change code in the
code_page_decode path to check args.errors (the _errors variable) when
strict_flags uses MB_ERR_INVALID_CHARS: if size == 0 and _errors == "strict"
then return vm.new_unicode_decode_error(...) (use the same error formatting as
other branches) instead of falling back to call MultiByteToWideChar with flags
0; otherwise (when _errors != "strict") keep the existing fallback logic.
Reference symbols: args.errors / _errors, strict_flags, MB_ERR_INVALID_CHARS,
MultiByteToWideChar, vm.new_unicode_decode_error, and the code_page_decode
return paths.

In `@crates/vm/src/stdlib/nt.rs`:
- Around line 573-666: The disk_device check in _test_file_type_by_name
incorrectly omits network filesystems so regular files on UNC/share paths aren't
treated as disk devices; update the matches!(...) in _test_file_type_by_name to
include FILE_DEVICE_NETWORK_FILE_SYSTEM (import it from
windows_sys::Win32::System::Ioctl) alongside FILE_DEVICE_DISK,
FILE_DEVICE_VIRTUAL_DISK, and FILE_DEVICE_CD_ROM so _test_info(...) correctly
treats network-mounted regular files as disk devices and allows the rest of the
logic (including the handle-based fallback) to run.
- Around line 1287-1333: In _getfinalpathname, the CreateFileW call opens the
file with access 0 and share mode 0 which causes sharing violations; change the
access parameter to FILE_READ_ATTRIBUTES and the sharemode parameter to
FILE_SHARE_READ | FILE_SHARE_WRITE | FILE_SHARE_DELETE (matching other stdlib
patterns) when calling CreateFileW so the handle is opened permissively for
metadata reads; update the CreateFileW invocation in the _getfinalpathname
function accordingly and keep the rest of the logic
(GetFinalPathNameByHandleW/CloseHandle) unchanged.
🧹 Nitpick comments (1)
crates/vm/src/stdlib/nt.rs (1)

1585-1750: Add tests for the new Windows normpath implementation.

This is a sizeable port; consider tests for UNC/device paths, drive-relative forms, dot/dotdot handling, trailing separators, and bytes input to guard parity.

Comment on lines 770 to 865
if args.code_page < 0 {
return Err(vm.new_value_error("invalid code page number".to_owned()));
}
let _errors = args.errors.as_ref().map(|s| s.as_str()).unwrap_or("strict");
let code_page = args.code_page as u32;
let data = args.data.borrow_buf();
let len = data.len();

if data.is_empty() {
return Ok((String::new(), 0));
}

// Some code pages don't support MB_ERR_INVALID_CHARS
let strict_flags = if code_page == 65000
|| code_page == 42
|| (50220..=50222).contains(&code_page)
|| code_page == 50225
|| code_page == 50227
|| code_page == 50229
|| (57002..=57011).contains(&code_page)
{
0
} else {
MB_ERR_INVALID_CHARS
};

let size = unsafe {
MultiByteToWideChar(
code_page,
strict_flags,
data.as_ptr().cast(),
len as i32,
std::ptr::null_mut(),
0,
)
};

if size == 0 {
let size = unsafe {
MultiByteToWideChar(
code_page,
0,
data.as_ptr().cast(),
len as i32,
std::ptr::null_mut(),
0,
)
};
if size == 0 {
let err = std::io::Error::last_os_error();
return Err(vm.new_os_error(format!("code_page_decode failed: {err}")));
}

let mut buffer = vec![0u16; size as usize];
let result = unsafe {
MultiByteToWideChar(
code_page,
0,
data.as_ptr().cast(),
len as i32,
buffer.as_mut_ptr(),
size,
)
};
if result == 0 {
let err = std::io::Error::last_os_error();
return Err(vm.new_os_error(format!("code_page_decode failed: {err}")));
}
buffer.truncate(result as usize);
let s = String::from_utf16(&buffer).map_err(|e| {
vm.new_unicode_decode_error(format!("code_page_decode failed: {e}"))
})?;
return Ok((s, len));
}

let mut buffer = vec![0u16; size as usize];
let result = unsafe {
MultiByteToWideChar(
code_page,
strict_flags,
data.as_ptr().cast(),
len as i32,
buffer.as_mut_ptr(),
size,
)
};
if result == 0 {
let err = std::io::Error::last_os_error();
return Err(vm.new_os_error(format!("code_page_decode failed: {err}")));
}
buffer.truncate(result as usize);
let s = String::from_utf16(&buffer)
.map_err(|e| vm.new_unicode_decode_error(format!("code_page_decode failed: {e}")))?;

Ok((s, len))
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

errors="strict" isn’t enforced on decode failures.
When the strict MultiByteToWideChar pass returns 0, the code always falls back to replacement decoding, so invalid bytes won’t raise a UnicodeDecodeError. Gate the fallback on errors != "strict" (or raise a decode error) when MB_ERR_INVALID_CHARS is supported.

🛠️ Proposed fix
-        let _errors = args.errors.as_ref().map(|s| s.as_str()).unwrap_or("strict");
+        let errors = args.errors.as_ref().map(|s| s.as_str()).unwrap_or("strict");
         let code_page = args.code_page as u32;
         let data = args.data.borrow_buf();
         let len = data.len();
@@
         let size = unsafe {
             MultiByteToWideChar(
                 code_page,
                 strict_flags,
@@
             )
         };

         if size == 0 {
+            if errors == "strict" && strict_flags != 0 {
+                return Err(vm.new_unicode_decode_error(format!(
+                    "'cp{code_page}' codec can't decode bytes: invalid character"
+                )));
+            }
             let size = unsafe {
                 MultiByteToWideChar(
                     code_page,
                     0,
CPython code_page_decode strict behavior MultiByteToWideChar MB_ERR_INVALID_CHARS UnicodeDecodeError
🤖 Prompt for AI Agents
In `@crates/vm/src/stdlib/codecs.rs` around lines 770 - 865, The decode currently
falls back to non-strict decoding whenever the first MultiByteToWideChar call
returns 0; change code in the code_page_decode path to check args.errors (the
_errors variable) when strict_flags uses MB_ERR_INVALID_CHARS: if size == 0 and
_errors == "strict" then return vm.new_unicode_decode_error(...) (use the same
error formatting as other branches) instead of falling back to call
MultiByteToWideChar with flags 0; otherwise (when _errors != "strict") keep the
existing fallback logic. Reference symbols: args.errors / _errors, strict_flags,
MB_ERR_INVALID_CHARS, MultiByteToWideChar, vm.new_unicode_decode_error, and the
code_page_decode return paths.

Comment on lines 573 to 666
/// _testFileTypeByName - test file type by path name
fn _test_file_type_by_name(path: &std::path::Path, tested_type: u32) -> bool {
use crate::common::fileutils::windows::{
FILE_INFO_BY_NAME_CLASS, get_file_information_by_name,
};
use crate::common::windows::ToWideString;
use windows_sys::Win32::Foundation::{CloseHandle, INVALID_HANDLE_VALUE};
use windows_sys::Win32::Storage::FileSystem::{
CreateFileW, FILE_FLAG_BACKUP_SEMANTICS, FILE_FLAG_OPEN_REPARSE_POINT,
FILE_READ_ATTRIBUTES, FILE_SHARE_DELETE, FILE_SHARE_READ, FILE_SHARE_WRITE,
OPEN_EXISTING,
CreateFileW, FILE_ATTRIBUTE_REPARSE_POINT, FILE_FLAG_BACKUP_SEMANTICS,
FILE_FLAG_OPEN_REPARSE_POINT, FILE_READ_ATTRIBUTES, OPEN_EXISTING,
};

// For islink/isjunction, use symlink_metadata to check reparse points
if (tested_type == PY_IFLNK || tested_type == PY_IFMNT)
&& let Ok(meta) = path.symlink_metadata()
{
use std::os::windows::fs::MetadataExt;
let attrs = meta.file_attributes();
use windows_sys::Win32::Storage::FileSystem::FILE_ATTRIBUTE_REPARSE_POINT;
if (attrs & FILE_ATTRIBUTE_REPARSE_POINT) == 0 {
return false;
use windows_sys::Win32::Storage::FileSystem::{FILE_DEVICE_CD_ROM, FILE_DEVICE_DISK};
use windows_sys::Win32::System::Ioctl::FILE_DEVICE_VIRTUAL_DISK;

match get_file_information_by_name(
path.as_os_str(),
FILE_INFO_BY_NAME_CLASS::FileStatBasicByNameInfo,
) {
Ok(info) => {
let disk_device = matches!(
info.DeviceType,
FILE_DEVICE_DISK | FILE_DEVICE_VIRTUAL_DISK | FILE_DEVICE_CD_ROM
);
let result = _test_info(
info.FileAttributes,
info.ReparseTag,
disk_device,
tested_type,
);
if !result
|| (tested_type != PY_IFREG && tested_type != PY_IFDIR)
|| (info.FileAttributes & FILE_ATTRIBUTE_REPARSE_POINT) == 0
{
return result;
}
}
Err(err) => {
if let Some(code) = err.raw_os_error()
&& file_info_error_is_trustworthy(code as u32)
{
return false;
}
}
// Need to check reparse tag, fall through to CreateFileW
}

let wide_path = path.to_wide_with_nul();

// For symlinks/junctions, add FILE_FLAG_OPEN_REPARSE_POINT to not follow
let mut flags = FILE_FLAG_BACKUP_SEMANTICS;
if tested_type != PY_IFREG && tested_type != PY_IFDIR {
flags |= FILE_FLAG_OPEN_REPARSE_POINT;
}

// Use sharing flags to avoid access denied errors
let handle = unsafe {
CreateFileW(
wide_path.as_ptr(),
FILE_READ_ATTRIBUTES,
FILE_SHARE_READ | FILE_SHARE_WRITE | FILE_SHARE_DELETE,
0,
core::ptr::null(),
OPEN_EXISTING,
flags,
std::ptr::null_mut(),
)
};

if handle == INVALID_HANDLE_VALUE {
// Fallback: try using Rust's metadata for isdir/isfile
if tested_type == PY_IFDIR {
return path.metadata().is_ok_and(|m| m.is_dir());
} else if tested_type == PY_IFREG {
return path.metadata().is_ok_and(|m| m.is_file());
}
// For symlinks/junctions, try without FILE_FLAG_BACKUP_SEMANTICS
let handle = unsafe {
CreateFileW(
wide_path.as_ptr(),
FILE_READ_ATTRIBUTES,
FILE_SHARE_READ | FILE_SHARE_WRITE | FILE_SHARE_DELETE,
core::ptr::null(),
OPEN_EXISTING,
FILE_FLAG_OPEN_REPARSE_POINT,
std::ptr::null_mut(),
)
};
if handle == INVALID_HANDLE_VALUE {
return false;
}
let result = _test_file_type_by_handle(handle, tested_type, true);
if handle != INVALID_HANDLE_VALUE {
let result = _test_file_type_by_handle(handle, tested_type, false);
unsafe { CloseHandle(handle) };
return result;
}

let result = _test_file_type_by_handle(handle, tested_type, true);
unsafe { CloseHandle(handle) };
result
match unsafe { windows_sys::Win32::Foundation::GetLastError() } {
windows_sys::Win32::Foundation::ERROR_ACCESS_DENIED
| windows_sys::Win32::Foundation::ERROR_SHARING_VIOLATION
| windows_sys::Win32::Foundation::ERROR_CANT_ACCESS_FILE
| windows_sys::Win32::Foundation::ERROR_INVALID_PARAMETER => {
let stat = if tested_type == PY_IFREG || tested_type == PY_IFDIR {
crate::windows::win32_xstat(path.as_os_str(), true)
} else {
crate::windows::win32_xstat(path.as_os_str(), false)
};
if let Ok(st) = stat {
let disk_device = (st.st_mode & libc::S_IFREG as u16) != 0;
return _test_info(
st.st_file_attributes,
st.st_reparse_tag,
disk_device,
tested_type,
);
}
}
_ => {}
}

false
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# First, let's examine the file structure and locate the relevant code
git ls-files "crates/vm/src/stdlib/nt.rs"

Repository: RustPython/RustPython

Length of output: 91


🏁 Script executed:

# Check if FILE_DEVICE_NETWORK_FILE_SYSTEM constant exists in the codebase
rg "FILE_DEVICE_NETWORK_FILE_SYSTEM" --type rust

Repository: RustPython/RustPython

Length of output: 153


🏁 Script executed:

# Find the _test_info function to understand how disk_device is used
rg "_test_info" --type rust -A 10 -B 2

Repository: RustPython/RustPython

Length of output: 4357


🏁 Script executed:

# Check the Cargo.toml to see which version of windows_sys is used
rg "windows_sys" Cargo.toml

Repository: RustPython/RustPython

Length of output: 47


🏁 Script executed:

# Search for windows device type constants used in the codebase
rg "FILE_DEVICE_" --type rust | head -30

Repository: RustPython/RustPython

Length of output: 1896


🏁 Script executed:

# Get the full _test_info function to understand how disk_device affects PY_IFREG
sed -n '520,565p' crates/vm/src/stdlib/nt.rs

Repository: RustPython/RustPython

Length of output: 1844


🏁 Script executed:

# Get more context around _test_info to see the complete function
sed -n '490,595p' crates/vm/src/stdlib/nt.rs

Repository: RustPython/RustPython

Length of output: 4160


🏁 Script executed:

# Check how crates/common/src/fileutils.rs handles device types
rg "FILE_DEVICE_NETWORK_FILE_SYSTEM" -B 5 -A 5 crates/common/src/fileutils.rs

Repository: RustPython/RustPython

Length of output: 560


🏁 Script executed:

# Get the complete _test_info function to see PY_IFREG logic
sed -n '450,490p' crates/vm/src/stdlib/nt.rs

Repository: RustPython/RustPython

Length of output: 1943


🏁 Script executed:

# Check windows_sys for the FILE_DEVICE_NETWORK_FILE_SYSTEM constant availability
rg "FILE_DEVICE_NETWORK_FILE_SYSTEM" -t rust

Repository: RustPython/RustPython

Length of output: 153


🏁 Script executed:

# Verify the constant path in windows_sys
python3 << 'EOF'
# Let's check if the constant is available and in what module
# From the fileutils.rs output, it's: Ioctl::FILE_DEVICE_NETWORK_FILE_SYSTEM
# We need to verify if it's in Win32::System::Ioctl
import subprocess
result = subprocess.run(['rg', 'Ioctl::FILE_DEVICE_NETWORK_FILE_SYSTEM', '-A', '2', '-B', '2', 'crates/common/src/fileutils.rs'], 
                       capture_output=True, text=True)
print("Context around FILE_DEVICE_NETWORK_FILE_SYSTEM in fileutils.rs:")
print(result.stdout)
EOF

Repository: RustPython/RustPython

Length of output: 369


🏁 Script executed:

# Verify the exact module path for FILE_DEVICE_NETWORK_FILE_SYSTEM in windows_sys
rg "use.*Ioctl" crates/common/src/fileutils.rs -B 5 | head -20

Repository: RustPython/RustPython

Length of output: 268


🏁 Script executed:

# Verify the pattern used in fileutils.rs matches what was proposed in the diff
sed -n '1,50p' crates/common/src/fileutils.rs | grep -A 20 "use.*Ioctl"

Repository: RustPython/RustPython

Length of output: 47


🏁 Script executed:

# Check the imports in fileutils.rs to confirm the module path
sed -n '1,100p' crates/common/src/fileutils.rs | grep -E "use windows_sys"

Repository: RustPython/RustPython

Length of output: 351


🏁 Script executed:

# Double-check the exact location where FILE_DEVICE_NETWORK_FILE_SYSTEM is used
rg "FILE_DEVICE_NETWORK_FILE_SYSTEM" -B 10 -A 2 crates/common/src/fileutils.rs | head -30

Repository: RustPython/RustPython

Length of output: 651


Include FILE_DEVICE_NETWORK_FILE_SYSTEM in disk_device type checks for network file support.

Regular files on network shares (UNC paths, etc.) report DeviceType as FILE_DEVICE_NETWORK_FILE_SYSTEM, but the current disk_device check only includes FILE_DEVICE_DISK, FILE_DEVICE_VIRTUAL_DISK, and FILE_DEVICE_CD_ROM. This causes _test_info() to return false for PY_IFREG on network paths, triggering an early return without reaching the handle-based fallback. Add FILE_DEVICE_NETWORK_FILE_SYSTEM to the matches pattern (import from windows_sys::Win32::System::Ioctl), or consider relaxing the device type gating to ensure regular files on all mountable filesystem types are handled correctly.

🤖 Prompt for AI Agents
In `@crates/vm/src/stdlib/nt.rs` around lines 573 - 666, The disk_device check in
_test_file_type_by_name incorrectly omits network filesystems so regular files
on UNC/share paths aren't treated as disk devices; update the matches!(...) in
_test_file_type_by_name to include FILE_DEVICE_NETWORK_FILE_SYSTEM (import it
from windows_sys::Win32::System::Ioctl) alongside FILE_DEVICE_DISK,
FILE_DEVICE_VIRTUAL_DISK, and FILE_DEVICE_CD_ROM so _test_info(...) correctly
treats network-mounted regular files as disk devices and allows the rest of the
logic (including the handle-based fallback) to run.

Comment on lines 1287 to +1333
fn _getfinalpathname(path: OsPath, vm: &VirtualMachine) -> PyResult {
let real = path
.as_ref()
.canonicalize()
.map_err(|e| e.to_pyexception(vm))?;
Ok(path.mode().process_path(real, vm))
use windows_sys::Win32::Storage::FileSystem::{
CreateFileW, FILE_FLAG_BACKUP_SEMANTICS, GetFinalPathNameByHandleW, OPEN_EXISTING,
VOLUME_NAME_DOS,
};

let wide = path.to_wide_cstring(vm)?;
let handle = unsafe {
CreateFileW(
wide.as_ptr(),
0,
0,
core::ptr::null(),
OPEN_EXISTING,
FILE_FLAG_BACKUP_SEMANTICS,
std::ptr::null_mut(),
)
};
if handle == INVALID_HANDLE_VALUE {
let err = io::Error::last_os_error();
return Err(OSErrorBuilder::with_filename(&err, path, vm));
}

let mut buffer: Vec<u16> = vec![0; Foundation::MAX_PATH as usize];
let result = loop {
let ret = unsafe {
GetFinalPathNameByHandleW(
handle,
buffer.as_mut_ptr(),
buffer.len() as u32,
VOLUME_NAME_DOS,
)
};
if ret == 0 {
let err = io::Error::last_os_error();
let _ = unsafe { Foundation::CloseHandle(handle) };
return Err(OSErrorBuilder::with_filename(&err, path, vm));
}
if (ret as usize) < buffer.len() {
let final_path = std::ffi::OsString::from_wide(&buffer[..ret as usize]);
break Ok(path.mode().process_path(final_path, vm));
}
buffer.resize(ret as usize, 0);
};

unsafe { Foundation::CloseHandle(handle) };
result
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# First, locate and examine the file
fd nt.rs crates/vm/src/stdlib/

Repository: RustPython/RustPython

Length of output: 203


🏁 Script executed:

# Read the specific lines in question
sed -n '1280,1340p' crates/vm/src/stdlib/nt.rs

Repository: RustPython/RustPython

Length of output: 2137


🏁 Script executed:

# Check the imports at the top of the file to understand the current setup
head -100 crates/vm/src/stdlib/nt.rs | grep -A 20 "use windows_sys"

Repository: RustPython/RustPython

Length of output: 2278


🏁 Script executed:

# Check if the proposed constants exist in windows_sys by searching Rust files
rg "FILE_READ_ATTRIBUTES|FILE_SHARE_READ|FILE_SHARE_WRITE|FILE_SHARE_DELETE" --type rs | head -20

Repository: RustPython/RustPython

Length of output: 91


🏁 Script executed:

# Check the windows_sys crate version and available exports
fd "Cargo.toml" | xargs grep -l "windows_sys" | head -5

Repository: RustPython/RustPython

Length of output: 47


🏁 Script executed:

# Look at how other functions in nt.rs handle file access
rg "CreateFileW" crates/vm/src/stdlib/nt.rs -A 10 -B 2

Repository: RustPython/RustPython

Length of output: 6696


🌐 Web query:

GetFinalPathNameByHandleW access rights requirements FILE_READ_ATTRIBUTES Windows API best practices

💡 Result:

Minimal access rights (practical + “least privilege”)

For GetFinalPathNameByHandleW, the safest minimal access to request when opening the handle is:

  • dwDesiredAccess = FILE_READ_ATTRIBUTES (rather than GENERIC_READ) so you can read name/metadata without requesting read-data access. FILE_READ_ATTRIBUTES is the specific right used to read file metadata. [3][4]

Many callers also succeed with

  • dwDesiredAccess = 0 (query-only handle) when they only need metadata and want to avoid requesting GENERIC_READ; Microsoft documents that a zero access mask can still allow querying certain metadata/attributes. [2]
    However, this is not guaranteed to work for every file system / redirector / scenario, so FILE_READ_ATTRIBUTES is the more reliable “minimal” choice.

Share mode / flags (best practices)

When opening the handle you’ll pass to GetFinalPathNameByHandleW:

  • Prefer permissive sharing to avoid collisions with other opens:
    dwShareMode = FILE_SHARE_READ | FILE_SHARE_WRITE | FILE_SHARE_DELETE
  • If opening a directory, include:
    FILE_FLAG_BACKUP_SEMANTICS (required to open directory handles via CreateFile).
  • If you only need the path and not I/O, avoid unnecessary rights (don’t use GENERIC_READ unless you actually need to read file data).

Avoiding ERROR_ACCESS_DENIED surprises (notably SMB)

If you request a normalized name (FILE_NAME_NORMALIZED, the default), Microsoft notes that on SMB shares the system may resolve and query each path component; if the caller lacks permission on any component, the call can fail with ERROR_ACCESS_DENIED. [1]
Mitigations:

  • Use FILE_NAME_OPENED when you don’t strictly need normalization. [1]
  • Or ensure the security context has traverse/query permission for the full component chain on the share.

Recommended pattern (typical)

Open a handle “as lightly as possible” for path resolution:

  • CreateFileW(path, FILE_READ_ATTRIBUTES, FILE_SHARE_READ|FILE_SHARE_WRITE|FILE_SHARE_DELETE, …, OPEN_EXISTING, FILE_FLAG_BACKUP_SEMANTICS /*if dir*/, NULL)
  • Then call GetFinalPathNameByHandleW(handle, …, dwFlags) (choose FILE_NAME_OPENED vs normalized per your needs). [1]

Sources: Microsoft GetFinalPathNameByHandleW remarks/flags and SMB behavior [1]; CreateFile note about dwDesiredAccess = 0 for querying metadata/attributes [2]; definition of FILE_READ_ATTRIBUTES right [3][4].


Use permissive share mode to avoid sharing violations when opening handle for final path.

The handle is opened with share mode 0, which prevents other processes from opening the file. This causes ERROR_SHARING_VIOLATION failures even though only metadata is needed. Use FILE_READ_ATTRIBUTES for access rights and FILE_SHARE_READ | FILE_SHARE_WRITE | FILE_SHARE_DELETE for share mode, matching Windows best practices and existing patterns in the codebase.

🛠️ Suggested change to reduce sharing violations
         use windows_sys::Win32::Storage::FileSystem::{
-            CreateFileW, FILE_FLAG_BACKUP_SEMANTICS, GetFinalPathNameByHandleW, OPEN_EXISTING,
-            VOLUME_NAME_DOS,
+            CreateFileW, FILE_FLAG_BACKUP_SEMANTICS, FILE_READ_ATTRIBUTES, FILE_SHARE_DELETE,
+            FILE_SHARE_READ, FILE_SHARE_WRITE, GetFinalPathNameByHandleW, OPEN_EXISTING,
+            VOLUME_NAME_DOS,
         };
 
         let wide = path.to_wide_cstring(vm)?;
         let handle = unsafe {
             CreateFileW(
                 wide.as_ptr(),
-                0,
-                0,
+                FILE_READ_ATTRIBUTES,
+                FILE_SHARE_READ | FILE_SHARE_WRITE | FILE_SHARE_DELETE,
                 core::ptr::null(),
                 OPEN_EXISTING,
                 FILE_FLAG_BACKUP_SEMANTICS,
                 std::ptr::null_mut(),
             )
         };
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
fn _getfinalpathname(path: OsPath, vm: &VirtualMachine) -> PyResult {
let real = path
.as_ref()
.canonicalize()
.map_err(|e| e.to_pyexception(vm))?;
Ok(path.mode().process_path(real, vm))
use windows_sys::Win32::Storage::FileSystem::{
CreateFileW, FILE_FLAG_BACKUP_SEMANTICS, GetFinalPathNameByHandleW, OPEN_EXISTING,
VOLUME_NAME_DOS,
};
let wide = path.to_wide_cstring(vm)?;
let handle = unsafe {
CreateFileW(
wide.as_ptr(),
0,
0,
core::ptr::null(),
OPEN_EXISTING,
FILE_FLAG_BACKUP_SEMANTICS,
std::ptr::null_mut(),
)
};
if handle == INVALID_HANDLE_VALUE {
let err = io::Error::last_os_error();
return Err(OSErrorBuilder::with_filename(&err, path, vm));
}
let mut buffer: Vec<u16> = vec![0; Foundation::MAX_PATH as usize];
let result = loop {
let ret = unsafe {
GetFinalPathNameByHandleW(
handle,
buffer.as_mut_ptr(),
buffer.len() as u32,
VOLUME_NAME_DOS,
)
};
if ret == 0 {
let err = io::Error::last_os_error();
let _ = unsafe { Foundation::CloseHandle(handle) };
return Err(OSErrorBuilder::with_filename(&err, path, vm));
}
if (ret as usize) < buffer.len() {
let final_path = std::ffi::OsString::from_wide(&buffer[..ret as usize]);
break Ok(path.mode().process_path(final_path, vm));
}
buffer.resize(ret as usize, 0);
};
unsafe { Foundation::CloseHandle(handle) };
result
fn _getfinalpathname(path: OsPath, vm: &VirtualMachine) -> PyResult {
use windows_sys::Win32::Storage::FileSystem::{
CreateFileW, FILE_FLAG_BACKUP_SEMANTICS, FILE_READ_ATTRIBUTES, FILE_SHARE_DELETE,
FILE_SHARE_READ, FILE_SHARE_WRITE, GetFinalPathNameByHandleW, OPEN_EXISTING,
VOLUME_NAME_DOS,
};
let wide = path.to_wide_cstring(vm)?;
let handle = unsafe {
CreateFileW(
wide.as_ptr(),
FILE_READ_ATTRIBUTES,
FILE_SHARE_READ | FILE_SHARE_WRITE | FILE_SHARE_DELETE,
core::ptr::null(),
OPEN_EXISTING,
FILE_FLAG_BACKUP_SEMANTICS,
std::ptr::null_mut(),
)
};
if handle == INVALID_HANDLE_VALUE {
let err = io::Error::last_os_error();
return Err(OSErrorBuilder::with_filename(&err, path, vm));
}
let mut buffer: Vec<u16> = vec![0; Foundation::MAX_PATH as usize];
let result = loop {
let ret = unsafe {
GetFinalPathNameByHandleW(
handle,
buffer.as_mut_ptr(),
buffer.len() as u32,
VOLUME_NAME_DOS,
)
};
if ret == 0 {
let err = io::Error::last_os_error();
let _ = unsafe { Foundation::CloseHandle(handle) };
return Err(OSErrorBuilder::with_filename(&err, path, vm));
}
if (ret as usize) < buffer.len() {
let final_path = std::ffi::OsString::from_wide(&buffer[..ret as usize]);
break Ok(path.mode().process_path(final_path, vm));
}
buffer.resize(ret as usize, 0);
};
unsafe { Foundation::CloseHandle(handle) };
result
🤖 Prompt for AI Agents
In `@crates/vm/src/stdlib/nt.rs` around lines 1287 - 1333, In _getfinalpathname,
the CreateFileW call opens the file with access 0 and share mode 0 which causes
sharing violations; change the access parameter to FILE_READ_ATTRIBUTES and the
sharemode parameter to FILE_SHARE_READ | FILE_SHARE_WRITE | FILE_SHARE_DELETE
(matching other stdlib patterns) when calling CreateFileW so the handle is
opened permissively for metadata reads; update the CreateFileW invocation in the
_getfinalpathname function accordingly and keep the rest of the logic
(GetFinalPathNameByHandleW/CloseHandle) unchanged.

@github-actions
Copy link
Contributor

github-actions bot commented Feb 3, 2026

Code has been automatically formatted

The code in this PR has been formatted using:

  • cargo fmt --all
    Please pull the latest changes before pushing again:
git pull origin nt

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
crates/vm/src/stdlib/codecs.rs (2)

465-521: ⚠️ Potential issue | 🟠 Major

Honor errors="strict" in mbcs_decode before falling back.

The _errors variable is parsed but never used. Currently, when the strict MultiByteToWideChar pass with MB_ERR_INVALID_CHARS fails, the code unconditionally retries without the flag, allowing invalid bytes to pass through even in strict mode. In strict mode, this should raise an error instead.

🛠️ Proposed fix
-        let _errors = args.errors.as_ref().map(|s| s.as_str()).unwrap_or("strict");
+        let errors = args.errors.as_ref().map(|s| s.as_str()).unwrap_or("strict");
         let data = args.data.borrow_buf();
         let len = data.len();
@@ -488,6 +488,11 @@ mod windows_codecs {
         if size == 0 {
+            if errors == "strict" {
+                return Err(vm.new_unicode_decode_error(
+                    "'mbcs' codec can't decode bytes: invalid character".to_owned(),
+                ));
+            }
             // Try without MB_ERR_INVALID_CHARS for non-strict mode (replacement behavior)

667-723: ⚠️ Potential issue | 🟠 Major

Honor errors="strict" in oem_decode and other decode functions.

At line 667, _errors is parsed but never used. When the strict MultiByteToWideChar pass fails (size == 0), the code unconditionally retries without MB_ERR_INVALID_CHARS, bypassing strict error checking. This violates codec semantics: errors="strict" should raise an error on invalid bytes, not silently fall back to lenient decoding.

The same issue exists in mbcs_decode (line 465) and code_page_decode (line 895). By contrast, the corresponding encode functions (oem_encode, code_page_encode) correctly use the errors parameter to control behavior in strict mode.

🛠️ Proposed fix for oem_decode
-        let _errors = args.errors.as_ref().map(|s| s.as_str()).unwrap_or("strict");
+        let errors = args.errors.as_ref().map(|s| s.as_str()).unwrap_or("strict");
         let data = args.data.borrow_buf();
         let len = data.len();

         if data.is_empty() {
             return Ok((String::new(), 0));
         }

         // Get the required buffer size for UTF-16
         let size = unsafe {
             MultiByteToWideChar(
                 CP_OEMCP,
                 MB_ERR_INVALID_CHARS,
                 data.as_ptr().cast(),
                 len as i32,
                 std::ptr::null_mut(),
                 0,
             )
         };

         if size == 0 {
+            if errors == "strict" {
+                return Err(vm.new_unicode_decode_error(
+                    "'oem' codec can't decode bytes: invalid character".to_owned(),
+                ));
+            }
             // Try without MB_ERR_INVALID_CHARS for non-strict mode (replacement behavior)

Apply the same logic to mbcs_decode and code_page_decode.

@github-actions
Copy link
Contributor

github-actions bot commented Feb 3, 2026

📦 Library Dependencies

The following Lib/ modules were modified. Here are their dependencies:

[ ] lib: cpython/Lib/io.py
[ ] lib: cpython/Lib/_pyio.py
[ ] test: cpython/Lib/test/test_io.py (TODO: 59)
[ ] test: cpython/Lib/test/test_bufio.py
[ ] test: cpython/Lib/test/test_fileio.py
[ ] test: cpython/Lib/test/test_memoryio.py (TODO: 26)

dependencies:

  • io (native: _io, _thread, errno, sys)
    • _pyio
    • _collections_abc, abc, codecs, os, stat

dependent tests: (87 tests)

  • io: test__colorize test_android test_argparse test_ast test_asyncio test_bufio test_builtin test_bz2 test_calendar test_cmd test_cmd_line_script test_codecs test_compileall test_compiler_assemble test_concurrent_futures test_configparser test_contextlib test_csv test_dbm_dumb test_dis test_email test_enum test_file test_fileinput test_fileio test_ftplib test_getpass test_gzip test_hashlib test_httplib test_httpservers test_importlib test_inspect test_io test_json test_largefile test_logging test_lzma test_mailbox test_marshal test_memoryio test_memoryview test_mimetypes test_optparse test_pathlib test_pickle test_pickletools test_plistlib test_pprint test_print test_pty test_pulldom test_pyexpat test_regrtest test_robotparser test_shlex test_shutil test_site test_smtplib test_socket test_socketserver test_subprocess test_support test_sys test_tarfile test_tempfile test_threadedtempfile test_timeit test_tokenize test_traceback test_typing test_unittest test_univnewlines test_urllib test_urllib2 test_uuid test_wave test_wsgiref test_xml_dom_xmlbuilder test_xml_etree test_xml_etree_c test_xmlrpc test_zipapp test_zipfile test_zipimport test_zoneinfo test_zstd

[ ] test: cpython/Lib/test/test_cmd_line_script.py (TODO: 16)

dependencies:

dependent tests: (no tests depend on cmd_line_script)

[x] lib: cpython/Lib/codecs.py
[ ] test: cpython/Lib/test/test_codecs.py (TODO: 78)
[ ] test: cpython/Lib/test/test_codeccallbacks.py (TODO: 9)
[ ] test: cpython/Lib/test/test_codecencodings_cn.py
[ ] test: cpython/Lib/test/test_codecencodings_hk.py
[ ] test: cpython/Lib/test/test_codecencodings_iso2022.py
[ ] test: cpython/Lib/test/test_codecencodings_jp.py
[ ] test: cpython/Lib/test/test_codecencodings_kr.py
[ ] test: cpython/Lib/test/test_codecencodings_tw.py
[ ] test: cpython/Lib/test/test_codecmaps_cn.py
[ ] test: cpython/Lib/test/test_codecmaps_hk.py
[ ] test: cpython/Lib/test/test_codecmaps_jp.py
[ ] test: cpython/Lib/test/test_codecmaps_kr.py
[ ] test: cpython/Lib/test/test_codecmaps_tw.py
[ ] test: cpython/Lib/test/test_charmapcodec.py
[ ] test: cpython/Lib/test/test_multibytecodec.py

dependencies:

  • codecs

dependent tests: (105 tests)

  • codecs: test_charmapcodec test_codeccallbacks test_codecs test_eof test_exceptions test_importlib test_io test_json test_locale test_logging test_os test_plistlib test_str test_sys
    • encodings:
      • locale: test__locale test_builtin test_c_locale_coercion test_calendar test_decimal test_format test_re test_regrtest test_time test_types test_utf8_mode
    • json: test_subprocess test_sysconfig test_tomllib test_tools test_traceback test_zoneinfo
      • importlib.metadata: test_importlib
    • pickle: test_array test_asyncio test_bytes test_bz2 test_collections test_concurrent_futures test_coroutines test_csv test_ctypes test_defaultdict test_deque test_descr test_dict test_dictviews test_email test_enum test_enumerate test_fractions test_functools test_generators test_genericalias test_http_cookies test_inspect test_ipaddress test_iter test_itertools test_list test_lzma test_memoryio test_memoryview test_opcache test_operator test_ordered_dict test_pathlib test_pickle test_pickletools test_platform test_positional_only_arg test_posix test_random test_range test_set test_shelve test_slice test_socket test_statistics test_string test_trace test_tuple test_typing test_unittest test_uuid test_xml_dom_minicompat test_xml_etree test_zipfile test_zlib test_zoneinfo
      • logging.handlers: test_concurrent_futures
    • tokenize: test_linecache test_tabnanny test_tokenize test_unparse
      • inspect: test_abc test_argparse test_asyncgen test_code test_grammar test_ntpath test_patma test_posixpath test_signal test_yield_from test_zipimport

[ ] lib: cpython/Lib/compileall.py
[ ] test: cpython/Lib/test/test_compileall.py (TODO: 2)

dependencies:

  • compileall (native: importlib.util, sys)
    • pathlib (native: errno, itertools, pathlib._os, sys)
    • filecmp, functools, os, py_compile, struct

dependent tests: (1 tests)

  • compileall: test_compileall

[ ] lib: cpython/Lib/concurrent
[ ] test: cpython/Lib/test/test_concurrent_futures (TODO: 18)
[ ] test: cpython/Lib/test/test_interpreters
[ ] test: cpython/Lib/test/test__interpreters.py
[ ] test: cpython/Lib/test/test__interpchannels.py
[ ] test: cpython/Lib/test/test_crossinterp.py

dependencies:

  • concurrent (native: _interpqueues, _interpreters, concurrent.futures, concurrent.futures._base, itertools, multiprocessing.connection, multiprocessing.queues, sys, time)
    • collections (native: _weakref, itertools, sys)
    • logging (native: atexit, errno, logging.handlers, sys, time)
    • multiprocessing (native: _multiprocessing, _winapi, array, atexit, collections.abc, errno, itertools, mmap, msvcrt, sys, time)
    • pickle (native: itertools, sys)
    • queue (native: time)
    • traceback (native: collections.abc, itertools, sys)
    • functools, os, threading, types, weakref

dependent tests: (11 tests)

  • concurrent: test_asyncio test_concurrent_futures test_context test_genericalias
    • asyncio: test_asyncio test_contextlib_async test_inspect test_logging test_os test_sys_settrace test_unittest

[x] test: cpython/Lib/test/test_exceptions.py (TODO: 27)
[ ] test: cpython/Lib/test/test_baseexception.py
[x] test: cpython/Lib/test/test_except_star.py (TODO: 1)
[ ] test: cpython/Lib/test/test_exception_group.py (TODO: 1)
[x] test: cpython/Lib/test/test_exception_hierarchy.py (TODO: 2)
[x] test: cpython/Lib/test/test_exception_variations.py

dependencies:

dependent tests: (no tests depend on exception)

[x] lib: cpython/Lib/os.py
[ ] test: cpython/Lib/test/test_os.py (TODO: 3)
[x] test: cpython/Lib/test/test_popen.py

dependencies:

  • os

dependent tests: (160 tests)

  • os: test___all__ test__osx_support test_argparse test_ast test_asyncio test_atexit test_base64 test_baseexception test_bdb test_bool test_builtin test_bytes test_bz2 test_c_locale_coercion test_calendar test_cmd_line test_cmd_line_script test_codecs test_compile test_compileall test_concurrent_futures test_configparser test_contextlib test_ctypes test_dbm test_dbm_dumb test_dbm_sqlite3 test_decimal test_devpoll test_doctest test_dtrace test_eintr test_email test_ensurepip test_enum test_epoll test_exception_hierarchy test_exceptions test_faulthandler test_fcntl test_file test_file_eintr test_filecmp test_fileinput test_fileio test_float test_fnmatch test_fork1 test_fractions test_fstring test_ftplib test_future_stmt test_genericalias test_genericpath test_getpass test_gettext test_glob test_graphlib test_gzip test_hash test_hashlib test_http_cookiejar test_httplib test_httpservers test_imaplib test_importlib test_inspect test_io test_ioctl test_json test_kqueue test_largefile test_linecache test_locale test_logging test_lzma test_mailbox test_marshal test_math test_mimetypes test_mmap test_msvcrt test_multiprocessing_fork test_multiprocessing_forkserver test_multiprocessing_main_handling test_multiprocessing_spawn test_ntpath test_openpty test_optparse test_os test_pathlib test_pkg test_pkgutil test_platform test_plistlib test_poll test_popen test_posix test_posixpath test_pty test_py_compile test_pyexpat test_random test_regrtest test_repl test_reprlib test_robotparser test_runpy test_script_helper test_selectors test_shelve test_shutil test_signal test_site test_smtpnet test_socket test_socketserver test_sqlite3 test_ssl test_stat test_string_literals test_structseq test_subprocess test_support test_sys test_sysconfig test_tabnanny test_tarfile test_tempfile test_termios test_thread test_threading test_threadsignals test_tokenize test_tools test_trace test_tty test_typing test_unicode_file test_unicode_file_functions test_unittest test_univnewlines test_urllib test_urllib2 test_urllib2_localnet test_urllib2net test_urllibnet test_uuid test_venv test_wait3 test_wait4 test_webbrowser test_winapi test_winreg test_wsgiref test_xml_etree test_zipfile test_zipimport test_zoneinfo test_zstd

[x] test: cpython/Lib/test/test_posix.py (TODO: 4)

dependencies:

dependent tests: (no tests depend on posix)

[x] lib: cpython/Lib/runpy.py
[x] test: cpython/Lib/test/test_runpy.py (TODO: 2)

dependencies:

  • runpy

dependent tests: (3 tests)

  • runpy: test_runpy
    • ensurepip: test_ensurepip test_venv

[ ] lib: cpython/Lib/test/support
[ ] test: cpython/Lib/test/test_support.py (TODO: 4)
[ ] test: cpython/Lib/test/test_script_helper.py

dependencies:

  • support (native: _imp, _interpchannels, _opcode, _thread, collections.abc, concurrent.interpreters, concurrent.interpreters._crossinterp, email._header_value_parser, errno, importlib.machinery, importlib.util, logging.handlers, math, os.path, select, sys, test.support, test.support.import_helper, test.support.script_helper, test.test_tools, time)
    • dataclasses (native: itertools, sys)
    • getopt
    • hashlib
    • importlib (native: _imp, _io, _warnings, collections.abc, email.message, importlib.abc, itertools, marshal, sys)
    • opcode (native: _opcode, builtins)
    • shutil (native: errno, sys)
    • subprocess (native: builtins, errno, sys, time)
    • sysconfig (native: os.path, sys)
    • tempfile (native: _thread, errno, sys)
    • test (native: main, _ast, _codecs, _ctypes, _datetime, _imp, _interpchannels, _interpreters, _io, _locale, _multibytecodec, _opcode, _operator, _overlapped, _pickle, _pyrepl._module_completer, _pyrepl.console, _pyrepl.input, _pyrepl.keymap, _pyrepl.reader, _pyrepl.readline, _pyrepl.simple_interact, _pyrepl.terminfo, _pyrepl.utils, _ssl, _string, _sysconfig, _testcapi, _testconsole, _testinternalcapi, _testlimitedcapi, _testsinglephase, _thread, _tkinter, _winapi, array, asyncio.events, asyncio.log, asyncio.proactor_events, asyncio.selector_events, asyncio.staggered, atexit, binascii, builtins, cmath, collections.abc, compression._common, compression.zstd, concurrent.futures, concurrent.futures._base, concurrent.futures.interpreter, concurrent.futures.process, concurrent.futures.thread, concurrent.interpreters, concurrent.interpreters._crossinterp, ctypes.macholib.dyld, ctypes.macholib.dylib, ctypes.macholib.framework, ctypes.util, ctypes.wintypes, dbm.dumb, dbm.ndbm, dbm.sqlite3, email._header_value_parser, email._policybase, email.base64mime, email.charset, email.contentmanager, email.errors, email.generator, email.header, email.headerregistry, email.iterators, email.message, email.mime.application, email.mime.audio, email.mime.base, email.mime.image, email.mime.message, email.mime.multipart, email.mime.nonmultipart, email.mime.text, email.parser, email.policy, email.utils, ensurepip._uninstall, errno, faulthandler, fcntl, gc, html.entities, html.parser, http.client, http.cookiejar, http.cookies, http.server, idlelib.idle_test, importlib._bootstrap, importlib._bootstrap_external, importlib.abc, importlib.machinery, importlib.metadata, importlib.readers, importlib.resources, importlib.resources._adapters, importlib.resources.abc, importlib.util, itertools, logging.config, logging.handlers, marshal, math, mmap, msvcrt, multiprocessing.connection, multiprocessing.dummy, multiprocessing.heap, multiprocessing.managers, multiprocessing.pool, multiprocessing.process, multiprocessing.queues, multiprocessing.util, not_a_module, os.path, package.submodule, parse, pyexpat, readline, select, setuptools, sqlite3.main, string.templatelib, sys, sysconfig.main, termios, test._code_definitions, test._crossinterp_definitions, test._test_multiprocessing, test.fork_wait, test.libregrtest, test.libregrtest.filter, test.libregrtest.main, test.libregrtest.result, test.libregrtest.results, test.libregrtest.utils, test.list_tests, test.pickletester, test.profilee, test.ssl_servers, test.string_tests, test.support, test.support.ast_helper, test.support.bytecode_helper, test.support.hashlib_helper, test.support.hypothesis_helper, test.support.i18n_helper, test.support.import_helper, test.support.logging_helper, test.support.numbers, test.support.os_helper, test.support.pty_helper, test.support.script_helper, test.support.socket_helper, test.support.testcase, test.support.threading_helper, test.support.venv, test.support.warnings_helper, test.test__interpreters, test.test_ast.snippets, test.test_ast.utils, test.test_asyncio, test.test_asyncio.utils, test.test_capi.test_getargs, test.test_contextlib, test.test_ctypes, test.test_ctypes._support, test.test_dbm, test.test_doctest, test.test_doctest.decorator_mod, test.test_email, test.test_httplib, test.test_httpservers, test.test_import, test.test_import.data.circular_imports.binding, test.test_import.data.circular_imports.binding2, test.test_import.data.circular_imports.import_cycle, test.test_import.data.circular_imports.subpkg2.parent, test.test_import.data.circular_imports.subpkg2.parent.child, test.test_importlib, test.test_importlib.extension.test_loader, test.test_importlib.metadata.fixtures, test.test_importlib.util, test.test_inspect, test.test_json, test.test_math, test.test_module.final_a, test.test_module.final_b, test.test_profile, test.test_py_compile, test.test_pydoc, test.test_pyrepl, test.test_sqlite3, test.test_string._support, test.test_tkinter.support, test.test_tkinter.widget_tests, test.test_tools, test.test_ttk_textonly, test.test_unittest, test.test_unittest.support, test.test_unittest.test_result, test.test_unittest.testmock, test.test_unittest.testmock.support, test.test_warnings.data, test.test_zoneinfo, test.test_zoneinfo._support, test.tracedmodules, test.typinganndata, test.typinganndata.ann_module, test_regrtest_b.util, time, tkinter.colorchooser, tkinter.commondialog, tkinter.messagebox, tkinter.simpledialog, unicodedata, unittest.case, unittest.mock, unittest.util, urllib.error, urllib.parse, urllib.request, urllib.response, urllib.robotparser, winreg, wsgiref.handlers, wsgiref.headers, wsgiref.simple_server, wsgiref.util, wsgiref.validate, xml.dom, xml.dom.minicompat, xml.dom.minidom, xml.etree, xml.etree.ElementTree, xml.parsers, xml.parsers.expat, xml.sax, xml.sax.expatreader, xml.sax.handler, xml.sax.saxutils, xml.sax.xmlreader, xmlrpc.client, xmlrpc.server, zipfile._path)
    • unittest (native: builtins, os.path, sys, time, unittest.util)
    • venv (native: sys)
    • collections, io, logging, pathlib
    • annotationlib, ast, contextlib, dis, enum, functools, inspect, os, py_compile, re, selectors, shlex, socket, stat, string, textwrap, threading, types, warnings

dependent tests: (no tests depend on support)

[ ] lib: cpython/Lib/subprocess.py
[ ] test: cpython/Lib/test/test_subprocess.py (TODO: 5)

dependencies:

  • subprocess

dependent tests: (52 tests)

  • subprocess: test_android test_asyncio test_audit test_bz2 test_c_locale_coercion test_cmd_line test_cmd_line_script test_ctypes test_dtrace test_faulthandler test_file_eintr test_gzip test_inspect test_json test_msvcrt test_ntpath test_os test_osx_env test_platform test_plistlib test_poll test_py_compile test_regrtest test_repl test_runpy test_script_helper test_select test_shutil test_signal test_site test_sqlite3 test_subprocess test_support test_sys test_sysconfig test_tempfile test_threading test_unittest test_urllib2 test_utf8_mode test_venv test_wait3 test_webbrowser test_zipfile
    • asyncio: test_asyncio test_contextlib_async test_logging test_sys_settrace test_unittest
    • ctypes.util: test_ctypes
    • ensurepip: test_ensurepip
    • multiprocessing.util: test_concurrent_futures

[x] lib: cpython/Lib/weakref.py
[x] lib: cpython/Lib/_weakrefset.py
[x] test: cpython/Lib/test/test_weakref.py (TODO: 21)
[ ] test: cpython/Lib/test/test_weakset.py (TODO: 1)

dependencies:

  • weakref

dependent tests: (148 tests)

  • weakref: test_array test_ast test_asyncio test_code test_concurrent_futures test_context test_contextlib test_copy test_deque test_descr test_dict test_exceptions test_file test_fileio test_functools test_generators test_genericalias test_importlib test_inspect test_io test_ipaddress test_itertools test_logging test_memoryview test_mmap test_ordered_dict test_pickle test_picklebuffer test_queue test_re test_scope test_set test_slice test_socket test_sqlite3 test_ssl test_struct test_tempfile test_thread test_threading test_threading_local test_types test_typing test_unittest test_uuid test_weakref test_weakset test_xml_etree
    • asyncio: test_asyncio test_contextlib_async test_os test_sys_settrace test_unittest
    • concurrent.futures.process: test_concurrent_futures
    • copy: test_bytes test_codecs test_collections test_coroutines test_csv test_defaultdict test_dictviews test_email test_enum test_fractions test_http_cookies test_opcache test_optparse test_platform test_plistlib test_posix test_site test_statistics test_sysconfig test_tomllib test_xml_dom_minicompat test_zlib
      • dataclasses: test__colorize test_patma test_pprint test_regrtest test_zoneinfo
      • email.generator: test_email
      • http.cookiejar: test_http_cookiejar
      • http.server: test_httpservers test_robotparser test_urllib2_localnet
      • mailbox: test_mailbox
      • smtplib: test_smtplib test_smtpnet
      • tarfile: test_shutil test_tarfile
    • inspect: test_abc test_argparse test_asyncgen test_builtin test_decimal test_grammar test_ntpath test_operator test_posixpath test_signal test_traceback test_yield_from test_zipimport
      • bdb: test_bdb
      • importlib.metadata: test_importlib
      • rlcompleter: test_rlcompleter
      • trace: test_trace
      • xmlrpc.server: test_docxmlrpc test_xmlrpc
    • multiprocessing: test_concurrent_futures test_fcntl test_multiprocessing_main_handling
    • symtable: test_symtable
    • tempfile: test_bz2 test_cmd_line test_compile test_compileall test_dis test_doctest test_ensurepip test_faulthandler test_filecmp test_fileinput test_importlib test_linecache test_pathlib test_pkg test_pkgutil test_py_compile test_runpy test_selectors test_string_literals test_subprocess test_support test_tabnanny test_termios test_threadedtempfile test_urllib test_urllib2 test_urllib_response test_venv test_zipapp test_zipfile test_zipfile64 test_zstd
      • urllib.request: test_urllib2net test_urllibnet

[ ] lib: cpython/Lib/zipfile
[x] test: cpython/Lib/test/test_zipfile64.py

dependencies:

  • zipfile (native: binascii, importlib.util, itertools, sys, time)
    • posixpath (native: errno, sys)
    • io, pathlib, shutil
    • contextlib, os, re, stat, struct, threading

dependent tests: (8 tests)

  • zipfile: test_pkgutil test_shutil test_zipapp test_zipfile test_zipfile64 test_zipimport
    • importlib.metadata: test_importlib test_zoneinfo

Legend:

  • [+] path exists in CPython
  • [x] up-to-date, [ ] outdated

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
crates/vm/src/stdlib/codecs.rs (2)

447-502: ⚠️ Potential issue | 🟠 Major

errors="strict" isn't enforced on decode failures in mbcs_decode.

The _errors variable (line 447) is extracted but never used. When MultiByteToWideChar with MB_ERR_INVALID_CHARS returns 0, the code unconditionally falls back to replacement decoding (lines 468-502). This means invalid bytes won't raise a UnicodeDecodeError even when errors="strict".

🛠️ Proposed fix
-        let _errors = args.errors.as_ref().map(|s| s.as_str()).unwrap_or("strict");
+        let errors = args.errors.as_ref().map(|s| s.as_str()).unwrap_or("strict");
         let data = args.data.borrow_buf();
         let len = data.len();
@@
         if size == 0 {
+            if errors == "strict" {
+                return Err(vm.new_unicode_decode_error(
+                    "'mbcs' codec can't decode bytes: invalid character".to_string(),
+                ));
+            }
             // Try without MB_ERR_INVALID_CHARS for non-strict mode (replacement behavior)
             let size = unsafe {

637-692: ⚠️ Potential issue | 🟠 Major

errors="strict" isn't enforced on decode failures in oem_decode.

Same issue as mbcs_decode: the _errors variable (line 637) is unused, and the code unconditionally falls back to replacement decoding when the strict check fails. This violates Python's codec error handling semantics.

🛠️ Proposed fix
-        let _errors = args.errors.as_ref().map(|s| s.as_str()).unwrap_or("strict");
+        let errors = args.errors.as_ref().map(|s| s.as_str()).unwrap_or("strict");
         let data = args.data.borrow_buf();
         let len = data.len();
@@
         if size == 0 {
+            if errors == "strict" {
+                return Err(vm.new_unicode_decode_error(
+                    "'oem' codec can't decode bytes: invalid character".to_string(),
+                ));
+            }
             // Try without MB_ERR_INVALID_CHARS for non-strict mode (replacement behavior)
             let size = unsafe {
🧹 Nitpick comments (1)
crates/vm/src/stdlib/codecs.rs (1)

338-348: Redundant #[cfg(windows)] attributes inside already-gated module.

The structs and functions inside _codecs_windows (lines 338, 347, 428, 440, 528, 537, 618, 630, 718, 729, 828, 842) have #[cfg(windows)] attributes, but the enclosing module at line 332 is already gated by #[cfg(windows)]. These inner attributes are redundant and can be removed for cleaner code.

@youknowone youknowone merged commit c045593 into RustPython:main Feb 4, 2026
14 checks passed
@youknowone youknowone deleted the nt branch February 4, 2026 00:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant