Skip to content

Conversation

@youknowone
Copy link
Member

@youknowone youknowone commented Jan 3, 2026

Summary by CodeRabbit

Release Notes

  • New Features

    • Added platform-aware character encoding detection function supporting Windows, Unix, iOS, Android, and Redox.
    • Added result retrieval method for Windows overlapped I/O operations with synchronization and error handling.
  • Bug Fixes

    • Improved Windows directory path resolution by propagating directory changes to per-drive environment variables.

✏️ Tip: You can customize this high-level summary in your review settings.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Jan 3, 2026

📝 Walkthrough

Walkthrough

This pull request introduces three platform-specific enhancements to stdlib modules: a locale encoding detection function with Windows/Unix/mobile fallbacks, an overlapped I/O result synchronization method with complex error handling, and per-drive environment variable propagation in the Windows chdir implementation.

Changes

Cohort / File(s) Summary
Locale module enhancement
crates/stdlib/src/locale.rs
Added getencoding() function with platform-aware logic: returns cp{ACP} on Windows via GetACP(), queries CODESET via nl_langinfo on Unix systems with UTF-8 fallback, and returns UTF-8 on iOS/Android/Redox.
Overlapped I/O synchronization
crates/stdlib/src/overlapped.rs
Added Overlapped.getresult() method to retrieve results from overlapped operations. Includes new helper for parsing SOCKADDR_IN6, nuanced error handling for Windows error codes (e.g., ERROR_BROKEN_PIPE, ERROR_IO_PENDING), and stateful return logic based on operation type (Read, Write, ReadFrom, etc.).
Windows chdir environment variable propagation
crates/vm/src/stdlib/os.rs
Extended Windows chdir implementation to set per-drive environment variables (e.g., =X:) after changing the process working directory, with UNC path detection to skip environment variable setting for network paths.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

Suggested reviewers

  • ShaharNaveh

Poem

🐰 New paths reveal their secrets true,
From Windows codes to overlapped queues,
Each platform sings a different song,
As locale and I/O march along! ✨

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately describes the main changes: adding/fixing three Windows-related features (locale.getencoding, overlapped.getresult, and chdir) across three files in the codebase.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
✨ Finishing touches
  • 📝 Generate docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@youknowone youknowone marked this pull request as ready for review January 3, 2026 01:11
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (2)
crates/stdlib/src/locale.rs (1)

52-53: Consider reusing existing GetACP wrapper.

The GetACP function is already exposed in crates/vm/src/stdlib/winapi.rs (line 148-150). While the current approach of importing directly from windows_sys works, consider consistency with the codebase.

crates/stdlib/src/overlapped.rs (1)

374-383: Minor inefficiency in Read result handling.

The code allocates a new Vec and creates new bytes when transferred differs from buffer length. This is correct but creates an extra allocation. Consider whether the original buffer could be truncated in-place if the underlying implementation allows it.

This is a minor optimization opportunity and not blocking.

📜 Review details

Configuration used: Path: .coderabbit.yml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 5ae75a6 and e19ba5f.

⛔ Files ignored due to path filters (1)
  • Lib/test/_test_multiprocessing.py is excluded by !Lib/**
📒 Files selected for processing (3)
  • crates/stdlib/src/locale.rs
  • crates/stdlib/src/overlapped.rs
  • crates/vm/src/stdlib/os.rs
🧰 Additional context used
📓 Path-based instructions (1)
**/*.rs

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

**/*.rs: Follow the default rustfmt code style by running cargo fmt to format Rust code
Always run clippy to lint code (cargo clippy) before completing tasks and fix any warnings or lints introduced by changes
Follow Rust best practices for error handling and memory management
Use the macro system (pyclass, pymodule, pyfunction, etc.) when implementing Python functionality in Rust

Files:

  • crates/vm/src/stdlib/os.rs
  • crates/stdlib/src/overlapped.rs
  • crates/stdlib/src/locale.rs
🧬 Code graph analysis (1)
crates/stdlib/src/locale.rs (1)
crates/vm/src/stdlib/winapi.rs (1)
  • GetACP (149-151)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (8)
  • GitHub Check: Run snippets and cpython tests (macos-latest)
  • GitHub Check: Run snippets and cpython tests (windows-latest)
  • GitHub Check: Run rust tests (windows-latest)
  • GitHub Check: Run rust tests (ubuntu-latest)
  • GitHub Check: Run snippets and cpython tests (ubuntu-latest)
  • GitHub Check: Run rust tests (macos-latest)
  • GitHub Check: Check Rust code with clippy
  • GitHub Check: Ensure compilation on various targets
🔇 Additional comments (5)
crates/stdlib/src/overlapped.rs (3)

157-200: unparse_address implementation is functional but has a minor note.

The function correctly parses both IPv4 and IPv6 addresses stored in SOCKADDR_IN6. The _addr_len parameter is unused but might be needed for validation in future (e.g., ensuring the address structure is fully populated).

The IPv6 formatting doesn't apply zero-compression (e.g., ::1 instead of 0:0:0:0:0:0:0:1), but this matches the simple formatting approach and is acceptable for debugging/display purposes.


309-326: getresult error messages could be more informative.

The validation logic is correct, but consider aligning error messages with CPython's overlapped.c:

  • "operation not yet attempted" → matches CPython
  • "operation failed to start" → CPython might phrase this differently

346-370: Error handling for ERROR_BROKEN_PIPE is comprehensive.

The logic correctly allows ERROR_BROKEN_PIPE for read operations while propagating it as an error for other operation types. The nested match arms for ReadFrom and ReadFromInto checking for None results handle edge cases appropriately.

crates/stdlib/src/locale.rs (1)

265-299: Implementation of getencoding is correct and matches CPython behavior.

The platform-specific logic correctly handles:

  • Windows: Returns cp{ACP} format using GetACP()
  • Unix (standard): Uses nl_langinfo(CODESET) with UTF-8 fallback
  • iOS/Android/Redox: Returns hardcoded UTF-8

The let-chains syntax used (lines 285-288) is fully supported by the project's configured MSRV (1.89.0) with edition 2024, which meets the requirement for this feature.

Likely an incorrect or invalid review comment.

crates/vm/src/stdlib/os.rs (1)

1127-1160: Code is correct—no actionable issues identified.

The implementation properly mirrors CPython's win32_wchdir() behavior:

  1. SetEnvironmentVariableW return value: Intentionally ignored as a best-effort operation, consistent with CPython.

  2. Drive letter assumption: The code safely assumes cwd_wide[0] is a valid drive letter because env::current_dir() calls GetCurrentDirectoryW, which returns only UNC paths (handled separately) or drive letter paths—never extended-length paths (\\?\). Extended-length prefixes are only added by APIs like canonicalize() or explicit operations, not by GetCurrentDirectoryW.

  3. Silent current_dir() failure: Acceptable as this is best-effort post-chdir fixup behavior matching CPython.

Comment on lines +412 to +429
OverlappedData::ReadFrom(rf) => {
// Return (resized_buffer, (host, port)) tuple
let buf = rf
.allocated_buffer
.downcast_ref::<crate::vm::builtins::PyBytes>()
.unwrap();
let bytes = buf.as_bytes();
let resized_buf = if transferred as usize != bytes.len() {
vm.ctx.new_bytes(bytes[..transferred as usize].to_vec())
} else {
buf.to_owned()
};
let addr_tuple = unparse_address(&rf.address, rf.address_length, vm);
Ok(vm
.ctx
.new_tuple(vec![resized_buf.into(), addr_tuple])
.into())
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Potential panic from unwrap() on downcast.

Line 417 uses .unwrap() on downcast_ref::<PyBytes>(). If rf.allocated_buffer is not a PyBytes object (due to a logic error elsewhere), this will panic.

Consider using a safer pattern or adding a comment explaining why this is guaranteed to succeed.

🔎 Suggested safer alternative
-                    let buf = rf
-                        .allocated_buffer
-                        .downcast_ref::<crate::vm::builtins::PyBytes>()
-                        .unwrap();
+                    let buf = rf
+                        .allocated_buffer
+                        .downcast_ref::<crate::vm::builtins::PyBytes>()
+                        .ok_or_else(|| vm.new_type_error("expected bytes object".to_owned()))?;
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
OverlappedData::ReadFrom(rf) => {
// Return (resized_buffer, (host, port)) tuple
let buf = rf
.allocated_buffer
.downcast_ref::<crate::vm::builtins::PyBytes>()
.unwrap();
let bytes = buf.as_bytes();
let resized_buf = if transferred as usize != bytes.len() {
vm.ctx.new_bytes(bytes[..transferred as usize].to_vec())
} else {
buf.to_owned()
};
let addr_tuple = unparse_address(&rf.address, rf.address_length, vm);
Ok(vm
.ctx
.new_tuple(vec![resized_buf.into(), addr_tuple])
.into())
}
OverlappedData::ReadFrom(rf) => {
// Return (resized_buffer, (host, port)) tuple
let buf = rf
.allocated_buffer
.downcast_ref::<crate::vm::builtins::PyBytes>()
.ok_or_else(|| vm.new_type_error("expected bytes object".to_owned()))?;
let bytes = buf.as_bytes();
let resized_buf = if transferred as usize != bytes.len() {
vm.ctx.new_bytes(bytes[..transferred as usize].to_vec())
} else {
buf.to_owned()
};
let addr_tuple = unparse_address(&rf.address, rf.address_length, vm);
Ok(vm
.ctx
.new_tuple(vec![resized_buf.into(), addr_tuple])
.into())
}
🤖 Prompt for AI Agents
In crates/stdlib/src/overlapped.rs around lines 412 to 429, the code uses
.unwrap() on rf.allocated_buffer.downcast_ref::<crate::vm::builtins::PyBytes>()
which can panic if the object is not PyBytes; replace the unwrap with a safe
check (e.g., match or if let) that handles the None case by returning a
Python-level error (TypeError or ValueError) via the vm or otherwise converting
the object to bytes safely, ensuring no panic; include a concise error message
describing the unexpected buffer type and proceed to construct resized_buf only
after confirming the downcast succeeded.

@youknowone youknowone merged commit cff37af into RustPython:main Jan 3, 2026
12 of 13 checks passed
@youknowone youknowone deleted the fix-windows- branch January 3, 2026 02:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant