update_lib hard dependency resolver#6817
Conversation
📝 WalkthroughWalkthroughThis PR extends the library update toolchain with a new dependency resolution module ( Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
Code has been automatically formatted The code in this PR has been formatted using:
git pull origin lib_updater |
| from update_lib.path import parse_lib_path | ||
|
|
||
| # Extract module name and cpython prefix from path | ||
| path_str = str(src_path).replace("\\", "/") |
There was a problem hiding this comment.
When I look at this again, wouldn't as_posix() would also do this?
There was a problem hiding this comment.
once tried, but not simply replacable.
ShaharNaveh
left a comment
There was a problem hiding this comment.
I like this change a lot!
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Fix all issues with AI agents
In `@scripts/update_lib/copy_lib.py`:
- Around line 65-77: The code that parses src_path via path_str and splits on
"/Lib/" mis-handles nested module paths and makes the "else" fallback
unreachable because parse_lib_path always raises; update the logic in
copy_lib.py so that when "/Lib/" is present you preserve the full after_lib (not
just its first path component) and derive the destination accordingly (e.g., use
the full nested path portion from after_lib to compute lib_path), and when
"/Lib/" is not present call parse_lib_path inside a try/except and on exception
perform an explicit direct copy using _copy_single (or consult DEPENDENCIES
mapping) instead of assuming parse_lib_path will succeed; refer to
variables/path pieces path_str, cpython_prefix, after_lib, name, and functions
parse_lib_path and _copy_single to locate the changes.
In `@scripts/update_lib/deps.py`:
- Around line 358-365: In resolve_all_paths, the loop over deps returned from
get_test_dependencies is appending dictionary keys ("hard_deps"/"data") into
result["test_deps"]; change the logic so you iterate over deps["hard_deps"]
(and, if you need test data, also extend/merge deps["data"]) instead of
iterating the deps dict itself; reference get_test_dependencies, the local
variable deps, and result["test_deps"], and add existence checks for "hard_deps"
and "data" before iterating to avoid KeyError.
🧹 Nitpick comments (2)
scripts/update_lib/tests/test_patch_spec.py (1)
325-358: Good test coverage for the three main branches.The tests correctly validate each branch of
_find_import_insert_line. Consider adding a test for a multi-line docstring to verifyend_linenohandling, though this is optional since the current coverage is sufficient.🧪 Optional: Additional test for multi-line docstring
def test_no_imports_with_multiline_docstring(self): """Test fallback to after multi-line docstring when no imports.""" code = '''"""Module docstring. This spans multiple lines. """ class Foo: pass ''' tree = ast.parse(code) line = _find_import_insert_line(tree) self.assertEqual(line, 4)scripts/update_lib/quick.py (1)
108-133: Dependency migration is solid; please confirm data deps are directories.
shutil.copytreewill fail if any entry intest_deps["data"]is a file. If files are possible, consider handling both cases.Optional hardening for file vs directory data deps
- if data_lib.exists(): - shutil.rmtree(data_lib) - shutil.copytree(data_src, data_lib) + if data_lib.exists(): + if data_lib.is_dir(): + shutil.rmtree(data_lib) + else: + data_lib.unlink() + if data_src.is_dir(): + shutil.copytree(data_src, data_lib) + else: + data_lib.parent.mkdir(parents=True, exist_ok=True) + shutil.copy2(data_src, data_lib)
| # Extract module name and cpython prefix from path | ||
| path_str = str(src_path).replace("\\", "/") | ||
| if "/Lib/" in path_str: | ||
| cpython_prefix, after_lib = path_str.split("/Lib/", 1) | ||
| # Get module name (first component, without .py) | ||
| name = after_lib.split("/")[0] | ||
| if name.endswith(".py"): | ||
| name = name[:-3] | ||
| else: | ||
| # Fallback: just copy the single file | ||
| lib_path = parse_lib_path(src_path) | ||
| _copy_single(src_path, lib_path, verbose) | ||
| return |
There was a problem hiding this comment.
Fallback path handling is currently unreachable; nested Lib paths mis-resolve module names.
When /Lib/ isn’t present, parse_lib_path will always raise, so the “fallback” can’t work. Also, deriving the name from the first component after /Lib/ mis-resolves nested paths like cpython/Lib/test/libregrtest (name becomes test), which can copy the wrong module. Consider copying nested paths directly (or reverse-mapping via DEPENDENCIES) and making the non-/Lib/ fallback explicit.
🛠️ Proposed fix (direct-copy for nested paths + explicit fallback)
if "/Lib/" in path_str:
cpython_prefix, after_lib = path_str.split("/Lib/", 1)
+ # If a deeper path is provided (e.g., Lib/test/libregrtest),
+ # copy it directly instead of resolving by top-level module name.
+ if "/" in after_lib:
+ lib_path = parse_lib_path(src_path)
+ _copy_single(src_path, lib_path, verbose)
+ return
# Get module name (first component, without .py)
name = after_lib.split("/")[0]
if name.endswith(".py"):
name = name[:-3]
else:
- # Fallback: just copy the single file
- lib_path = parse_lib_path(src_path)
- _copy_single(src_path, lib_path, verbose)
- return
+ # Fallback: allow direct Lib/... paths
+ if path_str.startswith("Lib/"):
+ lib_path = pathlib.Path(path_str)
+ _copy_single(src_path, lib_path, verbose)
+ return
+ raise ValueError(f"Path must contain '/Lib/' or start with 'Lib/' (got: {src_path})")🤖 Prompt for AI Agents
In `@scripts/update_lib/copy_lib.py` around lines 65 - 77, The code that parses
src_path via path_str and splits on "/Lib/" mis-handles nested module paths and
makes the "else" fallback unreachable because parse_lib_path always raises;
update the logic in copy_lib.py so that when "/Lib/" is present you preserve the
full after_lib (not just its first path component) and derive the destination
accordingly (e.g., use the full nested path portion from after_lib to compute
lib_path), and when "/Lib/" is not present call parse_lib_path inside a
try/except and on exception perform an explicit direct copy using _copy_single
(or consult DEPENDENCIES mapping) instead of assuming parse_lib_path will
succeed; refer to variables/path pieces path_str, cpython_prefix, after_lib,
name, and functions parse_lib_path and _copy_single to locate the changes.
| if include_deps: | ||
| # Auto-detect test dependencies | ||
| for test_path in result["test"]: | ||
| deps = get_test_dependencies(test_path) | ||
| for dep in deps: | ||
| if dep not in result["test_deps"]: | ||
| result["test_deps"].append(dep) | ||
|
|
There was a problem hiding this comment.
resolve_all_paths appends dict keys instead of dependency paths.
Iterating over deps yields "hard_deps" / "data" strings, so test_deps becomes invalid and downstream logic will break. Iterate over deps["hard_deps"] (and merge deps["data"] if you intend to collect test data).
🛠️ Proposed fix
if include_deps:
# Auto-detect test dependencies
for test_path in result["test"]:
deps = get_test_dependencies(test_path)
- for dep in deps:
- if dep not in result["test_deps"]:
- result["test_deps"].append(dep)
+ for dep in deps["hard_deps"]:
+ if dep not in result["test_deps"]:
+ result["test_deps"].append(dep)
+ for data_path in deps["data"]:
+ if data_path not in result["data"]:
+ result["data"].append(data_path)📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| if include_deps: | |
| # Auto-detect test dependencies | |
| for test_path in result["test"]: | |
| deps = get_test_dependencies(test_path) | |
| for dep in deps: | |
| if dep not in result["test_deps"]: | |
| result["test_deps"].append(dep) | |
| if include_deps: | |
| # Auto-detect test dependencies | |
| for test_path in result["test"]: | |
| deps = get_test_dependencies(test_path) | |
| for dep in deps["hard_deps"]: | |
| if dep not in result["test_deps"]: | |
| result["test_deps"].append(dep) | |
| for data_path in deps["data"]: | |
| if data_path not in result["data"]: | |
| result["data"].append(data_path) |
🤖 Prompt for AI Agents
In `@scripts/update_lib/deps.py` around lines 358 - 365, In resolve_all_paths, the
loop over deps returned from get_test_dependencies is appending dictionary keys
("hard_deps"/"data") into result["test_deps"]; change the logic so you iterate
over deps["hard_deps"] (and, if you need test data, also extend/merge
deps["data"]) instead of iterating the deps dict itself; reference
get_test_dependencies, the local variable deps, and result["test_deps"], and add
existence checks for "hard_deps" and "data" before iterating to avoid KeyError.
updaing datetime will triggers to update also _pydatetime
cc @moreal @ShaharNaveh
Summary by CodeRabbit
New Features
Bug Fixes
Tests
✏️ Tip: You can customize this high-level summary in your review settings.