Skip to content

Conversation

@moreal
Copy link
Contributor

@moreal moreal commented Jan 11, 2026

First, to minimize movement between Python and Rust, I made it parse with Rust whenever possible. This improved performance from 8.77s on the main branch (by my local benchmark) to 7.37s. Although still slower compared to CPython's 0.2s, it's about a 18% improvement. The execution method was as follows, and I ran it about 3 times, calculating based on the highest value.

cargo build --release
time ./target/release/rustpython json_loads.py

Additionally, since JSON often has duplicate keys, I also implemented CPython's method of reusing them. However, when measured separately, this didn't show any execution time benefits—it's only expected to help with memory efficiency.


While profiling by taking multiple flamegraph spans, I discovered that scanstring was taking the longest time. The reason was that operations like chars().nth() are O(N) and severely slow. I avoided this performance bottleneck by also operating with byte indices alongside character indices.

Compared to CPython taking 0.2 seconds, it now takes around 0.4 to 0.5 seconds—a huge improvement! This is over 16 times faster!

For the original motivation, pip install, what used to take 5 seconds for the black package now takes 1.61 seconds. Even when built for debug targets, what used to take 171 seconds now finishes in under 20 seconds.

Summary by CodeRabbit

  • Bug Fixes

    • Stronger detection and clearer errors for malformed JSON, invalid escapes, and out-of-range positions
    • Correct handling of trailing commas and other edge cases in objects/arrays
  • Improvements

    • More accurate, character-aware error reporting for strings and Unicode escapes (including surrogate handling)
    • Reduced string allocation for object keys and overall parsing robustness
  • New Features

    • Accepts an escaped "/" in strings and improves parsing fallback behavior

✏️ Tip: You can customize this high-level summary in your review settings.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Jan 11, 2026

📝 Walkthrough

Walkthrough

JSON parsing was refactored to track both Python character indices and underlying byte indices, moving string/object/array parsing into Rust, adding byte-oriented utilities, richer error reporting, and public scanstring and DecodeError APIs.

Changes

Cohort / File(s) Summary
Core JSON parsing restructure
crates/stdlib/src/json.rs
Updated JsonScanner::parse and JsonScanner::call signatures to accept PyStrRef, char_idx, and byte_idx. Added byte-/char-index utilities and whitespace helpers, implemented parse_object, parse_array, call_scan_once, finalize_object, and make_decode_error. Shifted dispatch from Python strings to Rust byte-oriented parsing and memoized object keys.
String parsing and error handling
crates/stdlib/src/json/machinery.rs
Added public scanstring(s: &Wtf8, char_offset: usize, strict: bool) -> Result<(Wtf8Buf, usize, usize), DecodeError> returning (decoded, chars_consumed, bytes_consumed). Introduced fast-path memchr2 scanning, revised slow-path escape/surrogate handling, extended error contexts, and made DecodeError::new public.

Sequence Diagram(s)

sequenceDiagram
    participant Caller
    participant JsonScanner
    participant Parser
    participant ScanOnce
    participant ScanString
    participant ObjParser
    participant ArrParser
    participant Finalizer

    Caller->>JsonScanner: call(pystr, char_idx)
    JsonScanner->>Parser: parse(pystr, char_idx, byte_idx, scan_once)
    Parser->>ScanOnce: call_scan_once(byte_idx, char_idx)
    alt string
        ScanOnce->>ScanString: scanstring(s, char_offset, strict)
        ScanString-->>ScanOnce: (decoded, chars, bytes)
    else object
        ScanOnce->>ObjParser: parse_object(s, byte_idx)
        ObjParser->>ScanString: scanstring(key, char_offset, strict)
        ScanString-->>ObjParser: (key, chars, bytes)
        ObjParser->>ScanOnce: parse value (recursive)
        ObjParser->>Finalizer: finalize_object(pairs, hooks)
        Finalizer-->>ObjParser: Python object
    else array
        ScanOnce->>ArrParser: parse_array(s, byte_idx)
        ArrParser->>ScanOnce: parse element (recursive)
    else literal/number
        ScanOnce-->>Parser: parsed literal/number
    end
    ScanOnce-->>Parser: (value, end_positions)
    Parser-->>JsonScanner: PyIterReturn
    JsonScanner-->>Caller: result
Loading

Estimated Code Review Effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Poem

🐇 Bytes and chars I hop between,

I chase escapes that once were mean,
I pair surrogates, count each rune,
I stitch the keys beneath the moon,
JSON blooms — a crunchy tune 🥕

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The PR title 'Improve json.loads performance' directly and clearly summarizes the main objective of the changeset, which involves significant refactoring of JSON parsing to move processing into Rust and reduce Python-Rust boundary crossings for performance gains.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings

🧹 Recent nitpick comments
crates/stdlib/src/json.rs (1)

36-40: Consider using the standard library's starts_with method.

The [u8] slice type has a built-in starts_with method that provides identical functionality.

♻️ Simplify using standard library
-    /// Check if a byte slice starts with a given ASCII pattern.
-    #[inline]
-    fn starts_with_bytes(bytes: &[u8], pattern: &[u8]) -> bool {
-        bytes.len() >= pattern.len() && &bytes[..pattern.len()] == pattern
-    }

Then replace call sites like:

-                b'n' if starts_with_bytes(&bytes[byte_idx..], b"null") => {
+                b'n' if bytes[byte_idx..].starts_with(b"null") => {

📜 Recent review details

Configuration used: Path: .coderabbit.yml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 7e51d21 and 073a24a.

⛔ Files ignored due to path filters (1)
  • Lib/test/test_json/test_decode.py is excluded by !Lib/**
📒 Files selected for processing (2)
  • crates/stdlib/src/json.rs
  • crates/stdlib/src/json/machinery.rs
🧰 Additional context used
📓 Path-based instructions (1)
**/*.rs

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

**/*.rs: Follow the default rustfmt code style using cargo fmt to format Rust code
Always run clippy to lint code (cargo clippy) before completing tasks and fix any warnings or lints introduced by changes
Follow Rust best practices for error handling and memory management
Use the macro system (pyclass, pymodule, pyfunction, etc.) when implementing Python functionality in Rust

Files:

  • crates/stdlib/src/json/machinery.rs
  • crates/stdlib/src/json.rs
🧬 Code graph analysis (2)
crates/stdlib/src/json/machinery.rs (1)
crates/stdlib/src/json.rs (1)
  • scanstring (702-732)
crates/stdlib/src/json.rs (1)
crates/stdlib/src/json/machinery.rs (3)
  • scanstring (140-252)
  • new (112-115)
  • len (123-128)
🔇 Additional comments (12)
crates/stdlib/src/json/machinery.rs (4)

33-33: LGTM!

Good use of memchr2 from the memchr crate for efficient byte scanning in the fast path.


112-116: LGTM!

Making DecodeError::new public is necessary to support the updated error construction in json.rs.


150-168: LGTM!

The fast path optimization using memchr2 is well-designed. When no escapes precede the closing quote, it avoids the chunk-based parsing overhead. The control character check in strict mode correctly scans the byte slice.


254-267: LGTM!

The decode_unicode function correctly parses a 4-digit hex escape sequence. The added flame_guard! is consistent with instrumentation in other hot paths.

crates/stdlib/src/json.rs (8)

20-34: LGTM!

The skip_whitespace helper efficiently processes JSON whitespace. Since all JSON whitespace characters (space, tab, newline, carriage return) are ASCII single-byte, the bytes-skipped count correctly equals the chars-skipped count.


91-143: LGTM!

The updated parse method correctly dispatches to the new Rust-side parsing for strings, objects, and arrays while maintaining the fallback behavior for other values. The dual tracking of char_idx and byte_idx is correctly propagated.


226-386: LGTM!

The parse_object implementation correctly handles:

  • Empty objects
  • Key memoization for memory efficiency
  • Proper delimiter checking
  • Trailing comma detection with clear error messages
  • Recursive value parsing via call_scan_once

The dual index tracking maintains correct positions throughout.


388-469: LGTM!

The parse_array implementation is clean and handles all cases correctly, including empty arrays, proper delimiter checking, and trailing comma detection.


471-503: LGTM!

The hook handling correctly prioritizes object_pairs_hook over object_hook, matching CPython's behavior. The tuple construction for pairs_hook is correct.


505-617: LGTM!

The call_scan_once method provides a comprehensive dispatcher for JSON value types. The fallback path (lines 587-614) correctly handles edge cases by calling the Python scan_once function and converting the resulting char index back to a byte index. The O(n) conversion is acceptable since this fallback path should be rare.


632-663: LGTM!

The entry point correctly performs the O(n) character-to-byte index conversion once, then propagates both indices through the parsing stack. This amortizes the conversion cost effectively.


701-732: LGTM!

The scanstring function correctly converts the Python character index to a byte index before delegating to the machinery module. The error handling for out-of-bounds indices is appropriate.

✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions
Copy link
Contributor

github-actions bot commented Jan 11, 2026

Code has been automatically formatted

The code in this PR has been formatted using:

  • cargo fmt --all
    Please pull the latest changes before pushing again:
git pull origin improve-json-perf

@moreal moreal force-pushed the improve-json-perf branch 3 times, most recently from e7bfda5 to 4772f32 Compare January 14, 2026 16:29
@moreal moreal changed the title WIP: Improve json.loads Improve json.loads Jan 14, 2026
@moreal moreal marked this pull request as ready for review January 14, 2026 17:55
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@crates/stdlib/src/json/machinery.rs`:
- Around line 212-227: The error position passed to decode_unicode for the
second surrogate is wrong; adjust the computed position so it points at the
start of the second "\uXXXX" escape (not inside the first). In the
surrogate-pair branch around uni.to_lead_surrogate(), change the position
argument passed to decode_unicode from char_offset + next_char_i + 1 to a value
that accounts for the first "\uXXXX" (e.g. char_offset + next_char_i + 7) or,
better, use the character index/byte_pos extracted from the tuple destructuring
(the byte_pos2/tuple element) to compute the exact offset and pass that into
decode_unicode so error locations refer to the second escape. Ensure chunk_start
and chars updates remain consistent with the new position calculation.
🧹 Nitpick comments (1)
crates/stdlib/src/json.rs (1)

587-617: Consider validating non-negative return index in fallback path.

The fallback to Python scan_once correctly handles the interface, but if scan_once returns a negative index (line 600), the as usize cast at line 605 would produce a very large value, causing .nth() to return None and defaulting to s.len().

While negative indices from scan_once would indicate a bug in the Python code, an explicit check could provide a clearer error:

Optional defensive check
                             let end_char_idx: isize = tuple.as_slice()[1].try_to_value(vm)?;
+                            if end_char_idx < 0 {
+                                return Err(vm.new_value_error("scan_once returned negative index"));
+                            }
                             // For fallback, we need to calculate byte_idx from char_idx
📜 Review details

Configuration used: Path: .coderabbit.yml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 13a875f and 7e51d21.

⛔ Files ignored due to path filters (1)
  • Lib/test/test_json/test_decode.py is excluded by !Lib/**
📒 Files selected for processing (2)
  • crates/stdlib/src/json.rs
  • crates/stdlib/src/json/machinery.rs
🧰 Additional context used
📓 Path-based instructions (1)
**/*.rs

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

**/*.rs: Follow the default rustfmt code style using cargo fmt to format Rust code
Always run clippy to lint code (cargo clippy) before completing tasks and fix any warnings or lints introduced by changes
Follow Rust best practices for error handling and memory management
Use the macro system (pyclass, pymodule, pyfunction, etc.) when implementing Python functionality in Rust

Files:

  • crates/stdlib/src/json/machinery.rs
  • crates/stdlib/src/json.rs
🧬 Code graph analysis (2)
crates/stdlib/src/json/machinery.rs (1)
crates/stdlib/src/json.rs (1)
  • scanstring (702-732)
crates/stdlib/src/json.rs (1)
crates/stdlib/src/json/machinery.rs (3)
  • scanstring (140-252)
  • new (112-115)
  • len (123-128)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (7)
  • GitHub Check: Run snippets and cpython tests (macos-latest)
  • GitHub Check: Run rust tests (windows-2025)
  • GitHub Check: Run snippets and cpython tests on wasm-wasi
  • GitHub Check: Run snippets and cpython tests (ubuntu-latest)
  • GitHub Check: Run snippets and cpython tests (windows-2025)
  • GitHub Check: Check Rust code with clippy
  • GitHub Check: Run rust tests (macos-latest)
🔇 Additional comments (13)
crates/stdlib/src/json/machinery.rs (5)

33-33: LGTM on imports and public visibility change.

The memchr2 import enables efficient byte-scanning for the fast path, and making DecodeError::new public is necessary for error construction in the calling code.

Also applies to: 112-115


130-144: LGTM on function signature and documentation.

The documentation clearly explains the contract - input slice starts after the opening quote, and char_offset enables accurate error position reporting. The triple return value (result, chars_consumed, bytes_consumed) elegantly supports the dual-index tracking pattern used throughout the PR.


145-168: LGTM on fast path implementation.

The optimization strategy is sound: use memchr2 to quickly locate the first " or \, and if the first match is the closing quote, the entire string content can be processed without escape handling. The control character check correctly scans for bytes < 0x20, and the character count properly uses code_points().count() to handle multi-byte sequences.


197-241: LGTM on escape sequence handling.

All standard JSON escape sequences are correctly handled, including the optional / escape. The error reporting correctly uses the character-based position (char_offset + char_i) for accurate Python-level error messages.


254-267: LGTM on decode_unicode implementation.

The function correctly parses 4 hex digits using bit shifting and properly handles errors when characters are missing or non-hexadecimal. The conversion from u16 to CodePoint correctly handles both BMP characters and surrogates.

crates/stdlib/src/json.rs (8)

20-40: LGTM on helper functions.

The skip_whitespace function correctly handles all JSON whitespace characters, and the comment appropriately notes that ASCII whitespace characters are single-byte, making the byte count equal to the character count. The starts_with_bytes function is a clean utility for pattern matching.


92-188: LGTM on updated parse method.

The method correctly maintains dual byte/char indices throughout. String parsing delegates to machinery::scanstring with proper slice offset, and the nested object/array parsing is cleanly delegated to dedicated methods. The constant handling correctly relies on ASCII lengths equaling character counts.


226-386: LGTM on parse_object implementation.

The method correctly handles all object parsing edge cases including empty objects, missing delimiters, and trailing commas. The key memoization via HashMap<String, PyStrRef> is an effective optimization for JSON with repeated keys. Error messages are clear and position-accurate.


388-469: LGTM on parse_array implementation.

The array parsing correctly mirrors the object parsing structure with proper handling for empty arrays, element delimiters, and trailing comma detection. The recursive value parsing via call_scan_once maintains consistent index tracking.


471-503: LGTM on finalize_object implementation.

The method correctly implements the CPython JSON module's hook priority: object_pairs_hook takes precedence when present, otherwise a dict is built and optionally passed to object_hook. The behavior for duplicate keys (last value wins) matches JSON specification.


619-629: LGTM on make_decode_error helper.

This provides a clean abstraction for creating decode errors within the JsonScanner methods, maintaining consistency with the existing py_decode_error function.


632-664: LGTM on Callable implementation.

The entry point correctly validates the index, computes the byte index from the character index (accepting the O(n) cost at the entry point), and properly handles the beyond-string-length case by returning StopIteration. This establishes the dual-index tracking that propagates through all nested parsing.


701-732: LGTM on scanstring function.

The function correctly bridges the Python API (character indices) with the Rust implementation (byte indices). The index conversion using code_point_indices().nth() handles multi-byte characters correctly, and the error handling for out-of-bounds indices is appropriate.

✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.

Comment on lines 212 to 227
if let Some(lead) = uni.to_lead_surrogate() {
// uni is a surrogate -- try to find its pair
let mut chars2 = chars.clone();
if let Some(((pos2, _), (_, _))) = chars2
if let Some(((_, (byte_pos2, _)), (_, _))) = chars2
.next_tuple()
.filter(|((_, (_, c1)), (_, (_, c2)))| *c1 == '\\' && *c2 == 'u')
{
let uni2 = decode_unicode(&mut chars2, pos2)?;
let uni2 =
decode_unicode(&mut chars2, char_offset + next_char_i + 1)?;
if let Some(trail) = uni2.to_trail_surrogate() {
// ok, we found what we were looking for -- \uXXXX\uXXXX, both surrogates
uni = lead.merge(trail).into();
chunk_start = pos2 + 6;
chunk_start = byte_pos2 + 6;
chars = chars2;
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Error position for second surrogate escape may be misleading.

When parsing a surrogate pair like \uD800\uDC00, if the second escape is malformed, the error position passed to decode_unicode appears incorrect.

At line 219-220, the position char_offset + next_char_i + 1 evaluates to approximately char_offset + char_i + 2 (pointing into the first escape), whereas the actual error would be in the second escape starting around char_offset + char_i + 7.

This only affects error message positions, not parsing correctness, so it's a minor issue.

Suggested position fix

The position should account for the 6 characters consumed by the first \uXXXX plus the \ already consumed by next_tuple():

-                                let uni2 =
-                                    decode_unicode(&mut chars2, char_offset + next_char_i + 1)?;
+                                let uni2 =
+                                    decode_unicode(&mut chars2, char_offset + char_i + 7)?;

Alternatively, extract the character index from the tuple destructuring instead of computing it.

🤖 Prompt for AI Agents
In `@crates/stdlib/src/json/machinery.rs` around lines 212 - 227, The error
position passed to decode_unicode for the second surrogate is wrong; adjust the
computed position so it points at the start of the second "\uXXXX" escape (not
inside the first). In the surrogate-pair branch around uni.to_lead_surrogate(),
change the position argument passed to decode_unicode from char_offset +
next_char_i + 1 to a value that accounts for the first "\uXXXX" (e.g.
char_offset + next_char_i + 7) or, better, use the character index/byte_pos
extracted from the tuple destructuring (the byte_pos2/tuple element) to compute
the exact offset and pass that into decode_unicode so error locations refer to
the second escape. Ensure chunk_start and chars updates remain consistent with
the new position calculation.

@moreal moreal changed the title Improve json.loads Improve json.loads performance Jan 14, 2026
moreal and others added 9 commits January 15, 2026 08:37
Parse JSON constants (null, true, false) directly in Rust within
call_scan_once() instead of falling back to Python scan_once.
This reduces Python-Rust boundary crossings for array/object values.

Co-Authored-By: Claude Opus 4.5 <[email protected]>
Parse JSON numbers starting with digits (0-9) directly in Rust within
call_scan_once() by reusing the existing parse_number() method.
This reduces Python-Rust boundary crossings for array/object values.

Co-Authored-By: Claude Opus 4.5 <[email protected]>
Parse special JSON constants (NaN, Infinity, -Infinity) and negative
numbers directly in Rust within call_scan_once(). This handles:
- 'N' -> NaN via parse_constant callback
- 'I' -> Infinity via parse_constant callback
- '-' -> -Infinity or negative numbers via parse_constant/parse_number

This reduces Python-Rust boundary crossings for array/object values.

Co-Authored-By: Claude Opus 4.5 <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant