Skip to content

buffer: re-enable Fast API for Buffer.write #54526

Merged
nodejs-github-bot merged 1 commit intonodejs:mainfrom
nxtedition:fast-utf8-2
Sep 4, 2024
Merged

buffer: re-enable Fast API for Buffer.write #54526
nodejs-github-bot merged 1 commit intonodejs:mainfrom
nxtedition:fast-utf8-2

Conversation

@ronag
Copy link
Member

@ronag ronag commented Aug 23, 2024

Re-enables fast Fast API for Buffer.write after fixing UTF8 handling.

13:05:25 buffers/buffer-write-string-short.js n=1000000 len=1 encoding='ascii'                            ***    266.52 %       ±7.27%  ±9.75% ±12.84%
13:05:25 buffers/buffer-write-string-short.js n=1000000 len=1 encoding='latin1'                           ***    201.16 %       ±4.74%  ±6.33%  ±8.29%
13:05:25 buffers/buffer-write-string-short.js n=1000000 len=1 encoding='utf8'                             ***    242.07 %       ±6.29%  ±8.44% ±11.14%
13:05:25 buffers/buffer-write-string-short.js n=1000000 len=16 encoding='ascii'                           ***    264.81 %       ±5.88%  ±7.86% ±10.29%
13:05:25 buffers/buffer-write-string-short.js n=1000000 len=16 encoding='latin1'                          ***    197.28 %       ±6.96%  ±9.32% ±12.26%
13:05:25 buffers/buffer-write-string-short.js n=1000000 len=16 encoding='utf8'                            ***    349.02 %       ±8.80% ±11.84% ±15.68%
13:05:25 buffers/buffer-write-string-short.js n=1000000 len=32 encoding='ascii'                           ***    271.75 %       ±8.70% ±11.69% ±15.43%
13:05:25 buffers/buffer-write-string-short.js n=1000000 len=32 encoding='latin1'                          ***    214.58 %       ±5.85%  ±7.84% ±10.32%
13:05:25 buffers/buffer-write-string-short.js n=1000000 len=32 encoding='utf8'                            ***    351.17 %       ±7.64% ±10.28% ±13.60%
13:05:25 buffers/buffer-write-string-short.js n=1000000 len=8 encoding='ascii'                            ***    273.30 %       ±8.65% ±11.63% ±15.38%
13:05:25 buffers/buffer-write-string-short.js n=1000000 len=8 encoding='latin1'                           ***    191.62 %       ±5.62%  ±7.52%  ±9.88%
13:05:25 buffers/buffer-write-string-short.js n=1000000 len=8 encoding='utf8'                             ***    224.55 %       ±4.72%  ±6.33%  ±8.33%

@nodejs-github-bot nodejs-github-bot added buffer Issues and PRs related to the buffer subsystem. c++ Issues and PRs that require attention from people who are familiar with C++. needs-ci PRs that need a full CI run. labels Aug 23, 2024
@ronag ronag force-pushed the fast-utf8-2 branch 2 times, most recently from a930ad4 to 0fda693 Compare August 23, 2024 14:45
@ronag ronag marked this pull request as ready for review August 23, 2024 14:46
@ronag ronag force-pushed the fast-utf8-2 branch 2 times, most recently from 23f1b23 to 8be5c43 Compare August 23, 2024 14:50
{
let i = 0;

while (i < 1_000_000) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having a non-deterministic test is unpleasant.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can trigger fast api call using the method @targos implemented in his previous pull requests.

Copy link
Member

@lemire lemire left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks to me. I wish we could use a deterministic test. Having to loop a million times to trigger and issue is both slow and error prone.

@codecov
Copy link

codecov bot commented Aug 23, 2024

Codecov Report

Attention: Patch coverage is 88.88889% with 8 lines in your changes missing coverage. Please review.

Project coverage is 87.60%. Comparing base (981c701) to head (0ae2ae1).
Report is 252 commits behind head on main.

Files with missing lines Patch % Lines
src/node_buffer.cc 88.88% 3 Missing and 5 partials ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main   #54526      +/-   ##
==========================================
- Coverage   87.61%   87.60%   -0.01%     
==========================================
  Files         650      650              
  Lines      182835   182889      +54     
  Branches    35382    35389       +7     
==========================================
+ Hits       160185   160227      +42     
- Misses      15925    15927       +2     
- Partials     6725     6735      +10     
Files with missing lines Coverage Δ
src/node_buffer.cc 71.41% <88.88%> (+1.46%) ⬆️

... and 32 files with indirect coverage changes

@ronag ronag added the request-ci Add this label to start a Jenkins CI on a PR. label Aug 23, 2024
@github-actions github-actions bot removed the request-ci Add this label to start a Jenkins CI on a PR. label Aug 23, 2024
@nodejs-github-bot

This comment was marked as outdated.

@ronag ronag added the author ready PRs that have at least one approval, no pending requests for changes, and a CI started. label Aug 23, 2024
@ronag ronag mentioned this pull request Aug 23, 2024
Copy link
Member

@jasnell jasnell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some lint issues but otherwise LGTM

@ronag ronag added the request-ci Add this label to start a Jenkins CI on a PR. label Aug 24, 2024
@github-actions github-actions bot removed the request-ci Add this label to start a Jenkins CI on a PR. label Aug 24, 2024
@nodejs-github-bot

This comment was marked as outdated.

@blexrob
Copy link

blexrob commented Aug 24, 2024

With the current commits, the following test will fail:

let i = 0;
const testStr = "\xc2\x80"; // when stored in 1-byte chars, this is a valid UTF-8 encoding
const expected = Buffer.from(testStr).toString("hex");
for(; i < 1_000_000; i++) {
  const buf = Buffer.from(testStr);
  const ashex = buf.toString("hex");
  if (ashex !== expected) {
    console.log(`Decoding changed in iteration ${i} when changing to FastWriteStringUTF8, got ${ashex}, expected ${expected}`);
    break;
  }
}

if(i<1_000_000) {
  console.error("FAILED after %d iterations",i);
} else
  console.log("PASSED after %d iterations",i);

The source string is stored as UTF16 (compressed to 1 byte), but still needs to be converted to UTF-8. The fast path can only be implemented as memcpy when only characters from the range 0-127 are present.

@ronag
Copy link
Member Author

ronag commented Aug 24, 2024

The source string is stored as UTF16 (compressed to 1 byte), but still needs to be converted to UTF-8. The fast path can only be implemented as memcpy when only characters from the range 0-127 are present.

I thought FastOneByteString means that it's only one byte chars?

@ronag
Copy link
Member Author

ronag commented Aug 24, 2024

< when only characters from the range 0-127 are present

@lemira does simdutf have this? validate_ascii?

@nodejs-github-bot
Copy link
Collaborator

@nodejs-github-bot
Copy link
Collaborator

Commit Queue failed
- Loading data for nodejs/node/pull/54526
✔  Done loading data for nodejs/node/pull/54526
----------------------------------- PR info ------------------------------------
Title      buffer: validate UTF8 on fast path  (#54526)
Author     Robert Nagy <[email protected]> (@ronag)
Branch     ronag:fast-utf8-2 -> nodejs:main
Labels     buffer, c++, author ready, needs-ci
Commits    1
 - buffer: validate UTF8 on fast path
Committers 1
 - Robert Nagy <[email protected]>
PR-URL: https://github.com/nodejs/node/pull/54526
Fixes: https://github.com/nodejs/node/issues/54521
Reviewed-By: Daniel Lemire <[email protected]>
Reviewed-By: Benjamin Gruenbaum <[email protected]>
Reviewed-By: James M Snell <[email protected]>
Reviewed-By: Anna Henningsen <[email protected]>
Reviewed-By: Yagiz Nizipli <[email protected]>
Reviewed-By: Paolo Insogna <[email protected]>
------------------------------ Generated metadata ------------------------------
PR-URL: https://github.com/nodejs/node/pull/54526
Fixes: https://github.com/nodejs/node/issues/54521
Reviewed-By: Daniel Lemire <[email protected]>
Reviewed-By: Benjamin Gruenbaum <[email protected]>
Reviewed-By: James M Snell <[email protected]>
Reviewed-By: Anna Henningsen <[email protected]>
Reviewed-By: Yagiz Nizipli <[email protected]>
Reviewed-By: Paolo Insogna <[email protected]>
--------------------------------------------------------------------------------
   ℹ  This PR was created on Fri, 23 Aug 2024 14:34:31 GMT
   ✔  Approvals: 6
   ✔  - Daniel Lemire (@lemire): https://github.com/nodejs/node/pull/54526#pullrequestreview-2257470621
   ✔  - Benjamin Gruenbaum (@benjamingr) (TSC): https://github.com/nodejs/node/pull/54526#pullrequestreview-2273156047
   ✔  - James M Snell (@jasnell) (TSC): https://github.com/nodejs/node/pull/54526#pullrequestreview-2258976699
   ✔  - Anna Henningsen (@addaleax): https://github.com/nodejs/node/pull/54526#pullrequestreview-2258958504
   ✔  - Yagiz Nizipli (@anonrig) (TSC): https://github.com/nodejs/node/pull/54526#pullrequestreview-2272343097
   ✔  - Paolo Insogna (@ShogunPanda) (TSC): https://github.com/nodejs/node/pull/54526#pullrequestreview-2260113366
   ✘  Last GitHub CI failed
   ℹ  Last Full PR CI on 2024-08-30T13:34:03Z: https://ci.nodejs.org/job/node-test-pull-request/61707/
   ℹ  Last Benchmark CI on 2024-08-30T10:28:48Z: https://ci.nodejs.org/view/Node.js%20benchmark/job/benchmark-node-micro-benchmarks/1626/
- Querying data for job/node-test-pull-request/61707/
   ✔  Last Jenkins CI successful
--------------------------------------------------------------------------------
   ✔  Aborted `git node land` session in /home/runner/work/node/node/.ncu
https://github.com/nodejs/node/actions/runs/10637813307

@nodejs-github-bot
Copy link
Collaborator

@nodejs-github-bot
Copy link
Collaborator

@nodejs-github-bot
Copy link
Collaborator

@nodejs-github-bot
Copy link
Collaborator

@nodejs-github-bot
Copy link
Collaborator

@nodejs-github-bot
Copy link
Collaborator

@nodejs-github-bot
Copy link
Collaborator

@nodejs-github-bot
Copy link
Collaborator

@avivkeller
Copy link
Member

avivkeller commented Sep 1, 2024

Given the repeated activity since the failed commit, I assume it is known that the commit-queue originally failed, so I removed the label. Feel free to re-add.

@nodejs-github-bot
Copy link
Collaborator

@avivkeller
Copy link
Member

Correct me if I'm wrong, but shouldn't the "PR-URL" meta be added by the bot?

@targos
Copy link
Member

targos commented Sep 2, 2024

Correct me if I'm wrong, but shouldn't the "PR-URL" meta be added by the bot?

You're right. @ronag please remove the PR-URL from the commit.

Re-enables fast Fast API for Buffer.write after fixing
UTF8 handling.

Fixes: nodejs#54521
@nodejs-github-bot
Copy link
Collaborator

@nodejs-github-bot
Copy link
Collaborator

@nodejs-github-bot
Copy link
Collaborator

@nodejs-github-bot
Copy link
Collaborator

@nodejs-github-bot
Copy link
Collaborator

@nodejs-github-bot
Copy link
Collaborator

@nodejs-github-bot
Copy link
Collaborator

@nodejs-github-bot
Copy link
Collaborator

@nodejs-github-bot
Copy link
Collaborator

@nodejs-github-bot
Copy link
Collaborator

CI: https://ci.nodejs.org/job/node-test-pull-request/61926/

@nodejs-github-bot
Copy link
Collaborator

Landed in dc74f17

@RafaelGSS RafaelGSS mentioned this pull request Sep 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

author ready PRs that have at least one approval, no pending requests for changes, and a CI started. buffer Issues and PRs related to the buffer subsystem. c++ Issues and PRs that require attention from people who are familiar with C++. dont-land-on-v20.x PRs that should not land on the v20.x-staging branch and should not be released in v20.x. needs-ci PRs that need a full CI run.

Projects

None yet

Development

Successfully merging this pull request may close these issues.