fs: add recursive opendir/readdir#41439
Conversation
There was a problem hiding this comment.
The proposed behavior looks good to me, although the output design can be changed for the sake of easier control and management over the output for developers, but the current output design can also stay if desired.
Imagine a file structure as the following:
foo
|_ bar
|_ baz
|_ a.txt
|_ fizz
|_ b.txt
|_ fuzz
|_ c.txt
|_ d.txtDesign 1 (The proposed design aka the current design, only resolve paths to files):
import { readdirSync } from 'node:fs';
const output = readdirSync('./foo', { recursive: true });
console.log(output);
// ['bar/a.txt', 'fizz/b.txt', 'fuzz/c.txt', 'd.txt']Design 2 (Resolve paths to both directories and files):
import { readdirSync } from 'node:fs';
const output = readdirSync('./foo', { recursive: true });
console.log(output);
// ['bar', 'bar/baz', 'bar/a.txt', 'fizz', 'fizz/b.txt', 'fuzz', 'fuzz/c.txt', 'd.txt']Design 3 (Instead of returning an array of paths, return array of objects with information to each recursively read directories):
import { readdirSync } from 'node:fs';
const output = readdirSync('./foo', { recursive: true });
console.log(output);
/*
[
{
type: 'directory',
name: 'bar',
children: [
{
type: 'directory',
name: 'baz',
children: []
},
{
type: 'file',
name: 'a.txt'
}
]
},
{
type: 'directory',
name: 'fizz',
children: [
{
type: 'file',
name: 'b.txt'
}
]
},
{
type: 'directory',
name: 'fuzz',
children: [
{
type: 'file',
name: 'c.txt'
}
]
},
{
type: 'file',
name: 'd.txt'
}
]
*/Also remember to fix the formatting issues reported by our linter(s) :).
tniessen
left a comment
There was a problem hiding this comment.
Since the result is potentially huge and the process itself might take virtually forever, especially on network drives etc, we should probably encourage streaming APIs instead of returning arrays. If we just return arrays, the process might crash after hours or so when the array allocations run into an out-of-memory situation, never having produced any results.
|
Can you fix linting please? It's hard to review with all the lint comments. Thanks! |
|
I am wondering if it wouldn't be nice for users to change the recursive option to be either await fs.readdir('.', { recursive: 'breadth-first' }) // ['a', 'b', 'a/x']
await fs.readdir('.', { recursive: 'depth-first' }) // ['a', 'a/x', b']
await fs.readdir('.', { recursive: true }) // throw new Error('Specify recursive as either "breadth-first" or "depth-first")
A Considering |
|
Thanks for all the great feedback folks. I'm going to update the tests to exemplify an API design for this feature that includes things like:
|
|
We also discussed on the call to keep this initial implementation simple. Omit things like a I personally would like there to be some sort of default behavior when using just |
I would prefer to also see |
|
Summary of recent updates: Default case uses just I included examples with the I included another way to configure the {
result?: 'stream',
algorithm?: 'depth-first' | 'breadth-first'
}Unless folks can come up with another I did not update the promise tests - want to focus on the API shape for now. I did not include async tests for all of the sync operations. After API agreement I'll add those in. I did not include error case tests yet. Will think about that more later |
|
When implementing a readdir stream a while back I had to do Different thing: Would it be a good idea to support abort signals? |
bcoe
left a comment
There was a problem hiding this comment.
Tests are looking quite clean, good start. Added a recommendation that we add a test for symlink loops.
It might be worth looking at the test suite for fs-extra copy/recursive, they have a bunch of test cases around symlink behavior that we might be able to use for inspiration.
If i'm not mistaken the
Yes, for the async operations we could do this |
|
After some consideration, I think supporting alternative algorithms is going to make this PR more complicated than I can handle at this time. What would be the best way to defer the algorithm feature of this PR? One option: I can put the test files I've written for them into a new branch and open an issue to track. @bcoe @bnb |
|
imo getting a basic implementation in and expanding on that with additional algorithms in later PRs would be fine. |
|
See linked issue #41709 for |
|
Also for the sake of initial implementation, I'm dropping the promise API. I'd like to see us land the sync/async versions and then progress from there |
|
@Ethan-Arrowood What about the concerns raised in #41439 (review) and #41439 (review) ? |
|
Ahh I missed those somehow. My initial thought on the discussion is it seems like a fine idea. Focus on a stream based recursive readdir, and then also add a recursive option to opendir that would return an array. |
|
Okay new API design for initial implementation:
Future work:
What do we think? |
Does it mean that |
|
Good catch. I meant that the callback will be passed the same readable that is returned by the sync version |
Would anything really async happen before the stream is constructed and returned in the callback version? |
|
Not sure. I guess it'd be the difference between opendir and opendirSync under the hood. The promise version of this is simple, I plan on using an async iterable. The sync and callback versions I haven't given tooo much thought beyond implementing a Readable and using _read to exhaust recursive calls to something. |
|
Then why don't we make |
|
Ah i may be lost now. opendir just returns a Dir (via callback, promise, or sync) that's an async iterable of Dirents. Are you suggesting that when recursive is enabled, the async iterable will return everything in it as Dirents? |
MoLow
left a comment
There was a problem hiding this comment.
Amazing 🎉 , I will wait a few hours before landing this to make sure no-one has objections to the recent changes.
|
Landed in 7b39e80 |
|
Congrats @Ethan-Arrowood 🎉 |
|
Thank you thank you! |
|
WOOOO CONGRATS @Ethan-Arrowood 🎊 |
Adds a naive, linear recursive algorithm for the following methods: readdir, readdirSync, opendir, opendirSync, and the promise based equivalents. Fixes: #34992 PR-URL: #41439 Refs: nodejs/tooling#130 Reviewed-By: Yagiz Nizipli <[email protected]> Reviewed-By: Matteo Collina <[email protected]> Reviewed-By: James M Snell <[email protected]> Reviewed-By: Moshe Atlow <[email protected]>
|
Wow, this looks like a big feature 👀 😲 Does this represent a built-in replacement for some use cases of the following libraries? (npm trends)
|
|
Yes it does! It may not be 1-to-1 replacements but it achieves similar functionality with all of those |
Implements a recursive option for the
readdirandreaddirSyncmethods.Closes: #34992
Ref: nodejs/tooling#130
Starting with tests to just get a sense for how things should behave. Will dive into the implementation next. Any feedback/concerns is welcome at any time.