[SPARK-54923][SQL] Add configuration for directory traversal depth before parallel partition discovery#54787
Open
xiaoxuandev wants to merge 1 commit intoapache:masterfrom
Open
[SPARK-54923][SQL] Add configuration for directory traversal depth before parallel partition discovery#54787xiaoxuandev wants to merge 1 commit intoapache:masterfrom
xiaoxuandev wants to merge 1 commit intoapache:masterfrom
Conversation
…fore parallel partition discovery
### What changes were proposed in this pull request?
Adds a new configuration `spark.sql.sources.parallelPartitionDiscovery.traversalDepth` that controls how many directory levels the driver sequentially expands before delegating to parallel workers during partition discovery.
For deep-narrow directory hierarchies (e.g., `root/subpath/{year}/{month}/`), the top levels have very few children, so parallelizing at level 1 creates too few tasks. This config allows the driver to sequentially traverse N-1 levels first, collecting enough paths to parallelize effectively.
Key design decisions:
- `traversalDepth=1` (default) preserves identical behavior to the original code
- Expanded intermediate paths use `isRootLevel=false` to preserve SPARK-27676 semantics
- A `tolerateDisappearance` flag handles the race condition where directories confirmed during sequential traversal vanish before parallel listing reaches them, without propagating to recursive subdirectory calls
- FileStatusCache keys are re-mapped to original root paths so the cache remains effective when `traversalDepth > 1`
- Sequential traversal filters hidden/underscore directories but preserves `_metadata` and `_common_metadata`
- Mixed directories (containing both files and subdirectories) stop expansion to avoid missing sibling files
### Why are the changes needed?
When tables have deep-narrow directory structures (e.g., a single intermediate path like `root/subpath/` before the partition columns), the parallel partition discovery creates only one task for the single root path. This means the entire directory tree is listed sequentially on a single executor, negating the benefit of parallel listing. By pre-expanding a few levels on the driver, we can distribute the work across multiple tasks.
### Does this PR introduce any user-facing change?
Yes. A new SQL configuration `spark.sql.sources.parallelPartitionDiscovery.traversalDepth` is available. Default value is 1 (no behavior change). Setting it to N causes the driver to sequentially expand up to N-1 directory levels before parallelizing.
### How was this patch tested?
- 11 new tests in HadoopFSUtilsDeepNarrowSuite covering: default behavior, single/multi-level expansion, leaf-before-depth, wide hierarchies, missing directories, mixed files+dirs, hidden directory filtering, _metadata/_common_metadata preservation, consistency across depths, and multiple input roots
- 2 new tests in FileIndexSuite covering: end-to-end traversalDepth with InMemoryFileIndex and config validation (rejects values < 1)
### Was this patch authored or co-authored using generative AI tooling?
Yes, Opus 4.6
rmannibucau
approved these changes
Mar 14, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What changes were proposed in this pull request?
Adds a new configuration
spark.sql.sources.parallelPartitionDiscovery.traversalDepththat controls how many directory levels the driver sequentially expands before delegating to parallel workers during partition discovery.For deep-narrow directory hierarchies (e.g.,
root/subpath/{year}/{month}/), the top levels have very few children, so parallelizing at level 1 creates too few tasks. This config allows the driver to sequentially traverse N-1 levels first, collecting enough paths to parallelize effectively.Key design decisions:
traversalDepth=1(default) preserves identical behavior to the original codeisRootLevel=falseto preserve SPARK-27676 semanticstolerateDisappearanceflag handles the race condition where directories confirmed during sequential traversal vanish before parallel listing reaches them, without propagating to recursive subdirectory callstraversalDepth > 1_metadataand_common_metadataWhy are the changes needed?
When tables have deep-narrow directory structures (e.g., a single intermediate path like
root/subpath/before the partition columns), the parallel partition discovery creates only one task for the single root path. This means the entire directory tree is listed sequentially on a single executor, negating the benefit of parallel listing. By pre-expanding a few levels on the driver, we can distribute the work across multiple tasks.Does this PR introduce any user-facing change?
Yes. A new SQL configuration
spark.sql.sources.parallelPartitionDiscovery.traversalDepthis available. Default value is 1 (no behavior change). Setting it to N causes the driver to sequentially expand up to N-1 directory levels before parallelizing.How was this patch tested?
Was this patch authored or co-authored using generative AI tooling?
Yes