Skip to content

[CELEBORN-2317] Validate applicationId to prevent worker path traversal#3674

Open
afterincomparableyum wants to merge 1 commit intoapache:mainfrom
afterincomparableyum:celeborn-2317
Open

[CELEBORN-2317] Validate applicationId to prevent worker path traversal#3674
afterincomparableyum wants to merge 1 commit intoapache:mainfrom
afterincomparableyum:celeborn-2317

Conversation

@afterincomparableyum
Copy link
Copy Markdown
Contributor

What changes were proposed in this pull request?

This change:

  • Adds Utils.validateAppId enforcing ^[A-Za-z0-9_-]+$, which matches Spark (application_<ts>_<n>, local-<ts>), Flink, and MR formats.
  • Calls it at every worker RPC entry point that takes an applicationId or shuffleKey from the wire: Controller (ReserveSlots, CommitFiles, DestroyWorkerSlots), PushDataHandler.handleCore, and the two checkAuth sites in FetchHandler.
  • Adds a canonical path containment check in StorageManager.createDiskFile (local disk branch) as defense in depth, before mkdirs() runs.

Why are the changes needed?

The worker builds local shuffle paths by concatenating applicationId received over RPC: <workingDir>/<appId>/<shuffleId>/<fileName>. The int32 and fileName is built from int id/epoch/mode, but it was never validated against a charset. With auth disabled, any client on the network could supply appId = "../foo" and have the worker mkdir, create, or delete files outside its working dir. With auth enabled, the SASL clientId == applicationId equality check in RpcEndpoint did not constrain format, so a tenant whose registered id contained .. could still escape.

Does this PR resolve a correctness bug?

Yes

Does this PR introduce any user-facing change?

No

How was this patch tested?

Unit tests/CI

@afterincomparableyum
Copy link
Copy Markdown
Contributor Author

I'll fix CI issues and push

The worker builds local shuffle paths by concatenating applicationId received over RPC: `<workingDir>/<appId>/<shuffleId>/<fileName>`. The int32 and fileName is built from int id/epoch/mode, but it was never validated against a charset. With auth disabled, any client on the network could supply `appId = "../foo"` and have the worker mkdir, create, or delete files outside its working dir. With auth enabled, the SASL clientId == applicationId equality check in RpcEndpoint did not constrain format, so a tenant whose registered id contained `..` could still escape.

This change:
  - Adds Utils.validateAppId enforcing `^[A-Za-z0-9_-]+$`, which matches
    Spark (`application_<ts>_<n>`, `local-<ts>`), Flink, and MR formats.
  - Calls it at every worker RPC entry point that takes an applicationId
    or shuffleKey from the wire: Controller (ReserveSlots, CommitFiles,
    DestroyWorkerSlots), PushDataHandler.handleCore, and the two
    checkAuth sites in FetchHandler.
  - Adds a canonical path containment check in
    StorageManager.createDiskFile (local disk branch) as defense in
    depth, before mkdirs() runs.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant