Skip to content

Decouple operator statistics propagation from traversal/caching via statistics_from_inputs #22958

@asolimando

Description

@asolimando

Is your feature request related to a problem or challenge?

The current statistics_with_args / StatisticsArgs design (#21815) embeds cache lookup
and child traversal directly inside each operator's statistics_with_args override via args.compute_child_statistics(...). This means:

  • Each operator must be aware of caching mechanics, coupling local propagation logic to the traversal strategy
  • Evolving the traversal or cache model requires touching every operator implementation

Describe the solution you'd like

Introduce a stateless statistics_from_inputs method on ExecutionPlan (defaulting to Statistics::new_unknown) that expresses only local propagation logic from pre-computed child statistics:

fn statistics_from_inputs(
    &self,
    input_stats: &[Arc<Statistics>],
    partition: Option<usize>,
) -> Result<Arc<Statistics>> {
    Ok(Arc::new(Statistics::new_unknown(self.schema().as_ref())))
}

The external StatisticsContext owns traversal and cache management, calling statistics_from_inputs after resolving child statistics. statistics_with_args remains the public API and is unchanged.

Benefits

  • Non-breaking: statistics_from_inputs has a safe default; statistics_with_args is unchanged
  • Operators that override statistics_from_inputs automatically benefit from any future
    improvements to the traversal/caching strategy without code changes
  • Operators become easier to test in isolation (no need to construct StatisticsArgs or
    a plan tree)

Describe alternatives you've considered

Keep the current statistics_with_args design as-is. Each operator handles caching via args.compute_child_statistics(...). Works correctly but tightly couples propagation logic to traversal mechanics, making the cache model hard to evolve.

Additional context

Suggested by @2010YOUY01 in this comment as a follow-up for #21815

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request
    No fields configured for Feature.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions