Skip to content

commit-reach: terminate merge-base walk when one side is exhausted#2149

Draft
spkrka wants to merge 6 commits into
gitgitgadget:masterfrom
spkrka:side-exhaust-pr
Draft

commit-reach: terminate merge-base walk when one side is exhausted#2149
spkrka wants to merge 6 commits into
gitgitgadget:masterfrom
spkrka:side-exhaust-pr

Conversation

@spkrka

@spkrka spkrka commented Jun 13, 2026

Copy link
Copy Markdown

Hi,

This follows up on my RFC [1] with a concrete proposal. I expect the
design to still be scrutinized, but that may be easier with actual code
to look at.

I tried to make this easier to review by splitting into atomic
patches. The first two patches are the meatiest parts, though they
are pure refactoring. The behavior change is in patch 3 and is in
itself quite small. The last patch adds technical documentation
to support future development.


Optimize paint_down_to_common() for merge-base queries that hit
large one-sided histories.

When the walk from one side reaches a commit with a very low
generation number that the other side never paints, the walk is
forced to drain most of the graph. A common trigger is a
repository import that grafts a separate history with its own root,
but any merge that introduces a low-generation commit never painted
by the other side has the same effect.

A new merge-base candidate can only be discovered when exclusive
PARENT1 and PARENT2 paint meet. This series teaches
paint_down_to_common() to stop as soon as one side has no exclusive
commits left in the queue; once one side is exhausted, no further
candidates can appear.

  origin/HEAD  o   o  PR HEAD
               |   |
     (import)  o   :
              / \ /
             |   o  merge-base
             |   |
             :   :  (~2.5M commits)
             |   |
  import root   main root

In the RFC thread [1], Derrick Stolee provided a criss-cross
counterexample that sharpened the halt condition, and Elijah Newren
independently discovered the same optimization and shared an
implementation in PR #2150 [2]. Patch 4 incorporates test cases
from Elijah's branch.

This series implements the optimization only after the walk enters
the finite-generation region, where generation ordering guarantees
that paint on visited commits is final.

Patch layout:

1/6 commit-reach: decouple ahead_behind from nonstale_queue
2/6 commit-reach: introduce paint_queue and per-side counters
3/6 commit-reach: stop the walk when one side is exhausted
4/6 t6600: add side-exhaustion edge-case tests
5/6 t6099, t6600: add side-exhaustion regression tests
6/6 Documentation/technical: document paint_down_to_common()

Benchmarks

Measured on a 2.6M-commit monorepo with commit-graph (baseline
v2.55-rc1):

merge-base --all  (across import)       4.293s ->    8ms  (537x)
merge-tree        (across import)       5.345s ->   13ms  (411x)
merge-base --all  (1000 commits apart)  5.404s ->    7ms  (772x)

No regression on linux.git (1.4M commits, commit-graph):

merge-base HEAD HEAD~1000                 38ms ->   40ms
merge-base --all HEAD HEAD~1000           87ms ->   36ms
merge-base --is-ancestor HEAD~1000 HEAD   11ms ->   11ms
merge-base --all HEAD HEAD~10000         626ms ->  428ms

[1] https://lore.kernel.org/git/CAL71e4Ps-2_0+uuZu43N9pFnXBemoAohPs_eyRJf8taXHJPAXQ@mail.gmail.com/T/#u
[2] #2150

CC: Derrick Stolee stolee@gmail.com
CC: Elijah Newren newren@gmail.com

@spkrka spkrka force-pushed the side-exhaust-pr branch 10 times, most recently from 7d5b1bb to 3e1315e Compare June 20, 2026 08:55
@spkrka spkrka changed the title commit-reach: terminate merge-base walk when one paint side is exhausted commit-reach: terminate merge-base walk when one side is exhausted Jun 20, 2026
spkrka and others added 6 commits June 20, 2026 11:09
Move ahead_behind() off the shared nonstale_queue abstraction to use
a plain prio_queue with a local max_nonstale pointer. The nonstale
tracking is inlined into insert_no_dup().

This prepares for replacing nonstale_queue with a paint_queue struct
that tracks per-side commit counts, which ahead_behind() does not
need. No behavior change.

Signed-off-by: Kristofer Karlsson <krka@spotify.com>
Replace the nonstale_queue abstraction in paint_down_to_common() with
a new paint_queue struct that tracks per-side commit counts. Each
non-stale queued commit occupies exactly one counter bucket based on
its paint flags: PARENT1-only, PARENT2-only, or both sides (a pending
merge-base candidate).

The counters are maintained by paint_count_transition() which handles
all flag changes as bucket transfers: remove from the old bucket, add
to the new one. Either step is a no-op when the respective state has
no bucket (stale or zero).

The loop now drains the queue via paint_queue_get() and breaks when
all counters reach zero, replacing the old pointer-based termination
(max_nonstale). This is equivalent behavior.

Signed-off-by: Kristofer Karlsson <krka@spotify.com>
Add an early termination check to paint_down_to_common() using the
per-side counters introduced in the previous commit. Once the walk
enters the finite-generation region, terminate early when one side's
exclusive count drops to zero -- no new merge-base can form without
both paint sides meeting.

The check also waits for pending_merge_bases to reach zero, ensuring
all merge-base candidates have been popped and recorded before
exiting.

The INFINITY gate ensures correctness: commits without a commit-graph
entry have GENERATION_NUMBER_INFINITY and are ordered by commit date,
which is not topologically reliable. The optimization only fires
once the walk enters the finite-generation region where ordering
guarantees hold.

On large repositories with commit-graph, this yields 100-1000x
speedups for merge-base queries where one side (e.g. a PR branch) is
much smaller than the other.

Helped-by: Derrick Stolee <stolee@gmail.com>
Helped-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Kristofer Karlsson <krka@spotify.com>
Add test cases to t6600-test-reach.sh that exercise edge cases in the
side-exhaustion optimization for paint_down_to_common():

 - in_merge_bases_many:self: commit is both A and one of the X inputs
 - get_merge_bases_many:duplicate-twos: duplicate entries in X list
 - get_merge_bases_many:pending-stale: STALE transition on an
   already-painted commit (ps-* diamond topology)
 - get_merge_bases_many:infinity-both-sides: both tips outside the
   commit-graph with non-monotonic dates (pi-* topology)

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Kristofer Karlsson <krka@spotify.com>
Add t6099 to test the case where multiple merge-base candidates exist
and one is an ancestor of another. This exercises the side-exhaustion
optimization in paint_down_to_common together with the
remove_redundant safety net in get_merge_bases_many_0.

Add a mixed finite/INFINITY test to t6600 where one tip is outside
the commit-graph (INFINITY generation) and the other is inside.
This exercises the region transition: the walk starts in the
INFINITY region where side-exhaustion is disabled, then crosses
into the finite region where it can fire.

Signed-off-by: Kristofer Karlsson <krka@spotify.com>
Add a technical document describing the paint_down_to_common()
algorithm used for merge-base computation.

Signed-off-by: Kristofer Karlsson <krka@spotify.com>
@spkrka spkrka force-pushed the side-exhaust-pr branch from 3e1315e to 9cbfc67 Compare June 20, 2026 09:09
@spkrka

spkrka commented Jun 20, 2026

Copy link
Copy Markdown
Author

/preview

@gitgitgadget

gitgitgadget Bot commented Jun 20, 2026

Copy link
Copy Markdown

Preview email sent as pull.2149.git.1781946989.gitgitgadget@gmail.com

@spkrka

spkrka commented Jun 20, 2026

Copy link
Copy Markdown
Author

/submit

@gitgitgadget

gitgitgadget Bot commented Jun 20, 2026

Copy link
Copy Markdown

Submitted as pull.2149.git.1781951820.gitgitgadget@gmail.com

To fetch this version into FETCH_HEAD:

git fetch https://github.com/gitgitgadget/git/ pr-2149/spkrka/side-exhaust-pr-v1

To fetch this version to local tag pr-2149/spkrka/side-exhaust-pr-v1:

git fetch --no-tags https://github.com/gitgitgadget/git/ tag pr-2149/spkrka/side-exhaust-pr-v1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants