Skip to content

refactor: Parallel PG -> Qdrant migration for improved throughput #198

Merged
Anush008 merged 4 commits intomainfrom
pg-parallel
Jan 13, 2026
Merged

refactor: Parallel PG -> Qdrant migration for improved throughput #198
Anush008 merged 4 commits intomainfrom
pg-parallel

Conversation

@Anush008
Copy link
Copy Markdown
Member

@Anush008 Anush008 commented Jan 6, 2026

Description

Adds support for migrating from PG to Qdrant with parallel batches.

NOTE

Plus, #199 was merged into this.

How?

Methodology same as #168.

Samples random sorted keys, creates non-overlapping ranges, then workers migrate the ranges independently. Each range tracks its own offset for resumability.

By default --migration.num-workers is set to the number of CPU cores.

If --migration.num-workers is set to 1, the current sequential migration is used.

Results

BEFORE - Migrating 500k points with a batch size of 250.

┌─────────────────────────────────────────────────────────────────────┐
| From → To: | embeddings_table@postgres  →  target-collection@qdrant |
└─────────────────────────────────────────────────────────────────────┘

INFO  Starting from the beginning                                                  
[500000/500000] ██████████████████████████████████████████████████ 100% | 4m10s
SUCCESS  Data migration finished successfully
INFO  Target collection has 500000 points

THIS PR - Migrating 500k points with a batch size of 250 and 8 workers.

┌─────────────────────────────────────────────────────────────────────┐
| From → To: | embeddings_table@postgres  →  target-collection@qdrant |
└─────────────────────────────────────────────────────────────────────┘

INFO  Using parallel migration with 8 workers
INFO  Starting from the beginning                                                  
[500000/500000] ████████████████████████████████████████████████████ 100% | 14s
SUCCESS  Data migration finished successfully
INFO  Target collection has 500000 points

About 17x faster.

@Anush008 Anush008 requested a review from a team as a code owner January 6, 2026 13:22
@Anush008 Anush008 marked this pull request as draft January 6, 2026 13:25
@Anush008 Anush008 marked this pull request as ready for review January 7, 2026 07:21
Signed-off-by: Anush008 <mail@anush.sh>
Signed-off-by: Anush008 <mail@anush.sh>
* refactor: PG UUID handling

Signed-off-by: Anush008 <mail@anush.sh>

* chore: Misc updates

Signed-off-by: Anush008 <mail@anush.sh>

---------

Signed-off-by: Anush008 <mail@anush.sh>
@Anush008 Anush008 enabled auto-merge (squash) January 9, 2026 18:12
@Anush008
Copy link
Copy Markdown
Member Author

Anush008 commented Jan 9, 2026

Merged #199 into here.

@Anush008 Anush008 requested a review from bashofmann January 9, 2026 19:03
@Anush008
Copy link
Copy Markdown
Member Author

Hey @qdrant/cloud-unit-regions-clusters
Just bumping this PR. Please take a look when possible.

Copy link
Copy Markdown
Contributor

@superseb superseb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you

@Anush008 Anush008 merged commit 89d9993 into main Jan 13, 2026
5 of 6 checks passed
@Anush008 Anush008 deleted the pg-parallel branch January 13, 2026 22:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants