Skip to content

Commit f6b2625

Browse files
newrengitster
authored andcommitted
fsck: snapshot default refs before object walk
Fsck has a race when operating on live repositories; consider the following simple script that writes new commits as fsck runs: #!/bin/bash git fsck & PID=$! while ps -p $PID >/dev/null; do sleep 3 git commit -q --allow-empty -m "Another commit" done Since fsck walks objects for connectivity and then reads the refs at the end to check, this can cause fsck to get confused and think that the new refs refer to missing commits and that new reflog entries are invalid. Running the above script in a clone of git.git results in the following (output ellipsized to remove additional errors of the same type): $ ./fsck-while-writing.sh Checking ref database: 100% (1/1), done. Checking object directories: 100% (256/256), done. warning in tag d6602ec: missingTaggerEntry: invalid format - expected 'tagger' line Checking objects: 100% (835091/835091), done. error: HEAD: invalid reflog entry 2aac9f9286e2164fbf8e4f1d1df53044ace2b310 error: HEAD: invalid reflog entry 2aac9f9286e2164fbf8e4f1d1df53044ace2b310 error: HEAD: invalid reflog entry da0f5b80d61844a6f0ad2ddfd57e4fdfa246ea68 error: HEAD: invalid reflog entry da0f5b80d61844a6f0ad2ddfd57e4fdfa246ea68 [...] error: HEAD: invalid reflog entry 87c8a5c2f6b79d9afa9e941590b9a097b6f7ac09 error: HEAD: invalid reflog entry d80887a48865e6ad165274b152cbbbed29f8a55a error: HEAD: invalid reflog entry d80887a48865e6ad165274b152cbbbed29f8a55a error: HEAD: invalid reflog entry 6724f2dfede88bfa9445a333e06e78536c0c6c0d error: refs/heads/mybranch invalid reflog entry 2aac9f9286e2164fbf8e4f1d1df53044ace2b310 error: refs/heads/mybranch: invalid reflog entry 2aac9f9286e2164fbf8e4f1d1df53044ace2b310 error: refs/heads/mybranch: invalid reflog entry da0f5b80d61844a6f0ad2ddfd57e4fdfa246ea68 error: refs/heads/mybranch: invalid reflog entry da0f5b80d61844a6f0ad2ddfd57e4fdfa246ea68 [...] error: refs/heads/mybranch: invalid reflog entry 87c8a5c2f6b79d9afa9e941590b9a097b6f7ac09 error: refs/heads/mybranch: invalid reflog entry d80887a48865e6ad165274b152cbbbed29f8a55a error: refs/heads/mybranch: invalid reflog entry d80887a48865e6ad165274b152cbbbed29f8a55a error: refs/heads/mybranch: invalid reflog entry 6724f2dfede88bfa9445a333e06e78536c0c6c0d Checking connectivity: 833846, done. missing commit 6724f2dfede88bfa9445a333e06e78536c0c6c0d Verifying commits in commit graph: 100% (242243/242243), done. We can minimize the race opportunities by taking a snapshot of refs at program invocation, doing the connectivity check, and then checking the snapshotted refs afterward. This avoids races with regular refs between fsck and adding objects to the database, though it still leaves a race between a gc and fsck. We are less concerned about folks simultaneously running gc with fsck; though, if it becomes an issue, we could lock fsck during gc. We definitely do not want to lock fsck during operations that may add objects to the object store; that would be problematic for forges. Note that refs aren't the only problem, though; reflog entries and index entries could be problematic as well. For now we punt on index entries just leaving a TODO comment, and for reflogs we use a coarse solution of taking the time at the beginning of the program and ignoring reflog entries newer than that time. That may be imperfect if dealing with a network filesystem, so we leave TODO comment for those that want to improve that handling as well. As a high level overview: * In addition to fsck_handle_ref(), which now is only a few lines long to process a ref, there's also a snapshot_ref() which is called early in the program for each ref and takes all the error checking logic. * The iterating over refs that used to be in get_default_heads() plus a loop over the arguments now appears in shapshot_refs(). * There's a new process_refs() as well that kind of looks like the old get_default_heads() though it is streamlined due to the work done by snapshot_refs(). This combination of changes modifies the output of running the script (from the beginning of this commit message) to: $ ./fsck-while-writing.sh Checking ref database: 100% (1/1), done. Checking object directories: 100% (256/256), done. warning in tag d6602ec: missingTaggerEntry: invalid format - expected 'tagger' line Checking objects: 100% (835091/835091), done. Checking connectivity: 833846, done. Verifying commits in commit graph: 100% (242243/242243), done. While worries about live updates while running fsck is likely of most interest for forge operators, it may also benefit those with automated jobs (such as git maintenance) or even casual users who want to do other work in their clone while fsck is running. Helped-by: Junio C Hamano <gitster@pobox.com> Helped-by: Jeff King <peff@peff.net> Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
1 parent 7c7698a commit f6b2625

1 file changed

Lines changed: 122 additions & 40 deletions

File tree

builtin/fsck.c

Lines changed: 122 additions & 40 deletions
Original file line numberDiff line numberDiff line change
@@ -51,6 +51,7 @@ static int show_progress = -1;
5151
static int show_dangling = 1;
5252
static int name_objects;
5353
static int check_references = 1;
54+
static timestamp_t now;
5455
#define ERROR_OBJECT 01
5556
#define ERROR_REACHABLE 02
5657
#define ERROR_PACK 04
@@ -510,6 +511,9 @@ static int fsck_handle_reflog_ent(const char *refname,
510511
timestamp_t timestamp, int tz UNUSED,
511512
const char *message UNUSED, void *cb_data UNUSED)
512513
{
514+
if (now && timestamp > now)
515+
return 0;
516+
513517
if (verbose)
514518
fprintf_ln(stderr, _("Checking reflog %s->%s"),
515519
oid_to_hex(ooid), oid_to_hex(noid));
@@ -531,8 +535,22 @@ static int fsck_handle_reflog(const char *logname, void *cb_data)
531535
return 0;
532536
}
533537

534-
static int fsck_handle_ref(const struct reference *ref, void *cb_data UNUSED)
538+
struct ref_snapshot {
539+
char *refname;
540+
struct object_id oid;
541+
/* TODO: Maybe supplement with latest reflog entry info too? */
542+
};
543+
544+
struct snapshot {
545+
size_t nr;
546+
size_t alloc;
547+
struct ref_snapshot *ref;
548+
/* TODO: Consider also snapshotting the index of each worktree. */
549+
};
550+
551+
static int snapshot_ref(const struct reference *ref, void *cb_data)
535552
{
553+
struct snapshot *snap = cb_data;
536554
struct object *obj;
537555

538556
obj = parse_object(the_repository, ref->oid);
@@ -556,6 +574,20 @@ static int fsck_handle_ref(const struct reference *ref, void *cb_data UNUSED)
556574
errors_found |= ERROR_REFS;
557575
}
558576
default_refs++;
577+
578+
ALLOC_GROW(snap->ref, snap->nr + 1, snap->alloc);
579+
snap->ref[snap->nr].refname = xstrdup(ref->name);
580+
oidcpy(&snap->ref[snap->nr].oid, ref->oid);
581+
snap->nr++;
582+
583+
return 0;
584+
}
585+
586+
static int fsck_handle_ref(const struct reference *ref, void *cb_data UNUSED)
587+
{
588+
struct object *obj;
589+
590+
obj = parse_object(the_repository, ref->oid);
559591
obj->flags |= USED;
560592
fsck_put_object_name(&fsck_walk_options,
561593
ref->oid, "%s", ref->name);
@@ -568,14 +600,35 @@ static int fsck_head_link(const char *head_ref_name,
568600
const char **head_points_at,
569601
struct object_id *head_oid);
570602

571-
static void get_default_heads(void)
603+
static void snapshot_refs(struct snapshot *snap, int argc, const char **argv)
572604
{
573605
struct worktree **worktrees, **p;
574606
const char *head_points_at;
575607
struct object_id head_oid;
576608

609+
for (int i = 0; i < argc; i++) {
610+
const char *arg = argv[i];
611+
struct object_id oid;
612+
if (!repo_get_oid(the_repository, arg, &oid)) {
613+
struct reference ref = {
614+
.name = arg,
615+
.oid = &oid,
616+
};
617+
618+
snapshot_ref(&ref, snap);
619+
continue;
620+
}
621+
error(_("invalid parameter: expected sha1, got '%s'"), arg);
622+
errors_found |= ERROR_OBJECT;
623+
}
624+
625+
if (argc) {
626+
include_reflogs = 0;
627+
return;
628+
}
629+
577630
refs_for_each_rawref(get_main_ref_store(the_repository),
578-
fsck_handle_ref, NULL);
631+
snapshot_ref, snap);
579632

580633
worktrees = get_worktrees();
581634
for (p = worktrees; *p; p++) {
@@ -590,15 +643,52 @@ static void get_default_heads(void)
590643
.oid = &head_oid,
591644
};
592645

593-
fsck_handle_ref(&ref, NULL);
646+
snapshot_ref(&ref, snap);
594647
}
595648
strbuf_release(&refname);
596649

597-
if (include_reflogs)
650+
/*
651+
* TODO: Could use refs_for_each_reflog(...) to find
652+
* latest entry instead of using a global 'now' for that
653+
* purpose.
654+
*/
655+
}
656+
free_worktrees(worktrees);
657+
658+
/* Ignore reflogs newer than now */
659+
now = time(NULL);
660+
}
661+
662+
663+
static void free_snapshot_refs(struct snapshot *snap)
664+
{
665+
for (size_t i = 0; i < snap->nr; i++)
666+
free(snap->ref[i].refname);
667+
free(snap->ref);
668+
}
669+
670+
static void process_refs(struct snapshot *snap)
671+
{
672+
struct worktree **worktrees, **p;
673+
674+
for (size_t i = 0; i < snap->nr; i++) {
675+
struct reference ref = {
676+
.name = snap->ref[i].refname,
677+
.oid = &snap->ref[i].oid,
678+
};
679+
fsck_handle_ref(&ref, NULL);
680+
}
681+
682+
if (include_reflogs) {
683+
worktrees = get_worktrees();
684+
for (p = worktrees; *p; p++) {
685+
struct worktree *wt = *p;
686+
598687
refs_for_each_reflog(get_worktree_ref_store(wt),
599688
fsck_handle_reflog, wt);
689+
}
690+
free_worktrees(worktrees);
600691
}
601-
free_worktrees(worktrees);
602692

603693
/*
604694
* Not having any default heads isn't really fatal, but
@@ -963,8 +1053,12 @@ int cmd_fsck(int argc,
9631053
const char *prefix,
9641054
struct repository *repo UNUSED)
9651055
{
966-
int i;
9671056
struct odb_source *source;
1057+
struct snapshot snap = {
1058+
.nr = 0,
1059+
.alloc = 0,
1060+
.ref = NULL
1061+
};
9681062

9691063
/* fsck knows how to handle missing promisor objects */
9701064
fetch_if_missing = 0;
@@ -1000,6 +1094,17 @@ int cmd_fsck(int argc,
10001094
if (check_references)
10011095
fsck_refs(the_repository);
10021096

1097+
/*
1098+
* Take a snapshot of the refs before walking objects to avoid looking
1099+
* at a set of refs that may be changed by the user while we are walking
1100+
* objects. We can still walk over new objects that are added during the
1101+
* execution of fsck but won't miss any objects that were reachable.
1102+
*/
1103+
snapshot_refs(&snap, argc, argv);
1104+
1105+
/* Ensure we get a "fresh" view of the odb */
1106+
odb_reprepare(the_repository->objects);
1107+
10031108
if (connectivity_only) {
10041109
for_each_loose_object(the_repository->objects,
10051110
mark_loose_for_connectivity, NULL, 0);
@@ -1041,42 +1146,18 @@ int cmd_fsck(int argc,
10411146
errors_found |= ERROR_OBJECT;
10421147
}
10431148

1044-
for (i = 0; i < argc; i++) {
1045-
const char *arg = argv[i];
1046-
struct object_id oid;
1047-
if (!repo_get_oid(the_repository, arg, &oid)) {
1048-
struct object *obj = lookup_object(the_repository,
1049-
&oid);
1050-
1051-
if (!obj || !(obj->flags & HAS_OBJ)) {
1052-
if (is_promisor_object(the_repository, &oid))
1053-
continue;
1054-
error(_("%s: object missing"), oid_to_hex(&oid));
1055-
errors_found |= ERROR_OBJECT;
1056-
continue;
1057-
}
1058-
1059-
obj->flags |= USED;
1060-
fsck_put_object_name(&fsck_walk_options, &oid,
1061-
"%s", arg);
1062-
mark_object_reachable(obj);
1063-
continue;
1064-
}
1065-
error(_("invalid parameter: expected sha1, got '%s'"), arg);
1066-
errors_found |= ERROR_OBJECT;
1067-
}
1149+
/* Process the snapshotted refs and the reflogs. */
1150+
process_refs(&snap);
10681151

1069-
/*
1070-
* If we've not been given any explicit head information, do the
1071-
* default ones from .git/refs. We also consider the index file
1072-
* in this case (ie this implies --cache).
1073-
*/
1074-
if (!argc) {
1075-
get_default_heads();
1152+
/* If not given any explicit objects, process index files too. */
1153+
if (!argc)
10761154
keep_cache_objects = 1;
1077-
}
1078-
10791155
if (keep_cache_objects) {
1156+
/*
1157+
* TODO: Consider first walking these indexes in snapshot_refs,
1158+
* to snapshot where the index entries used to point, and then
1159+
* check those snapshotted locations here.
1160+
*/
10801161
struct worktree **worktrees, **p;
10811162

10821163
verify_index_checksum = 1;
@@ -1149,5 +1230,6 @@ int cmd_fsck(int argc,
11491230
}
11501231
}
11511232

1233+
free_snapshot_refs(&snap);
11521234
return errors_found;
11531235
}

0 commit comments

Comments
 (0)