Skip to content

pack-bitmap: fix memory leak if load_bitmap failed #1962

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

brandb97
Copy link
Contributor

@brandb97 brandb97 commented May 12, 2025

Since it seems this patch has been inactive for some time, I have revised the comments according to Taylor's feedback and submitted a new version.

This patch prevents pack-bitmap.c:load_bitmap() from nulling bitmap_git->bitmap when loading failed. Thus eliminates memory leak. This patch also add a test case in t5310 which use clang leak sanitizer to detect whether leak happens when loading failed.

cc: Jeff King [email protected]
cc: Taylor Blau [email protected]

@brandb97
Copy link
Contributor Author

/submit

Copy link

Submitted as [email protected]

To fetch this version into FETCH_HEAD:

git fetch https://github.com/gitgitgadget/git/ pr-git-1962/brandb97/fix-pack-bitmap-leak-v1

To fetch this version to local tag pr-git-1962/brandb97/fix-pack-bitmap-leak-v1:

git fetch --no-tags https://github.com/gitgitgadget/git/ tag pr-git-1962/brandb97/fix-pack-bitmap-leak-v1

Copy link

On the Git mailing list, Jeff King wrote (reply to this):

On Mon, May 12, 2025 at 12:22:10PM +0000, Lidong Yan via GitGitGadget wrote:

> From: Lidong Yan <[email protected]>
> 
> In pack-bitmap.c:load_bitmap_entries_v1, the function `read_bitmap_1`
> allocates a bitmap and reads index data into it. However, if any of
> the validation checks following the allocation fail, the allocated bitmap
> is not freed, resulting in a memory leak. To avoid this, the validation
> checks should be performed before the bitmap is allocated.

Thanks, this looks correct to me.

> @@ -388,10 +388,6 @@ static int load_bitmap_entries_v1(struct bitmap_index *index)
>  			return error(_("corrupt ewah bitmap: commit index %u out of range"),
>  				     (unsigned)commit_idx_pos);
>  
> -		bitmap = read_bitmap_1(index);
> -		if (!bitmap)
> -			return -1;
> -
>  		if (xor_offset > MAX_XOR_OFFSET || xor_offset > i)
>  			return error(_("corrupted bitmap pack index"));

I noticed that this code is also within a loop, so we could still return
early on the next loop iteration. But by that point we will have called
store_bitmap() on the result, so we only have to worry about leaking the
bitmap from the current loop iteration.

-Peff

Copy link

User Jeff King <[email protected]> has been added to the cc: list.

Copy link

This patch series was integrated into seen via 5006856.

Copy link

This patch series was integrated into seen via 64852ef.

Copy link

On the Git mailing list, Taylor Blau wrote (reply to this):

On Mon, May 12, 2025 at 09:13:15AM -0400, Jeff King wrote:
> On Mon, May 12, 2025 at 12:22:10PM +0000, Lidong Yan via GitGitGadget wrote:
>
> > From: Lidong Yan <[email protected]>
> >
> > In pack-bitmap.c:load_bitmap_entries_v1, the function `read_bitmap_1`
> > allocates a bitmap and reads index data into it. However, if any of
> > the validation checks following the allocation fail, the allocated bitmap
> > is not freed, resulting in a memory leak. To avoid this, the validation
> > checks should be performed before the bitmap is allocated.
>
> Thanks, this looks correct to me.

It looks correct to me as well, and is a strict improvement. But I think
there is a leak outside of this function as well that is not touched by
this patch.

> > @@ -388,10 +388,6 @@ static int load_bitmap_entries_v1(struct bitmap_index *index)
> >  			return error(_("corrupt ewah bitmap: commit index %u out of range"),
> >  				     (unsigned)commit_idx_pos);
> >
> > -		bitmap = read_bitmap_1(index);
> > -		if (!bitmap)
> > -			return -1;
> > -
> >  		if (xor_offset > MAX_XOR_OFFSET || xor_offset > i)
> >  			return error(_("corrupted bitmap pack index"));
>
> I noticed that this code is also within a loop, so we could still return
> early on the next loop iteration. But by that point we will have called
> store_bitmap() on the result, so we only have to worry about leaking the
> bitmap from the current loop iteration.

That's right, though I think there is still a leak here.

After going through the "failed" label, load_bitmap() will return -1,
and its caller (either prepare_bitmap_walk() or prepare_bitmap_git())
will then call free_bitmap_index().

That function would have done:

    struct stored_bitmap *sb;
    kh_foreach_value(b->bitmaps, sb {
      ewah_pool_free(sb->root);
      free(sb);
    });

, but won't since load_bitmap() already called kh_destroy_oid_map() and
NULL'd the "bitmaps" pointer from within its "failed" label.

So I think if you got part of the way through loading bitmap entries and
then failed, you would leak all of the previous entries that you were
able to load successfully.

I suspect the fix looks something like:

--- 8< ---
diff --git a/pack-bitmap.c b/pack-bitmap.c
index 5299f49d59..7f28532a69 100644
--- a/pack-bitmap.c
+++ b/pack-bitmap.c
@@ -631,41 +631,28 @@ static int load_bitmap(struct repository *r, struct bitmap_index *bitmap_git,
 	bitmap_git->ext_index.positions = kh_init_oid_pos();

 	if (load_reverse_index(r, bitmap_git))
-		goto failed;
+		return -1;

 	if (!(bitmap_git->commits = read_bitmap_1(bitmap_git)) ||
 		!(bitmap_git->trees = read_bitmap_1(bitmap_git)) ||
 		!(bitmap_git->blobs = read_bitmap_1(bitmap_git)) ||
 		!(bitmap_git->tags = read_bitmap_1(bitmap_git)))
-		goto failed;
+		return -1;

 	if (!bitmap_git->table_lookup && load_bitmap_entries_v1(bitmap_git) < 0)
-		goto failed;
+		return -1;

 	if (bitmap_git->base) {
 		if (!bitmap_is_midx(bitmap_git))
 			BUG("non-MIDX bitmap has non-NULL base bitmap index");
 		if (load_bitmap(r, bitmap_git->base, 1) < 0)
-			goto failed;
+			return -1;
 	}

 	if (!recursing)
 		load_all_type_bitmaps(bitmap_git);

 	return 0;
-
-failed:
-	munmap(bitmap_git->map, bitmap_git->map_size);
-	bitmap_git->map = NULL;
-	bitmap_git->map_size = 0;
-
-	kh_destroy_oid_map(bitmap_git->bitmaps);
-	bitmap_git->bitmaps = NULL;
-
-	kh_destroy_oid_pos(bitmap_git->ext_index.positions);
-	bitmap_git->ext_index.positions = NULL;
-
-	return -1;
 }

 static int open_pack_bitmap(struct repository *r,
--- >8 ---

, since all callers of load_bitmap() will themselves call
free_bitmap_index(), so there is no need for us to open-code a portion
of that function's implementation ourselves.

Thanks,
Taylor

Copy link

User Taylor Blau <[email protected]> has been added to the cc: list.

Copy link

This branch is now known as ly/pack-bitmap-load-leakfix.

Copy link

This patch series was integrated into seen via 41ab05b.

Copy link

On the Git mailing list, Junio C Hamano wrote (reply to this):

Taylor Blau <[email protected]> writes:

> After going through the "failed" label, load_bitmap() will return -1,
> and its caller (either prepare_bitmap_walk() or prepare_bitmap_git())
> will then call free_bitmap_index().
>
> That function would have done:
>
>     struct stored_bitmap *sb;
>     kh_foreach_value(b->bitmaps, sb {
>       ewah_pool_free(sb->root);
>       free(sb);
>     });
>
> , but won't since load_bitmap() already called kh_destroy_oid_map() and
> NULL'd the "bitmaps" pointer from within its "failed" label.

Yikes.

> So I think if you got part of the way through loading bitmap entries and
> then failed, you would leak all of the previous entries that you were
> able to load successfully.
>
> I suspect the fix looks something like:
> ...
> --- 8< ---
> diff --git a/pack-bitmap.c b/pack-bitmap.c
> index 5299f49d59..7f28532a69 100644
> --- a/pack-bitmap.c
> +++ b/pack-bitmap.c
> @@ -631,41 +631,28 @@ static int load_bitmap(struct repository *r, struct bitmap_index *bitmap_git,
>  	bitmap_git->ext_index.positions = kh_init_oid_pos();
>
>  	if (load_reverse_index(r, bitmap_git))
> -		goto failed;
> +		return -1;

(a lot of changes that simplifies the code snipped)

> -failed:
> -	munmap(bitmap_git->map, bitmap_git->map_size);
> -	bitmap_git->map = NULL;
> -	bitmap_git->map_size = 0;
> -
> -	kh_destroy_oid_map(bitmap_git->bitmaps);
> -	bitmap_git->bitmaps = NULL;
> -
> -	kh_destroy_oid_pos(bitmap_git->ext_index.positions);
> -	bitmap_git->ext_index.positions = NULL;
> -
> -	return -1;
>  }
>
>  static int open_pack_bitmap(struct repository *r,
> --- >8 ---
>
> , since all callers of load_bitmap() will themselves call
> free_bitmap_index(), so there is no need for us to open-code a portion
> of that function's implementation ourselves.

It is rare for a fix to be removing and simplifying this much code
;-)

Copy link

On the Git mailing list, Jeff King wrote (reply to this):

On Tue, May 13, 2025 at 01:47:21PM -0400, Taylor Blau wrote:

> > > In pack-bitmap.c:load_bitmap_entries_v1, the function `read_bitmap_1`
> > > allocates a bitmap and reads index data into it. However, if any of
> > > the validation checks following the allocation fail, the allocated bitmap
> > > is not freed, resulting in a memory leak. To avoid this, the validation
> > > checks should be performed before the bitmap is allocated.
> >
> > Thanks, this looks correct to me.
> 
> It looks correct to me as well, and is a strict improvement. But I think
> there is a leak outside of this function as well that is not touched by
> this patch.

Good catch, and your analysis looks correct to me. I don't think that
changes anything for this patch, which is fixing a more "inner" issue of
the allocated memory hitting store_bitmap() at all.

So I think this can graduate independently, and then you can prepare
your fix on top (but no rush).

It would be nice if we triggered these cases in the test suite so that
LSan could confirm that all leaks are covered. But I suspect it may not
be worth the effort to craft a bitmap file that is broken in such
particular ways.

> I suspect the fix looks something like:
> [...]
> , since all callers of load_bitmap() will themselves call
> free_bitmap_index(), so there is no need for us to open-code a portion
> of that function's implementation ourselves.

Deleting that extra code would be doubly satisfying.

-Peff

Copy link

This patch series was integrated into seen via de0228a.

Copy link

This patch series was integrated into next via 0be31ea.

Copy link

On the Git mailing list, lidongyan wrote (reply to this):

On 15/5/2025 at 02:03,Jeff King <[email protected]> 写道:

> It would be nice if we triggered these cases in the test suite so that
> LSan could confirm that all leaks are covered. But I suspect it may not
> be worth the effort to craft a bitmap file that is broken in such
> particular ways.

I'd like to try coming up with some test cases for this.

Copy link

This patch series was integrated into seen via 1da0220.

Copy link

This patch series was integrated into seen via 4aa450a.

Copy link

There was a status update in the "Cooking" section about the branch ly/pack-bitmap-load-leakfix on the Git mailing list:

Leakfix.

Will merge to 'master'.
source: <[email protected]>

Copy link

This patch series was integrated into seen via 1e3fc8a.

Copy link

There are issues in commit 72bee0e:
pack-bitmap: fix memory leak if load_bitmap_entries_v1 failed
Commit not signed off

@brandb97 brandb97 force-pushed the fix-pack-bitmap-leak branch from 36add21 to 1ba6e24 Compare May 20, 2025 07:04
Copy link

There are issues in commit 36cda10:
pack-bitmap: fix memory leak if load_bitmap_entries_v1 failed
Commit not signed off

@brandb97 brandb97 force-pushed the fix-pack-bitmap-leak branch 4 times, most recently from db3ac09 to 5be22d5 Compare May 20, 2025 08:26
@brandb97
Copy link
Contributor Author

/submit

Copy link

Submitted as [email protected]

To fetch this version into FETCH_HEAD:

git fetch https://github.com/gitgitgadget/git/ pr-git-1962/brandb97/fix-pack-bitmap-leak-v2

To fetch this version to local tag pr-git-1962/brandb97/fix-pack-bitmap-leak-v2:

git fetch --no-tags https://github.com/gitgitgadget/git/ tag pr-git-1962/brandb97/fix-pack-bitmap-leak-v2

Copy link

This patch series was integrated into seen via d853973.

Copy link

This patch series was integrated into seen via b2e72e9.

Copy link

There was a status update in the "Cooking" section about the branch ly/load-bitmap-leakfix on the Git mailing list:

Leakfix with a new and a bit invasive test.

Comments?
source: <[email protected]>

Copy link

This patch series was integrated into seen via 5961b1c.

Copy link

This patch series was integrated into seen via d44b9f0.

Copy link

This patch series was integrated into seen via 40b54d4.

Copy link

This patch series was integrated into seen via cdc295e.

Copy link

This patch series was integrated into seen via 553eacd.

Copy link

There was a status update in the "Cooking" section about the branch ly/load-bitmap-leakfix on the Git mailing list:

Leakfix with a new and a bit invasive test.

Comments?
source: <[email protected]>

Copy link

This patch series was integrated into seen via 5321b4d.

Copy link

This patch series was integrated into seen via 7b01989.

Copy link

There was a status update in the "Cooking" section about the branch ly/load-bitmap-leakfix on the Git mailing list:

Leakfix with a new and a bit invasive test.

Comments?
source: <[email protected]>

Copy link

This patch series was integrated into seen via 848bd83.

Copy link

This patch series was integrated into seen via d9827a8.

Copy link

There was a status update in the "Cooking" section about the branch ly/load-bitmap-leakfix on the Git mailing list:

Leakfix with a new and a bit invasive test.

Comments?
source: <[email protected]>

Copy link

This patch series was integrated into seen via 85a1867.

Copy link

This patch series was integrated into seen via 891a36d.

Copy link

There was a status update in the "Cooking" section about the branch ly/load-bitmap-leakfix on the Git mailing list:

Leakfix with a new and a bit invasive test.

Comments?
source: <[email protected]>

Copy link

This patch series was integrated into seen via 33cfe4f.

Copy link

There was a status update in the "Cooking" section about the branch ly/load-bitmap-leakfix on the Git mailing list:

Leakfix with a new and a bit invasive test.

Comments?
source: <[email protected]>

Copy link

This patch series was integrated into seen via dbedc7a.

Copy link

This patch series was integrated into seen via 4ac5b0b.

Copy link

There was a status update in the "Cooking" section about the branch ly/load-bitmap-leakfix on the Git mailing list:

Leakfix with a new and a bit invasive test.

Comments?
source: <[email protected]>

Copy link

This patch series was integrated into seen via afcd542.

ttaylorr and others added 3 commits July 1, 2025 13:11
After going through the "failed" label, load_bitmap() will return -1,
and its caller (either prepare_bitmap_walk() or prepare_bitmap_git())
will then call free_bitmap_index().

That function would have done:

    struct stored_bitmap *sb;
    kh_foreach_value(b->bitmaps, sb {
      ewah_pool_free(sb->root);
      free(sb);
    });

, but won't since load_bitmap() already called kh_destroy_oid_map() and
NULL'd the "bitmaps" pointer from within its "failed" label. Thus if you
got part of the way through loading bitmap entries and then failed, you
would leak all of the previous entries that you were able to load
successfully.

The solution is to remove the error handling code in load_bitmap(), because
its caller will always call free_bitmap_index() in case of an error.

Signed-off-by: Taylor Blau <[email protected]>
Signed-off-by: Lidong Yan <[email protected]>
The comment in pack-bitmap.c:test_bitmap_commits(), suggests that
we can avoid reading the commit table altogether. However, this
comment is misleading. The reason we load bitmap entries here is
because test_bitmap_commits() needs to print the commit IDs from the
bitmap, and we must read the bitmap entries to obtain those commit IDs.
So reword this comment.

Signed-off-by: Lidong Yan <[email protected]>
t5310 lacks a test to ensure git works correctly when commit bitmap
data is corrupted. So this patch add test helper in pack-bitmap.c to
list each commit bitmap position in bitmap file and `load corrupt bitmap`
test case in t/t5310 to corrupt a commit bitmap before loading it.

Signed-off-by: Lidong Yan <[email protected]>
@brandb97 brandb97 force-pushed the fix-pack-bitmap-leak branch from 05140e2 to c1b5d03 Compare July 1, 2025 05:29
@brandb97
Copy link
Contributor Author

brandb97 commented Jul 1, 2025

/submit

Copy link

Submitted as [email protected]

To fetch this version into FETCH_HEAD:

git fetch https://github.com/gitgitgadget/git/ pr-git-1962/brandb97/fix-pack-bitmap-leak-v6

To fetch this version to local tag pr-git-1962/brandb97/fix-pack-bitmap-leak-v6:

git fetch --no-tags https://github.com/gitgitgadget/git/ tag pr-git-1962/brandb97/fix-pack-bitmap-leak-v6

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants