-
Notifications
You must be signed in to change notification settings - Fork 26.4k
pack-bitmap: fix memory leak if load_bitmap
failed
#1962
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
/submit |
Submitted as [email protected] To fetch this version into
To fetch this version to local tag
|
On the Git mailing list, Jeff King wrote (reply to this): On Mon, May 12, 2025 at 12:22:10PM +0000, Lidong Yan via GitGitGadget wrote:
> From: Lidong Yan <[email protected]>
>
> In pack-bitmap.c:load_bitmap_entries_v1, the function `read_bitmap_1`
> allocates a bitmap and reads index data into it. However, if any of
> the validation checks following the allocation fail, the allocated bitmap
> is not freed, resulting in a memory leak. To avoid this, the validation
> checks should be performed before the bitmap is allocated.
Thanks, this looks correct to me.
> @@ -388,10 +388,6 @@ static int load_bitmap_entries_v1(struct bitmap_index *index)
> return error(_("corrupt ewah bitmap: commit index %u out of range"),
> (unsigned)commit_idx_pos);
>
> - bitmap = read_bitmap_1(index);
> - if (!bitmap)
> - return -1;
> -
> if (xor_offset > MAX_XOR_OFFSET || xor_offset > i)
> return error(_("corrupted bitmap pack index"));
I noticed that this code is also within a loop, so we could still return
early on the next loop iteration. But by that point we will have called
store_bitmap() on the result, so we only have to worry about leaking the
bitmap from the current loop iteration.
-Peff |
User |
This patch series was integrated into seen via 5006856. |
This patch series was integrated into seen via 64852ef. |
On the Git mailing list, Taylor Blau wrote (reply to this): On Mon, May 12, 2025 at 09:13:15AM -0400, Jeff King wrote:
> On Mon, May 12, 2025 at 12:22:10PM +0000, Lidong Yan via GitGitGadget wrote:
>
> > From: Lidong Yan <[email protected]>
> >
> > In pack-bitmap.c:load_bitmap_entries_v1, the function `read_bitmap_1`
> > allocates a bitmap and reads index data into it. However, if any of
> > the validation checks following the allocation fail, the allocated bitmap
> > is not freed, resulting in a memory leak. To avoid this, the validation
> > checks should be performed before the bitmap is allocated.
>
> Thanks, this looks correct to me.
It looks correct to me as well, and is a strict improvement. But I think
there is a leak outside of this function as well that is not touched by
this patch.
> > @@ -388,10 +388,6 @@ static int load_bitmap_entries_v1(struct bitmap_index *index)
> > return error(_("corrupt ewah bitmap: commit index %u out of range"),
> > (unsigned)commit_idx_pos);
> >
> > - bitmap = read_bitmap_1(index);
> > - if (!bitmap)
> > - return -1;
> > -
> > if (xor_offset > MAX_XOR_OFFSET || xor_offset > i)
> > return error(_("corrupted bitmap pack index"));
>
> I noticed that this code is also within a loop, so we could still return
> early on the next loop iteration. But by that point we will have called
> store_bitmap() on the result, so we only have to worry about leaking the
> bitmap from the current loop iteration.
That's right, though I think there is still a leak here.
After going through the "failed" label, load_bitmap() will return -1,
and its caller (either prepare_bitmap_walk() or prepare_bitmap_git())
will then call free_bitmap_index().
That function would have done:
struct stored_bitmap *sb;
kh_foreach_value(b->bitmaps, sb {
ewah_pool_free(sb->root);
free(sb);
});
, but won't since load_bitmap() already called kh_destroy_oid_map() and
NULL'd the "bitmaps" pointer from within its "failed" label.
So I think if you got part of the way through loading bitmap entries and
then failed, you would leak all of the previous entries that you were
able to load successfully.
I suspect the fix looks something like:
--- 8< ---
diff --git a/pack-bitmap.c b/pack-bitmap.c
index 5299f49d59..7f28532a69 100644
--- a/pack-bitmap.c
+++ b/pack-bitmap.c
@@ -631,41 +631,28 @@ static int load_bitmap(struct repository *r, struct bitmap_index *bitmap_git,
bitmap_git->ext_index.positions = kh_init_oid_pos();
if (load_reverse_index(r, bitmap_git))
- goto failed;
+ return -1;
if (!(bitmap_git->commits = read_bitmap_1(bitmap_git)) ||
!(bitmap_git->trees = read_bitmap_1(bitmap_git)) ||
!(bitmap_git->blobs = read_bitmap_1(bitmap_git)) ||
!(bitmap_git->tags = read_bitmap_1(bitmap_git)))
- goto failed;
+ return -1;
if (!bitmap_git->table_lookup && load_bitmap_entries_v1(bitmap_git) < 0)
- goto failed;
+ return -1;
if (bitmap_git->base) {
if (!bitmap_is_midx(bitmap_git))
BUG("non-MIDX bitmap has non-NULL base bitmap index");
if (load_bitmap(r, bitmap_git->base, 1) < 0)
- goto failed;
+ return -1;
}
if (!recursing)
load_all_type_bitmaps(bitmap_git);
return 0;
-
-failed:
- munmap(bitmap_git->map, bitmap_git->map_size);
- bitmap_git->map = NULL;
- bitmap_git->map_size = 0;
-
- kh_destroy_oid_map(bitmap_git->bitmaps);
- bitmap_git->bitmaps = NULL;
-
- kh_destroy_oid_pos(bitmap_git->ext_index.positions);
- bitmap_git->ext_index.positions = NULL;
-
- return -1;
}
static int open_pack_bitmap(struct repository *r,
--- >8 ---
, since all callers of load_bitmap() will themselves call
free_bitmap_index(), so there is no need for us to open-code a portion
of that function's implementation ourselves.
Thanks,
Taylor |
User |
This branch is now known as |
This patch series was integrated into seen via 41ab05b. |
On the Git mailing list, Junio C Hamano wrote (reply to this): Taylor Blau <[email protected]> writes:
> After going through the "failed" label, load_bitmap() will return -1,
> and its caller (either prepare_bitmap_walk() or prepare_bitmap_git())
> will then call free_bitmap_index().
>
> That function would have done:
>
> struct stored_bitmap *sb;
> kh_foreach_value(b->bitmaps, sb {
> ewah_pool_free(sb->root);
> free(sb);
> });
>
> , but won't since load_bitmap() already called kh_destroy_oid_map() and
> NULL'd the "bitmaps" pointer from within its "failed" label.
Yikes.
> So I think if you got part of the way through loading bitmap entries and
> then failed, you would leak all of the previous entries that you were
> able to load successfully.
>
> I suspect the fix looks something like:
> ...
> --- 8< ---
> diff --git a/pack-bitmap.c b/pack-bitmap.c
> index 5299f49d59..7f28532a69 100644
> --- a/pack-bitmap.c
> +++ b/pack-bitmap.c
> @@ -631,41 +631,28 @@ static int load_bitmap(struct repository *r, struct bitmap_index *bitmap_git,
> bitmap_git->ext_index.positions = kh_init_oid_pos();
>
> if (load_reverse_index(r, bitmap_git))
> - goto failed;
> + return -1;
(a lot of changes that simplifies the code snipped)
> -failed:
> - munmap(bitmap_git->map, bitmap_git->map_size);
> - bitmap_git->map = NULL;
> - bitmap_git->map_size = 0;
> -
> - kh_destroy_oid_map(bitmap_git->bitmaps);
> - bitmap_git->bitmaps = NULL;
> -
> - kh_destroy_oid_pos(bitmap_git->ext_index.positions);
> - bitmap_git->ext_index.positions = NULL;
> -
> - return -1;
> }
>
> static int open_pack_bitmap(struct repository *r,
> --- >8 ---
>
> , since all callers of load_bitmap() will themselves call
> free_bitmap_index(), so there is no need for us to open-code a portion
> of that function's implementation ourselves.
It is rare for a fix to be removing and simplifying this much code
;-) |
On the Git mailing list, Jeff King wrote (reply to this): On Tue, May 13, 2025 at 01:47:21PM -0400, Taylor Blau wrote:
> > > In pack-bitmap.c:load_bitmap_entries_v1, the function `read_bitmap_1`
> > > allocates a bitmap and reads index data into it. However, if any of
> > > the validation checks following the allocation fail, the allocated bitmap
> > > is not freed, resulting in a memory leak. To avoid this, the validation
> > > checks should be performed before the bitmap is allocated.
> >
> > Thanks, this looks correct to me.
>
> It looks correct to me as well, and is a strict improvement. But I think
> there is a leak outside of this function as well that is not touched by
> this patch.
Good catch, and your analysis looks correct to me. I don't think that
changes anything for this patch, which is fixing a more "inner" issue of
the allocated memory hitting store_bitmap() at all.
So I think this can graduate independently, and then you can prepare
your fix on top (but no rush).
It would be nice if we triggered these cases in the test suite so that
LSan could confirm that all leaks are covered. But I suspect it may not
be worth the effort to craft a bitmap file that is broken in such
particular ways.
> I suspect the fix looks something like:
> [...]
> , since all callers of load_bitmap() will themselves call
> free_bitmap_index(), so there is no need for us to open-code a portion
> of that function's implementation ourselves.
Deleting that extra code would be doubly satisfying.
-Peff |
This patch series was integrated into seen via de0228a. |
This patch series was integrated into next via 0be31ea. |
On the Git mailing list, lidongyan wrote (reply to this): On 15/5/2025 at 02:03,Jeff King <[email protected]> 写道:
> It would be nice if we triggered these cases in the test suite so that
> LSan could confirm that all leaks are covered. But I suspect it may not
> be worth the effort to craft a bitmap file that is broken in such
> particular ways.
I'd like to try coming up with some test cases for this. |
This patch series was integrated into seen via 1da0220. |
This patch series was integrated into seen via 4aa450a. |
There was a status update in the "Cooking" section about the branch Leakfix. Will merge to 'master'. source: <[email protected]> |
This patch series was integrated into seen via 1e3fc8a. |
There are issues in commit 72bee0e: |
36add21
to
1ba6e24
Compare
There are issues in commit 36cda10: |
db3ac09
to
5be22d5
Compare
/submit |
Submitted as [email protected] To fetch this version into
To fetch this version to local tag
|
This patch series was integrated into seen via d853973. |
This patch series was integrated into seen via b2e72e9. |
There was a status update in the "Cooking" section about the branch Leakfix with a new and a bit invasive test. Comments? source: <[email protected]> |
This patch series was integrated into seen via 5961b1c. |
This patch series was integrated into seen via d44b9f0. |
This patch series was integrated into seen via 40b54d4. |
This patch series was integrated into seen via cdc295e. |
This patch series was integrated into seen via 553eacd. |
There was a status update in the "Cooking" section about the branch Leakfix with a new and a bit invasive test. Comments? source: <[email protected]> |
This patch series was integrated into seen via 5321b4d. |
This patch series was integrated into seen via 7b01989. |
There was a status update in the "Cooking" section about the branch Leakfix with a new and a bit invasive test. Comments? source: <[email protected]> |
This patch series was integrated into seen via 848bd83. |
This patch series was integrated into seen via d9827a8. |
There was a status update in the "Cooking" section about the branch Leakfix with a new and a bit invasive test. Comments? source: <[email protected]> |
This patch series was integrated into seen via 85a1867. |
This patch series was integrated into seen via 891a36d. |
There was a status update in the "Cooking" section about the branch Leakfix with a new and a bit invasive test. Comments? source: <[email protected]> |
This patch series was integrated into seen via 33cfe4f. |
There was a status update in the "Cooking" section about the branch Leakfix with a new and a bit invasive test. Comments? source: <[email protected]> |
This patch series was integrated into seen via dbedc7a. |
This patch series was integrated into seen via 4ac5b0b. |
There was a status update in the "Cooking" section about the branch Leakfix with a new and a bit invasive test. Comments? source: <[email protected]> |
This patch series was integrated into seen via afcd542. |
After going through the "failed" label, load_bitmap() will return -1, and its caller (either prepare_bitmap_walk() or prepare_bitmap_git()) will then call free_bitmap_index(). That function would have done: struct stored_bitmap *sb; kh_foreach_value(b->bitmaps, sb { ewah_pool_free(sb->root); free(sb); }); , but won't since load_bitmap() already called kh_destroy_oid_map() and NULL'd the "bitmaps" pointer from within its "failed" label. Thus if you got part of the way through loading bitmap entries and then failed, you would leak all of the previous entries that you were able to load successfully. The solution is to remove the error handling code in load_bitmap(), because its caller will always call free_bitmap_index() in case of an error. Signed-off-by: Taylor Blau <[email protected]> Signed-off-by: Lidong Yan <[email protected]>
The comment in pack-bitmap.c:test_bitmap_commits(), suggests that we can avoid reading the commit table altogether. However, this comment is misleading. The reason we load bitmap entries here is because test_bitmap_commits() needs to print the commit IDs from the bitmap, and we must read the bitmap entries to obtain those commit IDs. So reword this comment. Signed-off-by: Lidong Yan <[email protected]>
t5310 lacks a test to ensure git works correctly when commit bitmap data is corrupted. So this patch add test helper in pack-bitmap.c to list each commit bitmap position in bitmap file and `load corrupt bitmap` test case in t/t5310 to corrupt a commit bitmap before loading it. Signed-off-by: Lidong Yan <[email protected]>
05140e2
to
c1b5d03
Compare
/submit |
Submitted as [email protected] To fetch this version into
To fetch this version to local tag
|
Since it seems this patch has been inactive for some time, I have revised the comments according to Taylor's feedback and submitted a new version.
This patch prevents pack-bitmap.c:load_bitmap() from nulling bitmap_git->bitmap when loading failed. Thus eliminates memory leak. This patch also add a test case in t5310 which use clang leak sanitizer to detect whether leak happens when loading failed.
cc: Jeff King [email protected]
cc: Taylor Blau [email protected]