Skip to content

fix xstrdup leak in parse_short_opt #1954

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

brandb97
Copy link
Contributor

@brandb97 brandb97 commented May 6, 2025

Pass a user defined strdup-like function in parse_opt_ctx to avoid memory leak.

cc: Patrick Steinhardt [email protected]
cc: Phillip Wood [email protected]

Copy link

Welcome to GitGitGadget

Hi @brandb97, and welcome to GitGitGadget, the GitHub App to send patch series to the Git mailing list from GitHub Pull Requests.

Please make sure that either:

  • Your Pull Request has a good description, if it consists of multiple commits, as it will be used as cover letter.
  • Your Pull Request description is empty, if it consists of a single commit, as the commit message should be descriptive enough by itself.

You can CC potential reviewers by adding a footer to the PR description with the following syntax:

CC: Revi Ewer <[email protected]>, Ill Takalook <[email protected]>

NOTE: DO NOT copy/paste your CC list from a previous GGG PR's description,
because it will result in a malformed CC list on the mailing list. See
example.

Also, it is a good idea to review the commit messages one last time, as the Git project expects them in a quite specific form:

  • the lines should not exceed 76 columns,
  • the first line should be like a header and typically start with a prefix like "tests:" or "revisions:" to state which subsystem the change is about, and
  • the commit messages' body should be describing the "why?" of the change.
  • Finally, the commit messages should end in a Signed-off-by: line matching the commits' author.

It is in general a good idea to await the automated test ("Checks") in this Pull Request before contributing the patches, e.g. to avoid trivial issues such as unportable code.

Contributing the patches

Before you can contribute the patches, your GitHub username needs to be added to the list of permitted users. Any already-permitted user can do that, by adding a comment to your PR of the form /allow. A good way to find other contributors is to locate recent pull requests where someone has been /allowed:

Both the person who commented /allow and the PR author are able to /allow you.

An alternative is the channel #git-devel on the Libera Chat IRC network:

<newcontributor> I've just created my first PR, could someone please /allow me? https://github.com/gitgitgadget/git/pull/12345
<veteran> newcontributor: it is done
<newcontributor> thanks!

Once on the list of permitted usernames, you can contribute the patches to the Git mailing list by adding a PR comment /submit.

If you want to see what email(s) would be sent for a /submit request, add a PR comment /preview to have the email(s) sent to you. You must have a public GitHub email address for this. Note that any reviewers CC'd via the list in the PR description will not actually be sent emails.

After you submit, GitGitGadget will respond with another comment that contains the link to the cover letter mail in the Git mailing list archive. Please make sure to monitor the discussion in that thread and to address comments and suggestions (while the comments and suggestions will be mirrored into the PR by GitGitGadget, you will still want to reply via mail).

If you do not want to subscribe to the Git mailing list just to be able to respond to a mail, you can download the mbox from the Git mailing list archive (click the (raw) link), then import it into your mail program. If you use GMail, you can do this via:

curl -g --user "<EMailAddress>:<Password>" \
    --url "imaps://imap.gmail.com/INBOX" -T /path/to/raw.txt

To iterate on your change, i.e. send a revised patch or patch series, you will first want to (force-)push to the same branch. You probably also want to modify your Pull Request description (or title). It is a good idea to summarize the revision by adding something like this to the cover letter (read: by editing the first comment on the PR, i.e. the PR description):

Changes since v1:
- Fixed a typo in the commit message (found by ...)
- Added a code comment to ... as suggested by ...
...

To send a new iteration, just add another PR comment with the contents: /submit.

Need help?

New contributors who want advice are encouraged to join [email protected], where volunteers who regularly contribute to Git are willing to answer newbie questions, give advice, or otherwise provide mentoring to interested contributors. You must join in order to post or view messages, but anyone can join.

You may also be able to find help in real time in the developer IRC channel, #git-devel on Libera Chat. Remember that IRC does not support offline messaging, so if you send someone a private message and log out, they cannot respond to you. The scrollback of #git-devel is archived, though.

@dscho
Copy link
Member

dscho commented May 6, 2025

/allow

Copy link

User brandb97 is now allowed to use GitGitGadget.

WARNING: brandb97 has no public email address set on GitHub; GitGitGadget needs an email address to Cc: you on your contribution, so that you receive any feedback on the Git mailing list. Go to https://github.com/settings/profile to make your preferred email public to let GitGitGadget know which email address to use.

Copy link

There are issues in commit ae3f4e4:
fix whitespace issue
Commit checks stopped - the message is too short
Commit not signed off

Copy link

There are issues in commit 7e31fe9:
fix size_t/int incompatible problem
Commit checks stopped - the message is too short
Commit not signed off

Copy link

There are issues in commit ae3f4e4:
fix whitespace issue
Commit checks stopped - the message is too short
Commit not signed off

Copy link

There are issues in commit 7e31fe9:
fix size_t/int incompatible problem
Commit checks stopped - the message is too short
Commit not signed off

Copy link

There are issues in commit b80d17b:
fix clang-format error
Commit checks stopped - the message is too short
Commit not signed off

Copy link

There are issues in commit ae3f4e4:
fix whitespace issue
Commit checks stopped - the message is too short
Commit not signed off

Copy link

There are issues in commit 7e31fe9:
fix size_t/int incompatible problem
Commit checks stopped - the message is too short
Commit not signed off

Copy link

There are issues in commit b80d17b:
fix clang-format error
Commit checks stopped - the message is too short
Commit not signed off

Copy link

There are issues in commit 5f0d8e8:
fix c90 issue
Commit checks stopped - the message is too short
Commit not signed off

Copy link

There are issues in commit ae3f4e4:
fix whitespace issue
Commit checks stopped - the message is too short
Commit not signed off

Copy link

There are issues in commit 7e31fe9:
fix size_t/int incompatible problem
Commit checks stopped - the message is too short
Commit not signed off

Copy link

There are issues in commit b80d17b:
fix clang-format error
Commit checks stopped - the message is too short
Commit not signed off

Copy link

There are issues in commit 5f0d8e8:
fix c90 issue
Commit checks stopped - the message is too short
Commit not signed off

@brandb97 brandb97 force-pushed the fix-parse-option-leak branch from 25ce96a to bdb7c60 Compare May 6, 2025 09:03
Copy link

There are issues in commit ae3f4e4:
fix whitespace issue
Commit checks stopped - the message is too short
Commit not signed off

Copy link

There are issues in commit 7e31fe9:
fix size_t/int incompatible problem
Commit checks stopped - the message is too short
Commit not signed off

Copy link

There are issues in commit b80d17b:
fix clang-format error
Commit checks stopped - the message is too short
Commit not signed off

Copy link

There are issues in commit 5f0d8e8:
fix c90 issue
Commit checks stopped - the message is too short
Commit not signed off

Copy link

There are issues in commit ae3f4e4:
fix whitespace issue
Commit checks stopped - the message is too short
Commit not signed off

Copy link

There are issues in commit 7e31fe9:
fix size_t/int incompatible problem
Commit checks stopped - the message is too short
Commit not signed off

Copy link

There are issues in commit b80d17b:
fix clang-format error
Commit checks stopped - the message is too short
Commit not signed off

Copy link

There are issues in commit 5f0d8e8:
fix c90 issue
Commit checks stopped - the message is too short
Commit not signed off

@brandb97 brandb97 force-pushed the fix-parse-option-leak branch 2 times, most recently from c936abc to a9cbca6 Compare May 7, 2025 00:58
@brandb97
Copy link
Contributor Author

brandb97 commented May 7, 2025

/submit

Copy link

Submitted as [email protected]

To fetch this version into FETCH_HEAD:

git fetch https://github.com/gitgitgadget/git/ pr-git-1954/brandb97/fix-parse-option-leak-v1

To fetch this version to local tag pr-git-1954/brandb97/fix-parse-option-leak-v1:

git fetch --no-tags https://github.com/gitgitgadget/git/ tag pr-git-1954/brandb97/fix-parse-option-leak-v1

Makefile Outdated
@@ -822,6 +822,7 @@ TEST_BUILTINS_OBJS += test-online-cpus.o
TEST_BUILTINS_OBJS += test-pack-mtimes.o

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Patrick Steinhardt wrote (reply to this):

On Wed, May 07, 2025 at 02:33:21AM +0000, Lidong Yan via GitGitGadget wrote:
> From: Lidong Yan <[email protected]>

This commit is missing an explanation of the problem. When does it
trigger? Which subsystem triggers it? Why didn't we observe the issue
beforehand? And given that the solution seems to be a bit more complex
it should also explain how you're fixing the issue.

Furthermore, the subject of such a commit should be prefixed with the
subsystem in which you're fixing the issue. So in your case,
"parse-options:".

I'd recommend reading through "Documentation/MyFirstContribution.adoc",
which explains all of this.

> diff --git a/parse-options.c b/parse-options.c
> index a9a39ecaef6c..4279dfe4d313 100644
> --- a/parse-options.c
> +++ b/parse-options.c
> @@ -638,6 +638,16 @@ static int has_subcommands(const struct option *options)
>  	return 0;
>  }
>  
> +static void set_strdup_fn(struct parse_opt_ctx_t *ctx, const struct option *options) {

Style: the opening brace should be on its own line.

> +	for (; options->type != OPTION_END; options++)
> +		;
> +	if (options->value && options->strdup_fn) {
> +		ctx->unknown_opts = options->value;
> +		ctx->strdup_fn = options->strdup_fn;
> +		return;
> +	}

It's not really clear what this is doing. Is the intent to only change
this this value for the last option? If so, why?

> @@ -981,7 +992,11 @@ enum parse_opt_result parse_options_step(struct parse_opt_ctx_t *ctx,
>  					 *
>  					 * This is leaky, too bad.
>  					 */
> -					ctx->argv[0] = xstrdup(ctx->opt - 1);
> +					if (ctx->unknown_opts && ctx->strdup_fn) {
> +						ctx->argv[0] = ctx->strdup_fn(ctx->unknown_opts, ctx->opt - 1);
> +					} else {
> +						ctx->argv[0] = xstrdup(ctx->opt - 1);
> +					}
>  					*(char *)ctx->argv[0] = '-';
>  					goto unknown;
>  				case PARSE_OPT_NON_OPTION:

Hm. I'm at a loss what we're solving here. All of this really should be
explained in the commit message to give context to the reviewer.

> diff --git a/parse-options.h b/parse-options.h
> index 91c3e3c29b3d..af06a09abb8e 100644
> --- a/parse-options.h
> +++ b/parse-options.h
> @@ -388,6 +391,12 @@ static char *parse_options_noop_ignored_value MAYBE_UNUSED;
>  }
>  #define OPT_SUBCOMMAND(l, v, fn)    OPT_SUBCOMMAND_F((l), (v), (fn), 0)
>  
> +#define OPT_UNKNOWN(v, fn) { \
> +	.type = OPTION_END, \
> +	.value = (v), \
> +	.strdup_fn = (fn), \
> +}
> +
>  /*
>   * parse_options() will filter out the processed options and leave the
>   * non-option arguments in argv[]. argv0 is assumed program name and
> @@ -496,6 +505,9 @@ struct parse_opt_ctx_t {
>  	const char *prefix;
>  	const char **alias_groups; /* must be in groups of 3 elements! */
>  	struct parse_opt_cmdmode_list *cmdmode_list;
> +
> +	void *unknown_opts;
> +	parse_opt_strdup_fn *strdup_fn;
>  };
>  
>  void parse_options_start(struct parse_opt_ctx_t *ctx,

Okay. So the intent seems to be that we start storing any options that
we don't understand?

> diff --git a/t/helper/test-free-unknown-options.c b/t/helper/test-free-unknown-options.c
> new file mode 100644
> index 000000000000..9d658115ba8f
> --- /dev/null
> +++ b/t/helper/test-free-unknown-options.c
> @@ -0,0 +1,35 @@
> +#include "git-compat-util.h"
> +#include "parse-options.h"
> +#include "setup.h"
> +#include "strvec.h"
> +
> +static const char *const free_unknown_options_usage[] = {
> +    "test-tool free-unknown-options",
> +    NULL
> +};

Can't we expand `test-tool parse-options` instead of introducing a new
helper?

> +int cmd__free_unknown_options(int argc, const char **argv) {
> +    struct strvec *unknown_opts = xmalloc(sizeof(struct strvec));
> +    strvec_init(unknown_opts);
> +    const char *prefix = setup_git_directory();
> +
> +    bool a, b;
> +	struct option options[] = {
> +		OPT_BOOL('a', "test-a", &a, N_("option a, only for test use")),
> +	OPT_BOOL('b', "test-b", &b, N_("option b, only for test use")),
> +	OPT_UNKNOWN(unknown_opts, (parse_opt_strdup_fn *)&strvec_push),
> +	};
> +
> +    parse_options(argc, argv, prefix, options,
> +	free_unknown_options_usage, PARSE_OPT_KEEP_UNKNOWN_OPT);
> +
> +    printf("a = %s\n", a? "true": "false");
> +    printf("b = %s\n", b? "true": "false");
> +
> +    int i;
> +    for (i = 0; i < unknown_opts->nr; i++) {
> +	printf("free unknown option: %s\n", unknown_opts->v[i]);
> +    }
> +    strvec_clear(unknown_opts);
> +    free(unknown_opts);
> +}
> \ No newline at end of file

Missing newline.

> diff --git a/t/helper/test-tool.h b/t/helper/test-tool.h
> index 6d62a5b53d95..33fa7828b9f6 100644
> --- a/t/helper/test-tool.h
> +++ b/t/helper/test-tool.h
> @@ -44,6 +44,7 @@ int cmd__parse_options(int argc, const char **argv);
>  int cmd__parse_options_flags(int argc, const char **argv);
>  int cmd__parse_pathspec_file(int argc, const char** argv);
>  int cmd__parse_subcommand(int argc, const char **argv);
> +int cmd__free_unknown_options(int argc, const char **argv);
>  int cmd__partial_clone(int argc, const char **argv);
>  int cmd__path_utils(int argc, const char **argv);
>  int cmd__path_walk(int argc, const char **argv);
> diff --git a/t/t0040-parse-options.sh b/t/t0040-parse-options.sh
> index ca55ea8228c3..773f54103fd3 100755
> --- a/t/t0040-parse-options.sh
> +++ b/t/t0040-parse-options.sh
> @@ -822,4 +822,18 @@ test_expect_success 'u16 limits range' '
>  	test_grep "value 65536 for option .u16. not in range \[0,65535\]" err
>  '
>  
> +cat >expect <<\EOF
> +a = true
> +b = true
> +free unknown option: -c
> +free unknown option: -d
> +EOF

This should be done inside `test_expect_success` itself.

> +test_expect_success 'free unknown options' '
> +	test-tool free-unknown-options -ac -bd \
> +	>output 2>output.err &&
> +	test_cmp expect output &&
> +	test_must_be_empty output.err
> +'
> +
>  test_done

Okay, so the memory leak seems to happen when handling unknown options
indeed, as I started to expect after the middle of this patch. In any
case, this commit really needs to be polished to have a proper commit
message that sets the stage for the reviewer.

Patrick

Copy link

User Patrick Steinhardt <[email protected]> has been added to the cc: list.

parse-options.c Outdated
@@ -638,6 +638,18 @@ static int has_subcommands(const struct option *options)
return 0;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Patrick Steinhardt wrote (reply to this):

On Wed, May 07, 2025 at 02:33:22AM +0000, Lidong Yan via GitGitGadget wrote:
> From: Lidong Yan <[email protected]>
> 
> Signed-off-by: Lidong Yan <[email protected]>

The same is true here -- this needs to have a proper commit subject and
message.

> diff --git a/parse-options.c b/parse-options.c
> index 4279dfe4d313..12721b83000a 100644
> --- a/parse-options.c
> +++ b/parse-options.c
> @@ -638,7 +638,9 @@ static int has_subcommands(const struct option *options)
>  	return 0;
>  }
>  
> -static void set_strdup_fn(struct parse_opt_ctx_t *ctx, const struct option *options) {
> +static void set_strdup_fn(struct parse_opt_ctx_t *ctx,
> +			  const struct option *options)
> +{
>  	for (; options->type != OPTION_END; options++)
>  		;
>  	if (options->value && options->strdup_fn) {

Please refrain from doing code style cleanups of code you have just
introduced in the preceding commit.

Patrick

#include "test-tool.h"

static const char *const free_unknown_options_usage[] = {
"test-tool free-unknown-options", NULL

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Patrick Steinhardt wrote (reply to this):

On Wed, May 07, 2025 at 02:33:23AM +0000, Lidong Yan via GitGitGadget wrote:
> From: Lidong Yan <[email protected]>
> 
> Signed-off-by: Lidong Yan <[email protected]>

This looks like another fixup-style commit. I assume that all of this
really should be a single commit, as the latter two commits don't seem
to do anything new compared to the first commit.

Patrick

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, lidongyan wrote (reply to this):

Thank you for your suggestion. I will squash all the commits and include a detailed description in the next submission.

In short, calling parse_options may cause a memory leak when encountering unknown short options. To fix the leak, parse_options can use a user-provided allocation function, so that the user can later free the memory and avoid the leak. I also defined a macro called OPT_UNKNOWN to make it easier for users to pass in their own allocation function.

On the other hand, I think a new file should be added under t/helper/. The reason is that this new file not only serves as a test, but also provides a simple example of using OPT_UNKNOWN, which others can refer to when modifying their own option definitions to avoid memory leaks.

> 2025年5月7日 15:54,Patrick Steinhardt <[email protected]> 写道:
> 
> On Wed, May 07, 2025 at 02:33:23AM +0000, Lidong Yan via GitGitGadget wrote:
>> From: Lidong Yan <[email protected]>
>> 
>> Signed-off-by: Lidong Yan <[email protected]>
> 
> This looks like another fixup-style commit. I assume that all of this
> really should be a single commit, as the latter two commits don't seem
> to do anything new compared to the first commit.
> 
> Patrick
> 

@brandb97 brandb97 force-pushed the fix-parse-option-leak branch from a9cbca6 to 998ce1e Compare May 7, 2025 11:31
Copy link

There are issues in commit 998ce1e:
parse-options: fix xstrdup leak in parse_options_step parse-options:984
Lines in the body of the commit messages should be wrapped between 60 and 76 characters.
Indented lines, and lines without whitespace, are exempt

@brandb97 brandb97 force-pushed the fix-parse-option-leak branch 3 times, most recently from c456910 to e7b4465 Compare May 7, 2025 12:48
@brandb97
Copy link
Contributor Author

brandb97 commented May 7, 2025

/submit

Copy link

Submitted as [email protected]

To fetch this version into FETCH_HEAD:

git fetch https://github.com/gitgitgadget/git/ pr-git-1954/brandb97/fix-parse-option-leak-v2

To fetch this version to local tag pr-git-1954/brandb97/fix-parse-option-leak-v2:

git fetch --no-tags https://github.com/gitgitgadget/git/ tag pr-git-1954/brandb97/fix-parse-option-leak-v2

Copy link

On the Git mailing list, Patrick Steinhardt wrote (reply to this):

On Wed, May 07, 2025 at 01:24:53PM +0000, Lidong Yan via GitGitGadget wrote:
> From: Lidong Yan <[email protected]>
> 
> Most Git commands use parse_options to parse options, but parse_options
> causes a memory leak when it encounters unknown short options. The leak
> occurs at line 984 in parse-options.c, where git allocates some spaces

Again, no need to quote the exact line number.

> to store the unknown short options and will never release those spaces.
> 
> To address this issue, users can be allowed to provide a custom function
> to allocate memory for unknown options. For example, users can use
> strvec_push to allocate memory for unknown options and later call
> strvec_clear at the end of the Git command to release the memory, thereby
> avoiding the memory leak.

So does that mean that _every_ user of the "parse-options" interfaces
now has to explicitly plug this memory leak when facing unknown options?
That sounds rather undesirable, as there are so many users out there.

> This commit allows users to provide a custom allocation function for
> unknown options via the strdup_fn field in the last struct option. For
> convenience, this commit also introduces a new macro, OPT_UNKNOWN, which
> behaves like OPT_END but takes an additional argument for the allocation
> function. parse_options could get the custom allocation function in struct
> parse_opt_ctx_t by using set_strdup_fn. A simple example to use
> OPT_UNKNOWN is put into t/helper/test-free-unknown-options.c

Hm. Is there any other usecase for the `strdup_fn` field that you can
think about in the future? Otherwise it feels a bit overengineered from
my perspective.

I unfortuntaely don't think the proposed solution works well. To really
plug the memory leak everywhere, we would have to convert every single
user of `parse_options()` to pass `OPTION_UNKNOWN()` instead of
`OPTION_END()`, pass a strvec and clear that strvec at the end. It just
doesn't quite feel like it's worth it to plug a single, bounded memory
leak.

The most important problem that we're facing is that the allocated
string has a lifetime longer than `parse_options()`, because it really
only becomes relevant in the case where `PARSE_OPT_KEEP_UNKNOWN_OPT` is
enabled. If so, the _caller_ of `parse_options()` will want to process
the modified `argv`, and thus they have to be the ones responsible for
freeing potentially-allocated memory once it's unused.

But that'd require us to keep some state. You solve this via the extra
strvec, but that doesn't really look like a great solution to me. An
alternative would be to expose the `parse_options_ctx` and let the
caller eventually call `parse_options_clear()` or something like that.
This would also require us to modify every caller, but the benefit would
be that we now have a single place where we can always allocate memory
in without having to worry about its lifetime.

All of that starts to become kind of involved though. So unless others
disagree with my analysis I just don't think this edge case really is
worth the complexity.

Thanks!

Patrick

Copy link

On the Git mailing list, lidongyan wrote (reply to this):

2025年5月9日 14:19,Patrick Steinhardt <[email protected]> 写道:
> So does that mean that _every_ user of the "parse-options" interfaces
> now has to explicitly plug this memory leak when facing unknown options?
> That sounds rather undesirable, as there are so many users out there.

Since the lifetime of `argv` last until the program terminates, a memory leak
can only occur if parse_option is called multiple times and at least two of 
those calls use the `PARSE_OPT_KEEP_UNKNOWN` flag. In the other
words, the memory leak only occurs when the statement `ctx->argue[0] = xstrdup`
overwrites the result of a previous `xstrdup` call.

> Hm. Is there any other usecase for the `strdup_fn` field that you can
> think about in the future? Otherwise it feels a bit overengineered from
> my perspective.

I think a simple approach is to add a marker to the string allocated by `xstrdup`
, and before the next potential leaking `ctx->argv[0] = xstrdup`, check whether the
string needs to be freed. Like we could allocate one more byte in the end of the
string to store the marker.

Copy link

On the Git mailing list, Phillip Wood wrote (reply to this):

On 09/05/2025 07:19, Patrick Steinhardt wrote:
> > All of that starts to become kind of involved though. So unless others
> disagree with my analysis I just don't think this edge case really is
> worth the complexity.

I agree with this. We seem to be copying the string because argv is a
const char** but would it really be so bad to modify the string that
gets passed to us? We know that any bundled options that come before
an unknown option cannot have a value or else what we see as the
unknown option would be part of that value. So I think we could do


diff --git a/parse-options.c b/parse-options.c
index 35fbb3b0d6..9e6e46da27 100644
--- a/parse-options.c
+++ b/parse-options.c
@@ -924,12 +924,12 @@ enum parse_opt_result parse_options_step(struct parse_opt_ctx_t *ctx,
                                         if (internal_help && *ctx->opt == 'h')
                                                 goto show_usage;

-                                        /* fake a short option thing to hide the fact that we may have
+                                        /* move a short option thing to hide the fact that we may have
                                          * started to parse aggregated stuff
-                                         *
-                                         * This is leaky, too bad.
                                          */
-                                        ctx->argv[0] = xstrdup(ctx->opt - 1);
+                                        MOVE_ARRAY((char *)arg, ctx->opt - 1,
+                                                   strlen(ctx->opt) + 2);
+                                        ctx->argv[0] = arg;
                                         *(char *)ctx->argv[0] = '-';
                                         goto unknown;
                                 case PARSE_OPT_NON_OPTION:

Best Wishes

Phillip

Copy link

User Phillip Wood <[email protected]> has been added to the cc: list.

@brandb97 brandb97 force-pushed the fix-parse-option-leak branch 3 times, most recently from d451623 to c35722e Compare May 9, 2025 13:44
Copy link

On the Git mailing list, lidongyan wrote (reply to this):

On 09/05/2025 21:08,Phillip Wood <[email protected]> write:
> diff --git a/parse-options.c b/parse-options.c
> index 35fbb3b0d6..9e6e46da27 100644
> --- a/parse-options.c
> +++ b/parse-options.c
> @@ -924,12 +924,12 @@ enum parse_opt_result parse_options_step(struct parse_opt_ctx_t *ctx,
>                                         if (internal_help && *ctx->opt == 'h')
>                                                 goto show_usage;
> 
> -                                        /* fake a short option thing to hide the fact that we may have
> +                                        /* move a short option thing to hide the fact that we may have
>                                          * started to parse aggregated stuff
> -                                         *
> -                                         * This is leaky, too bad.
>                                          */
> -                                        ctx->argv[0] = xstrdup(ctx->opt - 1);
> +                                        MOVE_ARRAY((char *)arg, ctx->opt - 1,
> +                                                   strlen(ctx->opt) + 2);
> +                                        ctx->argv[0] = arg;
>                                         *(char *)ctx->argv[0] = '-';
>                                         goto unknown;
>                                 case PARSE_OPT_NON_OPTION:

I’m not sure why git used to use `xtrdup` here instead of modifying `arg` directly.
If modifying `arg` is safe, then this is indeed a good solution.

call parse_options twice and occasionally meet unknown option
in the same place would cause memory leak, since the second `xstrdup`
in `parse_options_step` would make the first `xstrdup` unreachable.

One solution is allocate one more magic byte for the unknown option to
indicate that argv[?] stores a heap allocated arg. Assume for all
arg, `*((char *)arg - 1)` is valid, we could put a magic number before
the unknown option. Next time calling "xstrdup" will check the magic
number first, and frees the previous heap allocated memory.

Signed-off-by: Lidong Yan <[email protected]>
@brandb97 brandb97 force-pushed the fix-parse-option-leak branch from c35722e to 475f7b5 Compare May 9, 2025 13:46
@brandb97
Copy link
Contributor Author

brandb97 commented May 9, 2025

/submit

Copy link

Submitted as [email protected]

To fetch this version into FETCH_HEAD:

git fetch https://github.com/gitgitgadget/git/ pr-git-1954/brandb97/fix-parse-option-leak-v3

To fetch this version to local tag pr-git-1954/brandb97/fix-parse-option-leak-v3:

git fetch --no-tags https://github.com/gitgitgadget/git/ tag pr-git-1954/brandb97/fix-parse-option-leak-v3

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants