2013-10-12 18:11:51

by Krzysztof Mazur

[permalink] [raw]
Subject: [PATCH] init: fix in-place parameter modification regression

Before commit 026cee0086fe1df4cf74691cf273062cc769617d
("params: <level>_initcall-like kernel parameters") the __setup
parameter parsing code could modify parameter in the
static_command_line buffer and such modifications were kept. After
that commit such modifications are destroyed during per-initcall level
parameter parsing because the same static_command_line buffer is used
and only parameters for appropriate initcall level are parsed.

That change broke at least parsing "ubd" parameter in the ubd driver
when the COW file is used.

Now the separate buffer is used for per-initcall parameter parsing,
like in parsing early params.

Signed-off-by: Krzysztof Mazur <[email protected]>
---
Hi,

this patch fixes an old Linux 3.4 regression in ubd parameter parsing.
It was previously reported by the David Fern?ndez in the
"ubd option parsing problem when using cow filesystems in kernel 3.4"
thread on user-mode-linux-devel mailing list
(http://marc.info/?t=134009640700003&r=1&w=2).
The bug still exists in the Linux 3.12-rc4.

I've been using a different patch that changed the ubd driver. I just
copied the parsed filename to separate buffer, but I think it's better
to fix generic code.

Maybe the tmp_cmdline should be shared with the parse_early_param().

Regards,
Krzysiek

init/main.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/init/main.c b/init/main.c
index 63d3e8f..e5b322a 100644
--- a/init/main.c
+++ b/init/main.c
@@ -742,12 +742,13 @@ static char *initcall_level_names[] __initdata = {

static void __init do_initcall_level(int level)
{
+ static __initdata char tmp_cmdline[COMMAND_LINE_SIZE];
extern const struct kernel_param __start___param[], __stop___param[];
initcall_t *fn;

- strcpy(static_command_line, saved_command_line);
+ strcpy(tmp_cmdline, saved_command_line);
parse_args(initcall_level_names[level],
- static_command_line, __start___param,
+ tmp_cmdline, __start___param,
__stop___param - __start___param,
level, level,
&repair_env_string);
--
1.8.4.652.g0d6e0ce


2013-10-14 08:57:37

by Rusty Russell

[permalink] [raw]
Subject: Re: [PATCH] init: fix in-place parameter modification regression

Krzysztof Mazur <[email protected]> writes:
> Before commit 026cee0086fe1df4cf74691cf273062cc769617d
> ("params: <level>_initcall-like kernel parameters") the __setup
> parameter parsing code could modify parameter in the
> static_command_line buffer and such modifications were kept. After
> that commit such modifications are destroyed during per-initcall level
> parameter parsing because the same static_command_line buffer is used
> and only parameters for appropriate initcall level are parsed.
>
> That change broke at least parsing "ubd" parameter in the ubd driver
> when the COW file is used.
>
> Now the separate buffer is used for per-initcall parameter parsing,
> like in parsing early params.

How about just removing "strcpy(static_command_line, saved_command_line);"
from do_initcall_level altogether? We already initialize it in
setup_command_line().

Cheers,
Rusty.

2013-10-14 09:28:20

by Krzysztof Mazur

[permalink] [raw]
Subject: Re: [PATCH] init: fix in-place parameter modification regression

On Mon, Oct 14, 2013 at 06:06:23PM +1030, Rusty Russell wrote:
> Krzysztof Mazur <[email protected]> writes:
> > Before commit 026cee0086fe1df4cf74691cf273062cc769617d
> > ("params: <level>_initcall-like kernel parameters") the __setup
> > parameter parsing code could modify parameter in the
> > static_command_line buffer and such modifications were kept. After
> > that commit such modifications are destroyed during per-initcall level
> > parameter parsing because the same static_command_line buffer is used
> > and only parameters for appropriate initcall level are parsed.
> >
> > That change broke at least parsing "ubd" parameter in the ubd driver
> > when the COW file is used.
> >
> > Now the separate buffer is used for per-initcall parameter parsing,
> > like in parsing early params.
>
> How about just removing "strcpy(static_command_line, saved_command_line);"
> from do_initcall_level altogether? We already initialize it in
> setup_command_line().
>

That does not work because the command line is always modified -
at least '\0' are added to split strings.

To be sure I tested that with:

diff --git a/kernel/params.c b/kernel/params.c
index c00d5b5..606058a 100644
--- a/kernel/params.c
+++ b/kernel/params.c
@@ -96,6 +96,7 @@ static int parse_one(char *param,
unsigned int i;
int err;

+ printk("param %d-%d: %s\n", min_level, max_level, param);
/* Find parameter */
for (i = 0; i < num_params; i++) {
if (parameq(param, params[i].name)) {


And without that strcpy() I get:

[...]
param 0-0: ubd0
param 1-1: ubd0
[...]

instead of:

[...]
param 0-0: ubd0
param 0-0: ubd1
param 0-0: raid
param 0-0: hostfs
param 0-0: root
param 1-1: ubd0
param 1-1: ubd1
param 1-1: raid
param 1-1: hostfs
param 1-1: root
[...]

I checked that also without udb parameters.

Thanks,
Krzysiek

2013-10-14 11:34:07

by Pawel Moll

[permalink] [raw]
Subject: Re: [PATCH] init: fix in-place parameter modification regression

On Sat, 2013-10-12 at 19:05 +0100, Krzysztof Mazur wrote:
> Before commit 026cee0086fe1df4cf74691cf273062cc769617d
> ("params: <level>_initcall-like kernel parameters") the __setup
> parameter parsing code could modify parameter in the
> static_command_line buffer and such modifications were kept. After
> that commit such modifications are destroyed during per-initcall level
> parameter parsing because the same static_command_line buffer is used
> and only parameters for appropriate initcall level are parsed.
>
> That change broke at least parsing "ubd" parameter in the ubd driver
> when the COW file is used.
>
> Now the separate buffer is used for per-initcall parameter parsing,
> like in parsing early params.
>
> Signed-off-by: Krzysztof Mazur <[email protected]>
> ---
>
> init/main.c | 5 +++--
> 1 file changed, 3 insertions(+), 2 deletions(-)
>
> diff --git a/init/main.c b/init/main.c
> index 63d3e8f..e5b322a 100644
> --- a/init/main.c
> +++ b/init/main.c
> @@ -742,12 +742,13 @@ static char *initcall_level_names[] __initdata = {
>
> static void __init do_initcall_level(int level)
> {
> + static __initdata char tmp_cmdline[COMMAND_LINE_SIZE];
> extern const struct kernel_param __start___param[], __stop___param[];
> initcall_t *fn;
>
> - strcpy(static_command_line, saved_command_line);
> + strcpy(tmp_cmdline, saved_command_line);
> parse_args(initcall_level_names[level],
> - static_command_line, __start___param,
> + tmp_cmdline, __start___param,
> __stop___param - __start___param,
> level, level,
> &repair_env_string);

Hold on. static_command_line can be only accessed within init/main.c. As
far as I can say, it is only used by unknown_bootoption() (so your
__setup callbacks) and then in the do_initcall_level().

So, assuming that it is actually legal to modify static_command_line in
__setup()-s (and I must say I have rather mixed feelings about it ;-),
the only place that will be able to make use of this changes will be
do_initcall_level().

But your change actually uses *saved*_command_line instead of
*static*_command_line as the base for parse_args.

Generally I agree that the commit in question changed the semantics in a
subtle way - it makes the do_initcalls() use saved_command_line as a
string to be parsed instead of static_command_line. I was convinced that
at this stage of execution they must be identical (and the
static_command_line is a de-facto scratch buffer). You're saying that is
may not be the case, which can be true, but you're keeping the same
behaviour :-)

So either you have some extra changes in your kernel actually using
static_command_line for some other reason, or your change makes no
difference. The third option is me being brain dead today, which is not
impossible ;-)

Paweł

2013-10-14 12:50:10

by Krzysztof Mazur

[permalink] [raw]
Subject: Re: [PATCH] init: fix in-place parameter modification regression

On Mon, Oct 14, 2013 at 12:34:02PM +0100, Pawel Moll wrote:
> On Sat, 2013-10-12 at 19:05 +0100, Krzysztof Mazur wrote:
> > Before commit 026cee0086fe1df4cf74691cf273062cc769617d
> > ("params: <level>_initcall-like kernel parameters") the __setup
> > parameter parsing code could modify parameter in the
> > static_command_line buffer and such modifications were kept. After
> > that commit such modifications are destroyed during per-initcall level
> > parameter parsing because the same static_command_line buffer is used
> > and only parameters for appropriate initcall level are parsed.
> >
> > That change broke at least parsing "ubd" parameter in the ubd driver
> > when the COW file is used.
> >
> > Now the separate buffer is used for per-initcall parameter parsing,
> > like in parsing early params.
> >
> > Signed-off-by: Krzysztof Mazur <[email protected]>
> > ---
> >
> > init/main.c | 5 +++--
> > 1 file changed, 3 insertions(+), 2 deletions(-)
> >
> > diff --git a/init/main.c b/init/main.c
> > index 63d3e8f..e5b322a 100644
> > --- a/init/main.c
> > +++ b/init/main.c
> > @@ -742,12 +742,13 @@ static char *initcall_level_names[] __initdata = {
> >
> > static void __init do_initcall_level(int level)
> > {
> > + static __initdata char tmp_cmdline[COMMAND_LINE_SIZE];
> > extern const struct kernel_param __start___param[], __stop___param[];
> > initcall_t *fn;
> >
> > - strcpy(static_command_line, saved_command_line);
> > + strcpy(tmp_cmdline, saved_command_line);
> > parse_args(initcall_level_names[level],
> > - static_command_line, __start___param,
> > + tmp_cmdline, __start___param,
> > __stop___param - __start___param,
> > level, level,
> > &repair_env_string);
>
> Hold on. static_command_line can be only accessed within init/main.c. As
> far as I can say, it is only used by unknown_bootoption() (so your
> __setup callbacks) and then in the do_initcall_level().

But I think it's legal to keep the pointer to the option argument
(which points to static_command_line) in __setup callback and use it later,
and assume that the argument value will never change.

However, with my patch it's no longer true for per-initcall arguments
because they share the same command line buffer, so the tmp_cmdline
have the __initdata attribute. The same restriction applies to
the "early params".

>
> So, assuming that it is actually legal to modify static_command_line in
> __setup()-s (and I must say I have rather mixed feelings about it ;-),

I also have mixed feelings about that, but the parse_args() already
does that, because some characters are replaced with '\0' to split
command line into separate strings. The ubd driver does the same
to split parameter into two strings.

So after parsing, the command line:

"ubd0=cow_file,file hostfs=/path"

is changed to:

"ubd0\0cow_file\0file\0hostfs\0path"


However after do_initcalls(), because __setup("ubd", ...) callback is not
called, it's changed to:

"ubd0\0cow_file,file\0hostfs\0path"

and the ubd driver tries to use "cow_file,file" as COW file because
it just saves pointer.


With my patch because do_initcall_level() uses different scratch buffer
this difference is not visible to the ubd driver.

> the only place that will be able to make use of this changes will be
> do_initcall_level().
>
> But your change actually uses *saved*_command_line instead of
> *static*_command_line as the base for parse_args.

Yes, because static_command_line is already destroyed by parsing
(NUL terminators are added).

>
> Generally I agree that the commit in question changed the semantics in a
> subtle way - it makes the do_initcalls() use saved_command_line as a
> string to be parsed instead of static_command_line. I was convinced that
> at this stage of execution they must be identical (and the

No, at that stage the static_command_line is already parsed by
parse_args("Booting kernel", ...).

> static_command_line is a de-facto scratch buffer). You're saying that is
> may not be the case, which can be true, but you're keeping the same
> behaviour :-)
>
> So either you have some extra changes in your kernel actually using
> static_command_line for some other reason, or your change makes no
> difference. The third option is me being brain dead today, which is not
> impossible ;-)
>

I've been using vanilla v3.12-rc4-92-gb68ba36. Now I'm using v3.12-rc5.

Thanks,
Krzysiek

2013-10-14 13:37:19

by Pawel Moll

[permalink] [raw]
Subject: Re: [PATCH] init: fix in-place parameter modification regression

On Mon, 2013-10-14 at 13:50 +0100, Krzysztof Mazur wrote:
> > Hold on. static_command_line can be only accessed within init/main.c. As
> > far as I can say, it is only used by unknown_bootoption() (so your
> > __setup callbacks) and then in the do_initcall_level().
>
> But I think it's legal to keep the pointer to the option argument
> (which points to static_command_line) in __setup callback and use it later,
> and assume that the argument value will never change.

Ah, of course. I forgot about this possibility. Yes, I think it's even
mentioned in a comment somewhere.

> However, with my patch it's no longer true for per-initcall arguments
> because they share the same command line buffer, so the tmp_cmdline
> have the __initdata attribute. The same restriction applies to
> the "early params".

Right. I'm glad you've pointed this out :-)

I'm not sure if this is acceptable. I'll stare at the code for a while,
Rusty may have some idea as well.

> > So, assuming that it is actually legal to modify static_command_line in
> > __setup()-s (and I must say I have rather mixed feelings about it ;-),
>
> I also have mixed feelings about that, but the parse_args() already
> does that, because some characters are replaced with '\0' to split
> command line into separate strings.

Yeah, that's why I referred to static_command_line as "scratch buffer".
Of course when someone keeps pointer to it, it has to be carefully
modified...

> The ubd driver does the same
> to split parameter into two strings.
>
> So after parsing, the command line:
>
> "ubd0=cow_file,file hostfs=/path"
>
> is changed to:
>
> "ubd0\0cow_file\0file\0hostfs\0path"

Ok, such kind of changes, I see... Still ugly, but something I think I
can swallow...

> > Generally I agree that the commit in question changed the semantics in a
> > subtle way - it makes the do_initcalls() use saved_command_line as a
> > string to be parsed instead of static_command_line. I was convinced that
> > at this stage of execution they must be identical (and the
>
> No, at that stage the static_command_line is already parsed by
> parse_args("Booting kernel", ...).

I meant "identical" with the '\0' approximation. I'm getting the point
though.

> > static_command_line is a de-facto scratch buffer). You're saying that is
> > may not be the case, which can be true, but you're keeping the same
> > behaviour :-)
> >
> > So either you have some extra changes in your kernel actually using
> > static_command_line for some other reason, or your change makes no
> > difference. The third option is me being brain dead today, which is not
> > impossible ;-)
> >
>
> I've been using vanilla v3.12-rc4-92-gb68ba36. Now I'm using v3.12-rc5.

Ok, I think I understand the problem now. Let me think about it for a
while.

Paweł

2013-10-18 06:09:28

by Rusty Russell

[permalink] [raw]
Subject: Re: [PATCH] init: fix in-place parameter modification regression

Krzysztof Mazur <[email protected]> writes:
> On Mon, Oct 14, 2013 at 12:34:02PM +0100, Pawel Moll wrote:
>> So, assuming that it is actually legal to modify static_command_line in
>> __setup()-s (and I must say I have rather mixed feelings about it ;-),
>
> I also have mixed feelings about that, but the parse_args() already
> does that, because some characters are replaced with '\0' to split
> command line into separate strings. The ubd driver does the same
> to split parameter into two strings.

Back when there was almost no parameter parsing support, everyone got
used to keeping pointers into the original. Making everyone kstrdup()
seems like gratuitous churn which is likely to make more bugs.

Your fix means __setup() gets treated specially, in that only it can
mangle the command line. That makes sense. But it introduces another
regression: normal parsing functions can't keep pointers, since that's
now __initdata.

There are two possible solutions:
(1) Audit all __setup to make sure they copy if they want to mangle.
There are about 750 of them, but many are trivial.
(2) alloc_bootmem() a third commandline for parsing.

Now, many functions of form __setup("XXX=") should be turned into
module_param anyway.

I suggest we do (2) for the moment, and start sweeping through cleaning
up __setup() in the longer term.

Cheers,
Rusty.

2013-10-18 09:19:27

by Krzysztof Mazur

[permalink] [raw]
Subject: Re: [PATCH] init: fix in-place parameter modification regression

On Fri, Oct 18, 2013 at 02:20:38PM +1030, Rusty Russell wrote:
> Back when there was almost no parameter parsing support, everyone got
> used to keeping pointers into the original. Making everyone kstrdup()
> seems like gratuitous churn which is likely to make more bugs.
>
> Your fix means __setup() gets treated specially, in that only it can
> mangle the command line. That makes sense. But it introduces another
> regression: normal parsing functions can't keep pointers, since that's
> now __initdata.
>
> There are two possible solutions:
> (1) Audit all __setup to make sure they copy if they want to mangle.
> There are about 750 of them, but many are trivial.
> (2) alloc_bootmem() a third commandline for parsing.
>
> Now, many functions of form __setup("XXX=") should be turned into
> module_param anyway.
>
> I suggest we do (2) for the moment, and start sweeping through cleaning
> up __setup() in the longer term.
>

Yes, the buffer cannot be __initdata. I'm sending an updated patch.


However, keeping pointers to buffer, that will be reinitialized
in next initcall parameters parsing pass, might cause some race
conditions.

Thanks,
Krzysiek

-- >8 --
Subject: [PATCH v2] init: fix in-place parameter modification regression

Before commit 026cee0086fe1df4cf74691cf273062cc769617d
("params: <level>_initcall-like kernel parameters") the __setup
parameter parsing code could modify parameter in the
static_command_line buffer and such modifications were kept. After
that commit such modifications are destroyed during per-initcall level
parameter parsing because the same static_command_line buffer is used
and only parameters for appropriate initcall level are parsed.

That change broke at least parsing "ubd" parameter in the ubd driver
when the COW file is used.

Now the separate buffer is used for per-initcall parameter parsing.

Signed-off-by: Krzysztof Mazur <[email protected]>
---
init/main.c | 7 +++++--
1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/init/main.c b/init/main.c
index 63d3e8f..c093b5c 100644
--- a/init/main.c
+++ b/init/main.c
@@ -132,6 +132,8 @@ char __initdata boot_command_line[COMMAND_LINE_SIZE];
char *saved_command_line;
/* Command line for parameter parsing */
static char *static_command_line;
+/* Command line for per-initcall parameter parsing */
+static char *initcall_command_line;

static char *execute_command;
static char *ramdisk_execute_command;
@@ -348,6 +350,7 @@ static inline void smp_prepare_cpus(unsigned int maxcpus) { }
static void __init setup_command_line(char *command_line)
{
saved_command_line = alloc_bootmem(strlen (boot_command_line)+1);
+ initcall_command_line = alloc_bootmem(strlen (boot_command_line)+1);
static_command_line = alloc_bootmem(strlen (command_line)+1);
strcpy (saved_command_line, boot_command_line);
strcpy (static_command_line, command_line);
@@ -745,9 +748,9 @@ static void __init do_initcall_level(int level)
extern const struct kernel_param __start___param[], __stop___param[];
initcall_t *fn;

- strcpy(static_command_line, saved_command_line);
+ strcpy(initcall_command_line, saved_command_line);
parse_args(initcall_level_names[level],
- static_command_line, __start___param,
+ initcall_command_line, __start___param,
__stop___param - __start___param,
level, level,
&repair_env_string);
--
1.8.4.1.635.g55556a5

2013-10-21 03:50:30

by Rusty Russell

[permalink] [raw]
Subject: Re: [PATCH] init: fix in-place parameter modification regression

Krzysztof Mazur <[email protected]> writes:
> On Fri, Oct 18, 2013 at 02:20:38PM +1030, Rusty Russell wrote:
>> Back when there was almost no parameter parsing support, everyone got
>> used to keeping pointers into the original. Making everyone kstrdup()
>> seems like gratuitous churn which is likely to make more bugs.
>>
>> Your fix means __setup() gets treated specially, in that only it can
>> mangle the command line. That makes sense. But it introduces another
>> regression: normal parsing functions can't keep pointers, since that's
>> now __initdata.
>>
>> There are two possible solutions:
>> (1) Audit all __setup to make sure they copy if they want to mangle.
>> There are about 750 of them, but many are trivial.
>> (2) alloc_bootmem() a third commandline for parsing.
>>
>> Now, many functions of form __setup("XXX=") should be turned into
>> module_param anyway.
>>
>> I suggest we do (2) for the moment, and start sweeping through cleaning
>> up __setup() in the longer term.
>>
>
> Yes, the buffer cannot be __initdata. I'm sending an updated patch.
>
>
> However, keeping pointers to buffer, that will be reinitialized
> in next initcall parameters parsing pass, might cause some race
> conditions.

Thanks, applied.

Cheers,
Rusty.

> Thanks,
> Krzysiek
>
> -- >8 --
> Subject: [PATCH v2] init: fix in-place parameter modification regression
>
> Before commit 026cee0086fe1df4cf74691cf273062cc769617d
> ("params: <level>_initcall-like kernel parameters") the __setup
> parameter parsing code could modify parameter in the
> static_command_line buffer and such modifications were kept. After
> that commit such modifications are destroyed during per-initcall level
> parameter parsing because the same static_command_line buffer is used
> and only parameters for appropriate initcall level are parsed.
>
> That change broke at least parsing "ubd" parameter in the ubd driver
> when the COW file is used.
>
> Now the separate buffer is used for per-initcall parameter parsing.
>
> Signed-off-by: Krzysztof Mazur <[email protected]>
> ---
> init/main.c | 7 +++++--
> 1 file changed, 5 insertions(+), 2 deletions(-)
>
> diff --git a/init/main.c b/init/main.c
> index 63d3e8f..c093b5c 100644
> --- a/init/main.c
> +++ b/init/main.c
> @@ -132,6 +132,8 @@ char __initdata boot_command_line[COMMAND_LINE_SIZE];
> char *saved_command_line;
> /* Command line for parameter parsing */
> static char *static_command_line;
> +/* Command line for per-initcall parameter parsing */
> +static char *initcall_command_line;
>
> static char *execute_command;
> static char *ramdisk_execute_command;
> @@ -348,6 +350,7 @@ static inline void smp_prepare_cpus(unsigned int maxcpus) { }
> static void __init setup_command_line(char *command_line)
> {
> saved_command_line = alloc_bootmem(strlen (boot_command_line)+1);
> + initcall_command_line = alloc_bootmem(strlen (boot_command_line)+1);
> static_command_line = alloc_bootmem(strlen (command_line)+1);
> strcpy (saved_command_line, boot_command_line);
> strcpy (static_command_line, command_line);
> @@ -745,9 +748,9 @@ static void __init do_initcall_level(int level)
> extern const struct kernel_param __start___param[], __stop___param[];
> initcall_t *fn;
>
> - strcpy(static_command_line, saved_command_line);
> + strcpy(initcall_command_line, saved_command_line);
> parse_args(initcall_level_names[level],
> - static_command_line, __start___param,
> + initcall_command_line, __start___param,
> __stop___param - __start___param,
> level, level,
> &repair_env_string);
> --
> 1.8.4.1.635.g55556a5