2008-03-12 05:35:42

by Gordon Farquharson

[permalink] [raw]
Subject: physmap and "request_module: runaway loop modprobe net-pf-1"

Hi All

On machines that register a physmap flash platform device (e.g. many
arm machines), parse_mtd_partitions() in drivers/mtd/mtdpart.c can
cause the following output in the boot log:

request_module: runaway loop modprobe net-pf-1
modprobe: FATAL: Could not load
/lib/modules/2.6.25-rc4-iop32x/modules.dep: No such file or directory

modprobe: FATAL: Could not load
/lib/modules/2.6.25-rc4-iop32x/modules.dep: No such file or directory

modprobe: FATAL: Could not load
/lib/modules/2.6.25-rc4-iop32x/modules.dep: No such file or directory
...

To illustrate how this can occur, I'll use the example where
CONFIG_KMOD=y, CONFIG_MTD_PARTITIONS=y, CONFIG_MTD_REDBOOT_PARTS=y,
and CONFIG_CMDLINE_PARTS is not set. I'll also refer to the following
code segment which is in drivers/mtd/mtdpart.c:parse_mtd_partitions():

for ( ; ret <= 0 && *types; types++) {
parser = get_partition_parser(*types);
#ifdef CONFIG_KMOD
if (!parser && !request_module("%s", *types))
parser = get_partition_parser(*types);
#endif
if (!parser) {
printk(KERN_NOTICE "%s partition parsing not available\n",
*types);
continue;
}
ret = (*parser->parse_fn)(master, pparts, origin);
if (ret > 0) {
printk(KERN_NOTICE "%d %s partitions found on MTD device %s\n",
ret, parser->name, master->name);
}
put_partition_parser(parser);
}

I should note that I am assuming (because I don't understand how) that
request_module() requires net-pf-1 to succeed. This assumption is
based on my observation of the request_module message
("request_module: runaway loop modprobe net-pf-1"), and that
af_unix_init() (net-pf-1 initialization) is called only after the call
to request_module() in this example.

Based on the configuration and assumptions described above, the
sequence of events that cause the boot log messages above are:

1. the first call to get_partition_parser() in parse_mtd_partitions()
does not find the parser in the the list of registered parsers, i.e.
get_partition_parser() returns a NULL pointer. In this example, the
only registered parser in the list is called Redboot;

2. parse_mtd_partitions() calls request_module() with (in this
example) the parameter "cmdlinepart";

3. request_module() complains (request_module: runaway loop modprobe
net-pf-1, etc.) because af_unix_init() has not been called. The boot
log shows that af_unix_init() is called only after
parse_mtd_partitions() finishes.

register_mtd_parser: RedBoot
physmap platform flash device: 00080000 at f0000000
Found: ST M29W400DB
physmap-flash.0: Found 1 x16 devices at 0x0 in 16-bit bank
number of JEDEC chips: 1
cfi_cmdset_0002: Disabling erase-suspend-program due to code brokenness.
parse_mtd_partitions:
get_partition_parser: name=cmdlinepart
get_partition_parser: p->name=RedBoot
get_partition_parser: p->owner=00000000
get_partition_parser: ret=00000000
parse_mtd_partitions: cmdlinepart
request_module: runaway loop modprobe net-pf-1
modprobe: FATAL: Could not load
/lib/modules/2.6.25-rc4-iop32x/modules.dep: No such file or directory

modprobe: FATAL: Could not load
/lib/modules/2.6.25-rc4-iop32x/modules.dep: No such file or directory

modprobe: FATAL: Could not load
/lib/modules/2.6.25-rc4-iop32x/modules.dep: No such file or directory

<--- 47 repeated modprobe messages deleted --->
parse_mtd_partitions: request_module called
cmdlinepart partition parsing not available
get_partition_parser: name=RedBoot
get_partition_parser: p->name=RedBoot
get_partition_parser: p->owner=00000000
get_partition_parser: ret->name=RedBoot
get_partition_parser: ret=c02b0f8c
parse_mtd_partitions: RedBoot
parse_mtd_partitions: request_module called
Searching for RedBoot partition table in physmap-flash.0 at offset 0x70000
No RedBoot partition table detected in physmap-flash.0
parse_mtd_partitions: end of function
mice: PS/2 mouse device common for all mice
i2c /dev entries driver
rtc-rs5c372 0-0032: rs5c372a found, 24hr, driver version 0.5
rtc-rs5c372 0-0032: rtc core: registered rtc-rs5c372 as rtc0
iop-adma iop-adma.0: Intel(R) IOP: ( cpy intr )
iop-adma iop-adma.1: Intel(R) IOP: ( cpy intr )
NET: Registered protocol family 26
TCP bic registered
af_unix_init:
NET: Registered protocol family 1
NET: Registered protocol family 17
XScale DSP coprocessor detected.
registered taskstats version 1
rtc-rs5c372 0-0032: setting system clock to 2008-03-11 03:16:46 UTC (1205205406)

The reason parse_mtd_partitions() tries to use cmdlinepart, even
though it is not a registered parsing scheme when CONFIG_CMDLINE_PARTS
is not set, is because cmdlinepart is hard coded in
drivers/mtd/maps/physmap.c:

#ifdef CONFIG_MTD_PARTITIONS
static const char *part_probe_types[] = { "cmdlinepart", "RedBoot", NULL };
#endif

The simple solution to eliminating the messages in the boot log
appears to be to ensure that, for machines that use a physmap flash
device, CONFIG_MTD_REDBOOT_PARTS and CONFIG_CMDLINE_PARTS are set if
CONFIG_MTD_PARTITIONS is set. However, the problem also occurs if
CONFIG_MTD_REDBOOT_PARTS is configured to be built as a module,
because again, request_module() fails because af_unix_init() has not
been called.

It would be simpler if all parsing code could only be built into the
kernel. In this case, built in parsers would be registered by the time
parse_mtd_partitions() is called, so the call to request_module() in
parse_mtd_partitions() could be removed:

#ifdef CONFIG_KMOD
if (!parser && !request_module("%s", *types))
parser = get_partition_parser(*types);
#endif

This solution would require simply changing the parser configuration
options to bool in drivers/mtd/Kconfig. Is there a reason to allow the
parsing code to be built as a module?

Alternatively, if selecting CONFIG_MTD_PARTITIONS and not selecting
both CONFIG_MTD_REDBOOT_PARTS and CONFIG_CMDLINE_PARTS, or selecting
CONFIG_MTD_PARTITIONS and not selecting CONFIG_CMDLINE_PARTS and
selecting CONFIG_MTD_REDBOOT_PARTS as a module, are valid
configurations, it seems that parse_mtd_partitions() must be called
only after af_unix_init() has been called. Is this possible?

Lastly, does anybody care because this problem does not cause the
kernel to crash? :-)

Please CC me when replying because I am not subscribed to the list.

Thanks.

Gordon

--
Gordon Farquharson
GnuPG Key ID: 32D6D676


2008-03-12 06:00:17

by Andrew Morton

[permalink] [raw]
Subject: Re: physmap and "request_module: runaway loop modprobe net-pf-1"

On Tue, 11 Mar 2008 23:35:28 -0600 "Gordon Farquharson" <[email protected]> wrote:

> Hi All

Let's cc the MTD list.

> On machines that register a physmap flash platform device (e.g. many
> arm machines), parse_mtd_partitions() in drivers/mtd/mtdpart.c can
> cause the following output in the boot log:
>
> request_module: runaway loop modprobe net-pf-1
> modprobe: FATAL: Could not load
> /lib/modules/2.6.25-rc4-iop32x/modules.dep: No such file or directory
>
> modprobe: FATAL: Could not load
> /lib/modules/2.6.25-rc4-iop32x/modules.dep: No such file or directory
>
> modprobe: FATAL: Could not load
> /lib/modules/2.6.25-rc4-iop32x/modules.dep: No such file or directory
> ...
>
> To illustrate how this can occur, I'll use the example where
> CONFIG_KMOD=y, CONFIG_MTD_PARTITIONS=y, CONFIG_MTD_REDBOOT_PARTS=y,
> and CONFIG_CMDLINE_PARTS is not set. I'll also refer to the following
> code segment which is in drivers/mtd/mtdpart.c:parse_mtd_partitions():
>
> for ( ; ret <= 0 && *types; types++) {
> parser = get_partition_parser(*types);
> #ifdef CONFIG_KMOD
> if (!parser && !request_module("%s", *types))
> parser = get_partition_parser(*types);
> #endif
> if (!parser) {
> printk(KERN_NOTICE "%s partition parsing not available\n",
> *types);
> continue;
> }
> ret = (*parser->parse_fn)(master, pparts, origin);
> if (ret > 0) {
> printk(KERN_NOTICE "%d %s partitions found on MTD device %s\n",
> ret, parser->name, master->name);
> }
> put_partition_parser(parser);
> }
>
> I should note that I am assuming (because I don't understand how) that
> request_module() requires net-pf-1 to succeed. This assumption is
> based on my observation of the request_module message
> ("request_module: runaway loop modprobe net-pf-1"), and that
> af_unix_init() (net-pf-1 initialization) is called only after the call
> to request_module() in this example.
>
> Based on the configuration and assumptions described above, the
> sequence of events that cause the boot log messages above are:
>
> 1. the first call to get_partition_parser() in parse_mtd_partitions()
> does not find the parser in the the list of registered parsers, i.e.
> get_partition_parser() returns a NULL pointer. In this example, the
> only registered parser in the list is called Redboot;
>
> 2. parse_mtd_partitions() calls request_module() with (in this
> example) the parameter "cmdlinepart";
>
> 3. request_module() complains (request_module: runaway loop modprobe
> net-pf-1, etc.) because af_unix_init() has not been called. The boot
> log shows that af_unix_init() is called only after
> parse_mtd_partitions() finishes.
>
> register_mtd_parser: RedBoot
> physmap platform flash device: 00080000 at f0000000
> Found: ST M29W400DB
> physmap-flash.0: Found 1 x16 devices at 0x0 in 16-bit bank
> number of JEDEC chips: 1
> cfi_cmdset_0002: Disabling erase-suspend-program due to code brokenness.
> parse_mtd_partitions:
> get_partition_parser: name=cmdlinepart
> get_partition_parser: p->name=RedBoot
> get_partition_parser: p->owner=00000000
> get_partition_parser: ret=00000000
> parse_mtd_partitions: cmdlinepart
> request_module: runaway loop modprobe net-pf-1
> modprobe: FATAL: Could not load
> /lib/modules/2.6.25-rc4-iop32x/modules.dep: No such file or directory
>
> modprobe: FATAL: Could not load
> /lib/modules/2.6.25-rc4-iop32x/modules.dep: No such file or directory
>
> modprobe: FATAL: Could not load
> /lib/modules/2.6.25-rc4-iop32x/modules.dep: No such file or directory
>
> <--- 47 repeated modprobe messages deleted --->
> parse_mtd_partitions: request_module called
> cmdlinepart partition parsing not available
> get_partition_parser: name=RedBoot
> get_partition_parser: p->name=RedBoot
> get_partition_parser: p->owner=00000000
> get_partition_parser: ret->name=RedBoot
> get_partition_parser: ret=c02b0f8c
> parse_mtd_partitions: RedBoot
> parse_mtd_partitions: request_module called
> Searching for RedBoot partition table in physmap-flash.0 at offset 0x70000
> No RedBoot partition table detected in physmap-flash.0
> parse_mtd_partitions: end of function
> mice: PS/2 mouse device common for all mice
> i2c /dev entries driver
> rtc-rs5c372 0-0032: rs5c372a found, 24hr, driver version 0.5
> rtc-rs5c372 0-0032: rtc core: registered rtc-rs5c372 as rtc0
> iop-adma iop-adma.0: Intel(R) IOP: ( cpy intr )
> iop-adma iop-adma.1: Intel(R) IOP: ( cpy intr )
> NET: Registered protocol family 26
> TCP bic registered
> af_unix_init:
> NET: Registered protocol family 1
> NET: Registered protocol family 17
> XScale DSP coprocessor detected.
> registered taskstats version 1
> rtc-rs5c372 0-0032: setting system clock to 2008-03-11 03:16:46 UTC (1205205406)
>
> The reason parse_mtd_partitions() tries to use cmdlinepart, even
> though it is not a registered parsing scheme when CONFIG_CMDLINE_PARTS
> is not set, is because cmdlinepart is hard coded in
> drivers/mtd/maps/physmap.c:
>
> #ifdef CONFIG_MTD_PARTITIONS
> static const char *part_probe_types[] = { "cmdlinepart", "RedBoot", NULL };
> #endif
>
> The simple solution to eliminating the messages in the boot log
> appears to be to ensure that, for machines that use a physmap flash
> device, CONFIG_MTD_REDBOOT_PARTS and CONFIG_CMDLINE_PARTS are set if
> CONFIG_MTD_PARTITIONS is set. However, the problem also occurs if
> CONFIG_MTD_REDBOOT_PARTS is configured to be built as a module,
> because again, request_module() fails because af_unix_init() has not
> been called.
>
> It would be simpler if all parsing code could only be built into the
> kernel. In this case, built in parsers would be registered by the time
> parse_mtd_partitions() is called, so the call to request_module() in
> parse_mtd_partitions() could be removed:
>
> #ifdef CONFIG_KMOD
> if (!parser && !request_module("%s", *types))
> parser = get_partition_parser(*types);
> #endif
>
> This solution would require simply changing the parser configuration
> options to bool in drivers/mtd/Kconfig. Is there a reason to allow the
> parsing code to be built as a module?

That sounds reasonable.

> Alternatively, if selecting CONFIG_MTD_PARTITIONS and not selecting
> both CONFIG_MTD_REDBOOT_PARTS and CONFIG_CMDLINE_PARTS, or selecting
> CONFIG_MTD_PARTITIONS and not selecting CONFIG_CMDLINE_PARTS and
> selecting CONFIG_MTD_REDBOOT_PARTS as a module, are valid
> configurations, it seems that parse_mtd_partitions() must be called
> only after af_unix_init() has been called. Is this possible?

Probably.

We can specify the order in which these things are called at compile-time.
See these, from include/linux/init.h:

#define core_initcall(fn) __define_initcall("1",fn,1)
#define postcore_initcall(fn) __define_initcall("2",fn,2)
#define arch_initcall(fn) __define_initcall("3",fn,3)
#define subsys_initcall(fn) __define_initcall("4",fn,4)
#define fs_initcall(fn) __define_initcall("5",fn,5)
#define rootfs_initcall(fn) __define_initcall("rootfs",fn,rootfs)
#define device_initcall(fn) __define_initcall("6",fn,6)
#define late_initcall(fn) __define_initcall("7",fn,7)

Now, af_unix_init() uses module_init(), which is really __initcall(), which
is really device_initcall() (ug, don't ask).

So you can do what you're sugesting here by locating the caller/callers of
parse_mtd_partitions() and marking them late_initcall().

> Lastly, does anybody care because this problem does not cause the
> kernel to crash? :-)

We're very caring people.

I'd suggest that you make your own decision as to what is the best fix, then
send a patch. Please cc me on it and I'll make sure that it gets
appropriately handled. This may take a bit of time, as the MTD patch
backlog is rather large at present.

2008-03-12 07:07:05

by Gordon Farquharson

[permalink] [raw]
Subject: Re: physmap and "request_module: runaway loop modprobe net-pf-1"

Hi Andrew

On Tue, Mar 11, 2008 at 11:59 PM, Andrew Morton
<[email protected]> wrote:
> On Tue, 11 Mar 2008 23:35:28 -0600 "Gordon Farquharson" <[email protected]> wrote:

> > Alternatively, if selecting CONFIG_MTD_PARTITIONS and not selecting
> > both CONFIG_MTD_REDBOOT_PARTS and CONFIG_CMDLINE_PARTS, or selecting
> > CONFIG_MTD_PARTITIONS and not selecting CONFIG_CMDLINE_PARTS and
> > selecting CONFIG_MTD_REDBOOT_PARTS as a module, are valid
> > configurations, it seems that parse_mtd_partitions() must be called
> > only after af_unix_init() has been called. Is this possible?
>
> Probably.
>
> We can specify the order in which these things are called at compile-time.
> See these, from include/linux/init.h:
>
> #define core_initcall(fn) __define_initcall("1",fn,1)
> #define postcore_initcall(fn) __define_initcall("2",fn,2)
> #define arch_initcall(fn) __define_initcall("3",fn,3)
> #define subsys_initcall(fn) __define_initcall("4",fn,4)
> #define fs_initcall(fn) __define_initcall("5",fn,5)
> #define rootfs_initcall(fn) __define_initcall("rootfs",fn,rootfs)
> #define device_initcall(fn) __define_initcall("6",fn,6)
> #define late_initcall(fn) __define_initcall("7",fn,7)
>
> Now, af_unix_init() uses module_init(), which is really __initcall(), which
> is really device_initcall() (ug, don't ask).
>
> So you can do what you're sugesting here by locating the caller/callers of
> parse_mtd_partitions() and marking them late_initcall().

Thanks very much for the explanation.

It looks like there could be up to 38 distinct callers of
parse_mtd_partitions(), and I expect that one would need to test all
of the various platforms on which these calls occur to ensure that
there were no regressions introduced by the change, so this option
scares me a little.

$ grep parse_mtd_partitions -r * | wc -l
38

> I'd suggest that you make your own decision as to what is the best fix, then
> send a patch. Please cc me on it and I'll make sure that it gets
> appropriately handled. This may take a bit of time, as the MTD patch
> backlog is rather large at present.

I'll see how the MTD guys respond today, and then submit a patch
later. I like option 1, but they may have a good reason for needing to
compile a parser as a module, or they may propose a completely
different solution. Right now, there is only one defconfig that has
CONFIG_REDBOOT_MTD_PARTS=m:

arch/mips/configs/mtx1_defconfig:CONFIG_MTD_REDBOOT_PARTS=m

Thanks very much for the input.

Gordon

--
Gordon Farquharson
GnuPG Key ID: 32D6D676

2008-03-12 07:19:59

by Andrew Morton

[permalink] [raw]
Subject: Re: physmap and "request_module: runaway loop modprobe net-pf-1"

On Wed, 12 Mar 2008 01:06:53 -0600 "Gordon Farquharson" <[email protected]> wrote:

> Hi Andrew
>
> On Tue, Mar 11, 2008 at 11:59 PM, Andrew Morton
> <[email protected]> wrote:
> > On Tue, 11 Mar 2008 23:35:28 -0600 "Gordon Farquharson" <[email protected]> wrote:
>
> > > Alternatively, if selecting CONFIG_MTD_PARTITIONS and not selecting
> > > both CONFIG_MTD_REDBOOT_PARTS and CONFIG_CMDLINE_PARTS, or selecting
> > > CONFIG_MTD_PARTITIONS and not selecting CONFIG_CMDLINE_PARTS and
> > > selecting CONFIG_MTD_REDBOOT_PARTS as a module, are valid
> > > configurations, it seems that parse_mtd_partitions() must be called
> > > only after af_unix_init() has been called. Is this possible?
> >
> > Probably.
> >
> > We can specify the order in which these things are called at compile-time.
> > See these, from include/linux/init.h:
> >
> > #define core_initcall(fn) __define_initcall("1",fn,1)
> > #define postcore_initcall(fn) __define_initcall("2",fn,2)
> > #define arch_initcall(fn) __define_initcall("3",fn,3)
> > #define subsys_initcall(fn) __define_initcall("4",fn,4)
> > #define fs_initcall(fn) __define_initcall("5",fn,5)
> > #define rootfs_initcall(fn) __define_initcall("rootfs",fn,rootfs)
> > #define device_initcall(fn) __define_initcall("6",fn,6)
> > #define late_initcall(fn) __define_initcall("7",fn,7)
> >
> > Now, af_unix_init() uses module_init(), which is really __initcall(), which
> > is really device_initcall() (ug, don't ask).
> >
> > So you can do what you're sugesting here by locating the caller/callers of
> > parse_mtd_partitions() and marking them late_initcall().
>
> Thanks very much for the explanation.
>
> It looks like there could be up to 38 distinct callers of
> parse_mtd_partitions(), and I expect that one would need to test all
> of the various platforms on which these calls occur to ensure that
> there were no regressions introduced by the change, so this option
> scares me a little.
>
> $ grep parse_mtd_partitions -r * | wc -l
> 38

err, yes, scrub that.

> > I'd suggest that you make your own decision as to what is the best fix, then
> > send a patch. Please cc me on it and I'll make sure that it gets
> > appropriately handled. This may take a bit of time, as the MTD patch
> > backlog is rather large at present.
>
> I'll see how the MTD guys respond today,

There's a good chance that there will be no response.

> and then submit a patch
> later. I like option 1, but they may have a good reason for needing to
> compile a parser as a module, or they may propose a completely
> different solution. Right now, there is only one defconfig that has
> CONFIG_REDBOOT_MTD_PARTS=m:
>
> arch/mips/configs/mtx1_defconfig:CONFIG_MTD_REDBOOT_PARTS=m
>

Just send the patch, please. Please retain you original analysis in its
covering description.

2008-03-12 07:36:57

by David Woodhouse

[permalink] [raw]
Subject: Re: physmap and "request_module: runaway loop modprobe net-pf-1"

On Tue, 2008-03-11 at 22:59 -0700, Andrew Morton wrote:
> > This solution would require simply changing the parser configuration
> > options to bool in drivers/mtd/Kconfig. Is there a reason to allow the
> > parsing code to be built as a module?
>
> That sounds reasonable.

I have no particular objection to that. But shouldn't we call
af_unix_init() earlier if modprobe doesn't run without it?

> I'd suggest that you make your own decision as to what is the best fix, then
> send a patch. Please cc me on it and I'll make sure that it gets
> appropriately handled. This may take a bit of time, as the MTD patch
> backlog is rather large at present.

Yeah, things have been interesting. But I get to spend a whole three
weeks at home, starting next week. So feel free to LART me if I don't
catch up. Thanks, again, for your help in making sure things don't fall
through the cracks.

--
dwmw2