2003-06-09 10:02:07

by Adam J. Richter

[permalink] [raw]
Subject: 2.5.70-bk1[23]: load_module crashes when aborting module load

Hi Rusty,

I thought I should report this problem to you now, as I'm
about to have to explore some code that I'm not too familiar with
(vfree) as I continue debugging it. Also note I am running a
modified kernel/module.c, so it is remotely possible that this problem
is self-inflicted, but I don't think so.

In 2.5.70-bk1[23], I get a kernel bad memory reference
when trying load a module with an undefined symbol that is not found.
The bad memory reference occurs in load_module after the call
to module_free(mod,mod->module_core), the next time that "mod" is
dereferenced. Here is a commented excerpt from load_module
in kernel/module.c:

cleanup:
module_unload_free(mod);
module_free(mod, mod->module_init);
free_core:
module_free(mod, mod->module_core);
/* The following "if" statement generates a kernel bad memory
reference. --Adam */
free_percpu:
if (mod->percpu)
percpu_modfree(mod->percpu);

For whatever reason, module->module_core (ee820000) points to
an address slightly before mod (mod = ee828780, the bad dereference
is to ee8298a4). On x86, module_free() is vfree(). I suspect that
somehow vfree() has gotten confused.

By the way, there also seems to be a bug in the
2.5.70-bk12/kernel/module.c changes where mod->percpu is left unitialized
if a module has no per-cpu data. I've verified that there really is a
junk non-zero value in mod->percpu in that case. However, fixing that
bug does not eliminate this problem.

Adam J. Richter __ ______________ 575 Oroville Road
[email protected] \ / Miplitas, California 95035
+1 408 309-6081 | g g d r a s i l United States of America
"Free Software For The Rest Of Us."


2003-06-10 04:11:51

by Rusty Russell

[permalink] [raw]
Subject: Re: 2.5.70-bk1[23]: load_module crashes when aborting module load

In message <[email protected]> you write:
> Hi Rusty,
>
> I thought I should report this problem to you now, as I'm
> about to have to explore some code that I'm not too familiar with
> (vfree) as I continue debugging it. Also note I am running a
> modified kernel/module.c, so it is remotely possible that this problem
> is self-inflicted, but I don't think so.
>
> In 2.5.70-bk1[23], I get a kernel bad memory reference
> when trying load a module with an undefined symbol that is not found.
> The bad memory reference occurs in load_module after the call
> to module_free(mod,mod->module_core), the next time that "mod" is
> dereferenced. Here is a commented excerpt from load_module
> in kernel/module.c:
>
> cleanup:
> module_unload_free(mod);
> module_free(mod, mod->module_init);
> free_core:
> module_free(mod, mod->module_core);
> /* The following "if" statement generates a kernel bad memory
> reference. --Adam */
> free_percpu:
> if (mod->percpu)
> percpu_modfree(mod->percpu);
>
> For whatever reason, module->module_core (ee820000) points to
> an address slightly before mod (mod = ee828780, the bad dereference
> is to ee8298a4). On x86, module_free() is vfree(). I suspect that
> somehow vfree() has gotten confused.

Well, mod is inside module->module_core, so that makes sense: check
the section layout, but usually the .text section is first, then mod
will be near the .data section (turn on debugging in layout_sections
to get the details).

> By the way, there also seems to be a bug in the
> 2.5.70-bk12/kernel/module.c changes where mod->percpu is left unitialized
> if a module has no per-cpu data. I've verified that there really is a
> junk non-zero value in mod->percpu in that case. However, fixing that
> bug does not eliminate this problem.

Something is badly wrong: look in include/linux/module.h and you'll
see the initialization of __this_module (which becomes mod). By
leaving the .percpu member uninitialized, it will be initialized to
NULL.

Random guess: did the build system not rebuild your modules properly
when module.h changed?

Puzzled,
Rusty.
--
Anyone who quotes me in their sig is an idiot. -- Rusty Russell.

2003-06-10 08:34:35

by Milton Miller

[permalink] [raw]
Subject: Re: 2.5.70-bk1[23]: load_module crashes when aborting module load


On Mon Jun 09 2003 - 20:39:45 EST, Rusty Russell ([email protected]) wrote:
> In message <[email protected]> you write:
> > cleanup:
> > module_unload_free(mod);
> > module_free(mod, mod->module_init);
> > free_core:
> > module_free(mod, mod->module_core);
> > /* The following "if" statement generates a kernel bad memory
> > reference. --Adam */
> > free_percpu:
> > if (mod->percpu)
> > percpu_modfree(mod->percpu);
> >
..
>
> Well, mod is inside module->module_core, so that makes sense: check
> the section layout, but usually the .text section is first, then mod
> will be near the .data section (turn on debugging in layout_sections
> to get the details).

Umm, isn't that the problem? Once we get past the point where mod points
inside the module core (ie where we would goto cleanup), we can't
reference mod->percpu after freeing mod->mod_core, since that frees mod
(and hence a the use-after-free bug).

milton

2003-06-11 03:30:26

by Randy.Dunlap

[permalink] [raw]
Subject: Re: 2.5.70-bk1[23]: load_module crashes when aborting module load

just a me too on this. I'm using 2.5.70-bk13 and if I load the usblp
module without any other USB support loaded, I get this oops.

I'll add this to my TODO list....

Rusty wrote:
| Random guess: did the build system not rebuild your modules properly
| when module.h changed?

My kernel tree is 2.5.70 plain + bk13 patch.
It never was just plain 2.5.70 or anything else.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
usblp: Unknown symbol usb_alloc_urb
usblp: Unknown symbol usb_free_urb
usblp: Unknown symbol usb_register
usblp: Unknown symbol usb_find_interface
usblp: Unknown symbol usb_submit_urb
usblp: Unknown symbol usb_control_msg
usblp: Unknown symbol usb_register_dev
usblp: Unknown symbol usb_set_interface
usblp: Unknown symbol usb_deregister
usblp: Unknown symbol usb_unlink_urb
usblp: Unknown symbol usb_deregister_dev
usblp: Unknown symbol usb_buffer_free
usblp: Unknown symbol usb_buffer_alloc
Unable to handle kernel paging request at virtual address f89ac11c
printing eip:
c013955d
*pde = 01a6e067
*pte = 00000000
Oops: 0000 [#1]
CPU: 0
EIP: 0060:[<c013955d>] Not tainted
EFLAGS: 00010286
EIP is at load_module+0x78d/0x900
eax: 0000000d ebx: f89a3000 ecx: f89abe00 edx: f7ff99f0
esi: fffffffe edi: f89abe00 ebp: f70d3f94 esp: f70d3efc
ds: 007b es: 007b ss: 0068
Process insmod (pid: 1917, threadinfo=f70d2000 task=f7119310)
Stack: f89a9000 f89a9000 f89a6c80 00000000 00000000 f89abe00 400eaf50 00000000
00000410 400eaf50 f711d448 00005000 f725d8a4 00030002 c014d57d f89ae000
f89abe00 00000000 00000000 00000000 00000000 00000000 0000000f 00000014
Call Trace:
[<c014d57d>] do_mmap_pgoff+0x3d7/0x698
[<c0139765>] sys_init_module+0x95/0x2d4
[<c0109683>] syscall_call+0x7/0xb

Code: 8b 81 1c 03 00 00 85 c0 0f 84 a7 fb ff ff 89 04 24 e8 b7 df

~Randy



2003-06-11 05:34:41

by Rusty Russell

[permalink] [raw]
Subject: Re: 2.5.70-bk1[23]: load_module crashes when aborting module load

In message <[email protected]> you write:
> Umm, isn't that the problem? Once we get past the point where mod points
> inside the module core (ie where we would goto cleanup), we can't
> reference mod->percpu after freeing mod->mod_core, since that frees mod
> (and hence a the use-after-free bug).

Doh! Good catch.

The problem is that mod is moved to point from the sucked-in
file (always freed last) to the module core, after which time the
"free(mod->core), reference mod->percpu" sequence is bogus, eg. when
the module_init function fails.

This patch keeps the ptr in a local variable, which solves the
problem simply.

Linus, please apply.

diff -urpN --exclude TAGS -X /home/rusty/devel/kernel/kernel-patches/current-dontdiff --minimal linux-2.5.70-bk15/kernel/module.c working-2.5.70-bk15-percpufree/kernel/module.c
--- linux-2.5.70-bk15/kernel/module.c 2003-06-11 12:15:34.000000000 +1000
+++ working-2.5.70-bk15-percpufree/kernel/module.c 2003-06-11 15:39:48.000000000 +1000
@@ -1366,7 +1366,7 @@ static struct module *load_module(void _
long arglen;
struct module *mod;
long err = 0;
- void *ptr = NULL; /* Stops spurious gcc uninitialized warning */
+ void *percpu = NULL, *ptr = NULL; /* Stops spurious gcc warning */

DEBUGP("load_module: umod=%p, len=%lu, uargs=%p\n",
umod, len, uargs);
@@ -1496,13 +1496,14 @@ static struct module *load_module(void _

if (pcpuindex) {
/* We have a special allocation for this section. */
- mod->percpu = percpu_modalloc(sechdrs[pcpuindex].sh_size,
- sechdrs[pcpuindex].sh_addralign);
- if (!mod->percpu) {
+ percpu = percpu_modalloc(sechdrs[pcpuindex].sh_size,
+ sechdrs[pcpuindex].sh_addralign);
+ if (!percpu) {
err = -ENOMEM;
goto free_mod;
}
sechdrs[pcpuindex].sh_flags &= ~(unsigned long)SHF_ALLOC;
+ mod->percpu = percpu;
}

/* Determine total sizes, and put offsets in sh_entsize. For now
@@ -1643,8 +1644,8 @@ static struct module *load_module(void _
free_core:
module_free(mod, mod->module_core);
free_percpu:
- if (mod->percpu)
- percpu_modfree(mod->percpu);
+ if (percpu)
+ percpu_modfree(percpu);
free_mod:
kfree(args);
free_hdr:
--
Anyone who quotes me in their sig is an idiot. -- Rusty Russell.

2003-06-11 23:29:02

by Adam J. Richter

[permalink] [raw]
Subject: Re: 2.5.70-bk1[23]: load_module crashes when aborting module load

On Tue, 10 Jun 2003 at 11:39:45 +1000, Rusty Russell wrote:
>In message <[email protected]> you write:
[A report of a bug in kernel/module.c, which Rusty has already
submitted a patch for, followed by a mistaken second bug report on my part:]
>> By the way, there also seems to be a bug in the
>> 2.5.70-bk12/kernel/module.c changes where mod->percpu is left unitialized
[...]

>Something is badly wrong: look in include/linux/module.h and you'll
>see the initialization of __this_module (which becomes mod). By
>leaving the .percpu member uninitialized, it will be initialized to
>NULL.

You're right. My misunderstanding was the result of flaky serial
console cable, which I've now replaced. Sorry for the confusion.

Adam J. Richter __ ______________ 575 Oroville Road
[email protected] \ / Miplitas, California 95035
+1 408 309-6081 | g g d r a s i l United States of America
"Free Software For The Rest Of Us."