2007-12-01 20:24:09

by Andrew Morton

[permalink] [raw]
Subject: Re: [Bugme-new] [Bug 9482] New: kernel GPF in 2.6.24 (g09f345da)

(switched to email - please respond via emailed reply-to-all, not via the
bugzilla web interface)

On Sat, 1 Dec 2007 11:54:11 -0800 (PST) [email protected] wrote:

> http://bugzilla.kernel.org/show_bug.cgi?id=9482
>
> Summary: kernel GPF in 2.6.24 (g09f345da)
> Product: Other
> Version: 2.5
> KernelVersion: 2.6.24rc3 g09f345da
> Platform: All
> OS/Version: Linux
> Tree: Mainline
> Status: NEW
> Severity: normal
> Priority: P1
> Component: Other
> AssignedTo: [email protected]
> ReportedBy: [email protected]
>
>
> Most recent kernel where this bug did not occur: 2.6.22.12
> Distribution: openSUSE 10.3
> Hardware Environment: Athlon XP 2200+
> Software Environment: openSUSE 10.3
> Problem Description: instant kernel GPF.
>
> Steps to reproduce:load module aoe (vanilla), type 'mkfs.ext3
> /dev/etherd/e0.0', and the kernel GPFs.
>
> Will be attaching jpg of screen - sorry, no other way to capture GPF for me.
>

http://bugzilla.kernel.org/attachment.cgi?id=13812&action=view

Damn that's odd. General Protection Fault in
__set_page_dirty->__percpu_counter_add(). No sign of AOE in the trace.

I assume that it is repeatable and that it doesn't occur with mkfs on
regular local disk drives?


2007-12-01 21:05:52

by Jon Nelson

[permalink] [raw]
Subject: Re: [Bugme-new] [Bug 9482] New: kernel GPF in 2.6.24 (g09f345da)

On 12/1/07, Andrew Morton <[email protected]> wrote:
> (switched to email - please respond via emailed reply-to-all, not via the
> bugzilla web interface)
>
> On Sat, 1 Dec 2007 11:54:11 -0800 (PST) [email protected] wrote:
>
> > http://bugzilla.kernel.org/show_bug.cgi?id=9482
> >
> > Summary: kernel GPF in 2.6.24 (g09f345da)
> > Product: Other
> > Version: 2.5
> > KernelVersion: 2.6.24rc3 g09f345da
> > Platform: All
> > OS/Version: Linux
> > Tree: Mainline
> > Status: NEW
> > Severity: normal
> > Priority: P1
> > Component: Other
> > AssignedTo: [email protected]
> > ReportedBy: [email protected]
> >
> >
> > Most recent kernel where this bug did not occur: 2.6.22.12
> > Distribution: openSUSE 10.3
> > Hardware Environment: Athlon XP 2200+
> > Software Environment: openSUSE 10.3
> > Problem Description: instant kernel GPF.
> >
> > Steps to reproduce:load module aoe (vanilla), type 'mkfs.ext3
> > /dev/etherd/e0.0', and the kernel GPFs.
> >
> > Will be attaching jpg of screen - sorry, no other way to capture GPF for me.
> >
>
> http://bugzilla.kernel.org/attachment.cgi?id=13812&action=view
>
> Damn that's odd. General Protection Fault in
> __set_page_dirty->__percpu_counter_add(). No sign of AOE in the trace.
>
> I assume that it is repeatable and that it doesn't occur with mkfs on
> regular local disk drives?


This works fine:

mkfs.ext3 /dev/system/test #### a local LVM2-based logical volume

mkfs.ext3 /dev/etherd/e0.0 ## Boom.




--
Jon

2007-12-02 16:57:06

by Jon Nelson

[permalink] [raw]
Subject: Re: [Bugme-new] [Bug 9482] New: kernel GPF in 2.6.24 (g09f345da)

I upgraded to a more recent commit,
92d499d991ec4f5cbd00d6f33967eab9d3ee8d6c
and rebuilt.

Now I get lots and lots of AoE errors, over and over again, but no GPF.

aoe: can't get a frame. This shouldn't happen.

It's possible the previous kernel had been patched, so I made really
REALLY sure that this was a 100% clean build using the above commit. I
can build using more or less any other commit or tag upon request but
it's not quick.

--
Jon

2007-12-03 16:45:48

by Ed L. Cashin

[permalink] [raw]
Subject: Re: [Bugme-new] [Bug 9482] New: kernel GPF in 2.6.24 (g09f345da)

On Sat, Dec 01, 2007 at 12:23:02PM -0800, Andrew Morton wrote:
> (switched to email - please respond via emailed reply-to-all, not via the
> bugzilla web interface)
>
> On Sat, 1 Dec 2007 11:54:11 -0800 (PST) [email protected] wrote:
>
> > http://bugzilla.kernel.org/show_bug.cgi?id=9482
...
> Damn that's odd. General Protection Fault in
> __set_page_dirty->__percpu_counter_add(). No sign of AOE in the trace.
>
> I assume that it is repeatable and that it doesn't occur with mkfs on
> regular local disk drives?

I am encountering this same problem during testing of some patches I
would like to send to the LKML, applied to 2.6.24-rc3, and I can trip
this problem with just,

echo > /dev/etherd/e7.0

... at which point I get the trace below. (I had added a couple of
checks for 0xffffffffffffffff pointers to __percpu_counter_add.) I
haven't been able to check the unpatched aoe driver, but it looks the
same.

Unable to handle kernel paging request at ffffffffffffffff RIP:
[<ffffffff8036d597>] __percpu_counter_add+0x24/0x6d
PGD 203067 PUD 204067 PMD 0
Oops: 0000 [1] SMP
CPU 0
Modules linked in: aoe
Pid: 2860, comm: bash Not tainted 2.6.24-rc3-47dbg #5
RIP: 0010:[<ffffffff8036d597>] [<ffffffff8036d597>] __percpu_counter_add+0x24/0x6d
RSP: 0018:ffff81007a0fbaa8 EFLAGS: 00010092
RAX: ffffffffffffffff RBX: ffff81007fcc48e0 RCX: 0000000000000010
RDX: 0000000000000000 RSI: 0000000000000001 RDI: ffff81007ace7240
RBP: ffff81007a0fbac8 R08: ffff81007cc077b0 R09: ffffffff802ae5ee
R10: ffff81007a0fbaa8 R11: ffff810077dd99d8 R12: ffff81007ace7240
R13: 0000000000000000 R14: 0000000000000200 R15: ffff810078473bb0
FS: 00002ba601c5cdb0(0000) GS:ffffffff8078b000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: ffffffffffffffff CR3: 0000000077c31000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process bash (pid: 2860, threadinfo ffff81007a0fa000, task ffff81007bf48040)
Stack: ffff81007a0fbac8 ffff81007fcc48e0 ffff81007c81c380 0000000000000000
ffff81007a0fbaf8 ffffffff802ae682 000010007a0fbae8 ffff810078473bb0
0000000000000200 ffff81007fcc48e0 ffff81007a0fbb18 ffffffff802ae75c
Call Trace:
[<ffffffff802ae682>] __set_page_dirty+0xdc/0x121
[<ffffffff802ae75c>] mark_buffer_dirty+0x95/0x99
[<ffffffff802ae7d2>] __block_commit_write+0x72/0xac
[<ffffffff802ae988>] block_write_end+0x4f/0x5b
[<ffffffff802b243f>] blkdev_write_end+0x1b/0x38
[<ffffffff80265d96>] generic_file_buffered_write+0x1c0/0x648
[<ffffffff8023a752>] current_fs_time+0x22/0x29
[<ffffffff80266576>] __generic_file_aio_write_nolock+0x358/0x3c2
[<ffffffff80266c84>] filemap_fault+0x1c4/0x320
[<ffffffff80264cce>] unlock_page+0x2d/0x31
[<ffffffff802666dd>] generic_file_aio_write_nolock+0x3b/0x8d
[<ffffffff8028e40f>] do_sync_write+0xe2/0x126
[<ffffffff802497d0>] autoremove_wake_function+0x0/0x38
[<ffffffff8058e705>] do_page_fault+0x3f8/0x7bb
[<ffffffff8028cae8>] fd_install+0x5f/0x68
[<ffffffff8028eb98>] vfs_write+0xae/0x137
[<ffffffff8028f102>] sys_write+0x47/0x70
[<ffffffff8020b7ae>] system_call+0x7e/0x83


Code: 4c 8b 2c d0 49 63 45 00 48 8d 1c 30 48 63 c1 48 39 c3 7d 0a
RIP [<ffffffff8036d597>] __percpu_counter_add+0x24/0x6d
RSP <ffff81007a0fbaa8>
CR2: ffffffffffffffff
BUG: sleeping function called from invalid context at kernel/rwsem.c:20
in_atomic():0, irqs_disabled():1
INFO: lockdep is turned off.

Call Trace:
[<ffffffff802518cb>] debug_show_held_locks+0x1b/0x24
[<ffffffff8022f352>] __might_sleep+0xc7/0xc9
[<ffffffff8024c4d0>] down_read+0x1d/0x4a
[<ffffffff80237bd8>] exit_mm+0x34/0xf7
[<ffffffff8023933e>] do_exit+0x247/0x75b
[<ffffffff8058e9d4>] do_page_fault+0x6c7/0x7bb
[<ffffffff8058a81a>] thread_return+0x42/0x86
[<ffffffff88002698>] :aoe:aoeblk_make_request+0x1e8/0x1f5
[<ffffffff8058cb6d>] error_exit+0x0/0x9a
[<ffffffff802ae5ee>] __set_page_dirty+0x48/0x121
[<ffffffff8036d597>] __percpu_counter_add+0x24/0x6d
[<ffffffff802ae682>] __set_page_dirty+0xdc/0x121
[<ffffffff802ae75c>] mark_buffer_dirty+0x95/0x99
[<ffffffff802ae7d2>] __block_commit_write+0x72/0xac
[<ffffffff802ae988>] block_write_end+0x4f/0x5b
[<ffffffff802b243f>] blkdev_write_end+0x1b/0x38
[<ffffffff80265d96>] generic_file_buffered_write+0x1c0/0x648
[<ffffffff8023a752>] current_fs_time+0x22/0x29
[<ffffffff80266576>] __generic_file_aio_write_nolock+0x358/0x3c2
[<ffffffff80266c84>] filemap_fault+0x1c4/0x320
[<ffffffff80264cce>] unlock_page+0x2d/0x31
[<ffffffff802666dd>] generic_file_aio_write_nolock+0x3b/0x8d
[<ffffffff8028e40f>] do_sync_write+0xe2/0x126
[<ffffffff802497d0>] autoremove_wake_function+0x0/0x38
[<ffffffff8058e705>] do_page_fault+0x3f8/0x7bb
[<ffffffff8028cae8>] fd_install+0x5f/0x68
[<ffffffff8028eb98>] vfs_write+0xae/0x137
[<ffffffff8028f102>] sys_write+0x47/0x70
[<ffffffff8020b7ae>] system_call+0x7e/0x83



--
Ed L Cashin <[email protected]>

2007-12-03 19:35:27

by Andrew Morton

[permalink] [raw]
Subject: Re: [Bugme-new] [Bug 9482] New: kernel GPF in 2.6.24 (g09f345da)

On Mon, 3 Dec 2007 11:21:37 -0500
"Ed L. Cashin" <[email protected]> wrote:

> On Sat, Dec 01, 2007 at 12:23:02PM -0800, Andrew Morton wrote:
> > (switched to email - please respond via emailed reply-to-all, not via the
> > bugzilla web interface)
> >
> > On Sat, 1 Dec 2007 11:54:11 -0800 (PST) [email protected] wrote:
> >
> > > http://bugzilla.kernel.org/show_bug.cgi?id=9482
> ...
> > Damn that's odd. General Protection Fault in
> > __set_page_dirty->__percpu_counter_add(). No sign of AOE in the trace.
> >
> > I assume that it is repeatable and that it doesn't occur with mkfs on
> > regular local disk drives?
>
> I am encountering this same problem during testing of some patches I
> would like to send to the LKML, applied to 2.6.24-rc3, and I can trip
> this problem with just,
>
> echo > /dev/etherd/e7.0
>
> ... at which point I get the trace below. (I had added a couple of
> checks for 0xffffffffffffffff pointers to __percpu_counter_add.) I
> haven't been able to check the unpatched aoe driver, but it looks the
> same.
>
> Unable to handle kernel paging request at ffffffffffffffff RIP:
> [<ffffffff8036d597>] __percpu_counter_add+0x24/0x6d
> PGD 203067 PUD 204067 PMD 0
> Oops: 0000 [1] SMP
> CPU 0
> Modules linked in: aoe
> Pid: 2860, comm: bash Not tainted 2.6.24-rc3-47dbg #5
> RIP: 0010:[<ffffffff8036d597>] [<ffffffff8036d597>] __percpu_counter_add+0x24/0x6d
> RSP: 0018:ffff81007a0fbaa8 EFLAGS: 00010092
> RAX: ffffffffffffffff RBX: ffff81007fcc48e0 RCX: 0000000000000010
> RDX: 0000000000000000 RSI: 0000000000000001 RDI: ffff81007ace7240
> RBP: ffff81007a0fbac8 R08: ffff81007cc077b0 R09: ffffffff802ae5ee
> R10: ffff81007a0fbaa8 R11: ffff810077dd99d8 R12: ffff81007ace7240
> R13: 0000000000000000 R14: 0000000000000200 R15: ffff810078473bb0
> FS: 00002ba601c5cdb0(0000) GS:ffffffff8078b000(0000) knlGS:0000000000000000
> CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> CR2: ffffffffffffffff CR3: 0000000077c31000 CR4: 00000000000006e0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> Process bash (pid: 2860, threadinfo ffff81007a0fa000, task ffff81007bf48040)
> Stack: ffff81007a0fbac8 ffff81007fcc48e0 ffff81007c81c380 0000000000000000
> ffff81007a0fbaf8 ffffffff802ae682 000010007a0fbae8 ffff810078473bb0
> 0000000000000200 ffff81007fcc48e0 ffff81007a0fbb18 ffffffff802ae75c
> Call Trace:
> [<ffffffff802ae682>] __set_page_dirty+0xdc/0x121
> [<ffffffff802ae75c>] mark_buffer_dirty+0x95/0x99
> [<ffffffff802ae7d2>] __block_commit_write+0x72/0xac
> [<ffffffff802ae988>] block_write_end+0x4f/0x5b
> [<ffffffff802b243f>] blkdev_write_end+0x1b/0x38
> [<ffffffff80265d96>] generic_file_buffered_write+0x1c0/0x648
> [<ffffffff8023a752>] current_fs_time+0x22/0x29
> [<ffffffff80266576>] __generic_file_aio_write_nolock+0x358/0x3c2
> [<ffffffff80266c84>] filemap_fault+0x1c4/0x320
> [<ffffffff80264cce>] unlock_page+0x2d/0x31
> [<ffffffff802666dd>] generic_file_aio_write_nolock+0x3b/0x8d
> [<ffffffff8028e40f>] do_sync_write+0xe2/0x126
> [<ffffffff802497d0>] autoremove_wake_function+0x0/0x38
> [<ffffffff8058e705>] do_page_fault+0x3f8/0x7bb
> [<ffffffff8028cae8>] fd_install+0x5f/0x68
> [<ffffffff8028eb98>] vfs_write+0xae/0x137
> [<ffffffff8028f102>] sys_write+0x47/0x70
> [<ffffffff8020b7ae>] system_call+0x7e/0x83

Strange. It _looks_ like we've somehow caused smp_processor_id() to return
a not-possible CPU number. This code:

void __percpu_counter_add(struct percpu_counter *fbc, s64 amount, s32 batch)
{
s64 count;
s32 *pcount;
int cpu = get_cpu();

pcount = per_cpu_ptr(fbc->counters, cpu);

grabs a null pointer out of fbc->counters and then does the
__percpu_disguise() thing on it, thus getting an address of ~0.

Which I think implies that something in AOE is scribbling on task_struct,
thread_info or a machine register (%fs). Quite an achievement if so...

Could you debug this a bit please? Find out which CPU number
__percpu_counter_add() is using, for a start? I'd do:

in __percpu_counter_add():

int foo;
EXPORT_SYMBOL(foo);

__percpu_counter_add()
{
int cpu = get_cpu();

if (foo)
printk("cpu:%d\n", cpu);
...
}

in aoe:

extern int foo;

{
...
foo = 1;
}



Alternatively, just do

if (!cpu_possible(cpu))
printk(...)

in __percpu_counter_add(). Then you can proceed to work through the
various operations which smp_processor_id() does and find out where it went
wrong: print out %fs, mainly.

If the cpu number is valid then perhaps something scribbled on the cpu's
per-cpu memory.

2007-12-03 21:01:05

by Ed L. Cashin

[permalink] [raw]
Subject: Re: [Bugme-new] [Bug 9482] New: kernel GPF in 2.6.24 (g09f345da)

On Mon, Dec 03, 2007 at 11:34:59AM -0800, Andrew Morton wrote:
...
> Strange. It _looks_ like we've somehow caused smp_processor_id() to return
> a not-possible CPU number.
...
> Could you debug this a bit please? Find out which CPU number
> __percpu_counter_add() is using, for a start? I'd do:
...
> Alternatively, just do
>
> if (!cpu_possible(cpu))
> printk(...)
>
> in __percpu_counter_add(). Then you can proceed to work through the
> various operations which smp_processor_id() does and find out where it went
> wrong: print out %fs, mainly.
>
> If the cpu number is valid then perhaps something scribbled on the cpu's
> per-cpu memory.

I'll keep looking at this, but at a glance it looks like the cpu
number is valid, because I don't trip a BUG_ON when I make the change
below (the badval variable is noise, sorry).

--- lx/lib/percpu_counter.c.20071130 2007-12-03 15:43:19.000000000 -0500
+++ lx/lib/percpu_counter.c 2007-12-03 15:47:38.000000000 -0500
@@ -33,7 +33,9 @@ void __percpu_counter_add(struct percpu_
s64 count;
s32 *pcount;
int cpu = get_cpu();
+ u64 badval = 0xffffffffffffffffULL;

+ BUG_ON(!cpu_possible(cpu));
pcount = per_cpu_ptr(fbc->counters, cpu);
count = *pcount + amount;
if (count >= batch || count <= -batch) {

The trace is,

Unable to handle kernel paging request at ffffffffffffffff RIP:
[<ffffffff8036d602>] __percpu_counter_add+0x35/0x7f
PGD 203067 PUD 204067 PMD 0
Oops: 0000 [1] SMP
CPU 0
Modules linked in: aoe
Pid: 2777, comm: bash Not tainted 2.6.24-rc3-47dbg #9
RIP: 0010:[<ffffffff8036d602>] [<ffffffff8036d602>] __percpu_counter_add+0x35/0x7f
RSP: 0018:ffff810078d19aa8 EFLAGS: 00010086
RAX: ffffffffffffffff RBX: ffff81007fc71950 RCX: 0000000000000010
RDX: 0000000000000000 RSI: 0000000000000001 RDI: ffff810078d9a250
RBP: ffff810078d19ac8 R08: ffff81007cc077b0 R09: ffffffff802ae5ee
R10: ffff810078d19aa8 R11: ffff810078cb59d8 R12: ffff81007c81c380
R13: ffff810078d9a250 R14: 0000000000000200 R15: ffff81007805f830
FS: 00002b341db94db0(0000) GS:ffffffff8078b000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: ffffffffffffffff CR3: 000000007b415000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process bash (pid: 2777, threadinfo ffff810078d18000, task ffff810078d160c0)
Stack: ffff810078d19ac8 ffff81007fc71950 ffff81007c81c380 0000000000000000
ffff810078d19af8 ffffffff802ae682 0000100078d19ae8 ffff81007805f830
0000000000000200 ffff81007fc71950 ffff810078d19b18 ffffffff802ae75c
Call Trace:
[<ffffffff802ae682>] __set_page_dirty+0xdc/0x121
[<ffffffff802ae75c>] mark_buffer_dirty+0x95/0x99
[<ffffffff802ae7d2>] __block_commit_write+0x72/0xac
[<ffffffff802ae988>] block_write_end+0x4f/0x5b
[<ffffffff802b243f>] blkdev_write_end+0x1b/0x38
[<ffffffff80265d96>] generic_file_buffered_write+0x1c0/0x648
[<ffffffff8023a752>] current_fs_time+0x22/0x29
[<ffffffff80266576>] __generic_file_aio_write_nolock+0x358/0x3c2
[<ffffffff80266c84>] filemap_fault+0x1c4/0x320
[<ffffffff80264cce>] unlock_page+0x2d/0x31
[<ffffffff802666dd>] generic_file_aio_write_nolock+0x3b/0x8d
[<ffffffff8028e40f>] do_sync_write+0xe2/0x126
[<ffffffff802497d0>] autoremove_wake_function+0x0/0x38
[<ffffffff8058e715>] do_page_fault+0x3f8/0x7bb
[<ffffffff8028cae8>] fd_install+0x5f/0x68
[<ffffffff8028eb98>] vfs_write+0xae/0x137
[<ffffffff8028f102>] sys_write+0x47/0x70
[<ffffffff8020b7ae>] system_call+0x7e/0x83


Code: 4c 8b 24 d0 49 63 04 24 48 8d 1c 30 48 63 c1 48 39 c3 7d 0a
RIP [<ffffffff8036d602>] __percpu_counter_add+0x35/0x7f
RSP <ffff810078d19aa8>
CR2: ffffffffffffffff
BUG: sleeping function called from invalid context at kernel/rwsem.c:20
in_atomic():0, irqs_disabled():1
INFO: lockdep is turned off.

Call Trace:
[<ffffffff802518cb>] debug_show_held_locks+0x1b/0x24
[<ffffffff8022f352>] __might_sleep+0xc7/0xc9
[<ffffffff8024c4d0>] down_read+0x1d/0x4a
[<ffffffff80237bd8>] exit_mm+0x34/0xf7
[<ffffffff8023933e>] do_exit+0x247/0x75b
[<ffffffff8058e9e4>] do_page_fault+0x6c7/0x7bb
[<ffffffff8058a82a>] thread_return+0x42/0x86
[<ffffffff88002544>] :aoe:aoeblk_make_request+0x1c3/0x1d0
[<ffffffff8058cb7d>] error_exit+0x0/0x9a
[<ffffffff802ae5ee>] __set_page_dirty+0x48/0x121
[<ffffffff8036d602>] __percpu_counter_add+0x35/0x7f
[<ffffffff802ae682>] __set_page_dirty+0xdc/0x121
[<ffffffff802ae75c>] mark_buffer_dirty+0x95/0x99
[<ffffffff802ae7d2>] __block_commit_write+0x72/0xac
[<ffffffff802ae988>] block_write_end+0x4f/0x5b
[<ffffffff802b243f>] blkdev_write_end+0x1b/0x38
[<ffffffff80265d96>] generic_file_buffered_write+0x1c0/0x648
[<ffffffff8023a752>] current_fs_time+0x22/0x29
[<ffffffff80266576>] __generic_file_aio_write_nolock+0x358/0x3c2
[<ffffffff80266c84>] filemap_fault+0x1c4/0x320
[<ffffffff80264cce>] unlock_page+0x2d/0x31
[<ffffffff802666dd>] generic_file_aio_write_nolock+0x3b/0x8d
[<ffffffff8028e40f>] do_sync_write+0xe2/0x126
[<ffffffff802497d0>] autoremove_wake_function+0x0/0x38
[<ffffffff8058e715>] do_page_fault+0x3f8/0x7bb
[<ffffffff8028cae8>] fd_install+0x5f/0x68
[<ffffffff8028eb98>] vfs_write+0xae/0x137
[<ffffffff8028f102>] sys_write+0x47/0x70
[<ffffffff8020b7ae>] system_call+0x7e/0x83



--
Ed L Cashin <[email protected]>

2007-12-03 21:40:56

by Ed L. Cashin

[permalink] [raw]
Subject: Re: [Bugme-new] [Bug 9482] New: kernel GPF in 2.6.24 (g09f345da)

On Mon, Dec 03, 2007 at 04:00:05PM -0500, Ed L. Cashin wrote:
...
> I'll keep looking at this, but at a glance it looks like the cpu
> number is valid, because I don't trip a BUG_ON when I make the change
> below (the badval variable is noise, sorry).
>
> --- lx/lib/percpu_counter.c.20071130 2007-12-03 15:43:19.000000000 -0500
> +++ lx/lib/percpu_counter.c 2007-12-03 15:47:38.000000000 -0500
> @@ -33,7 +33,9 @@ void __percpu_counter_add(struct percpu_
> s64 count;
> s32 *pcount;
> int cpu = get_cpu();
> + u64 badval = 0xffffffffffffffffULL;
>
> + BUG_ON(!cpu_possible(cpu));
> pcount = per_cpu_ptr(fbc->counters, cpu);
> count = *pcount + amount;
> if (count >= batch || count <= -batch) {

It appears that the fbc->counters pointer is NULL. I added the line,

BUG_ON(!fbc->counters);

... (on line 39 in my percpu_counter.c), and it results in the trace
below. It looks like when it's NULL, percpu_ptr passes it to
__percpu_disguise, which makes it all ones and then tries to
dereference 0xffffffffffffffff to access to the "ptrs" member of the
struct percpu_data.

------------[ cut here ]------------
kernel BUG at lib/percpu_counter.c:39!
invalid opcode: 0000 [1] SMP
CPU 0
Modules linked in: aoe
Pid: 3354, comm: bash Not tainted 2.6.24-rc3-47dbg #10
RIP: 0010:[<ffffffff8036d5f7>] [<ffffffff8036d5f7>] __percpu_counter_add+0x2a/0x8f
RSP: 0018:ffff810075031aa8 EFLAGS: 00010046
RAX: 00000000ffffffff RBX: ffff81007fd19bd8 RCX: 0000000000000000
RDX: 0000000000000010 RSI: 0000000000000001 RDI: 0000000000000000
RBP: ffff810075031ac8 R08: ffff81007cc077b0 R09: ffffffff802ae5ee
R10: ffff810075031aa8 R11: ffff8100750318e8 R12: ffff81007c81c380
R13: ffff810073ce8250 R14: 0000000000000200 R15: ffff8100755016b0
FS: 00002b3e5c052db0(0000) GS:ffffffff8078b000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 00002b7f44fb64e0 CR3: 000000007c4b1000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process bash (pid: 3354, threadinfo ffff810075030000, task ffff81007b4da040)
Stack: ffff810075031ac8 ffff81007fd19bd8 ffff81007c81c380 0000000000000000
ffff810075031af8 ffffffff802ae682 0000100075031ae8 ffff8100755016b0
0000000000000200 ffff81007fd19bd8 ffff810075031b18 ffffffff802ae75c
Call Trace:
[<ffffffff802ae682>] __set_page_dirty+0xdc/0x121
[<ffffffff802ae75c>] mark_buffer_dirty+0x95/0x99
[<ffffffff802ae7d2>] __block_commit_write+0x72/0xac
[<ffffffff802ae988>] block_write_end+0x4f/0x5b
[<ffffffff802b243f>] blkdev_write_end+0x1b/0x38
[<ffffffff80265d96>] generic_file_buffered_write+0x1c0/0x648
[<ffffffff8023a752>] current_fs_time+0x22/0x29
[<ffffffff80266576>] __generic_file_aio_write_nolock+0x358/0x3c2
[<ffffffff80266c84>] filemap_fault+0x1c4/0x320
[<ffffffff80264cce>] unlock_page+0x2d/0x31
[<ffffffff802666dd>] generic_file_aio_write_nolock+0x3b/0x8d
[<ffffffff8028e40f>] do_sync_write+0xe2/0x126
[<ffffffff802497d0>] autoremove_wake_function+0x0/0x38
[<ffffffff8058e725>] do_page_fault+0x3f8/0x7bb
[<ffffffff8028cae8>] fd_install+0x5f/0x68
[<ffffffff8028eb98>] vfs_write+0xae/0x137
[<ffffffff8028f102>] sys_write+0x47/0x70
[<ffffffff8020b7ae>] system_call+0x7e/0x83


Code: 0f 0b eb fe 0f a3 3d 7e 08 4f 00 19 c0 85 c0 75 04 0f 0b eb
RIP [<ffffffff8036d5f7>] __percpu_counter_add+0x2a/0x8f
RSP <ffff810075031aa8>
BUG: sleeping function called from invalid context at kernel/rwsem.c:20
in_atomic():0, irqs_disabled():1
INFO: lockdep is turned off.

Call Trace:
[<ffffffff802518cb>] debug_show_held_locks+0x1b/0x24
[<ffffffff8022f352>] __might_sleep+0xc7/0xc9
[<ffffffff8024c4d0>] down_read+0x1d/0x4a
[<ffffffff80237bd8>] exit_mm+0x34/0xf7
[<ffffffff8023933e>] do_exit+0x247/0x75b
[<ffffffff8020d01e>] kernel_math_error+0x0/0x7e
[<ffffffff8058d35b>] do_trap+0x101/0x110
[<ffffffff8020d4a6>] do_invalid_op+0x91/0x9a
[<ffffffff8036d5f7>] __percpu_counter_add+0x2a/0x8f
[<ffffffff88002544>] :aoe:aoeblk_make_request+0x1c3/0x1d0
[<ffffffff8058aaeb>] io_schedule+0x28/0x34
[<ffffffff8058cb8d>] error_exit+0x0/0x9a
[<ffffffff802ae5ee>] __set_page_dirty+0x48/0x121
[<ffffffff8036d5f7>] __percpu_counter_add+0x2a/0x8f
[<ffffffff802ae682>] __set_page_dirty+0xdc/0x121
[<ffffffff802ae75c>] mark_buffer_dirty+0x95/0x99
[<ffffffff802ae7d2>] __block_commit_write+0x72/0xac
[<ffffffff802ae988>] block_write_end+0x4f/0x5b
[<ffffffff802b243f>] blkdev_write_end+0x1b/0x38
[<ffffffff80265d96>] generic_file_buffered_write+0x1c0/0x648
[<ffffffff8023a752>] current_fs_time+0x22/0x29
[<ffffffff80266576>] __generic_file_aio_write_nolock+0x358/0x3c2
[<ffffffff80266c84>] filemap_fault+0x1c4/0x320
[<ffffffff80264cce>] unlock_page+0x2d/0x31
[<ffffffff802666dd>] generic_file_aio_write_nolock+0x3b/0x8d
[<ffffffff8028e40f>] do_sync_write+0xe2/0x126
[<ffffffff802497d0>] autoremove_wake_function+0x0/0x38
[<ffffffff8058e725>] do_page_fault+0x3f8/0x7bb
[<ffffffff8028cae8>] fd_install+0x5f/0x68
[<ffffffff8028eb98>] vfs_write+0xae/0x137
[<ffffffff8028f102>] sys_write+0x47/0x70
[<ffffffff8020b7ae>] system_call+0x7e/0x83



--
Ed L Cashin <[email protected]>

2007-12-03 22:48:18

by Andrew Morton

[permalink] [raw]
Subject: Re: [Bugme-new] [Bug 9482] New: kernel GPF in 2.6.24 (g09f345da)

On Mon, 3 Dec 2007 16:38:37 -0500
"Ed L. Cashin" <[email protected]> wrote:

> > --- lx/lib/percpu_counter.c.20071130 2007-12-03 15:43:19.000000000 -0500
> > +++ lx/lib/percpu_counter.c 2007-12-03 15:47:38.000000000 -0500
> > @@ -33,7 +33,9 @@ void __percpu_counter_add(struct percpu_
> > s64 count;
> > s32 *pcount;
> > int cpu = get_cpu();
> > + u64 badval = 0xffffffffffffffffULL;
> >
> > + BUG_ON(!cpu_possible(cpu));
> > pcount = per_cpu_ptr(fbc->counters, cpu);
> > count = *pcount + amount;
> > if (count >= batch || count <= -batch) {
>
> It appears that the fbc->counters pointer is NULL.

Does this fix?

--- a/drivers/block/aoe/aoeblk.c~a
+++ a/drivers/block/aoe/aoeblk.c
@@ -6,6 +6,7 @@

#include <linux/hdreg.h>
#include <linux/blkdev.h>
+#include <linux/backing-dev.h>
#include <linux/fs.h>
#include <linux/ioctl.h>
#include <linux/genhd.h>
@@ -228,6 +229,7 @@ aoeblk_gdalloc(void *vp)

spin_lock_irqsave(&d->lock, flags);
blk_queue_make_request(&d->blkq, aoeblk_make_request);
+ bdi_init(&d->blkq.backing_dev_info);
gd->major = AOE_MAJOR;
gd->first_minor = d->sysminor * AOE_PARTITIONS;
gd->fops = &aoe_bdops;
_


<wonders whether blk_queue_make_request() should be running bdi_init()?>

2007-12-03 23:14:42

by Andrew Morton

[permalink] [raw]
Subject: Re: [Bugme-new] [Bug 9482] New: kernel GPF in 2.6.24 (g09f345da)

On Mon, 3 Dec 2007 14:47:22 -0800
Andrew Morton <[email protected]> wrote:

> Does this fix?

Slightly more elaborate version

- handle errors

- don't do illegal things under spinlock

- clean up error unwinding

--- a/drivers/block/aoe/aoeblk.c~aoe-properly-initialise-the-request_queues-backing_dev_info
+++ a/drivers/block/aoe/aoeblk.c
@@ -6,6 +6,7 @@

#include <linux/hdreg.h>
#include <linux/blkdev.h>
+#include <linux/backing-dev.h>
#include <linux/fs.h>
#include <linux/ioctl.h>
#include <linux/genhd.h>
@@ -210,25 +211,20 @@ aoeblk_gdalloc(void *vp)
if (gd == NULL) {
printk(KERN_ERR "aoe: cannot allocate disk structure for %ld.%ld\n",
d->aoemajor, d->aoeminor);
- spin_lock_irqsave(&d->lock, flags);
- d->flags &= ~DEVFL_GDALLOC;
- spin_unlock_irqrestore(&d->lock, flags);
- return;
+ goto err;
}

d->bufpool = mempool_create_slab_pool(MIN_BUFS, buf_pool_cache);
if (d->bufpool == NULL) {
printk(KERN_ERR "aoe: cannot allocate bufpool for %ld.%ld\n",
d->aoemajor, d->aoeminor);
- put_disk(gd);
- spin_lock_irqsave(&d->lock, flags);
- d->flags &= ~DEVFL_GDALLOC;
- spin_unlock_irqrestore(&d->lock, flags);
- return;
+ goto err_disk;
}

- spin_lock_irqsave(&d->lock, flags);
blk_queue_make_request(&d->blkq, aoeblk_make_request);
+ if (bdi_init(&d->blkq.backing_dev_info))
+ goto err_mempool;
+ spin_lock_irqsave(&d->lock, flags);
gd->major = AOE_MAJOR;
gd->first_minor = d->sysminor * AOE_PARTITIONS;
gd->fops = &aoe_bdops;
@@ -246,6 +242,16 @@ aoeblk_gdalloc(void *vp)

add_disk(gd);
aoedisk_add_sysfs(d);
+ return;
+
+err_mempool:
+ mempool_destroy(d->bufpool);
+err_disk:
+ put_disk(gd);
+err:
+ spin_lock_irqsave(&d->lock, flags);
+ d->flags &= ~DEVFL_GDALLOC;
+ spin_unlock_irqrestore(&d->lock, flags);
}

void
_


It was done lackadaisically and needs checking.

2007-12-03 23:25:49

by Ed L. Cashin

[permalink] [raw]
Subject: Re: [Bugme-new] [Bug 9482] New: kernel GPF in 2.6.24 (g09f345da)

On Mon, Dec 03, 2007 at 02:47:22PM -0800, Andrew Morton wrote:
> On Mon, 3 Dec 2007 16:38:37 -0500
> "Ed L. Cashin" <[email protected]> wrote:
...
> > It appears that the fbc->counters pointer is NULL.
>
> Does this fix?
>
> --- a/drivers/block/aoe/aoeblk.c~a
> +++ a/drivers/block/aoe/aoeblk.c
> @@ -6,6 +6,7 @@
>
> #include <linux/hdreg.h>
> #include <linux/blkdev.h>
> +#include <linux/backing-dev.h>
> #include <linux/fs.h>
> #include <linux/ioctl.h>
> #include <linux/genhd.h>
> @@ -228,6 +229,7 @@ aoeblk_gdalloc(void *vp)
>
> spin_lock_irqsave(&d->lock, flags);
> blk_queue_make_request(&d->blkq, aoeblk_make_request);
> + bdi_init(&d->blkq.backing_dev_info);
> gd->major = AOE_MAJOR;
> gd->first_minor = d->sysminor * AOE_PARTITIONS;
> gd->fops = &aoe_bdops;
> _
>
>
> <wonders whether blk_queue_make_request() should be running bdi_init()?>

No, the behavior doesn't change with this patch applied.

Meanwhile I have started a git bisect, and hopefully that will turn up
a specific patch before I hit an unbootable kernel or get my machine
in a state where it won't boot.

--
Ed L Cashin <[email protected]>

2007-12-03 23:45:50

by Ed L. Cashin

[permalink] [raw]
Subject: Re: [Bugme-new] [Bug 9482] New: kernel GPF in 2.6.24 (g09f345da)

On Mon, Dec 03, 2007 at 03:13:49PM -0800, Andrew Morton wrote:
> On Mon, 3 Dec 2007 14:47:22 -0800
> Andrew Morton <[email protected]> wrote:
>
> > Does this fix?
>
> Slightly more elaborate version

Yes, this patch does eliminate the problem. Without it, no write can
complete, and with it I have seen many writes complete without any
trouble.

Thank you for looking into this. I will look more closely at this
patch tomorrow.

--
Ed L Cashin <[email protected]>

2007-12-08 22:59:40

by Jon Nelson

[permalink] [raw]
Subject: Re: [Bugme-new] [Bug 9482] New: kernel GPF in 2.6.24 (g09f345da)

I can confirm that 2.6.24rc4 with the (second) patch works fine.
--
Jon

2007-12-09 03:50:54

by Andrew Morton

[permalink] [raw]
Subject: Re: [Bugme-new] [Bug 9482] New: kernel GPF in 2.6.24 (g09f345da)

On Sat, 8 Dec 2007 16:59:30 -0600 "Jon Nelson" <[email protected]> wrote:

> I can confirm that 2.6.24rc4 with the (second) patch works fine.

OK, thanks.

We haven't heard back from Ed yet. I'll sit on this for a few more days.


From: Andrew Morton <[email protected]>

Cc: "Ed L. Cashin" <[email protected]>
Cc: <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
---

drivers/block/aoe/aoeblk.c | 26 ++++++++++++++++----------
1 file changed, 16 insertions(+), 10 deletions(-)

diff -puN drivers/block/aoe/aoeblk.c~aoe-properly-initialise-the-request_queues-backing_dev_info drivers/block/aoe/aoeblk.c
--- a/drivers/block/aoe/aoeblk.c~aoe-properly-initialise-the-request_queues-backing_dev_info
+++ a/drivers/block/aoe/aoeblk.c
@@ -6,6 +6,7 @@

#include <linux/hdreg.h>
#include <linux/blkdev.h>
+#include <linux/backing-dev.h>
#include <linux/fs.h>
#include <linux/ioctl.h>
#include <linux/genhd.h>
@@ -210,25 +211,20 @@ aoeblk_gdalloc(void *vp)
if (gd == NULL) {
printk(KERN_ERR "aoe: cannot allocate disk structure for %ld.%ld\n",
d->aoemajor, d->aoeminor);
- spin_lock_irqsave(&d->lock, flags);
- d->flags &= ~DEVFL_GDALLOC;
- spin_unlock_irqrestore(&d->lock, flags);
- return;
+ goto err;
}

d->bufpool = mempool_create_slab_pool(MIN_BUFS, buf_pool_cache);
if (d->bufpool == NULL) {
printk(KERN_ERR "aoe: cannot allocate bufpool for %ld.%ld\n",
d->aoemajor, d->aoeminor);
- put_disk(gd);
- spin_lock_irqsave(&d->lock, flags);
- d->flags &= ~DEVFL_GDALLOC;
- spin_unlock_irqrestore(&d->lock, flags);
- return;
+ goto err_disk;
}

- spin_lock_irqsave(&d->lock, flags);
blk_queue_make_request(&d->blkq, aoeblk_make_request);
+ if (bdi_init(&d->blkq.backing_dev_info))
+ goto err_mempool;
+ spin_lock_irqsave(&d->lock, flags);
gd->major = AOE_MAJOR;
gd->first_minor = d->sysminor * AOE_PARTITIONS;
gd->fops = &aoe_bdops;
@@ -246,6 +242,16 @@ aoeblk_gdalloc(void *vp)

add_disk(gd);
aoedisk_add_sysfs(d);
+ return;
+
+err_mempool:
+ mempool_destroy(d->bufpool);
+err_disk:
+ put_disk(gd);
+err:
+ spin_lock_irqsave(&d->lock, flags);
+ d->flags &= ~DEVFL_GDALLOC;
+ spin_unlock_irqrestore(&d->lock, flags);
}

void
_

2007-12-10 15:25:41

by Ed L. Cashin

[permalink] [raw]
Subject: Re: [Bugme-new] [Bug 9482] New: kernel GPF in 2.6.24 (g09f345da)

On Sat, Dec 08, 2007 at 07:50:15PM -0800, Andrew Morton wrote:
> On Sat, 8 Dec 2007 16:59:30 -0600 "Jon Nelson" <[email protected]> wrote:
>
> > I can confirm that 2.6.24rc4 with the (second) patch works fine.
>
> OK, thanks.
>
> We haven't heard back from Ed yet. I'll sit on this for a few more days.

Sorry, yes, the second patch works well for me in testing. I had some
initial concerns that were unfounded.

--
Ed L Cashin <[email protected]>