2008-07-23 20:07:23

by Vegard Nossum

[permalink] [raw]
Subject: recent -git: BUG in free_thread_xstate

Hi,

I just got this on c010b2f76c3032e48097a6eef291d8593d5d79a6 (-git from
yesterday):

BUG: unable to handle kernel paging request at 00664381
IP: [<c010b274>] free_thread_xstate+0x4/0x30
*pde = 00000000
Oops: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC
Pid: 4, comm: ksoftirqd/0 Not tainted (2.6.26-06077-gc010b2f #100)
EIP: 0060:[<c010b274>] EFLAGS: 00010246 CPU: 0
EIP is at free_thread_xstate+0x4/0x30
EAX: 00664001 EBX: f21e0000 ECX: 00000000 EDX: f7872fd0
ESI: f221df38 EDI: c0833d00 EBP: f7889f4c ESP: f7889f48
DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
Process ksoftirqd/0 (pid: 4, ti=f7888000 task=f7872fd0 task.ti=f7888000)
Stack: f21e0000 f7889f58 c010b2ad f221cfb0 f7889f64 c01352c9 f221cfb0 f7889f70
c0136713 f2b506cc f7889f78 c0138ea7 f7889f90 c01790ff 00000282 c0785aa0
00000001 0000000a f7889fac c013cad2 c0838c00 c0838c00 c0838c00 00000246
Call Trace:
[<c010b2ad>] ? free_thread_info+0xd/0x20
[<c01352c9>] ? free_task+0x19/0x30
[<c0136713>] ? __put_task_struct+0x53/0xb0
[<c0138ea7>] ? delayed_put_task_struct+0x27/0x30
[<c01790ff>] ? rcu_process_callbacks+0x6f/0xb0
[<c013cad2>] ? __do_softirq+0x92/0x110
[<c013cbf5>] ? do_softirq+0xa5/0xb0
[<c013cc76>] ? ksoftirqd+0x76/0x180
[<c013cc00>] ? ksoftirqd+0x0/0x180
[<c014befc>] ? kthread+0x3c/0x70
[<c014bec0>] ? kthread+0x0/0x70
[<c0104d8b>] ? kernel_thread_helper+0x7/0x1c
=======================
Code: 04 00 00 00 00 c7 04 24 00 00 04 00 e8 46 84 09 00 a3 dc 07 84 c0 c9 c3 eb
0d 90 90 90 90 90 90 90 90 90 90 90 90 90 55 89 e5 53 <8b> 90 80 03 00 00 89 c3
85 d2 74 14 a1 dc 07 84 c0 e8 c6 88 09
EIP: [<c010b274>] free_thread_xstate+0x4/0x30 SS:ESP 0068:f7889f48
Kernel panic - not syncing: Fatal exception in interrupt

EIP is at arch/x86/kernel/process.c:36:

if (tsk->thread.xstate) {

This looks related to the recent floating-point changes and maybe RCU,
adding Ccs.

It seems quite reproducible, so I'll give it a shot with the latest
-git as well.


Vegard

--
"The animistic metaphor of the bug that maliciously sneaked in while
the programmer was not looking is intellectually dishonest as it
disguises that the error is the programmer's own creation."
-- E. W. Dijkstra, EWD1036


2008-07-23 20:23:37

by Vegard Nossum

[permalink] [raw]
Subject: Re: recent -git: BUG in free_thread_xstate

On Wed, Jul 23, 2008 at 10:07 PM, Vegard Nossum <[email protected]> wrote:
> Hi,
>
> I just got this on c010b2f76c3032e48097a6eef291d8593d5d79a6 (-git from
> yesterday):
>
> BUG: unable to handle kernel paging request at 00664381
> IP: [<c010b274>] free_thread_xstate+0x4/0x30
> *pde = 00000000
> Oops: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC
> Pid: 4, comm: ksoftirqd/0 Not tainted (2.6.26-06077-gc010b2f #100)
> EIP: 0060:[<c010b274>] EFLAGS: 00010246 CPU: 0
> EIP is at free_thread_xstate+0x4/0x30
> EAX: 00664001 EBX: f21e0000 ECX: 00000000 EDX: f7872fd0
> ESI: f221df38 EDI: c0833d00 EBP: f7889f4c ESP: f7889f48
> DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
> Process ksoftirqd/0 (pid: 4, ti=f7888000 task=f7872fd0 task.ti=f7888000)
> Stack: f21e0000 f7889f58 c010b2ad f221cfb0 f7889f64 c01352c9 f221cfb0 f7889f70
> c0136713 f2b506cc f7889f78 c0138ea7 f7889f90 c01790ff 00000282 c0785aa0
> 00000001 0000000a f7889fac c013cad2 c0838c00 c0838c00 c0838c00 00000246
> Call Trace:
> [<c010b2ad>] ? free_thread_info+0xd/0x20
> [<c01352c9>] ? free_task+0x19/0x30
> [<c0136713>] ? __put_task_struct+0x53/0xb0
> [<c0138ea7>] ? delayed_put_task_struct+0x27/0x30
> [<c01790ff>] ? rcu_process_callbacks+0x6f/0xb0
> [<c013cad2>] ? __do_softirq+0x92/0x110
> [<c013cbf5>] ? do_softirq+0xa5/0xb0
> [<c013cc76>] ? ksoftirqd+0x76/0x180
> [<c013cc00>] ? ksoftirqd+0x0/0x180
> [<c014befc>] ? kthread+0x3c/0x70
> [<c014bec0>] ? kthread+0x0/0x70
> [<c0104d8b>] ? kernel_thread_helper+0x7/0x1c
> =======================
> Code: 04 00 00 00 00 c7 04 24 00 00 04 00 e8 46 84 09 00 a3 dc 07 84 c0 c9 c3 eb
> 0d 90 90 90 90 90 90 90 90 90 90 90 90 90 55 89 e5 53 <8b> 90 80 03 00 00 89 c3
> 85 d2 74 14 a1 dc 07 84 c0 e8 c6 88 09
> EIP: [<c010b274>] free_thread_xstate+0x4/0x30 SS:ESP 0068:f7889f48
> Kernel panic - not syncing: Fatal exception in interrupt
>
> EIP is at arch/x86/kernel/process.c:36:
>
> if (tsk->thread.xstate) {
>
> This looks related to the recent floating-point changes and maybe RCU,
> adding Ccs.
>
> It seems quite reproducible, so I'll give it a shot with the latest
> -git as well.

Don't know if it's related, but I got this on the same kernel:

BUG: unable to handle kernel paging request at c0817fac
IP: [<c0135bcc>] copy_process+0x8ec/0x1130
*pde = 3780e163 *pte = 00817162
Oops: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC
Pid: 1280, comm: udevd Not tainted (2.6.26-06077-gc010b2f #100)
EIP: 0060:[<c0135bcc>] EFLAGS: 00210086 CPU: 1
EIP is at copy_process+0x8ec/0x1130
EAX: ffffffff EBX: f799a224 ECX: 00000000 EDX: 00450008
ESI: f7999fe0 EDI: 00000000 EBP: f6f4bf44 ESP: f6f4bf08
DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
Process udevd (pid: 1280, ti=f6f4a000 task=f6d41fe0 task.ti=f6f4a000)
Stack: 00000000 f7999fe0 f6f4bfb8 f7999fe0 f6f4bfb8 bf96c708 01200011 f799a1e4
00000000 f7918400 00000000 00000000 00000000 f6f4bfb8 01200011 f6f4bf9c
c013646d 00000000 b7e65938 f78b6900 bf96c708 00000000 f6c98900 f6f4bf9c
Call Trace:
[<c013646d>] ? do_fork+0x5d/0x2b0
[<c0191571>] ? do_munmap+0x1e1/0x240
[<c01024af>] ? sys_clone+0x2f/0x40
[<c010404f>] ? sysenter_past_esp+0x78/0xc5
=======================
Code: 00 00 64 a1 00 70 7e c0 8b 80 70 01 00 00 89 86 70 01 00 00 8b 46 04 8b 50
10 0f a3 96 8c 01 00 00 19 c0 85 c0 0f 84 db 07 00 00 <0f> a3 15 ac df 78 c0 19
c0 85 c0 0f 84 ca 07 00 00 f7 45 dc 00
EIP: [<c0135bcc>] copy_process+0x8ec/0x1130 SS:ESP 0068:f6f4bf08
---[ end trace 11ce0863bd4ff64d ]---
note: udevd[1280] exited with preempt_count 1

$ addr2line -e vmlinux -i c0135bcc
include/asm/bitops.h:305
kernel/fork.c:1151

Seems to be this block (first line):

if (unlikely(!cpu_isset(task_cpu(p), p->cpus_allowed) ||
!cpu_online(task_cpu(p))))
set_task_cpu(p, smp_processor_id());


My test is basically stressing the network and running CPU hotplug at
the same time.


Vegard

--
"The animistic metaphor of the bug that maliciously sneaked in while
the programmer was not looking is intellectually dishonest as it
disguises that the error is the programmer's own creation."
-- E. W. Dijkstra, EWD1036

2008-07-23 20:28:33

by Vegard Nossum

[permalink] [raw]
Subject: Re: recent -git: BUG in free_thread_xstate

On Wed, Jul 23, 2008 at 10:23 PM, Vegard Nossum <[email protected]> wrote:
> My test is basically stressing the network and running CPU hotplug at
> the same time.

FWIW, a third run gives us this additional clue before going down with
the first error I posted in this thread:

=============================================================================
BUG task_struct: Poison overwritten
-----------------------------------------------------------------------------
INFO: 0xf3d00000-0xf3d0006b. First byte 0x1 instead of 0x6b
INFO: Allocated in copy_process+0x68/0x1130 age=4 cpu=0 pid=4338
INFO: Freed in free_task+0x2c/0x30 age=2 cpu=0 pid=4
INFO: Slab 0xc1c25c00 objects=8 used=3 fp=0xf3d00000 flags=0x400020c3
INFO: Object 0xf3d00000 @offset=0 fp=0xf3d03fc0
Object 0xf3d00000: 01 40 66 00 00 16 ec ee ad b9 00 1c 26 8a 70 f8
.@f...<EC><U+EB79>..&.p<F8>
Object 0xf3d00010: 08 00 45 00 00 54 00 00 40 00 40 01 b7 e8 c0 a8
..E..T..@.@.<B7><E8><C0><A8>
Object 0xf3d00020: 00 c4 c0 a8 00 ac 08 00 6e c0 df 24 55 33 75 af
.<C4><C0><A8>.<AC>..n<C0><DF>$U3u<AF>
Object 0xf3d00030: 87 48 69 ec 03 00 08 09 0a 0b 0c 0d 0e 0f 10 11
.Hi<EC>............
Object 0xf3d00040: 12 13 14 15 16 17 18 19 1a 1b 1c 1d 1e 1f 20 21
...............!
Object 0xf3d00050: 22 23 24 25 26 27 28 29 2a 2b 2c 2d 2e 2f 30 31
"#$%&'()*+,-./01
Object 0xf3d00060: 32 33 34 35 36 37 89 e0 c8 4a fb e0 6b 6b 6b 6b
234567.<E0><C8>J<FB><E0>kkkk
Object 0xf3d00070: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
kkkkkkkkkkkkkkkk
[...]
Object 0xf3d00fb0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b a5
kkkkkkkkkkkkkkk<A5>
Redzone 0xf3d00fc0: bb bb bb bb
<BB><BB><BB><BB>
Padding 0xf3d00fe8: 5a 5a 5a 5a 5a 5a 5a 5a
ZZZZZZZZ
Pid: 3995, comm: bash Not tainted 2.6.26-06077-gc010b2f #100
[<c01a0363>] print_trailer+0xd3/0x120
[<c01a0485>] check_bytes_and_report+0xd5/0x100
[<c01a06b1>] check_object+0x1b1/0x200
[<c01a1f90>] __slab_alloc+0x510/0x5f0
[<c01a2449>] kmem_cache_alloc+0xd9/0xe0
[<c0135348>] ? copy_process+0x68/0x1130
[<c0135348>] ? copy_process+0x68/0x1130
[<c0135348>] copy_process+0x68/0x1130
[<c02a59b7>] ? _raw_spin_trylock+0x17/0x50
[<c013646d>] do_fork+0x5d/0x2b0
[<c015bd5b>] ? trace_hardirqs_on+0xb/0x10
[<c0297240>] ? copy_to_user+0x40/0x130
[<c01024af>] sys_clone+0x2f/0x40
[<c010404f>] sysenter_past_esp+0x78/0xc5
=======================
FIX task_struct: Restoring 0xf3d00000-0xf3d0006b=0x6b
FIX task_struct: Marking all objects used


Vegard

--
"The animistic metaphor of the bug that maliciously sneaked in while
the programmer was not looking is intellectually dishonest as it
disguises that the error is the programmer's own creation."
-- E. W. Dijkstra, EWD1036

2008-07-23 20:31:21

by Suresh Siddha

[permalink] [raw]
Subject: Re: recent -git: BUG in free_thread_xstate

On Wed, Jul 23, 2008 at 01:07:04PM -0700, Vegard Nossum wrote:
> Hi,
>
> I just got this on c010b2f76c3032e48097a6eef291d8593d5d79a6 (-git from
> yesterday):

Do you see this in 2.6.26 aswell? I suspect it is coming from post 2.6.26
changes.

>
> BUG: unable to handle kernel paging request at 00664381
> IP: [<c010b274>] free_thread_xstate+0x4/0x30
...

> EIP is at arch/x86/kernel/process.c:36:
>
> if (tsk->thread.xstate) {
>

It looks like the kernel stack of that process got corrupted, corrupting the
task pointer in thread_info. Can you send us your config file?

thanks,
suresh

2008-07-23 20:39:32

by Vegard Nossum

[permalink] [raw]
Subject: Re: recent -git: BUG in free_thread_xstate

Hi!

On Wed, Jul 23, 2008 at 10:31 PM, Suresh Siddha
<[email protected]> wrote:
> On Wed, Jul 23, 2008 at 01:07:04PM -0700, Vegard Nossum wrote:
>> Hi,
>>
>> I just got this on c010b2f76c3032e48097a6eef291d8593d5d79a6 (-git from
>> yesterday):
>
> Do you see this in 2.6.26 aswell? I suspect it is coming from post 2.6.26
> changes.

Just reproduced it on latest linux-2.6.git.

>> BUG: unable to handle kernel paging request at 00664381
>> IP: [<c010b274>] free_thread_xstate+0x4/0x30
> ...
>

The funny thing is that this number (00664281) is _always_ the same,
even on the newer kernel. What's special about it?

>> EIP is at arch/x86/kernel/process.c:36:
>>
>> if (tsk->thread.xstate) {
>>
>
> It looks like the kernel stack of that process got corrupted, corrupting the
> task pointer in thread_info. Can you send us your config file?

It has been uploaded here:

http://userweb.kernel.org/~vegard/bugs/20080724-fork/config

Given the reproducible nature of this, I think I'll just go ahead and
start a bisect...

Thanks for looking :-)


Vegard

--
"The animistic metaphor of the bug that maliciously sneaked in while
the programmer was not looking is intellectually dishonest as it
disguises that the error is the programmer's own creation."
-- E. W. Dijkstra, EWD1036

2008-07-23 20:52:52

by Vegard Nossum

[permalink] [raw]
Subject: Re: recent -git: BUG in free_thread_xstate

On Wed, Jul 23, 2008 at 10:31 PM, Suresh Siddha
<[email protected]> wrote:
> On Wed, Jul 23, 2008 at 01:07:04PM -0700, Vegard Nossum wrote:
>> Hi,
>>
>> I just got this on c010b2f76c3032e48097a6eef291d8593d5d79a6 (-git from
>> yesterday):
>
> Do you see this in 2.6.26 aswell? I suspect it is coming from post 2.6.26
> changes.
>

Humm... I got something different now on plain 2.6.26:

------------[ cut here ]------------
WARNING: at kernel/sched_fair.c:815 hrtick_start_fair+0x158/0x170()
Pid: 5575, comm: migration/1 Not tainted 2.6.26 #1
[<c013471f>] warn_on_slowpath+0x4f/0x70
[<c010aea5>] ? native_sched_clock+0xb5/0x110
[<c0129aae>] ? update_curr_rt+0x10e/0x450
[<c0581427>] ? _spin_unlock+0x27/0x50
[<c0129aae>] ? update_curr_rt+0x10e/0x450
[<c012802a>] ? __dequeue_entity+0x2a/0xa0
[<c0129518>] hrtick_start_fair+0x158/0x170
[<c012958f>] pick_next_task_fair+0x5f/0x70
[<c057e2e8>] schedule+0x148/0x850
[<c0581582>] ? _spin_unlock_irq+0x22/0x60
[<c0157a24>] ? trace_hardirqs_on+0xd4/0x160
[<c012fd92>] migration_thread+0x132/0x240
[<c0127f39>] ? complete+0x49/0x60
[<c012fc60>] ? migration_thread+0x0/0x240
[<c0148722>] kthread+0x42/0x70
[<c01486e0>] ? kthread+0x0/0x70
[<c0105a53>] kernel_thread_helper+0x7/0x14
=======================
---[ end trace 3e4aafeba0960600 ]---
BUG: unable to handle kernel paging request at c0804358
IP: [<c057e38d>] schedule+0x1ed/0x850
*pde = 3780e163 *pte = 00804162
Oops: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC
Pid: 5575, comm: migration/1 Tainted: G W (2.6.26 #1)
EIP: 0060:[<c057e38d>] EFLAGS: 00010086 CPU: 1
EIP is at schedule+0x1ed/0x850
EAX: c0803f00 EBX: 00000001 ECX: f4d10000 EDX: 00450008
ESI: f4c91fe0 EDI: c0803f00 EBP: f4c85fa0 ESP: f4c85f3c
DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
Process migration/1 (pid: 5575, ti=f4c84000 task=f6348000 task.ti=f4c84000)
Stack: ef927c97 f63484b8 f6348000 f63484b8 c08ba810 f6348000 c1fb3f00 f4c91fe0
00000000 00000000 f6348000 f6348274 c1fb3f00 f4c84000 00000000 f6348000
c0581582 c1fb3f00 f4c85f94 c0157a24 c1fb3f00 00000001 c1fb4378 00000001
Call Trace:
[<c0581582>] ? _spin_unlock_irq+0x22/0x60
[<c0157a24>] ? trace_hardirqs_on+0xd4/0x160
[<c012fd92>] ? migration_thread+0x132/0x240
[<c0127f39>] ? complete+0x49/0x60
[<c012fc60>] ? migration_thread+0x0/0x240
[<c0148722>] ? kthread+0x42/0x70
[<c01486e0>] ? kthread+0x0/0x70
[<c0105a53>] ? kernel_thread_helper+0x7/0x14
=======================
Code: c0 0f 84 40 02 00 00 8b 86 48 04 00 00 39 45 b8 0f 84 ad 00 00
00 8b 75 b8 b8 00 3f 80 c0 8b 4e 04 8b 51 10 03 04 95 00 af
75 c0 <8b> b0 58 04 00 00 8b 98 54 04 00 00 89 75 c0 8b 75 b8 89 5d bc
EIP: [<c057e38d>] schedule+0x1ed/0x850 SS:ESP 0068:f4c85f3c
---[ end trace 3e4aafeba0960600 ]---
note: migration/1[5575] exited with preempt_count 2

I'll give it a few more tries to see if it can come up with the same
thing as before.


Vegard

--
"The animistic metaphor of the bug that maliciously sneaked in while
the programmer was not looking is intellectually dishonest as it
disguises that the error is the programmer's own creation."
-- E. W. Dijkstra, EWD1036

2008-07-23 21:22:36

by Dmitry Adamushko

[permalink] [raw]
Subject: Re: recent -git: BUG in free_thread_xstate

2008/7/23 Vegard Nossum <[email protected]>:
> On Wed, Jul 23, 2008 at 10:31 PM, Suresh Siddha
> <[email protected]> wrote:
>> On Wed, Jul 23, 2008 at 01:07:04PM -0700, Vegard Nossum wrote:
>>> Hi,
>>>
>>> I just got this on c010b2f76c3032e48097a6eef291d8593d5d79a6 (-git from
>>> yesterday):
>>
>> Do you see this in 2.6.26 aswell? I suspect it is coming from post 2.6.26
>> changes.
>>
>
> Humm... I got something different now on plain 2.6.26:
>
> ------------[ cut here ]------------
> WARNING: at kernel/sched_fair.c:815 hrtick_start_fair+0x158/0x170()

that's interesting. As a first step and if it's easily reproducible,
would you try something like below?

(hope, it compiles :-)

If it doesn't crash immediately, then 'p' is likely to be a real task
(well, we'll see then with p->comm) and if its cpu is different from
rq->cpu (must be in this case), then we might have a funny race
somewhere...

--- a/kernel/sched_fair.c
+++ b/kernel/sched_fair.c
@@ -813,6 +813,9 @@ static void hrtick_start_fair(struct rq *rq,
struct task_struct *p)
struct cfs_rq *cfs_rq = cfs_rq_of(se);

WARN_ON(task_rq(p) != rq);
+ if (task_rq(p) != rq)
+ printk(KERN_ERR "task (%s)'s cpu (%d), rq's (%d)\n",
+ p->comm, task_cpu(p), rq->cpu);

if (hrtick_enabled(rq) && cfs_rq->nr_running > 1) {
u64 slice = sched_slice(cfs_rq, se);

>
> [ ... ]
>

--
Best regards,
Dmitry Adamushko

2008-07-23 21:22:52

by Vegard Nossum

[permalink] [raw]
Subject: Re: recent -git: BUG in free_thread_xstate

On Wed, Jul 23, 2008 at 10:31 PM, Suresh Siddha
<[email protected]> wrote:
> On Wed, Jul 23, 2008 at 01:07:04PM -0700, Vegard Nossum wrote:
>> Hi,
>>
>> I just got this on c010b2f76c3032e48097a6eef291d8593d5d79a6 (-git from
>> yesterday):
>
> Do you see this in 2.6.26 aswell? I suspect it is coming from post 2.6.26
> changes.

Yep. Got this on 2.6.26 now:

BUG: unable to handle kernel paging request at 00664381
IP: [<c010b884>] free_thread_xstate+0x4/0x30
*pde = 00000000
Oops: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC
Pid: 3796, comm: bash Not tainted (2.6.26 #1)
EIP: 0060:[<c010b884>] EFLAGS: 00210246 CPU: 0
EIP is at free_thread_xstate+0x4/0x30
EAX: 00664001 EBX: f3870000 ECX: 00000004 EDX: f4b544e8
ESI: f4bdef28 EDI: c07feda0 EBP: f5325bd0 ESP: f5325bcc
DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
Process bash (pid: 3796, ti=f5324000 task=f4b53fc0 task.ti=f5324000)
Stack: f3870000 f5325bdc c010b8bd f4bddfa0 f5325be8 c0132b89 f4bddfa0 f5325bf4
c0133fd1 f4b77e00 f5325bfc c01368a7 f5325c14 c0172b8c 00200282 c0752b40
00000001 00000009 f5325c30 c0139cd3 c0803d00 c0803d00 c0803d00 00200046
Call Trace:
[<c010b8bd>] ? free_thread_info+0xd/0x20
[<c0132b89>] ? free_task+0x19/0x30
[<c0133fd1>] ? __put_task_struct+0x51/0xa0
[<c01368a7>] ? delayed_put_task_struct+0x27/0x30
[<c0172b8c>] ? rcu_process_callbacks+0x6c/0xb0
[<c0139cd3>] ? __do_softirq+0x83/0x100
[<c0139df5>] ? do_softirq+0xa5/0xb0
[<c0139f95>] ? irq_exit+0x95/0xa0
[<c0107e4d>] ? do_IRQ+0x4d/0xa0
[<c01057b2>] ? common_interrupt+0x2e/0x34
[<c013549e>] ? vprintk+0x1be/0x420
[<c010aea5>] ? native_sched_clock+0xb5/0x110
[<c010aea5>] ? native_sched_clock+0xb5/0x110
[<c013571b>] ? printk+0x1b/0x20
[<c012cbec>] ? cpu_attach_domain+0x3ec/0x410
[<c010aea5>] ? native_sched_clock+0xb5/0x110
[<c01979e1>] ? check_bytes_and_report+0x21/0xc0
[<c0197d8f>] ? check_object+0xdf/0x1f0
[<c0126c17>] ? sd_free_ctl_entry+0x37/0x50
[<c0157895>] ? mark_held_locks+0x65/0x80
[<c0199055>] ? kfree+0xb5/0x120
[<c0157a24>] ? trace_hardirqs_on+0xd4/0x160
[<c0126c17>] ? sd_free_ctl_entry+0x37/0x50
[<c0126c17>] ? sd_free_ctl_entry+0x37/0x50
[<c0126c17>] ? sd_free_ctl_entry+0x37/0x50
[<c012cc3e>] ? detach_destroy_domains+0x2e/0x50
[<c012cc9b>] ? update_sched_domains+0x3b/0x50
[<c014d467>] ? notifier_call_chain+0x37/0x70
[<c014d4d9>] ? __raw_notifier_call_chain+0x19/0x20
[<c055c858>] ? _cpu_down+0x78/0x240
[<c015d92f>] ? cpu_maps_update_begin+0xf/0x20
[<c055ca4b>] ? cpu_down+0x2b/0x40
[<c055dc69>] ? store_online+0x39/0x80
[<c055dc30>] ? store_online+0x0/0x80
[<c02faf6b>] ? sysdev_store+0x2b/0x40
[<c01dcdd2>] ? sysfs_write_file+0xa2/0x100
[<c019eb76>] ? vfs_write+0x96/0x130
[<c01dcd30>] ? sysfs_write_file+0x0/0x100
[<c019f23d>] ? sys_write+0x3d/0x70
[<c0104cdb>] ? sysenter_past_esp+0x78/0xd1
=======================
Code: 04 00 00 00 00 c7 04 24 00 00 04 00 e8 96 f8 08 00 a3 b4 a5 80
c0 c9 c3 eb 0d 90 90 90 90 90 90 90 90 90 90 90 90 90 55 89 e5 53 <8b>
90 80 03 00 00 89 c3 85 d2 74 14 a1 b4 a5 80 c0 e8 d6 e4 08
EIP: [<c010b884>] free_thread_xstate+0x4/0x30 SS:ESP 0068:f5325bcc
Kernel panic - not syncing: Fatal exception in interrupt

I'm not sure what to make of this. It looks related to the rebuilding
of sched domains that we saw earlier. But this reproduces on both
v2.6.26 and latest -git (though not with that backtrace).

Notice that the magic number is still the same -- 0x00664381. I'm curious.

Ah. The code decodes to:
mov 0x380(%rax),%edx

so the "real" magic number must be the one in %rax, 0x00664001. This
looks slightly more like a magic number. The middle two bytes may be
character codes: "f@"

I'm adding some of the people from the whole sched domain thing thread to Cc.


Vegard

--
"The animistic metaphor of the bug that maliciously sneaked in while
the programmer was not looking is intellectually dishonest as it
disguises that the error is the programmer's own creation."
-- E. W. Dijkstra, EWD1036

2008-07-23 21:35:33

by Dmitry Adamushko

[permalink] [raw]
Subject: Re: recent -git: BUG in free_thread_xstate

2008/7/23 Dmitry Adamushko <[email protected]>:
> 2008/7/23 Vegard Nossum <[email protected]>:
>> On Wed, Jul 23, 2008 at 10:31 PM, Suresh Siddha
>> <[email protected]> wrote:
>>> On Wed, Jul 23, 2008 at 01:07:04PM -0700, Vegard Nossum wrote:
>>>> Hi,
>>>>
>>>> I just got this on c010b2f76c3032e48097a6eef291d8593d5d79a6 (-git from
>>>> yesterday):
>>>
>>> Do you see this in 2.6.26 aswell? I suspect it is coming from post 2.6.26
>>> changes.
>>>
>>
>> Humm... I got something different now on plain 2.6.26:
>>
>> ------------[ cut here ]------------
>> WARNING: at kernel/sched_fair.c:815 hrtick_start_fair+0x158/0x170()
>
> that's interesting. As a first step and if it's easily reproducible,
> would you try something like below?

If not, maybe object files are still in-tact? Just to see where
exactly in schedule() it crashed so maybe we may get an idea on what
was a result of next = pick_next_task(rq, prev);


--
Best regards,
Dmitry Adamushko

2008-07-23 21:36:26

by Vegard Nossum

[permalink] [raw]
Subject: Re: recent -git: BUG in free_thread_xstate

On Wed, Jul 23, 2008 at 10:28 PM, Vegard Nossum <[email protected]> wrote:
> On Wed, Jul 23, 2008 at 10:23 PM, Vegard Nossum <[email protected]> wrote:
>> My test is basically stressing the network and running CPU hotplug at
>> the same time.
>
> FWIW, a third run gives us this additional clue before going down with
> the first error I posted in this thread:
>
> =============================================================================
> BUG task_struct: Poison overwritten
> -----------------------------------------------------------------------------
> INFO: 0xf3d00000-0xf3d0006b. First byte 0x1 instead of 0x6b

Note that the number of overwritten bytes is exactly 0x6b. This sounds
VERY much like a use-after-free, e.g. maybe something loaded 0x6b into
the "size" parameter for memcpy().

> INFO: Allocated in copy_process+0x68/0x1130 age=4 cpu=0 pid=4338
> INFO: Freed in free_task+0x2c/0x30 age=2 cpu=0 pid=4

Pid 4 seems to always be ksoftirqd/0 on this machine.

> INFO: Slab 0xc1c25c00 objects=8 used=3 fp=0xf3d00000 flags=0x400020c3
> INFO: Object 0xf3d00000 @offset=0 fp=0xf3d03fc0
> Object 0xf3d00000: 01 40 66 00 00 16 ec ee ad b9 00 1c 26 8a 70 f8
> .@f...<EC><U+EB79>..&.p<F8>

That's the "magic number": 0x00664001.

Why would this always get written in this position of the task struct?

> Object 0xf3d00010: 08 00 45 00 00 54 00 00 40 00 40 01 b7 e8 c0 a8
> ..E..T..@.@.<B7><E8><C0><A8>
> Object 0xf3d00020: 00 c4 c0 a8 00 ac 08 00 6e c0 df 24 55 33 75 af
> .<C4><C0><A8>.<AC>..n<C0><DF>$U3u<AF>
> Object 0xf3d00030: 87 48 69 ec 03 00 08 09 0a 0b 0c 0d 0e 0f 10 11
> .Hi<EC>............
> Object 0xf3d00040: 12 13 14 15 16 17 18 19 1a 1b 1c 1d 1e 1f 20 21
> ...............!
> Object 0xf3d00050: 22 23 24 25 26 27 28 29 2a 2b 2c 2d 2e 2f 30 31
> "#$%&'()*+,-./01
> Object 0xf3d00060: 32 33 34 35 36 37 89 e0 c8 4a fb e0 6b 6b 6b 6b
> 234567.<E0><C8>J<FB><E0>kkkk

Why is it writing the sequence of numbers from 0x08 to 0x37 here?

Also, the last line disassembles to this:

0: 89 e0 mov %esp,%eax
2: c8 4a 4b e0 enterq $0x4b4a,$0xe0


...Additional clues may be found... maybe :-)


Vegard

--
"The animistic metaphor of the bug that maliciously sneaked in while
the programmer was not looking is intellectually dishonest as it
disguises that the error is the programmer's own creation."
-- E. W. Dijkstra, EWD1036

2008-07-23 21:45:42

by Vegard Nossum

[permalink] [raw]
Subject: Re: recent -git: BUG in free_thread_xstate

On Wed, Jul 23, 2008 at 11:35 PM, Dmitry Adamushko
<[email protected]> wrote:
> 2008/7/23 Dmitry Adamushko <[email protected]>:
>> 2008/7/23 Vegard Nossum <[email protected]>:
>>> ------------[ cut here ]------------
>>> WARNING: at kernel/sched_fair.c:815 hrtick_start_fair+0x158/0x170()
>>
>> that's interesting. As a first step and if it's easily reproducible,
>> would you try something like below?
>
> If not, maybe object files are still in-tact? Just to see where
> exactly in schedule() it crashed so maybe we may get an idea on what
> was a result of next = pick_next_task(rq, prev);

Yes, indeed. You mean this part?

BUG: unable to handle kernel paging request at c0804358
IP: [<c057e38d>] schedule+0x1ed/0x850
*pde = 3780e163 *pte = 00804162
Oops: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC
Pid: 5575, comm: migration/1 Tainted: G W (2.6.26 #1)
EIP: 0060:[<c057e38d>] EFLAGS: 00010086 CPU: 1
EIP is at schedule+0x1ed/0x850
EAX: c0803f00 EBX: 00000001 ECX: f4d10000 EDX: 00450008
ESI: f4c91fe0 EDI: c0803f00 EBP: f4c85fa0 ESP: f4c85f3c

$ addr2line -e vmlinux -i c057e38d
kernel/sched_stats.h:164
kernel/sched_stats.h:236
kernel/sched_stats.h:242
kernel/sched.c:4191

(Keep in mind that the line numbers are for v2.6.26.)

Looks to be this line (kernel/sched.c):

sched_info_switch(prev, next);

or for kernel/sched_stats.h:164:

unsigned long long now = task_rq(t)->clock, delta = 0;

..I'm not sure how much information this is?

Most likely the task_struct just got corrupted, like for the xstate
crashes... Should I still try the patch? :-)


Vegard

--
"The animistic metaphor of the bug that maliciously sneaked in while
the programmer was not looking is intellectually dishonest as it
disguises that the error is the programmer's own creation."
-- E. W. Dijkstra, EWD1036

2008-07-23 22:01:57

by Dmitry Adamushko

[permalink] [raw]
Subject: Re: recent -git: BUG in free_thread_xstate

2008/7/23 Vegard Nossum <[email protected]>:
> On Wed, Jul 23, 2008 at 11:35 PM, Dmitry Adamushko
> <[email protected]> wrote:
>> 2008/7/23 Dmitry Adamushko <[email protected]>:
>>> 2008/7/23 Vegard Nossum <[email protected]>:
>>>> ------------[ cut here ]------------
>>>> WARNING: at kernel/sched_fair.c:815 hrtick_start_fair+0x158/0x170()
>>>
>>> that's interesting. As a first step and if it's easily reproducible,
>>> would you try something like below?
>>
>> If not, maybe object files are still in-tact? Just to see where
>> exactly in schedule() it crashed so maybe we may get an idea on what
>> was a result of next = pick_next_task(rq, prev);
>
> Yes, indeed. You mean this part?
>
> BUG: unable to handle kernel paging request at c0804358
> IP: [<c057e38d>] schedule+0x1ed/0x850
> *pde = 3780e163 *pte = 00804162
> Oops: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC
> Pid: 5575, comm: migration/1 Tainted: G W (2.6.26 #1)
> EIP: 0060:[<c057e38d>] EFLAGS: 00010086 CPU: 1
> EIP is at schedule+0x1ed/0x850
> EAX: c0803f00 EBX: 00000001 ECX: f4d10000 EDX: 00450008
> ESI: f4c91fe0 EDI: c0803f00 EBP: f4c85fa0 ESP: f4c85f3c
>
> $ addr2line -e vmlinux -i c057e38d
> kernel/sched_stats.h:164
> kernel/sched_stats.h:236
> kernel/sched_stats.h:242
> kernel/sched.c:4191
>
> (Keep in mind that the line numbers are for v2.6.26.)
>
> Looks to be this line (kernel/sched.c):
>
> sched_info_switch(prev, next);
>
> or for kernel/sched_stats.h:164:
>
> unsigned long long now = task_rq(t)->clock, delta = 0;
>
> ..I'm not sure how much information this is?

combining with other information from this thread (which I have
happily ignored before jumping in :-/), it's quite a lot.

't' ('next' from schedule()) is ok in this case. The problem might be
with its thread_info structure. In this case, task_rq(t) resorts to
task_cpu(t) which, in turn, comes to task_thread_info(t)->cpu.

So I guess, 'cpu' value is slightly, well, out of reality. Check the
address of "runqueues" in your kernel image...
I guess, it should be quite close to the "fault" address... then we
can even calculate 'cpu' :-)

task's thread_info is on the stack so if it gets corrupted we may get
something like this.


>
> Most likely the task_struct just got corrupted, like for the xstate
> crashes... Should I still try the patch? :-)

No. It has the same symptoms as above, namely task_cpu() is likely
wrong. So it fits in this "perhaps task's thread_info gets corrupted"
theory.


>
> Vegard
>

--
Best regards,
Dmitry Adamushko

2008-07-23 22:05:59

by Vegard Nossum

[permalink] [raw]
Subject: Re: recent -git: BUG in free_thread_xstate

On Thu, Jul 24, 2008 at 12:01 AM, Dmitry Adamushko
<[email protected]> wrote:
>> BUG: unable to handle kernel paging request at c0804358
>> IP: [<c057e38d>] schedule+0x1ed/0x850

[...]

> So I guess, 'cpu' value is slightly, well, out of reality. Check the
> address of "runqueues" in your kernel image...
> I guess, it should be quite close to the "fault" address... then we
> can even calculate 'cpu' :-)

Yup, that's right.

$ nm vmlinux | grep runqueues
c0803f00 d per_cpu__runqueues


Vegard

--
"The animistic metaphor of the bug that maliciously sneaked in while
the programmer was not looking is intellectually dishonest as it
disguises that the error is the programmer's own creation."
-- E. W. Dijkstra, EWD1036

2008-07-23 22:42:19

by Max Krasnyansky

[permalink] [raw]
Subject: Re: recent -git: BUG in free_thread_xstate

Vegard Nossum wrote:
> On Wed, Jul 23, 2008 at 10:31 PM, Suresh Siddha
> <[email protected]> wrote:
>> On Wed, Jul 23, 2008 at 01:07:04PM -0700, Vegard Nossum wrote:
>>> Hi,
>>>
>>> I just got this on c010b2f76c3032e48097a6eef291d8593d5d79a6 (-git from
>>> yesterday):
>> Do you see this in 2.6.26 aswell? I suspect it is coming from post 2.6.26
>> changes.
>
> Yep. Got this on 2.6.26 now:
>
> BUG: unable to handle kernel paging request at 00664381
> IP: [<c010b884>] free_thread_xstate+0x4/0x30
> *pde = 00000000
> Oops: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC
> Pid: 3796, comm: bash Not tainted (2.6.26 #1)
> EIP: 0060:[<c010b884>] EFLAGS: 00210246 CPU: 0
> EIP is at free_thread_xstate+0x4/0x30
> EAX: 00664001 EBX: f3870000 ECX: 00000004 EDX: f4b544e8
> ESI: f4bdef28 EDI: c07feda0 EBP: f5325bd0 ESP: f5325bcc
> DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
> Process bash (pid: 3796, ti=f5324000 task=f4b53fc0 task.ti=f5324000)
> Stack: f3870000 f5325bdc c010b8bd f4bddfa0 f5325be8 c0132b89 f4bddfa0 f5325bf4
> c0133fd1 f4b77e00 f5325bfc c01368a7 f5325c14 c0172b8c 00200282 c0752b40
> 00000001 00000009 f5325c30 c0139cd3 c0803d00 c0803d00 c0803d00 00200046
> Call Trace:
> [<c010b8bd>] ? free_thread_info+0xd/0x20
> [<c0132b89>] ? free_task+0x19/0x30
> [<c0133fd1>] ? __put_task_struct+0x51/0xa0
> [<c01368a7>] ? delayed_put_task_struct+0x27/0x30
> [<c0172b8c>] ? rcu_process_callbacks+0x6c/0xb0
> [<c0139cd3>] ? __do_softirq+0x83/0x100
> [<c0139df5>] ? do_softirq+0xa5/0xb0
> [<c0139f95>] ? irq_exit+0x95/0xa0
> [<c0107e4d>] ? do_IRQ+0x4d/0xa0
> [<c01057b2>] ? common_interrupt+0x2e/0x34
> [<c013549e>] ? vprintk+0x1be/0x420
> [<c010aea5>] ? native_sched_clock+0xb5/0x110
> [<c010aea5>] ? native_sched_clock+0xb5/0x110
> [<c013571b>] ? printk+0x1b/0x20
> [<c012cbec>] ? cpu_attach_domain+0x3ec/0x410
> [<c010aea5>] ? native_sched_clock+0xb5/0x110
> [<c01979e1>] ? check_bytes_and_report+0x21/0xc0
> [<c0197d8f>] ? check_object+0xdf/0x1f0
> [<c0126c17>] ? sd_free_ctl_entry+0x37/0x50
> [<c0157895>] ? mark_held_locks+0x65/0x80
> [<c0199055>] ? kfree+0xb5/0x120
> [<c0157a24>] ? trace_hardirqs_on+0xd4/0x160
> [<c0126c17>] ? sd_free_ctl_entry+0x37/0x50
> [<c0126c17>] ? sd_free_ctl_entry+0x37/0x50
> [<c0126c17>] ? sd_free_ctl_entry+0x37/0x50
> [<c012cc3e>] ? detach_destroy_domains+0x2e/0x50
> [<c012cc9b>] ? update_sched_domains+0x3b/0x50
> [<c014d467>] ? notifier_call_chain+0x37/0x70
> [<c014d4d9>] ? __raw_notifier_call_chain+0x19/0x20
> [<c055c858>] ? _cpu_down+0x78/0x240
> [<c015d92f>] ? cpu_maps_update_begin+0xf/0x20
> [<c055ca4b>] ? cpu_down+0x2b/0x40
> [<c055dc69>] ? store_online+0x39/0x80
> [<c055dc30>] ? store_online+0x0/0x80
> [<c02faf6b>] ? sysdev_store+0x2b/0x40
> [<c01dcdd2>] ? sysfs_write_file+0xa2/0x100
> [<c019eb76>] ? vfs_write+0x96/0x130
> [<c01dcd30>] ? sysfs_write_file+0x0/0x100
> [<c019f23d>] ? sys_write+0x3d/0x70
> [<c0104cdb>] ? sysenter_past_esp+0x78/0xd1
> =======================
> Code: 04 00 00 00 00 c7 04 24 00 00 04 00 e8 96 f8 08 00 a3 b4 a5 80
> c0 c9 c3 eb 0d 90 90 90 90 90 90 90 90 90 90 90 90 90 55 89 e5 53 <8b>
> 90 80 03 00 00 89 c3 85 d2 74 14 a1 b4 a5 80 c0 e8 d6 e4 08
> EIP: [<c010b884>] free_thread_xstate+0x4/0x30 SS:ESP 0068:f5325bcc
> Kernel panic - not syncing: Fatal exception in interrupt
>
> I'm not sure what to make of this. It looks related to the rebuilding
> of sched domains that we saw earlier. But this reproduces on both
> v2.6.26 and latest -git (though not with that backtrace).

Based on the trace above it seems that we panic even before calling into
cpusets. (ie I do not see rebuild_sched_domains() in there). Which means
it must be something different. The problem we had before was that
cpusets where screwing up domain rebuild sequence during cpu hotplug
handling.

Max

2008-07-23 22:45:51

by Vegard Nossum

[permalink] [raw]
Subject: Re: recent -git: BUG in free_thread_xstate

On Thu, Jul 24, 2008 at 12:05 AM, Vegard Nossum <[email protected]> wrote:
> On Thu, Jul 24, 2008 at 12:01 AM, Dmitry Adamushko
> <[email protected]> wrote:
>> So I guess, 'cpu' value is slightly, well, out of reality. Check the
>> address of "runqueues" in your kernel image...
>> I guess, it should be quite close to the "fault" address... then we
>> can even calculate 'cpu' :-)
>
> Yup, that's right.
>
> $ nm vmlinux | grep runqueues
> c0803f00 d per_cpu__runqueues

Hey, with this patch applied:

diff --git a/include/asm-x86/string_32.h b/include/asm-x86/string_32.h
index b49369a..7bef7ea 100644
--- a/include/asm-x86/string_32.h
+++ b/include/asm-x86/string_32.h
@@ -29,9 +29,14 @@ extern char *strchr(const char *s, int c);
#define __HAVE_ARCH_STRLEN
extern size_t strlen(const char *s);

+extern void warn_on_slowpath(const char *file, int line);
+
static __always_inline void * __memcpy(void * to, const void * from, size_t n)
{
int d0, d1, d2;
+ if (n == 0x6b)
+ warn_on_slowpath(__FILE__, __LINE__);
+
__asm__ __volatile__(
"rep ; movsl\n\t"
"movl %4,%%ecx\n\t"

I have found an important clue; it seems to be my network driver's fault:

------------[ cut here ]------------
WARNING: at include2/asm/string_32.h:38 skb_copy_and_csum_dev+0xee/0x100()
Pid: 3989, comm: bash Tainted: G W 2.6.26-dirty #3
[<c013496f>] warn_on_slowpath+0x4f/0x70
[<c0198041>] ? check_bytes_and_report+0x21/0xc0
[<c04a8544>] ? __kfree_skb+0x34/0x80
[<c0198041>] ? check_bytes_and_report+0x21/0xc0
[<c01983ef>] ? check_object+0xdf/0x1f0
[<c0198041>] ? check_bytes_and_report+0x21/0xc0
[<c04a8544>] ? __kfree_skb+0x34/0x80
[<c01983ef>] ? check_object+0xdf/0x1f0
[<c04bbafc>] ? find_skb+0x3c/0x80
[<c04a9f7e>] skb_copy_and_csum_dev+0xee/0x100
[<c03539d7>] rtl8139_start_xmit+0x57/0x130
[<c019a84b>] ? __kmalloc_track_caller+0x8b/0x120
[<c04bba6e>] netpoll_send_skb+0x14e/0x1a0
[<c04bbf54>] netpoll_send_udp+0x1e4/0x210
[<c0374b0c>] write_msg+0x8c/0xc0
[<c0135053>] __call_console_drivers+0x53/0x60
[<c01350ab>] _call_console_drivers+0x4b/0x90
[<c01351f5>] release_console_sem+0xc5/0x1f0
[<c01357fe>] vprintk+0x2ce/0x420
[<c0107e7d>] ? do_IRQ+0x4d/0xa0
[<c0104de5>] ? restore_nocheck+0x12/0x15
[<c0286ae1>] ? delay_tsc+0x61/0xb8
[<c0286b06>] ? delay_tsc+0x86/0xb8
[<c013596b>] printk+0x1b/0x20
[<c0580d5d>] native_cpu_up+0x7cd/0x880
[<c01df741>] ? internal_create_group+0xd1/0x180
[<c0580470>] ? do_fork_idle+0x0/0x20
[<c014d7c9>] ? __raw_notifier_call_chain+0x19/0x20
[<c05826f3>] _cpu_up+0x83/0x100
[<c05827b9>] cpu_up+0x49/0x70
[<c05635d8>] store_online+0x58/0x80
[<c0563580>] ? store_online+0x0/0x80
[<c02fda2b>] sysdev_store+0x2b/0x40
[<c01dd7b2>] sysfs_write_file+0xa2/0x100
[<c019f156>] vfs_write+0x96/0x130
[<c01dd710>] ? sysfs_write_file+0x0/0x100
[<c019f81d>] sys_write+0x3d/0x70
[<c0104cdb>] sysenter_past_esp+0x78/0xd1
=======================
---[ end trace a7919e7f17c0a725 ]---

In particular, these are interesting:

[<c04a9f7e>] skb_copy_and_csum_dev+0xee/0x100

This is net/core/skbuff.c:1731:
skb_copy_from_linear_data(skb, to, csstart);

[<c03539d7>] rtl8139_start_xmit+0x57/0x130

This is drivers/net/8139too.c:1711:
dev_kfree_skb(skb);

(The line numbers are still from v2.6.26, but this reproduces on
current -git as well.)

Is this enough information to fix it? :-)


Vegard

--
"The animistic metaphor of the bug that maliciously sneaked in while
the programmer was not looking is intellectually dishonest as it
disguises that the error is the programmer's own creation."
-- E. W. Dijkstra, EWD1036

2008-07-23 22:51:10

by Vegard Nossum

[permalink] [raw]
Subject: Re: recent -git: BUG in free_thread_xstate

On Thu, Jul 24, 2008 at 12:45 AM, Vegard Nossum <[email protected]> wrote:
> Hey, with this patch applied:
>
> diff --git a/include/asm-x86/string_32.h b/include/asm-x86/string_32.h
> index b49369a..7bef7ea 100644
> --- a/include/asm-x86/string_32.h
> +++ b/include/asm-x86/string_32.h
> @@ -29,9 +29,14 @@ extern char *strchr(const char *s, int c);
> #define __HAVE_ARCH_STRLEN
> extern size_t strlen(const char *s);
>
> +extern void warn_on_slowpath(const char *file, int line);
> +
> static __always_inline void * __memcpy(void * to, const void * from, size_t n)
> {
> int d0, d1, d2;
> + if (n == 0x6b)
> + warn_on_slowpath(__FILE__, __LINE__);
> +
> __asm__ __volatile__(
> "rep ; movsl\n\t"
> "movl %4,%%ecx\n\t"
>
> I have found an important clue; it seems to be my network driver's fault:
>
> ------------[ cut here ]------------
> WARNING: at include2/asm/string_32.h:38 skb_copy_and_csum_dev+0xee/0x100()
> Pid: 3989, comm: bash Tainted: G W 2.6.26-dirty #3
> [<c013496f>] warn_on_slowpath+0x4f/0x70
> [<c0198041>] ? check_bytes_and_report+0x21/0xc0
> [<c04a8544>] ? __kfree_skb+0x34/0x80
> [<c0198041>] ? check_bytes_and_report+0x21/0xc0
> [<c01983ef>] ? check_object+0xdf/0x1f0
> [<c0198041>] ? check_bytes_and_report+0x21/0xc0
> [<c04a8544>] ? __kfree_skb+0x34/0x80
> [<c01983ef>] ? check_object+0xdf/0x1f0
> [<c04bbafc>] ? find_skb+0x3c/0x80
> [<c04a9f7e>] skb_copy_and_csum_dev+0xee/0x100
> [<c03539d7>] rtl8139_start_xmit+0x57/0x130
> [<c019a84b>] ? __kmalloc_track_caller+0x8b/0x120
> [<c04bba6e>] netpoll_send_skb+0x14e/0x1a0
> [<c04bbf54>] netpoll_send_udp+0x1e4/0x210
> [<c0374b0c>] write_msg+0x8c/0xc0
> [<c0135053>] __call_console_drivers+0x53/0x60
> [<c01350ab>] _call_console_drivers+0x4b/0x90
> [<c01351f5>] release_console_sem+0xc5/0x1f0
> [<c01357fe>] vprintk+0x2ce/0x420
> [<c0107e7d>] ? do_IRQ+0x4d/0xa0
> [<c0104de5>] ? restore_nocheck+0x12/0x15
> [<c0286ae1>] ? delay_tsc+0x61/0xb8
> [<c0286b06>] ? delay_tsc+0x86/0xb8
> [<c013596b>] printk+0x1b/0x20
> [<c0580d5d>] native_cpu_up+0x7cd/0x880
> [<c01df741>] ? internal_create_group+0xd1/0x180
> [<c0580470>] ? do_fork_idle+0x0/0x20
> [<c014d7c9>] ? __raw_notifier_call_chain+0x19/0x20
> [<c05826f3>] _cpu_up+0x83/0x100
> [<c05827b9>] cpu_up+0x49/0x70
> [<c05635d8>] store_online+0x58/0x80
> [<c0563580>] ? store_online+0x0/0x80
> [<c02fda2b>] sysdev_store+0x2b/0x40
> [<c01dd7b2>] sysfs_write_file+0xa2/0x100
> [<c019f156>] vfs_write+0x96/0x130
> [<c01dd710>] ? sysfs_write_file+0x0/0x100
> [<c019f81d>] sys_write+0x3d/0x70
> [<c0104cdb>] sysenter_past_esp+0x78/0xd1
> =======================
> ---[ end trace a7919e7f17c0a725 ]---
>
> In particular, these are interesting:
>
> [<c04a9f7e>] skb_copy_and_csum_dev+0xee/0x100
>
> This is net/core/skbuff.c:1731:
> skb_copy_from_linear_data(skb, to, csstart);
>
> [<c03539d7>] rtl8139_start_xmit+0x57/0x130
>
> This is drivers/net/8139too.c:1711:
> dev_kfree_skb(skb);
>

Oops, this should of course be the line just above (because the
address on the stack is the return address...), which is:

skb_copy_and_csum_dev(skb, tp->tx_buf[entry]);

(Big surprise there ;-))

> (The line numbers are still from v2.6.26, but this reproduces on
> current -git as well.)
>
> Is this enough information to fix it? :-)

I've also added Jeff Garzik to Cc since he seems to be the maintainer
of this driver.


Vegard

--
"The animistic metaphor of the bug that maliciously sneaked in while
the programmer was not looking is intellectually dishonest as it
disguises that the error is the programmer's own creation."
-- E. W. Dijkstra, EWD1036

2008-07-23 23:20:58

by Vegard Nossum

[permalink] [raw]
Subject: Re: recent -git: BUG in free_thread_xstate

On Thu, Jul 24, 2008 at 12:50 AM, Vegard Nossum <[email protected]> wrote:
> On Thu, Jul 24, 2008 at 12:45 AM, Vegard Nossum <[email protected]> wrote:
>> Hey, with this patch applied:
>>
>> diff --git a/include/asm-x86/string_32.h b/include/asm-x86/string_32.h
>> index b49369a..7bef7ea 100644
>> --- a/include/asm-x86/string_32.h
>> +++ b/include/asm-x86/string_32.h
>> @@ -29,9 +29,14 @@ extern char *strchr(const char *s, int c);
>> #define __HAVE_ARCH_STRLEN
>> extern size_t strlen(const char *s);
>>
>> +extern void warn_on_slowpath(const char *file, int line);
>> +
>> static __always_inline void * __memcpy(void * to, const void * from, size_t n)
>> {
>> int d0, d1, d2;
>> + if (n == 0x6b)
>> + warn_on_slowpath(__FILE__, __LINE__);
>> +
>> __asm__ __volatile__(
>> "rep ; movsl\n\t"
>> "movl %4,%%ecx\n\t"
>>
>> I have found an important clue; it seems to be my network driver's fault:
>>
>> ------------[ cut here ]------------
>> WARNING: at include2/asm/string_32.h:38 skb_copy_and_csum_dev+0xee/0x100()
>> Pid: 3989, comm: bash Tainted: G W 2.6.26-dirty #3
>> [<c013496f>] warn_on_slowpath+0x4f/0x70
>> [<c0198041>] ? check_bytes_and_report+0x21/0xc0
>> [<c04a8544>] ? __kfree_skb+0x34/0x80
>> [<c0198041>] ? check_bytes_and_report+0x21/0xc0
>> [<c01983ef>] ? check_object+0xdf/0x1f0
>> [<c0198041>] ? check_bytes_and_report+0x21/0xc0
>> [<c04a8544>] ? __kfree_skb+0x34/0x80
>> [<c01983ef>] ? check_object+0xdf/0x1f0
>> [<c04bbafc>] ? find_skb+0x3c/0x80
>> [<c04a9f7e>] skb_copy_and_csum_dev+0xee/0x100
>> [<c03539d7>] rtl8139_start_xmit+0x57/0x130
>> [<c019a84b>] ? __kmalloc_track_caller+0x8b/0x120
>> [<c04bba6e>] netpoll_send_skb+0x14e/0x1a0
>> [<c04bbf54>] netpoll_send_udp+0x1e4/0x210
>> [<c0374b0c>] write_msg+0x8c/0xc0
>> [<c0135053>] __call_console_drivers+0x53/0x60
>> [<c01350ab>] _call_console_drivers+0x4b/0x90
>> [<c01351f5>] release_console_sem+0xc5/0x1f0
>> [<c01357fe>] vprintk+0x2ce/0x420
>> [<c0107e7d>] ? do_IRQ+0x4d/0xa0
>> [<c0104de5>] ? restore_nocheck+0x12/0x15
>> [<c0286ae1>] ? delay_tsc+0x61/0xb8
>> [<c0286b06>] ? delay_tsc+0x86/0xb8
>> [<c013596b>] printk+0x1b/0x20
>> [<c0580d5d>] native_cpu_up+0x7cd/0x880
>> [<c01df741>] ? internal_create_group+0xd1/0x180
>> [<c0580470>] ? do_fork_idle+0x0/0x20
>> [<c014d7c9>] ? __raw_notifier_call_chain+0x19/0x20
>> [<c05826f3>] _cpu_up+0x83/0x100
>> [<c05827b9>] cpu_up+0x49/0x70
>> [<c05635d8>] store_online+0x58/0x80
>> [<c0563580>] ? store_online+0x0/0x80
>> [<c02fda2b>] sysdev_store+0x2b/0x40
>> [<c01dd7b2>] sysfs_write_file+0xa2/0x100
>> [<c019f156>] vfs_write+0x96/0x130
>> [<c01dd710>] ? sysfs_write_file+0x0/0x100
>> [<c019f81d>] sys_write+0x3d/0x70
>> [<c0104cdb>] sysenter_past_esp+0x78/0xd1
>> =======================
>> ---[ end trace a7919e7f17c0a725 ]---
>>
>> In particular, these are interesting:
>>
>> [<c04a9f7e>] skb_copy_and_csum_dev+0xee/0x100
>>
>> This is net/core/skbuff.c:1731:
>> skb_copy_from_linear_data(skb, to, csstart);
>>
>> [<c03539d7>] rtl8139_start_xmit+0x57/0x130
>>
>> This is drivers/net/8139too.c:1711:
>> dev_kfree_skb(skb);
>>
>
> Oops, this should of course be the line just above (because the
> address on the stack is the return address...), which is:
>
> skb_copy_and_csum_dev(skb, tp->tx_buf[entry]);
>
> (Big surprise there ;-))
>
>> (The line numbers are still from v2.6.26, but this reproduces on
>> current -git as well.)
>>
>> Is this enough information to fix it? :-)
>
> I've also added Jeff Garzik to Cc since he seems to be the maintainer
> of this driver.

Hm. I'm not sure it's the driver's fault after all.

Look at the skb_copy_and_csum_dev() line again:

skb_copy_from_linear_data(skb, to, csstart);

And csstart was probably loaded in this line:

csstart = skb_headlen(skb);

Which makes sense if "skb" was freed (that's the case where "csstart"
would be 0x6b). Hm, looking at skb_headlen():

static inline unsigned int skb_headlen(const struct sk_buff *skb)
{
return skb->len - skb->data_len;
}

It seems difficult for this to return 0x6b unless skb->data_len has
been set to 0 after it was freed.

In either case, rtl_8139_start_xmit() is only passing on the skb it
got from netpoll_send_skb(). The call is from net/core/netpoll.c:290:

status = dev->hard_start_xmit(skb, dev);

Looks like the skb is passed into this as well... netpoll_send_skb(), line 370:

netpoll_send_skb(np, skb);

So finally, this function is doing lots of stuff with skbs which I
have no idea what is. Seems like this one is getting an already freed
skbuff. Somehow. Or maybe it's freed while it's handling it.

Hm, seems to be no recent changes in this area. Maybe I'm on the
completely wrong track. I'll add a couple of Cc in either case.


Vegard

--
"The animistic metaphor of the bug that maliciously sneaked in while
the programmer was not looking is intellectually dishonest as it
disguises that the error is the programmer's own creation."
-- E. W. Dijkstra, EWD1036

2008-07-24 00:02:38

by Vegard Nossum

[permalink] [raw]
Subject: Re: recent -git: BUG in free_thread_xstate

On Thu, Jul 24, 2008 at 1:53 AM, Matt Mackall <[email protected]> wrote:
>
> On Thu, 2008-07-24 at 01:20 +0200, Vegard Nossum wrote:
>> >> +extern void warn_on_slowpath(const char *file, int line);
>
> Mmm, fun.
>
>> >> [<c013496f>] warn_on_slowpath+0x4f/0x70
>> >> [<c0198041>] ? check_bytes_and_report+0x21/0xc0
>> >> [<c04a8544>] ? __kfree_skb+0x34/0x80
>> >> [<c0198041>] ? check_bytes_and_report+0x21/0xc0
>> >> [<c01983ef>] ? check_object+0xdf/0x1f0
>> >> [<c0198041>] ? check_bytes_and_report+0x21/0xc0
>> >> [<c04a8544>] ? __kfree_skb+0x34/0x80
>> >> [<c01983ef>] ? check_object+0xdf/0x1f0
>> >> [<c04bbafc>] ? find_skb+0x3c/0x80
>> >> [<c04a9f7e>] skb_copy_and_csum_dev+0xee/0x100
>> >> [<c03539d7>] rtl8139_start_xmit+0x57/0x130
>> >> [<c019a84b>] ? __kmalloc_track_caller+0x8b/0x120
>> >> [<c04bba6e>] netpoll_send_skb+0x14e/0x1a0
>> >> [<c04bbf54>] netpoll_send_udp+0x1e4/0x210
>
> Trying to warn_on from inside a network driver from inside netpoll may
> cause trouble.

Yeah.. but apparently it was fine? I'll try to disable netconsole and
see if I can still get the crash.

BTW, I posted this earlier (but without all the Ccs, and without my
debugging patch):

=============================================================================
BUG task_struct: Poison overwritten
-----------------------------------------------------------------------------
INFO: 0xf3d00000-0xf3d0006b. First byte 0x1 instead of 0x6b
INFO: Allocated in copy_process+0x68/0x1130 age=4 cpu=0 pid=4338
INFO: Freed in free_task+0x2c/0x30 age=2 cpu=0 pid=4
INFO: Slab 0xc1c25c00 objects=8 used=3 fp=0xf3d00000 flags=0x400020c3
INFO: Object 0xf3d00000 @offset=0 fp=0xf3d03fc0
Object 0xf3d00000: 01 40 66 00 00 16 ec ee ad b9 00 1c 26 8a 70 f8
Object 0xf3d00010: 08 00 45 00 00 54 00 00 40 00 40 01 b7 e8 c0 a8
Object 0xf3d00020: 00 c4 c0 a8 00 ac 08 00 6e c0 df 24 55 33 75 af

And I just realized that c0 a8 00 c4 is my IP address, 192.168.0.196 :-)

So _something_ seems to be up with networking :-)

(And this is in task_struct! Wow...)


Vegard

--
"The animistic metaphor of the bug that maliciously sneaked in while
the programmer was not looking is intellectually dishonest as it
disguises that the error is the programmer's own creation."
-- E. W. Dijkstra, EWD1036

2008-07-24 00:08:10

by Matt Mackall

[permalink] [raw]
Subject: Re: recent -git: BUG in free_thread_xstate


On Thu, 2008-07-24 at 01:20 +0200, Vegard Nossum wrote:
> >> +extern void warn_on_slowpath(const char *file, int line);

Mmm, fun.

> >> [<c013496f>] warn_on_slowpath+0x4f/0x70
> >> [<c0198041>] ? check_bytes_and_report+0x21/0xc0
> >> [<c04a8544>] ? __kfree_skb+0x34/0x80
> >> [<c0198041>] ? check_bytes_and_report+0x21/0xc0
> >> [<c01983ef>] ? check_object+0xdf/0x1f0
> >> [<c0198041>] ? check_bytes_and_report+0x21/0xc0
> >> [<c04a8544>] ? __kfree_skb+0x34/0x80
> >> [<c01983ef>] ? check_object+0xdf/0x1f0
> >> [<c04bbafc>] ? find_skb+0x3c/0x80
> >> [<c04a9f7e>] skb_copy_and_csum_dev+0xee/0x100
> >> [<c03539d7>] rtl8139_start_xmit+0x57/0x130
> >> [<c019a84b>] ? __kmalloc_track_caller+0x8b/0x120
> >> [<c04bba6e>] netpoll_send_skb+0x14e/0x1a0
> >> [<c04bbf54>] netpoll_send_udp+0x1e4/0x210

Trying to warn_on from inside a network driver from inside netpoll may
cause trouble.

--
Mathematics is the supreme nostalgia of our time.

2008-08-01 21:17:27

by Paul E. McKenney

[permalink] [raw]
Subject: Re: recent -git: BUG in free_thread_xstate

On Wed, Jul 23, 2008 at 01:31:09PM -0700, Suresh Siddha wrote:
> On Wed, Jul 23, 2008 at 01:07:04PM -0700, Vegard Nossum wrote:
> > Hi,
> >
> > I just got this on c010b2f76c3032e48097a6eef291d8593d5d79a6 (-git from
> > yesterday):
>
> Do you see this in 2.6.26 aswell? I suspect it is coming from post 2.6.26
> changes.
>
> >
> > BUG: unable to handle kernel paging request at 00664381
> > IP: [<c010b274>] free_thread_xstate+0x4/0x30
> ...
>
> > EIP is at arch/x86/kernel/process.c:36:
> >
> > if (tsk->thread.xstate) {
> >
>
> It looks like the kernel stack of that process got corrupted, corrupting the
> task pointer in thread_info. Can you send us your config file?

I would also like to see the config file.

Thanx, Paul

2008-08-08 18:46:32

by Vegard Nossum

[permalink] [raw]
Subject: Re: recent -git: BUG in free_thread_xstate

On Fri, Aug 1, 2008 at 11:10 PM, Paul E. McKenney
<[email protected]> wrote:
> On Wed, Jul 23, 2008 at 01:31:09PM -0700, Suresh Siddha wrote:
>> On Wed, Jul 23, 2008 at 01:07:04PM -0700, Vegard Nossum wrote:
>> > Hi,
>> >
>> > I just got this on c010b2f76c3032e48097a6eef291d8593d5d79a6 (-git from
>> > yesterday):
>>
>> Do you see this in 2.6.26 aswell? I suspect it is coming from post 2.6.26
>> changes.
>>
>> >
>> > BUG: unable to handle kernel paging request at 00664381
>> > IP: [<c010b274>] free_thread_xstate+0x4/0x30
>> ...
>>
>> > EIP is at arch/x86/kernel/process.c:36:
>> >
>> > if (tsk->thread.xstate) {
>> >
>>
>> It looks like the kernel stack of that process got corrupted, corrupting the
>> task pointer in thread_info. Can you send us your config file?
>
> I would also like to see the config file.

Hi,

I'm sorry for the late reply.

I copied you because I saw some RCU entry in the stack trace, but it
is almost definitely not a problem with (core or "leaf") RCU code.
Sometimes it also happens that people will say "oh, I recognize this
problem, the patch has been posted here and here", etc.

It seems to be a problem with either netpoll, netconsole, or the
8139too driver. I find a UDP packet in the task_struct slab, and the
stacktrace with RCU entries come from unrelated, unfortunate callbacks
that stumbled upon the corruption.

My config, if you are still interested, can be found here:
http://userweb.kernel.org/~vegard/bugs/20080724-fork/config

I don't know if the problem persists with the latest -git, it is now a
while since I last tested, but I've checked kernels back to 2.6.20, so
the problem has existed for a long time.


Vegard

--
"The animistic metaphor of the bug that maliciously sneaked in while
the programmer was not looking is intellectually dishonest as it
disguises that the error is the programmer's own creation."
-- E. W. Dijkstra, EWD1036

2008-08-08 20:41:01

by Paul E. McKenney

[permalink] [raw]
Subject: Re: recent -git: BUG in free_thread_xstate

On Fri, Aug 08, 2008 at 08:46:21PM +0200, Vegard Nossum wrote:
> On Fri, Aug 1, 2008 at 11:10 PM, Paul E. McKenney
> <[email protected]> wrote:
> > On Wed, Jul 23, 2008 at 01:31:09PM -0700, Suresh Siddha wrote:
> >> On Wed, Jul 23, 2008 at 01:07:04PM -0700, Vegard Nossum wrote:
> >> > Hi,
> >> >
> >> > I just got this on c010b2f76c3032e48097a6eef291d8593d5d79a6 (-git from
> >> > yesterday):
> >>
> >> Do you see this in 2.6.26 aswell? I suspect it is coming from post 2.6.26
> >> changes.
> >>
> >> >
> >> > BUG: unable to handle kernel paging request at 00664381
> >> > IP: [<c010b274>] free_thread_xstate+0x4/0x30
> >> ...
> >>
> >> > EIP is at arch/x86/kernel/process.c:36:
> >> >
> >> > if (tsk->thread.xstate) {
> >> >
> >>
> >> It looks like the kernel stack of that process got corrupted, corrupting the
> >> task pointer in thread_info. Can you send us your config file?
> >
> > I would also like to see the config file.
>
> Hi,
>
> I'm sorry for the late reply.
>
> I copied you because I saw some RCU entry in the stack trace, but it
> is almost definitely not a problem with (core or "leaf") RCU code.
> Sometimes it also happens that people will say "oh, I recognize this
> problem, the patch has been posted here and here", etc.
>
> It seems to be a problem with either netpoll, netconsole, or the
> 8139too driver. I find a UDP packet in the task_struct slab, and the
> stacktrace with RCU entries come from unrelated, unfortunate callbacks
> that stumbled upon the corruption.
>
> My config, if you are still interested, can be found here:
> http://userweb.kernel.org/~vegard/bugs/20080724-fork/config
>
> I don't know if the problem persists with the latest -git, it is now a
> while since I last tested, but I've checked kernels back to 2.6.20, so
> the problem has existed for a long time.

Well, the config shows preemptable RCU, which was my concern at the time,
but there was certainly no preemptable RCU in mainline in 2.6.20, so...

There -was- a bug in 2.6.26 release candidates that would cause RCU
to fail badly on !HOTPLUG_CPU builds due to a failure to initialize,
but that is fixed in 2.6.26 (thank you, Nick!!!).

Thanx, Paul