2006-08-02 05:37:16

by Robert Crocombe

[permalink] [raw]
Subject: Problems with 2.6.17-rt8

Trying again. Hopefully gmail will send as plain/text this time.
Ingo CC'd on the version with the text/html subpart (sorry).

I've just stuck up the relevant stuff on my little webserver at:

http://66.93.162.103/~rcrocomb/lkml/

The files are:

config_2.6.17-rt8_local_00
config_2.6.17-rt8_local_01
dmesg_2.6.17-rt8_local_00_00
linux_version_info
output_from_lspci
scooter_log

which should largely be self-explanatory, I think. The local_00
kernel does not have much in the way of debugging info turned on,
because I was hoping for peak performance. local_01 has a bunch more
debug widgets enabled:

+CONFIG_DEBUG_PREEMPT=y
+CONFIG_PREEMPT_TRACE=y
+CONFIG_DEBUG_INFO=y
+CONFIG_FRAME_POINTER=y
+CONFIG_UNWIND_INFO=y
+CONFIG_FORCED_INLINING=y
+CONFIG_IOMMU_DEBUG=y

in the hope of providing better/more info. The machine is 'scooter',
and scooter_log is the dump from 'cu' on another machine. I decided
not to trim anything, since I wasn't sure what might/might not be
relevant. Note that at first we were running with the nvidia kernel
module, but removed that later (scooter_log6 line 7142 or therebouts)
so as to test with an untainted kernel. I wasn't sure if the problems
w/ the module would be of interest or not.

I can pretty much guarantee problems with a kernel compile or two in parallel.

The machine is built around the new IWill H8502 motherboard. It's a
4-way 2.8GHz Opteron machine with 8GB of RAM and two 150GB SCSI disks
in a RAID5 setup. The person doing development on this machine says
the problems are basically constant. I am using a very similar
machine, but with dual video cards and 32GB of RAM, and haven't
experienced any problems. That machine, too, is using the nvidia
module. Oh, the underlying distribution is Fedora Core 5.

I'll be glad to provide other info/run any patches/etc.

Thanks.

Oh, I'm not lkml subscribed, so please CC.

--
Robert Crocombe
[email protected]


2006-08-02 17:51:30

by Steven Rostedt

[permalink] [raw]
Subject: Re: Problems with 2.6.17-rt8

On Tue, 2006-08-01 at 22:37 -0700, Robert Crocombe wrote:
> Trying again. Hopefully gmail will send as plain/text this time.
> Ingo CC'd on the version with the text/html subpart (sorry).
>
> I've just stuck up the relevant stuff on my little webserver at:
>
> http://66.93.162.103/~rcrocomb/lkml/
>
> The files are:
>
> config_2.6.17-rt8_local_00
> config_2.6.17-rt8_local_01
> dmesg_2.6.17-rt8_local_00_00
> linux_version_info
> output_from_lspci
> scooter_log
>
> which should largely be self-explanatory, I think. The local_00
> kernel does not have much in the way of debugging info turned on,
> because I was hoping for peak performance. local_01 has a bunch more
> debug widgets enabled:
>
> +CONFIG_DEBUG_PREEMPT=y
> +CONFIG_PREEMPT_TRACE=y
> +CONFIG_DEBUG_INFO=y
> +CONFIG_FRAME_POINTER=y
> +CONFIG_UNWIND_INFO=y
> +CONFIG_FORCED_INLINING=y
> +CONFIG_IOMMU_DEBUG=y
>
> in the hope of providing better/more info. The machine is 'scooter',
> and scooter_log is the dump from 'cu' on another machine. I decided
> not to trim anything, since I wasn't sure what might/might not be
> relevant. Note that at first we were running with the nvidia kernel
> module, but removed that later (scooter_log6 line 7142 or therebouts)
> so as to test with an untainted kernel. I wasn't sure if the problems
> w/ the module would be of interest or not.
>
> I can pretty much guarantee problems with a kernel compile or two in parallel.
>
> The machine is built around the new IWill H8502 motherboard. It's a
> 4-way 2.8GHz Opteron machine with 8GB of RAM and two 150GB SCSI disks
> in a RAID5 setup. The person doing development on this machine says
> the problems are basically constant. I am using a very similar
> machine, but with dual video cards and 32GB of RAM, and haven't
> experienced any problems. That machine, too, is using the nvidia
> module. Oh, the underlying distribution is Fedora Core 5.
>

I'm confused,

You mention problems but I don't see you listing what exactly the
problems are. Just saying "the problems exist" doesn't tell us
anything.

Don't assume that we will go to some web site to figure out what you're
talking about. Please list the problems you are facing.

I only looked at the dmesg and saw this:

BUG: scheduling while atomic: udev_run_devd/0x00000001/1568

Call Trace:
<ffffffff8045c693>{__schedule+155}
<ffffffff8045f156>{_raw_spin_unlock_irqrestore+53}
<ffffffff80242241>{task_blocks_on_rt_mutex+518}
<ffffffff80252da0>{free_pages_bulk+39}
<ffffffff80252da0>{free_pages_bulk+39}
<ffffffff8045d3bd>{schedule+236}
<ffffffff8045e1fe>{rt_lock_slowlock+327}
<ffffffff80252da0>{free_pages_bulk+39}
<ffffffff802531f0>{__free_pages_ok+408}
<ffffffff80226665>{free_task+18}
<ffffffff8045d113>{thread_return+200}
<ffffffff8026cdba>{kfree+56}
<ffffffff8026cdba>{kfree+56}
<ffffffff8045d9a8>{preempt_schedule_irq+82}
<ffffffff80209eb4>{retint_kernel+38}
<ffffffff8026cdba>{kfree+56}
<ffffffff8045f09c>{_raw_spin_unlock+57}
<ffffffff8026ce3f>{kfree+189}
<ffffffff8028f642>{seq_release+24}
<ffffffff80272f12>{__fput+161}
<ffffffff80270664>{filp_close+89}
<ffffffff80271b63>{sys_close+143}
<ffffffff80209882>{system_call+126}
nvidia: module license 'NVIDIA' taints kernel.

Is this the problem you are talking about? Bill Huey is working on this
same thing at this very moment.

-- Steve


> I'll be glad to provide other info/run any patches/etc.
>
> Thanks.
>
> Oh, I'm not lkml subscribed, so please CC.


2006-08-03 11:48:48

by Robert Crocombe

[permalink] [raw]
Subject: Re: Problems with 2.6.17-rt8

On 8/2/06, Steven Rostedt <[email protected]> wrote:
> You mention problems but I don't see you listing what exactly the
> problems are. Just saying "the problems exist" doesn't tell us
> anything.
>
> Don't assume that we will go to some web site to figure out what you're
> talking about. Please list the problems you are facing.

The machine dies (no alt-sysrq, no keyboard LEDs of any kind: dead in
the water). I thought the log would provide more useful information
without potentially erroneous editorialization by myself. Here are
some highlights:

kjournald/1105[CPU#3]: BUG in debug_rt_mutex_unlock at kernel/rtmutex-debug.c:47
1

Call Trace:
<ffffffff8047655a>{_raw_spin_lock_irqsave+24}
<ffffffff8022b272>{__WARN_ON+100}
<ffffffff802457e4>{debug_rt_mutex_unlock+199}
<ffffffff804757b7>{rt_lock_slowunlock+25}
<ffffffff80476301>{__lock_text_start+9}
<ffffffff80271e93>{kmem_cache_alloc+202}
<ffffffff8025493b>{mempool_alloc_slab+17}
<ffffffff80254d07>{mempool_alloc+75}
<ffffffff802f2f8c>{generic_make_request+375}
<ffffffff8027b914>{bio_alloc_bioset+35}
<ffffffff8027ba2a>{bio_alloc+16}
<ffffffff802781d1>{submit_bh+137}
<ffffffff80279377>{ll_rw_block+122}
<ffffffff8027939e>{ll_rw_block+161}
<ffffffff802c85dc>{journal_commit_transaction+1011}
<ffffffff80476a5f>{_raw_spin_unlock_irqrestore+56}
<ffffffff804769ac>{_raw_spin_unlock+46}
<ffffffff804757df>{rt_lock_slowunlock+65}
<ffffffff80476301>{__lock_text_start+9}
<ffffffff802339b0>{try_to_del_timer_sync+85}
<ffffffff802cca63>{kjournald+202}
<ffffffff8023db60>{autoremove_wake_function+0}
<ffffffff802cc999>{kjournald+0}
<ffffffff8023d739>{keventd_create_kthread+0}
<ffffffff8023da2f>{kthread+219}
<ffffffff80225a23>{schedule_tail+188}
<ffffffff8020aaca>{child_rip+8}
<ffffffff8023d739>{keventd_create_kthread+0}
<ffffffff8023d954>{kthread+0}
<ffffffff8020aac2>{child_rip+0}
---------------------------
| preempt count: 00000002 ]
| 2-level deep critical section nesting:
----------------------------------------
.. [<ffffffff80476499>] .... _raw_spin_lock+0x16/0x23
.....[<ffffffff804757af>] .. ( <= rt_lock_slowunlock+0x11/0x6b)
.. [<ffffffff8047655a>] .... _raw_spin_lock_irqsave+0x18/0x29
.....[<ffffffff8022b22d>] .. ( <= __WARN_ON+0x1f/0x82)


Somewhat later:

Kernel BUG at kernel/rtmutex.c:639
invalid opcode: 0000 [1] PREEMPT SMP
CPU 3
Modules linked in: nfsd exportfs lockd sunrpc tg3
Pid: 1105, comm: kjournald Not tainted 2.6.17-rt8_local_01 #1
RIP: 0010:[<ffffffff80475926>] <ffffffff80475926>{rt_lock_slowlock+181}
RSP: 0000:ffff810076c6db38 EFLAGS: 00010246
RAX: ffff810275696340 RBX: 0000000000000010 RCX: 0000000000000000
RDX: ffff810275696340 RSI: ffffffff80271e18 RDI: ffff8101800b9a60
RBP: ffff810076c6dbf8 R08: ffff810275696528 R09: ffff810076c6db38
R10: ffff810008003f38 R11: ffff8100cbf4de98 R12: ffff8101800b9a60
R13: ffff8100cbf99c40 R14: ffffffff80271e18 R15: 0000000000000010
FS: 00002b2838961770(0000) GS:ffff81018020b340(0000) knlGS:0000000000000000
CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b
CR2: 0000003d574460a0 CR3: 000000026e069000 CR4: 00000000000006e0
Process kjournald (pid: 1105, threadinfo ffff810076c6c000, task ffff810275696340
)
Stack: 111111110000008c ffff810076c6db40 ffff810076c6db40 ffff810076c6db50
ffff810076c6db50 0000000000000000 111111110000008c ffff810076c6db70
ffff810076c6db70 ffff810076c6db80
Call Trace:
<ffffffff80476310>{rt_lock+13}
<ffffffff80271e18>{kmem_cache_alloc+79}
<ffffffff8025493b>{mempool_alloc_slab+17}
<ffffffff80254d07>{mempool_alloc+75}
<ffffffff802f2f8c>{generic_make_request+375}
<ffffffff8027b983>{bio_alloc_bioset+146}
<ffffffff8027ba2a>{bio_alloc+16}
<ffffffff802781d1>{submit_bh+137}
<ffffffff80279377>{ll_rw_block+122}
<ffffffff8027939e>{ll_rw_block+161}
<ffffffff802c85dc>{journal_commit_transaction+1011}
<ffffffff80476a5f>{_raw_spin_unlock_irqrestore+56}
<ffffffff804769ac>{_raw_spin_unlock+46}
<ffffffff804757df>{rt_lock_slowunlock+65}
<ffffffff80476301>{__lock_text_start+9}
<ffffffff802339b0>{try_to_del_timer_sync+85}
<ffffffff802cca63>{kjournald+202}
<ffffffff8023db60>{autoremove_wake_function+0}
<ffffffff802cc999>{kjournald+0}
<ffffffff8023d739>{keventd_create_kthread+0}
<ffffffff8023da2f>{kthread+219}
<ffffffff80225a23>{schedule_tail+188}
<ffffffff8020aaca>{child_rip+8}
<ffffffff8023d739>{keventd_create_kthread+0}
<ffffffff8023d954>{kthread+0}
<ffffffff8020aac2>{child_rip+0}
---------------------------
| preempt count: 00000002 ]
| 2-level deep critical section nesting:
----------------------------------------
.. [<ffffffff80476499>] .... _raw_spin_lock+0x16/0x23
.....[<ffffffff804758a7>] .. ( <= rt_lock_slowlock+0x36/0x20e)
.. [<ffffffff804767b3>] .... _raw_spin_trylock+0x16/0x5a


and somewhat later:

<3>BUG: sleeping function called from invalid context kjournald(1105) at kernel
/rtmutex.c:1030
in_atomic():1 [00000001], irqs_disabled():0

Call Trace:
<ffffffff80221cd1>{__might_sleep+271}
<ffffffff80475d8a>{rt_mutex_lock+29}
<ffffffff80245a44>{rt_down_read+71}
<ffffffff802381c5>{blocking_notifier_call_chain+27}
<ffffffff8022b950>{profile_task_exit+21}
<ffffffff8022d298>{do_exit+37}
<ffffffff80476a5f>{_raw_spin_unlock_irqrestore+56}
<ffffffff8020b3a5>{kernel_math_error+0}
<ffffffff804773b1>{do_trap+223}
<ffffffff80271e18>{kmem_cache_alloc+79}
<ffffffff8020b9bc>{do_invalid_op+167}
<ffffffff80475926>{rt_lock_slowlock+181}
<ffffffff8022acee>{printk+103}
<ffffffff8020a911>{error_exit+0}
<ffffffff80271e18>{kmem_cache_alloc+79}
<ffffffff80271e18>{kmem_cache_alloc+79}
<ffffffff80475926>{rt_lock_slowlock+181}
<ffffffff804758fe>{rt_lock_slowlock+141}
<ffffffff80476310>{rt_lock+13}
<ffffffff80271e18>{kmem_cache_alloc+79}
<ffffffff8025493b>{mempool_alloc_slab+17}
<ffffffff80254d07>{mempool_alloc+75}
<ffffffff802f2f8c>{generic_make_request+375}
<ffffffff8027b983>{bio_alloc_bioset+146}
<ffffffff8027ba2a>{bio_alloc+16}
<ffffffff802781d1>{submit_bh+137}
<ffffffff80279377>{ll_rw_block+122}
<ffffffff8027939e>{ll_rw_block+161}
<ffffffff802c85dc>{journal_commit_transaction+1011}
<ffffffff80476a5f>{_raw_spin_unlock_irqrestore+56}
<ffffffff804769ac>{_raw_spin_unlock+46}
<ffffffff804757df>{rt_lock_slowunlock+65}
<ffffffff80476301>{__lock_text_start+9}
<ffffffff802339b0>{try_to_del_timer_sync+85}
<ffffffff802cca63>{kjournald+202}
<ffffffff8023db60>{autoremove_wake_function+0}
<ffffffff802cc999>{kjournald+0}
<ffffffff8023d739>{keventd_create_kthread+0}
<ffffffff8023da2f>{kthread+219}
<ffffffff80225a23>{schedule_tail+188}
<ffffffff8020aaca>{child_rip+8}
<ffffffff8023d739>{keventd_create_kthread+0}
<ffffffff8023d954>{kthread+0}
<ffffffff8020aac2>{child_rip+0}
---------------------------
| preempt count: 00000001 ]
| 1-level deep critical section nesting:
----------------------------------------
.. [<ffffffff80476499>] .... _raw_spin_lock+0x16/0x23
.....[<ffffffff804758a7>] .. ( <= rt_lock_slowlock+0x36/0x20e)

or there's this:

BUG: soft lockup detected on CPU#3!

Call Trace: <IRQ>
<ffffffff8024e8ec>{softlockup_tick+212}
<ffffffff80271008>{kmem_cache_free+64}
<ffffffff80233c95>{run_local_timers+19}
<ffffffff80233daa>{update_process_times+76}
<ffffffff8021426f>{smp_local_timer_interrupt+43}
<ffffffff8021469f>{smp_apic_timer_interrupt+64}
<ffffffff8020a774>{apic_timer_interrupt+132} <EOI>
<ffffffff80271008>{kmem_cache_free+64}
<ffffffff80476af6>{.text.lock.spinlock+12}
<ffffffff80476499>{_raw_spin_lock+22}
<ffffffff804758a7>{rt_lock_slowlock+54}
<ffffffff8024e196>{add_preempt_count+36}
<ffffffff80476310>{rt_lock+13}
<ffffffff80271008>{kmem_cache_free+64}
<ffffffff80226f82>{__cleanup_sighand+32}
<ffffffff8022c563>{release_task+665}
<ffffffff8022db04>{do_exit+2193}
<ffffffff8020b3a5>{kernel_math_error+0}
<ffffffff804773b1>{do_trap+223}
<ffffffff80271e18>{kmem_cache_alloc+79}
<ffffffff8020b9bc>{do_invalid_op+167}
<ffffffff80475926>{rt_lock_slowlock+181}
<ffffffff8022acee>{printk+103}
<ffffffff8020a911>{error_exit+0}
<ffffffff80271e18>{kmem_cache_alloc+79}
<ffffffff80271e18>{kmem_cache_alloc+79}
<ffffffff80475926>{rt_lock_slowlock+181}
<ffffffff804758fe>{rt_lock_slowlock+141}
<ffffffff80476310>{rt_lock+13}
<ffffffff80271e18>{kmem_cache_alloc+79}
<ffffffff8025493b>{mempool_alloc_slab+17}
<ffffffff80254d07>{mempool_alloc+75}
<ffffffff802f2f8c>{generic_make_request+375}
<ffffffff8027b983>{bio_alloc_bioset+146}
<ffffffff8027ba2a>{bio_alloc+16}
<ffffffff802781d1>{submit_bh+137}
<ffffffff80279377>{ll_rw_block+122}
<ffffffff8027939e>{ll_rw_block+161}
<ffffffff802c85dc>{journal_commit_transaction+1011}
<ffffffff80476a5f>{_raw_spin_unlock_irqrestore+56}
<ffffffff804769ac>{_raw_spin_unlock+46}
<ffffffff804757df>{rt_lock_slowunlock+65}
<ffffffff80476301>{__lock_text_start+9}
<ffffffff802339b0>{try_to_del_timer_sync+85}
<ffffffff802cca63>{kjournald+202}
<ffffffff8023db60>{autoremove_wake_function+0}
<ffffffff802cc999>{kjournald+0}
<ffffffff8023d739>{keventd_create_kthread+0}
<ffffffff8023da2f>{kthread+219}
<ffffffff80225a23>{schedule_tail+188}
<ffffffff8020aaca>{child_rip+8}
<ffffffff8023d739>{keventd_create_kthread+0}
<ffffffff8023d954>{kthread+0}
<ffffffff8020aac2>{child_rip+0}
---------------------------
| preempt count: 00010003 ]
| 3-level deep critical section nesting:
----------------------------------------
.. [<ffffffff80476499>] .... _raw_spin_lock+0x16/0x23
.....[<ffffffff804758a7>] .. ( <= rt_lock_slowlock+0x36/0x20e)
.. [<ffffffff80476499>] .... _raw_spin_lock+0x16/0x23
.....[<ffffffff804758a7>] .. ( <= rt_lock_slowlock+0x36/0x20e)
.. [<ffffffff80476499>] .... _raw_spin_lock+0x16/0x23
.....[<ffffffff8024e8d6>] .. ( <= softlockup_tick+0xbe/0xe9)

and finally:

IRQ 1/1085[CPU#0]: BUG in set_palette at drivers/char/vt.c:2924

Call Trace:
<ffffffff8047655a>{_raw_spin_lock_irqsave+24}
<ffffffff8022b272>{__WARN_ON+100}
<ffffffff8033b0dd>{set_palette+54}
<ffffffff8033b13d>{reset_palette+66}
<ffffffff80335497>{reset_vc+111}
<ffffffff8033f0ba>{sysrq_handle_SAK+37}
<ffffffff8033f210>{__handle_sysrq+149}
<ffffffff8033f2b7>{handle_sysrq+23}
<ffffffff80339e9d>{kbd_event+737}
<ffffffff803d046e>{input_event+1067}
<ffffffff803d417a>{atkbd_report_key+80}
<ffffffff803d4ceb>{atkbd_interrupt+1230}
<ffffffff803cd520>{serio_interrupt+69}
<ffffffff803ce145>{i8042_interrupt+499}
<ffffffff8024ec6c>{handle_IRQ_event+92}
<ffffffff8023d739>{keventd_create_kthread+0}
<ffffffff8024f720>{do_irqd+350}
<ffffffff8024f5c2>{do_irqd+0}
<ffffffff8024f5c2>{do_irqd+0}
<ffffffff8023da2f>{kthread+219}
<ffffffff80225a23>{schedule_tail+188}
<ffffffff8020aaca>{child_rip+8}
<ffffffff8023d739>{keventd_create_kthread+0}
<ffffffff8023d954>{kthread+0}
<ffffffff8020aac2>{child_rip+0}
---------------------------
| preempt count: 00000001 ]
| 1-level deep critical section nesting:
----------------------------------------
.. [<ffffffff8047655a>] .... _raw_spin_lock_irqsave+0x18/0x29
.....[<ffffffff8022b22d>] .. ( <= __WARN_ON+0x1f/0x82)

--
Robert Crocombe
[email protected]

2006-08-03 14:27:54

by Steven Rostedt

[permalink] [raw]
Subject: Re: Problems with 2.6.17-rt8

Please don't trim CC lines. LKML is too big to read all emails.

On Thu, 2006-08-03 at 04:48 -0700, Robert Crocombe wrote:
> On 8/2/06, Steven Rostedt <[email protected]> wrote:
> > You mention problems but I don't see you listing what exactly the
> > problems are. Just saying "the problems exist" doesn't tell us
> > anything.
> >
> > Don't assume that we will go to some web site to figure out what you're
> > talking about. Please list the problems you are facing.
>
> The machine dies (no alt-sysrq, no keyboard LEDs of any kind: dead in
> the water). I thought the log would provide more useful information
> without potentially erroneous editorialization by myself. Here are
> some highlights:
>
> kjournald/1105[CPU#3]: BUG in debug_rt_mutex_unlock at kernel/rtmutex-debug.c:47
> 1

Ouch, that looks like kjournald is unlocking a lock that it doesn't own?

>
> Call Trace:
> <ffffffff8047655a>{_raw_spin_lock_irqsave+24}
> <ffffffff8022b272>{__WARN_ON+100}
> <ffffffff802457e4>{debug_rt_mutex_unlock+199}
> <ffffffff804757b7>{rt_lock_slowunlock+25}
> <ffffffff80476301>{__lock_text_start+9}

hmm, here we are probably having trouble with the percpu slab locks,
that is somewhat of a hack to get slabs working on a per cpu basis.

> <ffffffff80271e93>{kmem_cache_alloc+202}

It would also be nice to know exactly where ffffffff80271e93 is.

> <ffffffff8025493b>{mempool_alloc_slab+17}
> <ffffffff80254d07>{mempool_alloc+75}
> <ffffffff802f2f8c>{generic_make_request+375}
> <ffffffff8027b914>{bio_alloc_bioset+35}
> <ffffffff8027ba2a>{bio_alloc+16}
> <ffffffff802781d1>{submit_bh+137}
> <ffffffff80279377>{ll_rw_block+122}
> <ffffffff8027939e>{ll_rw_block+161}
> <ffffffff802c85dc>{journal_commit_transaction+1011}
> <ffffffff80476a5f>{_raw_spin_unlock_irqrestore+56}
> <ffffffff804769ac>{_raw_spin_unlock+46}
> <ffffffff804757df>{rt_lock_slowunlock+65}
> <ffffffff80476301>{__lock_text_start+9}
> <ffffffff802339b0>{try_to_del_timer_sync+85}
> <ffffffff802cca63>{kjournald+202}
> <ffffffff8023db60>{autoremove_wake_function+0}
> <ffffffff802cc999>{kjournald+0}
> <ffffffff8023d739>{keventd_create_kthread+0}
> <ffffffff8023da2f>{kthread+219}
> <ffffffff80225a23>{schedule_tail+188}
> <ffffffff8020aaca>{child_rip+8}
> <ffffffff8023d739>{keventd_create_kthread+0}
> <ffffffff8023d954>{kthread+0}
> <ffffffff8020aac2>{child_rip+0}
> ---------------------------
> | preempt count: 00000002 ]
> | 2-level deep critical section nesting:
> ----------------------------------------
> .. [<ffffffff80476499>] .... _raw_spin_lock+0x16/0x23
> .....[<ffffffff804757af>] .. ( <= rt_lock_slowunlock+0x11/0x6b)
> .. [<ffffffff8047655a>] .... _raw_spin_lock_irqsave+0x18/0x29
> .....[<ffffffff8022b22d>] .. ( <= __WARN_ON+0x1f/0x82)
>
>
> Somewhat later:
>
> Kernel BUG at kernel/rtmutex.c:639

The rest was probably caused as a side effect from above. The above is
already broken!

You have NUMA configured too, so this is also something to look at.

I still wouldn't ignore the first bug message you got:

----
BUG: scheduling while atomic: udev_run_devd/0x00000001/1568

Call Trace:
<ffffffff8045c693>{__schedule+155}
<ffffffff8045f156>{_raw_spin_unlock_irqrestore+53}
<ffffffff80242241>{task_blocks_on_rt_mutex+518}
<ffffffff80252da0>{free_pages_bulk+39}
<ffffffff80252da0>{free_pages_bulk+39}
...
----

This could also have a side effect that messes things up.

Unfortunately, right now I'm assigned to other tasks and I cant spend
much more time on this at the moment. So hopefully, Ingo, Thomas or
Bill, or someone else can help you find the reason for this problem.

-- Steve


2006-08-03 15:08:51

by Robert Crocombe

[permalink] [raw]
Subject: Re: Problems with 2.6.17-rt8

On 8/3/06, Steven Rostedt <[email protected]> wrote:
> >
> > Call Trace:
> > <ffffffff8047655a>{_raw_spin_lock_irqsave+24}
> > <ffffffff8022b272>{__WARN_ON+100}
> > <ffffffff802457e4>{debug_rt_mutex_unlock+199}
> > <ffffffff804757b7>{rt_lock_slowunlock+25}
> > <ffffffff80476301>{__lock_text_start+9}
>
> hmm, here we are probably having trouble with the percpu slab locks,
> that is somewhat of a hack to get slabs working on a per cpu basis.
>
> > <ffffffff80271e93>{kmem_cache_alloc+202}
>
> It would also be nice to know exactly where ffffffff80271e93 is.

>From the System.map file:

ffffffff80271df5 t gather_stats
ffffffff80271e98 t get_zonemask
ffffffff80271f1e T __mpol_equal

--
Robert Crocombe
[email protected]

2006-08-03 15:27:59

by Steven Rostedt

[permalink] [raw]
Subject: Re: Problems with 2.6.17-rt8

On Thu, 2006-08-03 at 08:08 -0700, Robert Crocombe wrote:
> On 8/3/06, Steven Rostedt <[email protected]> wrote:
> > >
> > > Call Trace:
> > > <ffffffff8047655a>{_raw_spin_lock_irqsave+24}
> > > <ffffffff8022b272>{__WARN_ON+100}
> > > <ffffffff802457e4>{debug_rt_mutex_unlock+199}
> > > <ffffffff804757b7>{rt_lock_slowunlock+25}
> > > <ffffffff80476301>{__lock_text_start+9}
> >
> > hmm, here we are probably having trouble with the percpu slab locks,
> > that is somewhat of a hack to get slabs working on a per cpu basis.
> >
> > > <ffffffff80271e93>{kmem_cache_alloc+202}
> >
> > It would also be nice to know exactly where ffffffff80271e93 is.
>
> >From the System.map file:
>
> ffffffff80271df5 t gather_stats
> ffffffff80271e98 t get_zonemask
> ffffffff80271f1e T __mpol_equal
>

Actually, if you compiled with debugging information on, you can use gdb
to get a pretty good location:

$ cd path/to/build/directory
$ gdb vmlinux
GNU gdb 6.3
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB. Type "show warranty" for details.
This GDB was configured as "--host=i386-linux --target=x86_64-linux".

(gdb) li *0xffffffff80271e93

That is if your gdb is also compiled for 64 bit.

-- Steve


2006-08-03 15:48:24

by Robert Crocombe

[permalink] [raw]
Subject: Re: Problems with 2.6.17-rt8

I am sad.

Okay, I couldn't use that same machine, which had to be swapped with
something else, but I have a very similar machine (32GB vs 8GB of
RAM, 1 additional video card and some extra 1394b cards) which hit the
same problem. The trace from it is:

md0_raid1/1118[CPU#3]: BUG in debug_rt_mutex_unlock at
kernel/rtmutex-debug.c:471

Call Trace:
<ffffffff80487453>{_raw_spin_lock_irqsave+34}
<ffffffff8022cc03>{__WARN_ON+105}
<ffffffff8022cbbe>{__WARN_ON+36}
<ffffffff8024880b>{debug_rt_mutex_unlock+204}
<ffffffff80486621>{rt_lock_slowunlock+30}
<ffffffff80487196>{__lock_text_start+14}
<ffffffff802792f9>{kmem_cache_alloc+207}
<ffffffff8025b394>{mempool_alloc_slab+22}
<ffffffff8025b783>{mempool_alloc+80}
<ffffffff80248e35>{constant_test_bit+9}
<ffffffff80487960>{_raw_spin_unlock+51}
<ffffffff80486649>{rt_lock_slowunlock+70}
<ffffffff802fde74>{get_request+375}
<ffffffff80209a6d>{mcount+45}
<ffffffff802fe04c>{get_request_wait+45}
<ffffffff80209a6d>{mcount+45}
<ffffffff803023c5>{as_merge+0}
<ffffffff803023db>{as_merge+22}
<ffffffff802fe425>{__make_request+750}
<ffffffff803fc2b7>{md_thread+0}
<ffffffff802fba78>{generic_make_request+380}
<ffffffff80248e35>{constant_test_bit+9}
<ffffffff80487960>{_raw_spin_unlock+51}
<ffffffff803fc2b7>{md_thread+0}
<ffffffff803ea43c>{raid1d+246}
<ffffffff80209a6d>{mcount+45}
<ffffffff80209a6d>{mcount+45}
<ffffffff80486649>{rt_lock_slowunlock+70}
<ffffffff80485568>{thread_return+230}
<ffffffff80485568>{thread_return+230}
<ffffffff80248e35>{constant_test_bit+9}
<ffffffff80487960>{_raw_spin_unlock+51}
<ffffffff80486649>{rt_lock_slowunlock+70}
<ffffffff80485568>{thread_return+230}
<ffffffff80487196>{__lock_text_start+14}
<ffffffff803fc2b7>{md_thread+0}
<ffffffff8023fb55>{keventd_create_kthread+0}
<ffffffff803fc3cf>{md_thread+280}
<ffffffff8023ff97>{autoremove_wake_function+0}
<ffffffff8023fe5a>{kthread+224}
<ffffffff802273bf>{schedule_tail+198}
<ffffffff8020ae12>{child_rip+8}
<ffffffff8023fb55>{keventd_create_kthread+0}
<ffffffff80485568>{thread_return+230}
<ffffffff80485568>{thread_return+230}
<ffffffff80485568>{thread_return+230}
<ffffffff8023fd7a>{kthread+0}
<ffffffff8020ae0a>{child_rip+0}
---------------------------
| preempt count: 00000002 ]
| 2-level deep critical section nesting:
----------------------------------------
.. [<ffffffff8048736f>] .... _raw_spin_lock+0x1b/0x28
.....[<ffffffff80486619>] .. ( <= rt_lock_slowunlock+0x16/0x70)
.. [<ffffffff80487453>] .... _raw_spin_lock_irqsave+0x22/0x33
.....[<ffffffff8022cbbe>] .. ( <= __WARN_ON+0x24/0x8a)

which makes:

<ffffffff802792f9>{kmem_cache_alloc+207}

the interesting part?

However:

rcrocomb@spanky:2.6.17-rt8$ grep DEBUG_INFO .config
CONFIG_DEBUG_INFO=y
rcrocomb@spanky:2.6.17-rt8$ cd arch/x86_64/boot/compressed/
rcrocomb@spanky:compressed$ gdb vmlinux
GNU gdb Red Hat Linux (6.3.0.0-1.122rh)
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB. Type "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu"...
(no debugging symbols found)
Using host libthread_db library "/lib64/libthread_db.so.1".

(gdb) li *0xffffffff802792f9
No symbol table is loaded. Use the "file" command.

Huh.

--
Robert Crocombe
[email protected]

In case this might be useful:

rcrocomb@spanky:2.6.17-rt8$ grep -i debug .config
.
.
.
CONFIG_DEBUG_KERNEL=y
# CONFIG_DEBUG_SLAB is not set
CONFIG_DEBUG_PREEMPT=y
CONFIG_DEBUG_RT_MUTEXES=y
CONFIG_DEBUG_PI_LIST=y
CONFIG_DEBUG_TRACE_IRQFLAGS=y
# CONFIG_DEBUG_KOBJECT is not set
CONFIG_DEBUG_INFO=y
# CONFIG_DEBUG_FS is not set
# CONFIG_DEBUG_VM is not set
CONFIG_DEBUG_RODATA=y
CONFIG_IOMMU_DEBUG=y

2006-08-03 16:04:35

by Steven Rostedt

[permalink] [raw]
Subject: Re: Problems with 2.6.17-rt8

On Thu, 2006-08-03 at 08:48 -0700, Robert Crocombe wrote:
> I am sad.

Cheer up... It's OK ;)

>
> Okay, I couldn't use that same machine, which had to be swapped with
> something else, but I have a very similar machine (32GB vs 8GB of
> RAM, 1 additional video card and some extra 1394b cards) which hit the
> same problem. The trace from it is:

Are you also getting any warnings or BUG reports before this. In your
other dmesg, it should a bug being reported. This could cause problems
later on.

That said, this is starting to look like a NUMA problem. So I would like
to know where that lock is being taken exactly.

>
> md0_raid1/1118[CPU#3]: BUG in debug_rt_mutex_unlock at
> kernel/rtmutex-debug.c:471
>
> Call Trace:
> <ffffffff80487453>{_raw_spin_lock_irqsave+34}
> <ffffffff8022cc03>{__WARN_ON+105}
> <ffffffff8022cbbe>{__WARN_ON+36}
> <ffffffff8024880b>{debug_rt_mutex_unlock+204}
> <ffffffff80486621>{rt_lock_slowunlock+30}
> <ffffffff80487196>{__lock_text_start+14}
> <ffffffff802792f9>{kmem_cache_alloc+207}
> <ffffffff8025b394>{mempool_alloc_slab+22}
> <ffffffff8025b783>{mempool_alloc+80}
> <ffffffff80248e35>{constant_test_bit+9}
> <ffffffff80487960>{_raw_spin_unlock+51}
> <ffffffff80486649>{rt_lock_slowunlock+70}
> <ffffffff802fde74>{get_request+375}
> <ffffffff80209a6d>{mcount+45}
> <ffffffff802fe04c>{get_request_wait+45}
> <ffffffff80209a6d>{mcount+45}
> <ffffffff803023c5>{as_merge+0}
> <ffffffff803023db>{as_merge+22}
> <ffffffff802fe425>{__make_request+750}
> <ffffffff803fc2b7>{md_thread+0}
> <ffffffff802fba78>{generic_make_request+380}
> <ffffffff80248e35>{constant_test_bit+9}
> <ffffffff80487960>{_raw_spin_unlock+51}
> <ffffffff803fc2b7>{md_thread+0}
> <ffffffff803ea43c>{raid1d+246}
> <ffffffff80209a6d>{mcount+45}
> <ffffffff80209a6d>{mcount+45}
> <ffffffff80486649>{rt_lock_slowunlock+70}
> <ffffffff80485568>{thread_return+230}
> <ffffffff80485568>{thread_return+230}
> <ffffffff80248e35>{constant_test_bit+9}
> <ffffffff80487960>{_raw_spin_unlock+51}
> <ffffffff80486649>{rt_lock_slowunlock+70}
> <ffffffff80485568>{thread_return+230}
> <ffffffff80487196>{__lock_text_start+14}
> <ffffffff803fc2b7>{md_thread+0}
> <ffffffff8023fb55>{keventd_create_kthread+0}
> <ffffffff803fc3cf>{md_thread+280}
> <ffffffff8023ff97>{autoremove_wake_function+0}
> <ffffffff8023fe5a>{kthread+224}
> <ffffffff802273bf>{schedule_tail+198}
> <ffffffff8020ae12>{child_rip+8}
> <ffffffff8023fb55>{keventd_create_kthread+0}
> <ffffffff80485568>{thread_return+230}
> <ffffffff80485568>{thread_return+230}
> <ffffffff80485568>{thread_return+230}
> <ffffffff8023fd7a>{kthread+0}
> <ffffffff8020ae0a>{child_rip+0}
> ---------------------------
> | preempt count: 00000002 ]
> | 2-level deep critical section nesting:
> ----------------------------------------
> .. [<ffffffff8048736f>] .... _raw_spin_lock+0x1b/0x28
> .....[<ffffffff80486619>] .. ( <= rt_lock_slowunlock+0x16/0x70)
> .. [<ffffffff80487453>] .... _raw_spin_lock_irqsave+0x22/0x33
> .....[<ffffffff8022cbbe>] .. ( <= __WARN_ON+0x24/0x8a)
>
> which makes:
>
> <ffffffff802792f9>{kmem_cache_alloc+207}
>
> the interesting part?

Yeah, I would say that. I'm sure it's also doing a NUMA alloc in there
too.

>
> However:
>
> rcrocomb@spanky:2.6.17-rt8$ grep DEBUG_INFO .config
> CONFIG_DEBUG_INFO=y
> rcrocomb@spanky:2.6.17-rt8$ cd arch/x86_64/boot/compressed/
> rcrocomb@spanky:compressed$ gdb vmlinux
> GNU gdb Red Hat Linux (6.3.0.0-1.122rh)
> Copyright 2004 Free Software Foundation, Inc.
> GDB is free software, covered by the GNU General Public License, and you are
> welcome to change it and/or distribute copies of it under certain conditions.
> Type "show copying" to see the conditions.
> There is absolutely no warranty for GDB. Type "show warranty" for details.
> This GDB was configured as "x86_64-redhat-linux-gnu"...
> (no debugging symbols found)
> Using host libthread_db library "/lib64/libthread_db.so.1".
>
> (gdb) li *0xffffffff802792f9
> No symbol table is loaded. Use the "file" command.
>
> Huh.

??

Are you sure that vmlinux is the one created with the given config file?
There's been times when I added some configs and either forgot to
compile, or the compile failed, and I didn't notice, so the old binary
was being used.

-- Steve

>
> --
> Robert Crocombe
> [email protected]
>
> In case this might be useful:
>
> rcrocomb@spanky:2.6.17-rt8$ grep -i debug .config
> .
> .
> .
> CONFIG_DEBUG_KERNEL=y
> # CONFIG_DEBUG_SLAB is not set
> CONFIG_DEBUG_PREEMPT=y
> CONFIG_DEBUG_RT_MUTEXES=y
> CONFIG_DEBUG_PI_LIST=y
> CONFIG_DEBUG_TRACE_IRQFLAGS=y
> # CONFIG_DEBUG_KOBJECT is not set
> CONFIG_DEBUG_INFO=y
> # CONFIG_DEBUG_FS is not set
> # CONFIG_DEBUG_VM is not set
> CONFIG_DEBUG_RODATA=y
> CONFIG_IOMMU_DEBUG=y

2006-08-03 17:16:32

by Robert Crocombe

[permalink] [raw]
Subject: Re: Problems with 2.6.17-rt8

On 8/3/06, Steven Rostedt <[email protected]> wrote:

> Are you also getting any warnings or BUG reports before this. In your
> other dmesg, it should a bug being reported. This could cause problems
> later on.

Well, I'm getting a "nobody cared" on the tg3 (there's a separate
email about that to the list since it seemed sorta unrelated). This
is straight from the console:

*****************************************************************************
Time: acpi_pm clocksource has been installed.
* *
* REMINDER, the following debugging options are turned on in your .config: *
* *
* CONFIG_DEBUG_RT_MUTEXES *
* CONFIG_DEBUG_PREEMPT *
* CONFIG_CRITICAL_PREEMPT_TIMING *
* CONFIG_CRITICAL_IRQSOFF_TIMING *
* CONFIG_LATENCY_TRACE *
* *
* they may increase runtime overhead and latencies. *
* *
*****************************************************************************
Freeing unused kernel memory: 240k freed
Write protecting the kernel read-only data: 1036k
input: ImPS/2 Generic Wheel Mouse as /class/input/input1
md: Autodetecting RAID arrays.
md: autorun ...
md: considering sdb1 ...
md: adding sdb1 ...
md: adding sda1 ...
md: created md0
md: bind<sda1>
md: bind<sdb1>
md: running: <sdb1><sda1>
md: md0: raid array is not clean -- starting background reconstruction
raid1: raid set md0 active with 2 out of 2 mirrors
md: ... autorun DONE.
md: syncing RAID array md0
md: minimum _guaranteed_ reconstruction speed: 1000 KB/sec/disc.
md: using maximum available idle IO bandwidth (but not more than
200000 KB/sec) for reconstruction.
md: using 128k window, over a total of 287836480 blocks.
kjournald starting. Commit interval 5 seconds
EXT3-fs: mounted filesystem with ordered data mode.
irq 106: nobody cared (try booting with the "irqpoll" option)

Call Trace:
<ffffffff802562ad>{__report_bad_irq+61}
<ffffffff80485568>{thread_return+230}
<ffffffff802564e3>{note_interrupt+487}
<ffffffff8023fb55>{keventd_create_kthread+0}
<ffffffff8025588f>{thread_simple_irq+137}
<ffffffff80255cb3>{do_irqd+0}
<ffffffff80255d8f>{do_irqd+220}
<ffffffff80255cb3>{do_irqd+0}
<ffffffff8023fe5a>{kthread+224}
<ffffffff802273bf>{schedule_tail+198}
<ffffffff8020ae12>{child_rip+8}
<ffffffff8023fb55>{keventd_create_kthread+0}
<ffffffff80485568>{thread_return+230}
<ffffffff80485568>{thread_return+230}
<ffffffff80485568>{thread_return+230}
<ffffffff8023fd7a>{kthread+0}
<ffffffff8020ae0a>{child_rip+0}
---------------------------
| preempt count: 00000001 ]
| 1-level deep critical section nesting:
----------------------------------------
.. [<ffffffff80487485>] .... _raw_spin_lock_irq+0x21/0x2e
.....[<ffffffff80255876>] .. ( <= thread_simple_irq+0x70/0x98)

handlers:
[<ffffffff88003868>] (tg3_interrupt_tagged+0x0/0xa7 [tg3])
turning off IO-APIC fast mode.
irq 106: nobody cared (try booting with the "irqpoll" option)

Call Trace:
<ffffffff802562ad>{__report_bad_irq+61}
<ffffffff802564e3>{note_interrupt+487}
<ffffffff8023fb55>{keventd_create_kthread+0}
<ffffffff8025588f>{thread_simple_irq+137}
<ffffffff80255cb3>{do_irqd+0}
<ffffffff80255d8f>{do_irqd+220}
<ffffffff80255cb3>{do_irqd+0}
<ffffffff8023fe5a>{kthread+224}
<ffffffff802273bf>{schedule_tail+198}
<ffffffff8020ae12>{child_rip+8}
<ffffffff8023fb55>{keventd_create_kthread+0}
<ffffffff80485568>{thread_return+230}
<ffffffff80485568>{thread_return+230}
<ffffffff80485568>{thread_return+230}
<ffffffff8023fd7a>{kthread+0}
<ffffffff8020ae0a>{child_rip+0}
---------------------------
| preempt count: 00000001 ]
| 1-level deep critical section nesting:
----------------------------------------
.. [<ffffffff80487485>] .... _raw_spin_lock_irq+0x21/0x2e
.....[<ffffffff80255876>] .. ( <= thread_simple_irq+0x70/0x98)

handlers:
[<ffffffff88003868>] (tg3_interrupt_tagged+0x0/0xa7 [tg3])
md0_raid1/1118[CPU#3]: BUG in debug_rt_mutex_unlock at
kernel/rtmutex-debug.c:471

Call Trace:
<ffffffff80487453>{_raw_spin_lock_irqsave+34}
<ffffffff8022cc03>{__WARN_ON+105}
<ffffffff8022cbbe>{__WARN_ON+36}
<ffffffff8024880b>{debug_rt_mutex_unlock+204}
<ffffffff80486621>{rt_lock_slowunlock+30}
<ffffffff80487196>{__lock_text_start+14}
<ffffffff802792f9>{kmem_cache_alloc+207}
<ffffffff8025b394>{mempool_alloc_slab+22}
<ffffffff8025b783>{mempool_alloc+80}
<ffffffff80248e35>{constant_test_bit+9}
<ffffffff80487960>{_raw_spin_unlock+51}
<ffffffff80486649>{rt_lock_slowunlock+70}
<ffffffff802fde74>{get_request+375}
<ffffffff80209a6d>{mcount+45}
<ffffffff802fe04c>{get_request_wait+45}
<ffffffff80209a6d>{mcount+45}
<ffffffff803023c5>{as_merge+0}
<ffffffff803023db>{as_merge+22}
<ffffffff802fe425>{__make_request+750}
<ffffffff803fc2b7>{md_thread+0}
<ffffffff802fba78>{generic_make_request+380}
<ffffffff80248e35>{constant_test_bit+9}
<ffffffff80487960>{_raw_spin_unlock+51}
<ffffffff803fc2b7>{md_thread+0}
<ffffffff803ea43c>{raid1d+246}
<ffffffff80209a6d>{mcount+45}
<ffffffff80209a6d>{mcount+45}
<ffffffff80486649>{rt_lock_slowunlock+70}
<ffffffff80485568>{thread_return+230}
<ffffffff80485568>{thread_return+230}
<ffffffff80248e35>{constant_test_bit+9}
<ffffffff80487960>{_raw_spin_unlock+51}
<ffffffff80486649>{rt_lock_slowunlock+70}
<ffffffff80485568>{thread_return+230}
<ffffffff80487196>{__lock_text_start+14}
<ffffffff803fc2b7>{md_thread+0}
<ffffffff8023fb55>{keventd_create_kthread+0}
<ffffffff803fc3cf>{md_thread+280}
<ffffffff8023ff97>{autoremove_wake_function+0}
<ffffffff8023fe5a>{kthread+224}
<ffffffff802273bf>{schedule_tail+198}
<ffffffff8020ae12>{child_rip+8}
<ffffffff8023fb55>{keventd_create_kthread+0}
<ffffffff80485568>{thread_return+230}
<ffffffff80485568>{thread_return+230}
<ffffffff80485568>{thread_return+230}
<ffffffff8023fd7a>{kthread+0}
<ffffffff8020ae0a>{child_rip+0}
---------------------------
| preempt count: 00000002 ]
| 2-level deep critical section nesting:
----------------------------------------
.. [<ffffffff8048736f>] .... _raw_spin_lock+0x1b/0x28
.....[<ffffffff80486619>] .. ( <= rt_lock_slowunlock+0x16/0x70)
.. [<ffffffff80487453>] .... _raw_spin_lock_irqsave+0x22/0x33
.....[<ffffffff8022cbbe>] .. ( <= __WARN_ON+0x24/0x8a)

md0_raid1/1118[CPU#3]: BUG in debug_rt_mutex_unlock at
kernel/rtmutex-debug.c:472

Call Trace:
<ffffffff80487453>{_raw_spin_lock_irqsave+34}
<ffffffff8022cc03>{__WARN_ON+105}
<ffffffff8022cbbe>{__WARN_ON+36}
<ffffffff802488ad>{debug_rt_mutex_unlock+366}
<ffffffff80486621>{rt_lock_slowunlock+30}
<ffffffff80487196>{__lock_text_start+14}
<ffffffff802792f9>{kmem_cache_alloc+207}
<ffffffff8025b394>{mempool_alloc_slab+22}
<ffffffff8025b783>{mempool_alloc+80}
<ffffffff80248e35>{constant_test_bit+9}
<ffffffff80487960>{_raw_spin_unlock+51}
<ffffffff80486649>{rt_lock_slowunlock+70}
<ffffffff802fde74>{get_request+375}
<ffffffff80209a6d>{mcount+45}
<ffffffff802fe04c>{get_request_wait+45}
<ffffffff80209a6d>{mcount+45}
<ffffffff803023c5>{as_merge+0}
<ffffffff803023db>{as_merge+22}
<ffffffff802fe425>{__make_request+750}
<ffffffff803fc2b7>{md_thread+0}
<ffffffff802fba78>{generic_make_request+380}
<ffffffff80248e35>{constant_test_bit+9}
<ffffffff80487960>{_raw_spin_unlock+51}
<ffffffff803fc2b7>{md_thread+0}
<ffffffff803ea43c>{raid1d+246}
<ffffffff80209a6d>{mcount+45}
<ffffffff80209a6d>{mcount+45}
<ffffffff80486649>{rt_lock_slowunlock+70}
<ffffffff80485568>{thread_return+230}
<ffffffff80485568>{thread_return+230}
<ffffffff80248e35>{constant_test_bit+9}
<ffffffff80487960>{_raw_spin_unlock+51}
<ffffffff80486649>{rt_lock_slowunlock+70}
<ffffffff80485568>{thread_return+230}
<ffffffff80487196>{__lock_text_start+14}
<ffffffff803fc2b7>{md_thread+0}
<ffffffff8023fb55>{keventd_create_kthread+0}
<ffffffff803fc3cf>{md_thread+280}
<ffffffff8023ff97>{autoremove_wake_function+0}
<ffffffff8023fe5a>{kthread+224}
<ffffffff802273bf>{schedule_tail+198}
<ffffffff8020ae12>{child_rip+8}
<ffffffff8023fb55>{keventd_create_kthread+0}
<ffffffff80485568>{thread_return+230}
<ffffffff80485568>{thread_return+230}
<ffffffff80485568>{thread_return+230}
<ffffffff8023fd7a>{kthread+0}
<ffffffff8020ae0a>{child_rip+0}
---------------------------
| preempt count: 00000002 ]
| 2-level deep critical section nesting:
----------------------------------------
.. [<ffffffff8048736f>] .... _raw_spin_lock+0x1b/0x28
.....[<ffffffff80486619>] .. ( <= rt_lock_slowunlock+0x16/0x70)
.. [<ffffffff80487453>] .... _raw_spin_lock_irqsave+0x22/0x33
.....[<ffffffff8022cbbe>] .. ( <= __WARN_ON+0x24/0x8a)

----------- [cut here ] --------- [please bite here ] ---------
Kernel BUG at kernel/rtmutex.c:639
invalid opcode: 0000 [1] PREEMPT SMP
CPU 3
Modules linked in: nfsd exportfs lockd sunrpc ohci1394 ieee1394 tg3
Pid: 1118, comm: md0_raid1 Not tainted 2.6.17-rt8_local_00 #3
RIP: 0010:[<ffffffff8048679a>] <ffffffff8048679a>{rt_lock_slowlock+186}
RSP: 0018:ffff8101ea5bd9e8 EFLAGS: 00010246
RAX: ffff8107eac1e340 RBX: 0000000000000010 RCX: 0000000000240180
RDX: ffff8107eac1e340 RSI: ffffffff8027927e RDI: ffff810600115ca0
RBP: ffff8101ea5bdaa8 R08: ffff8104000d7f38 R09: ffff8101ea5bd9e8
R10: ffff8104000d7f38 R11: 0000000000000023 R12: ffff810600115ca0
R13: ffff8103ead2a080 R14: ffffffff8027927e R15: ffff8103eadbe0e0
FS: 00002afcbce1c770(0000) GS:ffff810600211340(0000) knlGS:0000000000000000
CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b
CR2: 000000334d703088 CR3: 00000007e8ffa000 CR4: 00000000000006e0
Process md0_raid1 (pid: 1118, threadinfo ffff8101ea5bc000, task
ffff8107eac1e340)
Stack: 111111110000008c ffff8101ea5bd9f0 ffff8101ea5bd9f0 ffff8101ea5bda00
ffff8101ea5bda00 0000000000000000 111111110000008c ffff8101ea5bda20
ffff8101ea5bda20 ffff8101ea5bda30
Call Trace:
<ffffffff804871aa>{rt_lock+18}
<ffffffff8027927e>{kmem_cache_alloc+84}
<ffffffff8025b394>{mempool_alloc_slab+22}
<ffffffff8025b783>{mempool_alloc+80}
<ffffffff80209a6d>{mcount+45}
<ffffffff80302648>{as_set_request+0}
<ffffffff8030266d>{as_set_request+37}
<ffffffff802f8fac>{elv_set_request+27}
<ffffffff802fde98>{get_request+411}
<ffffffff80209a6d>{mcount+45}
<ffffffff802fe04c>{get_request_wait+45}
<ffffffff80209a6d>{mcount+45}
<ffffffff803023c5>{as_merge+0}
<ffffffff803023db>{as_merge+22}
<ffffffff802fe425>{__make_request+750}
<ffffffff803fc2b7>{md_thread+0}
<ffffffff802fba78>{generic_make_request+380}
<ffffffff80248e35>{constant_test_bit+9}
<ffffffff80487960>{_raw_spin_unlock+51}
<ffffffff803fc2b7>{md_thread+0}
<ffffffff803ea43c>{raid1d+246}
<ffffffff80209a6d>{mcount+45}
<ffffffff80209a6d>{mcount+45}
<ffffffff80486649>{rt_lock_slowunlock+70}
<ffffffff80485568>{thread_return+230}
<ffffffff80485568>{thread_return+230}
<ffffffff80248e35>{constant_test_bit+9}
<ffffffff80487960>{_raw_spin_unlock+51}
<ffffffff80486649>{rt_lock_slowunlock+70}
<ffffffff80485568>{thread_return+230}
<ffffffff80487196>{__lock_text_start+14}
<ffffffff803fc2b7>{md_thread+0}
<ffffffff8023fb55>{keventd_create_kthread+0}
<ffffffff803fc3cf>{md_thread+280}
<ffffffff8023ff97>{autoremove_wake_function+0}
<ffffffff8023fe5a>{kthread+224}
<ffffffff802273bf>{schedule_tail+198}
<ffffffff8020ae12>{child_rip+8}
<ffffffff8023fb55>{keventd_create_kthread+0}
<ffffffff80485568>{thread_return+230}
<ffffffff80485568>{thread_return+230}
<ffffffff80485568>{thread_return+230}
<ffffffff8023fd7a>{kthread+0}
<ffffffff8020ae0a>{child_rip+0}
---------------------------
| preempt count: 00000002 ]
| 2-level deep critical section nesting:
----------------------------------------
.. [<ffffffff8048736f>] .... _raw_spin_lock+0x1b/0x28
.....[<ffffffff8048671b>] .. ( <= rt_lock_slowlock+0x3b/0x213)
.. [<ffffffff80487728>] .... _raw_spin_trylock+0x1b/0x5f
.....[<ffffffff804881d6>] .. ( <= oops_begin+0x28/0x77)


Code: 0f 0b 68 70 44 4c 80 c2 7f 02 65 48 8b 04 25 00 00 00 00 41
RIP <ffffffff8048679a>{rt_lock_slowlock+186} RSP <ffff8101ea5bd9e8>




etc. and so forth. Oh wait, here's one that happens straight off after boot:







*****************************************************************************
Time: acpi_pm clocksource has been installed.
* *
* REMINDER, the following debugging options are turned on in your .config: *
* *
* CONFIG_DEBUG_RT_MUTEXES *
* CONFIG_DEBUG_PREEMPT *
* CONFIG_CRITICAL_PREEMPT_TIMING *
* CONFIG_CRITICAL_IRQSOFF_TIMING *
* CONFIG_LATENCY_TRACE *
* *
* they may increase runtime overhead and latencies. *
* *
*****************************************************************************
Freeing unused kernel memory: 240k freed
Write protecting the kernel read-only data: 1036k
input: ImPS/2 Generic Wheel Mouse as /class/input/input1
md: Autodetecting RAID arrays.
md: autorun ...
md: considering sdb1 ...
md: adding sdb1 ...
md: adding sda1 ...
md: created md0
md: bind<sda1>
md: bind<sdb1>
md: running: <sdb1><sda1>
md: md0: raid array is not clean -- starting background reconstruction
raid1: raid set md0 active with 2 out of 2 mirrors
md: ... autorun DONE.
md: syncing RAID array md0
md: minimum _guaranteed_ reconstruction speed: 1000 KB/sec/disc.
md: using maximum available idle IO bandwidth (but not more than
200000 KB/sec) for reconstruction.
md: using 128k window, over a total of 287836480 blocks.
EXT3-fs: INFO: recovery required on readonly filesystem.
EXT3-fs: write access will be enabled during recovery.
kjournald starting. Commit interval 5 seconds
EXT3-fs: recovery complete.
EXT3-fs: mounted filesystem with ordered data mode.
md0_resync/1119[CPU#1]: BUG in debug_rt_mutex_unlock at
kernel/rtmutex-debug.c:471

Call Trace:
<ffffffff80487453>{_raw_spin_lock_irqsave+34}
<ffffffff8022cc03>{__WARN_ON+105}
<ffffffff8022cbbe>{__WARN_ON+36}
<ffffffff8024880b>{debug_rt_mutex_unlock+204}
<ffffffff80486621>{rt_lock_slowunlock+30}
<ffffffff80487196>{__lock_text_start+14}
<ffffffff802792f9>{kmem_cache_alloc+207}
<ffffffff80304e45>{kobject_get+31}
<ffffffff80374eb7>{scsi_get_command+77}
<ffffffff8037a2a8>{scsi_prep_fn+339}
<ffffffff802f9a3a>{elv_next_request+149}
<ffffffff80304e45>{kobject_get+31}
<ffffffff80379e81>{scsi_request_fn+128}
<ffffffff802fc854>{__generic_unplug_device+45}
<ffffffff802fca71>{generic_unplug_device+37}
<ffffffff803e8cc7>{unplug_slaves+132}
<ffffffff803e8d31>{raid1_unplug+29}
<ffffffff803fbe92>{md_do_sync+1303}
<ffffffff803fc2b7>{md_thread+0}
<ffffffff80248e35>{constant_test_bit+9}
<ffffffff80209a6d>{mcount+45}
<ffffffff80486649>{rt_lock_slowunlock+70}
<ffffffff803fc2b7>{md_thread+0}
<ffffffff8023fb55>{keventd_create_kthread+0}
<ffffffff803fc3cf>{md_thread+280}
<ffffffff80209a6d>{mcount+45}
<ffffffff803fc2b7>{md_thread+0}
<ffffffff8023fe5a>{kthread+224}
<ffffffff802273bf>{schedule_tail+198}
<ffffffff8020ae12>{child_rip+8}
<ffffffff8023fb55>{keventd_create_kthread+0}
<ffffffff80485568>{thread_return+230}
<ffffffff80485568>{thread_return+230}
<ffffffff80485568>{thread_return+230}
<ffffffff8023fd7a>{kthread+0}
<ffffffff8020ae0a>{child_rip+0}
---------------------------
| preempt count: 00000002 ]
| 2-level deep critical section nesting:
----------------------------------------
.. [<ffffffff8048736f>] .... _raw_spin_lock+0x1b/0x28
.....[<ffffffff80486619>] .. ( <= rt_lock_slowunlock+0x16/0x70)
.. [<ffffffff80487453>] .... _raw_spin_lock_irqsave+0x22/0x33
.....[<ffffffff8022cbbe>] .. ( <= __WARN_ON+0x24/0x8a)

md0_resync/1119[CPU#1]: BUG in debug_rt_mutex_unlock at
kernel/rtmutex-debug.c:472

Call Trace:
<ffffffff80487453>{_raw_spin_lock_irqsave+34}
<ffffffff8022cc03>{__WARN_ON+105}
<ffffffff8022cbbe>{__WARN_ON+36}
<ffffffff802488ad>{debug_rt_mutex_unlock+366}
<ffffffff80486621>{rt_lock_slowunlock+30}
<ffffffff80487196>{__lock_text_start+14}
<ffffffff802792f9>{kmem_cache_alloc+207}
<ffffffff80304e45>{kobject_get+31}
<ffffffff80374eb7>{scsi_get_command+77}
<ffffffff8037a2a8>{scsi_prep_fn+339}
<ffffffff802f9a3a>{elv_next_request+149}
<ffffffff80304e45>{kobject_get+31}
<ffffffff80379e81>{scsi_request_fn+128}
<ffffffff802fc854>{__generic_unplug_device+45}
<ffffffff802fca71>{generic_unplug_device+37}
<ffffffff803e8cc7>{unplug_slaves+132}
<ffffffff803e8d31>{raid1_unplug+29}
<ffffffff803fbe92>{md_do_sync+1303}
<ffffffff803fc2b7>{md_thread+0}
<ffffffff80248e35>{constant_test_bit+9}
<ffffffff80209a6d>{mcount+45}
<ffffffff80486649>{rt_lock_slowunlock+70}
<ffffffff803fc2b7>{md_thread+0}
<ffffffff8023fb55>{keventd_create_kthread+0}
<ffffffff803fc3cf>{md_thread+280}
<ffffffff80209a6d>{mcount+45}
<ffffffff803fc2b7>{md_thread+0}
<ffffffff8023fe5a>{kthread+224}
<ffffffff802273bf>{schedule_tail+198}
<ffffffff8020ae12>{child_rip+8}
<ffffffff8023fb55>{keventd_create_kthread+0}
<ffffffff80485568>{thread_return+230}
<ffffffff80485568>{thread_return+230}
<ffffffff80485568>{thread_return+230}
<ffffffff8023fd7a>{kthread+0}
<ffffffff8020ae0a>{child_rip+0}
---------------------------
| preempt count: 00000002 ]
| 2-level deep critical section nesting:
----------------------------------------
.. [<ffffffff8048736f>] .... _raw_spin_lock+0x1b/0x28
.....[<ffffffff80486619>] .. ( <= rt_lock_slowunlock+0x16/0x70)
.. [<ffffffff80487453>] .... _raw_spin_lock_irqsave+0x22/0x33
.....[<ffffffff8022cbbe>] .. ( <= __WARN_ON+0x24/0x8a)

----------- [cut here ] --------- [please bite here ] ---------
Kernel BUG at kernel/rtmutex.c:639
invalid opcode: 0000 [1] PREEMPT SMP
CPU 1
Modules linked in:
Pid: 1119, comm: md0_resync Not tainted 2.6.17-rt8_local_00 #3
RIP: 0010:[<ffffffff8048679a>] <ffffffff8048679a>{rt_lock_slowlock+186}
RSP: 0018:ffff8105eac63a58 EFLAGS: 00010246
RAX: ffff8102002b2db0 RBX: 0000000000000020 RCX: 00000000000c0080
RDX: ffff8102002b2db0 RSI: ffffffff8027927e RDI: ffff810200115ca0
RBP: ffff8105eac63b18 R08: ffff8104000d7650 R09: ffff8105eac63a58
R10: ffff8104000d7650 R11: ffff8104000d7650 R12: ffff810200115ca0
R13: ffff8107eac3f600 R14: ffffffff8027927e R15: ffff81020033a0c0
FS: 00002adbe7c426d0(0000) GS:ffff8102001e3340(0000) knlGS:0000000000000000
CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b
CR2: 00002aed3bd60770 CR3: 0000000000201000 CR4: 00000000000006e0
Process md0_resync (pid: 1119, threadinfo ffff8105eac62000, task
ffff8102002b2db0)
Stack: 111111110000008c ffff8105eac63a60 ffff8105eac63a60 ffff8105eac63a70
ffff8105eac63a70 0000000000000000 111111110000008c ffff8105eac63a90
ffff8105eac63a90 ffff8105eac63aa0
Call Trace:
<ffffffff80486649>{rt_lock_slowunlock+70}
<ffffffff804871aa>{rt_lock+18}
<ffffffff8027927e>{kmem_cache_alloc+84}
<ffffffff8025b394>{mempool_alloc_slab+22}
<ffffffff8025b783>{mempool_alloc+80}
<ffffffff80487196>{__lock_text_start+14}
<ffffffff80374f67>{scsi_get_command+253}
<ffffffff8037a3cf>{scsi_prep_fn+634}
<ffffffff802f9a3a>{elv_next_request+149}
<ffffffff80304e45>{kobject_get+31}
<ffffffff80379e81>{scsi_request_fn+128}
<ffffffff802fc854>{__generic_unplug_device+45}
<ffffffff802fca71>{generic_unplug_device+37}
<ffffffff803e8cc7>{unplug_slaves+132}
<ffffffff803e8d31>{raid1_unplug+29}
<ffffffff803fbe92>{md_do_sync+1303}
<ffffffff803fc2b7>{md_thread+0}
<ffffffff80248e35>{constant_test_bit+9}
<ffffffff80209a6d>{mcount+45}
<ffffffff80486649>{rt_lock_slowunlock+70}
<ffffffff803fc2b7>{md_thread+0}
<ffffffff8023fb55>{keventd_create_kthread+0}
<ffffffff803fc3cf>{md_thread+280}
<ffffffff80209a6d>{mcount+45}
<ffffffff803fc2b7>{md_thread+0}
<ffffffff8023fe5a>{kthread+224}
<ffffffff802273bf>{schedule_tail+198}
<ffffffff8020ae12>{child_rip+8}
<ffffffff8023fb55>{keventd_create_kthread+0}
<ffffffff80485568>{thread_return+230}
<ffffffff80485568>{thread_return+230}
<ffffffff80485568>{thread_return+230}
<ffffffff8023fd7a>{kthread+0}
<ffffffff8020ae0a>{child_rip+0}
---------------------------
| preempt count: 00000002 ]
| 2-level deep critical section nesting:
----------------------------------------
.. [<ffffffff8048736f>] .... _raw_spin_lock+0x1b/0x28
.....[<ffffffff8048671b>] .. ( <= rt_lock_slowlock+0x3b/0x213)
.. [<ffffffff80487728>] .... _raw_spin_trylock+0x1b/0x5f
.....[<ffffffff804881d6>] .. ( <= oops_begin+0x28/0x77)


Code: 0f 0b 68 70 44 4c 80 c2 7f 02 65 48 8b 04 25 00 00 00 00 41
RIP <ffffffff8048679a>{rt_lock_slowlock+186} RSP <ffff8105eac63a58>


> ??
>
> Are you sure that vmlinux is the one created with the given config file?
> There's been times when I added some configs and either forgot to
> compile, or the compile failed, and I didn't notice, so the old binary
> was being used.

I think so. I just rebuilt the thing and still no love:

rcrocomb@spanky:compressed$ cd ~/kernel/2.6.17-rt8
rcrocomb@spanky:2.6.17-rt8$ make clean
CLEAN arch/x86_64/boot/compressed/
CLEAN arch/x86_64/boot
CLEAN /home/rcrocomb/kernel/2.6.17-rt8
CLEAN arch/x86_64/ia32
CLEAN arch/x86_64/kernel
CLEAN drivers/char
CLEAN drivers/ieee1394
CLEAN drivers/md
CLEAN drivers/scsi/aic7xxx
CLEAN init
CLEAN kernel
CLEAN lib
CLEAN usr
CLEAN .tmp_versions
CLEAN vmlinux System.map .tmp_kallsyms1.o .tmp_kallsyms1.S
.tmp_kallsyms2.o .tmp_kallsyms2.S .tmp_vmli
nux1 .tmp_vmlinux2 .tmp_System.map
rcrocomb@spanky:2.6.17-rt8$ make -j4
.
.
.
buildy buildy buildy
.
.
.
LD [M] fs/lockd/lockd.ko
LD [M] fs/nfs/nfs.ko
LD [M] fs/nfsd/nfsd.ko
LD [M] net/sunrpc/sunrpc.ko
rcrocomb@spanky:2.6.17-rt8$ cd arch/x86_64/boot/compressed/
rcrocomb@spanky:compressed$ gdb vmlinux
GNU gdb Red Hat Linux (6.3.0.0-1.122rh)
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB. Type "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu"...(no debugging
symbols found)
Using host libthread_db library "/lib64/libthread_db.so.1".

(gdb) li *0xffffffff802792f9
No symbol table is loaded. Use the "file" command.
(gdb)

I am at [even more of] a loss.

--
Robert Crocombe
[email protected]

I hope warnings like this (one for each and every file, it seems):

scripts/mod/empty.c:1: warning: -ffunction-sections disabled; it makes
profiling impossible

and

ld: warning: i386:x86-64 architecture of input file
`arch/x86_64/boot/compressed/head.o' is incompatible with i386 output

don't indicate anything particularly maleficent.

2006-08-03 20:22:20

by Bill Huey

[permalink] [raw]
Subject: Re: Problems with 2.6.17-rt8

On Thu, Aug 03, 2006 at 10:27:41AM -0400, Steven Rostedt wrote:
> The rest was probably caused as a side effect from above. The above is
> already broken!
>
> You have NUMA configured too, so this is also something to look at.
>
> I still wouldn't ignore the first bug message you got:
>
> ----
> BUG: scheduling while atomic: udev_run_devd/0x00000001/1568
>
> Call Trace:
> <ffffffff8045c693>{__schedule+155}
> <ffffffff8045f156>{_raw_spin_unlock_irqrestore+53}
> <ffffffff80242241>{task_blocks_on_rt_mutex+518}
> <ffffffff80252da0>{free_pages_bulk+39}
> <ffffffff80252da0>{free_pages_bulk+39}
> ...
> ----
>
> This could also have a side effect that messes things up.
>
> Unfortunately, right now I'm assigned to other tasks and I cant spend
> much more time on this at the moment. So hopefully, Ingo, Thomas or
> Bill, or someone else can help you find the reason for this problem.

free_pages_bulk is definitely being called inside of an atomic.
I force this stack trace when the in_atomic() test is true at the
beginning of the function.


[ 29.362863] Call Trace:
[ 29.367107] <ffffffff802a82ac>{free_pages_bulk+86}
[ 29.373122] <ffffffff80261726>{_raw_spin_unlock_irqrestore+44}
[ 29.380233] <ffffffff802a8778>{__free_pages_ok+428}
[ 29.386336] <ffffffff8024f101>{free_hot_page+25}
[ 29.392165] <ffffffff8022e298>{__free_pages+41}
[ 29.397898] <ffffffff806b604d>{__free_pages_bootmem+174}
[ 29.404457] <ffffffff806b5266>{free_all_bootmem_core+253}
[ 29.411112] <ffffffff806b5340>{free_all_bootmem_node+9}
[ 29.417574] <ffffffff806b254e>{numa_free_all_bootmem+61}
[ 29.424122] <ffffffff8046e96e>{_etext+0}
[ 29.429224] <ffffffff806b1392>{mem_init+128}
[ 29.434691] <ffffffff806a17ab>{start_kernel+377}
[ 29.440520] <ffffffff806a129b>{_sinittext+667}
[ 29.446669] ---------------------------
[ 29.450963] | preempt count: 00000001 ]
[ 29.455257] | 1-level deep critical section nesting:
[ 29.460732] ----------------------------------------
[ 29.466212] .. [<ffffffff806a169a>] .... start_kernel+0x68/0x221
[ 29.472815] .....[<ffffffff806a129b>] .. ( <= _sinittext+0x29b/0x2a2)
[ 29.480056]

bill

2006-08-03 20:54:25

by Steven Rostedt

[permalink] [raw]
Subject: Re: Problems with 2.6.17-rt8

On Thu, 2006-08-03 at 13:22 -0700, Bill Huey wrote:
> On Thu, Aug 03, 2006 at 10:27:41AM -0400, Steven Rostedt wrote:
> > The rest was probably caused as a side effect from above. The above is
> > already broken!
> >
> > You have NUMA configured too, so this is also something to look at.
> >
> > I still wouldn't ignore the first bug message you got:
> >
> > ----
> > BUG: scheduling while atomic: udev_run_devd/0x00000001/1568
> >
> > Call Trace:
> > <ffffffff8045c693>{__schedule+155}
> > <ffffffff8045f156>{_raw_spin_unlock_irqrestore+53}
> > <ffffffff80242241>{task_blocks_on_rt_mutex+518}
> > <ffffffff80252da0>{free_pages_bulk+39}
> > <ffffffff80252da0>{free_pages_bulk+39}
> > ...
> > ----
> >
> > This could also have a side effect that messes things up.
> >
> > Unfortunately, right now I'm assigned to other tasks and I cant spend
> > much more time on this at the moment. So hopefully, Ingo, Thomas or
> > Bill, or someone else can help you find the reason for this problem.
>
> free_pages_bulk is definitely being called inside of an atomic.
> I force this stack trace when the in_atomic() test is true at the
> beginning of the function.
>
>
> [ 29.362863] Call Trace:
> [ 29.367107] <ffffffff802a82ac>{free_pages_bulk+86}
> [ 29.373122] <ffffffff80261726>{_raw_spin_unlock_irqrestore+44}
> [ 29.380233] <ffffffff802a8778>{__free_pages_ok+428}
> [ 29.386336] <ffffffff8024f101>{free_hot_page+25}
> [ 29.392165] <ffffffff8022e298>{__free_pages+41}
> [ 29.397898] <ffffffff806b604d>{__free_pages_bootmem+174}
> [ 29.404457] <ffffffff806b5266>{free_all_bootmem_core+253}
> [ 29.411112] <ffffffff806b5340>{free_all_bootmem_node+9}
> [ 29.417574] <ffffffff806b254e>{numa_free_all_bootmem+61}
> [ 29.424122] <ffffffff8046e96e>{_etext+0}
> [ 29.429224] <ffffffff806b1392>{mem_init+128}
> [ 29.434691] <ffffffff806a17ab>{start_kernel+377}
> [ 29.440520] <ffffffff806a129b>{_sinittext+667}
> [ 29.446669] ---------------------------
> [ 29.450963] | preempt count: 00000001 ]
> [ 29.455257] | 1-level deep critical section nesting:
> [ 29.460732] ----------------------------------------
> [ 29.466212] .. [<ffffffff806a169a>] .... start_kernel+0x68/0x221
> [ 29.472815] .....[<ffffffff806a129b>] .. ( <= _sinittext+0x29b/0x2a2)
> [ 29.480056]

Perhaps you could put in that in_atomic check at the start of each of
these functions and point to where it is a problem. Perhaps a spinlock
is taken that was real and not a mutex.

-- Steve


2006-08-03 21:18:10

by Bill Huey

[permalink] [raw]
Subject: Re: Problems with 2.6.17-rt8

On Thu, Aug 03, 2006 at 04:54:05PM -0400, Steven Rostedt wrote:
> On Thu, 2006-08-03 at 13:22 -0700, Bill Huey wrote:
> > free_pages_bulk is definitely being called inside of an atomic.
> > I force this stack trace when the in_atomic() test is true at the
> > beginning of the function.
> >
> >
> > [ 29.362863] Call Trace:
> > [ 29.367107] <ffffffff802a82ac>{free_pages_bulk+86}
> > [ 29.373122] <ffffffff80261726>{_raw_spin_unlock_irqrestore+44}
> > [ 29.380233] <ffffffff802a8778>{__free_pages_ok+428}
> > [ 29.386336] <ffffffff8024f101>{free_hot_page+25}
> > [ 29.392165] <ffffffff8022e298>{__free_pages+41}
> > [ 29.397898] <ffffffff806b604d>{__free_pages_bootmem+174}
> > [ 29.404457] <ffffffff806b5266>{free_all_bootmem_core+253}
> > [ 29.411112] <ffffffff806b5340>{free_all_bootmem_node+9}
> > [ 29.417574] <ffffffff806b254e>{numa_free_all_bootmem+61}
> > [ 29.424122] <ffffffff8046e96e>{_etext+0}
> > [ 29.429224] <ffffffff806b1392>{mem_init+128}
> > [ 29.434691] <ffffffff806a17ab>{start_kernel+377}
> > [ 29.440520] <ffffffff806a129b>{_sinittext+667}
> > [ 29.446669] ---------------------------
> > [ 29.450963] | preempt count: 00000001 ]
> > [ 29.455257] | 1-level deep critical section nesting:
> > [ 29.460732] ----------------------------------------
> > [ 29.466212] .. [<ffffffff806a169a>] .... start_kernel+0x68/0x221
> > [ 29.472815] .....[<ffffffff806a129b>] .. ( <= _sinittext+0x29b/0x2a2)
> > [ 29.480056]
>
> Perhaps you could put in that in_atomic check at the start of each of
> these functions and point to where it is a problem. Perhaps a spinlock
> is taken that was real and not a mutex.

Yeah, that's my thought as well. I'm going to do what you suggest now.

bill

2006-08-08 02:56:25

by Bill Huey

[permalink] [raw]
Subject: [Patch] restore the RCU callback to defer put_task_struct() Re: Problems with 2.6.17-rt8

On Thu, Aug 03, 2006 at 10:27:41AM -0400, Steven Rostedt wrote:

...(output and commentary a log deleted)...

> This could also have a side effect that messes things up.
>
> Unfortunately, right now I'm assigned to other tasks and I cant spend
> much more time on this at the moment. So hopefully, Ingo, Thomas or
> Bill, or someone else can help you find the reason for this problem.

Steve and company,

Speaking of which, after talking to Steve about this and confirming this
with a revert of changes. put_task_struct() can't deallocated memory from
either the zone or SLAB cache without taking a sleeping lock. It can't
be called directly from finish_task_switch to reap the thread because of
that (violation in atomic).

It is for this reason the RCU call back to delay processing was put into
place to reap threads and was, seemingly by accident, missing from
patch-2.6.17-rt7 to -rt8. That is what broke it in the first place.

I tested it with a "make -j4" which triggers the warning and it they all
go away now.

Reverse patch attached:

bill


Attachments:
(No filename) (1.03 kB)
t.diff (1.94 kB)
Download all attachments

2006-08-08 03:05:50

by Bill Huey

[permalink] [raw]
Subject: Re: [Patch] restore the RCU callback to defer put_task_struct() Re: Problems with 2.6.17-rt8

On Mon, Aug 07, 2006 at 07:56:15PM -0700, Bill Huey wrote:
> On Thu, Aug 03, 2006 at 10:27:41AM -0400, Steven Rostedt wrote:
>
> ...(output and commentary a log deleted)...
>
> > This could also have a side effect that messes things up.
> >
> > Unfortunately, right now I'm assigned to other tasks and I cant spend
> > much more time on this at the moment. So hopefully, Ingo, Thomas or
> > Bill, or someone else can help you find the reason for this problem.
>
> Steve and company,
>
> Speaking of which, after talking to Steve about this and confirming this
> with a revert of changes. put_task_struct() can't deallocated memory from
> either the zone or SLAB cache without taking a sleeping lock. It can't
> be called directly from finish_task_switch to reap the thread because of
> that (violation in atomic).
>
> It is for this reason the RCU call back to delay processing was put into
> place to reap threads and was, seemingly by accident, missing from
> patch-2.6.17-rt7 to -rt8. That is what broke it in the first place.
>
> I tested it with a "make -j4" which triggers the warning and it they all
> go away now.
>
> Reverse patch attached:

Resend with instrumentation code removed:

bill


Attachments:
(No filename) (1.18 kB)
t.diff (1.89 kB)
Download all attachments

2006-08-08 18:46:43

by Robert Crocombe

[permalink] [raw]
Subject: Re: [Patch] restore the RCU callback to defer put_task_struct() Re: Problems with 2.6.17-rt8

On 8/7/06, hui Bill Huey <[email protected]> wrote:
> > I tested it with a "make -j4" which triggers the warning and it they all
> > go away now.
> >
> > Reverse patch attached:

Unfortunately, this makes no difference on my setup. Patch applied,
made the obvious little change:

-#include <linux/dobject.h>
+#include <linux/kobject.h>

But:


kjournald/1119[CPU#1]: BUG in debug_rt_mutex_unlock at
kernel/rtmutex-debug.c:471

Call Trace:
<ffffffff804875c3>{_raw_spin_lock_irqsave+34}
<ffffffff8022cc37>{__WARN_ON+105}
<ffffffff8022cbf2>{__WARN_ON+36}
<ffffffff80248933>{debug_rt_mutex_unlock+204}
<ffffffff8048678f>{rt_lock_slowunlock+30}
<ffffffff80487306>{__lock_text_start+14}
<ffffffff8027942d>{kmem_cache_alloc+207}
<ffffffff8025b4d0>{mempool_alloc_slab+22}
<ffffffff8025b8bf>{mempool_alloc+80}
<ffffffff80209a6d>{mcount+45}
<ffffffff802832fe>{bio_alloc_bioset+40}
<ffffffff802833e6>{bio_clone+34}
<ffffffff803eb841>{make_request+1360}
<ffffffff804867b7>{rt_lock_slowunlock+70}
<ffffffff802fbbe8>{generic_make_request+380}
<ffffffff802fd301>{submit_bio+197}
<ffffffff8027fa71>{submit_bh+267}
<ffffffff80280c9f>{ll_rw_block+166}
<ffffffff802d19d7>{journal_commit_transaction+1016}
<ffffffff804856c8>{thread_return+230}
<ffffffff804856c8>{thread_return+230}
<ffffffff80248f5d>{constant_test_bit+9}
<ffffffff80487ad0>{_raw_spin_unlock+51}
<ffffffff804867b7>{rt_lock_slowunlock+70}
<ffffffff804856c8>{thread_return+230}
<ffffffff80487306>{__lock_text_start+14}
<ffffffff80235870>{try_to_del_timer_sync+90}
<ffffffff804856c8>{thread_return+230}
<ffffffff802d5fba>{kjournald+207}
<ffffffff80240047>{autoremove_wake_function+0}
<ffffffff802d5eeb>{kjournald+0}
<ffffffff8023fc04>{keventd_create_kthread+0}
<ffffffff8023ff09>{kthread+224}
<ffffffff802273de>{schedule_tail+210}
<ffffffff8020ae12>{child_rip+8}
<ffffffff8023fc04>{keventd_create_kthread+0}
<ffffffff804856c8>{thread_return+230}
<ffffffff804856c8>{thread_return+230}
<ffffffff804856c8>{thread_return+230}
<ffffffff8023fe29>{kthread+0}
<ffffffff8020ae0a>{child_rip+0}
---------------------------
| preempt count: 00000002 ]
| 2-level deep critical section nesting:
----------------------------------------
.. [<ffffffff804874df>] .... _raw_spin_lock+0x1b/0x28
.....[<ffffffff80486787>] .. ( <= rt_lock_slowunlock+0x16/0x70)
.. [<ffffffff804875c3>] .... _raw_spin_lock_irqsave+0x22/0x33
.....[<ffffffff8022cbf2>] .. ( <= __WARN_ON+0x24/0x8a)

kjournald/1119[CPU#1]: BUG in debug_rt_mutex_unlock at
kernel/rtmutex-debug.c:472

Call Trace:
<ffffffff804875c3>{_raw_spin_lock_irqsave+34}
<ffffffff8022cc37>{__WARN_ON+105}
<ffffffff8022cbf2>{__WARN_ON+36}
<ffffffff802489d5>{debug_rt_mutex_unlock+366}
<ffffffff8048678f>{rt_lock_slowunlock+30}
<ffffffff80487306>{__lock_text_start+14}
<ffffffff8027942d>{kmem_cache_alloc+207}
<ffffffff8025b4d0>{mempool_alloc_slab+22}
<ffffffff8025b8bf>{mempool_alloc+80}
<ffffffff80209a6d>{mcount+45}
<ffffffff802832fe>{bio_alloc_bioset+40}
<ffffffff802833e6>{bio_clone+34}
<ffffffff803eb841>{make_request+1360}
<ffffffff804867b7>{rt_lock_slowunlock+70}
<ffffffff802fbbe8>{generic_make_request+380}
<ffffffff802fd301>{submit_bio+197}
<ffffffff8027fa71>{submit_bh+267}
<ffffffff80280c9f>{ll_rw_block+166}
<ffffffff802d19d7>{journal_commit_transaction+1016}
<ffffffff804856c8>{thread_return+230}
<ffffffff804856c8>{thread_return+230}
<ffffffff80248f5d>{constant_test_bit+9}
<ffffffff80487ad0>{_raw_spin_unlock+51}
<ffffffff804867b7>{rt_lock_slowunlock+70}
<ffffffff804856c8>{thread_return+230}
<ffffffff80487306>{__lock_text_start+14}
<ffffffff80235870>{try_to_del_timer_sync+90}
<ffffffff804856c8>{thread_return+230}
<ffffffff802d5fba>{kjournald+207}
<ffffffff80240047>{autoremove_wake_function+0}
<ffffffff802d5eeb>{kjournald+0}
<ffffffff8023fc04>{keventd_create_kthread+0}
<ffffffff8023ff09>{kthread+224}
<ffffffff802273de>{schedule_tail+210}
<ffffffff8020ae12>{child_rip+8}
<ffffffff8023fc04>{keventd_create_kthread+0}
<ffffffff804856c8>{thread_return+230}
<ffffffff804856c8>{thread_return+230}
<ffffffff804856c8>{thread_return+230}
<ffffffff8023fe29>{kthread+0}
<ffffffff8020ae0a>{child_rip+0}
---------------------------
| preempt count: 00000002 ]
| 2-level deep critical section nesting:
----------------------------------------
.. [<ffffffff804874df>] .... _raw_spin_lock+0x1b/0x28
.....[<ffffffff80486787>] .. ( <= rt_lock_slowunlock+0x16/0x70)
.. [<ffffffff804875c3>] .... _raw_spin_lock_irqsave+0x22/0x33
.....[<ffffffff8022cbf2>] .. ( <= __WARN_ON+0x24/0x8a)

----------- [cut here ] --------- [please bite here ] ---------
Kernel BUG at kernel/rtmutex.c:639
invalid opcode: 0000 [1] PREEMPT SMP
CPU 1
Modules linked in: nfsd exportfs lockd sunrpc ohci1394 ieee1394 tg3
Pid: 1119, comm: kjournald Not tainted 2.6.17-rt8_local_02 #8
RIP: 0010:[<ffffffff80486908>] <ffffffff80486908>{rt_lock_slowlock+186}
RSP: 0018:ffff8105ea93b9e8 EFLAGS: 00010246
RAX: ffff810200220db0 RBX: 0000000000000010 RCX: 00000000000c0080
RDX: ffff810200220db0 RSI: ffffffff802793b2 RDI: ffff810200115ca0
RBP: ffff8105ea93baa8 R08: ffff8100011e5650 R09: ffff8105ea93b9e8
R10: ffff8100011e5650 R11: 0000000000000023 R12: ffff810200115ca0
R13: ffff810200031c40 R14: ffffffff802793b2 R15: 0000000000000010
FS: 00002b055e056480(0000) GS:ffff8102001e3340(0000) knlGS:0000000000000000
CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b
CR2: 00000000005b755c CR3: 00000005e6e0b000 CR4: 00000000000006e0
Process kjournald (pid: 1119, threadinfo ffff8105ea93a000, task
ffff810200220db0)
Stack: 111111110000008c ffff8105ea93b9f0 ffff8105ea93b9f0 ffff8105ea93ba00
ffff8105ea93ba00 0000000000000000 111111110000008c ffff8105ea93ba20
ffff8105ea93ba20 ffff8105ea93ba30
Call Trace:
<ffffffff8048731a>{rt_lock+18}
<ffffffff802793b2>{kmem_cache_alloc+84}
<ffffffff8025b4d0>{mempool_alloc_slab+22}
<ffffffff8025b8bf>{mempool_alloc+80}
<ffffffff8028336d>{bio_alloc_bioset+151}
<ffffffff802833e6>{bio_clone+34}
<ffffffff803eb841>{make_request+1360}
<ffffffff804867b7>{rt_lock_slowunlock+70}
<ffffffff802fbbe8>{generic_make_request+380}
<ffffffff802fd301>{submit_bio+197}
<ffffffff8027fa71>{submit_bh+267}
<ffffffff80280c9f>{ll_rw_block+166}
<ffffffff802d19d7>{journal_commit_transaction+1016}
<ffffffff804856c8>{thread_return+230}
<ffffffff804856c8>{thread_return+230}
<ffffffff80248f5d>{constant_test_bit+9}
<ffffffff80487ad0>{_raw_spin_unlock+51}
<ffffffff804867b7>{rt_lock_slowunlock+70}
<ffffffff804856c8>{thread_return+230}
<ffffffff80487306>{__lock_text_start+14}
<ffffffff80235870>{try_to_del_timer_sync+90}
<ffffffff804856c8>{thread_return+230}
<ffffffff802d5fba>{kjournald+207}
<ffffffff80240047>{autoremove_wake_function+0}
<ffffffff802d5eeb>{kjournald+0}
<ffffffff8023fc04>{keventd_create_kthread+0}
<ffffffff8023ff09>{kthread+224}
<ffffffff802273de>{schedule_tail+210}
<ffffffff8020ae12>{child_rip+8}
<ffffffff8023fc04>{keventd_create_kthread+0}
<ffffffff804856c8>{thread_return+230}
<ffffffff804856c8>{thread_return+230}
<ffffffff804856c8>{thread_return+230}
<ffffffff8023fe29>{kthread+0}
<ffffffff8020ae0a>{child_rip+0}
---------------------------
| preempt count: 00000002 ]
| 2-level deep critical section nesting:
----------------------------------------
.. [<ffffffff804874df>] .... _raw_spin_lock+0x1b/0x28
.....[<ffffffff80486889>] .. ( <= rt_lock_slowlock+0x3b/0x213)
.. [<ffffffff80487898>] .... _raw_spin_trylock+0x1b/0x5f
.....[<ffffffff80488346>] .. ( <= oops_begin+0x28/0x77)


Code: 0f 0b 68 70 44 4c 80 c2 7f 02 65 48 8b 04 25 00 00 00 00 41
RIP <ffffffff80486908>{rt_lock_slowlock+186} RSP <ffff8105ea93b9e8>

Triggered by building a kernel.

--
Robert Crocombe
[email protected]

2006-08-08 19:07:04

by Steven Rostedt

[permalink] [raw]
Subject: Re: [Patch] restore the RCU callback to defer put_task_struct() Re: Problems with 2.6.17-rt8


On Tue, 8 Aug 2006, Robert Crocombe wrote:

> On 8/7/06, hui Bill Huey <[email protected]> wrote:
> > > I tested it with a "make -j4" which triggers the warning and it they all
> > > go away now.
> > >
> > > Reverse patch attached:
>
> Unfortunately, this makes no difference on my setup. Patch applied,
> made the obvious little change:
>
> -#include <linux/dobject.h>
> +#include <linux/kobject.h>
>
> But:
>
>
> kjournald/1119[CPU#1]: BUG in debug_rt_mutex_unlock at
> kernel/rtmutex-debug.c:471
>

Robert,

How far back do you get this bug? I mean if you go back and test the
previous kernels, where did you start seeing this?

Thanks,

-- Steve

2006-08-08 21:35:28

by Robert Crocombe

[permalink] [raw]
Subject: Re: [Patch] restore the RCU callback to defer put_task_struct() Re: Problems with 2.6.17-rt8

On 8/8/06, Steven Rostedt <[email protected]> wrote:
> How far back do you get this bug? I mean if you go back and test the
> previous kernels, where did you start seeing this?

I have so far been unable to provoke the problem under 2.6.16-rt29.

I was able to provoke things once easily under 17-rt1, but subsequent
attempts to trigger the bug have so far yielded no results.

I'm back to poking at 16-rt29 to see if the problem is simply somewhat
less likely.

--
Robert Crocombe
[email protected]

2006-08-08 21:44:37

by Steven Rostedt

[permalink] [raw]
Subject: Re: [Patch] restore the RCU callback to defer put_task_struct() Re: Problems with 2.6.17-rt8

On Tue, 8 Aug 2006, Robert Crocombe wrote:

> On 8/8/06, Steven Rostedt <[email protected]> wrote:
> > How far back do you get this bug? I mean if you go back and test the
> > previous kernels, where did you start seeing this?
>
> I have so far been unable to provoke the problem under 2.6.16-rt29.
>
> I was able to provoke things once easily under 17-rt1, but subsequent
> attempts to trigger the bug have so far yielded no results.
>
> I'm back to poking at 16-rt29 to see if the problem is simply somewhat
> less likely.

Is it easy to poke the problem with -rt8? Just want to know if -rt8 has
caused the problem to be more prevalent.

-- Steve

2006-08-08 22:10:43

by Robert Crocombe

[permalink] [raw]
Subject: Re: [Patch] restore the RCU callback to defer put_task_struct() Re: Problems with 2.6.17-rt8

On 8/8/06, Steven Rostedt <[email protected]> wrote:
> Is it easy to poke the problem with -rt8? Just want to know if -rt8 has
> caused the problem to be more prevalent.

Yes. It has been trivial to cause the problem under -rt7 and -rt8
(the upgrade from 7 was in hopes that -rt8 didn't have the problem).

--
Robert Crocombe
[email protected]

2006-08-09 00:36:12

by Bill Huey

[permalink] [raw]
Subject: Re: [Patch] restore the RCU callback to defer put_task_struct() Re: Problems with 2.6.17-rt8

On Tue, Aug 08, 2006 at 11:46:40AM -0700, Robert Crocombe wrote:
> Unfortunately, this makes no difference on my setup. Patch applied,
> made the obvious little change:
...
> kjournald/1119[CPU#1]: BUG in debug_rt_mutex_unlock at
> kernel/rtmutex-debug.c:471
>
> Call Trace:
> <ffffffff804875c3>{_raw_spin_lock_irqsave+34}
> <ffffffff8022cc37>{__WARN_ON+105}
> <ffffffff8022cbf2>{__WARN_ON+36}
> <ffffffff80248933>{debug_rt_mutex_unlock+204}
...

I'll get on this as well. At least one problem was fixed with previous
patch.

bill


2006-08-09 17:19:40

by Robert Crocombe

[permalink] [raw]
Subject: Re: [Patch] restore the RCU callback to defer put_task_struct() Re: Problems with 2.6.17-rt8

On 8/8/06, Robert Crocombe <[email protected]> wrote:
> I'm back to poking at 16-rt29 to see if the problem is simply somewhat
> less likely.

Unable to crash the 2.6.16-rt29 kernel after 462 kernel compilations.
But I got about 40 of these warnings:

irq 106: nobody cared (try booting with the "irqpoll" option)

Call Trace:
<ffffffff88003c32>{:tg3:tg3_interrupt_tagged+61}
<ffffffff801523a0>{__report_bad_irq+56}
<ffffffff801525ce>{note_interrupt+479}
<ffffffff80152287>{do_irqd+514}
<ffffffff80152085>{do_irqd+0}
<ffffffff8013f454>{keventd_create_kthread+0}
<ffffffff8013f72b>{kthread+219}
<ffffffff8012719c>{schedule_tail+198}
<ffffffff8010bc6e>{child_rip+8}
<ffffffff8013f454>{keventd_create_kthread+0}
<ffffffff8013f650>{kthread+0}
<ffffffff8010bc66>{child_rip+0}
---------------------------
| preempt count: 00000001 ]
| 1-level deep critical section nesting:
----------------------------------------
.. [<ffffffff80373aca>] .... _raw_spin_lock_irq+0x17/0x24
.....[<ffffffff8015226e>] .. ( <= do_irqd+0x1e9/0x2e3)

handlers:
[<ffffffff88003bf5>] (tg3_interrupt_tagged+0x0/0x11b [tg3])

--
Robert Crocombe
[email protected]

2006-08-09 22:06:05

by Esben Nielsen

[permalink] [raw]
Subject: Re: [Patch] restore the RCU callback to defer put_task_struct() Re: Problems with 2.6.17-rt8



On Mon, 7 Aug 2006, Bill Huey wrote:

> On Mon, Aug 07, 2006 at 07:56:15PM -0700, Bill Huey wrote:
>> On Thu, Aug 03, 2006 at 10:27:41AM -0400, Steven Rostedt wrote:
>>
>> ...(output and commentary a log deleted)...
>>
>>> This could also have a side effect that messes things up.
>>>
>>> Unfortunately, right now I'm assigned to other tasks and I cant spend
>>> much more time on this at the moment. So hopefully, Ingo, Thomas or
>>> Bill, or someone else can help you find the reason for this problem.
>>
>> Steve and company,
>>
>> Speaking of which, after talking to Steve about this and confirming this
>> with a revert of changes. put_task_struct() can't deallocated memory from
>> either the zone or SLAB cache without taking a sleeping lock. It can't
>> be called directly from finish_task_switch to reap the thread because of
>> that (violation in atomic).
>>
>> It is for this reason the RCU call back to delay processing was put into
>> place to reap threads and was, seemingly by accident, missing from
>> patch-2.6.17-rt7 to -rt8. That is what broke it in the first place.
>>
>> I tested it with a "make -j4" which triggers the warning and it they all
>> go away now.
>>
>> Reverse patch attached:
>
> Resend with instrumentation code removed:
>
> bill
>
>

I had a long discussion with Paul McKenney about this. I opposed the patch
from a latency point of view: Suddenly a high-priority RT task could be
made into releasing a task_struct. It would be better for latencies to
defer it to a low priority task.

The conclusion we ended up with was that it is not a job for the RCU
system, but it ought to be deferred to some other low priority task to
free the task_struct.

Esben

2006-08-10 00:01:24

by Steven Rostedt

[permalink] [raw]
Subject: Re: [Patch] restore the RCU callback to defer put_task_struct() Re: Problems with 2.6.17-rt8


On Thu, 10 Aug 2006, Esben Nielsen wrote:

>
> I had a long discussion with Paul McKenney about this. I opposed the patch
> from a latency point of view: Suddenly a high-priority RT task could be
> made into releasing a task_struct. It would be better for latencies to
> defer it to a low priority task.
>
> The conclusion we ended up with was that it is not a job for the RCU
> system, but it ought to be deferred to some other low priority task to
> free the task_struct.
>

Fair enough. But by removing the rcu code to do the dirty work, we ended
up breaking the code completely. So although the rcu work is not the
right method, it needs to be there until we have a better solution.
This solution is better than what is there right now in -rt8, and that is,
-rt8 is broken without it!

-- Steve

2006-08-10 02:21:59

by Bill Huey

[permalink] [raw]
Subject: Re: [Patch] restore the RCU callback to defer put_task_struct() Re: Problems with 2.6.17-rt8

On Thu, Aug 10, 2006 at 12:05:57AM +0200, Esben Nielsen wrote:
> I had a long discussion with Paul McKenney about this. I opposed the patch
> from a latency point of view: Suddenly a high-priority RT task could be
> made into releasing a task_struct. It would be better for latencies to
> defer it to a low priority task.
>
> The conclusion we ended up with was that it is not a job for the RCU
> system, but it ought to be deferred to some other low priority task to
> free the task_struct.

I agree. It's just hack to get it not to crash at this time. It really
should be done in another facility or utilizing another threading context.

bill

2006-08-11 01:06:56

by Bill Huey

[permalink] [raw]
Subject: Re: [Patch] restore the RCU callback to defer put_task_struct() Re: Problems with 2.6.17-rt8

On Wed, Aug 09, 2006 at 07:18:35PM -0700, Bill Huey wrote:
> On Thu, Aug 10, 2006 at 12:05:57AM +0200, Esben Nielsen wrote:
> > I had a long discussion with Paul McKenney about this. I opposed the patch
> > from a latency point of view: Suddenly a high-priority RT task could be
> > made into releasing a task_struct. It would be better for latencies to
> > defer it to a low priority task.
> >
> > The conclusion we ended up with was that it is not a job for the RCU
> > system, but it ought to be deferred to some other low priority task to
> > free the task_struct.
>
> I agree. It's just hack to get it not to crash at this time. It really
> should be done in another facility or utilizing another threading context.

Esben and company,

This is the second round of getting rid of the locking problems with free_task()

This extends the mmdrop logic with desched_thread() to also handle free_task()
requests as well. I believe this address your concerns and I'm open to review
of this patch.

Patch included:

bill


Attachments:
(No filename) (1.00 kB)
t2.diff (5.74 kB)
Download all attachments

2006-08-11 07:47:40

by Bill Huey

[permalink] [raw]
Subject: Re: [Patch] restore the RCU callback to defer put_task_struct() Re: Problems with 2.6.17-rt8

On Tue, Aug 08, 2006 at 11:46:40AM -0700, Robert Crocombe wrote:
> Unfortunately, this makes no difference on my setup. Patch applied,
> made the obvious little change:
>
> -#include <linux/dobject.h>
> +#include <linux/kobject.h>
>
> But:

[...stack trace deleted...]

I'm have a difficult time triggering your bug. Can you post your .config ?

bill

2006-08-11 08:17:16

by Esben Nielsen

[permalink] [raw]
Subject: Re: [Patch] restore the RCU callback to defer put_task_struct() Re: Problems with 2.6.17-rt8



On Thu, 10 Aug 2006, Bill Huey wrote:

> On Wed, Aug 09, 2006 at 07:18:35PM -0700, Bill Huey wrote:
>> On Thu, Aug 10, 2006 at 12:05:57AM +0200, Esben Nielsen wrote:
>>> I had a long discussion with Paul McKenney about this. I opposed the patch
>>> from a latency point of view: Suddenly a high-priority RT task could be
>>> made into releasing a task_struct. It would be better for latencies to
>>> defer it to a low priority task.
>>>
>>> The conclusion we ended up with was that it is not a job for the RCU
>>> system, but it ought to be deferred to some other low priority task to
>>> free the task_struct.
>>
>> I agree. It's just hack to get it not to crash at this time. It really
>> should be done in another facility or utilizing another threading context.
>
> Esben and company,
>
> This is the second round of getting rid of the locking problems with free_task()
>
> This extends the mmdrop logic with desched_thread() to also handle free_task()
> requests as well. I believe this address your concerns and I'm open to review
> of this patch.
>
> Patch included:
>
> bill
>
>
Without applying the patch and only skimming it it looks like what Paul
and I concluded :-)

But is there really no generic way of defering this kind of thing? It
looks like a hell of a lot work where a kind of "message" infrastructure
could have solved it in a few lines.

Esben

2006-08-11 08:46:20

by Bill Huey

[permalink] [raw]
Subject: Re: [Patch] restore the RCU callback to defer put_task_struct() Re: Problems with 2.6.17-rt8

On Fri, Aug 11, 2006 at 10:16:56AM +0200, Esben Nielsen wrote:
> On Thu, 10 Aug 2006, Bill Huey wrote:
...
> >This extends the mmdrop logic with desched_thread() to also handle
> >free_task() requests as well. I believe this address your concerns and
> >I'm open to review of this patch.

> Without applying the patch and only skimming it it looks like what Paul
> and I concluded :-)
>
> But is there really no generic way of defering this kind of thing? It
> looks like a hell of a lot work where a kind of "message" infrastructure
> could have solved it in a few lines.

You can composite it out of a generic kernel APIs to fake it or special
case these things so it behaves like a message. Doing it this way also
cleaned up the code a bit. Messaging also wouldn't have directly solved
the per-CPU aspects of this reaping, so this was the right facility to
use. The only question left in my mind is the proper choice of priority
for that thread.

Another note, the schedule() path is shorten by this as well since this
deallocation and reaping stuff (calling to the mm system) was done with
preemption turned off and inline to the child itself exiting. This patch
should help shorted the maximal schedule path. It's one of the last
remining long paths like this in the kernel. It seems to be an all
around win if I'm not mistaken.

bill

2006-08-11 14:52:45

by Robert Crocombe

[permalink] [raw]
Subject: Re: [Patch] restore the RCU callback to defer put_task_struct() Re: Problems with 2.6.17-rt8

On 8/11/06, hui Bill Huey <[email protected]> wrote:
> I'm have a difficult time triggering your bug. Can you post your .config ?

--
Robert Crocombe
[email protected]


Attachments:
(No filename) (176.00 B)
config (27.11 kB)
Download all attachments

2006-08-11 15:00:57

by Robert Crocombe

[permalink] [raw]
Subject: Re: [Patch] restore the RCU callback to defer put_task_struct() Re: Problems with 2.6.17-rt8

On 8/10/06, hui Bill Huey <[email protected]> wrote:
> This is the second round of getting rid of the locking problems with free_task()
>
> This extends the mmdrop logic with desched_thread() to also handle free_task()
> requests as well. I believe this address your concerns and I'm open to review
> of this patch.
>
> Patch included:

Just tried it.

--
Robert Crocombe
[email protected]

kjournald/1119[CPU#3]: BUG in debug_rt_mutex_unlock at
kernel/rtmutex-debug.c:471

Call Trace:
<ffffffff80487593>{_raw_spin_lock_irqsave+34}
<ffffffff8022cd9f>{__WARN_ON+105}
<ffffffff8022cd5a>{__WARN_ON+36}
<ffffffff8024897b>{debug_rt_mutex_unlock+204}
<ffffffff80486761>{rt_lock_slowunlock+30}
<ffffffff804872d6>{__lock_text_start+14}
<ffffffff80279461>{kmem_cache_alloc+207}
<ffffffff8025b504>{mempool_alloc_slab+22}
<ffffffff8025b8f3>{mempool_alloc+80}
<ffffffff80209a6d>{mcount+45}
<ffffffff80283332>{bio_alloc_bioset+40}
<ffffffff80283452>{bio_alloc+21}
<ffffffff8027fa28>{submit_bh+142}
<ffffffff80280cd3>{ll_rw_block+166}
<ffffffff802d19cf>{journal_commit_transaction+1016}
<ffffffff80248fa5>{constant_test_bit+9}
<ffffffff80487aa0>{_raw_spin_unlock+51}
<ffffffff80486789>{rt_lock_slowunlock+70}
<ffffffff804872d6>{__lock_text_start+14}
<ffffffff80235974>{try_to_del_timer_sync+90}
<ffffffff802d5fb2>{kjournald+207}
<ffffffff80240107>{autoremove_wake_function+0}
<ffffffff802d5ee3>{kjournald+0}
<ffffffff8023fcc5>{keventd_create_kthread+0}
<ffffffff8023ffca>{kthread+224}
<ffffffff802273bf>{schedule_tail+198}
<ffffffff8020ae12>{child_rip+8}
<ffffffff8023fcc5>{keventd_create_kthread+0}
<ffffffff8023feea>{kthread+0}
<ffffffff8020ae0a>{child_rip+0}
---------------------------
| preempt count: 00000002 ]
| 2-level deep critical section nesting:
----------------------------------------
.. [<ffffffff804874af>] .... _raw_spin_lock+0x1b/0x28
.....[<ffffffff80486759>] .. ( <= rt_lock_slowunlock+0x16/0x70)
.. [<ffffffff80487593>] .... _raw_spin_lock_irqsave+0x22/0x33
.....[<ffffffff8022cd5a>] .. ( <= __WARN_ON+0x24/0x8a)

kjournald/1119[CPU#3]: BUG in debug_rt_mutex_unlock at
kernel/rtmutex-debug.c:472

Call Trace:
<ffffffff80487593>{_raw_spin_lock_irqsave+34}
<ffffffff8022cd9f>{__WARN_ON+105}
<ffffffff8022cd5a>{__WARN_ON+36}
<ffffffff80248a1d>{debug_rt_mutex_unlock+366}
<ffffffff80486761>{rt_lock_slowunlock+30}
<ffffffff804872d6>{__lock_text_start+14}
<ffffffff80279461>{kmem_cache_alloc+207}
<ffffffff8025b504>{mempool_alloc_slab+22}
<ffffffff8025b8f3>{mempool_alloc+80}
<ffffffff80209a6d>{mcount+45}
<ffffffff80283332>{bio_alloc_bioset+40}
<ffffffff80283452>{bio_alloc+21}
<ffffffff8027fa28>{submit_bh+142}
<ffffffff80280cd3>{ll_rw_block+166}
<ffffffff802d19cf>{journal_commit_transaction+1016}
<ffffffff80248fa5>{constant_test_bit+9}
<ffffffff80487aa0>{_raw_spin_unlock+51}
<ffffffff80486789>{rt_lock_slowunlock+70}
<ffffffff804872d6>{__lock_text_start+14}
<ffffffff80235974>{try_to_del_timer_sync+90}
<ffffffff802d5fb2>{kjournald+207}
<ffffffff80240107>{autoremove_wake_function+0}
<ffffffff802d5ee3>{kjournald+0}
<ffffffff8023fcc5>{keventd_create_kthread+0}
<ffffffff8023ffca>{kthread+224}
<ffffffff802273bf>{schedule_tail+198}
<ffffffff8020ae12>{child_rip+8}
<ffffffff8023fcc5>{keventd_create_kthread+0}
<ffffffff8023feea>{kthread+0}
<ffffffff8020ae0a>{child_rip+0}
---------------------------
| preempt count: 00000002 ]
| 2-level deep critical section nesting:
----------------------------------------
.. [<ffffffff804874af>] .... _raw_spin_lock+0x1b/0x28
.....[<ffffffff80486759>] .. ( <= rt_lock_slowunlock+0x16/0x70)
.. [<ffffffff80487593>] .... _raw_spin_lock_irqsave+0x22/0x33
.....[<ffffffff8022cd5a>] .. ( <= __WARN_ON+0x24/0x8a)

----------- [cut here ] --------- [please bite here ] ---------
Kernel BUG at kernel/rtmutex.c:639
invalid opcode: 0000 [1] PREEMPT SMP
CPU 3
Modules linked in: ohci1394 ieee1394 tg3
Pid: 1119, comm: kjournald Not tainted 2.6.17-rt8_t2_00 #2
RIP: 0010:[<ffffffff804868da>] <ffffffff804868da>{rt_lock_slowlock+186}
RSP: 0000:ffff8107eacbdb38 EFLAGS: 00010246
RAX: ffff8105eade7100 RBX: 0000000000000010 RCX: 0000000000240180
RDX: ffff8105eade7100 RSI: ffffffff802793e6 RDI: ffff810600115ca0
RBP: ffff8107eacbdbf8 R08: ffff8102000d7f38 R09: ffff8107eacbdb38
R10: ffff8102000d7f38 R11: 0000000000000023 R12: ffff810600115ca0
R13: ffff8101eae158c0 R14: ffffffff802793e6 R15: 0000000000000010
FS: 00002ab5934811e0(0000) GS:ffff810600211340(0000) knlGS:0000000000000000
CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b
CR2: 00002ab5969c1000 CR3: 00000005e69e4000 CR4: 00000000000006e0
Process kjournald (pid: 1119, threadinfo ffff8107eacbc000, task
ffff8105eade7100)
Stack: 111111110000008c ffff8107eacbdb40 ffff8107eacbdb40 ffff8107eacbdb50
ffff8107eacbdb50 0000000000000000 111111110000008c ffff8107eacbdb70
ffff8107eacbdb70 ffff8107eacbdb80
Call Trace:
<ffffffff804872ea>{rt_lock+18}
<ffffffff802793e6>{kmem_cache_alloc+84}
<ffffffff8025b504>{mempool_alloc_slab+22}
<ffffffff8025b8f3>{mempool_alloc+80}
<ffffffff802833a1>{bio_alloc_bioset+151}
<ffffffff80283452>{bio_alloc+21}
<ffffffff8027fa28>{submit_bh+142}
<ffffffff80280cd3>{ll_rw_block+166}
<ffffffff802d19cf>{journal_commit_transaction+1016}
<ffffffff80248fa5>{constant_test_bit+9}
<ffffffff80487aa0>{_raw_spin_unlock+51}
<ffffffff80486789>{rt_lock_slowunlock+70}
<ffffffff804872d6>{__lock_text_start+14}
<ffffffff80235974>{try_to_del_timer_sync+90}
<ffffffff802d5fb2>{kjournald+207}
<ffffffff80240107>{autoremove_wake_function+0}
<ffffffff802d5ee3>{kjournald+0}
<ffffffff8023fcc5>{keventd_create_kthread+0}
<ffffffff8023ffca>{kthread+224}
<ffffffff802273bf>{schedule_tail+198}
<ffffffff8020ae12>{child_rip+8}
<ffffffff8023fcc5>{keventd_create_kthread+0}
<ffffffff8023feea>{kthread+0}
<ffffffff8020ae0a>{child_rip+0}
---------------------------
| preempt count: 00000002 ]
| 2-level deep critical section nesting:
----------------------------------------
.. [<ffffffff804874af>] .... _raw_spin_lock+0x1b/0x28
.....[<ffffffff8048685b>] .. ( <= rt_lock_slowlock+0x3b/0x213)
.. [<ffffffff80487868>] .... _raw_spin_trylock+0x1b/0x5f
.....[<ffffffff80488316>] .. ( <= oops_begin+0x28/0x77)


Code: 0f 0b 68 70 44 4c 80 c2 7f 02 65 48 8b 04 25 00 00 00 00 41
RIP <ffffffff804868da>{rt_lock_slowlock+186} RSP <ffff8107eacbdb38>

2006-08-11 21:20:47

by Bill Huey

[permalink] [raw]
Subject: Re: [Patch] restore the RCU callback to defer put_task_struct() Re: Problems with 2.6.17-rt8

On Fri, Aug 11, 2006 at 08:00:50AM -0700, Robert Crocombe wrote:
> On 8/10/06, hui Bill Huey <[email protected]> wrote:
> >Patch included:
>
> Just tried it.

Yeah, this bug is different from the one I've been fixing. Looks like
it's in the softirq system. Let me see.

I'll look through your config and see if anything is different with my
setup that's notable.

bill


> kjournald/1119[CPU#3]: BUG in debug_rt_mutex_unlock at
> kernel/rtmutex-debug.c:471
>
> Call Trace:
> <ffffffff80487593>{_raw_spin_lock_irqsave+34}
> <ffffffff8022cd9f>{__WARN_ON+105}
> <ffffffff8022cd5a>{__WARN_ON+36}
> <ffffffff8024897b>{debug_rt_mutex_unlock+204}
> <ffffffff80486761>{rt_lock_slowunlock+30}
> <ffffffff804872d6>{__lock_text_start+14}
> <ffffffff80279461>{kmem_cache_alloc+207}
> <ffffffff8025b504>{mempool_alloc_slab+22}
> <ffffffff8025b8f3>{mempool_alloc+80}
> <ffffffff80209a6d>{mcount+45}
> <ffffffff80283332>{bio_alloc_bioset+40}
> <ffffffff80283452>{bio_alloc+21}
> <ffffffff8027fa28>{submit_bh+142}
> <ffffffff80280cd3>{ll_rw_block+166}
> <ffffffff802d19cf>{journal_commit_transaction+1016}
> <ffffffff80248fa5>{constant_test_bit+9}
> <ffffffff80487aa0>{_raw_spin_unlock+51}
> <ffffffff80486789>{rt_lock_slowunlock+70}
> <ffffffff804872d6>{__lock_text_start+14}
> <ffffffff80235974>{try_to_del_timer_sync+90}
> <ffffffff802d5fb2>{kjournald+207}
> <ffffffff80240107>{autoremove_wake_function+0}
> <ffffffff802d5ee3>{kjournald+0}
> <ffffffff8023fcc5>{keventd_create_kthread+0}
> <ffffffff8023ffca>{kthread+224}
> <ffffffff802273bf>{schedule_tail+198}
> <ffffffff8020ae12>{child_rip+8}
> <ffffffff8023fcc5>{keventd_create_kthread+0}
> <ffffffff8023feea>{kthread+0}
> <ffffffff8020ae0a>{child_rip+0}


2006-08-14 17:56:46

by Robert Crocombe

[permalink] [raw]
Subject: Re: [Patch] restore the RCU callback to defer put_task_struct() Re: Problems with 2.6.17-rt8

I assume that replies were trimmed on accident, so I have resplattered everyone.

On 8/11/06, hui Bill Huey <[email protected]> wrote:
> Can you try it without RAID ? do you have a RAID set up ? You have tons
> of options turned on in your .config, if you can isolate the problematic
> system that, would be helpful to me since I don't have a lot of hardware.

Boo, I thought my config was pretty concise :(. Okay, I really
chopped it this time.

And yeah, it is a RAID config. But for extra bonus points, I found a
spare SCSI disk and installed Fedora Core 5 and did a 'yum upgrade' to
whatever was current as of today. So it's a single disk config now.
Problem still occurs with 't2' patched kernel.

config and dmesg attached, kaboom-like stuff appended.

--
Robert Crocombe
[email protected]

BUG: scheduling while atomic: cc1/0x00000001/6787

Call Trace:
<ffffffff8024ff01>{stop_trace+9}
<ffffffff803f007a>{__schedule+165}
<ffffffff80209c99>{mcount+45}
<ffffffff80209c99>{mcount+45}
<ffffffff80249758>{atomic_dec_and_spin_lock+26}
<ffffffff803f0e2e>{schedule+246}
<ffffffff803f1dac>{rt_lock_slowlock+356}
<ffffffff803f26fa>{rt_lock+18}
<ffffffff80249758>{atomic_dec_and_spin_lock+26}
<ffffffff80237396>{free_uid+90}
<ffffffff8022b7dc>{__put_task_struct+147}
<ffffffff803f0b62>{thread_return+283}
<ffffffff80209c99>{mcount+45}
<ffffffff803f1ce6>{rt_lock_slowlock+158}
<ffffffff803f2780>{rt_read_lock+76}
<ffffffff803f0c80>{preempt_schedule+90}
<ffffffff803f2e1a>{_raw_spin_unlock+60}
<ffffffff803f1ce6>{rt_lock_slowlock+158}
<ffffffff803f2778>{rt_read_lock+68}
<ffffffff803f26fa>{rt_lock+18}
<ffffffff803f2780>{rt_read_lock+76}
<ffffffff80256f06>{find_get_page+40}
<ffffffff8025778b>{do_generic_mapping_read+371}
<ffffffff802596cb>{file_read_actor+0}
<ffffffff802582c0>{__generic_file_aio_read+344}
<ffffffff8025834e>{generic_file_aio_read+61}
<ffffffff8027c1d0>{do_sync_read+245}
<ffffffff80240c5f>{autoremove_wake_function+0}
<ffffffff80209c99>{mcount+45}
<ffffffff8027cb36>{vfs_read+176}
<ffffffff8027cf6c>{sys_read+76}
<ffffffff80209e27>{system_call+239}
---------------------------
| preempt count: 00000001 ]
| 1-level deep critical section nesting:
----------------------------------------
.. [<ffffffff803f0092>] .... __schedule+0xbd/0xa72
.....[<ffffffff803f0c80>] .. ( <= preempt_schedule+0x5a/0x90)

kjournald/1061[CPU#3]: BUG in debug_rt_mutex_unlock at
kernel/rtmutex-debug.c:471

Call Trace:
<ffffffff803f29a3>{_raw_spin_lock_irqsave+34}
<ffffffff8022db19>{__WARN_ON+105}
<ffffffff8022dad4>{__WARN_ON+36}
<ffffffff802494a7>{debug_rt_mutex_unlock+204}
<ffffffff803f1b89>{rt_lock_slowunlock+30}
<ffffffff803f26e6>{__lock_text_start+14}
<ffffffff80278660>{kmem_cache_alloc+207}
<ffffffff80259cc8>{mempool_alloc_slab+22}
<ffffffff8025a0b7>{mempool_alloc+80}
<ffffffff80209c99>{mcount+45}
<ffffffff802817e2>{bio_alloc_bioset+40}
<ffffffff80281902>{bio_alloc+21}
<ffffffff8027ded8>{submit_bh+142}
<ffffffff8027f183>{ll_rw_block+166}
<ffffffff802cefae>{journal_commit_transaction+995}
<ffffffff803f2e11>{_raw_spin_unlock+51}
<ffffffff803f1bb1>{rt_lock_slowunlock+70}
<ffffffff803f0b2d>{thread_return+230}
<ffffffff803f26e6>{__lock_text_start+14}
<ffffffff802365a1>{try_to_del_timer_sync+90}
<ffffffff803f0b2d>{thread_return+230}
<ffffffff802d351d>{kjournald+204}
<ffffffff80240c5f>{autoremove_wake_function+0}
<ffffffff802d3451>{kjournald+0}
<ffffffff80240831>{keventd_create_kthread+0}
<ffffffff80240b1f>{kthread+224}
<ffffffff80228e70>{schedule_tail+190}
<ffffffff8020b03e>{child_rip+8}
<ffffffff80240831>{keventd_create_kthread+0}
<ffffffff80240a3f>{kthread+0}
<ffffffff8020b036>{child_rip+0}
---------------------------
| preempt count: 00000002 ]
| 2-level deep critical section nesting:
----------------------------------------
.. [<ffffffff803f28bf>] .... _raw_spin_lock+0x1b/0x28
.....[<ffffffff803f1b81>] .. ( <= rt_lock_slowunlock+0x16/0x70)
.. [<ffffffff803f29a3>] .... _raw_spin_lock_irqsave+0x22/0x33
.....[<ffffffff8022dad4>] .. ( <= __WARN_ON+0x24/0x88)

kjournald/1061[CPU#3]: BUG in debug_rt_mutex_unlock at
kernel/rtmutex-debug.c:472

Call Trace:
<ffffffff803f29a3>{_raw_spin_lock_irqsave+34}
<ffffffff8022db19>{__WARN_ON+105}
<ffffffff8022dad4>{__WARN_ON+36}
<ffffffff80249549>{debug_rt_mutex_unlock+366}
<ffffffff803f1b89>{rt_lock_slowunlock+30}
<ffffffff803f26e6>{__lock_text_start+14}
<ffffffff80278660>{kmem_cache_alloc+207}
<ffffffff80259cc8>{mempool_alloc_slab+22}
<ffffffff8025a0b7>{mempool_alloc+80}
<ffffffff80209c99>{mcount+45}
<ffffffff802817e2>{bio_alloc_bioset+40}
<ffffffff80281902>{bio_alloc+21}
<ffffffff8027ded8>{submit_bh+142}
<ffffffff8027f183>{ll_rw_block+166}
<ffffffff802cefae>{journal_commit_transaction+995}
<ffffffff803f2e11>{_raw_spin_unlock+51}
<ffffffff803f1bb1>{rt_lock_slowunlock+70}
<ffffffff803f0b2d>{thread_return+230}
<ffffffff803f26e6>{__lock_text_start+14}
<ffffffff802365a1>{try_to_del_timer_sync+90}
<ffffffff803f0b2d>{thread_return+230}
<ffffffff802d351d>{kjournald+204}
<ffffffff80240c5f>{autoremove_wake_function+0}
<ffffffff802d3451>{kjournald+0}
<ffffffff80240831>{keventd_create_kthread+0}
<ffffffff80240b1f>{kthread+224}
<ffffffff80228e70>{schedule_tail+190}
<ffffffff8020b03e>{child_rip+8}
<ffffffff80240831>{keventd_create_kthread+0}
<ffffffff80240a3f>{kthread+0}
<ffffffff8020b036>{child_rip+0}
---------------------------
| preempt count: 00000002 ]
| 2-level deep critical section nesting:
----------------------------------------
.. [<ffffffff803f28bf>] .... _raw_spin_lock+0x1b/0x28
.....[<ffffffff803f1b81>] .. ( <= rt_lock_slowunlock+0x16/0x70)
.. [<ffffffff803f29a3>] .... _raw_spin_lock_irqsave+0x22/0x33
.....[<ffffffff8022dad4>] .. ( <= __WARN_ON+0x24/0x88)

----------- [cut here ] --------- [please bite here ] ---------
Kernel BUG at kernel/rtmutex.c:639
invalid opcode: 0000 [1] PREEMPT SMP
CPU 3
Modules linked in: tg3
Pid: 1061, comm: kjournald Not tainted 2.6.17-rt8_t2_00 #1
RIP: 0010:[<ffffffff803f1d02>] <ffffffff803f1d02>{rt_lock_slowlock+186}
RSP: 0018:ffff81040022bb98 EFLAGS: 00010246
RAX: ffff8102001c6000 RBX: 0000000000000010 RCX: 0000000000240180
RDX: ffff8102001c6000 RSI: ffffffff802785e5 RDI: ffff810600115b00
RBP: ffff81040022bc58 R08: ffff8104000d7dd8 R09: ffff81040022bb98
R10: ffff8104000d7dd8 R11: 0000000000000023 R12: ffff810600115b00
R13: ffff8101eafabb80 R14: ffffffff802785e5 R15: 0000000000000010
FS: 00002b8c5ddbd6f0(0000) GS:ffff8106001eb340(0000) knlGS:0000000000000000
CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b
CR2: 00002b8c5dc07bf0 CR3: 0000000000201000 CR4: 00000000000006e0
Process kjournald (pid: 1061, threadinfo ffff81040022a000, task
ffff8102001c6000)
Stack: 111111110000008c ffff81040022bba0 ffff81040022bba0 ffff81040022bbb0
ffff81040022bbb0 0000000000000000 111111110000008c ffff81040022bbd0
ffff81040022bbd0 ffff81040022bbe0
Call Trace:
<ffffffff803f26fa>{rt_lock+18}
<ffffffff802785e5>{kmem_cache_alloc+84}
<ffffffff80259cc8>{mempool_alloc_slab+22}
<ffffffff8025a0b7>{mempool_alloc+80}
<ffffffff80281851>{bio_alloc_bioset+151}
<ffffffff80281902>{bio_alloc+21}
<ffffffff8027ded8>{submit_bh+142}
<ffffffff8027f183>{ll_rw_block+166}
<ffffffff802cefae>{journal_commit_transaction+995}
<ffffffff803f2e11>{_raw_spin_unlock+51}
<ffffffff803f1bb1>{rt_lock_slowunlock+70}
<ffffffff803f0b2d>{thread_return+230}
<ffffffff803f26e6>{__lock_text_start+14}
<ffffffff802365a1>{try_to_del_timer_sync+90}
<ffffffff803f0b2d>{thread_return+230}
<ffffffff802d351d>{kjournald+204}
<ffffffff80240c5f>{autoremove_wake_function+0}
<ffffffff802d3451>{kjournald+0}
<ffffffff80240831>{keventd_create_kthread+0}
<ffffffff80240b1f>{kthread+224}
<ffffffff80228e70>{schedule_tail+190}
<ffffffff8020b03e>{child_rip+8}
<ffffffff80240831>{keventd_create_kthread+0}
<ffffffff80240a3f>{kthread+0}
<ffffffff8020b036>{child_rip+0}
---------------------------
| preempt count: 00000002 ]
| 2-level deep critical section nesting:
----------------------------------------
.. [<ffffffff803f28bf>] .... _raw_spin_lock+0x1b/0x28
.....[<ffffffff803f1c83>] .. ( <= rt_lock_slowlock+0x3b/0x213)
.. [<ffffffff803f2bd9>] .... _raw_spin_trylock+0x1b/0x5f
.....[<ffffffff8020b484>] .. ( <= oops_begin+0x28/0x77)


Code: 0f 0b 68 1b 3d 41 80 c2 7f 02 65 48 8b 04 25 00 00 00 00 41
RIP <ffffffff803f1d02>{rt_lock_slowlock+186} RSP <ffff81040022bb98>
<6>note: kjournald[1061] exited with preempt_count 1


Attachments:
(No filename) (8.95 kB)
2.6.17-rt8-t2_00 (20.75 kB)
dmesg_2.6.17-rt8-t2_00 (14.78 kB)
Download all attachments

2006-08-14 23:44:47

by Bill Huey

[permalink] [raw]
Subject: Re: [Patch] restore the RCU callback to defer put_task_struct() Re: Problems with 2.6.17-rt8

On Mon, Aug 14, 2006 at 10:56:39AM -0700, Robert Crocombe wrote:
> I assume that replies were trimmed on accident, so I have resplattered
> everyone.
>
> On 8/11/06, hui Bill Huey <[email protected]> wrote:
> >Can you try it without RAID ? do you have a RAID set up ? You have tons
> >of options turned on in your .config, if you can isolate the problematic
> >system that, would be helpful to me since I don't have a lot of hardware.
>
> Boo, I thought my config was pretty concise :(. Okay, I really
> chopped it this time.
>
> And yeah, it is a RAID config. But for extra bonus points, I found a
> spare SCSI disk and installed Fedora Core 5 and did a 'yum upgrade' to
> whatever was current as of today. So it's a single disk config now.
> Problem still occurs with 't2' patched kernel.
>
> config and dmesg attached, kaboom-like stuff appended.

It looks like a screw interaction between the latency tracer and the mutex
code that creates such a wacked out looking stack trace. Unfortunately,
I've been unsuccessful at reproducing it, so I'm going to focus on a partial
clean up so that the rtmutex is a bit more friendly to the latency tracer.

This is kind of a pain.

You have all sorts of strange things going on in that config like NUMA
emulation and stuff which I can't imagine why you'd have that on, 1000
a sec tick, etc... If you can do a bit of work and narrow which option
is it, then I might be able to reproduce the situation over here. I'm
bothered by that bug, but it's going to need to be attacked from a couple
angles, two of which I just mentioned. If I can get a bit of help from
you where I can isolate the subsystem, it would be helpful while I'll
examine the tracing here.

bill

2006-08-15 10:43:36

by Bill Huey

[permalink] [raw]
Subject: Re: [Patch] restore the RCU callback to defer put_task_struct() Re: Problems with 2.6.17-rt8

On Mon, Aug 14, 2006 at 04:44:23PM -0700, Bill Huey wrote:
> On Mon, Aug 14, 2006 at 10:56:39AM -0700, Robert Crocombe wrote:
> > And yeah, it is a RAID config. But for extra bonus points, I found a
> > spare SCSI disk and installed Fedora Core 5 and did a 'yum upgrade' to
> > whatever was current as of today. So it's a single disk config now.
> > Problem still occurs with 't2' patched kernel.
> >
> > config and dmesg attached, kaboom-like stuff appended.
>
> It looks like a screw interaction between the latency tracer and the mutex
> code that creates such a wacked out looking stack trace. Unfortunately,
> I've been unsuccessful at reproducing it, so I'm going to focus on a partial
> clean up so that the rtmutex is a bit more friendly to the latency tracer.
>
> This is kind of a pain.

I'm high, the latency tracer is fine. I'm going to look at the mutex code more.

bill

2006-08-15 17:53:14

by Robert Crocombe

[permalink] [raw]
Subject: Re: [Patch] restore the RCU callback to defer put_task_struct() Re: Problems with 2.6.17-rt8

On 8/14/06, hui Bill Huey <[email protected]> wrote:
> You have all sorts of strange things going on in that config like NUMA
> emulation and stuff which I can't imagine why you'd have that on, 1000
> a sec tick, etc... If you can do a bit of work and narrow which option
> is it, then I might be able to reproduce the situation over here. I'm
> bothered by that bug, but it's going to need to be attacked from a couple
> angles, two of which I just mentioned. If I can get a bit of help from
> you where I can isolate the subsystem, it would be helpful while I'll
> examine the tracing here.

NUMA emulation is off in the most recent one (the most recent config
posted), and the high Hz is due to 'man nanosleep':

BUGS
The current implementation of nanosleep() is based on the normal kernel
timer mechanism, which has a resolution of 1/HZ s (see time(7)).
Therefore, nanosleep() pauses always for at least the specified time,
however it can take up to 10 ms longer than specified until the process
becomes runnable again. For the same reason, the value returned in case
of a delivered signal in *rem is usually rounded to the next larger
multiple of 1/HZ s.

and Other People's Code (tm).

Sadly, I am behind a firewall and cannot offer remote access to the
machine. If you can point out some other config options that are Not
Good, then I can see about pruning them.

I've turned Hz down to 250 (the only change from the last config), but
got a BUG immediately upon boot: last bit of boot stuff and trace is
appended. Machine booted on a second attempt, but failed when
compiling a kernel, i.e. at the same point as previously: that trace
is after the fail-on-boot trace.

--
Robert Crocombe
[email protected]

fail on boot:


Total HugeTLB memory allocated, 0
io scheduler noop registered
io scheduler anticipatory registered (default)
pcie_portdrv_probe->Dev[005d:10de] has invalid IRQ. Check vendor BIOS
assign_interrupt_mode Found MSI capability
pcie_portdrv_probe->Dev[005d:10de] has invalid IRQ. Check vendor BIOS
assign_interrupt_mode Found MSI capability
pcie_portdrv_probe->Dev[005d:10de] has invalid IRQ. Check vendor BIOS
assign_interrupt_mode Found MSI capability
pcie_portdrv_probe->Dev[005d:10de] has invalid IRQ. Check vendor BIOS
assign_interrupt_mode Found MSI capability
pcie_portdrv_probe->Dev[005d:10de] has invalid IRQ. Check vendor BIOS
assign_interrupt_mode Found MSI capability
pcie_portdrv_probe->Dev[005d:10de] has invalid IRQ. Check vendor BIOS
assign_interrupt_mode Found MSI capability
swapper/1[CPU#3]: BUG in debug_rt_mutex_unlock at kernel/rtmutex-debug.c:471

Call Trace:
<ffffffff803f2b2b>{_raw_spin_lock_irqsave+34}
<ffffffff8022db21>{__WARN_ON+105}
<ffffffff8022dadc>{__WARN_ON+36}
<ffffffff802494f7>{debug_rt_mutex_unlock+204}
<ffffffff803f1d10>{rt_lock_slowunlock+30}
<ffffffff803f286e>{__lock_text_start+14}
<ffffffff80277337>{cache_grow+214}
<ffffffff80277730>{__cache_alloc_node+272}
<ffffffff802777cc>{alternate_node_alloc+127}
<ffffffff80278653>{kmem_cache_alloc+114}
<ffffffff802b8c09>{sysfs_new_dirent+43}
<ffffffff802b8f45>{sysfs_make_dirent+33}
<ffffffff802b9046>{create_dir+160}
<ffffffff802b97e5>{sysfs_create_dir+95}
<ffffffff802eb81e>{kobject_add+223}
<ffffffff8033ad25>{class_device_add+187}
<ffffffff8033afc3>{class_device_register+30}
<ffffffff8033b0c5>{class_device_create+253}
<ffffffff802eec48>{sprintf+109}
<ffffffff80209c89>{mcount+45}
<ffffffff803f0cb5>{thread_return+230}
<ffffffff80209c89>{mcount+45}
<ffffffff8031cf81>{tty_register_device+221}
<ffffffff8031e4c3>{tty_register_driver+512}
<ffffffff80f3f2a3>{pty_init+464}
<ffffffff802071cc>{init+343}
<ffffffff80228e7c>{schedule_tail+190}
<ffffffff8020b02e>{child_rip+8}
<ffffffff80207075>{init+0}
<ffffffff8020b026>{child_rip+0}
---------------------------
| preempt count: 00000002 ]
| 2-level deep critical section nesting:
----------------------------------------
.. [<ffffffff803f2a47>] .... _raw_spin_lock+0x1b/0x28
.....[<ffffffff803f1d08>] .. ( <= rt_lock_slowunlock+0x16/0x70)
.. [<ffffffff803f2b2b>] .... _raw_spin_lock_irqsave+0x22/0x33
.....[<ffffffff8022dadc>] .. ( <= __WARN_ON+0x24/0x88)

swapper/1[CPU#3]: BUG in debug_rt_mutex_unlock at kernel/rtmutex-debug.c:472

Call Trace:
<ffffffff803f2b2b>{_raw_spin_lock_irqsave+34}
<ffffffff8022db21>{__WARN_ON+105}
<ffffffff8022dadc>{__WARN_ON+36}
<ffffffff80249599>{debug_rt_mutex_unlock+366}
<ffffffff803f1d10>{rt_lock_slowunlock+30}
<ffffffff803f286e>{__lock_text_start+14}
<ffffffff80277337>{cache_grow+214}
<ffffffff80277730>{__cache_alloc_node+272}
<ffffffff802777cc>{alternate_node_alloc+127}
<ffffffff80278653>{kmem_cache_alloc+114}
<ffffffff802b8c09>{sysfs_new_dirent+43}
<ffffffff802b8f45>{sysfs_make_dirent+33}
<ffffffff802b9046>{create_dir+160}
<ffffffff802b97e5>{sysfs_create_dir+95}
<ffffffff802eb81e>{kobject_add+223}
<ffffffff8033ad25>{class_device_add+187}
<ffffffff8033afc3>{class_device_register+30}
<ffffffff8033b0c5>{class_device_create+253}
<ffffffff802eec48>{sprintf+109}
<ffffffff80209c89>{mcount+45}
<ffffffff803f0cb5>{thread_return+230}
<ffffffff80209c89>{mcount+45}
<ffffffff8031cf81>{tty_register_device+221}
<ffffffff8031e4c3>{tty_register_driver+512}
<ffffffff80f3f2a3>{pty_init+464}
<ffffffff802071cc>{init+343}
<ffffffff80228e7c>{schedule_tail+190}
<ffffffff8020b02e>{child_rip+8}
<ffffffff80207075>{init+0}
<ffffffff8020b026>{child_rip+0}
---------------------------
| preempt count: 00000002 ]
| 2-level deep critical section nesting:
----------------------------------------
.. [<ffffffff803f2a47>] .... _raw_spin_lock+0x1b/0x28
.....[<ffffffff803f1d08>] .. ( <= rt_lock_slowunlock+0x16/0x70)
.. [<ffffffff803f2b2b>] .... _raw_spin_lock_irqsave+0x22/0x33
.....[<ffffffff8022dadc>] .. ( <= __WARN_ON+0x24/0x88)

----------- [cut here ] --------- [please bite here ] ---------
Kernel BUG at kernel/rtmutex.c:639
invalid opcode: 0000 [1] PREEMPT SMP
CPU 1
Modules linked in:
Pid: 1, comm: swapper Not tainted 2.6.17-rt8_t2_01 #2
RIP: 0010:[<ffffffff803f1e89>] <ffffffff803f1e89>{rt_lock_slowlock+186}
RSP: 0000:ffff8101eac49978 EFLAGS: 00010246
RAX: ffff81060018a000 RBX: 0000000000000030 RCX: 00000000000c0080
RDX: ffff81060018a000 RSI: ffffffff80277592 RDI: ffff810200115b00
RBP: ffff8101eac49a38 R08: 0000000000000000 R09: ffff8101eac49978
R10: 0000000000000000 R11: 0000000000000000 R12: ffff810200115b00
R13: ffff8102003b50f0 R14: ffffffff80277592 R15: 0000000000000030
FS: 0000000000000000(0000) GS:ffff8102001d4340(0000) knlGS:0000000000000000
CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b
CR2: 0000000000000000 CR3: 0000000000201000 CR4: 00000000000006e0
Process swapper (pid: 1, threadinfo ffff8101eac48000, task ffff81060018a000)
Stack: 111111110000008c ffff8101eac49980 ffff8101eac49980 ffff8101eac49990
ffff8101eac49990 0000000000000000 111111110000008c ffff8101eac499b0
ffff8101eac499b0 ffff8101eac499c0
Call Trace:
<ffffffff803f2882>{rt_lock+18}
<ffffffff80277592>{cache_grow+817}
<ffffffff80277730>{__cache_alloc_node+272}
<ffffffff802777cc>{alternate_node_alloc+127}
<ffffffff80278653>{kmem_cache_alloc+114}
<ffffffff802b8c09>{sysfs_new_dirent+43}
<ffffffff802b8f45>{sysfs_make_dirent+33}
<ffffffff802b9046>{create_dir+160}
<ffffffff802b97e5>{sysfs_create_dir+95}
<ffffffff802eb81e>{kobject_add+223}
<ffffffff8033ad25>{class_device_add+187}
<ffffffff8033afc3>{class_device_register+30}
<ffffffff8033b0c5>{class_device_create+253}
<ffffffff802eec48>{sprintf+109}
<ffffffff80209c89>{mcount+45}
<ffffffff803f0cb5>{thread_return+230}
<ffffffff80209c89>{mcount+45}
<ffffffff8031cf81>{tty_register_device+221}
<ffffffff8031e4c3>{tty_register_driver+512}
<ffffffff80f3f2a3>{pty_init+464}
<ffffffff802071cc>{init+343}
<ffffffff80228e7c>{schedule_tail+190}
<ffffffff8020b02e>{child_rip+8}
<ffffffff80207075>{init+0}
<ffffffff8020b026>{child_rip+0}
---------------------------
| preempt count: 00000002 ]
| 2-level deep critical section nesting:
----------------------------------------
.. [<ffffffff803f2a47>] .... _raw_spin_lock+0x1b/0x28
.....[<ffffffff803f1e0a>] .. ( <= rt_lock_slowlock+0x3b/0x213)
.. [<ffffffff803f2d61>] .... _raw_spin_trylock+0x1b/0x5f
.....[<ffffffff8020b474>] .. ( <= oops_begin+0x28/0x77)


Code: 0f 0b 68 1b 3d 41 80 c2 7f 02 65 48 8b 04 25 00 00 00 00 41
RIP <ffffffff803f1e89>{rt_lock_slowlock+186} RSP <ffff8101eac49978>
<0>Kernel panic - not syncing: Attempted to kill init!

Call Trace:
<ffffffff8022ca20>{panic+175}
<ffffffff80251c92>{__trace+148}
<ffffffff80209c89>{mcount+45}
<ffffffff8022fadc>{do_exit+156}
<ffffffff803f3077>{_raw_spin_unlock_irqrestore+89}
<ffffffff8020bb1c>{kernel_math_error+0}
<ffffffff8020bc9b>{do_trap+228}
<ffffffff80277592>{cache_grow+817}
<ffffffff8020c382>{do_invalid_op+177}
<ffffffff803f1e89>{rt_lock_slowlock+186}
<ffffffff80209c89>{mcount+45}
<ffffffff8020ae75>{error_exit+0}
<ffffffff80277592>{cache_grow+817}
<ffffffff80277592>{cache_grow+817}
<ffffffff803f1e89>{rt_lock_slowlock+186}
<ffffffff803f1e61>{rt_lock_slowlock+146}
<ffffffff803f2882>{rt_lock+18}
<ffffffff80277592>{cache_grow+817}
<ffffffff80277730>{__cache_alloc_node+272}
<ffffffff802777cc>{alternate_node_alloc+127}
<ffffffff80278653>{kmem_cache_alloc+114}
<ffffffff802b8c09>{sysfs_new_dirent+43}
<ffffffff802b8f45>{sysfs_make_dirent+33}
<ffffffff802b9046>{create_dir+160}
<ffffffff802b97e5>{sysfs_create_dir+95}
<ffffffff802eb81e>{kobject_add+223}
<ffffffff8033ad25>{class_device_add+187}
<ffffffff8033afc3>{class_device_register+30}
<ffffffff8033b0c5>{class_device_create+253}
<ffffffff802eec48>{sprintf+109}
<ffffffff80209c89>{mcount+45}
<ffffffff803f0cb5>{thread_return+230}
<ffffffff80209c89>{mcount+45}
<ffffffff8031cf81>{tty_register_device+221}
<ffffffff8031e4c3>{tty_register_driver+512}
<ffffffff80f3f2a3>{pty_init+464}
<ffffffff802071cc>{init+343}
<ffffffff80228e7c>{schedule_tail+190}
<ffffffff8020b02e>{child_rip+8}
<ffffffff80207075>{init+0}
<ffffffff8020b026>{child_rip+0}
---------------------------
| preempt count: 00000002 ]
| 2-level deep critical section nesting:
----------------------------------------
.. [<ffffffff803f2a47>] .... _raw_spin_lock+0x1b/0x28
.....[<ffffffff803f1e0a>] .. ( <= rt_lock_slowlock+0x3b/0x213)
.. [<ffffffff8022c9b4>] .... panic+0x43/0x16f
.....[<ffffffff8022fadc>] .. ( <= do_exit+0x9c/0x9a2)



2nd trace (fail on compile)



kjournald/1061[CPU#3]: BUG in debug_rt_mutex_unlock at
kernel/rtmutex-debug.c:471

Call Trace:
<ffffffff803f2b2b>{_raw_spin_lock_irqsave+34}
<ffffffff8022db21>{__WARN_ON+105}
<ffffffff8022dadc>{__WARN_ON+36}
<ffffffff802494f7>{debug_rt_mutex_unlock+204}
<ffffffff803f1d10>{rt_lock_slowunlock+30}
<ffffffff803f286e>{__lock_text_start+14}
<ffffffff802786b0>{kmem_cache_alloc+207}
<ffffffff80259d18>{mempool_alloc_slab+22}
<ffffffff8025a107>{mempool_alloc+80}
<ffffffff80209c89>{mcount+45}
<ffffffff80281832>{bio_alloc_bioset+40}
<ffffffff80281952>{bio_alloc+21}
<ffffffff8027df28>{submit_bh+142}
<ffffffff8027f1d3>{ll_rw_block+166}
<ffffffff802cefe6>{journal_commit_transaction+995}
<ffffffff803f2f99>{_raw_spin_unlock+51}
<ffffffff803f1d38>{rt_lock_slowunlock+70}
<ffffffff803f0cb5>{thread_return+230}
<ffffffff803f286e>{__lock_text_start+14}
<ffffffff802365dd>{try_to_del_timer_sync+90}
<ffffffff803f0cb5>{thread_return+230}
<ffffffff802d3555>{kjournald+204}
<ffffffff80240caf>{autoremove_wake_function+0}
<ffffffff802d3489>{kjournald+0}
<ffffffff80240881>{keventd_create_kthread+0}
<ffffffff80240b6f>{kthread+224}
<ffffffff80228e7c>{schedule_tail+190}
<ffffffff8020b02e>{child_rip+8}
<ffffffff80240881>{keventd_create_kthread+0}
<ffffffff80240a8f>{kthread+0}
<ffffffff8020b026>{child_rip+0}
---------------------------
| preempt count: 00000002 ]
| 2-level deep critical section nesting:
----------------------------------------
.. [<ffffffff803f2a47>] .... _raw_spin_lock+0x1b/0x28
.....[<ffffffff803f1d08>] .. ( <= rt_lock_slowunlock+0x16/0x70)
.. [<ffffffff803f2b2b>] .... _raw_spin_lock_irqsave+0x22/0x33
.....[<ffffffff8022dadc>] .. ( <= __WARN_ON+0x24/0x88)

kjournald/1061[CPU#3]: BUG in debug_rt_mutex_unlock at
kernel/rtmutex-debug.c:472

Call Trace:
<ffffffff803f2b2b>{_raw_spin_lock_irqsave+34}
<ffffffff8022db21>{__WARN_ON+105}
<ffffffff8022dadc>{__WARN_ON+36}
<ffffffff80249599>{debug_rt_mutex_unlock+366}
<ffffffff803f1d10>{rt_lock_slowunlock+30}
<ffffffff803f286e>{__lock_text_start+14}
<ffffffff802786b0>{kmem_cache_alloc+207}
<ffffffff80259d18>{mempool_alloc_slab+22}
<ffffffff8025a107>{mempool_alloc+80}
<ffffffff80209c89>{mcount+45}
<ffffffff80281832>{bio_alloc_bioset+40}
<ffffffff80281952>{bio_alloc+21}
<ffffffff8027df28>{submit_bh+142}
<ffffffff8027f1d3>{ll_rw_block+166}
<ffffffff802cefe6>{journal_commit_transaction+995}
<ffffffff803f2f99>{_raw_spin_unlock+51}
<ffffffff803f1d38>{rt_lock_slowunlock+70}
<ffffffff803f0cb5>{thread_return+230}
<ffffffff803f286e>{__lock_text_start+14}
<ffffffff802365dd>{try_to_del_timer_sync+90}
<ffffffff803f0cb5>{thread_return+230}
<ffffffff802d3555>{kjournald+204}
<ffffffff80240caf>{autoremove_wake_function+0}
<ffffffff802d3489>{kjournald+0}
<ffffffff80240881>{keventd_create_kthread+0}
<ffffffff80240b6f>{kthread+224}
<ffffffff80228e7c>{schedule_tail+190}
<ffffffff8020b02e>{child_rip+8}
<ffffffff80240881>{keventd_create_kthread+0}
<ffffffff80240a8f>{kthread+0}
<ffffffff8020b026>{child_rip+0}
---------------------------
| preempt count: 00000002 ]
| 2-level deep critical section nesting:
----------------------------------------
.. [<ffffffff803f2a47>] .... _raw_spin_lock+0x1b/0x28
.....[<ffffffff803f1d08>] .. ( <= rt_lock_slowunlock+0x16/0x70)
.. [<ffffffff803f2b2b>] .... _raw_spin_lock_irqsave+0x22/0x33
.....[<ffffffff8022dadc>] .. ( <= __WARN_ON+0x24/0x88)

----------- [cut here ] --------- [please bite here ] ---------
Kernel BUG at kernel/rtmutex.c:639
invalid opcode: 0000 [1] PREEMPT SMP
CPU 3
Modules linked in: tg3
Pid: 1061, comm: kjournald Not tainted 2.6.17-rt8_t2_01 #2
RIP: 0010:[<ffffffff803f1e89>] <ffffffff803f1e89>{rt_lock_slowlock+186}
RSP: 0000:ffff8104003bdb98 EFLAGS: 00010246
RAX: ffff8102001c7540 RBX: 0000000000000010 RCX: 0000000000240180
RDX: ffff8102001c7540 RSI: ffffffff80278635 RDI: ffff810600115b00
RBP: ffff8104003bdc58 R08: ffffffff803f0c50 R09: ffff8104003bdb98
R10: ffffffff803f0c50 R11: 0000000000000023 R12: ffff810600115b00
R13: ffff8101eafa7b80 R14: ffffffff80278635 R15: 0000000000000010
FS: 00002ad25bb03200(0000) GS:ffff8106001eb340(0000) knlGS:0000000000000000
CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b
CR2: 00002ad25f085000 CR3: 00000001e68bf000 CR4: 00000000000006e0
Process kjournald (pid: 1061, threadinfo ffff8104003bc000, task
ffff8102001c7540)
Stack: 111111110000008c ffff8104003bdba0 ffff8104003bdba0 ffff8104003bdbb0
ffff8104003bdbb0 0000000000000000 111111110000008c ffff8104003bdbd0
ffff8104003bdbd0 ffff8104003bdbe0
Call Trace:
<ffffffff803f2882>{rt_lock+18}
<ffffffff80278635>{kmem_cache_alloc+84}
<ffffffff80259d18>{mempool_alloc_slab+22}
<ffffffff8025a107>{mempool_alloc+80}
<ffffffff802818a1>{bio_alloc_bioset+151}
<ffffffff80281952>{bio_alloc+21}
<ffffffff8027df28>{submit_bh+142}
<ffffffff8027f1d3>{ll_rw_block+166}
<ffffffff802cefe6>{journal_commit_transaction+995}
<ffffffff803f2f99>{_raw_spin_unlock+51}
<ffffffff803f1d38>{rt_lock_slowunlock+70}
<ffffffff803f0cb5>{thread_return+230}
<ffffffff803f286e>{__lock_text_start+14}
<ffffffff802365dd>{try_to_del_timer_sync+90}
<ffffffff803f0cb5>{thread_return+230}
<ffffffff802d3555>{kjournald+204}
<ffffffff80240caf>{autoremove_wake_function+0}
<ffffffff802d3489>{kjournald+0}
<ffffffff80240881>{keventd_create_kthread+0}
<ffffffff80240b6f>{kthread+224}
<ffffffff80228e7c>{schedule_tail+190}
<ffffffff8020b02e>{child_rip+8}
<ffffffff80240881>{keventd_create_kthread+0}
<ffffffff80240a8f>{kthread+0}
<ffffffff8020b026>{child_rip+0}
---------------------------
| preempt count: 00000002 ]
| 2-level deep critical section nesting:
----------------------------------------
.. [<ffffffff803f2a47>] .... _raw_spin_lock+0x1b/0x28
.....[<ffffffff803f1e0a>] .. ( <= rt_lock_slowlock+0x3b/0x213)
.. [<ffffffff803f2d61>] .... _raw_spin_trylock+0x1b/0x5f
.....[<ffffffff8020b474>] .. ( <= oops_begin+0x28/0x77)


Code: 0f 0b 68 1b 3d 41 80 c2 7f 02 65 48 8b 04 25 00 00 00 00 41
RIP <ffffffff803f1e89>{rt_lock_slowlock+186} RSP <ffff8104003bdb98>
<6>note: kjournald[1061] exited with preempt_count 1

2006-08-18 12:00:09

by Bill Huey

[permalink] [raw]
Subject: Re: [Patch] restore the RCU callback to defer put_task_struct() Re: Problems with 2.6.17-rt8

On Tue, Aug 15, 2006 at 10:53:10AM -0700, Robert Crocombe wrote:
> NUMA emulation is off in the most recent one (the most recent config
> posted), and the high Hz is due to 'man nanosleep':
>
> BUGS
> The current implementation of nanosleep() is based on the normal
> kernel
> timer mechanism, which has a resolution of 1/HZ s (see
> time(7)).
> Therefore, nanosleep() pauses always for at least the specified
> time,
> however it can take up to 10 ms longer than specified until the
> process
> becomes runnable again. For the same reason, the value returned in
> case
> of a delivered signal in *rem is usually rounded to the next
> larger
> multiple of 1/HZ s.
>
> and Other People's Code (tm).
>
> Sadly, I am behind a firewall and cannot offer remote access to the
> machine. If you can point out some other config options that are Not
> Good, then I can see about pruning them.

Robert,

Sorry for the delay, been a bit distracted.

There's a lot of little problems in the stacks traces you sent and I wonder
how much it effects the rest of the system. Getting reliable debug output
has been a bit problematic.

Can you try this patch and see if it changes something ?

1) This fixes a flush_tlb_mm atomic violation by moving back to a spinlock

2) Moves the delay logic up higher in the tree from free_task() to __put_task_struct()

3) Adds a check to all calls to rt_lock() where it will panic() the kernel
so that I can get a decent stack trace.

4) Minor fixes to the original deference code for INIT_TASK

5) use stop_trace just before doing any real panic().

This may not fix the journald code, but it should at least eliminate a
couple of stack traces that might be creating other effects in the system
and stop it dead in it's tracks. Hope this is useful.

Patch attached:

bill


Attachments:
(No filename) (1.85 kB)
t.diff (9.18 kB)
Download all attachments

2006-08-22 00:21:37

by Robert Crocombe

[permalink] [raw]
Subject: Re: [Patch] restore the RCU callback to defer put_task_struct() Re: Problems with 2.6.17-rt8

On 8/18/06, hui Bill Huey <[email protected]> wrote:
> Patch attached:

The problem still appears to happen, but now I get no trace at all,
just a single line reported to the machine's console (and not on the
serial console):

pdflush/314[CPU#2]: BUG in debug_rt_mutex_unlock at kernel/rt_mutex_debug.c:471

i.e., a standard statement, except at pdflush instead of swapper this time.

--
Robert Crocombe
[email protected]

2006-08-22 01:37:39

by Bill Huey

[permalink] [raw]
Subject: rtmutex assert failure (was [Patch] restore the RCU callback...)

On Mon, Aug 21, 2006 at 05:21:35PM -0700, Robert Crocombe wrote:
> On 8/18/06, hui Bill Huey <[email protected]> wrote:
> >Patch attached:
>
> The problem still appears to happen, but now I get no trace at all,
> just a single line reported to the machine's console (and not on the
> serial console):
>
> pdflush/314[CPU#2]: BUG in debug_rt_mutex_unlock at
> kernel/rt_mutex_debug.c:471
>
> i.e., a standard statement, except at pdflush instead of swapper this time.

Thanks for testing. I appreciate this and I'm more than just plain willing
to help out. Any thing goes to get this patch solid. :)

I'll come up with a patch tomorrow to try and get a clean stack trace. I've
made some change to the bug dump output to make it more preemption aware,
but, as you can, some tweeking is needed.

However, I don't think this is going to be a simple atomic violation problem
that I've seen numerous times before. That's what the previous patch tried to
address. It's seriously pointing to a problem with the rtmutex and/or how it's
used, maybe a corner case it's missing.

I'll start to look at it either tonight or tomorrow and see what's going on.
It's going to take me a bit so be patient. I'll definitely get back to you
on this. I've got a number of weeks to blow exclusively on -rt development,
so I'll be around for a little while longer, hopefully longer.

bill

2006-08-22 23:21:08

by Bill Huey

[permalink] [raw]
Subject: Re: rtmutex assert failure (was [Patch] restore the RCU callback...)

On Mon, Aug 21, 2006 at 06:37:22PM -0700, Bill Huey wrote:
> I'll come up with a patch tomorrow to try and get a clean stack trace. I've
> made some change to the bug dump output to make it more preemption aware,
> but, as you can, some tweeking is needed.

I turned off the tracing in the latency tracking stuff and a relatively
small patch is here against -rt8:

http://mmlinux.sourceforge.net/public/against-2.6.17-rt8-0.diff

This might help with the stack trace. The stack trace being kind of clobbered is a
different issue entirely. I don't know since I can't replicate your problem over here.
I'm going to start looking over the rt mutex more carefully and see what case might
make this assert trigger. I'll try and deal with this one and a time.

I'm new at this bug fixing thing so give me a bit of time. Apparently Ingo's on
long vacation.

bill

2006-08-22 23:22:04

by Bill Huey

[permalink] [raw]
Subject: Re: rtmutex assert failure (was [Patch] restore the RCU callback...)

On Tue, Aug 22, 2006 at 04:20:51PM -0700, Bill Huey wrote:
> On Mon, Aug 21, 2006 at 06:37:22PM -0700, Bill Huey wrote:
> > I'll come up with a patch tomorrow to try and get a clean stack trace. I've
> > made some change to the bug dump output to make it more preemption aware,
> > but, as you can, some tweeking is needed.
>
> I turned off the tracing in the latency tracking stuff and a relatively
> small patch is here against -rt8:

At __WARN_ON()... I meant.

bill


2006-08-23 17:14:29

by Robert Crocombe

[permalink] [raw]
Subject: Re: rtmutex assert failure (was [Patch] restore the RCU callback...)

On 8/22/06, hui Bill Huey <[email protected]> wrote:
> I turned off the tracing in the latency tracking stuff and a relatively
> small patch is here against -rt8:
>
> http://mmlinux.sourceforge.net/public/against-2.6.17-rt8-0.diff

I'm going to assume that the #error here:

+#ifdef CONFIG_LATENCY_TRACE
+#error
+ stop_trace();
+#endif

is to see if I'm awake. No, but gcc is. I just removed it (?).

End result is as with the previous patch: nothing to serial console,
and just the single line moaning about line 471 in blah blah blah.

--
Robert Crocombe
[email protected]

2006-08-23 17:24:26

by Robert Crocombe

[permalink] [raw]
Subject: Re: rtmutex assert failure (was [Patch] restore the RCU callback...)

On 8/23/06, Robert Crocombe <[email protected]> wrote:
> On 8/22/06, hui Bill Huey <[email protected]> wrote:
> > I turned off the tracing in the latency tracking stuff and a relatively
> > small patch is here against -rt8:
> >
> > http://mmlinux.sourceforge.net/public/against-2.6.17-rt8-0.diff
>
> I'm going to assume that the #error here:
>
> +#ifdef CONFIG_LATENCY_TRACE
> +#error
> + stop_trace();
> +#endif
>
> is to see if I'm awake. No, but gcc is. I just removed it (?).

Just in case it was a "hey, dummy, don't do that" kind of thing, I
built another kernel with latency tracing off, but the end result is
the same.

--
Robert Crocombe
[email protected]

2006-08-23 20:21:05

by Bill Huey

[permalink] [raw]
Subject: Re: rtmutex assert failure (was [Patch] restore the RCU callback...)

On Wed, Aug 23, 2006 at 10:14:27AM -0700, Robert Crocombe wrote:
> On 8/22/06, hui Bill Huey <[email protected]> wrote:
> >I turned off the tracing in the latency tracking stuff and a relatively
> >small patch is here against -rt8:
> >
> > http://mmlinux.sourceforge.net/public/against-2.6.17-rt8-0.diff
>
> I'm going to assume that the #error here:
>
> +#ifdef CONFIG_LATENCY_TRACE
> +#error
> + stop_trace();
> +#endif
>
> is to see if I'm awake. No, but gcc is. I just removed it (?).
>
> End result is as with the previous patch: nothing to serial console,
> and just the single line moaning about line 471 in blah blah blah.

I still to have streamline my patch consolidation process, so your getting
things like this as an artifact of that right now, sorry. That was the
correct change.

It might be the case that I need to remove some of that stuff and then
reupload the patch. I'm kind of guessing at the problem right now with
your stack dumps and it looks like the changes are causing more problems
than it is helping. Let me change that and upload something else to you
today.

Your case is particularly difficult since I don't have a machine as big as
yours that can replicate that problem and I can't do the normal kernel
stuff that I want to on it. However, I do have a good idea of what the
problem is and I'm trying to get reacquainted with the rt mutex code to
see if I can narrow it down.

bill

2006-08-23 21:06:21

by Bill Huey

[permalink] [raw]
Subject: Re: rtmutex assert failure (was [Patch] restore the RCU callback...)

On Wed, Aug 23, 2006 at 01:20:46PM -0700, Bill Huey wrote:
> On Wed, Aug 23, 2006 at 10:14:27AM -0700, Robert Crocombe wrote:
...
> > End result is as with the previous patch: nothing to serial console,
> > and just the single line moaning about line 471 in blah blah blah.
...
> It might be the case that I need to remove some of that stuff and then
> reupload the patch. I'm kind of guessing at the problem right now with
> your stack dumps and it looks like the changes are causing more problems
> than it is helping. Let me change that and upload something else to you
> today.

http://mmlinux.sourceforge.net/public/against-2.6.17-rt8-1.diff

I turned off the __WARN_ON() change. I should test that more thoroughly.
I also cleaned up things a bit.

I'm on irc.oftc.net as "bh" if you want somebody that's interactive. It
might help with things here.

bill

2006-08-23 21:09:06

by Bill Huey

[permalink] [raw]
Subject: Re: rtmutex assert failure (was [Patch] restore the RCU callback...)

On Wed, Aug 23, 2006 at 02:05:58PM -0700, Bill Huey wrote:
> I'm on irc.oftc.net as "bh" if you want somebody that's interactive. It
> might help with things here.

#kernelnewbies is the channel.

bill

2006-08-24 01:22:08

by Robert Crocombe

[permalink] [raw]
Subject: Re: rtmutex assert failure (was [Patch] restore the RCU callback...)

On 8/23/06, hui Bill Huey <[email protected]> wrote:
> On Wed, Aug 23, 2006 at 02:05:58PM -0700, Bill Huey wrote:
> > I'm on irc.oftc.net as "bh" if you want somebody that's interactive. It
> > might help with things here.
>
> #kernelnewbies is the channel.

Firewall where only port 80 is open. I tried this thing:

http://www.polarhome.com/cgi-bin/chat/irc.cgi

but it looks like it connects to the wrong port (8888 vs 6667?), or
something. I see that some Windows *shudder* client supposedly
supports thru-firewill connections, and additionally some instructions
about port-forwarding off an outside machine, but I'm kinda stuck for
the moment.

Okay, tried the patch. I actually made it through one full kernel
compile without going bang, but did crap out on the second iteration
(after a 'make clean').

I'll post the config, too, just so there's a clear understanding of
where we are.

--
Robert Crocombe
[email protected]/1061[CPU#3]: BUG in debug_rt_mutex_unlock
at kernel/rtmutex-debug.c:471

Call Trace:
<ffffffff802616af>{_raw_spin_lock_irqsave+29}
<ffffffff80280b6f>{__WARN_ON+100}
<ffffffff80280b2a>{__WARN_ON+31}
<ffffffff80295ef9>{debug_rt_mutex_unlock+199}
<ffffffff80260920>{rt_lock_slowunlock+25}
<ffffffff80261451>{__lock_text_start+9}
<ffffffff8020a188>{kmem_cache_alloc+202}
<ffffffff8029f849>{mempool_alloc_slab+17}
<ffffffff802222f4>{mempool_alloc+75}
<ffffffff80261ac8>{_raw_spin_unlock+46}
<ffffffff80260948>{rt_lock_slowunlock+65}
<ffffffff80219d1a>{bio_alloc_bioset+35}
<ffffffff802b5654>{bio_clone+29}
<ffffffff803a581f>{clone_bio+51}
<ffffffff803a64d7>{__split_bio+399}
<ffffffff803a6cc8>{dm_request+453}
<ffffffff8021b42a>{generic_make_request+375}
<ffffffff80232b94>{submit_bio+192}
<ffffffff80219c68>{submit_bh+262}
<ffffffff802166da>{ll_rw_block+161}
<ffffffff802ec25f>{journal_commit_transaction+990}
<ffffffff80261ac8>{_raw_spin_unlock+46}
<ffffffff80260948>{rt_lock_slowunlock+65}
<ffffffff8028f22b>{autoremove_wake_function+0}
<ffffffff8024b4c7>{try_to_del_timer_sync+85}
<ffffffff802f0679>{kjournald+199}
<ffffffff8028f22b>{autoremove_wake_function+0}
<ffffffff802f05b2>{kjournald+0}
<ffffffff8028f027>{keventd_create_kthread+0}
<ffffffff802321b7>{kthread+219}
<ffffffff802268f9>{schedule_tail+185}
<ffffffff8025d31e>{child_rip+8}
<ffffffff8028f027>{keventd_create_kthread+0}
<ffffffff802320dc>{kthread+0}
<ffffffff8025d316>{child_rip+0}
---------------------------
| preempt count: 00000002 ]
| 2-level deep critical section nesting:
----------------------------------------
.. [<ffffffff802615e9>] .... _raw_spin_lock+0x16/0x23
.....[<ffffffff80260918>] .. ( <= rt_lock_slowunlock+0x11/0x6b)
.. [<ffffffff802616af>] .... _raw_spin_lock_irqsave+0x1d/0x2e
.....[<ffffffff80280b2a>] .. ( <= __WARN_ON+0x1f/0x82)

kjournald/1061[CPU#3]: BUG in debug_rt_mutex_unlock at
kernel/rtmutex-debug.c:472

Call Trace:
<ffffffff802616af>{_raw_spin_lock_irqsave+29}
<ffffffff80280b6f>{__WARN_ON+100}
<ffffffff80280b2a>{__WARN_ON+31}
<ffffffff80295f9b>{debug_rt_mutex_unlock+361}
<ffffffff80260920>{rt_lock_slowunlock+25}
<ffffffff80261451>{__lock_text_start+9}
<ffffffff8020a188>{kmem_cache_alloc+202}
<ffffffff8029f849>{mempool_alloc_slab+17}
<ffffffff802222f4>{mempool_alloc+75}
<ffffffff80261ac8>{_raw_spin_unlock+46}
<ffffffff80260948>{rt_lock_slowunlock+65}
<ffffffff80219d1a>{bio_alloc_bioset+35}
<ffffffff802b5654>{bio_clone+29}
<ffffffff803a581f>{clone_bio+51}
<ffffffff803a64d7>{__split_bio+399}
<ffffffff803a6cc8>{dm_request+453}
<ffffffff8021b42a>{generic_make_request+375}
<ffffffff80232b94>{submit_bio+192}
<ffffffff80219c68>{submit_bh+262}
<ffffffff802166da>{ll_rw_block+161}
<ffffffff802ec25f>{journal_commit_transaction+990}
<ffffffff80261ac8>{_raw_spin_unlock+46}
<ffffffff80260948>{rt_lock_slowunlock+65}
<ffffffff8028f22b>{autoremove_wake_function+0}
<ffffffff8024b4c7>{try_to_del_timer_sync+85}
<ffffffff802f0679>{kjournald+199}
<ffffffff8028f22b>{autoremove_wake_function+0}
<ffffffff802f05b2>{kjournald+0}
<ffffffff8028f027>{keventd_create_kthread+0}
<ffffffff802321b7>{kthread+219}
<ffffffff802268f9>{schedule_tail+185}
<ffffffff8025d31e>{child_rip+8}
<ffffffff8028f027>{keventd_create_kthread+0}
<ffffffff802320dc>{kthread+0}
<ffffffff8025d316>{child_rip+0}
---------------------------
| preempt count: 00000002 ]
| 2-level deep critical section nesting:
----------------------------------------
.. [<ffffffff802615e9>] .... _raw_spin_lock+0x16/0x23
.....[<ffffffff80260918>] .. ( <= rt_lock_slowunlock+0x11/0x6b)
.. [<ffffffff802616af>] .... _raw_spin_lock_irqsave+0x1d/0x2e
.....[<ffffffff80280b2a>] .. ( <= __WARN_ON+0x1f/0x82)

----------- [cut here ] --------- [please bite here ] ---------
Kernel BUG at kernel/rtmutex.c:639
invalid opcode: 0000 [1] PREEMPT SMP
CPU 3
Modules linked in: tg3
Pid: 1061, comm: kjournald Not tainted 2.6.17-rt8_t5_00 #3
RIP: 0010:[<ffffffff80260a8f>] <ffffffff80260a8f>{rt_lock_slowlock+181}
RSP: 0018:ffff8104002319d8 EFLAGS: 00010246
RAX: ffff810200361560 RBX: 0000000000000010 RCX: ffffffff805c3800
RDX: ffff810200361560 RSI: ffffffff8020a10d RDI: ffff810600115b00
RBP: ffff810400231a98 R08: ffff8101e7c766c0 R09: ffff8104002319d8
R10: ffff8104000d74f0 R11: 0000000000000093 R12: ffff810600115b00
R13: ffff810200031c40 R14: ffffffff8020a10d R15: 0000000000000010
FS: 00002b45110964a0(0000) GS:ffff810600210340(0000) knlGS:0000000000000000
CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b
CR2: 00002b45111a10f8 CR3: 00000001e5fe1000 CR4: 00000000000006e0
Process kjournald (pid: 1061, threadinfo ffff810400230000, task
ffff810200361560)
Stack: 111111110000008c ffff8104002319e0 ffff8104002319e0 ffff8104002319f0
ffff8104002319f0 0000000000000000 111111110000008c ffff810400231a10
ffff810400231a10 ffff810400231a20
Call Trace:
<ffffffff80261460>{rt_lock+13}
<ffffffff8020a10d>{kmem_cache_alloc+79}
<ffffffff8029f849>{mempool_alloc_slab+17}
<ffffffff802222f4>{mempool_alloc+75}
<ffffffff80261ac8>{_raw_spin_unlock+46}
<ffffffff80260948>{rt_lock_slowunlock+65}
<ffffffff80219d89>{bio_alloc_bioset+146}
<ffffffff802b5654>{bio_clone+29}
<ffffffff803a581f>{clone_bio+51}
<ffffffff803a64d7>{__split_bio+399}
<ffffffff803a6cc8>{dm_request+453}
<ffffffff8021b42a>{generic_make_request+375}
<ffffffff80232b94>{submit_bio+192}
<ffffffff80219c68>{submit_bh+262}
<ffffffff802166da>{ll_rw_block+161}
<ffffffff802ec25f>{journal_commit_transaction+990}
<ffffffff80261ac8>{_raw_spin_unlock+46}
<ffffffff80260948>{rt_lock_slowunlock+65}
<ffffffff8028f22b>{autoremove_wake_function+0}
<ffffffff8024b4c7>{try_to_del_timer_sync+85}
<ffffffff802f0679>{kjournald+199}
<ffffffff8028f22b>{autoremove_wake_function+0}
<ffffffff802f05b2>{kjournald+0}
<ffffffff8028f027>{keventd_create_kthread+0}
<ffffffff802321b7>{kthread+219}
<ffffffff802268f9>{schedule_tail+185}
<ffffffff8025d31e>{child_rip+8}
<ffffffff8028f027>{keventd_create_kthread+0}
<ffffffff802320dc>{kthread+0}
<ffffffff8025d316>{child_rip+0}
---------------------------
| preempt count: 00000002 ]
| 2-level deep critical section nesting:
----------------------------------------
.. [<ffffffff802615e9>] .... _raw_spin_lock+0x16/0x23
.....[<ffffffff80260a10>] .. ( <= rt_lock_slowlock+0x36/0x20e)
.. [<ffffffff802618a9>] .... _raw_spin_trylock+0x16/0x5a
.....[<ffffffff80265982>] .. ( <= oops_begin+0x23/0x72)


Code: 0f 0b 68 84 5b 40 80 c2 7f 02 65 48 8b 04 25 00 00 00 00 41
RIP <ffffffff80260a8f>{rt_lock_slowlock+181} RSP <ffff8104002319d8>


Attachments:
(No filename) (7.93 kB)
2.6.17-rt8-t5-config (20.75 kB)
Download all attachments

2006-08-24 01:47:20

by Bill Huey

[permalink] [raw]
Subject: Re: rtmutex assert failure (was [Patch] restore the RCU callback...)

On Wed, Aug 23, 2006 at 06:22:03PM -0700, Robert Crocombe wrote:
> On 8/23/06, hui Bill Huey <[email protected]> wrote:
> >On Wed, Aug 23, 2006 at 02:05:58PM -0700, Bill Huey wrote:
> >> I'm on irc.oftc.net as "bh" if you want somebody that's interactive. It
> >> might help with things here.
> >
> >#kernelnewbies is the channel.
>
> Firewall where only port 80 is open. I tried this thing:
>
> http://www.polarhome.com/cgi-bin/chat/irc.cgi
>
> but it looks like it connects to the wrong port (8888 vs 6667?), or
> something. I see that some Windows *shudder* client supposedly
> supports thru-firewill connections, and additionally some instructions
> about port-forwarding off an outside machine, but I'm kinda stuck for
> the moment.

This is IRC and the #kernelnewbies channel is where a number of us (folks
following Rik van Riel) lurk. Any client like BitchX should work if you
have those ports open from where you're coming from.

> Okay, tried the patch. I actually made it through one full kernel
> compile without going bang, but did crap out on the second iteration
> (after a 'make clean').
>
> I'll post the config, too, just so there's a clear understanding of
> where we are.

Ok, yeah, I've been trying to get a clean stack trace output and not
really focused on the core issue. The debug output stuff seemed to need
some attention because of the hanging and continuous output of stack
traces. It makes it difficult to figure out what's going on when there
are a cascade of failures going on.

I'll upload those small changes next and try to figure out what's going
on with kjournald and the rtmutex. This is going to be a pain.

bill

2006-08-25 07:20:20

by Bill Huey

[permalink] [raw]
Subject: Re: rtmutex assert failure (was [Patch] restore the RCU callback...)

On Wed, Aug 23, 2006 at 06:46:58PM -0700, Bill Huey wrote:
> On Wed, Aug 23, 2006 at 06:22:03PM -0700, Robert Crocombe wrote:
...
> > I'll post the config, too, just so there's a clear understanding of
> > where we are.
>
> Ok, yeah, I've been trying to get a clean stack trace output and not
> really focused on the core issue. The debug output stuff seemed to need
> some attention because of the hanging and continuous output of stack
> traces. It makes it difficult to figure out what's going on when there
> are a cascade of failures going on.
>
> I'll upload those small changes next and try to figure out what's going
> on with kjournald and the rtmutex. This is going to be a pain.

I still can't replicate the problem here with my made up test suite, so
I'm going to take a guess and shove some memory barriers in the rt mutex
code. That code looked a bit suspect.

http://mmlinux.sourceforge.net/public/against-2.6.17-rt8-2.diff

bill

2006-08-26 01:24:32

by Robert Crocombe

[permalink] [raw]
Subject: Re: rtmutex assert failure (was [Patch] restore the RCU callback...)

On 8/25/06, hui Bill Huey <[email protected]> wrote:
> http://mmlinux.sourceforge.net/public/against-2.6.17-rt8-2.diff

It compiled a kernel through once, but died on a 2nd try.

The trace is the usual one, I believe.

kjournald/1061[CPU#1]: BUG in debug_rt_mutex_unlock at
kernel/rtmutex-debug.c:471

Call Trace:
<ffffffff80261783>{_raw_spin_lock_irqsave+29}
<ffffffff80280e0f>{__WARN_ON+123}
<ffffffff80280dca>{__WARN_ON+54}
<ffffffff8029617d>{debug_rt_mutex_unlock+199}
<ffffffff80260ef9>{rt_lock_slowunlock+25}
<ffffffff80261489>{__lock_text_start+9}
<ffffffff8020a188>{kmem_cache_alloc+202}
<ffffffff8029facd>{mempool_alloc_slab+17}
<ffffffff802222fe>{mempool_alloc+75}
<ffffffff80261b9c>{_raw_spin_unlock+46}
<ffffffff80260f21>{rt_lock_slowunlock+65}
<ffffffff80219d12>{bio_alloc_bioset+35}
<ffffffff802b58df>{bio_clone+29}
<ffffffff803a5aaa>{clone_bio+51}
<ffffffff803a6762>{__split_bio+399}
<ffffffff803a6f53>{dm_request+453}
<ffffffff8021b422>{generic_make_request+375}
<ffffffff80232b9d>{submit_bio+192}
<ffffffff80219c60>{submit_bh+262}
<ffffffff802166d6>{ll_rw_block+161}
<ffffffff802ec4ea>{journal_commit_transaction+990}
<ffffffff80261b9c>{_raw_spin_unlock+46}
<ffffffff80260f21>{rt_lock_slowunlock+65}
<ffffffff80261489>{__lock_text_start+9}
<ffffffff8024b490>{try_to_del_timer_sync+85}
<ffffffff802f0904>{kjournald+199}
<ffffffff8028f4b1>{autoremove_wake_function+0}
<ffffffff802f083d>{kjournald+0}
<ffffffff8028f2ad>{keventd_create_kthread+0}
<ffffffff802321c0>{kthread+219}
<ffffffff80226903>{schedule_tail+185}
<ffffffff8025d2e6>{child_rip+8}
<ffffffff8028f2ad>{keventd_create_kthread+0}
<ffffffff802320e5>{kthread+0}
<ffffffff8025d2de>{child_rip+0}
---------------------------
| preempt count: 00000002 ]
| 2-level deep critical section nesting:
----------------------------------------
.. [<ffffffff802616bd>] .... _raw_spin_lock+0x16/0x23
.....[<ffffffff80260ef1>] .. ( <= rt_lock_slowunlock+0x11/0x6b)
.. [<ffffffff80261783>] .... _raw_spin_lock_irqsave+0x1d/0x2e
.....[<ffffffff80280dca>] .. ( <= __WARN_ON+0x36/0xa2)

And I should've noted this before, but in general, I use icecream

http://en.opensuse.org/Icecream

to compile stuff. I have been able to reproduce the identical problem
without it, but it is another variable. Sorry for the omission.


I also feel like I should be doing more, so I built a vanilla
2.6.17-rt8 kernel with the more limited set of config options I've
used on the most recent patches, but I also built this as a
uni-processor config. On the uni-processor kernel with icecream
disabled and while building a kernel with a simple 'make', I received
the following BUG, but the machine kept on going:

BUG: scheduling while atomic: make/0x00000001/20884

Call Trace:
<ffffffff802fbc2e>{plist_check_head+60}
<ffffffff8025d2eb>{__schedule+155}
<ffffffff8028f06a>{task_blocks_on_rt_mutex+643}
<ffffffff8020efda>{__mod_page_state_offset+25}
<ffffffff802871de>{find_task_by_pid_type+24}
<ffffffff8020efda>{__mod_page_state_offset+25}
<ffffffff8025dad7>{schedule+236}
<ffffffff8025e96d>{rt_lock_slowlock+416}
<ffffffff8025f4b8>{rt_lock+13}
<ffffffff8020efda>{__mod_page_state_offset+25}
<ffffffff8029ab91>{__free_pages_ok+336}
<ffffffff8022c07e>{__free_pages+47}
<ffffffff80233bb0>{free_pages+128}
<ffffffff802580df>{free_task+26}
<ffffffff80245b52>{__put_task_struct+182}
<ffffffff8025d86e>{thread_return+163}
<ffffffff8025dad7>{schedule+236}
<ffffffff80245ff1>{pipe_wait+111}
<ffffffff802893ac>{autoremove_wake_function+0}
<ffffffff8025e7c8>{rt_mutex_lock+50}
<ffffffff8022ce81>{pipe_readv+785}
<ffffffff8025eaa6>{rt_lock_slowunlock+98}
<ffffffff8025f4a9>{__lock_text_start+9}
<ffffffff802aea38>{pipe_read+30}
<ffffffff8020afae>{vfs_read+171}
<ffffffff8020fdbc>{sys_read+71}
<ffffffff8025994e>{system_call+126}
---------------------------
| preempt count: 00000001 ]
| 1-level deep critical section nesting:
----------------------------------------
.. [<ffffffff8025d303>] .... __schedule+0xb3/0x57b
.....[<ffffffff8025dad7>] .. ( <= schedule+0xec/0x11a)

Of 4 total attempts this way, two completed with no problems, one had
the BUG above, and yet another had two BUGs (below), but also
continued.

BUG: scheduling while atomic: make/0x00000001/14632

Call Trace:
<ffffffff802fbc2e>{plist_check_head+60}
<ffffffff8025d2eb>{__schedule+155}
<ffffffff8028f06a>{task_blocks_on_rt_mutex+643}
<ffffffff8020efda>{__mod_page_state_offset+25}
<ffffffff802871de>{find_task_by_pid_type+24}
<ffffffff8020efda>{__mod_page_state_offset+25}
<ffffffff8025dad7>{schedule+236}
<ffffffff8025e96d>{rt_lock_slowlock+416}
<ffffffff8025f4b8>{rt_lock+13}
<ffffffff8020efda>{__mod_page_state_offset+25}
<ffffffff8029ab91>{__free_pages_ok+336}
<ffffffff8022c07e>{__free_pages+47}
<ffffffff80233bb0>{free_pages+128}
<ffffffff802580df>{free_task+26}
<ffffffff80245b52>{__put_task_struct+182}
<ffffffff8025d86e>{thread_return+163}
<ffffffff8025dad7>{schedule+236}
<ffffffff80245ff1>{pipe_wait+111}
<ffffffff802893ac>{autoremove_wake_function+0}
<ffffffff8025e7c8>{rt_mutex_lock+50}
<ffffffff8022ce81>{pipe_readv+785}
<ffffffff8025eaa6>{rt_lock_slowunlock+98}
<ffffffff8025f4a9>{__lock_text_start+9}
<ffffffff802aea38>{pipe_read+30}
<ffffffff8020afae>{vfs_read+171}
<ffffffff8020fdbc>{sys_read+71}
<ffffffff8025994e>{system_call+126}
---------------------------
| preempt count: 00000001 ]
| 1-level deep critical section nesting:
----------------------------------------
.. [<ffffffff8025d303>] .... __schedule+0xb3/0x57b
.....[<ffffffff8025dad7>] .. ( <= schedule+0xec/0x11a)

BUG: scheduling while atomic: make/0x00000001/23043

Call Trace:
<ffffffff802fbc2e>{plist_check_head+60}
<ffffffff8025d2eb>{__schedule+155}
<ffffffff8028f06a>{task_blocks_on_rt_mutex+643}
<ffffffff8029a81b>{free_pages_bulk+42}
<ffffffff802871de>{find_task_by_pid_type+24}
<ffffffff8029a81b>{free_pages_bulk+42}
<ffffffff8025dad7>{schedule+236}
<ffffffff8025e96d>{rt_lock_slowlock+416}
<ffffffff8025f4b8>{rt_lock+13}
<ffffffff8029a81b>{free_pages_bulk+42}
<ffffffff8029abf7>{__free_pages_ok+438}
<ffffffff8022c07e>{__free_pages+47}
<ffffffff80233bb0>{free_pages+128}
<ffffffff802580df>{free_task+26}
<ffffffff80245b52>{__put_task_struct+182}
<ffffffff80225900>{schedule_tail+152}
<ffffffff80259845>{ret_from_fork+5}
---------------------------
| preempt count: 00000001 ]
| 1-level deep critical section nesting:
----------------------------------------
.. [<ffffffff8025ea5a>] .... rt_lock_slowunlock+0x16/0xba
.....[<ffffffff8025f4a9>] .. ( <= __lock_text_start+0x9/0xb)


I also tried the compile with icecream running, and the machine simply
wudged during each of three attempts, once almost instantly. I was
able to use the sysrq keys (but nothing else: couldn't get the
capslock light when I toggled that key), and I dumped all the task
info. It's A LOT, since I was building with 'make -j40' or something.
I will probably try and put it up on my little webserver once I get
home (I have no mechanism here).

Is there anything else you want me to try: I can probably wave a
chicken over the machine or get some black candles or something?

--
Robert Crocombe
[email protected]

2006-08-26 01:28:38

by Robert Crocombe

[permalink] [raw]
Subject: Re: rtmutex assert failure (was [Patch] restore the RCU callback...)

On 8/25/06, Robert Crocombe <[email protected]> wrote:
> Of 4 total attempts this way, two completed with no problems, one had

Drat, this isn't quite true. There was one attempt with 'make', and
that produced the 1st BUG, but the latter three attempts were with
'make -j40': two were okay, and one produced the two BUGs. Apologies.
I will make additional attempts with 'make' only.

--
Robert Crocombe
[email protected]

2006-08-26 02:37:09

by Robert Crocombe

[permalink] [raw]
Subject: Re: rtmutex assert failure (was [Patch] restore the RCU callback...)

On 8/25/06, Robert Crocombe <[email protected]> wrote:
> Drat, this isn't quite true. There was one attempt with 'make', and
> that produced the 1st BUG, but the latter three attempts were with
> 'make -j40': two were okay, and one produced the two BUGs. Apologies.
> I will make additional attempts with 'make' only.

Well, I was just doing that, and I got this:

----------- [cut here ] --------- [please bite here ] ---------
Kernel BUG at kernel/rtmutex.c:639
invalid opcode: 0000 [1] PREEMPT
CPU 0
Modules linked in: tg3
Pid: 7332, comm: ld Not tainted 2.6.17-rt8_UP_00 #3
RIP: 0010:[<ffffffff8025e89e>] <ffffffff8025e89e>{rt_lock_slowlock+209}
RSP: 0018:ffff8107ed70f9f8 EFLAGS: 00010246
RAX: ffff8107f2b13540 RBX: ffffffff804aa510 RCX: ffffffff804aacc0
RDX: ffff8107f2b13540 RSI: ffff8107ed70f9e8 RDI: ffffffff804aa510
RBP: ffff8107ed70fab8 R08: 0000000000000000 R09: ffff8107ed70f9f8
R10: 0000000000000002 R11: ffff8107fdcd8080 R12: ffff81001cbe3ad0
R13: 0000000000000000 R14: ffffffff8029a81b R15: 0000000000000001
FS: 00002b9a0c7e0d50(0000) GS:ffffffff805fe000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 00000000005b9e5c CR3: 00000007eb86e000 CR4: 00000000000006e0
Process ld (pid: 7332, threadinfo ffff8107ed70e000, task ffff8107f2b13540)
Stack: 111111110000008c ffff8107ed70fa00 ffff8107ed70fa00 ffff8107ed70fa10
ffff8107ed70fa10 0000000000000000 111111110000008c ffff8107ed70fa30
ffff8107ed70fa30 ffff8107ed70fa40
Call Trace:
<ffffffff8025f4b8>{rt_lock+13}
<ffffffff8029a81b>{free_pages_bulk+42}
<ffffffff8029abf7>{__free_pages_ok+438}
<ffffffff8022c07e>{__free_pages+47}
<ffffffff80233bb0>{free_pages+128}
<ffffffff802580df>{free_task+26}
<ffffffff80245b52>{__put_task_struct+182}
<ffffffff8025d86e>{thread_return+163}
<ffffffff8025e876>{rt_lock_slowlock+169}
<ffffffff8025e0e3>{preempt_schedule_irq+78}
<ffffffff80259fa6>{retint_kernel+38}
<ffffffff80209d21>{get_page_from_freelist+204}
<ffffffff80209ce0>{get_page_from_freelist+139}
<ffffffff8020dee1>{__alloc_pages+116}
<ffffffff8022180a>{anon_vma_prepare+44}
<ffffffff80208468>{__handle_mm_fault+454}
<ffffffff8028f188>{rt_mutex_slowtrylock+174}
<ffffffff8020a705>{do_page_fault+1058}
<ffffffff8025eaa6>{rt_lock_slowunlock+98}
<ffffffff8025f4a9>{__lock_text_start+9}
<ffffffff8025a2f1>{error_exit+0}
---------------------------
| preempt count: 00000003 ]
| 3-level deep critical section nesting:
----------------------------------------
.. [<ffffffff8025d303>] .... __schedule+0xb3/0x57b
.....[<ffffffff8025e0e3>] .. ( <= preempt_schedule_irq+0x4e/0x84)
.. [<ffffffff8025e805>] .... rt_lock_slowlock+0x38/0x277
.....[<ffffffff8025f4b8>] .. ( <= rt_lock+0xd/0xf)
.. [<ffffffff80263268>] .... oops_begin+0x1b/0x55
.....[<ffffffff802638ff>] .. ( <= die+0x19/0x44)


Code: 0f 0b 68 71 7e 3f 80 c2 7f 02 65 48 8b 04 25 00 00 00 00 41
RIP <ffffffff8025e89e>{rt_lock_slowlock+209} RSP <ffff8107ed70f9f8>
<6>note: ld[7332] exited with preempt_count 2

A later example did the same thing, then started "wigging out",
eventually declaring:

<1>Fixing recursive fault but reboot is needed!

I have a full trace and sysrq-T for this as well.

--
Robert Crocombe
[email protected]

2006-08-26 10:28:37

by Bill Huey

[permalink] [raw]
Subject: Re: rtmutex assert failure (was [Patch] restore the RCU callback...)

On Fri, Aug 25, 2006 at 06:24:29PM -0700, Robert Crocombe wrote:
> On 8/25/06, hui Bill Huey <[email protected]> wrote:
> > http://mmlinux.sourceforge.net/public/against-2.6.17-rt8-2.diff
> BUG: scheduling while atomic: make/0x00000001/14632
>
> Call Trace:
> <ffffffff802fbc2e>{plist_check_head+60}
> <ffffffff8025d2eb>{__schedule+155}
> <ffffffff8028f06a>{task_blocks_on_rt_mutex+643}
> <ffffffff8020efda>{__mod_page_state_offset+25}
> <ffffffff802871de>{find_task_by_pid_type+24}
> <ffffffff8020efda>{__mod_page_state_offset+25}
> <ffffffff8025dad7>{schedule+236}
> <ffffffff8025e96d>{rt_lock_slowlock+416}
> <ffffffff8025f4b8>{rt_lock+13}
> <ffffffff8020efda>{__mod_page_state_offset+25}
> <ffffffff8029ab91>{__free_pages_ok+336}
> <ffffffff8022c07e>{__free_pages+47}
> <ffffffff80233bb0>{free_pages+128}
> <ffffffff802580df>{free_task+26}
> <ffffffff80245b52>{__put_task_struct+182}
> <ffffffff8025d86e>{thread_return+163}
> <ffffffff8025dad7>{schedule+236}
> <ffffffff80245ff1>{pipe_wait+111}
> <ffffffff802893ac>{autoremove_wake_function+0}
> <ffffffff8025e7c8>{rt_mutex_lock+50}
> <ffffffff8022ce81>{pipe_readv+785}
> <ffffffff8025eaa6>{rt_lock_slowunlock+98}
> <ffffffff8025f4a9>{__lock_text_start+9}
> <ffffffff802aea38>{pipe_read+30}
> <ffffffff8020afae>{vfs_read+171}
> <ffffffff8020fdbc>{sys_read+71}
> <ffffffff8025994e>{system_call+126}
> ---------------------------
> | preempt count: 00000001 ]
> | 1-level deep critical section nesting:
> ----------------------------------------
> .. [<ffffffff8025d303>] .... __schedule+0xb3/0x57b
> .....[<ffffffff8025dad7>] .. ( <= schedule+0xec/0x11a)

Things like this make me wonder if you're even appying the patch correctly.

The occurance of free_task() and friends should be completely gone. All
of those scheduling in atomics should be flat out kernel panics and the
kernel should immediately stop. I'll look at it again to see if missed a
compile options or something, but this is really odd cause it should stop
the kernel dead in its tracks.

bill

2006-08-26 10:49:44

by Bill Huey

[permalink] [raw]
Subject: Re: rtmutex assert failure (was [Patch] restore the RCU callback...)

On Fri, Aug 25, 2006 at 06:24:29PM -0700, Robert Crocombe wrote:
> Is there anything else you want me to try: I can probably wave a
> chicken over the machine or get some black candles or something?

Stop using icecream or anything that might to any kind of binary object
caching or header analysis to minimize compiles during the build. Make sure
you're really cleaning the entire kernel directory before each build. Make
sure it'a full build for starters.

Throw a #error in the __put_task_struct_inline(), not the alternative
__put_task_struct(). It should bark at compile time and fail to compile
kernel/fork.c

The function __put_task_struct() should never show up a stack trace
EVER. That function has been rename under all things CONFIG_PREEMPT_RT
under my addendum patches. That's why I'm starting to think it's your
build environment or you're miss applying the patches.

bill

2006-08-28 18:34:02

by Robert Crocombe

[permalink] [raw]
Subject: Re: rtmutex assert failure (was [Patch] restore the RCU callback...)

On 8/26/06, hui Bill Huey <[email protected]> wrote:

> Stop using icecream or anything that might to any kind of binary object
> caching or header analysis to minimize compiles during the build. Make sure
> you're really cleaning the entire kernel directory before each build. Make
> sure it'a full build for starters.

Yes.

> Throw a #error in the __put_task_struct_inline(), not the alternative
> __put_task_struct(). It should bark at compile time and fail to compile
> kernel/fork.c

added

#ifdef CONFIG_PREEMPT_RT
void __put_task_struct_inline(struct task_struct *tsk)
#error Boo, you can't do this.

then

[rcrocomb@spanky test_2.6.17-rt8]$ make clean ; make -j4 > /dev/null
CLEAN /home/rcrocomb/kernel/test_2.6.17-rt8
CLEAN arch/x86_64/ia32
CLEAN init
CLEAN usr
CLEAN .tmp_versions
kernel/fork.c:134:6: error: #error Boo, you can't do this.
make[1]: *** [kernel/fork.o] Error 1

> The function __put_task_struct() should never show up a stack trace
> EVER. That function has been rename under all things CONFIG_PREEMPT_RT
> under my addendum patches. That's why I'm starting to think it's your
> build environment or you're miss applying the patches.

from sched.h:

static inline void put_task_struct(struct task_struct *t)
{
if (atomic_dec_and_test(&t->usage))
__put_task_struct(t);
}

but is it used?

[rcrocomb@spanky test_2.6.17-rt8]$ find ./ -name \*.c -print0 | xargs
-0 grep -l "put_task_struct(" | wc -l
30

Yes it is. Also, beginning at line 292 in what I call t6.diff (the
most recent URL, I believe):

+#ifdef CONFIG_PREEMPT_RT
.
.
.

+
+/*
+ * We dont want to do complex work from the scheduler with preemption
+ * disabled, therefore we delay the work to a per-CPU worker thread.
+ */
+void fastcall __put_task_struct(struct task_struct *task)
+{
+ struct list_head *head;
+
+ head = &get_cpu_var(delayed_put_task_struct_list);
+ list_add_tail(&task->delayed_drop, head);
+
+ _wake_cpu_desched_task();
+
+ put_cpu_var(delayed_put_task_struct_list);
+}
+#endif
+

So I think you're mistaken. Patch is applied like this:

[rcrocomb@spanky test_2.6.17-rt8]$ patch -p0 < ../patches/t6.diff
patching file arch/x86_64/kernel/smp.c
patching file arch/x86_64/mm/numa.c
patching file fs/jbd/journal.c
patching file include/linux/hardirq.h
patching file include/linux/init_task.h
patching file include/linux/sched.h
patching file include/linux/seqlock.h
Hunk #1 succeeded at 1 with fuzz 1.
patching file kernel/exit.c
patching file kernel/fork.c
patching file kernel/panic.c
patching file kernel/printk.c
patching file kernel/rtmutex.c
patching file kernel/sched.c


Note that I've posted the sysrq stuff at:

http://66.93.162.249/~rcrocomb/2.6.16-rt8/

Sorry about screwing up your name: I turned 'hui' into your last name somehow.

--
Robert Crocombe
[email protected]

2006-08-28 20:28:52

by Bill Huey

[permalink] [raw]
Subject: Re: rtmutex assert failure (was [Patch] restore the RCU callback...)


[...resend...]

On Mon, Aug 28, 2006 at 11:33:59AM -0700, Robert Crocombe wrote:
> On 8/26/06, hui Bill Huey <[email protected]> wrote:
> >The function __put_task_struct() should never show up a stack trace
> >EVER. That function has been rename under all things
> >CONFIG_PREEMPT_RT
> >under my addendum patches. That's why I'm starting to think it's your
> >build environment or you're miss applying the patches.
>
> but is it used?
...
> +void fastcall __put_task_struct(struct task_struct *task)
> +{
> + struct list_head *head;
> +
> + head = &get_cpu_var(delayed_put_task_struct_list);
> + list_add_tail(&task->delayed_drop, head);
> +
> + _wake_cpu_desched_task();
> +
> + put_cpu_var(delayed_put_task_struct_list);
> +}
> +#endif
> +
>
> So I think you're mistaken. Patch is applied like this:

The patch is applied correctly.

This is what I'm having a problem with in your stack trace...

I was unclear in explain that __put_task_struct() should never
appear with free_task() in a stack trace as you can clearly see
from the implementation. All it's suppose to do is wake a thread,
so "how?" you're getting those things simultaneously in the stack
trace is completely baffling to me. Could you double check to see
if it's booting the right kernel ? maybe make sure that's calling
that version of the function with printks or something ?

bill

2006-08-29 04:05:29

by Robert Crocombe

[permalink] [raw]
Subject: Re: rtmutex assert failure (was [Patch] restore the RCU callback...)

On 8/28/06, hui Bill Huey <[email protected]> wrote:

> I was unclear in explain that __put_task_struct() should never
> appear with free_task() in a stack trace as you can clearly see
> from the implementation. All it's suppose to do is wake a thread,
> so "how?" you're getting those things simultaneously in the stack
> trace is completely baffling to me. Could you double check to see
> if it's booting the right kernel ? maybe make sure that's calling
> that version of the function with printks or something ?

So I built another kernel with the most recent "t6" patches from
scratch and stuck in a directory 2.6.17-rt8-mrproper.

[rcrocomb@spanky ~]$ cd kernel/
[rcrocomb@spanky kernel]$ rm -Rf test_2.6.17-rt8/
[rcrocomb@spanky kernel]$ tar xf source/linux-2.6.16.tar
[rcrocomb@spanky kernel]$ mv linux-2.6.16/ 2.6.17-rt8-mrproper
[rcrocomb@spanky kernel]$ cd 2.6.17-rt8-mrproper/
[rcrocomb@spanky 2.6.17-rt8-mrproper]$ patch -s -p1 < ../patches/patch-2.6.17
[rcrocomb@spanky 2.6.17-rt8-mrproper]$ patch -s -p1 <
../patches/patch-2.6.17-rt8
[rcrocomb@spanky 2.6.17-rt8-mrproper]$ patch -s -p0 < ../patches/t6.diff
[rcrocomb@spanky 2.6.17-rt8-mrproper]$ uname -r
2.6.17-rt8_UP_00
[rcrocomb@spanky 2.6.17-rt8-mrproper]$ zcat /proc/config.gz > ./.config
[rcrocomb@spanky 2.6.17-rt8-mrproper]$ make oldconfig

I added some printk()s:

[rcrocomb@spanky 2.6.17-rt8-mrproper]$ diff -u
~/kernel/2.6.17-rt8-t6/kernel/fork.c kernel/fork.c
--- /home/rcrocomb/kernel/2.6.17-rt8-t6/kernel/fork.c 2006-08-25
13:32:43.000000000 -0700
+++ kernel/fork.c 2006-08-28 19:52:22.000000000 -0700
@@ -1753,12 +1753,16 @@
{
struct list_head *head;

+ printk(KERN_DEBUG "+");
+
head = &get_cpu_var(delayed_put_task_struct_list);
list_add_tail(&task->delayed_drop, head);

_wake_cpu_desched_task();

put_cpu_var(delayed_put_task_struct_list);
+
+ printk(KERN_DEBUG "-\n");
}
#endif

but these make the machine sad:

+<3>BUG: sleeping function called from invalid context modprobe(2927)
at kernel/rtmutex.c:777
in_atomic():1 [00000001], irqs_disabled():0

Call Trace:
<ffffffff80304596>{vgacon_cursor+470}
<ffffffff8020b2c6>{__might_sleep+271}
<ffffffff8025f276>{rt_lock+155}
<ffffffff80337cb6>{serial8250_console_write+86}
<ffffffff803310e1>{vt_console_print+550}
<ffffffff80279c24>{__call_console_drivers+75}
<ffffffff80279c9b>{_call_console_drivers+102}
<ffffffff80215e11>{release_console_sem+372}
<ffffffff8027a212>{vprintk+671}
<ffffffff802629c9>{__switch_to+42}
<ffffffff8028f2ff>{debug_rt_mutex_lock+186}
<ffffffff8027a2cd>{printk+103}
<ffffffff80244d0b>{try_to_wake_up+358}
<ffffffff802629c9>{__switch_to+42}
<ffffffff80245886>{__put_task_struct+26}
<ffffffff8025d556>{thread_return+144}
<ffffffff80278939>{rt_mutex_setprio+206}
<ffffffff8025d662>{preempt_schedule+80}
<ffffffff8028d9af>{rt_mutex_adjust_prio+80}
<ffffffff8025ec99>{rt_lock_slowunlock+181}
<ffffffff8025f1d9>{__lock_text_start+9}
<ffffffff8025f4c5>{rt_write_unlock+9}
<ffffffff802149df>{do_exit+2100}
<ffffffff8028f452>{debug_rt_mutex_unlock+293}
<ffffffff80246f79>{cpu_idle+0}
<ffffffff8024a87a>{sys_exit_group+18}
<ffffffff8025964e>{system_call+126}
---------------------------
| preempt count: 00000001 ]
| 1-level deep critical section nesting:
----------------------------------------
.. [<ffffffff8025d003>] .... __schedule+0xb3/0x576
.....[<ffffffff8025d662>] .. ( <= preempt_schedule+0x50/0x81)
-

I originally was using KERN_ERR, but the machine wouldn't finish
booting, so I thought that by using KERN_DEBUG, I could boot it and
then switch loglevels with Sysrq. So I think it's relatively assured
that it is indeed going into this version of __put_task_struct(). If
it's at all valuable, the last character printing before lockup was
always the "+" character, but this last char is only on the video
console and not the serial.

Err, I hope this is what you meant. I have unlimited bits, and can
sprinkle them whereever you'd like.

Also, I switched up configs a bit (disabling a few debugging options)
because I was curious as to what gcc -E would show when processing
fork.c, and it appeared that lots of code could be snipped by doing
this. I kinda hoped there was some cool GNU utility to generate a
static call graph or similar (I was trying to find how free_task()
could come after __put_task_struct() as you mentioned). I've appended
a diff of the config from the original UP kernel and 'mrproper''.
Note that UP is vanilla 2.6.17-rt8 while mrproper includes your t6
patches (but its CONFIG_LOCALVERSION is UP_01 vs UP_00: consistency is
not my strong suit).

After the config diff is the result of the command:

gcc -D__KERNEL__ -I../include -E fork.c

(which I hope is reasonable) while in the kernel directory. I hoped
that a free_task() would magically appear, but was disappointed, and
tracing down and expanding all the various functions made me dizzy.
Someone needs to hook 'sparse' and graphviz together >:<.

--
Robert Crocombe
[email protected]

[rcrocomb@spanky 2.6.17-rt8-mrproper]$ diff -u
~/kernel/2.6.17-rt8-UP/.config ./.config
--- /home/rcrocomb/kernel/2.6.17-rt8-UP/.config 2006-08-25
17:14:28.000000000 -0700
+++ ./.config 2006-08-28 19:57:33.000000000 -0700
@@ -1,7 +1,7 @@
#
# Automatically generated make config: don't edit
# Linux kernel version: 2.6.17-rt8
-# Fri Aug 25 17:14:28 2006
+# Mon Aug 28 19:57:33 2006
#
CONFIG_X86_64=y
CONFIG_64BIT=y
@@ -30,7 +30,7 @@
#
# General setup
#
-CONFIG_LOCALVERSION="_UP_00"
+CONFIG_LOCALVERSION="_UP_01"
CONFIG_LOCALVERSION_AUTO=y
CONFIG_SWAP=y
CONFIG_SYSVIPC=y
@@ -1026,19 +1026,10 @@
CONFIG_DEBUG_RT_MUTEXES=y
CONFIG_DEBUG_PI_LIST=y
# CONFIG_RT_MUTEX_TESTER is not set
-CONFIG_WAKEUP_TIMING=y
-CONFIG_WAKEUP_LATENCY_HIST=y
+# CONFIG_WAKEUP_TIMING is not set
CONFIG_PREEMPT_TRACE=y
-CONFIG_CRITICAL_PREEMPT_TIMING=y
-CONFIG_PREEMPT_OFF_HIST=y
-CONFIG_CRITICAL_IRQSOFF_TIMING=y
-CONFIG_INTERRUPT_OFF_HIST=y
-CONFIG_CRITICAL_TIMING=y
-CONFIG_DEBUG_TRACE_IRQFLAGS=y
-CONFIG_LATENCY_TIMING=y
-CONFIG_CRITICAL_LATENCY_HIST=y
-CONFIG_LATENCY_HIST=y
-# CONFIG_LATENCY_TRACE is not set
+# CONFIG_CRITICAL_PREEMPT_TIMING is not set
+# CONFIG_CRITICAL_IRQSOFF_TIMING is not set
# CONFIG_DEBUG_KOBJECT is not set
CONFIG_DEBUG_INFO=y
# CONFIG_DEBUG_FS is not set

All the formatting is mine own, and I stripped some do-while (0)s.

void __put_task_struct(struct task_struct *task)
{
struct list_head *head;

head = &(*(
{

add_preempt_count(1);
__asm__ __volatile__("": : :"memory");

&(*(
{
unsigned long __ptr;
__asm__ ("" : "=r"(__ptr) :
"0"(&per_cpu__delayed_put_task_struct_list));
(typeof(&per_cpu__delayed_put_task_struct_list)) (__ptr + ((
{
typeof(((struct x8664_pda *)0)->data_offset) ret__;

switch ((sizeof(((struct x8664_pda *)0)->data_offset)))
{
case 2: asm volatile(BLORP_A); break; // it's all asm.
I have it, if you care
case 4: asm volatile(BLORP_B); break;
case 8: asm volatile(BLORP_C); break;
default: __bad_pda_field();
}

ret__;
}
)));
}));

}));

list_add_tail(&task->delayed_drop, head);

_wake_cpu_desched_task();

__asm__ __volatile__("": : :"memory");
sub_preempt_count(1);
__asm__ __volatile__("": : :"memory");

if (__builtin_expect(!!(test_ti_thread_flag(current_thread_info(), 3)), 0))
preempt_schedule();
}

2006-08-29 17:12:22

by Bill Huey

[permalink] [raw]
Subject: Re: rtmutex assert failure (was [Patch] restore the RCU callback...)

On Mon, Aug 28, 2006 at 09:05:27PM -0700, Robert Crocombe wrote:
> On 8/28/06, hui Bill Huey <[email protected]> wrote:
>
> >I was unclear in explain that __put_task_struct() should never
> >appear with free_task() in a stack trace as you can clearly see
> >from the implementation. All it's suppose to do is wake a thread,
> >so "how?" you're getting those things simultaneously in the stack
> >trace is completely baffling to me. Could you double check to see
> >if it's booting the right kernel ? maybe make sure that's calling
> >that version of the function with printks or something ?
>
> So I built another kernel with the most recent "t6" patches from
> scratch and stuck in a directory 2.6.17-rt8-mrproper.
>
> [rcrocomb@spanky ~]$ cd kernel/
> [rcrocomb@spanky kernel]$ rm -Rf test_2.6.17-rt8/
> [rcrocomb@spanky kernel]$ tar xf source/linux-2.6.16.tar
> [rcrocomb@spanky kernel]$ mv linux-2.6.16/ 2.6.17-rt8-mrproper
> [rcrocomb@spanky kernel]$ cd 2.6.17-rt8-mrproper/
> [rcrocomb@spanky 2.6.17-rt8-mrproper]$ patch -s -p1 <
> ../patches/patch-2.6.17
> [rcrocomb@spanky 2.6.17-rt8-mrproper]$ patch -s -p1 <
> ../patches/patch-2.6.17-rt8
> [rcrocomb@spanky 2.6.17-rt8-mrproper]$ patch -s -p0 < ../patches/t6.diff
> [rcrocomb@spanky 2.6.17-rt8-mrproper]$ uname -r
> 2.6.17-rt8_UP_00

I'm going to ask what seems like a really stupid question. What's "t6" ?
and what's the relationship of that to my patches ?

[side note: I'm headed to Burning Man tomorrow and won't be on a computer
for 6 days until I come back]

bill

2006-08-29 17:19:44

by Robert Crocombe

[permalink] [raw]
Subject: Re: rtmutex assert failure (was [Patch] restore the RCU callback...)

On 8/29/06, hui Bill Huey <[email protected]> wrote:
> I'm going to ask what seems like a really stupid question. What's "t6" ?
> and what's the relationship of that to my patches ?

Oh, sorry, thought I'd made that clear. Since the first patches you
put up were "t.diff" and "t2.diff", I've just been incrementing a
counter:

-rw-r--r-- 1 rcrocomb rcrocomb 1941 Aug 14 08:28 t.diff
-rw-r--r-- 1 rcrocomb rcrocomb 5875 Aug 14 08:28 t2.diff
-rw-r--r-- 1 rcrocomb rcrocomb 9397 Aug 21 15:25 t3.diff
-rw-r--r-- 1 rcrocomb rcrocomb 11289 Aug 23 09:55 t4.diff
-rw-r--r-- 1 rcrocomb rcrocomb 15460 Aug 23 17:37 t5.diff
-rw-r--r-- 1 rcrocomb rcrocomb 18119 Aug 25 13:31 t6.diff

[rcrocomb@spanky patches]$ head t6.diff
#
# old_revision [c9c382f0a12ab75bee370938f9b7ad5b582cf39a]
#
# patch "arch/x86_64/kernel/smp.c"
# from [b354a6e4a080ad49f4213ab4334ca2e57ddf1bdc]
# to [a76a90aaad275ab3775d32ac3ae500fe120c5b8c]
#
# patch "arch/x86_64/mm/numa.c"
# from [72976a7a4aff795f020f34e91e98e0defa280d3b]
# to [9f2858d19a369e3aceaf084796837bd02e9bc6b7]

> [side note: I'm headed to Burning Man tomorrow and won't be on a computer
> for 6 days until I come back]

Nice!

--
Robert Crocombe
[email protected]