Dear RT folks!
I'm pleased to announce the v3.14-rt1 patch set.
Changes since v3.12.15-rt25
- I dropped the sparc64 patches I had in the queue. They did not apply
cleanly, the code in v3.14 changed in the MMU area. Here is where I
remembered that it was not working perfectly either.
- Scott Wood pointed out that I forgot a return value in the latest
gianfar fixup for RT. Return value added, thanks Scott.
This -RT series didn't crashed within ~4h testing on my ARM and
x86-32.
x86-64 crashed after I started hackbench. I figured out that the crash
does not happen with lazy-preempt disabled. Therefore the last but one
patch in the queue disables lazy preempt on x86-64. With this change the
test box survived ~2h without a crash. I look at this later but it looks
good now.
I decided to release it now because otherwise it would be lying around
probably the next two weeks or so. So here it is, enjoy.
Known issues:
- bcache is disabled.
The RT patch against 3.14 can be found here:
https://www.kernel.org/pub/linux/kernel/projects/rt/3.14/patch-3.14.0-rt1.patch.xz
The split quilt queue is available at:
https://www.kernel.org/pub/linux/kernel/projects/rt/3.14/patches-3.14.0-rt1.tar.xz
Sebastian
11.04.2014 22:57, Sebastian Andrzej Siewior пишет:
> Dear RT folks!
>
> I'm pleased to announce the v3.14-rt1 patch set.
Hooooooooooooooooray!
--
Pavel.
Hi Sebastian,
On Fri, 2014-04-11 at 20:57 +0200, Sebastian Andrzej Siewior wrote:
> Dear RT folks!
>
> I'm pleased to announce the v3.14-rt1 patch set.
This hunk in hotplug-light-get-online-cpus.patch looks like a bug.
@@ -333,7 +449,7 @@ static int __ref _cpu_down(unsigned int
/* CPU didn't die: tell everyone. Can't complain. */
smpboot_unpark_threads(cpu);
cpu_notify_nofail(CPU_DOWN_FAILED | mod, hcpu);
- goto out_release;
+ goto out_cancel;
}
BUG_ON(cpu_online(cpu));
> x86-64 crashed after I started hackbench. I figured out that the crash
> does not happen with lazy-preempt disabled. Therefore the last but one
> patch in the queue disables lazy preempt on x86-64. With this change the
> test box survived ~2h without a crash. I look at this later but it looks
> good now.
Ah, I had trouble there a while back too. I'll try to scrape up cycles
for a round 2, see who begs for mercy this time, it or me again.
-Mike
On Sat, 2014-04-19 at 16:46 +0200, Mike Galbraith wrote:
> Hi Sebastian,
>
> On Fri, 2014-04-11 at 20:57 +0200, Sebastian Andrzej Siewior wrote:
> > Dear RT folks!
> >
> > I'm pleased to announce the v3.14-rt1 patch set.
>
> This hunk in hotplug-light-get-online-cpus.patch looks like a bug.
>
> @@ -333,7 +449,7 @@ static int __ref _cpu_down(unsigned int
> /* CPU didn't die: tell everyone. Can't complain. */
> smpboot_unpark_threads(cpu);
> cpu_notify_nofail(CPU_DOWN_FAILED | mod, hcpu);
> - goto out_release;
> + goto out_cancel;
> }
> BUG_ON(cpu_online(cpu));
>
Another little bug. This hunk of patches/stomp-machine-raw-lock.patch
should be while (atomic_read(&done.nr_todo))
@@ -647,7 +671,7 @@ int stop_machine_from_inactive_cpu(int (
ret = multi_cpu_stop(&msdata);
/* Busy wait for completion. */
- while (!completion_done(&done.completion))
+ while (!atomic_read(&done.nr_todo))
cpu_relax();
mutex_unlock(&stop_cpus_mutex);
On Fri, 2014-04-11 at 20:57 +0200, Sebastian Andrzej Siewior wrote:
> This -RT series didn't crashed within ~4h testing on my ARM and
> x86-32.
> x86-64 crashed after I started hackbench. I figured out that the crash
> does not happen with lazy-preempt disabled. Therefore the last but one
> patch in the queue disables lazy preempt on x86-64. With this change the
> test box survived ~2h without a crash. I look at this later but it looks
> good now.
I think the below fixes it (in a more or less minimalist way), but it's
not very pretty. Methinks it would be prettier to either clone the x86
percpu + fold logic, or neutralize that optimization completely when
PREEMPT_LAZY is enabled.
x86_32 bit is completely untested, x86_64 hasn't exploded.. yet :)
---
include/linux/preempt.h | 3 +--
arch/x86/include/asm/preempt.h | 8 ++++++++
arch/x86/kernel/asm-offsets.c | 1 +
arch/x86/kernel/entry_32.S | 9 ++++++---
arch/x86/kernel/entry_64.S | 7 +++++--
5 files changed, 21 insertions(+), 7 deletions(-)
--- a/include/linux/preempt.h
+++ b/include/linux/preempt.h
@@ -126,8 +126,7 @@ do { \
#define preempt_enable_notrace() \
do { \
barrier(); \
- if (unlikely(__preempt_count_dec_and_test() || \
- test_thread_flag(TIF_NEED_RESCHED_LAZY))) \
+ if (unlikely(__preempt_count_dec_and_test())) \
__preempt_schedule_context(); \
} while (0)
#else
--- a/arch/x86/include/asm/preempt.h
+++ b/arch/x86/include/asm/preempt.h
@@ -94,7 +94,11 @@ static __always_inline bool __preempt_co
{
if (____preempt_count_dec_and_test())
return true;
+#ifdef CONFIG_PREEMPT_LAZY
return test_thread_flag(TIF_NEED_RESCHED_LAZY);
+#else
+ return false;
+#endif
}
/*
@@ -102,8 +106,12 @@ static __always_inline bool __preempt_co
*/
static __always_inline bool should_resched(void)
{
+#ifdef CONFIG_PREEMPT_LAZY
return unlikely(!__this_cpu_read_4(__preempt_count) || \
test_thread_flag(TIF_NEED_RESCHED_LAZY));
+#else
+ return unlikely(!__this_cpu_read_4(__preempt_count));
+#endif
}
#ifdef CONFIG_PREEMPT
--- a/arch/x86/kernel/asm-offsets.c
+++ b/arch/x86/kernel/asm-offsets.c
@@ -72,4 +72,5 @@ void common(void) {
BLANK();
DEFINE(PTREGS_SIZE, sizeof(struct pt_regs));
+ DEFINE(_PREEMPT_ENABLED, PREEMPT_ENABLED);
}
--- a/arch/x86/kernel/entry_32.S
+++ b/arch/x86/kernel/entry_32.S
@@ -365,19 +365,22 @@ ENTRY(resume_kernel)
need_resched:
# preempt count == 0 + NEED_RS set?
cmpl $0,PER_CPU_VAR(__preempt_count)
+#ifndef CONFIG_PREEMPT_LAZY
+ jnz restore_all
+#else
jz test_int_off
# atleast preempt count == 0 ?
- cmpl $_TIF_NEED_RESCHED,PER_CPU_VAR(__preempt_count)
+ cmpl $_PREEMPT_ENABLED,PER_CPU_VAR(__preempt_count)
jne restore_all
cmpl $0,TI_preempt_lazy_count(%ebp) # non-zero preempt_lazy_count ?
jnz restore_all
- testl $_TIF_NEED_RESCHED_LAZY, %ecx
+ testl $_TIF_NEED_RESCHED_LAZY, TI_flags(%ebp)
jz restore_all
-
test_int_off:
+#endif
testl $X86_EFLAGS_IF,PT_EFLAGS(%esp) # interrupts off (exception path) ?
jz restore_all
call preempt_schedule_irq
--- a/arch/x86/kernel/entry_64.S
+++ b/arch/x86/kernel/entry_64.S
@@ -1104,10 +1104,13 @@ ENTRY(native_iret)
/* rcx: threadinfo. interrupts off. */
ENTRY(retint_kernel)
cmpl $0,PER_CPU_VAR(__preempt_count)
+#ifndef CONFIG_PREEMPT_LAZY
+ jnz retint_restore_args
+#else
jz check_int_off
# atleast preempt count == 0 ?
- cmpl $_TIF_NEED_RESCHED,PER_CPU_VAR(__preempt_count)
+ cmpl $_PREEMPT_ENABLED,PER_CPU_VAR(__preempt_count)
jnz retint_restore_args
cmpl $0, TI_preempt_lazy_count(%rcx)
@@ -1115,8 +1118,8 @@ ENTRY(retint_kernel)
bt $TIF_NEED_RESCHED_LAZY,TI_flags(%rcx)
jnc retint_restore_args
-
check_int_off:
+#endif
bt $9,EFLAGS-ARGOFFSET(%rsp) /* interrupts off? */
jnc retint_restore_args
call preempt_schedule_irq
On Wed, 23 Apr 2014 12:37:05 +0200
Mike Galbraith <[email protected]> wrote:
> On Fri, 2014-04-11 at 20:57 +0200, Sebastian Andrzej Siewior wrote:
>
> > This -RT series didn't crashed within ~4h testing on my ARM and
> > x86-32.
> > x86-64 crashed after I started hackbench. I figured out that the crash
> > does not happen with lazy-preempt disabled. Therefore the last but one
> > patch in the queue disables lazy preempt on x86-64. With this change the
> > test box survived ~2h without a crash. I look at this later but it looks
> > good now.
>
> I think the below fixes it (in a more or less minimalist way), but it's
> not very pretty. Methinks it would be prettier to either clone the x86
> percpu + fold logic, or neutralize that optimization completely when
> PREEMPT_LAZY is enabled.
>
> x86_32 bit is completely untested, x86_64 hasn't exploded.. yet :)
>
This patch makes sense to me.
Acked-by: Steven Rostedt <[email protected]>
-- Steve
Turning lockdep on, it says it's busted.
(I'll go stare at it, maybe the beast will blink first for a change)
[ 0.000000] Lock dependency validator: Copyright (c) 2006 Red Hat, Inc., Ingo Molnar
[ 0.000000] ... MAX_LOCKDEP_SUBCLASSES: 8
[ 0.000000] ... MAX_LOCK_DEPTH: 48
[ 0.000000] ... MAX_LOCKDEP_KEYS: 8191
[ 0.000000] ... CLASSHASH_SIZE: 4096
[ 0.000000] ... MAX_LOCKDEP_ENTRIES: 16384
[ 0.000000] ... MAX_LOCKDEP_CHAINS: 32768
[ 0.000000] ... CHAINHASH_SIZE: 16384
[ 0.000000] memory used by lock dependency info: 6367 kB
[ 0.000000] per task-struct memory footprint: 2688 bytes
[ 0.000000] ------------------------
[ 0.000000] | Locking API testsuite:
[ 0.000000] ----------------------------------------------------------------------------
[ 0.000000] | spin |wlock |rlock |mutex | wsem | rsem |
[ 0.000000] --------------------------------------------------------------------------
[ 0.000000] A-A deadlock: ok | ok |FAILED|
[ 0.000000] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.14.1-rt1 #16
[ 0.000000] Hardware name: MEDIONPC MS-7502/MS-7502, BIOS 6.00 PG 12/26/2007
[ 0.000000] 0000000000000002 ffffffff81a01f28 ffffffff815e12a5 ffffffff810b7727
[ 0.000000] 0000000000000001 ffffffff81a01f58 ffffffff815e1db2 0000000000000000
[ 0.000000] 0000000000000000 0000000000000000 0000000000000000 ffffffff81a01f68
[ 0.000000] Call Trace:
[ 0.000000] [<ffffffff815e12a5>] dump_stack+0x4f/0x7c
[ 0.000000] [<ffffffff810b7727>] ? console_trylock_for_printk+0x37/0xf0
[ 0.000000] [<ffffffff815e1db2>] dotest+0x5f/0xc7
[ 0.000000] [<ffffffff812e710f>] locking_selftest+0xdf/0xb30
[ 0.000000] [<ffffffff81d51d8f>] start_kernel+0x215/0x327
[ 0.000000] [<ffffffff81d51a0c>] ? repair_env_string+0x5a/0x5a
[ 0.000000] [<ffffffff81d9f4c1>] ? memblock_reserve+0x49/0x4e
[ 0.000000] [<ffffffff81d515b2>] x86_64_start_reservations+0x2a/0x2c
[ 0.000000] [<ffffffff81d516a4>] x86_64_start_kernel+0xf0/0xf7
[ 0.000000] ok | ok | ok |
[ 0.000000] A-B-B-A deadlock: ok | ok |FAILED|
[ 0.000000] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.14.1-rt1 #16
[ 0.000000] Hardware name: MEDIONPC MS-7502/MS-7502, BIOS 6.00 PG 12/26/2007
[ 0.000000] 0000000000000002 ffffffff81a01f28 ffffffff815e12a5 ffffffff810b7727
[ 0.000000] 0000000000000001 ffffffff81a01f58 ffffffff815e1db2 0000000000000000
[ 0.000000] 0000000000000000 0000000000000000 0000000000000000 ffffffff81a01f68
[ 0.000000] Call Trace:
[ 0.000000] [<ffffffff815e12a5>] dump_stack+0x4f/0x7c
[ 0.000000] [<ffffffff810b7727>] ? console_trylock_for_printk+0x37/0xf0
[ 0.000000] [<ffffffff815e1db2>] dotest+0x5f/0xc7
[ 0.000000] [<ffffffff812e719e>] locking_selftest+0x16e/0xb30
[ 0.000000] [<ffffffff81d51d8f>] start_kernel+0x215/0x327
[ 0.000000] [<ffffffff81d51a0c>] ? repair_env_string+0x5a/0x5a
[ 0.000000] [<ffffffff81d9f4c1>] ? memblock_reserve+0x49/0x4e
[ 0.000000] [<ffffffff81d515b2>] x86_64_start_reservations+0x2a/0x2c
[ 0.000000] [<ffffffff81d516a4>] x86_64_start_kernel+0xf0/0xf7
[ 0.000000] ok | ok | ok |
[ 0.000000] A-B-B-C-C-A deadlock: ok | ok |FAILED|
[ 0.000000] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.14.1-rt1 #16
[ 0.000000] Hardware name: MEDIONPC MS-7502/MS-7502, BIOS 6.00 PG 12/26/2007
[ 0.000000] 0000000000000002 ffffffff81a01f28 ffffffff815e12a5 ffffffff810b7727
[ 0.000000] 0000000000000001 ffffffff81a01f58 ffffffff815e1db2 0000000000000000
[ 0.000000] 0000000000000000 0000000000000000 0000000000000000 ffffffff81a01f68
[ 0.000000] Call Trace:
[ 0.000000] [<ffffffff815e12a5>] dump_stack+0x4f/0x7c
[ 0.000000] [<ffffffff810b7727>] ? console_trylock_for_printk+0x37/0xf0
[ 0.000000] [<ffffffff815e1db2>] dotest+0x5f/0xc7
[ 0.000000] [<ffffffff812e722d>] locking_selftest+0x1fd/0xb30
[ 0.000000] [<ffffffff81d51d8f>] start_kernel+0x215/0x327
[ 0.000000] [<ffffffff81d51a0c>] ? repair_env_string+0x5a/0x5a
[ 0.000000] [<ffffffff81d9f4c1>] ? memblock_reserve+0x49/0x4e
[ 0.000000] [<ffffffff81d515b2>] x86_64_start_reservations+0x2a/0x2c
[ 0.000000] [<ffffffff81d516a4>] x86_64_start_kernel+0xf0/0xf7
[ 0.000000] ok | ok | ok |
[ 0.000000] A-B-C-A-B-C deadlock: ok | ok |FAILED|
[ 0.000000] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.14.1-rt1 #16
[ 0.000000] Hardware name: MEDIONPC MS-7502/MS-7502, BIOS 6.00 PG 12/26/2007
[ 0.000000] 0000000000000002 ffffffff81a01f28 ffffffff815e12a5 ffffffff810b7727
[ 0.000000] 0000000000000001 ffffffff81a01f58 ffffffff815e1db2 0000000000000000
[ 0.000000] 0000000000000000 0000000000000000 0000000000000000 ffffffff81a01f68
[ 0.000000] Call Trace:
[ 0.000000] [<ffffffff815e12a5>] dump_stack+0x4f/0x7c
[ 0.000000] [<ffffffff810b7727>] ? console_trylock_for_printk+0x37/0xf0
[ 0.000000] [<ffffffff815e1db2>] dotest+0x5f/0xc7
[ 0.000000] [<ffffffff812e72bc>] locking_selftest+0x28c/0xb30
[ 0.000000] [<ffffffff81d51d8f>] start_kernel+0x215/0x327
[ 0.000000] [<ffffffff81d51a0c>] ? repair_env_string+0x5a/0x5a
[ 0.000000] [<ffffffff81d9f4c1>] ? memblock_reserve+0x49/0x4e
[ 0.000000] [<ffffffff81d515b2>] x86_64_start_reservations+0x2a/0x2c
[ 0.000000] [<ffffffff81d516a4>] x86_64_start_kernel+0xf0/0xf7
[ 0.000000] ok | ok | ok |
[ 0.000000] A-B-B-C-C-D-D-A deadlock: ok | ok |FAILED|
[ 0.000000] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.14.1-rt1 #16
[ 0.000000] Hardware name: MEDIONPC MS-7502/MS-7502, BIOS 6.00 PG 12/26/2007
[ 0.000000] 0000000000000002 ffffffff81a01f28 ffffffff815e12a5 ffffffff810b7727
[ 0.000000] 0000000000000001 ffffffff81a01f58 ffffffff815e1db2 0000000000000000
[ 0.000000] 0000000000000000 0000000000000000 0000000000000000 ffffffff81a01f68
[ 0.000000] Call Trace:
[ 0.000000] [<ffffffff815e12a5>] dump_stack+0x4f/0x7c
[ 0.000000] [<ffffffff810b7727>] ? console_trylock_for_printk+0x37/0xf0
[ 0.000000] [<ffffffff815e1db2>] dotest+0x5f/0xc7
[ 0.000000] [<ffffffff812e734b>] locking_selftest+0x31b/0xb30
[ 0.000000] [<ffffffff81d51d8f>] start_kernel+0x215/0x327
[ 0.000000] [<ffffffff81d51a0c>] ? repair_env_string+0x5a/0x5a
[ 0.000000] [<ffffffff81d9f4c1>] ? memblock_reserve+0x49/0x4e
[ 0.000000] [<ffffffff81d515b2>] x86_64_start_reservations+0x2a/0x2c
[ 0.000000] [<ffffffff81d516a4>] x86_64_start_kernel+0xf0/0xf7
[ 0.000000] ok | ok | ok |
[ 0.000000] A-B-C-D-B-D-D-A deadlock: ok | ok |FAILED|
[ 0.000000] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.14.1-rt1 #16
[ 0.000000] Hardware name: MEDIONPC MS-7502/MS-7502, BIOS 6.00 PG 12/26/2007
[ 0.000000] 0000000000000002 ffffffff81a01f28 ffffffff815e12a5 ffffffff810b7727
[ 0.000000] 0000000000000001 ffffffff81a01f58 ffffffff815e1db2 0000000000000000
[ 0.000000] 0000000000000000 0000000000000000 0000000000000000 ffffffff81a01f68
[ 0.000000] Call Trace:
[ 0.000000] [<ffffffff815e12a5>] dump_stack+0x4f/0x7c
[ 0.000000] [<ffffffff810b7727>] ? console_trylock_for_printk+0x37/0xf0
[ 0.000000] [<ffffffff815e1db2>] dotest+0x5f/0xc7
[ 0.000000] [<ffffffff812e73da>] locking_selftest+0x3aa/0xb30
[ 0.000000] [<ffffffff81d51d8f>] start_kernel+0x215/0x327
[ 0.000000] [<ffffffff81d51a0c>] ? repair_env_string+0x5a/0x5a
[ 0.000000] [<ffffffff81d9f4c1>] ? memblock_reserve+0x49/0x4e
[ 0.000000] [<ffffffff81d515b2>] x86_64_start_reservations+0x2a/0x2c
[ 0.000000] [<ffffffff81d516a4>] x86_64_start_kernel+0xf0/0xf7
[ 0.000000] ok | ok | ok |
[ 0.000000] A-B-C-D-B-C-D-A deadlock: ok | ok |FAILED|
[ 0.000000] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.14.1-rt1 #16
[ 0.000000] Hardware name: MEDIONPC MS-7502/MS-7502, BIOS 6.00 PG 12/26/2007
[ 0.000000] 0000000000000002 ffffffff81a01f28 ffffffff815e12a5 ffffffff810b7727
[ 0.000000] 0000000000000001 ffffffff81a01f58 ffffffff815e1db2 0000000000000000
[ 0.000000] 0000000000000000 0000000000000000 0000000000000000 ffffffff81a01f68
[ 0.000000] Call Trace:
[ 0.000000] [<ffffffff815e12a5>] dump_stack+0x4f/0x7c
[ 0.000000] [<ffffffff810b7727>] ? console_trylock_for_printk+0x37/0xf0
[ 0.000000] [<ffffffff815e1db2>] dotest+0x5f/0xc7
[ 0.000000] [<ffffffff812e7469>] locking_selftest+0x439/0xb30
[ 0.000000] [<ffffffff81d51d8f>] start_kernel+0x215/0x327
[ 0.000000] [<ffffffff81d51a0c>] ? repair_env_string+0x5a/0x5a
[ 0.000000] [<ffffffff81d9f4c1>] ? memblock_reserve+0x49/0x4e
[ 0.000000] [<ffffffff81d515b2>] x86_64_start_reservations+0x2a/0x2c
[ 0.000000] [<ffffffff81d516a4>] x86_64_start_kernel+0xf0/0xf7
[ 0.000000] ok | ok | ok |
[ 0.000000] double unlock: ok |
[ 0.000000] ------------[ cut here ]------------
[ 0.000000] WARNING: CPU: 0 PID: 0 at kernel/sched/core.c:2660 migrate_disable+0xbd/0xd0()
[ 0.000000] Modules linked in:
[ 0.000000] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.14.1-rt1 #16
[ 0.000000] Hardware name: MEDIONPC MS-7502/MS-7502, BIOS 6.00 PG 12/26/2007
[ 0.000000] 0000000000000a64 ffffffff81a01d38 ffffffff815e12a5 ffffffff810b7727
[ 0.000000] 0000000000000000 ffffffff81a01d78 ffffffff8104f0cc ffffffff81a01d78
[ 0.000000] ffffffff81a194c0 0000000000000027 0000000000000006 0000000000000001
[ 0.000000] Call Trace:
[ 0.000000] [<ffffffff815e12a5>] dump_stack+0x4f/0x7c
[ 0.000000] [<ffffffff810b7727>] ? console_trylock_for_printk+0x37/0xf0
[ 0.000000] [<ffffffff8104f0cc>] warn_slowpath_common+0x8c/0xc0
[ 0.000000] [<ffffffff8104f11a>] warn_slowpath_null+0x1a/0x20
[ 0.000000] [<ffffffff8108795d>] migrate_disable+0xbd/0xd0
[ 0.000000] [<ffffffff810b7a7f>] call_console_drivers.constprop.20+0x4f/0x140
[ 0.000000] [<ffffffff810b823f>] console_unlock.part.15+0x1df/0x2e0
[ 0.000000] [<ffffffff815ec835>] ? _raw_spin_unlock+0x35/0x60
[ 0.000000] [<ffffffff810b8358>] console_unlock+0x18/0x30
[ 0.000000] [<ffffffff810b8771>] vprintk_emit+0x231/0x400
[ 0.000000] [<ffffffff815e9aca>] ? preempt_schedule+0x4a/0x70
[ 0.000000] [<ffffffff812e1320>] ? bad_unlock_order_spin+0x40/0x40
[ 0.000000] [<ffffffff815d8519>] printk+0x4d/0x4f
[ 0.000000] [<ffffffff815f11d9>] ? preempt_count_sub+0x29/0x70
[ 0.000000] [<ffffffff812e1348>] ? double_unlock_spin+0x28/0x30
[ 0.000000] [<ffffffff815e1dc8>] dotest+0x75/0xc7
[ 0.000000] [<ffffffff812e74cf>] locking_selftest+0x49f/0xb30
[ 0.000000] [<ffffffff81d51d8f>] start_kernel+0x215/0x327
[ 0.000000] [<ffffffff81d51a0c>] ? repair_env_string+0x5a/0x5a
[ 0.000000] [<ffffffff81d9f4c1>] ? memblock_reserve+0x49/0x4e
[ 0.000000] [<ffffffff81d515b2>] x86_64_start_reservations+0x2a/0x2c
[ 0.000000] [<ffffffff81d516a4>] x86_64_start_kernel+0xf0/0xf7
[ 0.000000] ---[ end trace 0000000000000001 ]---
[ 0.000000] ------------[ cut here ]------------
[ 0.000000] WARNING: CPU: 0 PID: 0 at kernel/sched/core.c:2693 migrate_enable+0xf6/0x1a0()
[ 0.000000] Modules linked in:
[ 0.000000] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G W 3.14.1-rt1 #16
[ 0.000000] Hardware name: MEDIONPC MS-7502/MS-7502, BIOS 6.00 PG 12/26/2007
[ 0.000000] 0000000000000a85 ffffffff81a01cb8 ffffffff815e12a5 ffffffff810b7727
[ 0.000000] 0000000000000000 ffffffff81a01cf8 ffffffff8104f0cc ffffffff81a01d18
[ 0.000000] ffffffff81a194c0 0000000000000000 000000000000000a ffffffff8275f367
[ 0.000000] Call Trace:
[ 0.000000] [<ffffffff815e12a5>] dump_stack+0x4f/0x7c
[ 0.000000] [<ffffffff810b7727>] ? console_trylock_for_printk+0x37/0xf0
[ 0.000000] [<ffffffff8104f0cc>] warn_slowpath_common+0x8c/0xc0
[ 0.000000] [<ffffffff8104f11a>] warn_slowpath_null+0x1a/0x20
[ 0.000000] [<ffffffff810877f6>] migrate_enable+0xf6/0x1a0
[ 0.000000] [<ffffffff813aab31>] vt_console_print+0x2f1/0x3d0
[ 0.000000] [<ffffffff810b7ae7>] call_console_drivers.constprop.20+0xb7/0x140
[ 0.000000] [<ffffffff810b823f>] console_unlock.part.15+0x1df/0x2e0
[ 0.000000] [<ffffffff815ec835>] ? _raw_spin_unlock+0x35/0x60
[ 0.000000] [<ffffffff810b8358>] console_unlock+0x18/0x30
[ 0.000000] [<ffffffff810b8771>] vprintk_emit+0x231/0x400
[ 0.000000] [<ffffffff815e9aca>] ? preempt_schedule+0x4a/0x70
[ 0.000000] [<ffffffff812e1320>] ? bad_unlock_order_spin+0x40/0x40
[ 0.000000] [<ffffffff815d8519>] printk+0x4d/0x4f
[ 0.000000] [<ffffffff815f11d9>] ? preempt_count_sub+0x29/0x70
[ 0.000000] [<ffffffff812e1348>] ? double_unlock_spin+0x28/0x30
[ 0.000000] [<ffffffff815e1dc8>] dotest+0x75/0xc7
[ 0.000000] [<ffffffff812e74cf>] locking_selftest+0x49f/0xb30
[ 0.000000] [<ffffffff81d51d8f>] start_kernel+0x215/0x327
[ 0.000000] [<ffffffff81d51a0c>] ? repair_env_string+0x5a/0x5a
[ 0.000000] [<ffffffff81d9f4c1>] ? memblock_reserve+0x49/0x4e
[ 0.000000] [<ffffffff81d515b2>] x86_64_start_reservations+0x2a/0x2c
[ 0.000000] [<ffffffff81d516a4>] x86_64_start_kernel+0xf0/0xf7
[ 0.000000] ---[ end trace 0000000000000002 ]---
[ 0.000000] ok |FAILED|
[ 0.000000] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G W 3.14.1-rt1 #16
[ 0.000000] Hardware name: MEDIONPC MS-7502/MS-7502, BIOS 6.00 PG 12/26/2007
[ 0.000000] 0000000000000002 ffffffff81a01f28 ffffffff815e12a5 0000000000000046
[ 0.000000] 0000000000000001 ffffffff81a01f58 ffffffff815e1db2 0000000000000000
[ 0.000000] 0000000000000000 0000000000000000 0000000000000000 ffffffff81a01f68
[ 0.000000] Call Trace:
[ 0.000000] [<ffffffff815e12a5>] dump_stack+0x4f/0x7c
[ 0.000000] [<ffffffff815e1db2>] dotest+0x5f/0xc7
[ 0.000000] [<ffffffff812e74f5>] locking_selftest+0x4c5/0xb30
[ 0.000000] [<ffffffff81d51d8f>] start_kernel+0x215/0x327
[ 0.000000] [<ffffffff81d51a0c>] ? repair_env_string+0x5a/0x5a
[ 0.000000] [<ffffffff81d9f4c1>] ? memblock_reserve+0x49/0x4e
[ 0.000000] [<ffffffff81d515b2>] x86_64_start_reservations+0x2a/0x2c
[ 0.000000] [<ffffffff81d516a4>] x86_64_start_kernel+0xf0/0xf7
[ 0.000000] ok | ok |FAILED|
[ 0.000000] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G W 3.14.1-rt1 #16
[ 0.000000] Hardware name: MEDIONPC MS-7502/MS-7502, BIOS 6.00 PG 12/26/2007
[ 0.000000] 0000000000000008 ffffffff81a01f28 ffffffff815e12a5 0000000000000046
[ 0.000000] 0000000000000001 ffffffff81a01f58 ffffffff815e1db2 0000000000000000
[ 0.000000] 0000000000000000 0000000000000000 0000000000000000 ffffffff81a01f68
[ 0.000000] Call Trace:
[ 0.000000] [<ffffffff815e12a5>] dump_stack+0x4f/0x7c
[ 0.000000] [<ffffffff815e1db2>] dotest+0x5f/0xc7
[ 0.000000] [<ffffffff812e752e>] locking_selftest+0x4fe/0xb30
[ 0.000000] [<ffffffff81d51d8f>] start_kernel+0x215/0x327
[ 0.000000] [<ffffffff81d51a0c>] ? repair_env_string+0x5a/0x5a
[ 0.000000] [<ffffffff81d9f4c1>] ? memblock_reserve+0x49/0x4e
[ 0.000000] [<ffffffff81d515b2>] x86_64_start_reservations+0x2a/0x2c
[ 0.000000] [<ffffffff81d516a4>] x86_64_start_kernel+0xf0/0xf7
[ 0.000000]
[ 0.000000] initialize held: ok | ok | ok | ok | ok | ok |
[ 0.000000] bad unlock order: ok | ok | ok | ok | ok | ok |
[ 0.000000] --------------------------------------------------------------------------
[ 0.000000] recursive read-lock: | ok | |FAILED|
[ 0.000000] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G W 3.14.1-rt1 #16
[ 0.000000] Hardware name: MEDIONPC MS-7502/MS-7502, BIOS 6.00 PG 12/26/2007
[ 0.000000] 0000000000000008 ffffffff81a01f28 ffffffff815e12a5 0000000000000006
[ 0.000000] 0000000000000001 ffffffff81a01f58 ffffffff815e1db2 0000000000000000
[ 0.000000] 0000000000000000 0000000000000000 0000000000000000 ffffffff81a01f68
[ 0.000000] Call Trace:
[ 0.000000] [<ffffffff815e12a5>] dump_stack+0x4f/0x7c
[ 0.000000] [<ffffffff815e1db2>] dotest+0x5f/0xc7
[ 0.000000] [<ffffffff812e76c5>] locking_selftest+0x695/0xb30
[ 0.000000] [<ffffffff81d51d8f>] start_kernel+0x215/0x327
[ 0.000000] [<ffffffff81d51a0c>] ? repair_env_string+0x5a/0x5a
[ 0.000000] [<ffffffff81d9f4c1>] ? memblock_reserve+0x49/0x4e
[ 0.000000] [<ffffffff81d515b2>] x86_64_start_reservations+0x2a/0x2c
[ 0.000000] [<ffffffff81d516a4>] x86_64_start_kernel+0xf0/0xf7
[ 0.000000]
[ 0.000000] recursive read-lock #2: |FAILED|
[ 0.000000] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G W 3.14.1-rt1 #16
[ 0.000000] Hardware name: MEDIONPC MS-7502/MS-7502, BIOS 6.00 PG 12/26/2007
[ 0.000000] 0000000000000002 ffffffff81a01f28 ffffffff815e12a5 ffffffff810b7727
[ 0.000000] 0000000000000001 ffffffff81a01f58 ffffffff815e1db2 0000000000000000
[ 0.000000] 0000000000000000 0000000000000000 0000000000000000 ffffffff81a01f68
[ 0.000000] Call Trace:
[ 0.000000] [<ffffffff815e12a5>] dump_stack+0x4f/0x7c
[ 0.000000] [<ffffffff810b7727>] ? console_trylock_for_printk+0x37/0xf0
[ 0.000000] [<ffffffff815e1db2>] dotest+0x5f/0xc7
[ 0.000000] [<ffffffff812e7703>] locking_selftest+0x6d3/0xb30
[ 0.000000] [<ffffffff81d51d8f>] start_kernel+0x215/0x327
[ 0.000000] [<ffffffff81d51a0c>] ? repair_env_string+0x5a/0x5a
[ 0.000000] [<ffffffff81d9f4c1>] ? memblock_reserve+0x49/0x4e
[ 0.000000] [<ffffffff81d515b2>] x86_64_start_reservations+0x2a/0x2c
[ 0.000000] [<ffffffff81d516a4>] x86_64_start_kernel+0xf0/0xf7
[ 0.000000] | ok |
[ 0.000000] mixed read-write-lock: | ok | | ok |
[ 0.000000] mixed write-read-lock: | ok | | ok |
[ 0.000000] --------------------------------------------------------------------------
[ 0.000000] hard-irqs-on + irq-safe-A/12: ok |
[ 0.000000] hard-irqs-on + irq-safe-A/21: ok |
[ 0.000000] hard-safe-A + irqs-on/12: ok |
[ 0.000000] hard-safe-A + irqs-on/21: ok |
[ 0.000000] hard-safe-A + unsafe-B #1/123: ok |
[ 0.000000] hard-safe-A + unsafe-B #1/132: ok |
[ 0.000000] hard-safe-A + unsafe-B #1/213: ok |
[ 0.000000] hard-safe-A + unsafe-B #1/231: ok |
[ 0.000000] hard-safe-A + unsafe-B #1/312: ok |
[ 0.000000] hard-safe-A + unsafe-B #1/321: ok |
[ 0.000000] hard-safe-A + unsafe-B #2/123: ok |
[ 0.000000] hard-safe-A + unsafe-B #2/132: ok |
[ 0.000000] hard-safe-A + unsafe-B #2/213: ok |
[ 0.000000] hard-safe-A + unsafe-B #2/231: ok |
[ 0.000000] hard-safe-A + unsafe-B #2/312: ok |
[ 0.000000] hard-safe-A + unsafe-B #2/321: ok |
[ 0.000000] --------------------------------------------------------------------------
[ 0.000000] | Wound/wait tests |
[ 0.000000] ---------------------
[ 0.000000] ww api failures: ok | ok | ok |
[ 0.000000] ww contexts mixing: ok | ok |
[ 0.000000] finishing ww context: ok | ok | ok | ok |
[ 0.000000] locking mismatches: ok | ok | ok |
[ 0.000000] EDEADLK handling: ok | ok | ok | ok | ok | ok | ok | ok | ok | ok |
[ 0.000000] spinlock nest unlocked: ok |
[ 0.000000] -----------------------------------------------------
[ 0.000000] |block | try |context|
[ 0.000000] -----------------------------------------------------
[ 0.000000] context: ok | ok | ok |
[ 0.000000] try: ok | ok | ok |
[ 0.000000] block: ok | ok | ok |
[ 0.000000] spinlock: ok | ok | ok |
[ 0.000000] -----------------------------------------------------------------
[ 0.000000] BUG: 11 unexpected failures (out of 119) - debugging disabled! |
[ 0.000000] -----------------------------------------------------------------
On 04/24/2014 06:06 AM, Mike Galbraith wrote:
> Turning lockdep on, it says it's busted.
http://www.spinics.net/lists/linux-rt-users/msg11179.html
Sebastian
On Thu, 2014-04-24 at 09:12 +0200, Sebastian Andrzej Siewior wrote:
> On 04/24/2014 06:06 AM, Mike Galbraith wrote:
> > Turning lockdep on, it says it's busted.
>
> http://www.spinics.net/lists/linux-rt-users/msg11179.html
I was heading toward the same conclusion while regression testing.
Guess I can stop that.
-Mike
On Sat, 2014-04-19 at 16:46 +0200, Mike Galbraith wrote:
> Hi Sebastian,
>
> On Fri, 2014-04-11 at 20:57 +0200, Sebastian Andrzej Siewior wrote:
> > Dear RT folks!
> >
> > I'm pleased to announce the v3.14-rt1 patch set.
>
> This hunk in hotplug-light-get-online-cpus.patch looks like a bug.
>
> @@ -333,7 +449,7 @@ static int __ref _cpu_down(unsigned int
> /* CPU didn't die: tell everyone. Can't complain. */
> smpboot_unpark_threads(cpu);
> cpu_notify_nofail(CPU_DOWN_FAILED | mod, hcpu);
> - goto out_release;
> + goto out_cancel;
> }
> BUG_ON(cpu_online(cpu));
...
BTW, the reason I was eyeballing this stuff is because I was highly
interested in what you were going to do here...
# XXX stomp-machine-deal-clever-with-stopper-lock.patch
...with that bloody lglock. What I did is attached for your amusement.
(warning: viewing may induce "Medussa" syndrome:)
Hotplug can still deadlock in rt trees too, and will if you beat it
hard. The splat below is virgin 3.12-rt (where wonderful lock doesn't
yet exist) while running Stevens stress-cpu-hotplug.sh, which is still
plenty deadly when liberally applied.
[ 161.951908] CPU0 attaching NULL sched-domain.
[ 161.970417] CPU2 attaching NULL sched-domain.
[ 161.976594] CPU3 attaching NULL sched-domain.
[ 161.981044] CPU0 attaching sched-domain:
[ 161.985010] domain 0: span 0,3 level CPU
[ 161.990627] groups: 0 (cpu_power = 997) 3 (cpu_power = 1021)
[ 162.000609] CPU3 attaching sched-domain:
[ 162.007723] domain 0: span 0,3 level CPU
[ 162.012756] groups: 3 (cpu_power = 1021) 0 (cpu_power = 997)
[ 162.025533] smpboot: CPU 2 is now offline
[ 162.036113]
[ 162.036114] ======================================================
[ 162.036115] [ INFO: possible circular locking dependency detected ]
[ 162.036116] 3.12.17-rt25 #14 Not tainted
[ 162.036117] -------------------------------------------------------
[ 162.036118] boot.kdump/6853 is trying to acquire lock:
[ 162.036126] (&hp->lock){+.+...}, at: [<ffffffff81044974>] pin_current_cpu+0x84/0x1d0
[ 162.036126]
[ 162.036126] but task is already holding lock:
[ 162.036131] (&mm->mmap_sem){+++++.}, at: [<ffffffff8156285c>] __do_page_fault+0x14c/0x5d0
[ 162.036132]
[ 162.036132] which lock already depends on the new lock.
[ 162.036132]
[ 162.036133]
[ 162.036133] the existing dependency chain (in reverse order) is:
[ 162.036135]
[ 162.036135] -> #2 (&mm->mmap_sem){+++++.}:
[ 162.036138] [<ffffffff810ae4a8>] check_prevs_add+0xf8/0x180
[ 162.036140] [<ffffffff810aeada>] validate_chain.isra.45+0x5aa/0x750
[ 162.036142] [<ffffffff810af4f6>] __lock_acquire+0x3f6/0x9f0
[ 162.036143] [<ffffffff810b01bc>] lock_acquire+0x8c/0x160
[ 162.036146] [<ffffffff8112df03>] might_fault+0x83/0xb0
[ 162.036149] [<ffffffff81341851>] sel_loadlut+0x11/0x70
[ 162.036152] [<ffffffff8134aa1d>] tioclinux+0x23d/0x2c0
[ 162.036153] [<ffffffff8133f88c>] vt_ioctl+0x86c/0x11f0
[ 162.036155] [<ffffffff81333cf8>] tty_ioctl+0x2a8/0x940
[ 162.036158] [<ffffffff8116c161>] do_vfs_ioctl+0x81/0x340
[ 162.036159] [<ffffffff8116c46b>] SyS_ioctl+0x4b/0x90
[ 162.036162] [<ffffffff81566c22>] system_call_fastpath+0x16/0x1b
[ 162.036164]
[ 162.036164] -> #1 (console_lock){+.+.+.}:
[ 162.036165] [<ffffffff810ae4a8>] check_prevs_add+0xf8/0x180
[ 162.036167] [<ffffffff810aeada>] validate_chain.isra.45+0x5aa/0x750
[ 162.036169] [<ffffffff810af4f6>] __lock_acquire+0x3f6/0x9f0
[ 162.036171] [<ffffffff810b01bc>] lock_acquire+0x8c/0x160
[ 162.036173] [<ffffffff810957bf>] console_lock+0x6f/0x80
[ 162.036174] [<ffffffff8109673d>] console_cpu_notify+0x1d/0x30
[ 162.036176] [<ffffffff81562d3d>] notifier_call_chain+0x4d/0x70
[ 162.036179] [<ffffffff81070b49>] __raw_notifier_call_chain+0x9/0x10
[ 162.036181] [<ffffffff8104443b>] __cpu_notify+0x1b/0x30
[ 162.036182] [<ffffffff81044650>] cpu_notify_nofail+0x10/0x20
[ 162.036185] [<ffffffff815480fd>] _cpu_down+0x20d/0x440
[ 162.036186] [<ffffffff81548360>] cpu_down+0x30/0x50
[ 162.036188] [<ffffffff8137118c>] cpu_subsys_offline+0x1c/0x30
[ 162.036191] [<ffffffff8136c285>] device_offline+0x95/0xc0
[ 162.036192] [<ffffffff8136c390>] online_store+0x40/0x80
[ 162.036194] [<ffffffff813697c3>] dev_attr_store+0x13/0x30
[ 162.036197] [<ffffffff811c8820>] sysfs_write_file+0xf0/0x170
[ 162.036200] [<ffffffff8115a068>] vfs_write+0xc8/0x1d0
[ 162.036202] [<ffffffff8115a500>] SyS_write+0x50/0xa0
[ 162.036203] [<ffffffff81566c22>] system_call_fastpath+0x16/0x1b
[ 162.036205]
[ 162.036205] -> #0 (&hp->lock){+.+...}:
[ 162.036207] [<ffffffff810ae39d>] check_prev_add+0x7bd/0x7d0
[ 162.036209] [<ffffffff810ae4a8>] check_prevs_add+0xf8/0x180
[ 162.036210] [<ffffffff810aeada>] validate_chain.isra.45+0x5aa/0x750
[ 162.036212] [<ffffffff810af4f6>] __lock_acquire+0x3f6/0x9f0
[ 162.036214] [<ffffffff810b01bc>] lock_acquire+0x8c/0x160
[ 162.036216] [<ffffffff8155e645>] rt_spin_lock+0x55/0x70
[ 162.036218] [<ffffffff81044974>] pin_current_cpu+0x84/0x1d0
[ 162.036220] [<ffffffff81079ef1>] migrate_disable+0x81/0x100
[ 162.036222] [<ffffffff8112fd38>] handle_pte_fault+0xf8/0x1c0
[ 162.036223] [<ffffffff81131646>] __handle_mm_fault+0x106/0x1b0
[ 162.036225] [<ffffffff81131712>] handle_mm_fault+0x22/0x30
[ 162.036227] [<ffffffff815628c1>] __do_page_fault+0x1b1/0x5d0
[ 162.036229] [<ffffffff81562ce9>] do_page_fault+0x9/0x10
[ 162.036230] [<ffffffff8155f9d2>] page_fault+0x22/0x30
[ 162.036232] [<ffffffff81566b0f>] ret_from_fork+0xf/0xb0
[ 162.036233]
[ 162.036233] other info that might help us debug this:
[ 162.036233]
[ 162.036235] Chain exists of:
[ 162.036235] &hp->lock --> console_lock --> &mm->mmap_sem
[ 162.036235]
[ 162.036236] Possible unsafe locking scenario:
[ 162.036236]
[ 162.036236] CPU0 CPU1
[ 162.036237] ---- ----
[ 162.036238] lock(&mm->mmap_sem);
[ 162.036239] lock(console_lock);
[ 162.036241] lock(&mm->mmap_sem);
[ 162.036242] lock(&hp->lock);
[ 162.036242]
[ 162.036242] *** DEADLOCK ***
[ 162.036242]
[ 162.036243] 1 lock held by boot.kdump/6853:
[ 162.036247] #0: (&mm->mmap_sem){+++++.}, at: [<ffffffff8156285c>] __do_page_fault+0x14c/0x5d0
[ 162.036247]
[ 162.036247] stack backtrace:
[ 162.036250] CPU: 0 PID: 6853 Comm: boot.kdump Not tainted 3.12.17-rt25 #14
[ 162.036251] Hardware name: MEDIONPC MS-7502/MS-7502, BIOS 6.00 PG 12/26/2007
[ 162.036253] ffff8801fd0e6d58 ffff8800bbe85918 ffffffff8155532c 0000000000000000
[ 162.036255] 0000000000000000 ffff8800bbe85968 ffffffff8154d07f ffff8800bbe85958
[ 162.036257] ffffffff82350640 ffff8801fd0e6d58 ffff8801fd0e6d20 ffff8801fd0e6d58
[ 162.036258] Call Trace:
[ 162.036261] [<ffffffff8155532c>] dump_stack+0x4f/0x91
[ 162.036263] [<ffffffff8154d07f>] print_circular_bug+0xd3/0xe4
[ 162.036265] [<ffffffff810ae39d>] check_prev_add+0x7bd/0x7d0
[ 162.036268] [<ffffffff8107e1f5>] ? sched_clock_local+0x25/0x90
[ 162.036270] [<ffffffff8107e388>] ? sched_clock_cpu+0xa8/0x120
[ 162.036272] [<ffffffff810ae4a8>] check_prevs_add+0xf8/0x180
[ 162.036273] [<ffffffff810aeada>] validate_chain.isra.45+0x5aa/0x750
[ 162.036275] [<ffffffff810af4f6>] __lock_acquire+0x3f6/0x9f0
[ 162.036277] [<ffffffff8155d9b1>] ? rt_spin_lock_slowlock+0x231/0x280
[ 162.036279] [<ffffffff8155d8b1>] ? rt_spin_lock_slowlock+0x131/0x280
[ 162.036281] [<ffffffff81044974>] ? pin_current_cpu+0x84/0x1d0
[ 162.036282] [<ffffffff810b01bc>] lock_acquire+0x8c/0x160
[ 162.036284] [<ffffffff81044974>] ? pin_current_cpu+0x84/0x1d0
[ 162.036286] [<ffffffff8155e645>] rt_spin_lock+0x55/0x70
[ 162.036288] [<ffffffff81044974>] ? pin_current_cpu+0x84/0x1d0
[ 162.036289] [<ffffffff81044974>] pin_current_cpu+0x84/0x1d0
[ 162.036291] [<ffffffff81079ef1>] migrate_disable+0x81/0x100
[ 162.036293] [<ffffffff8112fd38>] handle_pte_fault+0xf8/0x1c0
[ 162.036295] [<ffffffff8156285c>] ? __do_page_fault+0x14c/0x5d0
[ 162.036296] [<ffffffff81131646>] __handle_mm_fault+0x106/0x1b0
[ 162.036298] [<ffffffff81131712>] handle_mm_fault+0x22/0x30
[ 162.036300] [<ffffffff815628c1>] __do_page_fault+0x1b1/0x5d0
[ 162.036302] [<ffffffff8107e1f5>] ? sched_clock_local+0x25/0x90
[ 162.036304] [<ffffffff810790d1>] ? get_parent_ip+0x11/0x50
[ 162.036306] [<ffffffff81562f7d>] ? add_preempt_count.part.93+0x5d/0xb0
[ 162.036307] [<ffffffff810aa2c2>] ? get_lock_stats+0x22/0x70
[ 162.036309] [<ffffffff810aa7ce>] ? put_lock_stats.isra.26+0xe/0x40
[ 162.036311] [<ffffffff812841ed>] ? trace_hardirqs_off_thunk+0x3a/0x3c
[ 162.036313] [<ffffffff81562ce9>] do_page_fault+0x9/0x10
[ 162.036315] [<ffffffff8155f9d2>] page_fault+0x22/0x30
[ 162.036317] [<ffffffff81284130>] ? __put_user_4+0x20/0x30
[ 162.036319] [<ffffffff81078c47>] ? schedule_tail+0x67/0xb0
[ 162.036321] [<ffffffff81566b0f>] ret_from_fork+0xf/0xb0
On Fri, 2014-04-25 at 09:40 +0200, Mike Galbraith wrote:
> Hotplug can still deadlock in rt trees too, and will if you beat it
> hard.
Box actually deadlocks like so.
CPU3 boot.kdump
sys_wait4
do_wait
read_lock(&tasklist_lock)
rt_read_lock
__rt_spin_lock(lock)
migrate_disable()
pin_current_cpu()
if (hp->grab_lock) {
preempt_enable(); <== hmm
hotplug_lock(hp);
hp = &__get_cpu_var(hotplug_pcp); <== hmm
struct hotplug_pcp {
unplug = 0xffff8800b7d0e540,
sync_tsk = 0x0,
refcount = 0, <== hmm
grab_lock = 1,
...
lock = {
...
owner = 0xffff8802039f0001,
stress-cpu-hotplug_stress.sh?!?
<=== he's way over yonder.
Yo, dude, would you please NOT
take percpu locks with you?
CPU0 stress-cpu-hotplug_stress.sh
sysfs_write_file
dev_attr_store
online_store
device_offline
cpu_subsys_offline
cpu_down
_cpu_down
cpu_hotplug_begin
mutex_lock(&cpu_hotplug.lock);
...
check_for_tasks
write_lock_irq(&tasklist_lock);
held by CPU3 boot.kdump over there ===>
CPU0 kworker/0:0
cpuset_hotplug_workfn+0x23e/0x380
rebuild_sched_domains+0x15/0x30
rebuild_sched_domains_locked+0x17/0x80
get_online_cpus+0x35/0x50
mutex_lock(&cpu_hotplug.lock);
held by stress-cpu-hotplug_stress.sh
twiddle twiddle twiddle...
INFO: task kworker/0:0:4 blocked for more than 120 seconds.
On Sat, 2014-04-26 at 10:38 +0200, Mike Galbraith wrote:
> On Fri, 2014-04-25 at 09:40 +0200, Mike Galbraith wrote:
>
> > Hotplug can still deadlock in rt trees too, and will if you beat it
> > hard.
>
> Box actually deadlocks like so.
...
3.12-rt looks a bit busted migrate_disable/enable() wise.
/me eyeballs 3.10-rt (looks better), confirms 3.10-rt hotplug works,
swipes working code, confirms 3.12-rt now works. Yup, that was it.
When I fix lg_global_lock() (I think it and Medusa are both busted) I
bet a nickle 3.14-rt will work.
Hm, actually, rt_write_trylock() in swiped 3.10-rt code below (and some
others) look busted to me. migrate_disable() _after_ grabbing a lock is
too late, no?
---
include/linux/rwlock_rt.h | 32 ++++++++++++++++++++++++++++----
kernel/rt.c | 21 +++++++++++----------
2 files changed, 39 insertions(+), 14 deletions(-)
--- a/include/linux/rwlock_rt.h
+++ b/include/linux/rwlock_rt.h
@@ -33,50 +33,72 @@ extern void __rt_rwlock_init(rwlock_t *r
#define read_lock_irqsave(lock, flags) \
do { \
typecheck(unsigned long, flags); \
+ migrate_disable(); \
flags = rt_read_lock_irqsave(lock); \
} while (0)
#define write_lock_irqsave(lock, flags) \
do { \
typecheck(unsigned long, flags); \
+ migrate_disable(); \
flags = rt_write_lock_irqsave(lock); \
} while (0)
-#define read_lock(lock) rt_read_lock(lock)
+#define read_lock(lock) \
+ do { \
+ migrate_disable(); \
+ rt_read_lock(lock); \
+ } while (0)
#define read_lock_bh(lock) \
do { \
local_bh_disable(); \
+ migrate_disable(); \
rt_read_lock(lock); \
} while (0)
#define read_lock_irq(lock) read_lock(lock)
-#define write_lock(lock) rt_write_lock(lock)
+#define write_lock(lock) \
+ do { \
+ migrate_disable(); \
+ rt_write_lock(lock); \
+ } while (0)
#define write_lock_bh(lock) \
do { \
local_bh_disable(); \
+ migrate_disable(); \
rt_write_lock(lock); \
} while (0)
#define write_lock_irq(lock) write_lock(lock)
-#define read_unlock(lock) rt_read_unlock(lock)
+#define read_unlock(lock) \
+ do { \
+ rt_read_unlock(lock); \
+ migrate_enable(); \
+ } while (0)
#define read_unlock_bh(lock) \
do { \
rt_read_unlock(lock); \
+ migrate_enable(); \
local_bh_enable(); \
} while (0)
#define read_unlock_irq(lock) read_unlock(lock)
-#define write_unlock(lock) rt_write_unlock(lock)
+#define write_unlock(lock) \
+ do { \
+ rt_write_unlock(lock); \
+ migrate_enable(); \
+ } while (0)
#define write_unlock_bh(lock) \
do { \
rt_write_unlock(lock); \
+ migrate_enable(); \
local_bh_enable(); \
} while (0)
@@ -87,6 +109,7 @@ extern void __rt_rwlock_init(rwlock_t *r
typecheck(unsigned long, flags); \
(void) flags; \
rt_read_unlock(lock); \
+ migrate_enable(); \
} while (0)
#define write_unlock_irqrestore(lock, flags) \
@@ -94,6 +117,7 @@ extern void __rt_rwlock_init(rwlock_t *r
typecheck(unsigned long, flags); \
(void) flags; \
rt_write_unlock(lock); \
+ migrate_enable(); \
} while (0)
#endif
--- a/kernel/rt.c
+++ b/kernel/rt.c
@@ -182,10 +182,11 @@ int __lockfunc rt_write_trylock(rwlock_t
{
int ret = rt_mutex_trylock(&rwlock->lock);
- if (ret) {
+ migrate_disable();
+ if (ret)
rwlock_acquire(&rwlock->dep_map, 0, 1, _RET_IP_);
- migrate_disable();
- }
+ else
+ migrate_enable();
return ret;
}
@@ -196,7 +197,10 @@ int __lockfunc rt_write_trylock_irqsave(
int ret;
*flags = 0;
+ migrate_disable();
ret = rt_write_trylock(rwlock);
+ if (!ret)
+ migrate_enable();
return ret;
}
EXPORT_SYMBOL(rt_write_trylock_irqsave);
@@ -211,18 +215,19 @@ int __lockfunc rt_read_trylock(rwlock_t
* but not when read_depth == 0 which means that the lock is
* write locked.
*/
+ migrate_disable();
if (rt_mutex_owner(lock) != current) {
ret = rt_mutex_trylock(lock);
- if (ret) {
+ if (ret)
rwlock_acquire(&rwlock->dep_map, 0, 1, _RET_IP_);
- migrate_disable();
- }
} else if (!rwlock->read_depth) {
ret = 0;
}
if (ret)
rwlock->read_depth++;
+ else
+ migrate_enable();
return ret;
}
@@ -231,7 +236,6 @@ EXPORT_SYMBOL(rt_read_trylock);
void __lockfunc rt_write_lock(rwlock_t *rwlock)
{
rwlock_acquire(&rwlock->dep_map, 0, 0, _RET_IP_);
- migrate_disable();
__rt_spin_lock(&rwlock->lock);
}
EXPORT_SYMBOL(rt_write_lock);
@@ -246,7 +250,6 @@ void __lockfunc rt_read_lock(rwlock_t *r
if (rt_mutex_owner(lock) != current) {
rwlock_acquire(&rwlock->dep_map, 0, 0, _RET_IP_);
__rt_spin_lock(lock);
- migrate_disable();
}
rwlock->read_depth++;
}
@@ -258,7 +261,6 @@ void __lockfunc rt_write_unlock(rwlock_t
/* NOTE: we always pass in '1' for nested, for simplicity */
rwlock_release(&rwlock->dep_map, 1, _RET_IP_);
__rt_spin_unlock(&rwlock->lock);
- migrate_enable();
}
EXPORT_SYMBOL(rt_write_unlock);
@@ -268,7 +270,6 @@ void __lockfunc rt_read_unlock(rwlock_t
if (--rwlock->read_depth == 0) {
rwlock_release(&rwlock->dep_map, 1, _RET_IP_);
__rt_spin_unlock(&rwlock->lock);
- migrate_enable();
}
}
EXPORT_SYMBOL(rt_read_unlock);
On 04/11/2014 11:57 AM, Sebastian Andrzej Siewior wrote:
> Dear RT folks!
>
> I'm pleased to announce the v3.14-rt1 patch setty).
>
> Changes since v3.12.15-rt25
> - I dropped the sparc64 patches I had in the queue. They did not apply
> cleanly, the code in v3.14 changed in the MMU area. Here is where I
> remembered that it was not working perfectly either.
Saw this a moment ago (3.14.1 + rt1, Fedora 19 laptop - I think I have
seen something similar in 3.12.x-r):
Apr 26 11:16:11 localhost kernel: [ 96.323248] ------------[ cut here
]------------
Apr 26 11:16:11 localhost kernel: [ 96.323262] WARNING: CPU: 0 PID:
2051 at lib/list_debug.c:59 __list_del_entry+0xa1/0xd0()
Apr 26 11:16:11 localhost kernel: [ 96.323264] list_del corruption.
prev->next should be ffff8802101196a0, but was 0000000000000001
Apr 26 11:16:11 localhost kernel: [ 96.323266] Modules linked in: fuse
ipt_MASQUERADE xt_CHECKSUM tun ip6t_rpfilter ip6t_REJECT xt_conntrack
ebtable_nat ebtable_broute bridge stp llc ebtable_filter ebtables
ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6
ip6table_mangle ip6table_security ip6table_raw rfcomm ip6table_filter
bnep ip6_tables iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4
nf_nat nf_conntrack iptable_mangle iptable_security iptable_raw iTCO_wdt
iTCO_vendor_support coretemp kvm_intel kvm crct10dif_pclmul crc32_pclmul
crc32c_intel ghash_clmulni_intel uvcvideo videobuf2_vmalloc microcode
videobuf2_memops snd_hda_codec_hdmi videobuf2_core videodev media
serio_raw btusb bluetooth intel_ips i2c_i801 6lowpan_iphc
snd_hda_codec_conexant snd_hda_codec_generic arc4 iwldvm mac80211
iwlwifi lpc_ich sdhci_pci mfd_core sdhci cfg80211 mmc_core snd_hda_intel
snd_hda_codec snd_hwdep snd_seq snd_seq_device snd_pcm e1000e snd_timer
ptp mei_me pps_core mei shpchp thinkpad_acpi snd ppdev soundcore rfkill
parport_pc parport acpi_cpufreq uinput firewire_ohci nouveau
firewire_core crc_itu_t i2c_algo_bit drm_kms_helper ttm drm mxm_wmi
i2c_core wmi video
Apr 26 11:16:11 localhost kernel: [ 96.323331] CPU: 0 PID: 2051 Comm:
cinnamon Not tainted 3.14.1-200.rt1.1.fc19.ccrma.x86_64+rt #1
Apr 26 11:16:11 localhost kernel: [ 96.323332] Hardware name: LENOVO
4313CTO/4313CTO, BIOS 6MET64WW (1.27 ) 07/15/2010
Apr 26 11:16:11 localhost kernel: [ 96.323334] 0000000000000000
000000008a5c11dc ffff8800ae715a88 ffffffff81707fca
Apr 26 11:16:11 localhost kernel: [ 96.323336] ffff8800ae715ad0
ffff8800ae715ac0 ffffffff8108d03d ffff8802101196a0
Apr 26 11:16:11 localhost kernel: [ 96.323337] ffff880210119b50
ffff880210119b50 ffff880210119b40 ffff88021a615648
Apr 26 11:16:11 localhost kernel: [ 96.323338] Call Trace:
Apr 26 11:16:11 localhost kernel: [ 96.323345] [<ffffffff81707fca>]
dump_stack+0x4d/0x82
Apr 26 11:16:11 localhost kernel: [ 96.323351] [<ffffffff8108d03d>]
warn_slowpath_common+0x7d/0xc0
Apr 26 11:16:11 localhost kernel: [ 96.323352] [<ffffffff8108d0dc>]
warn_slowpath_fmt+0x5c/0x80
Apr 26 11:16:11 localhost kernel: [ 96.323354] [<ffffffff8137c551>]
__list_del_entry+0xa1/0xd0
Apr 26 11:16:11 localhost kernel: [ 96.323355] [<ffffffff8137c58d>]
list_del+0xd/0x30
Apr 26 11:16:11 localhost kernel: [ 96.323393] [<ffffffffa0135593>]
nouveau_fence_signal+0x53/0x80 [nouveau]
Apr 26 11:16:11 localhost kernel: [ 96.323414] [<ffffffffa0135678>]
nouveau_fence_update+0x48/0xa0 [nouveau]
Apr 26 11:16:11 localhost kernel: [ 96.323435] [<ffffffffa0135f85>]
nouveau_fence_sync+0x45/0x80 [nouveau]
Apr 26 11:16:11 localhost kernel: [ 96.323456] [<ffffffffa013aea8>]
validate_list+0xd8/0x2e0 [nouveau]
Apr 26 11:16:11 localhost kernel: [ 96.323478] [<ffffffffa013c3d3>]
nouveau_gem_ioctl_pushbuf+0xaa3/0x13e0 [nouveau]
Apr 26 11:16:11 localhost kernel: [ 96.323500] [<ffffffffa002ad02>]
drm_ioctl+0x4f2/0x620 [drm]
Apr 26 11:16:11 localhost kernel: [ 96.323506] [<ffffffff810c1af4>] ?
migrate_enable+0x94/0x1c0
Apr 26 11:16:11 localhost kernel: [ 96.323527] [<ffffffffa0132cfe>]
nouveau_drm_ioctl+0x4e/0x90 [nouveau]
Apr 26 11:16:11 localhost kernel: [ 96.323530] [<ffffffff81203480>]
do_vfs_ioctl+0x2e0/0x4c0
Apr 26 11:16:11 localhost kernel: [ 96.323533] [<ffffffff812fd8d6>] ?
file_has_perm+0xa6/0xb0
Apr 26 11:16:11 localhost kernel: [ 96.323535] [<ffffffff812036e1>]
SyS_ioctl+0x81/0xa0
Apr 26 11:16:11 localhost kernel: [ 96.323538] [<ffffffff81716769>]
system_call_fastpath+0x16/0x1b
Apr 26 11:16:11 localhost kernel: [ 96.323569] ---[ end trace
0000000000000002 ]---
-- Fernando
Hi Nicholas,
On Sat, 2014-04-26 at 15:58 +0200, Mike Galbraith wrote:
> On Sat, 2014-04-26 at 10:38 +0200, Mike Galbraith wrote:
> > On Fri, 2014-04-25 at 09:40 +0200, Mike Galbraith wrote:
> >
> > > Hotplug can still deadlock in rt trees too, and will if you beat it
> > > hard.
> >
> > Box actually deadlocks like so.
>
> ...
>
> 3.12-rt looks a bit busted migrate_disable/enable() wise.
>
> /me eyeballs 3.10-rt (looks better), confirms 3.10-rt hotplug works,
> swipes working code, confirms 3.12-rt now works. Yup, that was it.
My boxen, including 64 core DL980 that ran hotplug stress for 3 hours
yesterday with pre-pushdown rwlocks, say the migrate_disable/enable
pushdown patches are very definitely busted.
Instead of whacking selective bits, as I did to verify that the rwlock
changes were indeed causing hotplug stress deadlock woes, I'm eyeballing
the lot, twiddling primitives to look like I think they should, after
which I'll let my boxen express their opinions of the result.
-Mike
On Mon, 2014-04-28 at 07:09 +0200, Mike Galbraith wrote:
> Hi Nicholas,
>
> On Sat, 2014-04-26 at 15:58 +0200, Mike Galbraith wrote:
> > On Sat, 2014-04-26 at 10:38 +0200, Mike Galbraith wrote:
> > > On Fri, 2014-04-25 at 09:40 +0200, Mike Galbraith wrote:
> > >
> > > > Hotplug can still deadlock in rt trees too, and will if you beat it
> > > > hard.
> > >
> > > Box actually deadlocks like so.
> >
> > ...
> >
> > 3.12-rt looks a bit busted migrate_disable/enable() wise.
> >
> > /me eyeballs 3.10-rt (looks better), confirms 3.10-rt hotplug works,
> > swipes working code, confirms 3.12-rt now works. Yup, that was it.
>
> My boxen, including 64 core DL980 that ran hotplug stress for 3 hours
> yesterday with pre-pushdown rwlocks, say the migrate_disable/enable
> pushdown patches are very definitely busted.
migrate_disable-pushd-down-in-atomic_dec_and_spin_lo.patch
bug: migrate_disable() after blocking is too late.
@@ -1028,12 +1028,12 @@ int atomic_dec_and_spin_lock(atomic_t *a
/* Subtract 1 from counter unless that drops it to 0 (ie. it was 1) */
if (atomic_add_unless(atomic, -1, 1))
return 0;
- migrate_disable();
rt_spin_lock(lock);
- if (atomic_dec_and_test(atomic))
+ if (atomic_dec_and_test(atomic)){
+ migrate_disable();
return 1;
+ }
rt_spin_unlock(lock);
- migrate_enable();
return 0;
}
EXPORT_SYMBOL(atomic_dec_and_spin_lock);
read_lock-migrate_disable-pushdown-to-rt_read_lock.patch
bug: ditto.
@@ -244,8 +246,10 @@ void __lockfunc rt_read_lock(rwlock_t *r
/*
* recursive read locks succeed when current owns the lock
*/
- if (rt_mutex_owner(lock) != current)
+ if (rt_mutex_owner(lock) != current) {
__rt_spin_lock(lock);
+ migrate_disable();
+ }
rwlock->read_depth++;
}
Moving that migrate_disable() up will likely fix my hotplug troubles.
I'll find out when I get back from physical torture (therapy) session.
-Mike
On Mon, 28 Apr 2014 11:09:46 +0200
Mike Galbraith <[email protected]> wrote:
> migrate_disable-pushd-down-in-atomic_dec_and_spin_lo.patch
>
> bug: migrate_disable() after blocking is too late.
>
> @@ -1028,12 +1028,12 @@ int atomic_dec_and_spin_lock(atomic_t *a
> /* Subtract 1 from counter unless that drops it to 0 (ie. it was 1) */
> if (atomic_add_unless(atomic, -1, 1))
> return 0;
> - migrate_disable();
> rt_spin_lock(lock);
> - if (atomic_dec_and_test(atomic))
> + if (atomic_dec_and_test(atomic)){
> + migrate_disable();
Makes sense, as the CPU can go offline right after the lock is grabbed
and before the migrate_disable() is called.
Seems that migrate_disable() must be called before taking the lock as
it is done in every other location.
-- Steve
> return 1;
> + }
> rt_spin_unlock(lock);
> - migrate_enable();
> return 0;
> }
> EXPORT_SYMBOL(atomic_dec_and_spin_lock);
>
> read_lock-migrate_disable-pushdown-to-rt_read_lock.patch
>
> bug: ditto.
>
> @@ -244,8 +246,10 @@ void __lockfunc rt_read_lock(rwlock_t *r
> /*
> * recursive read locks succeed when current owns the lock
> */
> - if (rt_mutex_owner(lock) != current)
> + if (rt_mutex_owner(lock) != current) {
> __rt_spin_lock(lock);
> + migrate_disable();
> + }
> rwlock->read_depth++;
> }
>
> Moving that migrate_disable() up will likely fix my hotplug troubles.
> I'll find out when I get back from physical torture (therapy) session.
>
> -Mike
On Mon, 2014-04-28 at 10:18 -0400, Steven Rostedt wrote:
> On Mon, 28 Apr 2014 11:09:46 +0200
> Mike Galbraith <[email protected]> wrote:
>
> > migrate_disable-pushd-down-in-atomic_dec_and_spin_lo.patch
> >
> > bug: migrate_disable() after blocking is too late.
> >
> > @@ -1028,12 +1028,12 @@ int atomic_dec_and_spin_lock(atomic_t *a
> > /* Subtract 1 from counter unless that drops it to 0 (ie. it was 1) */
> > if (atomic_add_unless(atomic, -1, 1))
> > return 0;
> > - migrate_disable();
> > rt_spin_lock(lock);
> > - if (atomic_dec_and_test(atomic))
> > + if (atomic_dec_and_test(atomic)){
> > + migrate_disable();
>
> Makes sense, as the CPU can go offline right after the lock is grabbed
> and before the migrate_disable() is called.
>
> Seems that migrate_disable() must be called before taking the lock as
> it is done in every other location.
And for tasklist_lock, seems you also MUST do that prior to trylock as
well, else you'll run afoul of the hotplug beast.
-Mike
On Mon, 2014-04-28 at 16:37 +0200, Mike Galbraith wrote:
> On Mon, 2014-04-28 at 10:18 -0400, Steven Rostedt wrote:
> > On Mon, 28 Apr 2014 11:09:46 +0200
> > Mike Galbraith <[email protected]> wrote:
> >
> > > migrate_disable-pushd-down-in-atomic_dec_and_spin_lo.patch
> > >
> > > bug: migrate_disable() after blocking is too late.
> > >
> > > @@ -1028,12 +1028,12 @@ int atomic_dec_and_spin_lock(atomic_t *a
> > > /* Subtract 1 from counter unless that drops it to 0 (ie. it was 1) */
> > > if (atomic_add_unless(atomic, -1, 1))
> > > return 0;
> > > - migrate_disable();
> > > rt_spin_lock(lock);
> > > - if (atomic_dec_and_test(atomic))
> > > + if (atomic_dec_and_test(atomic)){
> > > + migrate_disable();
> >
> > Makes sense, as the CPU can go offline right after the lock is grabbed
> > and before the migrate_disable() is called.
> >
> > Seems that migrate_disable() must be called before taking the lock as
> > it is done in every other location.
>
> And for tasklist_lock, seems you also MUST do that prior to trylock as
> well, else you'll run afoul of the hotplug beast.
This lockdep gripe is from the deadlocked crashdump with only the
clearly busted bits patched up.
[ 193.033224] ======================================================
[ 193.033225] [ INFO: possible circular locking dependency detected ]
[ 193.033227] 3.12.18-rt25 #19 Not tainted
[ 193.033227] -------------------------------------------------------
[ 193.033228] boot.kdump/5422 is trying to acquire lock:
[ 193.033237] (&hp->lock){+.+...}, at: [<ffffffff81044974>] pin_current_cpu+0x84/0x1d0
[ 193.033238]
but task is already holding lock:
[ 193.033241] (tasklist_lock){+.+...}, at: [<ffffffff81046a5b>] do_wait+0xbb/0x2a0
[ 193.033242]
which lock already depends on the new lock.
[ 193.033242]
the existing dependency chain (in reverse order) is:
[ 193.033244]
-> #1 (tasklist_lock){+.+...}:
[ 193.033248] [<ffffffff810ae4a8>] check_prevs_add+0xf8/0x180
[ 193.033250] [<ffffffff810aeada>] validate_chain.isra.45+0x5aa/0x750
[ 193.033252] [<ffffffff810af4f6>] __lock_acquire+0x3f6/0x9f0
[ 193.033253] [<ffffffff810b01bc>] lock_acquire+0x8c/0x160
[ 193.033257] [<ffffffff8155e99c>] rt_write_lock+0x2c/0x40
[ 193.033260] [<ffffffff81548169>] _cpu_down+0x219/0x440
[ 193.033261] [<ffffffff815483c0>] cpu_down+0x30/0x50
[ 193.033264] [<ffffffff813711dc>] cpu_subsys_offline+0x1c/0x30
[ 193.033267] [<ffffffff8136c2d5>] device_offline+0x95/0xc0
[ 193.033269] [<ffffffff8136c3e0>] online_store+0x40/0x80
[ 193.033271] [<ffffffff81369813>] dev_attr_store+0x13/0x30
[ 193.033274] [<ffffffff811c8820>] sysfs_write_file+0xf0/0x170
[ 193.033277] [<ffffffff8115a068>] vfs_write+0xc8/0x1d0
[ 193.033279] [<ffffffff8115a500>] SyS_write+0x50/0xa0
[ 193.033282] [<ffffffff81566ca2>] system_call_fastpath+0x16/0x1b
[ 193.033284]
-> #0 (&hp->lock){+.+...}:
[ 193.033286] [<ffffffff810ae39d>] check_prev_add+0x7bd/0x7d0
[ 193.033287] [<ffffffff810ae4a8>] check_prevs_add+0xf8/0x180
[ 193.033289] [<ffffffff810aeada>] validate_chain.isra.45+0x5aa/0x750
[ 193.033291] [<ffffffff810af4f6>] __lock_acquire+0x3f6/0x9f0
[ 193.033293] [<ffffffff810b01bc>] lock_acquire+0x8c/0x160
[ 193.033295] [<ffffffff8155e6a5>] rt_spin_lock+0x55/0x70
[ 193.033296] [<ffffffff81044974>] pin_current_cpu+0x84/0x1d0
[ 193.033299] [<ffffffff81079ef1>] migrate_disable+0x81/0x100
[ 193.033301] [<ffffffff8155e947>] rt_read_lock+0x47/0x60
[ 193.033303] [<ffffffff81046a5b>] do_wait+0xbb/0x2a0
[ 193.033305] [<ffffffff8104777e>] SyS_wait4+0x9e/0x100
[ 193.033307] [<ffffffff81566ca2>] system_call_fastpath+0x16/0x1b
[ 193.033307]
other info that might help us debug this:
[ 193.033308] Possible unsafe locking scenario:
[ 193.033309] CPU0 CPU1
[ 193.033309] ---- ----
[ 193.033310] lock(tasklist_lock);
[ 193.033312] lock(&hp->lock);
[ 193.033313] lock(tasklist_lock);
[ 193.033314] lock(&hp->lock);
[ 193.033315]
*** DEADLOCK ***
[ 193.033316] 1 lock held by boot.kdump/5422:
[ 193.033319] #0: (tasklist_lock){+.+...}, at: [<ffffffff81046a5b>] do_wait+0xbb/0x2a0
[ 193.033320]
stack backtrace:
[ 193.033322] CPU: 0 PID: 5422 Comm: boot.kdump Not tainted 3.12.18-rt25 #19
[ 193.033323] Hardware name: MEDIONPC MS-7502/MS-7502, BIOS 6.00 PG 12/26/2007
[ 193.033326] ffff880200550818 ffff8802004e5ad8 ffffffff8155538c 0000000000000000
[ 193.033328] 0000000000000000 ffff8802004e5b28 ffffffff8154d0df ffff8802004e5b18
[ 193.033330] ffff8802004e5b50 ffff880200550818 ffff8802005507e0 ffff880200550818
[ 193.033331] Call Trace:
[ 193.033335] [<ffffffff8155538c>] dump_stack+0x4f/0x91
[ 193.033337] [<ffffffff8154d0df>] print_circular_bug+0xd3/0xe4
[ 193.033339] [<ffffffff810ae39d>] check_prev_add+0x7bd/0x7d0
[ 193.033342] [<ffffffff8107e1f5>] ? sched_clock_local+0x25/0x90
[ 193.033344] [<ffffffff8107e388>] ? sched_clock_cpu+0xa8/0x120
[ 193.033346] [<ffffffff810ae4a8>] check_prevs_add+0xf8/0x180
[ 193.033348] [<ffffffff810aeada>] validate_chain.isra.45+0x5aa/0x750
[ 193.033350] [<ffffffff810af4f6>] __lock_acquire+0x3f6/0x9f0
[ 193.033352] [<ffffffff8155da11>] ? rt_spin_lock_slowlock+0x231/0x280
[ 193.033354] [<ffffffff8155d911>] ? rt_spin_lock_slowlock+0x131/0x280
[ 193.033356] [<ffffffff81044974>] ? pin_current_cpu+0x84/0x1d0
[ 193.033358] [<ffffffff810b01bc>] lock_acquire+0x8c/0x160
[ 193.033360] [<ffffffff81044974>] ? pin_current_cpu+0x84/0x1d0
[ 193.033362] [<ffffffff8155e6a5>] rt_spin_lock+0x55/0x70
[ 193.033363] [<ffffffff81044974>] ? pin_current_cpu+0x84/0x1d0
[ 193.033365] [<ffffffff81044974>] pin_current_cpu+0x84/0x1d0
[ 193.033367] [<ffffffff81079ef1>] migrate_disable+0x81/0x100
[ 193.033369] [<ffffffff8155e947>] rt_read_lock+0x47/0x60
[ 193.033371] [<ffffffff81046a5b>] ? do_wait+0xbb/0x2a0
[ 193.033373] [<ffffffff8155cd39>] ? schedule+0x29/0x90
[ 193.033374] [<ffffffff81046a5b>] do_wait+0xbb/0x2a0
[ 193.033378] [<ffffffff8112ded6>] ? might_fault+0x56/0xb0
[ 193.033380] [<ffffffff8104777e>] SyS_wait4+0x9e/0x100
[ 193.033382] [<ffffffff81566cc7>] ? sysret_check+0x1b/0x56
[ 193.033384] [<ffffffff81045d50>] ? task_stopped_code+0xa0/0xa0
[ 193.033386] [<ffffffff81566ca2>] system_call_fastpath+0x16/0x1b
[ 193.033845] SMP alternatives: lockdep: fixing up alternatives
On Mon, 2014-04-28 at 16:37 +0200, Mike Galbraith wrote:
> > Seems that migrate_disable() must be called before taking the lock as
> > it is done in every other location.
>
> And for tasklist_lock, seems you also MUST do that prior to trylock as
> well, else you'll run afoul of the hotplug beast.
Bah. Futzing with dmesg while stress script is running is either a very
bad idea, or a very good test. Both virgin 3.10-rt and 3.12-rt with new
bugs squashed will deadlock.
Too bad I kept on testing, I liked the notion that hotplug was solid ;-)
-Mike
On Tue, 29 Apr 2014 07:21:09 +0200
Mike Galbraith <[email protected]> wrote:
> On Mon, 2014-04-28 at 16:37 +0200, Mike Galbraith wrote:
>
> > > Seems that migrate_disable() must be called before taking the lock as
> > > it is done in every other location.
> >
> > And for tasklist_lock, seems you also MUST do that prior to trylock as
> > well, else you'll run afoul of the hotplug beast.
>
> Bah. Futzing with dmesg while stress script is running is either a very
> bad idea, or a very good test. Both virgin 3.10-rt and 3.12-rt with new
> bugs squashed will deadlock.
>
> Too bad I kept on testing, I liked the notion that hotplug was solid ;-)
I was able to stress cpu hotplug on 3.12-rt after applying the
following patch.
If there's no complaints about it. I'm going to add this to the 3.12-rt
stable tree. As without it, it fails horribly with the cpu hotplug
stress test, and I wont release a stable kernel that does that.
-- Steve
Signed-off-by: Steven Rostedt <[email protected]>
diff --git a/kernel/rt.c b/kernel/rt.c
index bb72347..4f2a613 100644
--- a/kernel/rt.c
+++ b/kernel/rt.c
@@ -180,12 +180,15 @@ EXPORT_SYMBOL(_mutex_unlock);
*/
int __lockfunc rt_write_trylock(rwlock_t *rwlock)
{
- int ret = rt_mutex_trylock(&rwlock->lock);
+ int ret;
+
+ migrate_disable();
+ ret = rt_mutex_trylock(&rwlock->lock);
- if (ret) {
+ if (ret)
rwlock_acquire(&rwlock->dep_map, 0, 1, _RET_IP_);
- migrate_disable();
- }
+ else
+ migrate_enable();
return ret;
}
@@ -212,11 +215,12 @@ int __lockfunc rt_read_trylock(rwlock_t *rwlock)
* write locked.
*/
if (rt_mutex_owner(lock) != current) {
+ migrate_disable();
ret = rt_mutex_trylock(lock);
- if (ret) {
+ if (ret)
rwlock_acquire(&rwlock->dep_map, 0, 1, _RET_IP_);
- migrate_disable();
- }
+ else
+ migrate_enable();
} else if (!rwlock->read_depth) {
ret = 0;
}
@@ -245,8 +249,8 @@ void __lockfunc rt_read_lock(rwlock_t *rwlock)
*/
if (rt_mutex_owner(lock) != current) {
rwlock_acquire(&rwlock->dep_map, 0, 0, _RET_IP_);
- __rt_spin_lock(lock);
migrate_disable();
+ __rt_spin_lock(lock);
}
rwlock->read_depth++;
}
On Tue, 2014-04-29 at 20:13 -0400, Steven Rostedt wrote:
> On Tue, 29 Apr 2014 07:21:09 +0200
> Mike Galbraith <[email protected]> wrote:
>
> > On Mon, 2014-04-28 at 16:37 +0200, Mike Galbraith wrote:
> >
> > > > Seems that migrate_disable() must be called before taking the lock as
> > > > it is done in every other location.
> > >
> > > And for tasklist_lock, seems you also MUST do that prior to trylock as
> > > well, else you'll run afoul of the hotplug beast.
> >
> > Bah. Futzing with dmesg while stress script is running is either a very
> > bad idea, or a very good test. Both virgin 3.10-rt and 3.12-rt with new
> > bugs squashed will deadlock.
> >
> > Too bad I kept on testing, I liked the notion that hotplug was solid ;-)
>
> I was able to stress cpu hotplug on 3.12-rt after applying the
> following patch.
>
> If there's no complaints about it. I'm going to add this to the 3.12-rt
> stable tree. As without it, it fails horribly with the cpu hotplug
> stress test, and I wont release a stable kernel that does that.
My local boxen are happy, 64 core box with 14-rt seems happy as well,
though I couldn't let it burn for long.
BTW, that dmesg business went into hiding. I didn't have time to put
virgin 10-rt back on and play around poking both kernels this that and
the other way again, but seems there's some phase-of-moon factor there.
-Mike
On Wed, 2014-04-30 at 09:43 +0200, Mike Galbraith wrote:
> On Tue, 2014-04-29 at 20:13 -0400, Steven Rostedt wrote:
> > On Tue, 29 Apr 2014 07:21:09 +0200
> > Mike Galbraith <[email protected]> wrote:
> >
> > > On Mon, 2014-04-28 at 16:37 +0200, Mike Galbraith wrote:
> > >
> > > > > Seems that migrate_disable() must be called before taking the lock as
> > > > > it is done in every other location.
> > > >
> > > > And for tasklist_lock, seems you also MUST do that prior to trylock as
> > > > well, else you'll run afoul of the hotplug beast.
> > >
> > > Bah. Futzing with dmesg while stress script is running is either a very
> > > bad idea, or a very good test. Both virgin 3.10-rt and 3.12-rt with new
> > > bugs squashed will deadlock.
> > >
> > > Too bad I kept on testing, I liked the notion that hotplug was solid ;-)
> >
> > I was able to stress cpu hotplug on 3.12-rt after applying the
> > following patch.
> >
> > If there's no complaints about it. I'm going to add this to the 3.12-rt
> > stable tree. As without it, it fails horribly with the cpu hotplug
> > stress test, and I wont release a stable kernel that does that.
>
> My local boxen are happy, 64 core box with 14-rt seems happy as well,
> though I couldn't let it burn for long.
And 3.12 looks stable on 64 core DL980 as well. (If it survived a 24
hour busy+stress session I'd still likely fall outta my chair though)
My kinda sorta 3.12-rt enterprise to be kernel wasn't stable on DL980,
while appearing just fine on small boxen, which made me suspect that
there was still a big box something lurking, only raising its ugly head
in the fatter kernel. That wasn't an rt problem after all, someone in
enterprise land just didn't stack their goody pile quite high enough
while wedging upstream into the stable base kernel, which bent rt.
The End.. I hope. I've had enough hotplug entertainment for a while.
-Mike
On Wed, 30 Apr 2014 15:06:29 +0200
Mike Galbraith <[email protected]> wrote:
> The End.. I hope. I've had enough hotplug entertainment for a while.
Not for me. 3.14-rt stress-cpu-hotplug crashes quickly. But it's a
different issues than what my patch addressed. I'm still debugging it.
-- Steve
On Wed, 2014-04-30 at 09:15 -0400, Steven Rostedt wrote:
> On Wed, 30 Apr 2014 15:06:29 +0200
> Mike Galbraith <[email protected]> wrote:
>
>
> > The End.. I hope. I've had enough hotplug entertainment for a while.
>
> Not for me. 3.14-rt stress-cpu-hotplug crashes quickly. But it's a
> different issues than what my patch addressed. I'm still debugging it.
If you didn't fix the two bugs I showed, and (wisely) didn't look at the
beautiful lglock patches I posted (no frozen shark, I'm disappointed;),
your patch won't help.
-Mike
On Wed, 30 Apr 2014 16:00:03 +0200
Mike Galbraith <[email protected]> wrote:
> On Wed, 2014-04-30 at 09:15 -0400, Steven Rostedt wrote:
> > On Wed, 30 Apr 2014 15:06:29 +0200
> > Mike Galbraith <[email protected]> wrote:
> >
> >
> > > The End.. I hope. I've had enough hotplug entertainment for a while.
> >
> > Not for me. 3.14-rt stress-cpu-hotplug crashes quickly. But it's a
> > different issues than what my patch addressed. I'm still debugging it.
>
> If you didn't fix the two bugs I showed, and (wisely) didn't look at the
> beautiful lglock patches I posted (no frozen shark, I'm disappointed;),
> your patch won't help.
Mike,
I'm testing it now. But could you please post them as regular patches.
They were attachments to this thread, and were not something that stood
out.
Thanks,
-- Steve
On Wed, 30 Apr 2014 10:19:19 -0400
Steven Rostedt <[email protected]> wrote:
> I'm testing it now. But could you please post them as regular patches.
> They were attachments to this thread, and were not something that stood
> out.
With your two patches, it still crashes exactly the same way. I
probably should remove my debug just in case, but I think this box has
another problem with it.
-- Steve
On Wed, 2014-04-30 at 10:19 -0400, Steven Rostedt wrote:
> On Wed, 30 Apr 2014 16:00:03 +0200
> Mike Galbraith <[email protected]> wrote:
>
> > On Wed, 2014-04-30 at 09:15 -0400, Steven Rostedt wrote:
> > > On Wed, 30 Apr 2014 15:06:29 +0200
> > > Mike Galbraith <[email protected]> wrote:
> > >
> > >
> > > > The End.. I hope. I've had enough hotplug entertainment for a while.
> > >
> > > Not for me. 3.14-rt stress-cpu-hotplug crashes quickly. But it's a
> > > different issues than what my patch addressed. I'm still debugging it.
> >
> > If you didn't fix the two bugs I showed, and (wisely) didn't look at the
> > beautiful lglock patches I posted (no frozen shark, I'm disappointed;),
> > your patch won't help.
>
> Mike,
>
> I'm testing it now. But could you please post them as regular patches.
> They were attachments to this thread, and were not something that stood
> out.
They were meant to not stick out :) I showed what I did to deal with
that damn lglock, but showing them at all felt more akin to chumming the
waters for frozen sharks than posting patches.
'spose I could try to muster up some courage, showing them put a pretty
big dent in my supply.
-Mike
On Wed, 2014-04-30 at 10:33 -0400, Steven Rostedt wrote:
> On Wed, 30 Apr 2014 10:19:19 -0400
> Steven Rostedt <[email protected]> wrote:
>
> > I'm testing it now. But could you please post them as regular patches.
> > They were attachments to this thread, and were not something that stood
> > out.
>
> With your two patches, it still crashes exactly the same way. I
> probably should remove my debug just in case, but I think this box has
> another problem with it.
You killed this hunk of hotplug-light-get-online-cpus.patch
@@ -333,7 +449,7 @@ static int __ref _cpu_down(unsigned int
/* CPU didn't die: tell everyone. Can't complain. */
smpboot_unpark_threads(cpu);
cpu_notify_nofail(CPU_DOWN_FAILED | mod, hcpu);
- goto out_release;
+ goto out_cancel;
}
BUG_ON(cpu_online(cpu));
..and fixed this too?
Another little bug. This hunk of patches/stomp-machine-raw-lock.patch
should be while (atomic_read(&done.nr_todo))
@@ -647,7 +671,7 @@ int stop_machine_from_inactive_cpu(int (
ret = multi_cpu_stop(&msdata);
/* Busy wait for completion. */
- while (!completion_done(&done.completion))
+ while (!atomic_read(&done.nr_todo))
cpu_relax();
mutex_unlock(&stop_cpus_mutex);
On Wed, 30 Apr 2014 16:54:46 +0200
Mike Galbraith <[email protected]> wrote:
> On Wed, 2014-04-30 at 10:33 -0400, Steven Rostedt wrote:
> > On Wed, 30 Apr 2014 10:19:19 -0400
> > Steven Rostedt <[email protected]> wrote:
> >
> > > I'm testing it now. But could you please post them as regular patches.
> > > They were attachments to this thread, and were not something that stood
> > > out.
> >
> > With your two patches, it still crashes exactly the same way. I
> > probably should remove my debug just in case, but I think this box has
> > another problem with it.
>
> You killed this hunk of hotplug-light-get-online-cpus.patch
>
> @@ -333,7 +449,7 @@ static int __ref _cpu_down(unsigned int
> /* CPU didn't die: tell everyone. Can't complain. */
> smpboot_unpark_threads(cpu);
> cpu_notify_nofail(CPU_DOWN_FAILED | mod, hcpu);
> - goto out_release;
> + goto out_cancel;
I added this, but it only happens on the failed case, which I don't
think is an issue with what I'm dealing with.
> }
> BUG_ON(cpu_online(cpu));
>
> ..and fixed this too?
>
> Another little bug. This hunk of patches/stomp-machine-raw-lock.patch
> should be while (atomic_read(&done.nr_todo))
>
> @@ -647,7 +671,7 @@ int stop_machine_from_inactive_cpu(int (
> ret = multi_cpu_stop(&msdata);
>
> /* Busy wait for completion. */
> - while (!completion_done(&done.completion))
> + while (!atomic_read(&done.nr_todo))
I don't see this in the code. That is, there is no "completion_done()"
in stop_machine_from_inactive_cpu(). It is already an atomic_read().
-- Steve
> cpu_relax();
>
> mutex_unlock(&stop_cpus_mutex);
On Wed, 2014-04-30 at 11:11 -0400, Steven Rostedt wrote:
> > Another little bug. This hunk of patches/stomp-machine-raw-lock.patch
> > should be while (atomic_read(&done.nr_todo))
> >
> > @@ -647,7 +671,7 @@ int stop_machine_from_inactive_cpu(int (
> > ret = multi_cpu_stop(&msdata);
> >
> > /* Busy wait for completion. */
> > - while (!completion_done(&done.completion))
> > + while (!atomic_read(&done.nr_todo))
^--- that ! needs to go away
>
> I don't see this in the code. That is, there is no "completion_done()"
> in stop_machine_from_inactive_cpu(). It is already an atomic_read().
Yes, but it should read "while (atomic_read(&done.nr_todo))"
-Mike
I fired off a 100 iteration run on 64 core box. If it's still alive in
the morning, it should still be busy as hell.
-Mike
On Wed, 30 Apr 2014 17:15:57 +0200
Mike Galbraith <[email protected]> wrote:
> On Wed, 2014-04-30 at 11:11 -0400, Steven Rostedt wrote:
>
> > > Another little bug. This hunk of patches/stomp-machine-raw-lock.patch
> > > should be while (atomic_read(&done.nr_todo))
> > >
> > > @@ -647,7 +671,7 @@ int stop_machine_from_inactive_cpu(int (
> > > ret = multi_cpu_stop(&msdata);
> > >
> > > /* Busy wait for completion. */
> > > - while (!completion_done(&done.completion))
> > > + while (!atomic_read(&done.nr_todo))
> ^--- that ! needs to go away
> >
> > I don't see this in the code. That is, there is no "completion_done()"
> > in stop_machine_from_inactive_cpu(). It is already an atomic_read().
>
> Yes, but it should read "while (atomic_read(&done.nr_todo))"
Ah, this would have been better if you had sent a patch. I misread what
you talked about.
Yes, this was the culprit of my failures. After removing the '!', it
worked.
Care to send a patch :-)
-- Steve
On Wed, 2014-04-30 at 11:48 -0400, Steven Rostedt wrote:
> On Wed, 30 Apr 2014 17:15:57 +0200
> Mike Galbraith <[email protected]> wrote:
>
> > On Wed, 2014-04-30 at 11:11 -0400, Steven Rostedt wrote:
> >
> > > > Another little bug. This hunk of patches/stomp-machine-raw-lock.patch
> > > > should be while (atomic_read(&done.nr_todo))
> > > >
> > > > @@ -647,7 +671,7 @@ int stop_machine_from_inactive_cpu(int (
> > > > ret = multi_cpu_stop(&msdata);
> > > >
> > > > /* Busy wait for completion. */
> > > > - while (!completion_done(&done.completion))
> > > > + while (!atomic_read(&done.nr_todo))
> > ^--- that ! needs to go away
> > >
> > > I don't see this in the code. That is, there is no "completion_done()"
> > > in stop_machine_from_inactive_cpu(). It is already an atomic_read().
> >
> > Yes, but it should read "while (atomic_read(&done.nr_todo))"
>
> Ah, this would have been better if you had sent a patch. I misread what
> you talked about.
>
> Yes, this was the culprit of my failures. After removing the '!', it
> worked.
>
> Care to send a patch :-)
I figured those two were just edit patch, done, but yeah, I can do that.
-Mike