Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753780AbbDSDkC (ORCPT ); Sat, 18 Apr 2015 23:40:02 -0400 Received: from mail-wi0-f179.google.com ([209.85.212.179]:34768 "EHLO mail-wi0-f179.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752068AbbDSDj7 (ORCPT ); Sat, 18 Apr 2015 23:39:59 -0400 Date: Sun, 19 Apr 2015 05:39:40 +0200 From: Rabin Vincent To: Guenter Roeck Cc: Linus Torvalds , Linux Kernel Mailing List , Peter Zijlstra , Ingo Molnar Subject: Re: qemu:arm test failure due to commit 8053871d0f7f (smp: Fix smp_call_function_single_async() locking) Message-ID: <20150419033940.GA25145@debian> References: <20150418232325.GA22411@roeck-us.net> <20150418234050.GA5987@roeck-us.net> <55330B32.4010907@roeck-us.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <55330B32.4010907@roeck-us.net> User-Agent: Mutt/1.5.23 (2014-03-12) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3918 Lines: 100 On Sat, Apr 18, 2015 at 06:56:02PM -0700, Guenter Roeck wrote: > Further debugging (with added WARN_ON if cpu != 0 in smp_call_function_single) shows: > > [<800157ec>] (unwind_backtrace) from [<8001250c>] (show_stack+0x10/0x14) > [<8001250c>] (show_stack) from [<80494cb4>] (dump_stack+0x88/0x98) > [<80494cb4>] (dump_stack) from [<80024058>] (warn_slowpath_common+0x84/0xb4) > [<80024058>] (warn_slowpath_common) from [<80024124>] (warn_slowpath_null+0x1c/0x24) > [<80024124>] (warn_slowpath_null) from [<80078fc8>] (smp_call_function_single+0x170/0x178) > [<80078fc8>] (smp_call_function_single) from [<80090024>] (perf_event_exit_cpu+0x80/0xf0) > [<80090024>] (perf_event_exit_cpu) from [<80090110>] (perf_cpu_notify+0x30/0x48) > [<80090110>] (perf_cpu_notify) from [<8003d340>] (notifier_call_chain+0x44/0x84) > [<8003d340>] (notifier_call_chain) from [<8002451c>] (_cpu_up+0x120/0x168) > [<8002451c>] (_cpu_up) from [<800245d4>] (cpu_up+0x70/0x94) > [<800245d4>] (cpu_up) from [<80624234>] (smp_init+0xac/0xb0) > [<80624234>] (smp_init) from [<80618d84>] (kernel_init_freeable+0x118/0x268) > [<80618d84>] (kernel_init_freeable) from [<8049107c>] (kernel_init+0x8/0xe8) > [<8049107c>] (kernel_init) from [<8000f320>] (ret_from_fork+0x14/0x34) > ---[ end trace 2f9f1bb8a47b3a1b ]--- > smp_call_function_single, cpu=1, wait=1, csd_stack=87825ea0 > generic_exec_single, cpu=1, smp_processor_id()=0 > csd_lock_wait: csd=87825ea0, flags=0x3 > > This is repeated for each secondary CPU. But the secondary CPUs don't respond because > they are not enabled, which I guess explains why the lock is never released. > > So, in other words, this happens because the system believes (presumably per configuration > / fdt data) that there are four CPU cores, but that is not really the case. Previously that > did not matter, and was handled correctly. Now it is fatal. > > Does this help ? 8053871d0f7f67c7efb7f226ef031f78877d6625 moved the csd locking to the callers, but the offline check, which was earlier done before the csd locking, was not moved. The following moves the checks to the earlier point fixes your boot: diff --git a/kernel/smp.c b/kernel/smp.c index 2aaac2c..ba1fb01 100644 --- a/kernel/smp.c +++ b/kernel/smp.c @@ -159,9 +159,6 @@ static int generic_exec_single(int cpu, struct call_single_data *csd, } - if ((unsigned)cpu >= nr_cpu_ids || !cpu_online(cpu)) - return -ENXIO; - csd->func = func; csd->info = info; @@ -289,6 +286,12 @@ int smp_call_function_single(int cpu, smp_call_func_t func, void *info, WARN_ON_ONCE(cpu_online(this_cpu) && irqs_disabled() && !oops_in_progress); + if (cpu != smp_processor_id() + && ((unsigned)cpu >= nr_cpu_ids || !cpu_online(cpu))) { + err = -ENXIO; + goto out; + } + csd = &csd_stack; if (!wait) { csd = this_cpu_ptr(&csd_data); @@ -300,6 +303,7 @@ int smp_call_function_single(int cpu, smp_call_func_t func, void *info, if (wait) csd_lock_wait(csd); +out: put_cpu(); return err; @@ -328,6 +332,12 @@ int smp_call_function_single_async(int cpu, struct call_single_data *csd) preempt_disable(); + if (cpu != smp_processor_id() + && ((unsigned)cpu >= nr_cpu_ids || !cpu_online(cpu))) { + err = -ENXIO; + goto out; + } + /* We could deadlock if we have to wait here with interrupts disabled! */ if (WARN_ON_ONCE(csd->flags & CSD_FLAG_LOCK)) csd_lock_wait(csd); @@ -336,8 +346,9 @@ int smp_call_function_single_async(int cpu, struct call_single_data *csd) smp_wmb(); err = generic_exec_single(cpu, csd, csd->func, csd->info); - preempt_enable(); +out: + preempt_enable(); return err; } EXPORT_SYMBOL_GPL(smp_call_function_single_async); -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/