Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754885AbbDTPyB (ORCPT ); Mon, 20 Apr 2015 11:54:01 -0400 Received: from mail-ie0-f180.google.com ([209.85.223.180]:34167 "EHLO mail-ie0-f180.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753586AbbDTPx7 (ORCPT ); Mon, 20 Apr 2015 11:53:59 -0400 MIME-Version: 1.0 In-Reply-To: <20150420053954.GA9923@gmail.com> References: <20150418234050.GA5987@roeck-us.net> <55330B32.4010907@roeck-us.net> <20150419033940.GA25145@debian> <20150419093112.GA6139@gmail.com> <5533B6D7.9050101@roeck-us.net> <20150419180140.GA8934@gmail.com> <20150420053954.GA9923@gmail.com> Date: Mon, 20 Apr 2015 08:53:58 -0700 X-Google-Sender-Auth: NULvNswMjlJrkPQJd6fuoPuGgFo Message-ID: Subject: Re: qemu:arm test failure due to commit 8053871d0f7f (smp: Fix smp_call_function_single_async() locking) From: Linus Torvalds To: Ingo Molnar Cc: Guenter Roeck , Rabin Vincent , Linux Kernel Mailing List , Peter Zijlstra , Thomas Gleixner , "Paul E. McKenney" Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2044 Lines: 44 On Sun, Apr 19, 2015 at 10:39 PM, Ingo Molnar wrote: > >> >> So I _could_ imagine that somebody would want to do optimistic "prod >> other cpu" calls that in all normal cases are for existing cpus, but >> could be racy in theory. > > Yes, and I don't disagree with such optimizations in principle (it > allows less references to be taken in the fast path), but is it really > safe? > > If a CPU is going down and we potentially race against that, and send > off an IPI, the IPI might be 'in flight' for an indeterminate amount > of time, especially on wildly non-deterministic hardware like virtual > platforms. Well, it should be easy enough to handle that race in the cpu offlining: after the cpu is marked "not present", just call flush_smp_call_function_queue(), In fact, I thought we did exactly that - it's the reason for the "warn_cpu_offline" argument, isn't it)? So I don't think there should be any real race. Sure, the HW IPI itself might be in flight, but from a sw perspective isn't all done. No, I was talking about something even more optimistic - the CPU number we optimisitcally loaded and sent an IPI to might be completely bogus just because we loaded it using some unlocked sequence, and maybe the memory got re-assigned. So it might not even be a CPU number that is "stale", it could be entirely invalid. And no, I don't claim that we should do this, I'm just saying that I could imagine this being a valid thing to do. But it might be a good idea to add a WARN_ON_ONCE() for now to find the users that are not being clever like this, they are just being stupid and wrong-headed, and sending IPI's to bogus CPU's not because they are doing really subtle smart stuff, but just because they never noticed how stupid they are.. Linus -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/