Received: by 2002:a25:ab43:0:0:0:0:0 with SMTP id u61csp3660777ybi; Mon, 27 May 2019 03:56:40 -0700 (PDT) X-Google-Smtp-Source: APXvYqwc9DHuy+UAaCSWJ43hX7bVaRLwZQA/g5QgxltJ3gJrg2YwLv6e2/h4vWqPVSnVIN/kPqsU X-Received: by 2002:a17:902:5c2:: with SMTP id f60mr80701964plf.104.1558954600174; Mon, 27 May 2019 03:56:40 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1558954600; cv=none; d=google.com; s=arc-20160816; b=KZvDl2JDE9K9Rt4K5S3tIIDb/2HBtNPKm1W0QP+H1ZccwOcASIfwGIKElHssOn3N2m /GTRp08JBy6UZBalJ5nbX1ZC32YuS+gizwS6MMPn/3un4XBeHRKbYAkVDO/SHuicNkDw w3aJVDHAO6RWfQdu7yzIn+Z9rjFrAz8YcBfxU60fKTUwIXhYb3sFF7PhruCYLP0poBwj AhZXSKQ1OdnX/DJaVRDiSjDUuVYaV//ldsIK10Pn95At6yp+zNM3yukPQd520IZdoqbn 8EMi0E2nUFpw5X9onc0avksAuykVU/Vam61v0MgYD1JKKnFBWnW9hqbSkJfwVFye04sh n4Ow== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-transfer-encoding:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:dkim-signature; bh=uVfy/lu/NyhO7gs/yTDr0U1v9NIKN7O4ba0GqTFT89s=; b=GxRKtGYULEFqRxk1hlVYJ2s9lQif0npej11ZKi5Y1tTMSSCW6uRUP7Pb1rKKr/luzn EMV4AIFJR5nbouJHHvgWi9q2KuMbdmZ7WTZFtZfZ0uBjraAv+Ig3XcfzP7mCjFjgNe72 /zkISFkPLHy+0kagjotuC6E7Mf+BHNp3+Yvbo6OBHGdNUK+4fqA61RlKyiW6x6PjcgJ2 eucwKfDlYQv7HOzSvIHUjVOpbDB2iUVUtrQlsyt3VtELF6v1iRTZ++iu4DMsr4p6QLhB aqNvfQslG5m7T3seN8iyuCfN2MqYNNRhBy1vlq0FvxQ+Y8rXxPW3GyW6aN3u1hbLtRm+ RwCA== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@infradead.org header.s=bombadil.20170209 header.b=jwdSKNwo; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id y9si16619606pjp.47.2019.05.27.03.56.24; Mon, 27 May 2019 03:56:40 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=fail header.i=@infradead.org header.s=bombadil.20170209 header.b=jwdSKNwo; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726636AbfE0Kxo (ORCPT + 99 others); Mon, 27 May 2019 06:53:44 -0400 Received: from bombadil.infradead.org ([198.137.202.133]:54634 "EHLO bombadil.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725814AbfE0Kxo (ORCPT ); Mon, 27 May 2019 06:53:44 -0400 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=bombadil.20170209; h=In-Reply-To:Content-Transfer-Encoding :Content-Type:MIME-Version:References:Message-ID:Subject:Cc:To:From:Date: Sender:Reply-To:Content-ID:Content-Description:Resent-Date:Resent-From: Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Id:List-Help: List-Unsubscribe:List-Subscribe:List-Post:List-Owner:List-Archive; bh=uVfy/lu/NyhO7gs/yTDr0U1v9NIKN7O4ba0GqTFT89s=; b=jwdSKNwofdLW9xZVXVF9A/LpJx Cd9/MtEOfxbNkI6fzdegmk8U9nHgsG+d6S6ih5nbwKH+MG1fjh71PotDcsHqQkoidfkCgRrT5pEW5 /KGV1YUPF429Of951i4tfXEtjFLe2BU4xID55x3rHJQizvIr9niyegrWwONhoPkhBtjf83u8I0csT v0rW+aUAnqbCcetKsmcsPzoU8t8ZO+z6+DMKQeXMtCBhrNNMKffIcrLEJ+zbmnq8xMhSK4uXENnfi 7v20O2+9lTaD3TCEYP5jzkkIXlR16YhUa0EL3uokVutdcczCvG63D+NZwfkG15nuve7GSLBgBGDHc 5hvi0x0A==; Received: from j217100.upc-j.chello.nl ([24.132.217.100] helo=hirez.programming.kicks-ass.net) by bombadil.infradead.org with esmtpsa (Exim 4.90_1 #2 (Red Hat Linux)) id 1hVDG8-0000jz-Pr; Mon, 27 May 2019 10:53:40 +0000 Received: by hirez.programming.kicks-ass.net (Postfix, from userid 1000) id 4ED722027F766; Mon, 27 May 2019 12:53:39 +0200 (CEST) Date: Mon, 27 May 2019 12:53:39 +0200 From: Peter Zijlstra To: Rik van Riel Cc: Andrew Murray , Thomas Gleixner , linux-kernel@vger.kernel.org Subject: Re: [PATCH] smp,cpumask: Don't call functions on offline CPUs Message-ID: <20190527105339.GZ2623@hirez.programming.kicks-ass.net> References: <20190522111537.27815-1-andrew.murray@arm.com> <20190522140921.GD16275@worktop.programming.kicks-ass.net> <20190522143711.GC8268@e119886-lin.cambridge.arm.com> <20190522144918.GH16275@worktop.programming.kicks-ass.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable In-Reply-To: User-Agent: Mutt/1.10.1 (2018-07-13) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, May 22, 2019 at 02:23:47PM -0400, Rik van Riel wrote: > On Wed, 2019-05-22 at 16:49 +0200, Peter Zijlstra wrote: > > On Wed, May 22, 2019 at 03:37:11PM +0100, Andrew Murray wrote: > > > > Is perhaps the problem that on_each_cpu_cond() uses > > > > cpu_onlne_mask > > > > without protection? > > >=20 > > > Does this prevent racing with a CPU going offline? I guess this > > > prevents > > > the warning at the expense of a lock - but is only beneficial in > > > the > > > unlikely path. (In the likely path this prevents new CPUs going > > > offline > > > but we don't care because we don't WARN if they aren't they when we > > > attempt to call functions). > > >=20 > > > At least this is my limited understanding. > >=20 > > Hmm.. I don't think it could matter, we only use the mask when > > preempt_disable(), which would already block offline, due to it using > > stop_machine(). > >=20 > > So the patch is a no-op. > >=20 > > What's the WARN you see? TLB invalidation should pass mm_cpumask(), > > which similarly should not contain offline CPUs I'm thinking. >=20 > Does the TLB invalidation code have anything in it > to prevent from racing with the CPU offline code? >=20 > In other words, could we end up with the TLB > invalidation code building its bitmask, getting > interrupted (eg. hypervisor preemption, NMI), > and not sending out the IPI to that bitmask of > CPUs until after one of the CPUs in the bitmap > has gotten offlined? One possible thing would be if cpu-offline didn't remove the bit from mm_cpumask() because it entered lazy state earlier or something. Then, mm_cpumask() would contain an offline CPU and the WARN could trigger, but I don't _think_ we do that, but I didn't check.