Received: by 2002:a25:8b91:0:0:0:0:0 with SMTP id j17csp941963ybl; Wed, 11 Dec 2019 09:53:24 -0800 (PST) X-Google-Smtp-Source: APXvYqxVX5WX4DRUOziKOSkagf7/YFQdqKTX+ZIz5Q/odJESiMB4UWOLHBBGtvhrMs+2oTZRKJxV X-Received: by 2002:aca:c354:: with SMTP id t81mr3832481oif.149.1576086804252; Wed, 11 Dec 2019 09:53:24 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1576086804; cv=none; d=google.com; s=arc-20160816; b=p+J+0MOlPXKe+sbPTjtQja6lRJbu3l2V6nGNlMJ1Fz11YpCitVc7HfnoMXa22NGNma 6VJ0L585pEnIakINxq1904ZGtutUuMWG24lnju604sCR2V5PIQBF1BLw47b43RzFZcuK x63WZ8FGhpzoFyEQn6HXZWCdjENehpCId7qFzv7ADfXT5G2hiM+HoXFOABkFVLQXGpTu dvtZLHu1u9at64rRWHn01TsXzmm3SmP2f1qMkBntBDvFbkPJeoJdTZ0IV28yQjLxbXYq c+k6ZximcnkFgYXC5dCMsb5yqZl+LBuRUgkAAmQkGz5ts2I6xENHyrcZNWQoGBL1zTSs Cg9g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:dkim-signature; bh=NNmicuFL5ok36X4eq37Ayq5N4dbbpy3GcLqAwcxHD20=; b=NFBUZoJfOFAcxa6pOo06A8mLKZS8111NAuII6sh8NRY3zK7lvoJLzNovQEsvc0iZb8 VKyMpSl/ebNS5sMV5UqR11g28F6bJZY+hp+qw6RPrBI8YdTrSR5upRsBhuWiWKUepA6P YFoo4wPq5kl2Vin5PHbn1O7jl5nntgHjZOwJmoB8AughGLu9Wnt0tSVxY/StXZlwoT+0 Cgp3uptw+y36xTEZwJYufGBCSueE7QoesZ7KcbOLxHjnm8spmBHJxqxdp1kTbJRmPCXr a85MqU71Tq5H+JTwd8GZ2tFhpIg3ROR+My560nQjiGH/tnoh57n1YMjV7DQJFMKGdP4f 8i5Q== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@infradead.org header.s=bombadil.20170209 header.b=nxZLRJlP; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id k8si1809987otf.76.2019.12.11.09.53.07; Wed, 11 Dec 2019 09:53:24 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=fail header.i=@infradead.org header.s=bombadil.20170209 header.b=nxZLRJlP; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730431AbfLKRwS (ORCPT + 99 others); Wed, 11 Dec 2019 12:52:18 -0500 Received: from bombadil.infradead.org ([198.137.202.133]:57090 "EHLO bombadil.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726856AbfLKRwR (ORCPT ); Wed, 11 Dec 2019 12:52:17 -0500 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=bombadil.20170209; h=In-Reply-To:Content-Type:MIME-Version :References:Message-ID:Subject:Cc:To:From:Date:Sender:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Id: List-Help:List-Unsubscribe:List-Subscribe:List-Post:List-Owner:List-Archive; bh=NNmicuFL5ok36X4eq37Ayq5N4dbbpy3GcLqAwcxHD20=; b=nxZLRJlP62cN7e2AvJqHBD1EX Q/9Rmt2XXKWYsEOK6WOOI53eUGjAIOGVvwN0nddevYkw8poXEYzXZs+E3qVPJ4Pd36PX5lkaM66Un 7b7oJCljoeOVnfR6w7trjP/m8yGhpq50mIg+FzkznyOSZhTCwq5XMrLvKpvOsWh3GpAVVfZj69O/t ZGXJ7nnLaE7ipj3XR2lqTH3Po8hHa6dRokw3KBXD8DwZENktcWAjE3Hp+U8L1IxOfuhlU8GaD7xld R165MNTTuuHxCWLDJEaU/QRpgV1VoecoWc1CtlWN9J+aA9kQPOa19x1oHYxYZq2tJFX4QqNsXrBrP AEjE3uiQA==; Received: from j217100.upc-j.chello.nl ([24.132.217.100] helo=noisy.programming.kicks-ass.net) by bombadil.infradead.org with esmtpsa (Exim 4.92.3 #3 (Red Hat Linux)) id 1if69d-000597-VL; Wed, 11 Dec 2019 17:52:06 +0000 Received: from hirez.programming.kicks-ass.net (hirez.programming.kicks-ass.net [192.168.1.225]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by noisy.programming.kicks-ass.net (Postfix) with ESMTPS id CAEEA300F29; Wed, 11 Dec 2019 18:50:42 +0100 (CET) Received: by hirez.programming.kicks-ass.net (Postfix, from userid 1000) id C1C5920120E4D; Wed, 11 Dec 2019 18:52:02 +0100 (CET) Date: Wed, 11 Dec 2019 18:52:02 +0100 From: Peter Zijlstra To: Andy Lutomirski Cc: "Luck, Tony" , "Yu, Fenghua" , David Laight , Ingo Molnar , Thomas Gleixner , Ingo Molnar , Borislav Petkov , H Peter Anvin , "Raj, Ashok" , "Shankar, Ravi V" , linux-kernel , x86 , Will Deacon Subject: Re: [PATCH v10 6/6] x86/split_lock: Enable split lock detection by kernel parameter Message-ID: <20191211175202.GQ2827@hirez.programming.kicks-ass.net> References: <20191121171214.GD12042@gmail.com> <3481175cbe14457a947f934343946d52@AcuMS.aculab.com> <20191121185303.GB199273@romley-ivt3.sc.intel.com> <20191121202508.GZ4097@hirez.programming.kicks-ass.net> <20191122092555.GA4097@hirez.programming.kicks-ass.net> <3908561D78D1C84285E8C5FCA982C28F7F4DD19F@ORSMSX115.amr.corp.intel.com> <20191122203105.GE2844@hirez.programming.kicks-ass.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.10.1 (2018-07-13) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Nov 22, 2019 at 01:23:30PM -0800, Andy Lutomirski wrote: > On Fri, Nov 22, 2019 at 12:31 PM Peter Zijlstra wrote: > > > > On Fri, Nov 22, 2019 at 05:48:14PM +0000, Luck, Tony wrote: > > > > When we use byte ops, we must consider the word as 4 independent > > > > variables. And in that case the later load might observe the lock-byte > > > > state from 3, because the modification to the lock byte from 4 is in > > > > CPU2's store-buffer. > > > > > > So we absolutely violate this with the optimization for constant arguments > > > to set_bit(), clear_bit() and change_bit() that are implemented as byte ops. > > > > > > So is code that does: > > > > > > set_bit(0, bitmap); > > > > > > on one CPU. While another is doing: > > > > > > set_bit(mybit, bitmap); > > > > > > on another CPU safe? The first operates on just one byte, the second on 8 bytes. > > > > It is safe if all you care about is the consistency of that one bit. > > > > I'm still lost here. Can you explain how one could write code that > observes an issue? My trusty SDM, Vol 3 8.2.2 says "Locked > instructions have a total order." This is the thing I don't fully believe. Per this thread the bus-lock is *BAD* and not used for normal LOCK prefixed operations. But without the bus-lock it becomes very hard to guarantee total order. After all, if some CPU doesn't observe a specific variable, it doesn't care where in the order it fell. So I'm thinking they punted and went with some partial order that is near enough that it becomes very hard to tell the difference the moment you actually do observe stuff. > 8.2.3.9 says "Loads and Stores Are > Not Reordered with Locked Instructions." Admittedly, the latter is an > "example", but the section is very clear about the fact that a locked > instruction prevents reordering of a load or a store issued by the > same CPU relative to the locked instruction *regardless of whether > they overlap*. IIRC this rule is CPU-local. Sure, but we're talking two cpus here. u32 var = 0; u8 *ptr = &var; CPU0 CPU1 xchg(ptr, 1) xchg((ptr+1, 1); r = READ_ONCE(var); AFAICT nothing guarantees r == 0x0101. The CPU1 store can be stuck in CPU1's store-buffer. CPU0's xchg() does not overlap and therefore doesn't force a snoop or forward. From the perspective of the LOCK prefixed instructions CPU0 never observes the variable @ptr. And therefore doesn't need to provide order. Note how the READ_ONCE() is a normal load on CPU0, and per the rules is only forced to happen after it's own LOCK prefixed instruction, but it is free to observe ptr[0,2,3] from before, only ptr[1] will be forwarded from its own store-buffer. This is exactly the one reorder TSO allows. > I understand that the CPU is probably permitted to optimize a LOCK RMW > operation such that it retires before the store buffers of earlier > instructions are fully flushed, but only if the store buffer and cache > coherency machinery work together to preserve the architecturally > guaranteed ordering. Maybe, maybe not. I'm very loathe to trust this without things being better specified. Like I said, it is possible that it all works, but the way I understand things I _really_ don't want to rely on it. Therefore, I've written: u32 var = 0; u8 *ptr = &var; CPU0 CPU1 xchg(ptr, 1) set_bit(8, ptr); r = READ_ONCE(var); Because then the LOCK BTSL overlaps with the LOCK XCHGB and CPU0 now observes the variable @ptr and therefore must force order. Did this clarify, or confuse more?