Received: by 2002:a25:8b91:0:0:0:0:0 with SMTP id j17csp793200ybl; Thu, 12 Dec 2019 05:06:51 -0800 (PST) X-Google-Smtp-Source: APXvYqyMI/SfCvqCVvuLaLUYHce9rUDF5aHB5SCNcKqpZ7cEGHDyy7H9Ejwfy3zyf8sqWwnKKnXt X-Received: by 2002:a9d:73ca:: with SMTP id m10mr7905909otk.289.1576156011000; Thu, 12 Dec 2019 05:06:51 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1576156010; cv=none; d=google.com; s=arc-20160816; b=JW+LPpI9Add3aTdlOdgEfiqPCDvRW1igFvSwBZ3IwERG1WJQk7swO23J9CFm6XoSrw sQICOjUMxvaVdcqF6H0F66g5IWkTak2CX353F5i7eelzIaOgoHXAmZ4o4ULYvd1DbkOE Hd1kLqlynjMYPCpYSutOXQyEfU+nxDewOWJyDfZqOnq/uttGgLfkqcMcbR/SSam0EBu8 ks5As2CylYu8kgLf9dVjpN7lESwhUpfTGCIyu4yk2hiWi3Tm6+MHW8HZ80wS/7zVZJAN +KC5ITY5QFPKmmUNYQxE5+DEeX4/PLrMjw70lhgCnddGJcT+KDRUFWL7CtvU1+Hk6+Xz 1Z9g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:dkim-signature; bh=NjvH1quuvLRS7d0OQTnj2l3Ce1X8fDNiEPYtgGnqjt0=; b=Bqzbbf95RXNoXJ2ZDS5IgED7p97/xlUQr1nmWlllnBtVlFddM5gTVbjEeoGFn8CDNx zJVyATMbhX7shC0yeX/8cBmJwCcvKDXykxMqsDrdyl3BLYx0Gq5Mj4PKuBxKGVXEi123 KhysiTg0hagbAYw7YL1Mt5PBPxrNXdxGv8H3IgtdaRsNfziOf4Z0ckh5tnu6krNdLTHV W3DCoptEh1MoFscT1qNmJNyO2Ym5dZ0r3cXjRlNVcGRNUrwU+gmJCM7uFEEzgCU1KhiM +1C1w2yz7uO6W+7kNjSGknmxHn2/A/TTwVf++vSHtJw/Dwxl8hamD9/32pkEj6QxahY6 PgEw== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@infradead.org header.s=bombadil.20170209 header.b=IrhFuyyC; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id e5si3232336otr.247.2019.12.12.05.06.38; Thu, 12 Dec 2019 05:06:50 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=fail header.i=@infradead.org header.s=bombadil.20170209 header.b=IrhFuyyC; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729342AbfLLNEb (ORCPT + 99 others); Thu, 12 Dec 2019 08:04:31 -0500 Received: from bombadil.infradead.org ([198.137.202.133]:34832 "EHLO bombadil.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729328AbfLLNEa (ORCPT ); Thu, 12 Dec 2019 08:04:30 -0500 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=bombadil.20170209; h=In-Reply-To:Content-Type:MIME-Version :References:Message-ID:Subject:Cc:To:From:Date:Sender:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Id: List-Help:List-Unsubscribe:List-Subscribe:List-Post:List-Owner:List-Archive; bh=NjvH1quuvLRS7d0OQTnj2l3Ce1X8fDNiEPYtgGnqjt0=; b=IrhFuyyCtQPfTdwSv4ZfaZj28 V1MsfH+YoqybwmDazfqs7d0FCO0flqSSgK27sQkktwsT1PrMjpglvnxzoB0wzrz4b+92F9ZjofePl 6EASXumfYeDbmC1IigCJy+Tqi9QtyQHCBmS584sKSEzs5u2pTDQoazP9Uxu2jT8/YY0ruR1lQQDDJ Vlw0fE8dpa+DKaObS0ZO+MwOG+LZOJF49OuLCoEgwarO4QITN2OesvdhESiOr9nTDnIoZKxXja6fM a4GY6n14n0EQ64LMVvwfLFpeid7rKMLNks3roB/iFPALlzIi6altPVksRSS3NJOFbP/JB8VIK32p2 j0RQVvMYg==; Received: from j217100.upc-j.chello.nl ([24.132.217.100] helo=noisy.programming.kicks-ass.net) by bombadil.infradead.org with esmtpsa (Exim 4.92.3 #3 (Red Hat Linux)) id 1ifO8m-0006l2-Io; Thu, 12 Dec 2019 13:04:24 +0000 Received: from hirez.programming.kicks-ass.net (hirez.programming.kicks-ass.net [192.168.1.225]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by noisy.programming.kicks-ass.net (Postfix) with ESMTPS id 58B5C3058B4; Thu, 12 Dec 2019 14:03:01 +0100 (CET) Received: by hirez.programming.kicks-ass.net (Postfix, from userid 1000) id B211929E10BD8; Thu, 12 Dec 2019 14:04:21 +0100 (CET) Date: Thu, 12 Dec 2019 14:04:21 +0100 From: Peter Zijlstra To: David Laight Cc: "Luck, Tony" , Andy Lutomirski , "Yu, Fenghua" , Ingo Molnar , Thomas Gleixner , Ingo Molnar , Borislav Petkov , H Peter Anvin , "Raj, Ashok" , "Shankar, Ravi V" , linux-kernel , x86 , Will Deacon Subject: Re: [PATCH v10 6/6] x86/split_lock: Enable split lock detection by kernel parameter Message-ID: <20191212130421.GX2827@hirez.programming.kicks-ass.net> References: <20191121202508.GZ4097@hirez.programming.kicks-ass.net> <20191122092555.GA4097@hirez.programming.kicks-ass.net> <3908561D78D1C84285E8C5FCA982C28F7F4DD19F@ORSMSX115.amr.corp.intel.com> <20191122203105.GE2844@hirez.programming.kicks-ass.net> <20191211175202.GQ2827@hirez.programming.kicks-ass.net> <20191211184416.GA6344@agluck-desk2.amr.corp.intel.com> <20191211223917.GU2844@hirez.programming.kicks-ass.net> <5ba0190e63594ad1b8b032bc12553393@AcuMS.aculab.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <5ba0190e63594ad1b8b032bc12553393@AcuMS.aculab.com> User-Agent: Mutt/1.10.1 (2018-07-13) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Dec 12, 2019 at 10:36:27AM +0000, David Laight wrote: > On x86 'xchg' is always 'locked' regardless of whether there is a 'lock' prefix. Sure, irrelevant here though. > set_bit() (etc) include the 'lock' prefix (dunno why this decision was made...). Because it is the atomic set bit function, we have __set_bit() if you want the non-atomic one. Atomic bitops are (obviously) useful if you have concurrent changes to your bitmap. Lots of people seem confused on this though, as evidenced by a lot of the broken crap we keep finding (then again, them using __set_bit() would still be broken due to the endian thing). > For locked operations (including misaligned ones) that don't cross cache-line > boundaries the read operation almost certainly locks the cache line (against > a snoop) until the write has updated the cache line. Note your use of 'almost'. Almost isn't good enough. Note that other architectures allow the store from atomic operations to hit the store buffer. And I strongly suspect x86 does the same. Waiting for a store-buffer drain is *expensive*. Try timing: LOCK INC (ptr); vs LOCK INC (ptr); MFENCE My guess is the second one *far* more expensive. MFENCE drains (and waits for completion thereof) the store-buffer -- it must since it fences against non-coherent stuff. I suppose ARM's DMB vs DSB is of similar distinction. > This won't happen until the write 'drains' from the store buffer. > (I suspect that locked read requests act like write requests in ensuring > that no other cpu has a dirty copy of the cache line, and also marking it dirty.) > Although this will delay the response to the snoop it will only > stall the cpu (or other bus master), not the entire memory 'bus'. I really don't think so. The commit I pointed to earlier in the thread, that replaced MFENCE with LOCK ADD $0, -4(%RSP) for smp_mb(), strongly indicates LOCK prefixed instructions do _NOT_ flush the store buffer. All barriers impose is order, if your store-buffer can preserve order, all should just work. One possible way would be to tag each entry, and increment the tag on barrier. Then ensure that all smaller tags are flushed before allowing a higher tagged entry to leave. > If you read the description of 'lock btr' you'll see that it always does the > write cycle (to complete the atomic RMW expected by the memory > subsystem) even when the bit is clear. I know it does, but I don't see how that is relevant here. > Remote store buffers are irrelevant to locked accesses. They are not in general and I've seen nothing to indicate this is the case on x86. > (If you are doing concurrent locked and unlocked accesses to the same > memory location something is badly broken.) It is actually quite common. > It really can't matter whether one access is a mis-aligned 64bit word > and the other a byte. Both do atomic RMW updates so the result > cannot be unexpected. Expectations are often violated. Esp when talking about memory ordering. > In principle two separate 8 bit RMW cycles could be done concurrently > to two halves of a 16 bit 'flag' word without losing any bits or any reads > returning any of the expected 4 values. > Not that any memory system would support such updates. I'm thinking you ought to go read that paper on mixed size concurrency I referenced earlier in this thread. IIRC the conclusion was that PowerPC does exactly that and ARM64 allows for it but it hasn't been observed, yet. Anyway, I'm not saying x86 behaves this way, I'm saying that I have lots of questions and very little answers. I'm also saying that the variant with non-overlapping atomics could conceivably misbehave, while the variant with overlapping atomics is guaranteed not to. Specifically smp_mb()/SYNC on PowerPC can not restore Sequential Consistency under mixed size operations. How's that for expectations?