Received: by 2002:a25:8b91:0:0:0:0:0 with SMTP id j17csp1230984ybl; Thu, 12 Dec 2019 11:41:49 -0800 (PST) X-Google-Smtp-Source: APXvYqzfjVKWca/GCURfoti5kcXVOHVOhr0Y79lkdnP02NgTHm1PG8klJRiuNjwwSmW2nXuxXWaX X-Received: by 2002:a05:6830:10f:: with SMTP id i15mr10175731otp.199.1576179709425; Thu, 12 Dec 2019 11:41:49 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1576179709; cv=none; d=google.com; s=arc-20160816; b=EHdMIom9AhJ5fQbyWAathrLmLf/1ltDcuk65tTCRVS7DFceRh9oFyCarsb2lm7ijrQ LuiUO0nxTFDjDHDTCAjoD7t9jizw7M6U9ZnCsw+PWDH1PeZ73niURvr0/SN3yOA3nr+k gw9mi95tavJ4IgpJhRxnvvUR/WbCUyhSP4r6vuY6mVLY2cLOH3/Gr77ZQJs7ZS7BFHp8 pwY0CF/7hkWqF6z0OCTuSV5Eix5AEj87DsHl62RtDMxPpJM1/BJbyLkfOPk2kmPRFkeD ZeBw3S5At6xW+9ZoCUSfdmyiOso2Th8dOELQ1U+S0LpJ6+xqWT04WuW/ANyRr9VEU3pd SRKg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:dkim-signature; bh=VlRr8ObzdbeL2+N+UOfEEVmrDxxbBsihhQbm1unaDi4=; b=TAxNU9/5mnnw1LYSrRvKXqkIah9A9PrF3c7JwgeTi9jmwwkNVK5y9gCHrL8Y8gMjvO yZTm4VvMnwvpSqEqgi8BSN9Z2Cmbu7JuL5m2yKAMwElnxlJbuuXp053qRSqwmlUzvFXN pqhx9GVz50C4ry7t5Cr/qgzbG2xDiIoFZsz074Qiipo0X1MpdTwS+dO00BWcifTUvKVN 4YeNyEIn6NIYyRVNQSS3ZizZS63P3TEpDJ4UfGsDFuImPOUXtMf2xq+KoQtOr9Nk/qZU /DWSKIMnZMJ3XST/623Bahdo+YQeXk+nmNNQabmwnskU9BnyJAEVxX1lcSGnjZDABo/h puiQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=OZovex24; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id e59si3955690ote.97.2019.12.12.11.41.36; Thu, 12 Dec 2019 11:41:49 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=OZovex24; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730620AbfLLTlD (ORCPT + 99 others); Thu, 12 Dec 2019 14:41:03 -0500 Received: from mail.kernel.org ([198.145.29.99]:44732 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1730284AbfLLTlD (ORCPT ); Thu, 12 Dec 2019 14:41:03 -0500 Received: from mail-wr1-f51.google.com (mail-wr1-f51.google.com [209.85.221.51]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 673D3227BF for ; Thu, 12 Dec 2019 19:41:02 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1576179662; bh=3mdH5uJVVqe6b0uV/b7ne6K15jv7ISIqE7RpR8fkSo8=; h=References:In-Reply-To:From:Date:Subject:To:Cc:From; b=OZovex24FGJneMwgx2EDsdJl+8WhEdW3+H1iZrlBGsvnanfDpx+cPqP1UJ9mO27tw BUhYuW86mpRdbQfo1Mvjs9gHHFSB5h0iy06YDKFGhQU6g0uZbDvW1OBkii78WPrScR 18WphFUqKkunluSkP37QdHdgcSL+sJkBFJ0zbyuE= Received: by mail-wr1-f51.google.com with SMTP id q10so4012994wrm.11 for ; Thu, 12 Dec 2019 11:41:02 -0800 (PST) X-Gm-Message-State: APjAAAU5OFMxeJv6cxlqQu9YQlDnQ6WgeQpMLVOkRGXYobjpW9VwWoKD lrPJCEDJD6kKNUAYMEo39xboRYahLWdRwxrsw03zLg== X-Received: by 2002:adf:eb09:: with SMTP id s9mr8477929wrn.61.1576179660775; Thu, 12 Dec 2019 11:41:00 -0800 (PST) MIME-Version: 1.0 References: <20191121185303.GB199273@romley-ivt3.sc.intel.com> <20191121202508.GZ4097@hirez.programming.kicks-ass.net> <20191122092555.GA4097@hirez.programming.kicks-ass.net> <3908561D78D1C84285E8C5FCA982C28F7F4DD19F@ORSMSX115.amr.corp.intel.com> <20191122203105.GE2844@hirez.programming.kicks-ass.net> <20191211175202.GQ2827@hirez.programming.kicks-ass.net> <20191211223407.GT2844@hirez.programming.kicks-ass.net> In-Reply-To: <20191211223407.GT2844@hirez.programming.kicks-ass.net> From: Andy Lutomirski Date: Thu, 12 Dec 2019 11:40:48 -0800 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: [PATCH v10 6/6] x86/split_lock: Enable split lock detection by kernel parameter To: Peter Zijlstra Cc: Andy Lutomirski , "Luck, Tony" , "Yu, Fenghua" , David Laight , Ingo Molnar , Thomas Gleixner , Ingo Molnar , Borislav Petkov , H Peter Anvin , "Raj, Ashok" , "Shankar, Ravi V" , linux-kernel , x86 , Will Deacon Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Dec 11, 2019 at 2:34 PM Peter Zijlstra wrote: > > On Wed, Dec 11, 2019 at 10:12:56AM -0800, Andy Lutomirski wrote: > > > Sure, but we're talking two cpus here. > > > > > > u32 var = 0; > > > u8 *ptr = &var; > > > > > > CPU0 CPU1 > > > > > > xchg(ptr, 1) > > > > > > xchg((ptr+1, 1); > > > r = READ_ONCE(var); > > > > > > AFAICT nothing guarantees r == 0x0101. The CPU1 store can be stuck in > > > CPU1's store-buffer. CPU0's xchg() does not overlap and therefore > > > doesn't force a snoop or forward. > > > > I think I don't quite understand. The final value of var had better > > be 0x0101 or something is severely wrong. > > > But r can be 0x0100 because > > nothing in this example guarantees that the total order of the locked > > instructions has CPU 1's instruction first. > > Assuming CPU1 goes first, why would the load from CPU0 see CPU1's > ptr[0]? It can be in CPU1 store buffer, and TSO allows regular reads to > ignore (remote) store-buffers. What I'm saying is: if CPU0 goes first, then the three operations order as: xchg(ptr+1, 1); r = READ_ONCE(var); /* 0x0100 */ xchg(ptr, 1); Anyway, this is all a bit too hypothetical for me. Is there a clear example where the total ordering of LOCKed instructions is observable? That is, is there a sequence of operations on, presumably, two or three CPUs, such that LOCKed instructions being only partially ordered allows an outcome that is disallowed by a total ordering? I suspect there is, but I haven't come up with it yet. (I mean in an x86-like memory model. Getting this in a relaxed atomic model is easy.) As a probably bad example: u32 x0, x1, a1, b0, b1; CPU 0: xchg(&x0, 1); barrier(); a1 = READ_ONCE(x1); CPU 1: xchg(&b, 1); CPU 2: b1 = READ_ONCE(x1); smp_rmb(); /* which is just barrier() on x86 */ b0 = READ_ONCE(x0); Suppose a1 == 0 and b1 == 1. Then we know that CPU0's READ_ONCE happened before CPU1's xchg and hence CPU0's xchg happened before CPU1's xchg. We also know that CPU2's first read observed the write from CPU1's xchg, which means that CPU2's second read should have been after CPU0's xchg (because the xchg operations have a total order according to the SDM). This means that b0 can't be 0. Hence the outcome (a1, b1, b0) == (0, 1, 0) is disallowed. It's entirely possible that I screwed up the analysis. But I think this means that the cache coherency mechanism is doing something more intelligent than just shoving the x0=1 write into the store buffer and letting it hang out there. Something needs to make sure that CPU 2 observes everything in the same order that CPU 0 observes, and, as far as I know it, there is a considerable amount of complexity in the CPUs that makes sure this happens. So here's my question: do you have a concrete example of a series of operations and an outcome that you suspect Intel CPUs allow but that is disallowed in the SDM? --Andy