Received: by 2002:ac0:a5a7:0:0:0:0:0 with SMTP id m36-v6csp783524imm; Mon, 9 Jul 2018 10:32:00 -0700 (PDT) X-Google-Smtp-Source: AAOMgpd3njzSj2MGeBSM8STyf9Y1emvXcBch1rOZOAGLS6/HsJW0j3yWEk8/R/raofkSQmm1roI0 X-Received: by 2002:a65:63cd:: with SMTP id n13-v6mr19478496pgv.185.1531157520542; Mon, 09 Jul 2018 10:32:00 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1531157520; cv=none; d=google.com; s=arc-20160816; b=LZQpWPEAbsvAXNMtm3qYD9fPWyQTg1oQzmcbORcWi2pAJSIDJfuUXz2Qzh6RZHlaAj 6Ti/JBch/KeTJupiO9BtMP6CM2+xWNo3lAJjmF/B2P0tMfJQl1a8JPR7ZtqpU/XHkbT7 8asXDQVmrGyb5SGfrkst9ZycEwPiXUuuaB+zvfGlf0LYcs4XFuWPYaPFB0IY1ecrlwoU KJ5OGCV+iV8D6oag4RNgn9P9Fde0ksd+jvQSbyICeb5hXb6EKAZLlcubwCjlIULilBpE iaRp9DC/0XR1lqii++mz2zExVX+RPJ7MNpygwO6rC3TZxaKn99ENF4o8IwWwqS2Hf+A0 Nh6Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding :content-language:in-reply-to:mime-version:user-agent:date :message-id:from:references:cc:to:subject:arc-authentication-results; bh=NnFWd/S+cwY6a4UkRvTNCNr9pljPP2Elp76iB1YzE+U=; b=DDIzKxC6N2y1EROfh+5F0e4A52d4mI0kliYstMIbJlbEalgc/5QvD0diIuPf4zgQtE pxRQjb7evpzmCijgXrsBrO2OyfI1iwXmQuSp5rO4yJApNE0qPaJhKSC9Rm8TyVuCjN/F 1TCnP+YjYB3kKehW8jCWYMdSwxjD3wfeYHFCvc8J5Hc6AG0HaAljBcHsevo4ckt3Wrzc er8ZPkvOvUc2qkk1Gul/Qt/4UQxNrhFP64ocCXnOMYSDNDwVwTvJ53G1EaF+73A5UdFA 6aRopaM3LUOm9+6Dq9usOR82cNxaeBN8K2G0fGrUqCEgua8qsPLpzBlrMysFOojRrWH8 A+2A== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=nvidia.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id q28-v6si7588688pgm.362.2018.07.09.10.31.43; Mon, 09 Jul 2018 10:32:00 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=nvidia.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S933862AbeGIR35 (ORCPT + 99 others); Mon, 9 Jul 2018 13:29:57 -0400 Received: from hqemgate14.nvidia.com ([216.228.121.143]:13131 "EHLO hqemgate14.nvidia.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S933425AbeGIR3z (ORCPT ); Mon, 9 Jul 2018 13:29:55 -0400 Received: from hqpgpgate101.nvidia.com (Not Verified[216.228.121.13]) by hqemgate14.nvidia.com (using TLS: TLSv1, AES128-SHA) id ; Mon, 09 Jul 2018 10:29:53 -0700 Received: from HQMAIL105.nvidia.com ([172.20.161.6]) by hqpgpgate101.nvidia.com (PGP Universal service); Mon, 09 Jul 2018 10:29:55 -0700 X-PGP-Universal: processed; by hqpgpgate101.nvidia.com on Mon, 09 Jul 2018 10:29:55 -0700 Received: from [10.110.39.62] (10.110.39.62) by HQMAIL105.nvidia.com (172.20.187.12) with Microsoft SMTP Server (TLS) id 15.0.1347.2; Mon, 9 Jul 2018 17:29:54 +0000 Subject: Re: [PATCH 2/2] tools/memory-model: Add write ordering by release-acquire and by locks To: Will Deacon , "Paul E. McKenney" CC: Alan Stern , Andrea Parri , LKMM Maintainers -- Akira Yokosawa , Boqun Feng , David Howells , Jade Alglave , Luc Maranget , Nicholas Piggin , Peter Zijlstra , Kernel development list References: <20180705150945.GA3699@andrea> <20180706211055.GN3593@linux.vnet.ibm.com> <20180709165200.GA4689@arm.com> From: Daniel Lustig Message-ID: <01c35480-e207-c916-078b-de53df0e2645@nvidia.com> Date: Mon, 9 Jul 2018 10:29:54 -0700 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:52.0) Gecko/20100101 Thunderbird/52.9.0 MIME-Version: 1.0 In-Reply-To: <20180709165200.GA4689@arm.com> X-Originating-IP: [10.110.39.62] X-ClientProxiedBy: HQMAIL105.nvidia.com (172.20.187.12) To HQMAIL105.nvidia.com (172.20.187.12) Content-Type: text/plain; charset="utf-8" Content-Language: en-US Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 7/9/2018 9:52 AM, Will Deacon wrote: > On Fri, Jul 06, 2018 at 02:10:55PM -0700, Paul E. McKenney wrote: >> On Fri, Jul 06, 2018 at 04:37:21PM -0400, Alan Stern wrote: >>> On Thu, 5 Jul 2018, Andrea Parri wrote: >>> >>>>> At any rate, it looks like instead of strengthening the relation, I >>>>> should write a patch that removes it entirely. I also will add new, >>>>> stronger relations for use with locking, essentially making spin_lock >>>>> and spin_unlock be RCsc. >>>> >>>> Thank you. >>>> >>>> Ah let me put this forward: please keep an eye on the (generic) >>>> >>>> queued_spin_lock() >>>> queued_spin_unlock() >>>> >>>> (just to point out an example). Their implementation (in part., >>>> the fast-path) suggests that if we will stick to RCsc lock then >>>> we should also stick to RCsc acq. load from RMW and rel. store. Just to be clear, this is "RCsc with W->R exception" again, right? >>> A very good point. The implementation of those routines uses >>> atomic_cmpxchg_acquire() to acquire the lock. Unless this is >>> implemented with an operation or fence that provides write-write >>> ordering (in conjunction with a suitable release), qspinlocks won't >>> have the ordering properties that we want. >>> >>> I'm going to assume that the release operations used for unlocking >>> don't need to have any extra properties; only the lock-acquire >>> operations need to be special (i.e., stronger than a normal >>> smp_load_acquire). This suggests that atomic RMW functions with acquire >>> semantics should also use this stronger form of acquire. It's not clear to me that the burden of enforcing "RCsc with W->R ordering" should always be placed only on the acquire half. RISC-V currently places some of the burden on the release half, as we discussed last week. Specifically, there are a few cases where fence.tso is used instead of fence rw,w on the release side. If we always use fence.tso here, following the current recommendation, we'll still be fine. If LKMM introduces an RCpc vs. RCsc distinction of some kind, though, I think we would want to distinguish the two types of release accordingly as well. >>> Does anybody have a different suggestion? >> >> The approach you suggest makes sense to me. Will, Peter, Daniel, any >> reasons why this approach would be a problem for you guys? > > qspinlock is very much opt-in per arch, so we can simply require that > an architecture must have RCsc RmW atomics if they want to use qspinlock. > Should an architecture arise where that isn't the case, then we could > consider an arch hook in the qspinlock code, but I don't think we have > to solve that yet. > > Will This sounds reasonable to me. Dan