Received: by 2002:a05:7412:d1aa:b0:fc:a2b0:25d7 with SMTP id ba42csp1185568rdb; Tue, 30 Jan 2024 10:14:33 -0800 (PST) X-Google-Smtp-Source: AGHT+IH1tvE4TD48t7/5IfzNWThsuoRvfZ7XLLnhF67HqSxt214P4yIzSaj+ivzz2ht3l1Wk3R+d X-Received: by 2002:aa7:df96:0:b0:55f:5e7b:7c6c with SMTP id b22-20020aa7df96000000b0055f5e7b7c6cmr499433edy.32.1706638473075; Tue, 30 Jan 2024 10:14:33 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1706638473; cv=pass; d=google.com; s=arc-20160816; b=xkink025K3kcRqpaPTs6xiVEzQsUGZnsdithL+LTwcTp2TMmAN3EAkBTdap8LDoXJB GSlxRCrOKFw19tF3nSff6RbSp+a9FH4lqjRwDnE+vrBthBqF8gfrlovltEV09wl+kzFd hReGwv+LV+iQWVd9qqAgDFn7O1YSEooil22j0yPasRFhPgAH3zK7EuLdH7LYotWIVNzt xTUqA2+iiwdP2A3UwsPO2LMfSN3BUtA9BuN4WyLSFwPhi6jUdwi11l0zVuJ8DLC/OSBD GltG5w2tmz75h8DobIzTzKQPMeNjmbXMBcL4gzqsB9w0cPrecvxqd8eIlvey8AbdIORA ozZA== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=in-reply-to:content-disposition:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:references:message-id:subject:cc :to:from:date; bh=neyD+VgCfoxF0786R2lmXUONkxPdGWxybpJ2MCGSQJM=; fh=uDGV37n6EnVQGZVMjZ68hgHBUSrnM8fmu+TQ3o8FtvI=; b=cLbC2a/Qk7iEiFGrzsReA/cEZV+O9Xk9BKVBDZq++NvF9ErUl6BgecG8m1bRSlGi6u rPdpLM5d37aTXNUcLPPqlRDCsw8rkTZ7P3h0DDfR3D0vFbBACfJyk2w075yZt2IvT5Ac URJZFXY+QuVmeSvW8Fmn9kIBITySoaNBZyzwV8A6qs/adyZ/eadLH8QSk4WRpwB7aN4f JcvQdeMsO6dsUFoO/CfAybPRscOOMjQoAOrtpeykENtMLVFvGnQZK4wqQnjq/nVf2TCX gy2H9mCudzC2XFPTr6C0Bqm5wr0sUC9B6on0NZbXhv13payT8KHa+WUf7gQ2oapvCEiD gujg== ARC-Authentication-Results: i=2; mx.google.com; arc=pass (i=1 spf=pass spfdomain=arm.com dmarc=pass fromdomain=arm.com); spf=pass (google.com: domain of linux-kernel+bounces-45083-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:4601:e00::3 as permitted sender) smtp.mailfrom="linux-kernel+bounces-45083-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Return-Path: Received: from am.mirrors.kernel.org (am.mirrors.kernel.org. [2604:1380:4601:e00::3]) by mx.google.com with ESMTPS id d10-20020a056402516a00b0055f4a49fa3esi609502ede.127.2024.01.30.10.14.33 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 30 Jan 2024 10:14:33 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel+bounces-45083-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:4601:e00::3 as permitted sender) client-ip=2604:1380:4601:e00::3; Authentication-Results: mx.google.com; arc=pass (i=1 spf=pass spfdomain=arm.com dmarc=pass fromdomain=arm.com); spf=pass (google.com: domain of linux-kernel+bounces-45083-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:4601:e00::3 as permitted sender) smtp.mailfrom="linux-kernel+bounces-45083-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by am.mirrors.kernel.org (Postfix) with ESMTPS id CFBFF1F2612A for ; Tue, 30 Jan 2024 18:14:32 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id ED03C157056; Tue, 30 Jan 2024 18:14:21 +0000 (UTC) Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 6578D1552EC; Tue, 30 Jan 2024 18:14:18 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=217.140.110.172 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1706638461; cv=none; b=IcqAukhc1I0N5NCR6nP1sMM0Wvz5bSQmvuoC71R5S165g9m8jy/6Q+5VOgHhLwOHqAC1QMFj6DmKFYvb5HUSvHJIB2+f3NCucb1SQEUO7SnY0kox1Q9OkRdC20sY43ABVr18X+ds6K13I1/XkAgzYNuFfFDQXD+NNCa6Uq1+7PA= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1706638461; c=relaxed/simple; bh=/tqXGxwF3b22rn3wgrCe0BaGaBi/qiWgcoDjGcjSu3g=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=F6lwPSgClyT1VU+qon9gZJC4mOu8UtTjBiDKt6eQ8RqxXnBNIIWyw9jJeIf3J98oWct9dUtWjLtg1YzY9Voj/dPVPVQGOqyGYxkv+BYj/ZBZsVIRtst88ymCOef4ndXUUeVXYyp8UHPwZlNO/TU/z98RbD9IFpxIOpeirf8CW2k= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=arm.com; spf=pass smtp.mailfrom=arm.com; arc=none smtp.client-ip=217.140.110.172 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=arm.com Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id A28E6DA7; Tue, 30 Jan 2024 10:14:59 -0800 (PST) Received: from FVFF77S0Q05N (unknown [10.57.45.140]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 7811B3F762; Tue, 30 Jan 2024 10:14:14 -0800 (PST) Date: Tue, 30 Jan 2024 18:14:11 +0000 From: Mark Rutland To: Fenghua Yu Cc: Vinod Koul , Dave Jiang , dmaengine@vger.kernel.org, linux-kernel , Nikhil Rao , Tony Zhu , Mathieu Desnoyers Subject: Re: [PATCH] dmaengine: idxd: Change wmb() to smp_wmb() when copying completion record to user space Message-ID: References: <20240130025806.2027284-1-fenghua.yu@intel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: On Tue, Jan 30, 2024 at 05:58:24PM +0000, Mark Rutland wrote: > This patch might be ok (it looks reasonable as an optimization), but I think > the description of wmb() and smp_wmb() is incorrect. I also think that you're > missing an rmb()/smp_rmb()eor equivalent on the reader side. Sorry, the above should have said: an rmb()/smp_rmb() *or* equivalent > > On Mon, Jan 29, 2024 at 06:58:06PM -0800, Fenghua Yu wrote: > > wmb() is used to ensure status in the completion record is written > > after the rest of the completion record, making it visible to the user. > > However, on SMP systems, this may not guarantee visibility across > > different CPUs. > > > > Considering this scenario that event log handler is running on CPU1 while > > user app is polling completion record (cr) status on CPU2: > > > > CPU1 CPU2 > > event log handler user app > > > > 1. cr = 0 (status = 0) > > 2. copy X to user cr except "status" > > 3. wmb() > > 4. copy Y to user cr "status" > > 5. poll status value Y > > 6. read rest cr which is still 0. > > cr handling fails > > 7. cr value X visible now > > > > Although wmb() ensure value Y is written and visible after X is written > > on CPU1, the order is not guaranteed on CPU2. So user app may see status > > value Y while cr value X is still not visible yet on CPU2. This will > > cause reading 0 from the rest of cr and cr handling fails. > > The wmb() on CPU1 ensures the order of the reads, but you need an rmb() on CPU2 Sorry again, the above should have said: The wmb() on CPU1 ensures the order of the *writes* Apologies for any confusion resulting from those mistakes. Mark. > between reading the 'status' and 'rest' parts; otherwise CPU2 (or the > compiler!) is permitted to hoist the read of 'rest' early, before reading from > 'status', and hence you can end up with a sequence that is effectively: > > CPU1 CPU2 > event log handler user app > > 1. cr = 0 (status = 0) > 6a. read rest cr which is still 0. > 2. copy X to user cr except "status" > 3. wmb() > 4. copy Y to user cr "status" > 5. poll status value Y > 6b. cr handling fails > 7. cr value X visible now > > Since this is all to regular cacheable memory, it's *sufficient* to use > smp_wmb() and smp_rmb(), but that's an optimization rather than an ordering > fix. > > Note that on x86_64, TSO means that the stores are in-order (and so smp_wmb() > is just a compiler barrier), and IIUC loads are not reordered w.r.t. other > loads (and so smp_rmb() is also just a compiler barrier). > > > Changing wmb() to smp_wmb() ensures Y is written after X on both CPU1 > > and CPU2. This guarantees that user app can consume cr in right order. > > This implies that smp_wmb() is *stronger* than wmb(), whereas smp_wmb() is > actually *weaker* (e.g. on x86_64 wmb() is an sfence, whereas smp_wmb() is a > barrier()). > > Thanks, > Mark. > > > > > Fixes: b022f59725f0 ("dmaengine: idxd: add idxd_copy_cr() to copy user completion record during page fault handling") > > Suggested-by: Nikhil Rao > > Tested-by: Tony Zhu > > Signed-off-by: Fenghua Yu > > --- > > drivers/dma/idxd/cdev.c | 5 +++-- > > 1 file changed, 3 insertions(+), 2 deletions(-) > > > > diff --git a/drivers/dma/idxd/cdev.c b/drivers/dma/idxd/cdev.c > > index 77f8885cf407..9b7388a23cbe 100644 > > --- a/drivers/dma/idxd/cdev.c > > +++ b/drivers/dma/idxd/cdev.c > > @@ -681,9 +681,10 @@ int idxd_copy_cr(struct idxd_wq *wq, ioasid_t pasid, unsigned long addr, > > * Ensure that the completion record's status field is written > > * after the rest of the completion record has been written. > > * This ensures that the user receives the correct completion > > - * record information once polling for a non-zero status. > > + * record information on any CPU once polling for a non-zero > > + * status. > > */ > > - wmb(); > > + smp_wmb(); > > status = *(u8 *)cr; > > if (put_user(status, (u8 __user *)addr)) > > left += status_size; > > -- > > 2.37.1 > > > > >