Received: by 2002:a05:6a10:5bc5:0:0:0:0 with SMTP id os5csp1832264pxb; Thu, 4 Nov 2021 09:11:55 -0700 (PDT) X-Google-Smtp-Source: ABdhPJzmH9BVUnZJVo7pnuZM5ncCBl45OyojrAXTSXWEQYGwNuKBSR37gJxGSKzJw8Xs71ckrSV7 X-Received: by 2002:a05:6402:4255:: with SMTP id g21mr15996601edb.256.1636042315224; Thu, 04 Nov 2021 09:11:55 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1636042315; cv=none; d=google.com; s=arc-20160816; b=lbsRlHymFkfps/NY183WsMyl5w2NVsysh1hj0WAwQpd5pj/FwQr0vdniCRQPYX6vzf qSS4qSc8Arsny+lyOkDVNILwEXutKsd00GLs2rCq6hK5JlderUEPE2gymMADLHBOns1x C8M2vdvmq3gHRrsCl9IIFcDI8pnvvILGD4QSpgiSlSzGpFpPd86cHqL403F93hpETzec POPElObhbKfqimYFKbBCpPlyzG6/ZOBAdyvBv8DbQY5dx/jP7Y3wyavdPv5Cn2JGIpxN 9Llrm8NokBz6hl2quLOPmP+x/1UlM6T+YBgWpVUv2FYDQS5ZpLVEv1nCaPSVyGkKjudn c24A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:subject:message-id:date:from:in-reply-to :references:mime-version:dkim-signature; bh=yJc11Y8854aO1pDJriycOvwdVvyzfDm/4ES7HzJB+FM=; b=SM95Lz/u1MHiwxmA32/kpuYXi+1t8OU7ZxBA+fPHiPPEEfyN87m3IRoXhsU/AqQqyI dDQxcGfDUvbEj65FJhdeFZ/nZ06UnFakVl2pgGsvdZLi231ywrk5r2kByLmPLM92pU9e t5i7zcVIZhRoHZjsatDWw4k7gjvFjLBneIzxrpGxVCXW3IuPJv8q7cQvePqJhsq/ZJBl TDMVdxL4sXbJGY36Fip8TQdkMImiZ+derWRW6A9LSwVFUFggjkGvyhErxqZa0q/bZctW Z0XOcOSgtTKywItbCiO+1/2t5piSDm1S2yjNs8bQxur0cirnTr0ADZEI6WHlDXUaInK0 0zyA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel-com.20210112.gappssmtp.com header.s=20210112 header.b=IsnPdnBo; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id hc12si9391889ejc.700.2021.11.04.09.11.07; Thu, 04 Nov 2021 09:11:55 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@intel-com.20210112.gappssmtp.com header.s=20210112 header.b=IsnPdnBo; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231463AbhKDQLb (ORCPT + 99 others); Thu, 4 Nov 2021 12:11:31 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38076 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229770AbhKDQLa (ORCPT ); Thu, 4 Nov 2021 12:11:30 -0400 Received: from mail-pg1-x531.google.com (mail-pg1-x531.google.com [IPv6:2607:f8b0:4864:20::531]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 70CE2C061714 for ; Thu, 4 Nov 2021 09:08:52 -0700 (PDT) Received: by mail-pg1-x531.google.com with SMTP id a9so5826852pgg.7 for ; Thu, 04 Nov 2021 09:08:52 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=intel-com.20210112.gappssmtp.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=yJc11Y8854aO1pDJriycOvwdVvyzfDm/4ES7HzJB+FM=; b=IsnPdnBom5LpEFlGcLWkFA9sW7xGG8iOp50ZHuzEB4vQDUSiDguPpZOqfovPfwIarp GsKSCbR87dHhdnK+ldotdA9Dbfq2sjEmSbTuQ5UY0613Wk7auCSP74m1QFrpDaW2u/sC ZFnFhs5kkVCJMWE6/gQem/FTNxnco9Y30Wvy+SRy6QWFEsRBD/sH0B/4cVB8fQZMJYli o/FUeNd9Wa9xTTJyrJOB/sjYPFO8bf/ZCsaOqjq5HfAtKNnJS7+bF/FxfdfxnSgKyRlE a8T4w8vcLyfYZYARqfbaz7B2UovGGoPthbHEqZGn0Bv1y/rPFtfapIWXuUDyDAgaUXj3 4F3A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=yJc11Y8854aO1pDJriycOvwdVvyzfDm/4ES7HzJB+FM=; b=iVE1BOsCC3GSGwQtK+zdUydC5hy25LLRpnfo/dXns0fn3CaLSsWpL3EzTJnxGkySgk koZaSCajm8SR6cagd0wCwsOMqFqPYBBQ+pS1MqjWGexfl4OPaL8AbNgSl+9vtHYeoc7A hOkWCL1GTua1ujtRH8wbVWCLFhJ8vH7C5PDRarPIxAV11ApvWAkElAZVFYPcw7YdC+g6 0/wsBI0DDtkrb05159xt1SlTMuT/5+1srFuv++LY1RnO7p9Nw2SdLZThKvpMz/FmkvBr fAOKOQKlj7v1kxRQeUgyYa05vLYrjjBI6wGMxU0pdiv6iuetXbe5opvVudZ0z0tiFc3k r50w== X-Gm-Message-State: AOAM533RqcsHiOHlMOyonXzMJ+9XOFZZ/fr4TDTXHoyb9dMit2lOu/yM vcvoX16SpOQKZ6RDfUGsLOUF4RoKKHynYNrs4MZE8A== X-Received: by 2002:a63:6302:: with SMTP id x2mr22207191pgb.5.1636042131973; Thu, 04 Nov 2021 09:08:51 -0700 (PDT) MIME-Version: 1.0 References: <20211021001059.438843-1-jane.chu@oracle.com> <2102a2e6-c543-2557-28a2-8b0bdc470855@oracle.com> <20211028002451.GB2237511@magnolia> In-Reply-To: From: Dan Williams Date: Thu, 4 Nov 2021 09:08:41 -0700 Message-ID: Subject: Re: [dm-devel] [PATCH 0/6] dax poison recovery with RWF_RECOVERY_DATA flag To: Christoph Hellwig Cc: Jane Chu , "Darrick J. Wong" , "david@fromorbit.com" , "vishal.l.verma@intel.com" , "dave.jiang@intel.com" , "agk@redhat.com" , "snitzer@redhat.com" , "dm-devel@redhat.com" , "ira.weiny@intel.com" , "willy@infradead.org" , "vgoyal@redhat.com" , "linux-fsdevel@vger.kernel.org" , "nvdimm@lists.linux.dev" , "linux-kernel@vger.kernel.org" , "linux-xfs@vger.kernel.org" Content-Type: text/plain; charset="UTF-8" Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Nov 4, 2021 at 1:36 AM Christoph Hellwig wrote: > > On Wed, Nov 03, 2021 at 11:21:39PM -0700, Dan Williams wrote: > > The concern I have with dax_clear_poison() is that it precludes atomic > > error clearing. > > atomic as in clear poison and write the actual data? Yes, that would > be useful, but it is not how the Intel pmem support actually works, right? Yes, atomic clear+write new data. The ability to atomic clear requires either a CPU with the ability to overwrite cachelines without doing a RMW cycle (MOVDIR64B), or it requires a device with a suitable slow-path mailbox command like the one defined for CXL devices (see section 8.2.9.5.4.3 Clear Poison in CXL 2.0). I don't know why you think these devices don't perform wear-leveling with spare blocks? > > Also, as Boris and I discussed, poisoned pages should > > be marked NP (not present) rather than UC (uncacheable) [1]. > > This would not really have an affect on the series, right? But yes, > that seems like the right thing to do. It would because the implementations would need to be careful to clear poison in an entire page before any of it could be accessed. With an enlightened write-path RWF flag or custom fault handler it could do sub-page overwrites of poison. Not that I think the driver should optimize for multiple failed cachelines in a page, but it does mean dax_clear_poison() fails in more theoretical scenarios. > > With > > those 2 properties combined I think that wants a custom pmem fault > > handler that knows how to carefully write to pmem pages with poison > > present, rather than an additional explicit dax-operation. That also > > meets Christoph's requirement of "works with the intended direct > > memory map use case". > > So we have 3 kinds of accesses to DAX memory: > > (1) user space mmap direct access. > (2) iov_iter based access (could be from kernel or userspace) > (3) open coded kernel access using ->direct_access > > One thing I noticed: (2) could also work with kernel memory or pages, > but that doesn't use MC safe access. Yes, but after the fight to even get copy_mc_to_kernel() to exist for pmem_copy_to_iter() I did not have the nerve to push for wider usage. > Which seems like a major independent > of this discussion. > > I suspect all kernel access could work fine with a copy_mc_to_kernel > helper as long as everyone actually uses it, All kernel accesses do use it. They either route to pmem_copy_to_iter(), or like dm-writecache, call it directly. Do you see a kernel path that does not use that helper? > missing required bits of (2) and (3) together with something like the > ->clear_poison series from Jane. We just need to think hard what we > want to do for userspace mmap access. dax_clear_poison() is at least ready to go today and does not preclude adding the atomic and finer grained support later.