Received: by 2002:ac0:a5b6:0:0:0:0:0 with SMTP id m51-v6csp69178imm; Tue, 5 Jun 2018 15:13:28 -0700 (PDT) X-Google-Smtp-Source: ADUXVKInVaaD+U86LO3kGoVSGXq6QMhTj/UhjKOTqIAWtfitRubI8P9ebDDSF1VG3K6NjBYu+n7l X-Received: by 2002:a65:4c0e:: with SMTP id u14-v6mr332541pgq.388.1528236808361; Tue, 05 Jun 2018 15:13:28 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1528236808; cv=none; d=google.com; s=arc-20160816; b=0l+7LYTENc4DZfymXczKhV4EibpYOF+AM0Ov17kQuVjOdMNjEl6MPabLIGodsBjE6V SgEp98rqSiXmcw/OTLD6IClc7OEaJVQXQKIxA8qAl+wVS12gm3K+usuiLO8mpa7t6PnS B4y4+my0Y00r5RHZ3huVnd/hVSAt0rfImqgAzlfaU0z+WFzthbhVJTNT/hWxs0+Q5aBv XLR3mJ1vbWZHm2teBvART1a1Zxy87Ar5m5bAu2nKopl1trJ9CGYwpK8uzJpqMr5I2UfA mN/jMuB8WZs9o/UccTEKmLwBsah7EbeHCplNYRZLC1ii4wwjCQeSG6Go9pffiXYuH98J 9wpg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :references:in-reply-to:mime-version:dkim-signature :arc-authentication-results; bh=2JKhMqeTlcnlCt3SqQTuoerKGwAx/6U1ZadX21JCynQ=; b=Haw/fwhDayQQuyGPn4U75AeKFHmltWWtsIy+VRGBSMcrJ2jmYEipIdrTWwaK7Z3SbI p1eJZGe32h+T/KmqJN/PF66eWYnEujHoEpJWCRcH0SloWboAJ6kH0oHo++qeQMFEeBov v1bYGfeDfU/OpXsIQP8Or9gibNX3IyB7cupHTP/VxdjqESNros2mgylY8CHKezwanpH9 aOgQ1K/GYzf2YfIAWNmqLnGEpC7CB7sLgrIFjU66HJEukfu65Z5l/F0Oxrmp6xGLGO55 r9xTM0CWzmZgEzkZ7frNwSCX2TjOg4sPEMYi/SZVquEugYZ05wnuH5SyYM7z8X4l7WDl 3oBQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel-com.20150623.gappssmtp.com header.s=20150623 header.b=KzbgJZBy; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id 3-v6si15687849pla.418.2018.06.05.15.13.00; Tue, 05 Jun 2018 15:13:28 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@intel-com.20150623.gappssmtp.com header.s=20150623 header.b=KzbgJZBy; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752431AbeFEWMX (ORCPT + 99 others); Tue, 5 Jun 2018 18:12:23 -0400 Received: from mail-oi0-f66.google.com ([209.85.218.66]:39095 "EHLO mail-oi0-f66.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751894AbeFEWMW (ORCPT ); Tue, 5 Jun 2018 18:12:22 -0400 Received: by mail-oi0-f66.google.com with SMTP id t22-v6so3599758oih.6 for ; Tue, 05 Jun 2018 15:12:21 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=intel-com.20150623.gappssmtp.com; s=20150623; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=2JKhMqeTlcnlCt3SqQTuoerKGwAx/6U1ZadX21JCynQ=; b=KzbgJZBy8aSGMhMM/ferDwVLaPLt7fPXksw1BbGZQmMk0LB+RXQVoxCkTOqlXN7Apr bwizTZq90Ol4Q7Y2KFSz3mQQQhnZJsXYZLHRYOU6AEgepPDiUmS8fXt7hvnI+jUo+kUl TGvHbhKZuNhHXGSXrXR1aOUVAOSxI7t7wtXbBPySW9CtvT0tJr/oPyG0Zr1tplKlMis7 cqk2veppjoB9bNNh0v+shdsPRK/eK7MlrdJsoeEcJIeGhOinaiFDXiJvIJGSnE3ecdZ4 Z4n86uj/4M7HrcLrK/RjYpND+KYIEic+tpPQfqWcU4gCed+P5vT/sZ9Fc1h5aQCzFYQR +cnw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=2JKhMqeTlcnlCt3SqQTuoerKGwAx/6U1ZadX21JCynQ=; b=FpoVKwfQzWxdo1bBTbJJa/CMn5D13m99RIRxD8QGTl4376wJnUdhWi87UwbV5qyhDc DdDUC8mfnEqsgtpAO0ojFfrE70no2XFDvvbxy9Kdu76NI+iKSCB5aFq2Wqkkp+XeYg+7 jqUjk9UQmih2eyCU65rL56/4vkug249j14e4TOZs/65iCPaR7yUoV+BnreUm3ext7ZUr v9nToNHBF1H9tJuG79LfwplJSm6QgHPW+reiZIDpbZRHH3mSY/qoj5VyIXleDAzXR0Ou PB8RW+s5AuPoKt9Aq7LmLxezi7HGTVBNy1nfABlUpxvNPA14j/yOBtsMNu+9s4IxIOwq RBHQ== X-Gm-Message-State: APt69E2ANiivENKeyS/jJgo7se84BEcO5PfZv+D0Qcy2MwhFUBdROVQJ TA3Ra3wvJ5rdPxAKZl8cW3aFClKvi5esjljagIuDNA== X-Received: by 2002:aca:6b89:: with SMTP id g131-v6mr276410oic.118.1528236741427; Tue, 05 Jun 2018 15:12:21 -0700 (PDT) MIME-Version: 1.0 Received: by 2002:a9d:2ea9:0:0:0:0:0 with HTTP; Tue, 5 Jun 2018 15:12:20 -0700 (PDT) In-Reply-To: <20180605215926.GA16066@linux.intel.com> References: <20180605205841.15878-1-ross.zwisler@linux.intel.com> <20180605205841.15878-2-ross.zwisler@linux.intel.com> <20180605215926.GA16066@linux.intel.com> From: Dan Williams Date: Tue, 5 Jun 2018 15:12:20 -0700 Message-ID: Subject: Re: [PATCH 2/2] libnvdimm: don't flush power-fail protected CPU caches To: Ross Zwisler Cc: Linux Kernel Mailing List , Dave Jiang , linux-nvdimm Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Jun 5, 2018 at 2:59 PM, Ross Zwisler wrote: > On Tue, Jun 05, 2018 at 02:20:38PM -0700, Dan Williams wrote: >> On Tue, Jun 5, 2018 at 1:58 PM, Ross Zwisler >> wrote: >> > This commit: >> > >> > 5fdf8e5ba566 ("libnvdimm: re-enable deep flush for pmem devices via fsync()") >> > >> > intended to make sure that deep flush was always available even on >> > platforms which support a power-fail protected CPU cache. An unintended >> > side effect of this change was that we also lost the ability to skip >> > flushing CPU caches on those power-fail protected CPU cache. >> > >> > Signed-off-by: Ross Zwisler >> > Fixes: 5fdf8e5ba566 ("libnvdimm: re-enable deep flush for pmem devices via fsync()") >> > --- >> > drivers/dax/super.c | 20 +++++++++++++++++++- >> > drivers/nvdimm/pmem.c | 2 ++ >> > include/linux/dax.h | 9 +++++++++ >> > 3 files changed, 30 insertions(+), 1 deletion(-) >> > >> > diff --git a/drivers/dax/super.c b/drivers/dax/super.c >> > index c2c46f96b18c..457e0bb6c936 100644 >> > --- a/drivers/dax/super.c >> > +++ b/drivers/dax/super.c >> > @@ -152,6 +152,8 @@ enum dax_device_flags { >> > DAXDEV_ALIVE, >> > /* gate whether dax_flush() calls the low level flush routine */ >> > DAXDEV_WRITE_CACHE, >> > + /* only flush the CPU caches if they are not power fail protected */ >> > + DAXDEV_FLUSH_ON_SYNC, >> > }; >> > >> > /** >> > @@ -283,7 +285,8 @@ EXPORT_SYMBOL_GPL(dax_copy_from_iter); >> > void arch_wb_cache_pmem(void *addr, size_t size); >> > void dax_flush(struct dax_device *dax_dev, void *addr, size_t size) >> > { >> > - if (unlikely(!dax_write_cache_enabled(dax_dev))) >> > + if (unlikely(!dax_write_cache_enabled(dax_dev)) || >> > + !dax_flush_on_sync_enabled(dax_dev)) >> >> This seems backwards. I think we should teach the pmem driver to still >> issue deep flush even when dax_write_cache_enabled() is false. > > That does still happen. Deep flush is essentially controlled by the 'wbc' > variable in pmem_attach_disk(), which we use to set blk_queue_write_cache(). Right, what I'm trying to kill is the need to add dax_flush_on_sync_enabled() I think we can handle this local to the pmem driver and not extend the 'dax' api. > My understanding is that this causes the block layer to send down > REQ_FUA/REQ_PREFLUSH BIOs, and it's in response to these that we do a deep > flush via nvdimm_flush(). Whether this happens is totally up to the device's > write cache setting, and doesn't look at whether the platform has > flush-on-fail CPU caches. > > This does bring up another wrinkle, though: we export a write_cache sysfs > entry that you can use to change the write cache setting of a namespace: > > i.e.: > /sys/bus/nd/devices/pfn0.1/block/pmem0/dax/write_cache > > This changes whether or not the DAXDEV_WRITE_CACHE flag is set, but does *not* > change whether the block queue says it supports a write cache > (blk_queue_write_cache()). So, the sysfs entry ends up controlling whether or > not we do CPU cache flushing via DAX, but does not do anything with the deep > flush code. > > I'm guessing this should be fixed? I'll go take a look... I think we need to disconnect DAXDEV_WRITE_CACHE from the indication of the filesystem triggerring nvdimm_flush() via REQ_{FUA,FLUSH}.