Received: by 2002:a05:6a10:5bc5:0:0:0:0 with SMTP id os5csp1176361pxb; Tue, 9 Nov 2021 06:22:11 -0800 (PST) X-Google-Smtp-Source: ABdhPJx6Fwk9pUXlUsnez/+u5nltnohQqaPL5dj4kDn/EmZmkZFp747oqC+4Kq6C3o9sf3woxWal X-Received: by 2002:a92:ca0d:: with SMTP id j13mr2910715ils.178.1636467731317; Tue, 09 Nov 2021 06:22:11 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1636467731; cv=none; d=google.com; s=arc-20160816; b=bQlTEXlax1laFMrv09K8fAHDQM/WyB2EKStDOdbHtUIzC/gI4CmJJorV9cuYyVguOQ 90JoBgHBLBaAfsAMgha43xmrcDRbEpK/OJzrXn5NCTousF2uGbcthO1pa+yxqjpUrP9p hrg4H3KBCKtb4s/UcrhfV4PIjlLF3Qi3y57SaHJywG5Cxr9VXImXt7mWOzSbq967kk6A AfrCfS9lyzPNvcrY38yIolNIudjixha6sQR00fbmmVRCXgoTlEC3on5ark8vosDRXmlU QPCrDC5RzyjGWRV2CeuxqnvuhVNXIHuYk/7PYa3hIaKsC8ztpEmGIHddhzvCcg4TbEnM GkWA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:subject:message-id:date:from:in-reply-to :references:mime-version:dkim-signature; bh=NVge1Iw2DLICm/edhKMuLv0kwlblMX536VglZLVGuRY=; b=cN0KSBQT0f3sPHolHGJ9QDf4UKGIgmBIU+AV8grRDhRq4JDCarmttbMCdT+vE9v5O6 KWXzy0reauUfDEl5lS+MvAYsZeM9PCdwsRxoDAIzNRr1tpSe1Jn+Lf6JWG65WaASA1Ht Nxx7eO4G+PwU+FxjYmDCXtXcb/rTvZ3RW8qQk7HzhHw1JreM5aoWQfEn5bMFDQoS3eLj ze7NUMhF1VVJm8qP+r6k9AYYFZMqCOeIr62MYjVsvYm60XmkpU36doV1rcXewhKatrYk BGvxsjXlowuEPyHO0Si8pBDz0/a44vEkqPz2i+Q4eV5s+m9Mu+dPh8kXfUEP3d++4Aov Ra0Q== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel-com.20210112.gappssmtp.com header.s=20210112 header.b="0y9m/3Tm"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id q2si35642944ilv.130.2021.11.09.06.21.56; Tue, 09 Nov 2021 06:22:11 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@intel-com.20210112.gappssmtp.com header.s=20210112 header.b="0y9m/3Tm"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S240691AbhKIGHV (ORCPT + 99 others); Tue, 9 Nov 2021 01:07:21 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:54830 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S240711AbhKIGHU (ORCPT ); Tue, 9 Nov 2021 01:07:20 -0500 Received: from mail-pl1-x629.google.com (mail-pl1-x629.google.com [IPv6:2607:f8b0:4864:20::629]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 02337C061767 for ; Mon, 8 Nov 2021 22:04:35 -0800 (PST) Received: by mail-pl1-x629.google.com with SMTP id u17so18978437plg.9 for ; Mon, 08 Nov 2021 22:04:34 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=intel-com.20210112.gappssmtp.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=NVge1Iw2DLICm/edhKMuLv0kwlblMX536VglZLVGuRY=; b=0y9m/3Tm3ZPhkRfOO6g6G8z3si/DzsgoP/MCPFixBiyqHbS5GCigEp/dxfNfAseuxS xThX8p4IEReKP+5oxM6JzYGIeRyClmu/fx6c5Rx+NDh3ZfmAXahyg+B0H8EoEGRZElbf YGDSosivvUKzfEReFxYNIPktibOlqP0MH+sG6GINqABRAZqmPBABbEWU9mnCx2uqUvjn vsaInhxhJX8T0ab9HkmF6f4okUnSUBvkCjzj7VDW63dScslQ3yBjiipLTGyDwHa0VayU 4Z4YaHdEJZqVrovaz24ifQc0maLsszlZ3AxdC8bbzM8uN1EXsV/QYBf/NjWXNGmSE3dZ 9OMA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=NVge1Iw2DLICm/edhKMuLv0kwlblMX536VglZLVGuRY=; b=YZIt9uyjgV+Osr3VgZJmDNqMyCRlZJGSJ3MzYc/hKhxbIk9pqsTUtXlYtJmdMpdmj1 e2dqAHLPBLuwzmtWzwNAgaKhyyvdY7b6DLwoyfRH3PFdHSMAmnHQnhcVW7Bk7lqTwvOt HAMGQPrMAEAbIuld2lpccwOv5A7SoVWyGjkR+nRcTF3RRPqjzXE/sDB8dusD0Yyq9Ea+ M+kRFVVbuG0Eo3X5US+GXJmoXCVrXwAhQkonf3yQQPQeF3ivSBUXjoa10QwRXaUpVobn 3QCrzF5cH6C9jC+hBq4ZHkymq30aZqATgdoiCooTmhogq64oq+s2CngLaMw7Sgl3Nvia kxlQ== X-Gm-Message-State: AOAM530SilsF9eL8Ypt/rsnUYLsSasI77gSuiMyj6jR8j+uxi8iIaocM FrhZUmGxOMwxuyAkfVhaRbfe/JxcCfKThD/+Tm9Oxg== X-Received: by 2002:a17:90b:1e07:: with SMTP id pg7mr4381475pjb.93.1636437874506; Mon, 08 Nov 2021 22:04:34 -0800 (PST) MIME-Version: 1.0 References: <20211106011638.2613039-1-jane.chu@oracle.com> <20211106011638.2613039-2-jane.chu@oracle.com> <63f89475-7a1f-e79e-7785-ba996211615b@oracle.com> <20211109052640.GG3538886@iweiny-DESK2.sc.intel.com> In-Reply-To: <20211109052640.GG3538886@iweiny-DESK2.sc.intel.com> From: Dan Williams Date: Mon, 8 Nov 2021 22:04:23 -0800 Message-ID: Subject: Re: [PATCH v2 1/2] dax: Introduce normal and recovery dax operation modes To: Ira Weiny Cc: Jane Chu , david , "Darrick J. Wong" , Christoph Hellwig , Vishal L Verma , Dave Jiang , Alasdair Kergon , Mike Snitzer , device-mapper development , Matthew Wilcox , Vivek Goyal , linux-fsdevel , Linux NVDIMM , Linux Kernel Mailing List , linux-xfs Content-Type: text/plain; charset="UTF-8" Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Nov 8, 2021 at 9:26 PM Ira Weiny wrote: > > On Mon, Nov 08, 2021 at 09:02:29PM +0000, Jane Chu wrote: > > On 11/6/2021 9:48 AM, Dan Williams wrote: > > > On Fri, Nov 5, 2021 at 6:17 PM Jane Chu wrote: > > >> > > >> Introduce DAX_OP_NORMAL and DAX_OP_RECOVERY operation modes to > > >> {dax_direct_access, dax_copy_from_iter, dax_copy_to_iter}. > > >> DAX_OP_NORMAL is the default or the existing mode, and > > >> DAX_OP_RECOVERY is a new mode for data recovery purpose. > > >> > > >> When dax-FS suspects dax media error might be encountered > > >> on a read or write, it can enact the recovery mode read or write > > >> by setting DAX_OP_RECOVERY in the aforementioned APIs. A read > > >> in recovery mode attempts to fetch as much data as possible > > >> until the first poisoned page is encountered. A write in recovery > > >> mode attempts to clear poison(s) in a page-aligned range and > > >> then write the user provided data over. > > >> > > >> DAX_OP_NORMAL should be used for all non-recovery code path. > > >> > > >> Signed-off-by: Jane Chu > > > [..] > > >> diff --git a/include/linux/dax.h b/include/linux/dax.h > > >> index 324363b798ec..931586df2905 100644 > > >> --- a/include/linux/dax.h > > >> +++ b/include/linux/dax.h > > >> @@ -9,6 +9,10 @@ > > >> /* Flag for synchronous flush */ > > >> #define DAXDEV_F_SYNC (1UL << 0) > > >> > > >> +/* dax operation mode dynamically set by caller */ > > >> +#define DAX_OP_NORMAL 0 > > > > > > Perhaps this should be called DAX_OP_FAILFAST? > > > > Sure. > > > > > > > >> +#define DAX_OP_RECOVERY 1 > > >> + > > >> typedef unsigned long dax_entry_t; > > >> > > >> struct dax_device; > > >> @@ -22,8 +26,8 @@ struct dax_operations { > > >> * logical-page-offset into an absolute physical pfn. Return the > > >> * number of pages available for DAX at that pfn. > > >> */ > > >> - long (*direct_access)(struct dax_device *, pgoff_t, long, > > >> - void **, pfn_t *); > > >> + long (*direct_access)(struct dax_device *, pgoff_t, long, int, > > > > > > Would be nice if that 'int' was an enum, but I'm not sure a new > > > parameter is needed at all, see below... > > > > Let's do your suggestion below. :) > > > > > > > >> + void **, pfn_t *); > > >> /* > > >> * Validate whether this device is usable as an fsdax backing > > >> * device. > > >> @@ -32,10 +36,10 @@ struct dax_operations { > > >> sector_t, sector_t); > > >> /* copy_from_iter: required operation for fs-dax direct-i/o */ > > >> size_t (*copy_from_iter)(struct dax_device *, pgoff_t, void *, size_t, > > >> - struct iov_iter *); > > >> + struct iov_iter *, int); > > > > > > I'm not sure the flag is needed here as the "void *" could carry a > > > flag in the pointer to indicate that is a recovery kaddr. > > > > Agreed. > > Not sure if this is implied but I would like some macros or other helper > functions to check these flags hidden in the addresses. > > For me I'm a bit scared about having flags hidden in the address like this > because I can't lead to some confusions IMO. > > But if we have some macros or other calls which can make this more obvious of > what is going on I think that would help. You could go further and mark it as an 'unsigned long __bitwise' type to get the compiler to help with enforcing accessors to strip off the flag bits.