Received: by 2002:a05:6a10:1a4d:0:0:0:0 with SMTP id nk13csp1032971pxb; Tue, 8 Feb 2022 07:52:39 -0800 (PST) X-Google-Smtp-Source: ABdhPJztTQIi/L5as9Sinn37t9pSsLib/z/IFruzFqgR9QBOnEYI9mAn04zPNMg3PvSfd1dbWJ1b X-Received: by 2002:a17:90b:1b11:: with SMTP id nu17mr2069236pjb.98.1644335558918; Tue, 08 Feb 2022 07:52:38 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1644335558; cv=none; d=google.com; s=arc-20160816; b=rK5wEASWrlzyD2oDmkXwafSttyWWN/AK6Q/H4byzUuK3Y7q5kr9dl7uKVQDW7dtz1g cekkydWsQjGE5NLfWVYAEES/FvFOTFrW2x8AmiT7uZQzY+4aYTJbJZBNHyyFlgsvk37k nPz9/CMZk/5R8KG0TNHtJuNkrlp2gYyg8erdnWmqhhgjxigA0TvMjBy69Bd5lmYP2sUP UJgFslFz4naYc9fsgkSYd4mao8ZjCpp9dXLsPa84rfZvjYo2wJ0id8l+fL84fgJEee1z /RqbxFccxvSq5UkJPs5Iv+ojUdaI1IiBusUs58J09//HdgRqztE05GOiI7G6ETTpj9sj rj9A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:subject:message-id:date:from:in-reply-to :references:mime-version:dkim-signature; bh=6l4Se70qE+syqGJMrB/AsP3g/MR2GzyK9H999D6OCfQ=; b=vHF5pxSfblGlnrrYIv1dTB45se8IuJVncvgK0eVrackshgVx9v8UaW3sVdrPrkbQqg hAhk7QMlDnbDZMoKDigx487Rod+NpWL9YiCaffcdtIXaf7uSIkgx6quLDaJsXj7KS3q2 9zeb14sxnpfJDsD4FyjDTsp/ybSn9AteS/y9drbzhpsKgMI6FaCJdMoGJBEH6oky5PJp 6cOULIj8qrlWwzwBn5Zm2j+5cve3lsXnWPG8xdzJ9KFrHAUpI2o1X4bn6TVO6/4bMcvO lMTlMxp1n7t3/zE+0LwTkuyLibEW3fy28fMxVg0OKDzx95yPPlKzWvWXjqWkgA7kgu+Q sLIg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel-com.20210112.gappssmtp.com header.s=20210112 header.b=twPrCIMq; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id q7si13984790pfs.96.2022.02.08.07.52.25; Tue, 08 Feb 2022 07:52:38 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel-com.20210112.gappssmtp.com header.s=20210112 header.b=twPrCIMq; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1346217AbiBDGDu (ORCPT + 99 others); Fri, 4 Feb 2022 01:03:50 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44680 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232540AbiBDGDt (ORCPT ); Fri, 4 Feb 2022 01:03:49 -0500 Received: from mail-pg1-x535.google.com (mail-pg1-x535.google.com [IPv6:2607:f8b0:4864:20::535]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 30B93C06173D for ; Thu, 3 Feb 2022 22:03:49 -0800 (PST) Received: by mail-pg1-x535.google.com with SMTP id d186so4216266pgc.9 for ; Thu, 03 Feb 2022 22:03:49 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=intel-com.20210112.gappssmtp.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=6l4Se70qE+syqGJMrB/AsP3g/MR2GzyK9H999D6OCfQ=; b=twPrCIMqWtmqtJsfNPwMsDsagV0gpq4f/8DzAU4geJD+UGyGQgOqnMEBqrAHSi3ysi OZFSBTZO5Scr/Eb4sDzoWZKuO22QYO5PJeDNB1xNcD9U488BQ2Q3VPKA36Ehsd517Ilg pc8yNaoLDB9uzx1gwGShF90NNYCooU++3EoqDWqYITqP6Y7KjmGWoLjSM2XlV4M0zrvI xrlWbjtiknXe9msgZUmE1ZHfB3bxHcqzJ4oiS782bN9s3p2OCIzwRnTXNMGbJMuilLeF NkfoSjXMv/9pHRyF+Qt9n6thJRR4Iw1FmFt0llnpocTri/ZiS91bSirzeyBt8FQ96JpR WN/g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=6l4Se70qE+syqGJMrB/AsP3g/MR2GzyK9H999D6OCfQ=; b=saW4xpWOB0ep0KyHzFKgoRs3KUXZ+q95aEpZIemx/eYeov2FOBmLhSTWqbn40IepeG zsol2NiEI7cZIRaD0vwtvsIa3i+JUtApXJRq6UMOm79kbQXegbnuGAy6l9Rd42KbIgii wTJFXgL/mfsXSTNWmNMlyve3/oITnh8Q36fb3bEHDDG3sbeZpsGzusf1x+Qv2DolggOJ jsyopSgqpgi7CD2BwrgJD9Km6feZTxZflr1zxfhOQRfJpCVsrhehyg/5qSZUG9h6g+Qi 4dXF+qC6Wml1NE3TWyPtPJtnCN6V4UDtnjqjXb3xYq6TgS1f4pggl8cM8rtN54nqWyQO jsNg== X-Gm-Message-State: AOAM530umW2BToPdoTxBdo0pQaes3rsw+YAGH9IJi2IBKqpC16PAALvs 2WkM+KObLRvt/rOEhEa2UZ5hKxhbuf07jBcZwj5BhQ== X-Received: by 2002:a63:550f:: with SMTP id j15mr1228156pgb.40.1643954628595; Thu, 03 Feb 2022 22:03:48 -0800 (PST) MIME-Version: 1.0 References: <20220128213150.1333552-1-jane.chu@oracle.com> <20220128213150.1333552-5-jane.chu@oracle.com> In-Reply-To: <20220128213150.1333552-5-jane.chu@oracle.com> From: Dan Williams Date: Thu, 3 Feb 2022 22:03:36 -0800 Message-ID: Subject: Re: [PATCH v5 4/7] dax: add dax_recovery_write to dax_op and dm target type To: Jane Chu Cc: david , "Darrick J. Wong" , Christoph Hellwig , Vishal L Verma , Dave Jiang , Alasdair Kergon , Mike Snitzer , device-mapper development , "Weiny, Ira" , Matthew Wilcox , Vivek Goyal , linux-fsdevel , Linux NVDIMM , Linux Kernel Mailing List , linux-xfs Content-Type: text/plain; charset="UTF-8" Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Jan 28, 2022 at 1:32 PM Jane Chu wrote: > > dax_recovery_write() dax op is only required for DAX device that > export DAXDEV_RECOVERY indicating its capability to recover from > poisons. > > DM may be nested, if part of the base dax devices forming a DM > device support dax recovery, the DM device is marked with such > capability. > > Signed-off-by: Jane Chu [..] > diff --git a/include/linux/dax.h b/include/linux/dax.h > index 2fc776653c6e..1b3d6ebf3e49 100644 > --- a/include/linux/dax.h > +++ b/include/linux/dax.h > @@ -30,6 +30,9 @@ struct dax_operations { > sector_t, sector_t); > /* zero_page_range: required operation. Zero page range */ > int (*zero_page_range)(struct dax_device *, pgoff_t, size_t); > + /* recovery_write: optional operation. */ > + size_t (*recovery_write)(struct dax_device *, pgoff_t, void *, size_t, > + struct iov_iter *); The removal of the ->copy_{to,from}_iter() operations set the precedent that dax ops should not be needed when the operation can be carried out generically. The only need to call back to the pmem driver is so that it can call nvdimm_clear_poison(). nvdimm_clear_poison() in turn only needs the 'struct device' hosting the pmem and the physical address to be cleared. The physical address is already returned by dax_direct_access(). The device is something that could be added to dax_device, and the pgmap could host the callback that pmem fills in. Something like: diff --git a/drivers/nvdimm/pfn_devs.c b/drivers/nvdimm/pfn_devs.c index 58eda16f5c53..36486ba4753a 100644 --- a/drivers/nvdimm/pfn_devs.c +++ b/drivers/nvdimm/pfn_devs.c @@ -694,6 +694,7 @@ static int __nvdimm_setup_pfn(struct nd_pfn *nd_pfn, struct dev_pagemap *pgmap) .end = nsio->res.end - end_trunc, }; pgmap->nr_range = 1; + pgmap->owner = &nd_pfn->dev; if (nd_pfn->mode == PFN_MODE_RAM) { if (offset < reserve) return -EINVAL; diff --git a/drivers/nvdimm/pmem.c b/drivers/nvdimm/pmem.c index 58d95242a836..95e1b6326f88 100644 --- a/drivers/nvdimm/pmem.c +++ b/drivers/nvdimm/pmem.c @@ -481,6 +481,7 @@ static int pmem_attach_disk(struct device *dev, } set_dax_nocache(dax_dev); set_dax_nomc(dax_dev); + set_dax_pgmap(dax_dev, &pmem->pgmap); if (is_nvdimm_sync(nd_region)) set_dax_synchronous(dax_dev); rc = dax_add_host(dax_dev, disk); diff --git a/include/linux/memremap.h b/include/linux/memremap.h index 1fafcc38acba..8cb59b5df38b 100644 --- a/include/linux/memremap.h +++ b/include/linux/memremap.h @@ -81,6 +81,11 @@ struct dev_pagemap_ops { #define PGMAP_ALTMAP_VALID (1 << 0) +struct dev_pagemap_operations { + size_t (*recovery_write)(struct dev_pagemap *pgmap, void *, size_t, + struct iov_iter *); +}; + /** * struct dev_pagemap - metadata for ZONE_DEVICE mappings * @altmap: pre-allocated/reserved memory for vmemmap allocations @@ -111,12 +116,15 @@ struct dev_pagemap { const struct dev_pagemap_ops *ops; void *owner; int nr_range; + struct dev_pagemap_operations ops; union { struct range range; struct range ranges[0]; }; }; ...then DM does not need to be involved in the recovery path, fs/dax.c just does dax_direct_access(..., DAX_RECOVERY, ...) and then looks up the pgmap to generically coordinate the recovery_write(). The pmem driver would be responsible for setting pgmap->recovery_write() to a function that calls nvdimm_clear_poison(). This arch works for anything that can be described by a pgmap, and supports error clearing, it need not be limited to the pmem block driver.