Received: by 2002:a05:6a10:1d13:0:0:0:0 with SMTP id pp19csp1761540pxb; Fri, 20 Aug 2021 13:21:28 -0700 (PDT) X-Google-Smtp-Source: ABdhPJzTl2AKiui59T1KFCGfe0ywEpRbwl8dkRM4UUYusNfVFsw5y9jIKvzI+lRSlJ1ZqJmbgPLY X-Received: by 2002:a17:906:38c8:: with SMTP id r8mr23880325ejd.172.1629490888645; Fri, 20 Aug 2021 13:21:28 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1629490888; cv=none; d=google.com; s=arc-20160816; b=VMqDyWan+qIQJqX2DKGyS5fSbLUJdcYQUD5/lr12Us8NDeU16n82rJXEX2JzEizaX2 KLq0S1jVcHly4fqFSkVok1yg5yl6dz5UywtO4HGKaDjjwbeKCVTtiij7v8UmYEJhSYvx YO7jMCCZ1oDgNIKcj1En359T9KIm3OCDvZ1z1CagvJMWP6x0d7lyj177kZqEe6jW7zQu e48w5Qjp2OYNjFqVHgulBdV0p5EY08lp24871qB5J/S6WspWvcOoGvj3qtRMNaVEE3Fg fk6N2wQIisu3SC/2JQ0TEtkCLTYowSXxvCG07dKWNp2zzwdNoRXwy148SGHN+iXPKJua x3gw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:subject:message-id:date:from:in-reply-to :references:mime-version:dkim-signature; bh=BkGBH+Sk2bQQ3B5sfYf3Y77q6TOaLfRpAcPAwL8BdAo=; b=H2JvYtHpY61gOYLem65XWkmard0RGT1a3akclbpqcn2jjhQ/r3i4aZKc1zfTqa7A8+ d2/tx4HTBCST4/0HiWhxg4I00HgUGPH3cA1BVK56uEXPBbX/rYkydflPl+TWei6wv0QE U60UE2stW9ZAOdO2DVmo65RAQ/ms2epZp34Ck1bmL7wbHRjp1ZDDXSdOWw4sHh6flntK swwUUdQMlCCn7OkoTZmGn0tZ84w959wgDnKJyQTH8ZBNaHhQrU53nqnoYTMZfGhXhCfs 9rqIdQLFb13r4k7pr8duiYzE/JzHTu7H9eWnWRnhyXYmJUgE9xFUkKqXyESzT9akGt+z whYA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel-com.20150623.gappssmtp.com header.s=20150623 header.b=ecXEd1De; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id t8si7130955edr.294.2021.08.20.13.20.44; Fri, 20 Aug 2021 13:21:28 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@intel-com.20150623.gappssmtp.com header.s=20150623 header.b=ecXEd1De; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230453AbhHTUUA (ORCPT + 99 others); Fri, 20 Aug 2021 16:20:00 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60006 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230266AbhHTUUA (ORCPT ); Fri, 20 Aug 2021 16:20:00 -0400 Received: from mail-pl1-x62a.google.com (mail-pl1-x62a.google.com [IPv6:2607:f8b0:4864:20::62a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E9D1BC061575 for ; Fri, 20 Aug 2021 13:19:21 -0700 (PDT) Received: by mail-pl1-x62a.google.com with SMTP id b9so2215043plx.2 for ; Fri, 20 Aug 2021 13:19:21 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=intel-com.20150623.gappssmtp.com; s=20150623; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=BkGBH+Sk2bQQ3B5sfYf3Y77q6TOaLfRpAcPAwL8BdAo=; b=ecXEd1DeQkdKgu9gLiR1FnqZyViNsljclorUBJyGwLxYUHCPK4WZISv10/u9Zptlus FVsZhPWNk6afCvlXCEt/KypYJRqjVv5p4tB38Y4xQ7N8hQCHiFCI1Wy/0r13jIzLcFY3 SVzcwNQEvMNUNmxDkS84Rd3A8dwjCiHQJ1/v/P2RaI7/M6PwE2f4GmEJ0J669WZT+2vw K5jV5336LwDoCgbQu5A590/GgUZUWFEILYVLvL2rDGR+IP8aJRMkYB8JDJcpGwXt0HxZ hsjVMAkPb8RtHCOiKvnFbj2Fg4ExPof2GDhJELQ4h5ADtPHTZbBazYzWXh/7eNNia8NY lOHw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=BkGBH+Sk2bQQ3B5sfYf3Y77q6TOaLfRpAcPAwL8BdAo=; b=BzDUjH8ZadJL3ooCViScirf9AdpoyscaH/XyAeKIIDegqUpXg6lcdejl091mJquNbn Z1KVLppkyXaljpva1bC9kGBKBl+UE17zCsAxrxStzxoI0Rm3jCvBvepzL89SqyqQlAM2 VoVyQ9+CV4sKatwwkoI+X+HGJcTdsiMdP3X9FXOWcEMzWJ3VRptplKlahkixOr0aaj8g luJAB0pPoArIEudIfQCZ/uU6HDORxH7E1UI3EZ0/7If6uirHmPawsxm8aaaNH2gxTGzT RDGSN3OLr2NOvgtRZ3yPbJO8jJOggFYBQGuwt6c/NmegrCb/xTx3fsdO6jkPtVpVxRhA 4LZQ== X-Gm-Message-State: AOAM533Q4AkmGhfzblYZLwtGqeHz5dygyRuq+eWQ8IgCV16N0XGjOQPW bmnahoWeSK0tJyvgluXzlpH3zSLnATvQLP7btCwrcw== X-Received: by 2002:a17:902:c10a:b0:12d:97e1:f035 with SMTP id 10-20020a170902c10a00b0012d97e1f035mr17854315pli.52.1629490761440; Fri, 20 Aug 2021 13:19:21 -0700 (PDT) MIME-Version: 1.0 References: <20210730100158.3117319-1-ruansy.fnst@fujitsu.com> <20210730100158.3117319-3-ruansy.fnst@fujitsu.com> In-Reply-To: <20210730100158.3117319-3-ruansy.fnst@fujitsu.com> From: Dan Williams Date: Fri, 20 Aug 2021 13:19:10 -0700 Message-ID: Subject: Re: [PATCH RESEND v6 2/9] dax: Introduce holder for dax_device To: Shiyang Ruan Cc: Linux Kernel Mailing List , linux-xfs , Linux NVDIMM , Linux MM , linux-fsdevel , device-mapper development , "Darrick J. Wong" , david , Christoph Hellwig , Alasdair Kergon , Mike Snitzer Content-Type: text/plain; charset="UTF-8" Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Jul 30, 2021 at 3:02 AM Shiyang Ruan wrote: > > To easily track filesystem from a pmem device, we introduce a holder for > dax_device structure, and also its operation. This holder is used to > remember who is using this dax_device: > - When it is the backend of a filesystem, the holder will be the > superblock of this filesystem. > - When this pmem device is one of the targets in a mapped device, the > holder will be this mapped device. In this case, the mapped device > has its own dax_device and it will follow the first rule. So that we > can finally track to the filesystem we needed. > > The holder and holder_ops will be set when filesystem is being mounted, > or an target device is being activated. > > Signed-off-by: Shiyang Ruan > --- > drivers/dax/super.c | 46 +++++++++++++++++++++++++++++++++++++++++++++ > include/linux/dax.h | 17 +++++++++++++++++ > 2 files changed, 63 insertions(+) > > diff --git a/drivers/dax/super.c b/drivers/dax/super.c > index 5fa6ae9dbc8b..00c32dfa5665 100644 > --- a/drivers/dax/super.c > +++ b/drivers/dax/super.c > @@ -214,6 +214,8 @@ enum dax_device_flags { > * @cdev: optional character interface for "device dax" > * @host: optional name for lookups where the device path is not available > * @private: dax driver private data > + * @holder_rwsem: prevent unregistration while holder_ops is in progress > + * @holder_data: holder of a dax_device: could be filesystem or mapped device > * @flags: state and boolean properties > */ > struct dax_device { > @@ -222,8 +224,11 @@ struct dax_device { > struct cdev cdev; > const char *host; > void *private; > + struct rw_semaphore holder_rwsem; > + void *holder_data; > unsigned long flags; > const struct dax_operations *ops; > + const struct dax_holder_operations *holder_ops; > }; > > static ssize_t write_cache_show(struct device *dev, > @@ -373,6 +378,25 @@ int dax_zero_page_range(struct dax_device *dax_dev, pgoff_t pgoff, > } > EXPORT_SYMBOL_GPL(dax_zero_page_range); > > +int dax_holder_notify_failure(struct dax_device *dax_dev, loff_t offset, > + size_t size, void *data) I took a look at patch3 and had some questions about the api. Can you add kernel-doc for this api and specifically clarify what is @data used for vs dax_dev->holder_data? I also think the holder needs to know whether this failure is being signaled synchronously. or asynchronously. In the synchronous case a process has consumed poison and action needs to be taken immediately. In the asynchronous case the driver stack has encountered failed address ranges and is notifying the holder to avoid those ranges, but no immediate action needs to be taken to shoot down mappings. For example, I would use the synchronous notification when memory_failure() is invoked with the "action required" indication, and the asynchronous notification when an NVDIMM_REVALIDATE_POISON event fires, or the "action optional" memory_failure() case. In short I think the interface just needs a flags argument. > +{ > + int rc; > + > + if (!dax_dev) > + return -ENXIO; > + > + if (!dax_dev->holder_data) > + return -EOPNOTSUPP; > + > + down_read(&dax_dev->holder_rwsem); > + rc = dax_dev->holder_ops->notify_failure(dax_dev, offset, > + size, data); > + up_read(&dax_dev->holder_rwsem); > + return rc; > +} > +EXPORT_SYMBOL_GPL(dax_holder_notify_failure); > + > #ifdef CONFIG_ARCH_HAS_PMEM_API > void arch_wb_cache_pmem(void *addr, size_t size); > void dax_flush(struct dax_device *dax_dev, void *addr, size_t size) > @@ -603,6 +627,7 @@ struct dax_device *alloc_dax(void *private, const char *__host, > dax_add_host(dax_dev, host); > dax_dev->ops = ops; > dax_dev->private = private; > + init_rwsem(&dax_dev->holder_rwsem); > if (flags & DAXDEV_F_SYNC) > set_dax_synchronous(dax_dev); > > @@ -624,6 +649,27 @@ void put_dax(struct dax_device *dax_dev) > } > EXPORT_SYMBOL_GPL(put_dax); > > +void dax_set_holder(struct dax_device *dax_dev, void *holder, > + const struct dax_holder_operations *ops) > +{ > + if (!dax_dev) > + return; > + down_write(&dax_dev->holder_rwsem); > + dax_dev->holder_data = holder; > + dax_dev->holder_ops = ops; > + up_write(&dax_dev->holder_rwsem); > +} > +EXPORT_SYMBOL_GPL(dax_set_holder); > + > +void *dax_get_holder(struct dax_device *dax_dev) > +{ > + if (!dax_dev) > + return NULL; > + > + return dax_dev->holder_data; > +} > +EXPORT_SYMBOL_GPL(dax_get_holder); > + > /** > * dax_get_by_host() - temporary lookup mechanism for filesystem-dax > * @host: alternate name for the device registered by a dax driver > diff --git a/include/linux/dax.h b/include/linux/dax.h > index b52f084aa643..6f4b5c97ceb0 100644 > --- a/include/linux/dax.h > +++ b/include/linux/dax.h > @@ -38,10 +38,17 @@ struct dax_operations { > int (*zero_page_range)(struct dax_device *, pgoff_t, size_t); > }; > > +struct dax_holder_operations { > + int (*notify_failure)(struct dax_device *, loff_t, size_t, void *); > +}; > + > extern struct attribute_group dax_attribute_group; > > #if IS_ENABLED(CONFIG_DAX) > struct dax_device *dax_get_by_host(const char *host); > +void dax_set_holder(struct dax_device *dax_dev, void *holder, > + const struct dax_holder_operations *ops); > +void *dax_get_holder(struct dax_device *dax_dev); > struct dax_device *alloc_dax(void *private, const char *host, > const struct dax_operations *ops, unsigned long flags); > void put_dax(struct dax_device *dax_dev); > @@ -77,6 +84,14 @@ static inline struct dax_device *dax_get_by_host(const char *host) > { > return NULL; > } > +static inline void dax_set_holder(struct dax_device *dax_dev, void *holder, > + const struct dax_holder_operations *ops) > +{ > +} > +static inline void *dax_get_holder(struct dax_device *dax_dev) > +{ > + return NULL; > +} > static inline struct dax_device *alloc_dax(void *private, const char *host, > const struct dax_operations *ops, unsigned long flags) > { > @@ -226,6 +241,8 @@ size_t dax_copy_to_iter(struct dax_device *dax_dev, pgoff_t pgoff, void *addr, > size_t bytes, struct iov_iter *i); > int dax_zero_page_range(struct dax_device *dax_dev, pgoff_t pgoff, > size_t nr_pages); > +int dax_holder_notify_failure(struct dax_device *dax_dev, loff_t offset, > + size_t size, void *data); > void dax_flush(struct dax_device *dax_dev, void *addr, size_t size); > > ssize_t dax_iomap_rw(struct kiocb *iocb, struct iov_iter *iter, > -- > 2.32.0 > > >