2021-03-09 06:56:14

by Dan Williams

[permalink] [raw]
Subject: [PATCH] libnvdimm: Let revalidate_disk() revalidate region read-only

Previous kernels allowed the BLKROSET to override the disk's read-only
status. With that situation fixed the pmem driver needs to rely on
revalidate_disk() to clear the disk read-only status after the host
region has been marked read-write.

Recall that when libnvdimm determines that the persistent memory has
lost persistence (for example lack of energy to flush from DRAM to FLASH
on an NVDIMM-N device) it marks the region read-only, but that state can
be overridden by the user via:

echo 0 > /sys/bus/nd/devices/regionX/read_only

...to date there is no notification that the region has restored
persistence, so the user override is the only recovery.

Fixes: 52f019d43c22 ("block: add a hard-readonly flag to struct gendisk")
Cc: Christoph Hellwig <[email protected]>
Cc: Ming Lei <[email protected]>
Cc: Martin K. Petersen <[email protected]>
Cc: Hannes Reinecke <[email protected]>
Cc: Jens Axboe <[email protected]>
Reported-by: kernel test robot <[email protected]>
Reported-by: Vishal Verma <[email protected]>
Signed-off-by: Dan Williams <[email protected]>
---
drivers/nvdimm/btt.c | 7 +++++++
drivers/nvdimm/bus.c | 14 ++++++--------
drivers/nvdimm/pmem.c | 7 +++++++
3 files changed, 20 insertions(+), 8 deletions(-)

diff --git a/drivers/nvdimm/btt.c b/drivers/nvdimm/btt.c
index 41aa1f01fc07..73d3bf5aa208 100644
--- a/drivers/nvdimm/btt.c
+++ b/drivers/nvdimm/btt.c
@@ -1508,11 +1508,18 @@ static int btt_getgeo(struct block_device *bd, struct hd_geometry *geo)
return 0;
}

+static int btt_revalidate(struct gendisk *disk)
+{
+ nvdimm_check_and_set_ro(disk);
+ return 0;
+}
+
static const struct block_device_operations btt_fops = {
.owner = THIS_MODULE,
.submit_bio = btt_submit_bio,
.rw_page = btt_rw_page,
.getgeo = btt_getgeo,
+ .revalidate_disk = btt_revalidate,
};

static int btt_blk_init(struct btt *btt)
diff --git a/drivers/nvdimm/bus.c b/drivers/nvdimm/bus.c
index 48f0985ca8a0..3a777d0073b7 100644
--- a/drivers/nvdimm/bus.c
+++ b/drivers/nvdimm/bus.c
@@ -631,16 +631,14 @@ void nvdimm_check_and_set_ro(struct gendisk *disk)
struct nd_region *nd_region = to_nd_region(dev->parent);
int disk_ro = get_disk_ro(disk);

- /*
- * Upgrade to read-only if the region is read-only preserve as
- * read-only if the disk is already read-only.
- */
- if (disk_ro || nd_region->ro == disk_ro)
+ /* catch the disk up with the region ro state */
+ if (disk_ro == nd_region->ro)
return;

- dev_info(dev, "%s read-only, marking %s read-only\n",
- dev_name(&nd_region->dev), disk->disk_name);
- set_disk_ro(disk, 1);
+ dev_info(dev, "%s read-%s, marking %s read-%s\n",
+ dev_name(&nd_region->dev), nd_region->ro ? "only" : "write",
+ disk->disk_name, nd_region->ro ? "only" : "write");
+ set_disk_ro(disk, nd_region->ro);
}
EXPORT_SYMBOL(nvdimm_check_and_set_ro);

diff --git a/drivers/nvdimm/pmem.c b/drivers/nvdimm/pmem.c
index b8a85bfb2e95..af204fce1b1c 100644
--- a/drivers/nvdimm/pmem.c
+++ b/drivers/nvdimm/pmem.c
@@ -276,10 +276,17 @@ __weak long __pmem_direct_access(struct pmem_device *pmem, pgoff_t pgoff,
return PHYS_PFN(pmem->size - pmem->pfn_pad - offset);
}

+static int pmem_revalidate(struct gendisk *disk)
+{
+ nvdimm_check_and_set_ro(disk);
+ return 0;
+}
+
static const struct block_device_operations pmem_fops = {
.owner = THIS_MODULE,
.submit_bio = pmem_submit_bio,
.rw_page = pmem_rw_page,
+ .revalidate_disk = pmem_revalidate,
};

static int pmem_dax_zero_page_range(struct dax_device *dax_dev, pgoff_t pgoff,


2021-03-09 07:35:13

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH] libnvdimm: Let revalidate_disk() revalidate region read-only

On Mon, Mar 08, 2021 at 10:54:22PM -0800, Dan Williams wrote:
> Previous kernels allowed the BLKROSET to override the disk's read-only
> status. With that situation fixed the pmem driver needs to rely on
> revalidate_disk() to clear the disk read-only status after the host
> region has been marked read-write.
>
> Recall that when libnvdimm determines that the persistent memory has
> lost persistence (for example lack of energy to flush from DRAM to FLASH
> on an NVDIMM-N device) it marks the region read-only, but that state can
> be overridden by the user via:
>
> echo 0 > /sys/bus/nd/devices/regionX/read_only
>
> ...to date there is no notification that the region has restored
> persistence, so the user override is the only recovery.

I've just resent my series to kill of ->revalidate_disk for good, so this
obvious makes me a little unhappy. Given that ->revalidate_disk
only ends up beeing called from the same path that ->open is called,
why can't you just hook this up from the open method?

Also any reason the sysfs attribute can't just directly propagate the
information to the disk?

2021-03-09 16:38:58

by Dan Williams

[permalink] [raw]
Subject: Re: [PATCH] libnvdimm: Let revalidate_disk() revalidate region read-only

On Mon, Mar 8, 2021 at 11:31 PM Christoph Hellwig <[email protected]> wrote:
>
> On Mon, Mar 08, 2021 at 10:54:22PM -0800, Dan Williams wrote:
> > Previous kernels allowed the BLKROSET to override the disk's read-only
> > status. With that situation fixed the pmem driver needs to rely on
> > revalidate_disk() to clear the disk read-only status after the host
> > region has been marked read-write.
> >
> > Recall that when libnvdimm determines that the persistent memory has
> > lost persistence (for example lack of energy to flush from DRAM to FLASH
> > on an NVDIMM-N device) it marks the region read-only, but that state can
> > be overridden by the user via:
> >
> > echo 0 > /sys/bus/nd/devices/regionX/read_only
> >
> > ...to date there is no notification that the region has restored
> > persistence, so the user override is the only recovery.
>
> I've just resent my series to kill of ->revalidate_disk for good, so this
> obvious makes me a little unhappy. Given that ->revalidate_disk
> only ends up beeing called from the same path that ->open is called,
> why can't you just hook this up from the open method?
>
> Also any reason the sysfs attribute can't just directly propagate the
> information to the disk?

I should have assumed that revalidate_disk() was on the chopping
block. Let me take a look at just propagating from the region update
down to all affected disks. There's already a notification path for
regions to communicate badblocks, should be straightforward to reuse
for read-only updates.