2021-10-20 01:57:32

by Ming Lei

[permalink] [raw]
Subject: [PATCH V2 0/4] zram: fix two races and one zram leak

Hello,

Fixes three issues reported by Luis Chamberlain with one simpler approach:

- race between between zram_reset_device() and disksize_store() (1/4)

- zram leak during unloading module, which is one race between resetting
and removing device (2/4)

- race between zram_remove and disksize_store (3/4)

Also replace replace fsync_bdev with sync_blockdev since no one opens
it.(4/4)

V2:
- take another approach to avoid failing of zram_remove()
- add patch to address race between zram_reset_device() and
disksize_store()


Ming Lei (4):
zram: fix race between zram_reset_device() and disksize_store()
zram: don't fail to remove zram during unloading module
zram: avoid race between zram_remove and disksize_store
zram: replace fsync_bdev with sync_blockdev

drivers/block/zram/zram_drv.c | 39 ++++++++++++++++++++++++++++-------
1 file changed, 31 insertions(+), 8 deletions(-)

--
2.31.1


2021-10-20 01:57:48

by Ming Lei

[permalink] [raw]
Subject: [PATCH V2 1/4] zram: fix race between zram_reset_device() and disksize_store()

When the ->init_lock is released in zram_reset_device(), disksize_store()
can come in and try to allocate meta, but zram_reset_device() is freeing
free meta, so cause races.

Link: https://lore.kernel.org/linux-block/[email protected]/T/#mc617f865a3fa2778e40f317ddf48f6447c20c073
Reported-by: Luis Chamberlain <[email protected]>
Signed-off-by: Ming Lei <[email protected]>
---
drivers/block/zram/zram_drv.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c
index a68297fb51a2..25d781dc5fef 100644
--- a/drivers/block/zram/zram_drv.c
+++ b/drivers/block/zram/zram_drv.c
@@ -1704,12 +1704,13 @@ static void zram_reset_device(struct zram *zram)
set_capacity_and_notify(zram->disk, 0);
part_stat_set_all(zram->disk->part0, 0);

- up_write(&zram->init_lock);
/* I/O operation under all of CPU are done so let's free */
zram_meta_free(zram, disksize);
memset(&zram->stats, 0, sizeof(zram->stats));
zcomp_destroy(comp);
reset_bdev(zram);
+
+ up_write(&zram->init_lock);
}

static ssize_t disksize_store(struct device *dev,
--
2.31.1

2021-10-20 01:57:52

by Ming Lei

[permalink] [raw]
Subject: [PATCH V2 2/4] zram: don't fail to remove zram during unloading module

When zram module is started to be unloaded, no one should use all zram
disks at that time. But disk's show() or store() attribute method may be
running, this way is expected because device_del()(called from
del_gendisk) will drain all pending show()/store().

But reset_store() may set ->claim which will fail zram_remove(), when
this happens, the zram device will be leaked and the warning of 'Error:
Removing state 63 which has instances left.' is triggered during
unloading module.

Fixes the issue by not failing zram_remove() if ->claim is set, and
we actually need to do nothing in case that zram_reset() is running
since del_gendisk() will wait until zram_reset() is done.

Reported-by: Luis Chamberlain <[email protected]>
Signed-off-by: Ming Lei <[email protected]>
---
drivers/block/zram/zram_drv.c | 27 +++++++++++++++++++++------
1 file changed, 21 insertions(+), 6 deletions(-)

diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c
index 25d781dc5fef..8883de7aa3d7 100644
--- a/drivers/block/zram/zram_drv.c
+++ b/drivers/block/zram/zram_drv.c
@@ -1968,25 +1968,40 @@ static int zram_add(void)
static int zram_remove(struct zram *zram)
{
struct block_device *bdev = zram->disk->part0;
+ bool claimed;

mutex_lock(&bdev->bd_disk->open_mutex);
- if (bdev->bd_openers || zram->claim) {
+ if (bdev->bd_openers) {
mutex_unlock(&bdev->bd_disk->open_mutex);
return -EBUSY;
}

- zram->claim = true;
+ claimed = zram->claim;
+ if (!claimed)
+ zram->claim = true;
mutex_unlock(&bdev->bd_disk->open_mutex);

zram_debugfs_unregister(zram);

- /* Make sure all the pending I/O are finished */
- fsync_bdev(bdev);
- zram_reset_device(zram);
+ if (claimed) {
+ /*
+ * If we were claimed by reset_store(), del_gendisk() will
+ * wait until reset_store() is done, so nothing need to do.
+ */
+ ;
+ } else {
+ /* Make sure all the pending I/O are finished */
+ fsync_bdev(bdev);
+ zram_reset_device(zram);
+ }

pr_info("Removed device: %s\n", zram->disk->disk_name);

del_gendisk(zram->disk);
+
+ /* del_gendisk drains pending reset_store */
+ WARN_ON_ONCE(claimed && zram->claim);
+
blk_cleanup_disk(zram->disk);
kfree(zram);
return 0;
@@ -2063,7 +2078,7 @@ static struct class zram_control_class = {

static int zram_remove_cb(int id, void *ptr, void *data)
{
- zram_remove(ptr);
+ WARN_ON_ONCE(zram_remove(ptr));
return 0;
}

--
2.31.1

2021-10-20 01:58:07

by Ming Lei

[permalink] [raw]
Subject: [PATCH V2 3/4] zram: avoid race between zram_remove and disksize_store

After resetting device in zram_remove(), disksize_store still may come and
allocate resources again before deleting gendisk, fix the race by resetting
zram after del_gendisk() returns. At that time, disksize_store can't come
any more.

Reported-by: Luis Chamberlain <[email protected]>
Signed-off-by: Ming Lei <[email protected]>
---
drivers/block/zram/zram_drv.c | 7 +++++++
1 file changed, 7 insertions(+)

diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c
index 8883de7aa3d7..6078d1dae44a 100644
--- a/drivers/block/zram/zram_drv.c
+++ b/drivers/block/zram/zram_drv.c
@@ -2002,6 +2002,13 @@ static int zram_remove(struct zram *zram)
/* del_gendisk drains pending reset_store */
WARN_ON_ONCE(claimed && zram->claim);

+ /*
+ * disksize store may come between zram_reset_device and del_gendisk, so
+ * run the last reset for avoiding leak anything allocated in
+ * disksize_store().
+ */
+ zram_reset_device(zram);
+
blk_cleanup_disk(zram->disk);
kfree(zram);
return 0;
--
2.31.1

2021-10-20 01:59:35

by Ming Lei

[permalink] [raw]
Subject: [PATCH V2 4/4] zram: replace fsync_bdev with sync_blockdev

When calling fsync_bdev(), zram driver guarantees that the bdev won't be
opened by anyone, then there can't be one active fs/superblock over the
zram bdev, so replace fsync_bdev with sync_blockdev.

Signed-off-by: Ming Lei <[email protected]>
---
drivers/block/zram/zram_drv.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c
index 6078d1dae44a..563af3aa4f5e 100644
--- a/drivers/block/zram/zram_drv.c
+++ b/drivers/block/zram/zram_drv.c
@@ -1790,7 +1790,7 @@ static ssize_t reset_store(struct device *dev,
mutex_unlock(&bdev->bd_disk->open_mutex);

/* Make sure all the pending I/O are finished */
- fsync_bdev(bdev);
+ sync_blockdev(bdev);
zram_reset_device(zram);

mutex_lock(&bdev->bd_disk->open_mutex);
@@ -1991,7 +1991,7 @@ static int zram_remove(struct zram *zram)
;
} else {
/* Make sure all the pending I/O are finished */
- fsync_bdev(bdev);
+ sync_blockdev(bdev);
zram_reset_device(zram);
}

--
2.31.1

2021-10-20 21:42:59

by Minchan Kim

[permalink] [raw]
Subject: Re: [PATCH V2 0/4] zram: fix two races and one zram leak

On Wed, Oct 20, 2021 at 09:55:44AM +0800, Ming Lei wrote:
> Hello,
>
> Fixes three issues reported by Luis Chamberlain with one simpler approach:
>
> - race between between zram_reset_device() and disksize_store() (1/4)
>
> - zram leak during unloading module, which is one race between resetting
> and removing device (2/4)
>
> - race between zram_remove and disksize_store (3/4)
>
> Also replace replace fsync_bdev with sync_blockdev since no one opens
> it.(4/4)
>
> V2:
> - take another approach to avoid failing of zram_remove()
> - add patch to address race between zram_reset_device() and
> disksize_store()
>

Thanks for breaking the problems down, Ming.

To me, the whole patchset looks good to me since each patch solves
the problem step by step and finally fix.

Luis, do you have any concern of this patchset to solve the cpuhp
problem? (Sorry in advance if I miss some concerns if you raised
in different thread. I'm totally lost).

>
> Ming Lei (4):
> zram: fix race between zram_reset_device() and disksize_store()
> zram: don't fail to remove zram during unloading module
> zram: avoid race between zram_remove and disksize_store
> zram: replace fsync_bdev with sync_blockdev
>
> drivers/block/zram/zram_drv.c | 39 ++++++++++++++++++++++++++++-------
> 1 file changed, 31 insertions(+), 8 deletions(-)
>
> --
> 2.31.1
>

2021-10-21 17:41:29

by Luis Chamberlain

[permalink] [raw]
Subject: Re: [PATCH V2 0/4] zram: fix two races and one zram leak

On Wed, Oct 20, 2021 at 02:40:05PM -0700, Minchan Kim wrote:
> On Wed, Oct 20, 2021 at 09:55:44AM +0800, Ming Lei wrote:
> > Hello,
> >
> > Fixes three issues reported by Luis Chamberlain with one simpler approach:
> >
> > - race between between zram_reset_device() and disksize_store() (1/4)
> >
> > - zram leak during unloading module, which is one race between resetting
> > and removing device (2/4)
> >
> > - race between zram_remove and disksize_store (3/4)
> >
> > Also replace replace fsync_bdev with sync_blockdev since no one opens
> > it.(4/4)
> >
> > V2:
> > - take another approach to avoid failing of zram_remove()
> > - add patch to address race between zram_reset_device() and
> > disksize_store()
> >
>
> Thanks for breaking the problems down, Ming.
>
> To me, the whole patchset looks good to me since each patch solves
> the problem step by step and finally fix.
>
> Luis, do you have any concern of this patchset to solve the cpuhp
> problem? (Sorry in advance if I miss some concerns if you raised
> in different thread. I'm totally lost).

Running tests against this now. Will report back!

Luis

2021-10-21 17:48:39

by Luis Chamberlain

[permalink] [raw]
Subject: Re: [PATCH V2 0/4] zram: fix two races and one zram leak

On Thu, Oct 21, 2021 at 10:39:46AM -0700, Luis Chamberlain wrote:
> On Wed, Oct 20, 2021 at 02:40:05PM -0700, Minchan Kim wrote:
> > On Wed, Oct 20, 2021 at 09:55:44AM +0800, Ming Lei wrote:
> > > Hello,
> > >
> > > Fixes three issues reported by Luis Chamberlain with one simpler approach:
> > >
> > > - race between between zram_reset_device() and disksize_store() (1/4)
> > >
> > > - zram leak during unloading module, which is one race between resetting
> > > and removing device (2/4)
> > >
> > > - race between zram_remove and disksize_store (3/4)
> > >
> > > Also replace replace fsync_bdev with sync_blockdev since no one opens
> > > it.(4/4)
> > >
> > > V2:
> > > - take another approach to avoid failing of zram_remove()
> > > - add patch to address race between zram_reset_device() and
> > > disksize_store()
> > >
> >
> > Thanks for breaking the problems down, Ming.
> >
> > To me, the whole patchset looks good to me since each patch solves
> > the problem step by step and finally fix.
> >
> > Luis, do you have any concern of this patchset to solve the cpuhp
> > problem? (Sorry in advance if I miss some concerns if you raised
> > in different thread. I'm totally lost).
>
> Running tests against this now. Will report back!

So indeed with these patches I end up in the situation where we if
if spawn two ltp zram02.sh runs and cancel then randomly and start
them again:

zram: Can't change algorithm for initialized device

And after that only if you do:

swapoff /dev/zram0

Only then can you restart the tests again.

I had note seen that with my patch fix, but But Ming noted that he
did see that, and I trust him, although I can't reproduce that issue.

And from at lest a testing perspective then:

Tested-by: Luis Chamberlain <[email protected]>

I'll go and do the line-by-line code review now.

Luis

2021-10-21 23:05:58

by Luis Chamberlain

[permalink] [raw]
Subject: Re: [PATCH V2 1/4] zram: fix race between zram_reset_device() and disksize_store()

On Wed, Oct 20, 2021 at 09:55:45AM +0800, Ming Lei wrote:
> When the ->init_lock is released in zram_reset_device(), disksize_store()
> can come in and try to allocate meta, but zram_reset_device() is freeing
> free meta, so cause races.
>
> Link: https://lore.kernel.org/linux-block/[email protected]/T/#mc617f865a3fa2778e40f317ddf48f6447c20c073
> Reported-by: Luis Chamberlain <[email protected]>
> Signed-off-by: Ming Lei <[email protected]>

Reviewed-by: Luis Chamberlain <[email protected]>

Luis

2021-10-21 23:53:20

by Luis Chamberlain

[permalink] [raw]
Subject: Re: [PATCH V2 2/4] zram: don't fail to remove zram during unloading module

On Wed, Oct 20, 2021 at 09:55:46AM +0800, Ming Lei wrote:
> When zram module is started to be unloaded, no one should use all zram
> disks at that time. But disk's show() or store() attribute method may be
> running, this way is expected because device_del()(called from
> del_gendisk) will drain all pending show()/store().

How about:

----------
When the zram module is being unloaded, no one should be using the
zram disks. However even while being unloaded the zram module's
sysfs attributes might be poked at to re-configure the zram module.
This is expected, and kernfs ensures that these operations complete
before device_del() completes.
----------

> But reset_store() may set ->claim which will fail zram_remove(), when
> this happens, the zram device will be leaked and the warning of 'Error:
> Removing state 63 which has instances left.' is triggered during
> unloading module.

Note: the "Removing state 63 which has instances left" does not
necessarily mean the struct zram is leaked. It just means that the per
cpu struct zcomp instances are leaked, given the CPU hotplug removal
callback is in charge of cleaning them up. That this gave us a hint that
we do in fact leak a struct zram device as well is separate from what
the message means, but I do agree it should be *that* because as you
noted, LTP does not yet try to make spaghetti with hot_add_show().

So, how about:

----------
When the reset sysfs op (reset_store()) gets called the zram->claim will
be set, and this prevents zram_remove() from removing a zram device. If one
is unloading the module and one races to run the reset sysfs op we end
up leaking the struct zram device. We learned about this issue through
the error "Error: Removing state 63 which has instances left". While
this just means the any of the per cpu struct zcomp instances are
leaked when we're removing the CPU hotplug multistate callback, the
test case (running LTP zram02.sh twice in different terminals) that
allows us to reproduce this issue only mucks with reseting the devices,
not hot_add_show(). Setting zram->claim will prevent zram_remove() from
completing successfully, and so on module removal zram_remove_cb() does
not tell us when it failed to remove the full struct zram device and
it leaks.
----------

This begs the question, should we not then also make zram_remove_cb()
chatty on failure?

> Fixes the issue by not failing zram_remove() if ->claim is set, and
> we actually need to do nothing in case that zram_reset() is running
> since del_gendisk() will wait until zram_reset() is done.

Sure this all looks like a reasonable alternative to the issue without
using a generic lock. It does beg the questions if this suffices for
the other oddball sysfs ops, but we can take that up with time.

> Reported-by: Luis Chamberlain <[email protected]>
> Signed-off-by: Ming Lei <[email protected]>

In so far as the code is concerned:

Reviewed-by: Luis Chamberlain <[email protected]>

Luis

2021-10-22 00:40:06

by Ming Lei

[permalink] [raw]
Subject: Re: [PATCH V2 2/4] zram: don't fail to remove zram during unloading module

On Thu, Oct 21, 2021 at 04:50:55PM -0700, Luis Chamberlain wrote:
> On Wed, Oct 20, 2021 at 09:55:46AM +0800, Ming Lei wrote:
> > When zram module is started to be unloaded, no one should use all zram
> > disks at that time. But disk's show() or store() attribute method may be
> > running, this way is expected because device_del()(called from
> > del_gendisk) will drain all pending show()/store().
>
> How about:
>
> ----------
> When the zram module is being unloaded, no one should be using the
> zram disks. However even while being unloaded the zram module's
> sysfs attributes might be poked at to re-configure the zram module.

're-configure the zram module' is confused, and it should be zram
device.

> This is expected, and kernfs ensures that these operations complete
> before device_del() completes.

Otherwise, the above is pretty good, and I will take that in V3.

> ----------
>
> > But reset_store() may set ->claim which will fail zram_remove(), when
> > this happens, the zram device will be leaked and the warning of 'Error:
> > Removing state 63 which has instances left.' is triggered during
> > unloading module.
>
> Note: the "Removing state 63 which has instances left" does not
> necessarily mean the struct zram is leaked. It just means that the per
> cpu struct zcomp instances are leaked, given the CPU hotplug removal
> callback is in charge of cleaning them up. That this gave us a hint that
> we do in fact leak a struct zram device as well is separate from what
> the message means, but I do agree it should be *that* because as you
> noted, LTP does not yet try to make spaghetti with hot_add_show().
>
> So, how about:
>
> ----------
> When the reset sysfs op (reset_store()) gets called the zram->claim will
> be set, and this prevents zram_remove() from removing a zram device. If one
> is unloading the module and one races to run the reset sysfs op we end
> up leaking the struct zram device. We learned about this issue through
> the error "Error: Removing state 63 which has instances left". While
> this just means the any of the per cpu struct zcomp instances are
> leaked when we're removing the CPU hotplug multistate callback, the
> test case (running LTP zram02.sh twice in different terminals) that
> allows us to reproduce this issue only mucks with reseting the devices,
> not hot_add_show(). Setting zram->claim will prevent zram_remove() from
> completing successfully, and so on module removal zram_remove_cb() does
> not tell us when it failed to remove the full struct zram device and
> it leaks.
> ----------

commit log is for helping people to understand the change, and too long or
exposing too much details may not serve that purpose since we also talk with
code, and I believe the following words are clear enough for explaining the
problem, and it is short & straightforward, and won't make people terrified, :-)

But reset_store() may set ->claim which will fail zram_remove(), when
this happens, zram_reset_device() is bypassed, and zram->comp can't
be destroyed, so the warning of 'Error: Removing state 63 which has
instances left.' is triggered during unloading module.

>
> This begs the question, should we not then also make zram_remove_cb()
> chatty on failure?

When zram_remove() is run from module unloading path, it shouldn't be
failed cause no one owns any zram disk since unloading module, that is
why I add WARN_ON_ONCE() in zram_remove_cb() for document benefit because
zram_remove() is called from two contexts. But it can be failed in case of
hot removing.

>
> > Fixes the issue by not failing zram_remove() if ->claim is set, and
> > we actually need to do nothing in case that zram_reset() is running
> > since del_gendisk() will wait until zram_reset() is done.
>
> Sure this all looks like a reasonable alternative to the issue without
> using a generic lock. It does beg the questions if this suffices for
> the other oddball sysfs ops, but we can take that up with time.

->claim is only set in reset_store(), so it is enough for avoiding
zram_remove failure during unloading module.

>
> > Reported-by: Luis Chamberlain <[email protected]>
> > Signed-off-by: Ming Lei <[email protected]>
>
> In so far as the code is concerned:
>
> Reviewed-by: Luis Chamberlain <[email protected]>

Thanks for the review!

--
Ming

2021-10-22 19:05:33

by Luis Chamberlain

[permalink] [raw]
Subject: Re: [PATCH V2 3/4] zram: avoid race between zram_remove and disksize_store

On Wed, Oct 20, 2021 at 09:55:47AM +0800, Ming Lei wrote:
> After resetting device in zram_remove(), disksize_store still may come and
> allocate resources again before deleting gendisk, fix the race by resetting
> zram after del_gendisk() returns. At that time, disksize_store can't come
> any more.
>
> Reported-by: Luis Chamberlain <[email protected]>
> Signed-off-by: Ming Lei <[email protected]>
> ---
> drivers/block/zram/zram_drv.c | 7 +++++++
> 1 file changed, 7 insertions(+)
>
> diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c
> index 8883de7aa3d7..6078d1dae44a 100644
> --- a/drivers/block/zram/zram_drv.c
> +++ b/drivers/block/zram/zram_drv.c
> @@ -2002,6 +2002,13 @@ static int zram_remove(struct zram *zram)
> /* del_gendisk drains pending reset_store */
> WARN_ON_ONCE(claimed && zram->claim);
>
> + /*
> + * disksize store may come between zram_reset_device and del_gendisk, so
> + * run the last reset for avoiding leak anything allocated in
> + * disksize_store().

The above is not clear English, how about:

disksize_store() may be called in between zram_reset_device() and
del_gendisk(), so run the last reset to avoid leaking anything allocated
with disksize_store()

Reviewed-by: Luis Chamberlain <[email protected]>

Luis

2021-10-22 19:09:39

by Luis Chamberlain

[permalink] [raw]
Subject: Re: [PATCH V2 4/4] zram: replace fsync_bdev with sync_blockdev

On Wed, Oct 20, 2021 at 09:55:48AM +0800, Ming Lei wrote:
> When calling fsync_bdev(), zram driver guarantees that the bdev won't be
> opened by anyone, then there can't be one active fs/superblock over the
> zram bdev, so replace fsync_bdev with sync_blockdev.
>
> Signed-off-by: Ming Lei <[email protected]>

Reviewed-by: Luis Chamberlain <[email protected]>

Luis