2021-10-25 02:59:36

by Ming Lei

[permalink] [raw]
Subject: [PATCH V3 0/4] zram: fix two races and one zram leak

Hello,

Fixes three issues reported by Luis Chamberlain with one simpler approach:

- race between between zram_reset_device() and disksize_store() (1/4)

- zram leak during unloading module, which is one race between resetting
and removing device (2/4)

- race between zram_remove and disksize_store (3/4)

Also replace replace fsync_bdev with sync_blockdev since no one opens
it.(4/4)

V3:
- no code change
- update commit log or comment as Luis suggested
- add reviewed-by tag

V2:
- take another approach to avoid failing of zram_remove()
- add patch to address race between zram_reset_device() and
disksize_store()


Ming Lei (4):
zram: fix race between zram_reset_device() and disksize_store()
zram: don't fail to remove zram during unloading module
zram: avoid race between zram_remove and disksize_store
zram: replace fsync_bdev with sync_blockdev

drivers/block/zram/zram_drv.c | 39 ++++++++++++++++++++++++++++-------
1 file changed, 31 insertions(+), 8 deletions(-)

--
2.31.1


2021-10-25 03:00:10

by Ming Lei

[permalink] [raw]
Subject: [PATCH V3 1/4] zram: fix race between zram_reset_device() and disksize_store()

When the ->init_lock is released in zram_reset_device(), disksize_store()
can come in and try to allocate meta, but zram_reset_device() is freeing
free meta, so cause races.

Link: https://lore.kernel.org/linux-block/[email protected]/T/#mc617f865a3fa2778e40f317ddf48f6447c20c073
Reported-by: Luis Chamberlain <[email protected]>
Reviewed-by: Luis Chamberlain <[email protected]>
Signed-off-by: Ming Lei <[email protected]>
---
drivers/block/zram/zram_drv.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c
index a68297fb51a2..25d781dc5fef 100644
--- a/drivers/block/zram/zram_drv.c
+++ b/drivers/block/zram/zram_drv.c
@@ -1704,12 +1704,13 @@ static void zram_reset_device(struct zram *zram)
set_capacity_and_notify(zram->disk, 0);
part_stat_set_all(zram->disk->part0, 0);

- up_write(&zram->init_lock);
/* I/O operation under all of CPU are done so let's free */
zram_meta_free(zram, disksize);
memset(&zram->stats, 0, sizeof(zram->stats));
zcomp_destroy(comp);
reset_bdev(zram);
+
+ up_write(&zram->init_lock);
}

static ssize_t disksize_store(struct device *dev,
--
2.31.1

2021-10-25 03:00:33

by Ming Lei

[permalink] [raw]
Subject: [PATCH V3 4/4] zram: replace fsync_bdev with sync_blockdev

When calling fsync_bdev(), zram driver guarantees that the bdev won't be
opened by anyone, then there can't be one active fs/superblock over the
zram bdev, so replace fsync_bdev with sync_blockdev.

Reviewed-by: Luis Chamberlain <[email protected]>
Signed-off-by: Ming Lei <[email protected]>
---
drivers/block/zram/zram_drv.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c
index dba93b8ce511..9609e2b31d5a 100644
--- a/drivers/block/zram/zram_drv.c
+++ b/drivers/block/zram/zram_drv.c
@@ -1790,7 +1790,7 @@ static ssize_t reset_store(struct device *dev,
mutex_unlock(&bdev->bd_disk->open_mutex);

/* Make sure all the pending I/O are finished */
- fsync_bdev(bdev);
+ sync_blockdev(bdev);
zram_reset_device(zram);

mutex_lock(&bdev->bd_disk->open_mutex);
@@ -1991,7 +1991,7 @@ static int zram_remove(struct zram *zram)
;
} else {
/* Make sure all the pending I/O are finished */
- fsync_bdev(bdev);
+ sync_blockdev(bdev);
zram_reset_device(zram);
}

--
2.31.1

2021-10-25 03:12:14

by Ming Lei

[permalink] [raw]
Subject: [PATCH V3 2/4] zram: don't fail to remove zram during unloading module

When the zram module is being unloaded, no one should be using the
zram disks. However even while being unloaded the zram module's
sysfs attributes might be poked at to re-configure zram devices.
This is expected, and kernfs ensures that these operations complete
before device_del() completes.

But reset_store() may set ->claim which will fail zram_remove(), when
this happens, zram_reset_device() is bypassed, and zram->comp can't
be destroyed, so the warning of 'Error: Removing state 63 which has
instances left.' is triggered during unloading module, together with
memory leak and sort of thing.

Fixes the issue by not failing zram_remove() if ->claim is set, and
we actually need to do nothing in case that zram_reset() is running
since del_gendisk() will wait until zram_reset() is done.

Reported-by: Luis Chamberlain <[email protected]>
Reviewed-by: Luis Chamberlain <[email protected]>
Signed-off-by: Ming Lei <[email protected]>
---
drivers/block/zram/zram_drv.c | 27 +++++++++++++++++++++------
1 file changed, 21 insertions(+), 6 deletions(-)

diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c
index 25d781dc5fef..8883de7aa3d7 100644
--- a/drivers/block/zram/zram_drv.c
+++ b/drivers/block/zram/zram_drv.c
@@ -1968,25 +1968,40 @@ static int zram_add(void)
static int zram_remove(struct zram *zram)
{
struct block_device *bdev = zram->disk->part0;
+ bool claimed;

mutex_lock(&bdev->bd_disk->open_mutex);
- if (bdev->bd_openers || zram->claim) {
+ if (bdev->bd_openers) {
mutex_unlock(&bdev->bd_disk->open_mutex);
return -EBUSY;
}

- zram->claim = true;
+ claimed = zram->claim;
+ if (!claimed)
+ zram->claim = true;
mutex_unlock(&bdev->bd_disk->open_mutex);

zram_debugfs_unregister(zram);

- /* Make sure all the pending I/O are finished */
- fsync_bdev(bdev);
- zram_reset_device(zram);
+ if (claimed) {
+ /*
+ * If we were claimed by reset_store(), del_gendisk() will
+ * wait until reset_store() is done, so nothing need to do.
+ */
+ ;
+ } else {
+ /* Make sure all the pending I/O are finished */
+ fsync_bdev(bdev);
+ zram_reset_device(zram);
+ }

pr_info("Removed device: %s\n", zram->disk->disk_name);

del_gendisk(zram->disk);
+
+ /* del_gendisk drains pending reset_store */
+ WARN_ON_ONCE(claimed && zram->claim);
+
blk_cleanup_disk(zram->disk);
kfree(zram);
return 0;
@@ -2063,7 +2078,7 @@ static struct class zram_control_class = {

static int zram_remove_cb(int id, void *ptr, void *data)
{
- zram_remove(ptr);
+ WARN_ON_ONCE(zram_remove(ptr));
return 0;
}

--
2.31.1

2021-10-25 03:12:21

by Ming Lei

[permalink] [raw]
Subject: [PATCH V3 3/4] zram: avoid race between zram_remove and disksize_store

After resetting device in zram_remove(), disksize_store still may come and
allocate resources again before deleting gendisk, fix the race by resetting
zram after del_gendisk() returns. At that time, disksize_store can't come
any more.

Reported-by: Luis Chamberlain <[email protected]>
Reviewed-by: Luis Chamberlain <[email protected]>
Signed-off-by: Ming Lei <[email protected]>
---
drivers/block/zram/zram_drv.c | 7 +++++++
1 file changed, 7 insertions(+)

diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c
index 8883de7aa3d7..dba93b8ce511 100644
--- a/drivers/block/zram/zram_drv.c
+++ b/drivers/block/zram/zram_drv.c
@@ -2002,6 +2002,13 @@ static int zram_remove(struct zram *zram)
/* del_gendisk drains pending reset_store */
WARN_ON_ONCE(claimed && zram->claim);

+ /*
+ * disksize_store() may be called in between zram_reset_device()
+ * and del_gendisk(), so run the last reset to avoid leaking
+ * anything allocated with disksize_store()
+ */
+ zram_reset_device(zram);
+
blk_cleanup_disk(zram->disk);
kfree(zram);
return 0;
--
2.31.1

2021-10-25 16:32:45

by Minchan Kim

[permalink] [raw]
Subject: Re: [PATCH V3 0/4] zram: fix two races and one zram leak

On Mon, Oct 25, 2021 at 10:54:22AM +0800, Ming Lei wrote:
> Hello,
>
> Fixes three issues reported by Luis Chamberlain with one simpler approach:
>
> - race between between zram_reset_device() and disksize_store() (1/4)
>
> - zram leak during unloading module, which is one race between resetting
> and removing device (2/4)
>
> - race between zram_remove and disksize_store (3/4)
>
> Also replace replace fsync_bdev with sync_blockdev since no one opens
> it.(4/4)
>
> V3:
> - no code change
> - update commit log or comment as Luis suggested
> - add reviewed-by tag
>
> V2:
> - take another approach to avoid failing of zram_remove()
> - add patch to address race between zram_reset_device() and
> disksize_store()
>
>
> Ming Lei (4):
> zram: fix race between zram_reset_device() and disksize_store()
> zram: don't fail to remove zram during unloading module
> zram: avoid race between zram_remove and disksize_store
> zram: replace fsync_bdev with sync_blockdev

Andrew Morton usually takes zram patches so Ccing him.

Acked-by: Minchan Kim <[email protected]>

for all patches in this thread.

2021-11-01 00:33:48

by Ming Lei

[permalink] [raw]
Subject: Re: [PATCH V3 0/4] zram: fix two races and one zram leak

Hello Andrew Morton and Jens,

On Mon, Oct 25, 2021 at 09:30:07AM -0700, Minchan Kim wrote:
> On Mon, Oct 25, 2021 at 10:54:22AM +0800, Ming Lei wrote:
> > Hello,
> >
> > Fixes three issues reported by Luis Chamberlain with one simpler approach:
> >
> > - race between between zram_reset_device() and disksize_store() (1/4)
> >
> > - zram leak during unloading module, which is one race between resetting
> > and removing device (2/4)
> >
> > - race between zram_remove and disksize_store (3/4)
> >
> > Also replace replace fsync_bdev with sync_blockdev since no one opens
> > it.(4/4)
> >
> > V3:
> > - no code change
> > - update commit log or comment as Luis suggested
> > - add reviewed-by tag
> >
> > V2:
> > - take another approach to avoid failing of zram_remove()
> > - add patch to address race between zram_reset_device() and
> > disksize_store()
> >
> >
> > Ming Lei (4):
> > zram: fix race between zram_reset_device() and disksize_store()
> > zram: don't fail to remove zram during unloading module
> > zram: avoid race between zram_remove and disksize_store
> > zram: replace fsync_bdev with sync_blockdev
>
> Andrew Morton usually takes zram patches so Ccing him.
>
> Acked-by: Minchan Kim <[email protected]>
>
> for all patches in this thread.

Any chance to make it in v5.16?

Thanks,
Ming

2021-11-02 20:45:54

by Jens Axboe

[permalink] [raw]
Subject: Re: [PATCH V3 0/4] zram: fix two races and one zram leak

On Mon, 25 Oct 2021 10:54:22 +0800, Ming Lei wrote:
> Fixes three issues reported by Luis Chamberlain with one simpler approach:
>
> - race between between zram_reset_device() and disksize_store() (1/4)
>
> - zram leak during unloading module, which is one race between resetting
> and removing device (2/4)
>
> [...]

Applied, thanks!

[1/4] zram: fix race between zram_reset_device() and disksize_store()
commit: 6f1637795f2827d36aec9e0246487f5852e8abf7
[2/4] zram: don't fail to remove zram during unloading module
commit: 8c54499a59b026a3dc2afccf6e1b36d5700d2fef
[3/4] zram: avoid race between zram_remove and disksize_store
commit: 5a4b653655d554b5f51a5d2252882708c56a6f7e
[4/4] zram: replace fsync_bdev with sync_blockdev
commit: 00c5495c54f785beb0f6a34f7a3674d3ea0997d5

Best regards,
--
Jens Axboe