2017-06-24 10:08:05

by Sami Kerola

[permalink] [raw]
Subject: zram hot_add device busy

Hello,

While going through if there are new util-linux bugs reported I came a
cross this https://bugs.launchpad.net/ubuntu/+source/util-linux/+bug/1645846

Simple way to reproduce the issue is:
d=$(cat /sys/class/zram-control/hot_add) && zramctl --size 256M /dev/zram$d

I am not entirely sure, but drivers/block/zram/zram_drv.c function
zram_add() should block until the device is usable. Looking the code
that it might be the device_add_disk() from block/genhd.c that should
do the blocking. But perhaps it's best if I leave such detail to
people who know the code a bit better.

One thing annoys me. I expected 'zramctl --find --size 256M' to suffer
from same issue but it does not. I can only reproduce the issue when
triggering hot_add separately, and as quick as possibly using the
path. Notice that sometimes it takes second try before the hot_add and
use triggers the issue. That is almost certainly down to speed the
system in hand, e.g., quicker the computer less likely to trigger.

--
Sami Kerola
http://www.iki.fi/kerolasa/


2017-06-26 02:23:13

by Minchan Kim

[permalink] [raw]
Subject: Re: zram hot_add device busy

Hello,

On Sat, Jun 24, 2017 at 11:08:01AM +0100, Sami Kerola wrote:
> Hello,
>
> While going through if there are new util-linux bugs reported I came a
> cross this https://bugs.launchpad.net/ubuntu/+source/util-linux/+bug/1645846
>
> Simple way to reproduce the issue is:
> d=$(cat /sys/class/zram-control/hot_add) && zramctl --size 256M /dev/zram$d

To know the problem comes from any side, could you test it without zramctl
command?

IOW,
d=$(cat /sys/class/zram-control/hot_add) && echo $((256<<20)) /dev/zram$d

If it still has a problem, please show your test code which helps
understanding of fundamental problem a lot. ;-)

>
> I am not entirely sure, but drivers/block/zram/zram_drv.c function
> zram_add() should block until the device is usable. Looking the code
> that it might be the device_add_disk() from block/genhd.c that should
> do the blocking. But perhaps it's best if I leave such detail to
> people who know the code a bit better.

I might miss something but I believe device is usable state after zram_add done.
Just in case, please test return value after some operation.

if [ $? -ne 0 ];
then
echo "fail to some op"
blah blah
fi

Thanks.

>
> One thing annoys me. I expected 'zramctl --find --size 256M' to suffer
> from same issue but it does not. I can only reproduce the issue when
> triggering hot_add separately, and as quick as possibly using the
> path. Notice that sometimes it takes second try before the hot_add and
> use triggers the issue. That is almost certainly down to speed the
> system in hand, e.g., quicker the computer less likely to trigger.
>
> --
> Sami Kerola
> http://www.iki.fi/kerolasa/

2017-06-26 02:39:11

by Sergey Senozhatsky

[permalink] [raw]
Subject: Re: zram hot_add device busy

Hello,

(Cc Andrew, Karel)

On (06/24/17 11:08), Sami Kerola wrote:
> Hello,
>
> While going through if there are new util-linux bugs reported I came a
> cross this https://bugs.launchpad.net/ubuntu/+source/util-linux/+bug/1645846
>
> Simple way to reproduce the issue is:
> d=$(cat /sys/class/zram-control/hot_add) && zramctl --size 256M /dev/zram$d
>
> I am not entirely sure, but drivers/block/zram/zram_drv.c function
> zram_add() should block until the device is usable. Looking the code
> that it might be the device_add_disk() from block/genhd.c that should
> do the blocking. But perhaps it's best if I leave such detail to
> people who know the code a bit better.
>
> One thing annoys me. I expected 'zramctl --find --size 256M' to suffer
> from same issue but it does not. I can only reproduce the issue when
> triggering hot_add separately, and as quick as possibly using the
> path. Notice that sometimes it takes second try before the hot_add and
> use triggers the issue. That is almost certainly down to speed the
> system in hand, e.g., quicker the computer less likely to trigger.


ok... I don't think I see what we can do in zram about the
issue in question. what I see on my system is that systemd
udevd is racing with zramctl.


so, the "failed" case looks like this


process action

[ 761.191697] zram: >>[systemd-udevd]>>> zram_open
^^^^^^^ systemd opens the device. we have ->bd_openers now

[ 761.194057] zram: >>[zramctl]>>> reset_store
^^^^^^^ fails, because device is opened by systemd.
if (bdev->bd_openers || zram->claim) is true, we can't reset

[ 761.195105] zram: >>[systemd-udevd]>>> zram_release
^^^^^^^ now system releases the device

[ 761.198655] zram: >>[zramctl]>>> reset_store
^^^^^^^ succeeds, because we don't have ->bd_openers anymore

[ 761.198669] zram: >>[zramctl]>>> zram_reset_device
[ 761.198721] zram: >>[zramctl]>>> disksize_store
[ 761.199279] zram0: detected capacity change from 0 to 268435456


normal case,

process action


[ 761.203989] zram: >>[systemd-udevd]>>> zram_open
[ 761.204461] zram: >>[systemd-udevd]>>> zram_release
[ 761.204940] zram: >>[zramctl]>>> reset_store
[ 761.204948] zram: >>[zramctl]>>> zram_reset_device
[ 761.204972] zram: >>[zramctl]>>> disksize_store
[ 761.205409] zram1: detected capacity change from 0 to 268435456


as you can see, systemd-udevd releases the device before zramctl
attempts to reset it.

-ss

2017-06-26 02:58:19

by Sergey Senozhatsky

[permalink] [raw]
Subject: Re: zram hot_add device busy

On (06/26/17 11:39), Sergey Senozhatsky wrote:
[..]
> ok... I don't think I see what we can do in zram about the
> issue in question.

... check init_done() in reset_store() and avoid the whole ->bd_openers
branch if the device is already reset?

// not compile tested. just a sketch. //

---

diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c
index debee952dcc1..79b1a957a6bd 100644
--- a/drivers/block/zram/zram_drv.c
+++ b/drivers/block/zram/zram_drv.c
@@ -1047,6 +1047,14 @@ static ssize_t reset_store(struct device *dev,
return -EINVAL;

zram = dev_to_zram(dev);
+
+ down_write(&zram->init_lock);
+ if (!init_done(zram)) {
+ up_write(&zram->init_lock);
+ return len;
+ }
+ up_write(&zram->init_lock);
+
bdev = bdget_disk(zram->disk, 0);
if (!bdev)
return -ENOMEM;

---

-ss