2017-03-20 22:59:04

by Ming Lin

[permalink] [raw]
Subject: [RFC PATCH 0/1] nbd: fix crash when unmaping nbd device with fs still mounted

Hi all,

I run into a BUG_ON(!buffer_mapped(bh)) crash with below script.

$ rbd-nbd map mypool/myimg
$ mkfs.ext4 /dev/nbd0
$ mount /dev/nbd0 /mnt/
$ rbd-nbd unmap /dev/nbd0
$ umount /mnt

[ 1248.870131] kernel BUG at /home/mlin/linux/fs/buffer.c:3103!
[ 1248.871214] invalid opcode: 0000 [#1] SMP
[ 1248.879468] CPU: 0 PID: 2450 Comm: umount Tainted: G E 4.11.0-rc2+ #2
[ 1248.896579] Call Trace:
[ 1248.897056] __sync_dirty_buffer+0x6e/0xe0
[ 1248.897870] ext4_commit_super+0x1eb/0x290 [ext4]
[ 1248.898795] ext4_put_super+0x2fa/0x3c0 [ext4]
[ 1248.899662] generic_shutdown_super+0x6f/0x100
[ 1248.900525] kill_block_super+0x27/0x70
[ 1248.901257] deactivate_locked_super+0x43/0x70
[ 1248.902112] deactivate_super+0x46/0x60
[ 1248.902869] cleanup_mnt+0x3f/0x80
[ 1248.903526] __cleanup_mnt+0x12/0x20
[ 1248.904218] task_work_run+0x83/0xb0
[ 1248.904941] exit_to_usermode_loop+0x59/0x7b
[ 1248.905769] do_syscall_64+0x165/0x180
[ 1248.907603] entry_SYSCALL64_slow_path+0x25/0x25

Last year, Ratna posted a patch to fix it.
https://lkml.org/lkml/2016/4/20/257

Ratna's script to reproduce the bug.

$ qemu-img create -f qcow2 f.img 1G
$ mkfs.ext4 f.img
$ qemu-nbd -c /dev/nbd0 f.img
$ mount /dev/nbd0 dir
$ killall -KILL qemu-nbd
$ sleep 1
$ ls dir
$ umount dir

I ported Rantna's patch to 4.11-rc2 and confirmed that it fixes the crash.

Jan Kara had some comments about this bug:
http://www.kernelhub.org/?p=2&msg=361407

I hope to fix this bug in the upstream kernel first and then back port it to
our production system.

Please see "PATCH 1/1" for detail.

Thanks,
Ming


2017-03-20 22:59:12

by Ming Lin

[permalink] [raw]
Subject: [RFC PATCH 1/1] nbd: replace kill_bdev() with __invalidate_device()

From: Ratna Manoj Bolla <[email protected]>

When a filesystem is mounted on a nbd device and on a disconnect, because
of kill_bdev(), and resetting bdev size to zero, buffer_head mappings are
getting destroyed under mounted filesystem.

After a bdev size reset(i.e bdev->bd_inode->i_size = 0) on a disconnect,
followed by a sys_umount(),
generic_shutdown_super()->...
->__sync_blockdev()->...
-blkdev_writepages()->...
->do_invalidatepage()->...
-discard_buffer() is discarding superblock buffer_head assumed
to be in mapped state by ext4_commit_super().

[mlin: ported to 4.11-rc2]
Signed-off-by: Ratna Manoj Bolla <[email protected]
---
drivers/block/nbd.c | 8 ++++++--
1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c
index cb4ccfc..a6a3643 100644
--- a/drivers/block/nbd.c
+++ b/drivers/block/nbd.c
@@ -125,7 +125,8 @@ static const char *nbdcmd_to_ascii(int cmd)

static int nbd_size_clear(struct nbd_device *nbd, struct block_device *bdev)
{
- bd_set_size(bdev, 0);
+ if (bdev->bd_openers <= 1)
+ bd_set_size(bdev, 0);
set_capacity(nbd->disk, 0);
kobject_uevent(&nbd_to_dev(nbd)->kobj, KOBJ_CHANGE);

@@ -603,6 +604,8 @@ static void nbd_reset(struct nbd_device *nbd)

static void nbd_bdev_reset(struct block_device *bdev)
{
+ if (bdev->bd_openers > 1)
+ return;
set_device_ro(bdev, false);
bdev->bd_inode->i_size = 0;
if (max_part > 0) {
@@ -666,7 +669,8 @@ static int nbd_clear_sock(struct nbd_device *nbd, struct block_device *bdev)
{
sock_shutdown(nbd);
nbd_clear_que(nbd);
- kill_bdev(bdev);
+
+ __invalidate_device(bdev, true);
nbd_bdev_reset(bdev);
/*
* We want to give the run thread a chance to wait for everybody
--
1.8.3.1

2017-03-22 20:48:15

by Ming Lin

[permalink] [raw]
Subject: Re: [Nbd] [RFC PATCH 1/1] nbd: replace kill_bdev() with __invalidate_device()

On Mon, Mar 20, 2017 at 3:58 PM, Ming Lin <[email protected]> wrote:
> From: Ratna Manoj Bolla <[email protected]>
>
> When a filesystem is mounted on a nbd device and on a disconnect, because
> of kill_bdev(), and resetting bdev size to zero, buffer_head mappings are
> getting destroyed under mounted filesystem.
>
> After a bdev size reset(i.e bdev->bd_inode->i_size = 0) on a disconnect,
> followed by a sys_umount(),
> generic_shutdown_super()->...
> ->__sync_blockdev()->...
> -blkdev_writepages()->...
> ->do_invalidatepage()->...
> -discard_buffer() is discarding superblock buffer_head assumed
> to be in mapped state by ext4_commit_super().
>
> [mlin: ported to 4.11-rc2]
> Signed-off-by: Ratna Manoj Bolla <[email protected]
> ---
> drivers/block/nbd.c | 8 ++++++--
> 1 file changed, 6 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c
> index cb4ccfc..a6a3643 100644
> --- a/drivers/block/nbd.c
> +++ b/drivers/block/nbd.c
> @@ -125,7 +125,8 @@ static const char *nbdcmd_to_ascii(int cmd)
>
> static int nbd_size_clear(struct nbd_device *nbd, struct block_device *bdev)
> {
> - bd_set_size(bdev, 0);
> + if (bdev->bd_openers <= 1)
> + bd_set_size(bdev, 0);
> set_capacity(nbd->disk, 0);
> kobject_uevent(&nbd_to_dev(nbd)->kobj, KOBJ_CHANGE);
>
> @@ -603,6 +604,8 @@ static void nbd_reset(struct nbd_device *nbd)
>
> static void nbd_bdev_reset(struct block_device *bdev)
> {
> + if (bdev->bd_openers > 1)
> + return;
> set_device_ro(bdev, false);
> bdev->bd_inode->i_size = 0;
> if (max_part > 0) {
> @@ -666,7 +669,8 @@ static int nbd_clear_sock(struct nbd_device *nbd, struct block_device *bdev)
> {
> sock_shutdown(nbd);
> nbd_clear_que(nbd);
> - kill_bdev(bdev);
> +
> + __invalidate_device(bdev, true);
> nbd_bdev_reset(bdev);
> /*
> * We want to give the run thread a chance to wait for everybody
> --
> 1.8.3.1

Hi Josef,

Any comments?

Thanks,
Ming

2017-03-22 21:44:02

by Josef Bacik

[permalink] [raw]
Subject: Re: [Nbd] [RFC PATCH 1/1] nbd: replace kill_bdev() with __invalidate_device()

Hey sorry I just got back from LSF, I’ll look at this in the morning. Thanks,

Josef

On 3/22/17, 4:48 PM, "Ming Lin" <[email protected]> wrote:

On Mon, Mar 20, 2017 at 3:58 PM, Ming Lin <[email protected]> wrote:
> From: Ratna Manoj Bolla <[email protected]>
>
> When a filesystem is mounted on a nbd device and on a disconnect, because
> of kill_bdev(), and resetting bdev size to zero, buffer_head mappings are
> getting destroyed under mounted filesystem.
>
> After a bdev size reset(i.e bdev->bd_inode->i_size = 0) on a disconnect,
> followed by a sys_umount(),
> generic_shutdown_super()->...
> ->__sync_blockdev()->...
> -blkdev_writepages()->...
> ->do_invalidatepage()->...
> -discard_buffer() is discarding superblock buffer_head assumed
> to be in mapped state by ext4_commit_super().
>
> [mlin: ported to 4.11-rc2]
> Signed-off-by: Ratna Manoj Bolla <[email protected]
> ---
> drivers/block/nbd.c | 8 ++++++--
> 1 file changed, 6 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c
> index cb4ccfc..a6a3643 100644
> --- a/drivers/block/nbd.c
> +++ b/drivers/block/nbd.c
> @@ -125,7 +125,8 @@ static const char *nbdcmd_to_ascii(int cmd)
>
> static int nbd_size_clear(struct nbd_device *nbd, struct block_device *bdev)
> {
> - bd_set_size(bdev, 0);
> + if (bdev->bd_openers <= 1)
> + bd_set_size(bdev, 0);
> set_capacity(nbd->disk, 0);
> kobject_uevent(&nbd_to_dev(nbd)->kobj, KOBJ_CHANGE);
>
> @@ -603,6 +604,8 @@ static void nbd_reset(struct nbd_device *nbd)
>
> static void nbd_bdev_reset(struct block_device *bdev)
> {
> + if (bdev->bd_openers > 1)
> + return;
> set_device_ro(bdev, false);
> bdev->bd_inode->i_size = 0;
> if (max_part > 0) {
> @@ -666,7 +669,8 @@ static int nbd_clear_sock(struct nbd_device *nbd, struct block_device *bdev)
> {
> sock_shutdown(nbd);
> nbd_clear_que(nbd);
> - kill_bdev(bdev);
> +
> + __invalidate_device(bdev, true);
> nbd_bdev_reset(bdev);
> /*
> * We want to give the run thread a chance to wait for everybody
> --
> 1.8.3.1

Hi Josef,

Any comments?

Thanks,
Ming



2017-03-23 17:51:34

by Josef Bacik

[permalink] [raw]
Subject: Re: [RFC PATCH 1/1] nbd: replace kill_bdev() with __invalidate_device()

Yeah I think this is ok, I’ll throw it on my queue for fixes for this cycle. Thanks,

Josef

On 3/20/17, 6:58 PM, "Ming Lin" <[email protected]> wrote:

From: Ratna Manoj Bolla <[email protected]>

When a filesystem is mounted on a nbd device and on a disconnect, because
of kill_bdev(), and resetting bdev size to zero, buffer_head mappings are
getting destroyed under mounted filesystem.

After a bdev size reset(i.e bdev->bd_inode->i_size = 0) on a disconnect,
followed by a sys_umount(),
generic_shutdown_super()->...
->__sync_blockdev()->...
-blkdev_writepages()->...
->do_invalidatepage()->...
-discard_buffer() is discarding superblock buffer_head assumed
to be in mapped state by ext4_commit_super().

[mlin: ported to 4.11-rc2]
Signed-off-by: Ratna Manoj Bolla <[email protected]
---
drivers/block/nbd.c | 8 ++++++--
1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c
index cb4ccfc..a6a3643 100644
--- a/drivers/block/nbd.c
+++ b/drivers/block/nbd.c
@@ -125,7 +125,8 @@ static const char *nbdcmd_to_ascii(int cmd)

static int nbd_size_clear(struct nbd_device *nbd, struct block_device *bdev)
{
- bd_set_size(bdev, 0);
+ if (bdev->bd_openers <= 1)
+ bd_set_size(bdev, 0);
set_capacity(nbd->disk, 0);
kobject_uevent(&nbd_to_dev(nbd)->kobj, KOBJ_CHANGE);

@@ -603,6 +604,8 @@ static void nbd_reset(struct nbd_device *nbd)

static void nbd_bdev_reset(struct block_device *bdev)
{
+ if (bdev->bd_openers > 1)
+ return;
set_device_ro(bdev, false);
bdev->bd_inode->i_size = 0;
if (max_part > 0) {
@@ -666,7 +669,8 @@ static int nbd_clear_sock(struct nbd_device *nbd, struct block_device *bdev)
{
sock_shutdown(nbd);
nbd_clear_que(nbd);
- kill_bdev(bdev);
+
+ __invalidate_device(bdev, true);
nbd_bdev_reset(bdev);
/*
* We want to give the run thread a chance to wait for everybody
--
1.8.3.1




2017-03-23 20:59:06

by Ming Lin

[permalink] [raw]
Subject: Re: [RFC PATCH 1/1] nbd: replace kill_bdev() with __invalidate_device()

On Thu, Mar 23, 2017 at 10:51 AM, Josef Bacik <[email protected]> wrote:
> Yeah I think this is ok, I’ll throw it on my queue for fixes for this cycle. Thanks,

Great. Thanks.

>
> Josef