2019-07-30 12:16:35

by Pankaj Gupta

[permalink] [raw]
Subject: [PATCH] dm: fix dax_dev NULL dereference


'Murphy Zhou' reports[1] hitting the panic when running xfstests
generic/108 on pmem ramdisk. In his words:

This test is simulating partial disk error when calling fsync():
create a lvm vg which consists of 2 disks:
one scsi_debug disk; one other disk I specified, pmem ramdisk in this case.
create lv in this vg and write to it, make sure writing across 2 disks;
offline scsi_debug disk;
write again to allocated area;
expect fsync: IO error.
If one of the disks is pmem ramdisk, it reproduces every time on my setup,
on v5.3-rc2+.
The mount -o dax option is not required to reproduce this panic.
...

Fix this by returning false from 'device_synchronous' function when dax_dev
is NULL.

[ 1984.878208] BUG: kernel NULL pointer dereference, address: 00000000000002d0
[ 1984.882546] #PF: supervisor read access in kernel mode
[ 1984.885664] #PF: error_code(0x0000) - not-present page
[ 1984.888626] PGD 0 P4D 0
[ 1984.890140] Oops: 0000 [#1] SMP PTI
...
...
[ 1984.943682] Call Trace:
[ 1984.945007] device_synchronous+0xe/0x20 [dm_mod]
[ 1984.947328] stripe_iterate_devices+0x48/0x60 [dm_mod]
[ 1984.949947] ? dm_set_device_limits+0x130/0x130 [dm_mod]
[ 1984.952516] dm_table_supports_dax+0x39/0x90 [dm_mod]
[ 1984.954989] dm_table_set_restrictions+0x248/0x5d0 [dm_mod]
[ 1984.957685] dm_setup_md_queue+0x66/0x110 [dm_mod]
[ 1984.960280] table_load+0x1e3/0x390 [dm_mod]
[ 1984.962491] ? retrieve_status+0x1c0/0x1c0 [dm_mod]
[ 1984.964910] ctl_ioctl+0x1d3/0x550 [dm_mod]
[ 1984.967006] ? path_lookupat+0xf4/0x200
[ 1984.968890] dm_ctl_ioctl+0xa/0x10 [dm_mod]
[ 1984.970920] do_vfs_ioctl+0xa9/0x630
[ 1984.972701] ksys_ioctl+0x60/0x90
[ 1984.974335] __x64_sys_ioctl+0x16/0x20
[ 1984.976221] do_syscall_64+0x5b/0x1d0
[ 1984.978091] entry_SYSCALL_64_after_hwframe+0x44/0xa9

[1] https://lore.kernel.org/linux-fsdevel/[email protected]/T/#mac662eb50b9d7bd282b23e6e8625a3f7a4687506

Fixes: 2e9ee0955d3c ("dm: enable synchronous dax")
Reported-by: [email protected]
Tested-by: [email protected]
Signed-off-by: Pankaj Gupta <[email protected]>
---
drivers/md/dm-table.c | 3 +++
1 file changed, 3 insertions(+)

diff --git a/drivers/md/dm-table.c b/drivers/md/dm-table.c
index caaee8032afe..b065845c1bdd 100644
--- a/drivers/md/dm-table.c
+++ b/drivers/md/dm-table.c
@@ -894,6 +894,9 @@ int device_supports_dax(struct dm_target *ti, struct dm_dev *dev,
static int device_synchronous(struct dm_target *ti, struct dm_dev *dev,
sector_t start, sector_t len, void *data)
{
+ if (!dev->dax_dev)
+ return false;
+
return dax_synchronous(dev->dax_dev);
}

--
2.20.1


2019-07-30 12:16:52

by Pankaj Gupta

[permalink] [raw]
Subject: Re: [PATCH] dm: fix dax_dev NULL dereference



+CC [[email protected]]
>
>
> 'Murphy Zhou' reports[1] hitting the panic when running xfstests
> generic/108 on pmem ramdisk. In his words:
>
> This test is simulating partial disk error when calling fsync():
> create a lvm vg which consists of 2 disks:
> one scsi_debug disk; one other disk I specified, pmem ramdisk in this
> case.
> create lv in this vg and write to it, make sure writing across 2 disks;
> offline scsi_debug disk;
> write again to allocated area;
> expect fsync: IO error.
> If one of the disks is pmem ramdisk, it reproduces every time on my setup,
> on v5.3-rc2+.
> The mount -o dax option is not required to reproduce this panic.
> ...
>
> Fix this by returning false from 'device_synchronous' function when dax_dev
> is NULL.
>
> [ 1984.878208] BUG: kernel NULL pointer dereference, address:
> 00000000000002d0
> [ 1984.882546] #PF: supervisor read access in kernel mode
> [ 1984.885664] #PF: error_code(0x0000) - not-present page
> [ 1984.888626] PGD 0 P4D 0
> [ 1984.890140] Oops: 0000 [#1] SMP PTI
> ...
> ...
> [ 1984.943682] Call Trace:
> [ 1984.945007] device_synchronous+0xe/0x20 [dm_mod]
> [ 1984.947328] stripe_iterate_devices+0x48/0x60 [dm_mod]
> [ 1984.949947] ? dm_set_device_limits+0x130/0x130 [dm_mod]
> [ 1984.952516] dm_table_supports_dax+0x39/0x90 [dm_mod]
> [ 1984.954989] dm_table_set_restrictions+0x248/0x5d0 [dm_mod]
> [ 1984.957685] dm_setup_md_queue+0x66/0x110 [dm_mod]
> [ 1984.960280] table_load+0x1e3/0x390 [dm_mod]
> [ 1984.962491] ? retrieve_status+0x1c0/0x1c0 [dm_mod]
> [ 1984.964910] ctl_ioctl+0x1d3/0x550 [dm_mod]
> [ 1984.967006] ? path_lookupat+0xf4/0x200
> [ 1984.968890] dm_ctl_ioctl+0xa/0x10 [dm_mod]
> [ 1984.970920] do_vfs_ioctl+0xa9/0x630
> [ 1984.972701] ksys_ioctl+0x60/0x90
> [ 1984.974335] __x64_sys_ioctl+0x16/0x20
> [ 1984.976221] do_syscall_64+0x5b/0x1d0
> [ 1984.978091] entry_SYSCALL_64_after_hwframe+0x44/0xa9
>
> [1]
> https://lore.kernel.org/linux-fsdevel/[email protected]/T/#mac662eb50b9d7bd282b23e6e8625a3f7a4687506
>
> Fixes: 2e9ee0955d3c ("dm: enable synchronous dax")
> Reported-by: [email protected]
> Tested-by: [email protected]
> Signed-off-by: Pankaj Gupta <[email protected]>
> ---
> drivers/md/dm-table.c | 3 +++
> 1 file changed, 3 insertions(+)
>
> diff --git a/drivers/md/dm-table.c b/drivers/md/dm-table.c
> index caaee8032afe..b065845c1bdd 100644
> --- a/drivers/md/dm-table.c
> +++ b/drivers/md/dm-table.c
> @@ -894,6 +894,9 @@ int device_supports_dax(struct dm_target *ti, struct
> dm_dev *dev,
> static int device_synchronous(struct dm_target *ti, struct dm_dev *dev,
> sector_t start, sector_t len, void *data)
> {
> + if (!dev->dax_dev)
> + return false;
> +
> return dax_synchronous(dev->dax_dev);
> }
>
> --
> 2.20.1
>
>

2019-07-30 22:42:24

by Mike Snitzer

[permalink] [raw]
Subject: Re: dm: fix dax_dev NULL dereference

I staged the fix (which I tweaked) here:
https://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm.git/commit/?h=dm-5.3&id=95b9ebb78c4c733f8912a195fbd0bc19960e726e

Also, please note this additional related commit that just serves to
improve a related function name and clean up some whitespace:
https://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm.git/commit/?h=dm-5.3&id=f965f935a89bb174fd3f6d6b51bba91c1ed258c5

I'll likely send these to Linus for 5.2-rc3 later this week.

Thanks,
Mike

2019-07-31 02:24:29

by Dan Williams

[permalink] [raw]
Subject: Re: dm: fix dax_dev NULL dereference

On Tue, Jul 30, 2019 at 12:07 PM Mike Snitzer <[email protected]> wrote:
>
> I staged the fix (which I tweaked) here:
> https://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm.git/commit/?h=dm-5.3&id=95b9ebb78c4c733f8912a195fbd0bc19960e726e

Thanks for picking this up Mike, but I'd prefer to just teach
dax_synchronous() to return false if the passed in dax_dev is NULL.
Thoughts?

2019-07-31 05:44:32

by Pankaj Gupta

[permalink] [raw]
Subject: Re: dm: fix dax_dev NULL dereference


>
> I staged the fix (which I tweaked) here:
> https://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm.git/commit/?h=dm-5.3&id=95b9ebb78c4c733f8912a195fbd0bc19960e726e
>
> Also, please note this additional related commit that just serves to
> improve a related function name and clean up some whitespace:
> https://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm.git/commit/?h=dm-5.3&id=f965f935a89bb174fd3f6d6b51bba91c1ed258c5
>
> I'll likely send these to Linus for 5.2-rc3 later this week.

o.k

Thank you,
Pankaj

>
> Thanks,
> Mike
>