2024-01-13 21:34:56

by Fedor Pchelkin

[permalink] [raw]
Subject: [PATCH] drm/ttm: fix ttm pool initialization for no-dma-device drivers

QXL driver doesn't use any device for DMA mappings or allocations so
dev_to_node() will panic inside ttm_device_init() on NUMA systems:

general protection fault, probably for non-canonical address 0xdffffc000000007a: 0000 [#1] PREEMPT SMP KASAN NOPTI
KASAN: null-ptr-deref in range [0x00000000000003d0-0x00000000000003d7]
CPU: 1 PID: 1 Comm: swapper/0 Not tainted 6.7.0+ #9
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.2-3-gd478f380-rebuilt.opensuse.org 04/01/2014
RIP: 0010:ttm_device_init+0x10e/0x340
Call Trace:
<TASK>
qxl_ttm_init+0xaa/0x310
qxl_device_init+0x1071/0x2000
qxl_pci_probe+0x167/0x3f0
local_pci_probe+0xe1/0x1b0
pci_device_probe+0x29d/0x790
really_probe+0x251/0x910
__driver_probe_device+0x1ea/0x390
driver_probe_device+0x4e/0x2e0
__driver_attach+0x1e3/0x600
bus_for_each_dev+0x12d/0x1c0
bus_add_driver+0x25a/0x590
driver_register+0x15c/0x4b0
qxl_pci_driver_init+0x67/0x80
do_one_initcall+0xf5/0x5d0
kernel_init_freeable+0x637/0xb10
kernel_init+0x1c/0x2e0
ret_from_fork+0x48/0x80
ret_from_fork_asm+0x1b/0x30
</TASK>
Modules linked in:
---[ end trace 0000000000000000 ]---
RIP: 0010:ttm_device_init+0x10e/0x340

Fall back to NUMA_NO_NODE if there is no device for DMA.

Found by Linux Verification Center (linuxtesting.org).

Fixes: b0a7ce53d494 ("drm/ttm: Schedule delayed_delete worker closer")
Signed-off-by: Fedor Pchelkin <[email protected]>
---
drivers/gpu/drm/ttm/ttm_device.c | 9 +++++++--
1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/ttm/ttm_device.c b/drivers/gpu/drm/ttm/ttm_device.c
index f5187b384ae9..4130945052ed 100644
--- a/drivers/gpu/drm/ttm/ttm_device.c
+++ b/drivers/gpu/drm/ttm/ttm_device.c
@@ -195,7 +195,7 @@ int ttm_device_init(struct ttm_device *bdev, const struct ttm_device_funcs *func
bool use_dma_alloc, bool use_dma32)
{
struct ttm_global *glob = &ttm_glob;
- int ret;
+ int ret, nid;

if (WARN_ON(vma_manager == NULL))
return -EINVAL;
@@ -215,7 +215,12 @@ int ttm_device_init(struct ttm_device *bdev, const struct ttm_device_funcs *func

ttm_sys_man_init(bdev);

- ttm_pool_init(&bdev->pool, dev, dev_to_node(dev), use_dma_alloc, use_dma32);
+ if (dev)
+ nid = dev_to_node(dev);
+ else
+ nid = NUMA_NO_NODE;
+
+ ttm_pool_init(&bdev->pool, dev, nid, use_dma_alloc, use_dma32);

bdev->vma_manager = vma_manager;
spin_lock_init(&bdev->lru_lock);
--
2.43.0



2024-01-15 10:08:24

by Christian König

[permalink] [raw]
Subject: Re: [PATCH] drm/ttm: fix ttm pool initialization for no-dma-device drivers

Am 13.01.24 um 22:33 schrieb Fedor Pchelkin:
> QXL driver doesn't use any device for DMA mappings or allocations so
> dev_to_node() will panic inside ttm_device_init() on NUMA systems:
>
> general protection fault, probably for non-canonical address 0xdffffc000000007a: 0000 [#1] PREEMPT SMP KASAN NOPTI
> KASAN: null-ptr-deref in range [0x00000000000003d0-0x00000000000003d7]
> CPU: 1 PID: 1 Comm: swapper/0 Not tainted 6.7.0+ #9
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.2-3-gd478f380-rebuilt.opensuse.org 04/01/2014
> RIP: 0010:ttm_device_init+0x10e/0x340
> Call Trace:
> <TASK>
> qxl_ttm_init+0xaa/0x310
> qxl_device_init+0x1071/0x2000
> qxl_pci_probe+0x167/0x3f0
> local_pci_probe+0xe1/0x1b0
> pci_device_probe+0x29d/0x790
> really_probe+0x251/0x910
> __driver_probe_device+0x1ea/0x390
> driver_probe_device+0x4e/0x2e0
> __driver_attach+0x1e3/0x600
> bus_for_each_dev+0x12d/0x1c0
> bus_add_driver+0x25a/0x590
> driver_register+0x15c/0x4b0
> qxl_pci_driver_init+0x67/0x80
> do_one_initcall+0xf5/0x5d0
> kernel_init_freeable+0x637/0xb10
> kernel_init+0x1c/0x2e0
> ret_from_fork+0x48/0x80
> ret_from_fork_asm+0x1b/0x30
> </TASK>
> Modules linked in:
> ---[ end trace 0000000000000000 ]---
> RIP: 0010:ttm_device_init+0x10e/0x340
>
> Fall back to NUMA_NO_NODE if there is no device for DMA.
>
> Found by Linux Verification Center (linuxtesting.org).
>
> Fixes: b0a7ce53d494 ("drm/ttm: Schedule delayed_delete worker closer")
> Signed-off-by: Fedor Pchelkin <[email protected]>

Oh, thanks for that fix. Reviewed-by: Christian König
<[email protected]>

Going to push that into -fixes in a minute.

Regards,
Christian.

> ---
> drivers/gpu/drm/ttm/ttm_device.c | 9 +++++++--
> 1 file changed, 7 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/ttm/ttm_device.c b/drivers/gpu/drm/ttm/ttm_device.c
> index f5187b384ae9..4130945052ed 100644
> --- a/drivers/gpu/drm/ttm/ttm_device.c
> +++ b/drivers/gpu/drm/ttm/ttm_device.c
> @@ -195,7 +195,7 @@ int ttm_device_init(struct ttm_device *bdev, const struct ttm_device_funcs *func
> bool use_dma_alloc, bool use_dma32)
> {
> struct ttm_global *glob = &ttm_glob;
> - int ret;
> + int ret, nid;
>
> if (WARN_ON(vma_manager == NULL))
> return -EINVAL;
> @@ -215,7 +215,12 @@ int ttm_device_init(struct ttm_device *bdev, const struct ttm_device_funcs *func
>
> ttm_sys_man_init(bdev);
>
> - ttm_pool_init(&bdev->pool, dev, dev_to_node(dev), use_dma_alloc, use_dma32);
> + if (dev)
> + nid = dev_to_node(dev);
> + else
> + nid = NUMA_NO_NODE;
> +
> + ttm_pool_init(&bdev->pool, dev, nid, use_dma_alloc, use_dma32);
>
> bdev->vma_manager = vma_manager;
> spin_lock_init(&bdev->lru_lock);