2022-10-25 03:11:46

by Chen Zhongjin

[permalink] [raw]
Subject: [PATCH] RDMA/core: Fix null-ptr-deref in ib_core_cleanup()

KASAN reported a null-ptr-deref error:

KASAN: null-ptr-deref in range [0x0000000000000118-0x000000000000011f]
CPU: 1 PID: 379
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996)
RIP: 0010:destroy_workqueue+0x2f/0x740
RSP: 0018:ffff888016137df8 EFLAGS: 00000202
...
Call Trace:
<TASK>
ib_core_cleanup+0xa/0xa1 [ib_core]
__do_sys_delete_module.constprop.0+0x34f/0x5b0
do_syscall_64+0x3a/0x90
entry_SYSCALL_64_after_hwframe+0x63/0xcd
RIP: 0033:0x7fa1a0d221b7
...
</TASK>

It is because the fail of roce_gid_mgmt_init() is ignored:

ib_core_init()
roce_gid_mgmt_init()
gid_cache_wq = alloc_ordered_workqueue # fail
...
ib_core_cleanup()
roce_gid_mgmt_cleanup()
destroy_workqueue(gid_cache_wq)
# destroy an unallocated wq

Fix this by catching the fail of roce_gid_mgmt_init() in ib_core_init().

Fixes: 03db3a2d81e6 ("IB/core: Add RoCE GID table management")
Signed-off-by: Chen Zhongjin <[email protected]>
---
drivers/infiniband/core/device.c | 10 +++++++++-
1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/drivers/infiniband/core/device.c b/drivers/infiniband/core/device.c
index ae60c73babcc..b45431e6817b 100644
--- a/drivers/infiniband/core/device.c
+++ b/drivers/infiniband/core/device.c
@@ -2815,10 +2815,18 @@ static int __init ib_core_init(void)

nldev_init();
rdma_nl_register(RDMA_NL_LS, ibnl_ls_cb_table);
- roce_gid_mgmt_init();
+ ret = roce_gid_mgmt_init();
+ if (ret) {
+ pr_warn("Couldn't init RoCE GID management\n");
+ goto err_parent;
+ }

return 0;

+err_parent:
+ nldev_exit();
+ rdma_nl_unregister(RDMA_NL_LS);
+ unregister_pernet_device(&rdma_dev_net_ops);
err_compat:
unregister_blocking_lsm_notifier(&ibdev_lsm_nb);
err_sa:
--
2.17.1


2022-10-25 07:36:16

by Leon Romanovsky

[permalink] [raw]
Subject: Re: [PATCH] RDMA/core: Fix null-ptr-deref in ib_core_cleanup()

On Tue, Oct 25, 2022 at 10:41:46AM +0800, Chen Zhongjin wrote:
> KASAN reported a null-ptr-deref error:
>
> KASAN: null-ptr-deref in range [0x0000000000000118-0x000000000000011f]
> CPU: 1 PID: 379
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996)
> RIP: 0010:destroy_workqueue+0x2f/0x740
> RSP: 0018:ffff888016137df8 EFLAGS: 00000202
> ...
> Call Trace:
> <TASK>
> ib_core_cleanup+0xa/0xa1 [ib_core]
> __do_sys_delete_module.constprop.0+0x34f/0x5b0
> do_syscall_64+0x3a/0x90
> entry_SYSCALL_64_after_hwframe+0x63/0xcd
> RIP: 0033:0x7fa1a0d221b7
> ...
> </TASK>
>
> It is because the fail of roce_gid_mgmt_init() is ignored:
>
> ib_core_init()
> roce_gid_mgmt_init()
> gid_cache_wq = alloc_ordered_workqueue # fail
> ...
> ib_core_cleanup()
> roce_gid_mgmt_cleanup()
> destroy_workqueue(gid_cache_wq)
> # destroy an unallocated wq
>
> Fix this by catching the fail of roce_gid_mgmt_init() in ib_core_init().
>
> Fixes: 03db3a2d81e6 ("IB/core: Add RoCE GID table management")
> Signed-off-by: Chen Zhongjin <[email protected]>
> ---
> drivers/infiniband/core/device.c | 10 +++++++++-
> 1 file changed, 9 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/infiniband/core/device.c b/drivers/infiniband/core/device.c
> index ae60c73babcc..b45431e6817b 100644
> --- a/drivers/infiniband/core/device.c
> +++ b/drivers/infiniband/core/device.c
> @@ -2815,10 +2815,18 @@ static int __init ib_core_init(void)
>
> nldev_init();
> rdma_nl_register(RDMA_NL_LS, ibnl_ls_cb_table);
> - roce_gid_mgmt_init();
> + ret = roce_gid_mgmt_init();
> + if (ret) {
> + pr_warn("Couldn't init RoCE GID management\n");
> + goto err_parent;
> + }
>
> return 0;
>
> +err_parent:
> + nldev_exit();
> + rdma_nl_unregister(RDMA_NL_LS);
> + unregister_pernet_device(&rdma_dev_net_ops);

I fixed release flow and applied to -rc.

diff --git a/drivers/infiniband/core/device.c b/drivers/infiniband/core/device.c
index b45431e6817b..b69e2c4e4d2a 100644
--- a/drivers/infiniband/core/device.c
+++ b/drivers/infiniband/core/device.c
@@ -2824,8 +2824,8 @@ static int __init ib_core_init(void)
return 0;

err_parent:
- nldev_exit();
rdma_nl_unregister(RDMA_NL_LS);
+ nldev_exit();
unregister_pernet_device(&rdma_dev_net_ops);
err_compat:
unregister_blocking_lsm_notifier(&ibdev_lsm_nb);

Thanks


> err_compat:
> unregister_blocking_lsm_notifier(&ibdev_lsm_nb);
> err_sa:
> --
> 2.17.1
>

2022-10-25 08:09:28

by Leon Romanovsky

[permalink] [raw]
Subject: Re: [PATCH] RDMA/core: Fix null-ptr-deref in ib_core_cleanup()

On Tue, 25 Oct 2022 10:41:46 +0800, Chen Zhongjin wrote:
> KASAN reported a null-ptr-deref error:
>
> KASAN: null-ptr-deref in range [0x0000000000000118-0x000000000000011f]
> CPU: 1 PID: 379
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996)
> RIP: 0010:destroy_workqueue+0x2f/0x740
> RSP: 0018:ffff888016137df8 EFLAGS: 00000202
> ...
> Call Trace:
> <TASK>
> ib_core_cleanup+0xa/0xa1 [ib_core]
> __do_sys_delete_module.constprop.0+0x34f/0x5b0
> do_syscall_64+0x3a/0x90
> entry_SYSCALL_64_after_hwframe+0x63/0xcd
> RIP: 0033:0x7fa1a0d221b7
> ...
> </TASK>
>
> [...]

Applied, thanks!

[1/1] RDMA/core: Fix null-ptr-deref in ib_core_cleanup()
https://git.kernel.org/rdma/rdma/c/ad9394a3da3399

Best regards,
--
Leon Romanovsky <[email protected]>