2024-01-24 10:21:47

by Vasiliy Kovalev

[permalink] [raw]
Subject: [PATCH 1/1] gtp: fix use-after-free and null-ptr-deref in gtp_genl_dump_pdp()

From: Vasiliy Kovalev <[email protected]>

After unloading the module, an instance continues to exist that accesses
outdated memory addresses.

To prevent this, the dump_pdp_en flag has been added, which blocks the
dump of pdp contexts by a false value. And only after these checks can
the net_generic() function be called.

These errors were found using the syzkaller program:

Syzkaller hit 'general protection fault in gtp_genl_dump_pdp' bug.
gtp: GTP module loaded (pdp ctx size 104 bytes)
gtp: GTP module unloaded
general protection fault, probably for non-canonical address
0xdffffc0000000001:0000 [#1] SMP KASAN NOPTI
KASAN: null-ptr-deref in range [0x0000000000000008-0x000000000000000f]
CPU: 0 PID: 2782 Comm: syz-executor139 Not tainted 5.10.200-std-def-alt1 #1
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.0-alt1
RIP: 0010:gtp_genl_dump_pdp+0x1b1/0x790 [gtp]
..
Call Trace:
genl_lock_dumpit+0x6b/0xa0 net/netlink/genetlink.c:623
netlink_dump+0x575/0xc70 net/netlink/af_netlink.c:2271
__netlink_dump_start+0x64e/0x910 net/netlink/af_netlink.c:2376
genl_family_rcv_msg_dumpit+0x2b8/0x310 net/netlink/genetlink.c:686
genl_family_rcv_msg net/netlink/genetlink.c:780 [inline]
genl_rcv_msg+0x450/0x5a0 net/netlink/genetlink.c:800
netlink_rcv_skb+0x150/0x440 net/netlink/af_netlink.c:2497
genl_rcv+0x29/0x40 net/netlink/genetlink.c:811
netlink_unicast_kernel net/netlink/af_netlink.c:1322 [inline]
netlink_unicast+0x54e/0x800 net/netlink/af_netlink.c:1348
netlink_sendmsg+0x914/0xe00 net/netlink/af_netlink.c:1916
sock_sendmsg_nosec net/socket.c:651 [inline]
__sock_sendmsg+0x159/0x190 net/socket.c:663
____sys_sendmsg+0x712/0x870 net/socket.c:2376
___sys_sendmsg+0xf8/0x170 net/socket.c:2430
__sys_sendmsg+0xea/0x1b0 net/socket.c:2459
do_syscall_64+0x33/0x40 arch/x86/entry/common.c:46
entry_SYSCALL_64_after_hwframe+0x62/0xc7
RIP: 0033:0x7f2ea16c2d49

Fixes: 94a6d9fb88df ("gtp: fix wrong condition in gtp_genl_dump_pdp()")
Signed-off-by: Vasiliy Kovalev <[email protected]>
---
drivers/net/gtp.c | 18 +++++++++++++++---
1 file changed, 15 insertions(+), 3 deletions(-)

diff --git a/drivers/net/gtp.c b/drivers/net/gtp.c
index 477b4d4f860bd3..3fc4639711cd83 100644
--- a/drivers/net/gtp.c
+++ b/drivers/net/gtp.c
@@ -1675,6 +1675,8 @@ static int gtp_genl_get_pdp(struct sk_buff *skb, struct genl_info *info)
return err;
}

+static bool dump_pdp_en;
+
static int gtp_genl_dump_pdp(struct sk_buff *skb,
struct netlink_callback *cb)
{
@@ -1684,12 +1686,19 @@ static int gtp_genl_dump_pdp(struct sk_buff *skb,
struct pdp_ctx *pctx;
struct gtp_net *gn;

- gn = net_generic(net, gtp_net_id);
-
- if (cb->args[4])
+ /* Do not allow further operations if the module is
+ * unloaded before or after the process is blocked.
+ */
+ if (!dump_pdp_en)
return 0;

rcu_read_lock();
+ if (!dump_pdp_en || cb->args[4]) {
+ rcu_read_unlock();
+ return 0;
+ }
+ gn = net_generic(net, gtp_net_id);
+
list_for_each_entry_rcu(gtp, &gn->gtp_dev_list, list) {
if (last_gtp && last_gtp != gtp)
continue;
@@ -1914,6 +1923,8 @@ static int __init gtp_init(void)
if (err < 0)
goto unreg_genl_family;

+ dump_pdp_en = true;
+
pr_info("GTP module loaded (pdp ctx size %zd bytes)\n",
sizeof(struct pdp_ctx));
return 0;
@@ -1930,6 +1941,7 @@ late_initcall(gtp_init);

static void __exit gtp_fini(void)
{
+ dump_pdp_en = false;
genl_unregister_family(&gtp_genl_family);
rtnl_link_unregister(&gtp_link_ops);
unregister_pernet_subsys(&gtp_net_ops);
--
2.33.8



2024-01-24 10:45:45

by Eric Dumazet

[permalink] [raw]
Subject: Re: [PATCH 1/1] gtp: fix use-after-free and null-ptr-deref in gtp_genl_dump_pdp()

On Wed, Jan 24, 2024 at 11:14 AM <[email protected]> wrote:
>
> From: Vasiliy Kovalev <[email protected]>
>
> After unloading the module, an instance continues to exist that accesses
> outdated memory addresses.
>
> To prevent this, the dump_pdp_en flag has been added, which blocks the
> dump of pdp contexts by a false value. And only after these checks can
> the net_generic() function be called.
>
> These errors were found using the syzkaller program:
>
> Syzkaller hit 'general protection fault in gtp_genl_dump_pdp' bug.
> gtp: GTP module loaded (pdp ctx size 104 bytes)
> gtp: GTP module unloaded
> general protection fault, probably for non-canonical address
> 0xdffffc0000000001:0000 [#1] SMP KASAN NOPTI
> KASAN: null-ptr-deref in range [0x0000000000000008-0x000000000000000f]
> CPU: 0 PID: 2782 Comm: syz-executor139 Not tainted 5.10.200-std-def-alt1 #1
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.0-alt1
> RIP: 0010:gtp_genl_dump_pdp+0x1b1/0x790 [gtp]
> ...
> Call Trace:
> genl_lock_dumpit+0x6b/0xa0 net/netlink/genetlink.c:623
> netlink_dump+0x575/0xc70 net/netlink/af_netlink.c:2271
> __netlink_dump_start+0x64e/0x910 net/netlink/af_netlink.c:2376
> genl_family_rcv_msg_dumpit+0x2b8/0x310 net/netlink/genetlink.c:686
> genl_family_rcv_msg net/netlink/genetlink.c:780 [inline]
> genl_rcv_msg+0x450/0x5a0 net/netlink/genetlink.c:800
> netlink_rcv_skb+0x150/0x440 net/netlink/af_netlink.c:2497
> genl_rcv+0x29/0x40 net/netlink/genetlink.c:811
> netlink_unicast_kernel net/netlink/af_netlink.c:1322 [inline]
> netlink_unicast+0x54e/0x800 net/netlink/af_netlink.c:1348
> netlink_sendmsg+0x914/0xe00 net/netlink/af_netlink.c:1916
> sock_sendmsg_nosec net/socket.c:651 [inline]
> __sock_sendmsg+0x159/0x190 net/socket.c:663
> ____sys_sendmsg+0x712/0x870 net/socket.c:2376
> ___sys_sendmsg+0xf8/0x170 net/socket.c:2430
> __sys_sendmsg+0xea/0x1b0 net/socket.c:2459
> do_syscall_64+0x33/0x40 arch/x86/entry/common.c:46
> entry_SYSCALL_64_after_hwframe+0x62/0xc7
> RIP: 0033:0x7f2ea16c2d49
>
> Fixes: 94a6d9fb88df ("gtp: fix wrong condition in gtp_genl_dump_pdp()")
> Signed-off-by: Vasiliy Kovalev <[email protected]>
> ---
> drivers/net/gtp.c | 18 +++++++++++++++---
> 1 file changed, 15 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/net/gtp.c b/drivers/net/gtp.c
> index 477b4d4f860bd3..3fc4639711cd83 100644
> --- a/drivers/net/gtp.c
> +++ b/drivers/net/gtp.c
> @@ -1675,6 +1675,8 @@ static int gtp_genl_get_pdp(struct sk_buff *skb, struct genl_info *info)
> return err;
> }
>
> +static bool dump_pdp_en;
> +

Hmm, it seems there is a missing try_module_get() somewhere...

__netlink_dump_start() does one, so perhaps we reach __netlink_dump_start()
with a NULL in control->module ?

2024-01-24 10:58:13

by Eric Dumazet

[permalink] [raw]
Subject: Re: [PATCH 1/1] gtp: fix use-after-free and null-ptr-deref in gtp_genl_dump_pdp()

On Wed, Jan 24, 2024 at 11:14 AM <[email protected]> wrote:
>
> From: Vasiliy Kovalev <[email protected]>
>
> After unloading the module, an instance continues to exist that accesses
> outdated memory addresses.
>
> To prevent this, the dump_pdp_en flag has been added, which blocks the
> dump of pdp contexts by a false value. And only after these checks can
> the net_generic() function be called.
>
> These errors were found using the syzkaller program:
>
> Syzkaller hit 'general protection fault in gtp_genl_dump_pdp' bug.
> gtp: GTP module loaded (pdp ctx size 104 bytes)
> gtp: GTP module unloaded
> general protection fault, probably for non-canonical address
> 0xdffffc0000000001:0000 [#1] SMP KASAN NOPTI
> KASAN: null-ptr-deref in range [0x0000000000000008-0x000000000000000f]
> CPU: 0 PID: 2782 Comm: syz-executor139 Not tainted 5.10.200-std-def-alt1 #1

Oh wait, this is a 5.10 kernel ?

Please generate a stack trace using a recent tree, it is possible the
bug has been fixed already.

2024-01-24 11:20:33

by Vasiliy Kovalev

[permalink] [raw]
Subject: Re: [PATCH 1/1] gtp: fix use-after-free and null-ptr-deref in gtp_genl_dump_pdp()

24.01.2024 13:57, Eric Dumazet wrote:
> Oh wait, this is a 5.10 kernel ?
Yes, but the bug is reproduced on the latest stable kernels.
> Please generate a stack trace using a recent tree, it is possible the
> bug has been fixed already.

See [PATCH 0/1] above, there's a stack for the 6.6.13 kernel at the
bottom of the message.

[ 523.915255] Call Trace:
[ 523.915255] <TASK>
[ 523.915255] ? __die+0x1f/0x70
[ 523.915255] ? page_fault_oops+0x14d/0x4a0
[ 523.915255] ? exc_page_fault+0x7b/0x180
[ 523.915255] ? asm_exc_page_fault+0x22/0x30
[ 523.915255] ? gtp_genl_dump_pdp+0x82/0x190 [gtp]
[ 523.915255] ? gtp_genl_dump_pdp+0x82/0x190 [gtp]
[ 523.915255] genl_dumpit+0x2f/0x90
[ 523.915255] netlink_dump+0x126/0x320
[ 523.915255] __netlink_dump_start+0x1da/0x2a0
[ 523.915255] genl_family_rcv_msg_dumpit+0x93/0x100
[ 523.915255] ? __pfx_genl_start+0x10/0x10
[ 523.915255] ? __pfx_genl_dumpit+0x10/0x10
[ 523.915255] ? __pfx_genl_done+0x10/0x10
[ 523.915255] genl_rcv_msg+0x112/0x2a0
[ 523.915255] ? __pfx_gtp_genl_dump_pdp+0x10/0x10 [gtp]
[ 523.915255] ? __pfx_genl_rcv_msg+0x10/0x10
[ 523.915255] netlink_rcv_skb+0x54/0x110
[ 523.915255] genl_rcv+0x24/0x40
[ 523.915255] netlink_unicast+0x19f/0x290
[ 523.915255] netlink_sendmsg+0x250/0x4e0
[ 523.915255] ____sys_sendmsg+0x376/0x3b0
[ 523.915255] ? copy_msghdr_from_user+0x6d/0xb0
[ 523.915255] ___sys_sendmsg+0x86/0xe0
[ 523.915255] ? do_fault+0x296/0x470
[ 523.915255] ? __handle_mm_fault+0x771/0xda0
[ 523.915255] __sys_sendmsg+0x57/0xb0
[ 523.915255] do_syscall_64+0x59/0x90
[ 523.915255] ? ct_kernel_exit.isra.0+0x71/0x90
[ 523.915255] ? __ct_user_enter+0x5a/0xd0
[ 523.915255] entry_SYSCALL_64_after_hwframe+0x6e/0xd8
[ 523.915255] RIP: 0033:0x7f2bcb93cd49

--
Regards,
Vasiliy Kovalev


2024-01-24 11:59:07

by Eric Dumazet

[permalink] [raw]
Subject: Re: [PATCH 1/1] gtp: fix use-after-free and null-ptr-deref in gtp_genl_dump_pdp()

On Wed, Jan 24, 2024 at 12:20 PM <[email protected]> wrote:
>
> 24.01.2024 13:57, Eric Dumazet wrote:
> > Oh wait, this is a 5.10 kernel ?
> Yes, but the bug is reproduced on the latest stable kernels.
> > Please generate a stack trace using a recent tree, it is possible the
> > bug has been fixed already.
>
> See [PATCH 0/1] above, there's a stack for the 6.6.13 kernel at the
> bottom of the message.

Ah, ok. Not sure why you sent a cover letter for a single patch...

Setting a boolean, in a module that can disappear will not prevent the
module from disappearing.

This work around might work, or might not work, depending on timing,
preemptions, ....

Thanks.

2024-01-29 12:08:47

by Pablo Neira Ayuso

[permalink] [raw]
Subject: Re: [PATCH 1/1] gtp: fix use-after-free and null-ptr-deref in gtp_genl_dump_pdp()

Hi,

On Wed, Jan 24, 2024 at 02:20:04PM +0300, [email protected] wrote:
> 24.01.2024 13:57, Eric Dumazet wrote:
> > Oh wait, this is a 5.10 kernel ?
>
> Yes, but the bug is reproduced on the latest stable kernels.
>
> > Please generate a stack trace using a recent tree, it is possible the
> > bug has been fixed already.

__netlink_dump_start() is called at the beginning of the dump, which is
grabbing a reference on this module.

do you have a reproducer?

> See [PATCH 0/1] above, there's a stack for the 6.6.13 kernel at the bottom
> of the message.
>
> [ 523.915255] Call Trace:
> [ 523.915255] <TASK>
> [ 523.915255] ? __die+0x1f/0x70
> [ 523.915255] ? page_fault_oops+0x14d/0x4a0
> [ 523.915255] ? exc_page_fault+0x7b/0x180
> [ 523.915255] ? asm_exc_page_fault+0x22/0x30
> [ 523.915255] ? gtp_genl_dump_pdp+0x82/0x190 [gtp]
> [ 523.915255] ? gtp_genl_dump_pdp+0x82/0x190 [gtp]
> [ 523.915255] genl_dumpit+0x2f/0x90
> [ 523.915255] netlink_dump+0x126/0x320
> [ 523.915255] __netlink_dump_start+0x1da/0x2a0
> [ 523.915255] genl_family_rcv_msg_dumpit+0x93/0x100
> [ 523.915255] ? __pfx_genl_start+0x10/0x10
> [ 523.915255] ? __pfx_genl_dumpit+0x10/0x10
> [ 523.915255] ? __pfx_genl_done+0x10/0x10
> [ 523.915255] genl_rcv_msg+0x112/0x2a0
> [ 523.915255] ? __pfx_gtp_genl_dump_pdp+0x10/0x10 [gtp]
> [ 523.915255] ? __pfx_genl_rcv_msg+0x10/0x10
> [ 523.915255] netlink_rcv_skb+0x54/0x110
> [ 523.915255] genl_rcv+0x24/0x40
> [ 523.915255] netlink_unicast+0x19f/0x290
> [ 523.915255] netlink_sendmsg+0x250/0x4e0
> [ 523.915255] ____sys_sendmsg+0x376/0x3b0
> [ 523.915255] ? copy_msghdr_from_user+0x6d/0xb0
> [ 523.915255] ___sys_sendmsg+0x86/0xe0
> [ 523.915255] ? do_fault+0x296/0x470
> [ 523.915255] ? __handle_mm_fault+0x771/0xda0
> [ 523.915255] __sys_sendmsg+0x57/0xb0
> [ 523.915255] do_syscall_64+0x59/0x90
> [ 523.915255] ? ct_kernel_exit.isra.0+0x71/0x90
> [ 523.915255] ? __ct_user_enter+0x5a/0xd0
> [ 523.915255] entry_SYSCALL_64_after_hwframe+0x6e/0xd8
> [ 523.915255] RIP: 0033:0x7f2bcb93cd49
>
> --
> Regards,
> Vasiliy Kovalev
>

2024-01-29 17:13:18

by Vasiliy Kovalev

[permalink] [raw]
Subject: Re: [PATCH 1/1] gtp: fix use-after-free and null-ptr-deref in gtp_genl_dump_pdp()

29.01.2024 15:02, Pablo Neira Ayuso wrote:
> __netlink_dump_start() is called at the beginning of the dump, which is
> grabbing a reference on this module.
>
> do you have a reproducer?
Hi, yes, in the cover letter [PATCH 0/1]
https://lore.kernel.org/all/[email protected]/

--
Regards,
Vasiliy Kovalev


2024-02-09 18:16:33

by Vasiliy Kovalev

[permalink] [raw]
Subject: Re: [PATCH 1/1] gtp: fix use-after-free and null-ptr-deref in gtp_genl_dump_pdp()

Hi,

24.01.2024 14:52, Eric Dumazet wrote:
> On Wed, Jan 24, 2024 at 12:20 PM <[email protected]> wrote:
>> 24.01.2024 13:57, Eric Dumazet wrote:
>>> Oh wait, this is a 5.10 kernel ?
>> Yes, but the bug is reproduced on the latest stable kernels.
>>> Please generate a stack trace using a recent tree, it is possible the
>>> bug has been fixed already.
>> See [PATCH 0/1] above, there's a stack for the 6.6.13 kernel at the
>> bottom of the message.
> Ah, ok. Not sure why you sent a cover letter for a single patch...
>
> Setting a boolean, in a module that can disappear will not prevent the
> module from disappearing.
>
> This work around might work, or might not work, depending on timing,
> preemptions, ....
>
> Thanks.

I tested running the reproducer [1] on the 6.8-rc3 kernel, the crash
occurs in less than 10 seconds and the qemu VM restarts:

dmesg -w:

[  106.941736] gtp: GTP module unloaded
[  106.962548] gtp: GTP module loaded (pdp ctx size 104 bytes)
[  107.014691] gtp: GTP module unloaded
[  107.041554] gtp: GTP module loaded (pdp ctx size 104 bytes)
[  107.082283] gtp: GTP module unloaded
[  107.123268] general protection fault, probably for non-canonical
address 0xdffffc0000000002: 0000 [#1] PREEMPT SMP KASAN NOPTI
[  107.124050] KASAN: null-ptr-deref in range
[0x0000000000000010-0x0000000000000017]
[  107.124339] CPU: 1 PID: 5826 Comm: gtp Not tainted
6.8.0-rc3-std-def-alt1 #1
[  107.124604] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS
1.16.0-alt1 04/01/2014
[  107.124916] RIP: 0010:gtp_genl_dump_pdp+0x1be/0x800 [gtp]
[  107.125141] Code: c6 89 c6 e8 64 e9 86 df 58 45 85 f6 0f 85 4e 04 00
00 e8 c5 ee 86 df 48 8b 54 24 18 48 b8 00 00 00 00 00 fc ff df 48 c1 ea
03 <80> 3c 02 00 0f 85 de 05 00 00 48 8b 44 24 18 4c 8b 30 4c 39 f0 74
[  107.125960] RSP: 0018:ffff888014107220 EFLAGS: 00010202
[  107.126164] RAX: dffffc0000000000 RBX: 0000000000000000 RCX:
0000000000000000
[  107.126434] RDX: 0000000000000002 RSI: 0000000000000000 RDI:
0000000000000000
[  107.126707] RBP: 0000000000000000 R08: 0000000000000000 R09:
0000000000000000
[  107.126976] R10: 0000000000000000 R11: 0000000000000000 R12:
0000000000000000
[  107.127245] R13: ffff88800fcda588 R14: 0000000000000001 R15:
0000000000000000
[  107.127515] FS:  00007f1be4eb05c0(0000) GS:ffff88806ce80000(0000)
knlGS:0000000000000000
[  107.127955] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  107.128177] CR2: 00007f1be4e766cf CR3: 000000000c33e000 CR4:
0000000000750ef0
[  107.128450] PKRU: 55555554
[  107.128577] Call Trace:
[  107.128699]  <TASK>
[  107.128790]  ? show_regs+0x90/0xa0
[  107.128935]  ? die_addr+0x50/0xd0
[  107.129075]  ? exc_general_protection+0x148/0x220
[  107.129267]  ? asm_exc_general_protection+0x22/0x30
[  107.129469]  ? gtp_genl_dump_pdp+0x1be/0x800 [gtp]
[  107.129677]  ? __alloc_skb+0x1dd/0x350
[  107.129831]  ? __pfx___alloc_skb+0x10/0x10
[  107.129999]  genl_dumpit+0x11d/0x230
[  107.130150]  netlink_dump+0x5b9/0xce0
[  107.130301]  ? lockdep_hardirqs_on_prepare+0x253/0x430
[  107.130503]  ? __pfx_netlink_dump+0x10/0x10
[  107.130686]  ? kasan_save_track+0x10/0x40
[  107.130849]  ? __kasan_kmalloc+0x9b/0xa0
[  107.131009]  ? genl_start+0x675/0x970
[  107.131162]  __netlink_dump_start+0x6fc/0x9f0
[  107.131341]  genl_family_rcv_msg_dumpit+0x1bb/0x2d0
[  107.131538]  ? __pfx_genl_family_rcv_msg_dumpit+0x10/0x10
[  107.131754]  ? genl_op_from_small+0x2a/0x440
[  107.131972]  ? cap_capable+0x1d0/0x240
[  107.132127]  ? __pfx_genl_start+0x10/0x10
[  107.132292]  ? __pfx_genl_dumpit+0x10/0x10
[  107.132461]  ? __pfx_genl_done+0x10/0x10
[  107.132645]  ? security_capable+0x9d/0xe0

With the proposed patch applied, such a crash is not observed during
long-term testing.

[1]
https://lore.kernel.org/lkml/[email protected]/T/#mf9b411baec52858b1c9118c671f26a6dc424e7b4

--
Regards,
Vasiliy Kovalev


2024-02-09 19:21:35

by Eric Dumazet

[permalink] [raw]
Subject: Re: [PATCH 1/1] gtp: fix use-after-free and null-ptr-deref in gtp_genl_dump_pdp()

On Fri, Feb 9, 2024 at 7:16 PM <[email protected]> wrote:
>
> Hi,
>
> 24.01.2024 14:52, Eric Dumazet wrote:
> > On Wed, Jan 24, 2024 at 12:20 PM <[email protected]> wrote:
> >> 24.01.2024 13:57, Eric Dumazet wrote:
> >>> Oh wait, this is a 5.10 kernel ?
> >> Yes, but the bug is reproduced on the latest stable kernels.
> >>> Please generate a stack trace using a recent tree, it is possible the
> >>> bug has been fixed already.
> >> See [PATCH 0/1] above, there's a stack for the 6.6.13 kernel at the
> >> bottom of the message.
> > Ah, ok. Not sure why you sent a cover letter for a single patch...
> >
> > Setting a boolean, in a module that can disappear will not prevent the
> > module from disappearing.
> >
> > This work around might work, or might not work, depending on timing,
> > preemptions, ....
> >
> > Thanks.
>
> I tested running the reproducer [1] on the 6.8-rc3 kernel, the crash
> occurs in less than 10 seconds and the qemu VM restarts:
>
> dmesg -w:
>
> [ 106.941736] gtp: GTP module unloaded
> [ 106.962548] gtp: GTP module loaded (pdp ctx size 104 bytes)
> [ 107.014691] gtp: GTP module unloaded
> [ 107.041554] gtp: GTP module loaded (pdp ctx size 104 bytes)
> [ 107.082283] gtp: GTP module unloaded
> [ 107.123268] general protection fault, probably for non-canonical
> address 0xdffffc0000000002: 0000 [#1] PREEMPT SMP KASAN NOPTI
> [ 107.124050] KASAN: null-ptr-deref in range
> [0x0000000000000010-0x0000000000000017]
> [ 107.124339] CPU: 1 PID: 5826 Comm: gtp Not tainted
> 6.8.0-rc3-std-def-alt1 #1
> [ 107.124604] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS
> 1.16.0-alt1 04/01/2014
> [ 107.124916] RIP: 0010:gtp_genl_dump_pdp+0x1be/0x800 [gtp]
> [ 107.125141] Code: c6 89 c6 e8 64 e9 86 df 58 45 85 f6 0f 85 4e 04 00
> 00 e8 c5 ee 86 df 48 8b 54 24 18 48 b8 00 00 00 00 00 fc ff df 48 c1 ea
> 03 <80> 3c 02 00 0f 85 de 05 00 00 48 8b 44 24 18 4c 8b 30 4c 39 f0 74
> [ 107.125960] RSP: 0018:ffff888014107220 EFLAGS: 00010202
> [ 107.126164] RAX: dffffc0000000000 RBX: 0000000000000000 RCX:
> 0000000000000000
> [ 107.126434] RDX: 0000000000000002 RSI: 0000000000000000 RDI:
> 0000000000000000
> [ 107.126707] RBP: 0000000000000000 R08: 0000000000000000 R09:
> 0000000000000000
> [ 107.126976] R10: 0000000000000000 R11: 0000000000000000 R12:
> 0000000000000000
> [ 107.127245] R13: ffff88800fcda588 R14: 0000000000000001 R15:
> 0000000000000000
> [ 107.127515] FS: 00007f1be4eb05c0(0000) GS:ffff88806ce80000(0000)
> knlGS:0000000000000000
> [ 107.127955] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 107.128177] CR2: 00007f1be4e766cf CR3: 000000000c33e000 CR4:
> 0000000000750ef0
> [ 107.128450] PKRU: 55555554
> [ 107.128577] Call Trace:
> [ 107.128699] <TASK>
> [ 107.128790] ? show_regs+0x90/0xa0
> [ 107.128935] ? die_addr+0x50/0xd0
> [ 107.129075] ? exc_general_protection+0x148/0x220
> [ 107.129267] ? asm_exc_general_protection+0x22/0x30
> [ 107.129469] ? gtp_genl_dump_pdp+0x1be/0x800 [gtp]
> [ 107.129677] ? __alloc_skb+0x1dd/0x350
> [ 107.129831] ? __pfx___alloc_skb+0x10/0x10
> [ 107.129999] genl_dumpit+0x11d/0x230
> [ 107.130150] netlink_dump+0x5b9/0xce0
> [ 107.130301] ? lockdep_hardirqs_on_prepare+0x253/0x430
> [ 107.130503] ? __pfx_netlink_dump+0x10/0x10
> [ 107.130686] ? kasan_save_track+0x10/0x40
> [ 107.130849] ? __kasan_kmalloc+0x9b/0xa0
> [ 107.131009] ? genl_start+0x675/0x970
> [ 107.131162] __netlink_dump_start+0x6fc/0x9f0
> [ 107.131341] genl_family_rcv_msg_dumpit+0x1bb/0x2d0
> [ 107.131538] ? __pfx_genl_family_rcv_msg_dumpit+0x10/0x10
> [ 107.131754] ? genl_op_from_small+0x2a/0x440
> [ 107.131972] ? cap_capable+0x1d0/0x240
> [ 107.132127] ? __pfx_genl_start+0x10/0x10
> [ 107.132292] ? __pfx_genl_dumpit+0x10/0x10
> [ 107.132461] ? __pfx_genl_done+0x10/0x10
> [ 107.132645] ? security_capable+0x9d/0xe0
>
> With the proposed patch applied, such a crash is not observed during
> long-term testing.

Maybe, but the patch is not good, I think I and Pablo gave feedback on this ?

Please trace __netlink_dump_start() content of control->module

gtp_genl_family.module should be set, and we should get it.

Otherwise, if the bug is in the core, we would need a dozen of 'work
arounds because it is better than nothing'

Thank you.

2024-02-14 16:51:11

by Vasiliy Kovalev

[permalink] [raw]
Subject: Re: [PATCH 1/1] gtp: fix use-after-free and null-ptr-deref in gtp_genl_dump_pdp()

09.02.2024 22:21, Eric Dumazet wrote:

> Maybe, but the patch is not good, I think I and Pablo gave feedback on this ?
>
> Please trace __netlink_dump_start() content of control->module
>
> gtp_genl_family.module should be set, and we should get it.
>
> Otherwise, if the bug is in the core, we would need a dozen of 'work
> arounds because it is better than nothing'
>
> Thank you.

Thanks.

I tracked the moment when the __netlink_dump_start() function was
called, it turned out that in the gtp_init() initialization function
before registering pernet subsystem (gtp_net_ops), therefore, outdated
data is used, which leads to a crash.

The documentation says that ops structure must be assigned before
registering a generic netlink family [1].

I have fixed and sent a new patch [2].

[1]
https://elixir.bootlin.com/linux/v6.8-rc4/source/net/netlink/genetlink.c#L773

[2]
https://lore.kernel.org/netdev/[email protected]/T/#u

--
Regards,
Vasiliy Kovalev


2024-02-15 20:33:11

by Eric Dumazet

[permalink] [raw]
Subject: Re: [PATCH 1/1] gtp: fix use-after-free and null-ptr-deref in gtp_genl_dump_pdp()

On Wed, Feb 14, 2024 at 5:50 PM <[email protected]> wrote:
>
> 09.02.2024 22:21, Eric Dumazet wrote:
>
> > Maybe, but the patch is not good, I think I and Pablo gave feedback on this ?
> >
> > Please trace __netlink_dump_start() content of control->module
> >
> > gtp_genl_family.module should be set, and we should get it.
> >
> > Otherwise, if the bug is in the core, we would need a dozen of 'work
> > arounds because it is better than nothing'
> >
> > Thank you.
>
> Thanks.
>
> I tracked the moment when the __netlink_dump_start() function was
> called, it turned out that in the gtp_init() initialization function
> before registering pernet subsystem (gtp_net_ops), therefore, outdated
> data is used, which leads to a crash.
>
> The documentation says that ops structure must be assigned before
> registering a generic netlink family [1].
>
> I have fixed and sent a new patch [2].
>
> [1]
> https://elixir.bootlin.com/linux/v6.8-rc4/source/net/netlink/genetlink.c#L773
>
> [2]
> https://lore.kernel.org/netdev/[email protected]/T/#u
>

Excellent detective work, thanks a lot !