2023-12-27 07:41:10

by Wen Gu

[permalink] [raw]
Subject: [PATCH net] net/smc: fix invalid link access in dumping SMC-R connections

A crash was found when dumping SMC-R connections. It can be reproduced
by following steps:

- environment: two RNICs on both sides.
- run SMC-R between two sides, now a SMC_LGR_SYMMETRIC type link group
will be created.
- set the first RNIC down on either side and link group will turn to
SMC_LGR_ASYMMETRIC_LOCAL then.
- run 'smcss -R' and the crash will be triggered.

BUG: kernel NULL pointer dereference, address: 0000000000000010
#PF: supervisor read access in kernel mode
#PF: error_code(0x0000) - not-present page
PGD 8000000101fdd067 P4D 8000000101fdd067 PUD 10ce46067 PMD 0
Oops: 0000 [#1] PREEMPT SMP PTI
CPU: 3 PID: 1810 Comm: smcss Kdump: loaded Tainted: G W E 6.7.0-rc6+ #51
RIP: 0010:__smc_diag_dump.constprop.0+0x36e/0x620 [smc_diag]
Call Trace:
<TASK>
? __die+0x24/0x70
? page_fault_oops+0x66/0x150
? exc_page_fault+0x69/0x140
? asm_exc_page_fault+0x26/0x30
? __smc_diag_dump.constprop.0+0x36e/0x620 [smc_diag]
smc_diag_dump_proto+0xd0/0xf0 [smc_diag]
smc_diag_dump+0x26/0x60 [smc_diag]
netlink_dump+0x19f/0x320
__netlink_dump_start+0x1dc/0x300
smc_diag_handler_dump+0x6a/0x80 [smc_diag]
? __pfx_smc_diag_dump+0x10/0x10 [smc_diag]
sock_diag_rcv_msg+0x121/0x140
? __pfx_sock_diag_rcv_msg+0x10/0x10
netlink_rcv_skb+0x5a/0x110
sock_diag_rcv+0x28/0x40
netlink_unicast+0x22a/0x330
netlink_sendmsg+0x240/0x4a0
__sock_sendmsg+0xb0/0xc0
____sys_sendmsg+0x24e/0x300
? copy_msghdr_from_user+0x62/0x80
___sys_sendmsg+0x7c/0xd0
? __do_fault+0x34/0x1a0
? do_read_fault+0x5f/0x100
? do_fault+0xb0/0x110
__sys_sendmsg+0x4d/0x80
do_syscall_64+0x45/0xf0
entry_SYSCALL_64_after_hwframe+0x6e/0x76

When the first RNIC is set down, the lgr->lnk[0] will be cleared and an
asymmetric link will be allocated in lgr->link[SMC_LINKS_PER_LGR_MAX - 1]
by smc_llc_alloc_alt_link(). Then when we try to dump SMC-R connections
in __smc_diag_dump(), the invalid lgr->lnk[0] will be accessed, resulting
in this issue. So fix it by accessing the right link.

Fixes: f16a7dd5cf27 ("smc: netlink interface for SMC sockets")
Reported-by: henaumars <[email protected]>
Closes: https://bugzilla.openanolis.cn/show_bug.cgi?id=7616
Signed-off-by: Wen Gu <[email protected]>
---
net/smc/smc_diag.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/net/smc/smc_diag.c b/net/smc/smc_diag.c
index a584613aca12..5cc376834c57 100644
--- a/net/smc/smc_diag.c
+++ b/net/smc/smc_diag.c
@@ -153,8 +153,7 @@ static int __smc_diag_dump(struct sock *sk, struct sk_buff *skb,
.lnk[0].link_id = link->link_id,
};

- memcpy(linfo.lnk[0].ibname,
- smc->conn.lgr->lnk[0].smcibdev->ibdev->name,
+ memcpy(linfo.lnk[0].ibname, link->smcibdev->ibdev->name,
sizeof(link->smcibdev->ibdev->name));
smc_gid_be16_convert(linfo.lnk[0].gid, link->gid);
smc_gid_be16_convert(linfo.lnk[0].peer_gid, link->peer_gid);
--
2.43.0



2023-12-28 09:32:53

by Tony Lu

[permalink] [raw]
Subject: Re: [PATCH net] net/smc: fix invalid link access in dumping SMC-R connections

On Wed, Dec 27, 2023 at 03:40:35PM +0800, Wen Gu wrote:
> A crash was found when dumping SMC-R connections. It can be reproduced
> by following steps:
>
> - environment: two RNICs on both sides.
> - run SMC-R between two sides, now a SMC_LGR_SYMMETRIC type link group
> will be created.
> - set the first RNIC down on either side and link group will turn to
> SMC_LGR_ASYMMETRIC_LOCAL then.
> - run 'smcss -R' and the crash will be triggered.
>
> BUG: kernel NULL pointer dereference, address: 0000000000000010
> #PF: supervisor read access in kernel mode
> #PF: error_code(0x0000) - not-present page
> PGD 8000000101fdd067 P4D 8000000101fdd067 PUD 10ce46067 PMD 0
> Oops: 0000 [#1] PREEMPT SMP PTI
> CPU: 3 PID: 1810 Comm: smcss Kdump: loaded Tainted: G W E 6.7.0-rc6+ #51
> RIP: 0010:__smc_diag_dump.constprop.0+0x36e/0x620 [smc_diag]
> Call Trace:
> <TASK>
> ? __die+0x24/0x70
> ? page_fault_oops+0x66/0x150
> ? exc_page_fault+0x69/0x140
> ? asm_exc_page_fault+0x26/0x30
> ? __smc_diag_dump.constprop.0+0x36e/0x620 [smc_diag]
> smc_diag_dump_proto+0xd0/0xf0 [smc_diag]
> smc_diag_dump+0x26/0x60 [smc_diag]
> netlink_dump+0x19f/0x320
> __netlink_dump_start+0x1dc/0x300
> smc_diag_handler_dump+0x6a/0x80 [smc_diag]
> ? __pfx_smc_diag_dump+0x10/0x10 [smc_diag]
> sock_diag_rcv_msg+0x121/0x140
> ? __pfx_sock_diag_rcv_msg+0x10/0x10
> netlink_rcv_skb+0x5a/0x110
> sock_diag_rcv+0x28/0x40
> netlink_unicast+0x22a/0x330
> netlink_sendmsg+0x240/0x4a0
> __sock_sendmsg+0xb0/0xc0
> ____sys_sendmsg+0x24e/0x300
> ? copy_msghdr_from_user+0x62/0x80
> ___sys_sendmsg+0x7c/0xd0
> ? __do_fault+0x34/0x1a0
> ? do_read_fault+0x5f/0x100
> ? do_fault+0xb0/0x110
> __sys_sendmsg+0x4d/0x80
> do_syscall_64+0x45/0xf0
> entry_SYSCALL_64_after_hwframe+0x6e/0x76
>
> When the first RNIC is set down, the lgr->lnk[0] will be cleared and an
> asymmetric link will be allocated in lgr->link[SMC_LINKS_PER_LGR_MAX - 1]
> by smc_llc_alloc_alt_link(). Then when we try to dump SMC-R connections
> in __smc_diag_dump(), the invalid lgr->lnk[0] will be accessed, resulting
> in this issue. So fix it by accessing the right link.
>
> Fixes: f16a7dd5cf27 ("smc: netlink interface for SMC sockets")
> Reported-by: henaumars <[email protected]>
> Closes: https://bugzilla.openanolis.cn/show_bug.cgi?id=7616

What about using Link: http... here?

> Signed-off-by: Wen Gu <[email protected]>

Reviewed-by: Tony Lu <[email protected]>

> ---
> net/smc/smc_diag.c | 3 +--
> 1 file changed, 1 insertion(+), 2 deletions(-)
>
> diff --git a/net/smc/smc_diag.c b/net/smc/smc_diag.c
> index a584613aca12..5cc376834c57 100644
> --- a/net/smc/smc_diag.c
> +++ b/net/smc/smc_diag.c
> @@ -153,8 +153,7 @@ static int __smc_diag_dump(struct sock *sk, struct sk_buff *skb,
> .lnk[0].link_id = link->link_id,
> };
>
> - memcpy(linfo.lnk[0].ibname,
> - smc->conn.lgr->lnk[0].smcibdev->ibdev->name,
> + memcpy(linfo.lnk[0].ibname, link->smcibdev->ibdev->name,
> sizeof(link->smcibdev->ibdev->name));
> smc_gid_be16_convert(linfo.lnk[0].gid, link->gid);
> smc_gid_be16_convert(linfo.lnk[0].peer_gid, link->peer_gid);
> --
> 2.43.0

2023-12-28 11:03:06

by Wen Gu

[permalink] [raw]
Subject: Re: [PATCH net] net/smc: fix invalid link access in dumping SMC-R connections



On 2023/12/28 17:32, Tony Lu wrote:
> On Wed, Dec 27, 2023 at 03:40:35PM +0800, Wen Gu wrote:
>> A crash was found when dumping SMC-R connections. It can be reproduced
>> by following steps:
>>
>> - environment: two RNICs on both sides.
>> - run SMC-R between two sides, now a SMC_LGR_SYMMETRIC type link group
>> will be created.
>> - set the first RNIC down on either side and link group will turn to
>> SMC_LGR_ASYMMETRIC_LOCAL then.
>> - run 'smcss -R' and the crash will be triggered.
>>
>> BUG: kernel NULL pointer dereference, address: 0000000000000010
>> #PF: supervisor read access in kernel mode
>> #PF: error_code(0x0000) - not-present page
>> PGD 8000000101fdd067 P4D 8000000101fdd067 PUD 10ce46067 PMD 0
>> Oops: 0000 [#1] PREEMPT SMP PTI
>> CPU: 3 PID: 1810 Comm: smcss Kdump: loaded Tainted: G W E 6.7.0-rc6+ #51
>> RIP: 0010:__smc_diag_dump.constprop.0+0x36e/0x620 [smc_diag]
>> Call Trace:
>> <TASK>
>> ? __die+0x24/0x70
>> ? page_fault_oops+0x66/0x150
>> ? exc_page_fault+0x69/0x140
>> ? asm_exc_page_fault+0x26/0x30
>> ? __smc_diag_dump.constprop.0+0x36e/0x620 [smc_diag]
>> smc_diag_dump_proto+0xd0/0xf0 [smc_diag]
>> smc_diag_dump+0x26/0x60 [smc_diag]
>> netlink_dump+0x19f/0x320
>> __netlink_dump_start+0x1dc/0x300
>> smc_diag_handler_dump+0x6a/0x80 [smc_diag]
>> ? __pfx_smc_diag_dump+0x10/0x10 [smc_diag]
>> sock_diag_rcv_msg+0x121/0x140
>> ? __pfx_sock_diag_rcv_msg+0x10/0x10
>> netlink_rcv_skb+0x5a/0x110
>> sock_diag_rcv+0x28/0x40
>> netlink_unicast+0x22a/0x330
>> netlink_sendmsg+0x240/0x4a0
>> __sock_sendmsg+0xb0/0xc0
>> ____sys_sendmsg+0x24e/0x300
>> ? copy_msghdr_from_user+0x62/0x80
>> ___sys_sendmsg+0x7c/0xd0
>> ? __do_fault+0x34/0x1a0
>> ? do_read_fault+0x5f/0x100
>> ? do_fault+0xb0/0x110
>> __sys_sendmsg+0x4d/0x80
>> do_syscall_64+0x45/0xf0
>> entry_SYSCALL_64_after_hwframe+0x6e/0x76
>>
>> When the first RNIC is set down, the lgr->lnk[0] will be cleared and an
>> asymmetric link will be allocated in lgr->link[SMC_LINKS_PER_LGR_MAX - 1]
>> by smc_llc_alloc_alt_link(). Then when we try to dump SMC-R connections
>> in __smc_diag_dump(), the invalid lgr->lnk[0] will be accessed, resulting
>> in this issue. So fix it by accessing the right link.
>>
>> Fixes: f16a7dd5cf27 ("smc: netlink interface for SMC sockets")
>> Reported-by: henaumars <[email protected]>
>> Closes: https://bugzilla.openanolis.cn/show_bug.cgi?id=7616
>
> What about using Link: http... here?
>

Thank you, Tony.

According to [1],

"
The Reported-by tag gives credit to people who find bugs and report them and it
hopefully inspires them to help us again in the future. The tag is intended for
bugs; please do not use it to credit feature requests. The tag should be followed
by a Closes: tag pointing to the report, unless the report is not available on
the web. The Link: tag can be used instead of Closes: if the patch fixes a part
of the issue(s) being reported.
"

So I guess the Closes: tag is fine here.

[1] https://docs.kernel.org/process/submitting-patches.html

>> Signed-off-by: Wen Gu <[email protected]>
>
> Reviewed-by: Tony Lu <[email protected]>
>
>> ---
>> net/smc/smc_diag.c | 3 +--
>> 1 file changed, 1 insertion(+), 2 deletions(-)
>>
>> diff --git a/net/smc/smc_diag.c b/net/smc/smc_diag.c
>> index a584613aca12..5cc376834c57 100644
>> --- a/net/smc/smc_diag.c
>> +++ b/net/smc/smc_diag.c
>> @@ -153,8 +153,7 @@ static int __smc_diag_dump(struct sock *sk, struct sk_buff *skb,
>> .lnk[0].link_id = link->link_id,
>> };
>>
>> - memcpy(linfo.lnk[0].ibname,
>> - smc->conn.lgr->lnk[0].smcibdev->ibdev->name,
>> + memcpy(linfo.lnk[0].ibname, link->smcibdev->ibdev->name,
>> sizeof(link->smcibdev->ibdev->name));
>> smc_gid_be16_convert(linfo.lnk[0].gid, link->gid);
>> smc_gid_be16_convert(linfo.lnk[0].peer_gid, link->peer_gid);
>> --
>> 2.43.0

2024-01-03 09:33:47

by Wenjia Zhang

[permalink] [raw]
Subject: Re: [PATCH net] net/smc: fix invalid link access in dumping SMC-R connections



On 27.12.23 08:40, Wen Gu wrote:
> A crash was found when dumping SMC-R connections. It can be reproduced
> by following steps:
>
> - environment: two RNICs on both sides.
> - run SMC-R between two sides, now a SMC_LGR_SYMMETRIC type link group
> will be created.
> - set the first RNIC down on either side and link group will turn to
> SMC_LGR_ASYMMETRIC_LOCAL then.
> - run 'smcss -R' and the crash will be triggered.
>
> BUG: kernel NULL pointer dereference, address: 0000000000000010
> #PF: supervisor read access in kernel mode
> #PF: error_code(0x0000) - not-present page
> PGD 8000000101fdd067 P4D 8000000101fdd067 PUD 10ce46067 PMD 0
> Oops: 0000 [#1] PREEMPT SMP PTI
> CPU: 3 PID: 1810 Comm: smcss Kdump: loaded Tainted: G W E 6.7.0-rc6+ #51
> RIP: 0010:__smc_diag_dump.constprop.0+0x36e/0x620 [smc_diag]
> Call Trace:
> <TASK>
> ? __die+0x24/0x70
> ? page_fault_oops+0x66/0x150
> ? exc_page_fault+0x69/0x140
> ? asm_exc_page_fault+0x26/0x30
> ? __smc_diag_dump.constprop.0+0x36e/0x620 [smc_diag]
> smc_diag_dump_proto+0xd0/0xf0 [smc_diag]
> smc_diag_dump+0x26/0x60 [smc_diag]
> netlink_dump+0x19f/0x320
> __netlink_dump_start+0x1dc/0x300
> smc_diag_handler_dump+0x6a/0x80 [smc_diag]
> ? __pfx_smc_diag_dump+0x10/0x10 [smc_diag]
> sock_diag_rcv_msg+0x121/0x140
> ? __pfx_sock_diag_rcv_msg+0x10/0x10
> netlink_rcv_skb+0x5a/0x110
> sock_diag_rcv+0x28/0x40
> netlink_unicast+0x22a/0x330
> netlink_sendmsg+0x240/0x4a0
> __sock_sendmsg+0xb0/0xc0
> ____sys_sendmsg+0x24e/0x300
> ? copy_msghdr_from_user+0x62/0x80
> ___sys_sendmsg+0x7c/0xd0
> ? __do_fault+0x34/0x1a0
> ? do_read_fault+0x5f/0x100
> ? do_fault+0xb0/0x110
> __sys_sendmsg+0x4d/0x80
> do_syscall_64+0x45/0xf0
> entry_SYSCALL_64_after_hwframe+0x6e/0x76
>
> When the first RNIC is set down, the lgr->lnk[0] will be cleared and an
> asymmetric link will be allocated in lgr->link[SMC_LINKS_PER_LGR_MAX - 1]
> by smc_llc_alloc_alt_link(). Then when we try to dump SMC-R connections
> in __smc_diag_dump(), the invalid lgr->lnk[0] will be accessed, resulting
> in this issue. So fix it by accessing the right link.
>
> Fixes: f16a7dd5cf27 ("smc: netlink interface for SMC sockets")
> Reported-by: henaumars <[email protected]>
> Closes: https://bugzilla.openanolis.cn/show_bug.cgi?id=7616
> Signed-off-by: Wen Gu <[email protected]>

That is really good catch and good description! Thank you, Wen Gu, for
fixing it!

Reviewed-and-tested-by: Wenjia Zhang <[email protected]>

2024-01-04 01:01:02

by patchwork-bot+netdevbpf

[permalink] [raw]
Subject: Re: [PATCH net] net/smc: fix invalid link access in dumping SMC-R connections

Hello:

This patch was applied to netdev/net.git (main)
by Jakub Kicinski <[email protected]>:

On Wed, 27 Dec 2023 15:40:35 +0800 you wrote:
> A crash was found when dumping SMC-R connections. It can be reproduced
> by following steps:
>
> - environment: two RNICs on both sides.
> - run SMC-R between two sides, now a SMC_LGR_SYMMETRIC type link group
> will be created.
> - set the first RNIC down on either side and link group will turn to
> SMC_LGR_ASYMMETRIC_LOCAL then.
> - run 'smcss -R' and the crash will be triggered.
>
> [...]

Here is the summary with links:
- [net] net/smc: fix invalid link access in dumping SMC-R connections
https://git.kernel.org/netdev/net/c/9dbe086c69b8

You are awesome, thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html