2024-06-05 07:29:35

by Jijie Shao

[permalink] [raw]
Subject: [PATCH net 0/2] There are some bugfix for the HNS3 ethernet driver

There are some bugfix for the HNS3 ethernet driver

Jie Wang (1):
net: hns3: add cond_resched() to hns3 ring buffer init process

Yonglong Liu (1):
net: hns3: fix kernel crash problem in concurrent scenario

.../net/ethernet/hisilicon/hns3/hns3_enet.c | 4 ++++
.../net/ethernet/hisilicon/hns3/hns3_enet.h | 2 ++
.../hisilicon/hns3/hns3pf/hclge_main.c | 21 ++++++++++++++-----
3 files changed, 22 insertions(+), 5 deletions(-)

--
2.30.0



2024-06-05 07:29:46

by Jijie Shao

[permalink] [raw]
Subject: [PATCH net 1/2] net: hns3: fix kernel crash problem in concurrent scenario

From: Yonglong Liu <[email protected]>

When link status change, the nic driver need to notify the roce
driver to handle this event, but at this time, the roce driver
may uninit, then cause kernel crash.

To fix the problem, when link status change, need to check
whether the roce registered, and when uninit, need to wait link
update finish.

Fixes: 45e92b7e4e27 ("net: hns3: add calling roce callback function when link status change")
Signed-off-by: Yonglong Liu <[email protected]>
Signed-off-by: Jijie Shao <[email protected]>
---
.../hisilicon/hns3/hns3pf/hclge_main.c | 21 ++++++++++++++-----
1 file changed, 16 insertions(+), 5 deletions(-)

diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c
index 43cc6ee4d87d..82574ce0194f 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c
@@ -3086,9 +3086,7 @@ static void hclge_push_link_status(struct hclge_dev *hdev)

static void hclge_update_link_status(struct hclge_dev *hdev)
{
- struct hnae3_handle *rhandle = &hdev->vport[0].roce;
struct hnae3_handle *handle = &hdev->vport[0].nic;
- struct hnae3_client *rclient = hdev->roce_client;
struct hnae3_client *client = hdev->nic_client;
int state;
int ret;
@@ -3112,8 +3110,15 @@ static void hclge_update_link_status(struct hclge_dev *hdev)

client->ops->link_status_change(handle, state);
hclge_config_mac_tnl_int(hdev, state);
- if (rclient && rclient->ops->link_status_change)
- rclient->ops->link_status_change(rhandle, state);
+
+ if (test_bit(HCLGE_STATE_ROCE_REGISTERED, &hdev->state)) {
+ struct hnae3_handle *rhandle = &hdev->vport[0].roce;
+ struct hnae3_client *rclient = hdev->roce_client;
+
+ if (rclient && rclient->ops->link_status_change)
+ rclient->ops->link_status_change(rhandle,
+ state);
+ }

hclge_push_link_status(hdev);
}
@@ -11319,6 +11324,12 @@ static int hclge_init_client_instance(struct hnae3_client *client,
return ret;
}

+static bool hclge_uninit_need_wait(struct hclge_dev *hdev)
+{
+ return test_bit(HCLGE_STATE_RST_HANDLING, &hdev->state) ||
+ test_bit(HCLGE_STATE_LINK_UPDATING, &hdev->state);
+}
+
static void hclge_uninit_client_instance(struct hnae3_client *client,
struct hnae3_ae_dev *ae_dev)
{
@@ -11327,7 +11338,7 @@ static void hclge_uninit_client_instance(struct hnae3_client *client,

if (hdev->roce_client) {
clear_bit(HCLGE_STATE_ROCE_REGISTERED, &hdev->state);
- while (test_bit(HCLGE_STATE_RST_HANDLING, &hdev->state))
+ while (hclge_uninit_need_wait(hdev))
msleep(HCLGE_WAIT_RESET_DONE);

hdev->roce_client->ops->uninit_instance(&vport->roce, 0);
--
2.30.0


2024-06-06 16:27:11

by Simon Horman

[permalink] [raw]
Subject: Re: [PATCH net 1/2] net: hns3: fix kernel crash problem in concurrent scenario

On Wed, Jun 05, 2024 at 03:20:57PM +0800, Jijie Shao wrote:
> From: Yonglong Liu <[email protected]>
>
> When link status change, the nic driver need to notify the roce
> driver to handle this event, but at this time, the roce driver
> may uninit, then cause kernel crash.
>
> To fix the problem, when link status change, need to check
> whether the roce registered, and when uninit, need to wait link
> update finish.
>
> Fixes: 45e92b7e4e27 ("net: hns3: add calling roce callback function when link status change")
> Signed-off-by: Yonglong Liu <[email protected]>
> Signed-off-by: Jijie Shao <[email protected]>

Reviewed-by: Simon Horman <[email protected]>


2024-06-07 11:30:46

by patchwork-bot+netdevbpf

[permalink] [raw]
Subject: Re: [PATCH net 0/2] There are some bugfix for the HNS3 ethernet driver

Hello:

This series was applied to netdev/net.git (main)
by David S. Miller <[email protected]>:

On Wed, 5 Jun 2024 15:20:56 +0800 you wrote:
> There are some bugfix for the HNS3 ethernet driver
>
> Jie Wang (1):
> net: hns3: add cond_resched() to hns3 ring buffer init process
>
> Yonglong Liu (1):
> net: hns3: fix kernel crash problem in concurrent scenario
>
> [...]

Here is the summary with links:
- [net,1/2] net: hns3: fix kernel crash problem in concurrent scenario
https://git.kernel.org/netdev/net/c/12cda920212a
- [net,2/2] net: hns3: add cond_resched() to hns3 ring buffer init process
https://git.kernel.org/netdev/net/c/968fde83841a

You are awesome, thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html