2024-02-15 12:29:55

by Bernard Metzler

[permalink] [raw]
Subject: [PATCH] RDMA/siw: Fix handling netdev going down event

siw uses the NETDEV_GOING_DOWN event to schedule work which
gracefully clears all related siw devices connections. This
fix avoids re-initiating and re-scheduling this work if still
pending from a previous invocation.

Fixes: bdcf26bf9b3a ("rdma/siw: network and RDMA core interface")
Reported-by: [email protected]
Signed-off-by: Bernard Metzler <[email protected]>
---
drivers/infiniband/sw/siw/siw_main.c | 56 ++++++++++++++--------------
1 file changed, 28 insertions(+), 28 deletions(-)

diff --git a/drivers/infiniband/sw/siw/siw_main.c b/drivers/infiniband/sw/siw/siw_main.c
index 723903bd30c5..6c61f62b322c 100644
--- a/drivers/infiniband/sw/siw/siw_main.c
+++ b/drivers/infiniband/sw/siw/siw_main.c
@@ -276,6 +276,31 @@ static const struct ib_device_ops siw_device_ops = {
INIT_RDMA_OBJ_SIZE(ib_ucontext, siw_ucontext, base_ucontext),
};

+/*
+ * Network link becomes unavailable. Mark all
+ * affected QP's accordingly.
+ */
+static void siw_netdev_down(struct work_struct *work)
+{
+ struct siw_device *sdev =
+ container_of(work, struct siw_device, netdev_down);
+
+ struct siw_qp_attrs qp_attrs;
+ struct list_head *pos, *tmp;
+
+ memset(&qp_attrs, 0, sizeof(qp_attrs));
+ qp_attrs.state = SIW_QP_STATE_ERROR;
+
+ list_for_each_safe(pos, tmp, &sdev->qp_list) {
+ struct siw_qp *qp = list_entry(pos, struct siw_qp, devq);
+
+ down_write(&qp->state_lock);
+ WARN_ON(siw_qp_modify(qp, &qp_attrs, SIW_QP_ATTR_STATE));
+ up_write(&qp->state_lock);
+ }
+ ib_device_put(&sdev->base_dev);
+}
+
static struct siw_device *siw_device_create(struct net_device *netdev)
{
struct siw_device *sdev;
@@ -319,6 +344,7 @@ static struct siw_device *siw_device_create(struct net_device *netdev)
xa_init_flags(&sdev->mem_xa, XA_FLAGS_ALLOC1);

ib_set_device_ops(base_dev, &siw_device_ops);
+ INIT_WORK(&sdev->netdev_down, siw_netdev_down);
rv = ib_device_set_netdev(base_dev, netdev, 1);
if (rv)
goto error;
@@ -364,37 +390,11 @@ static struct siw_device *siw_device_create(struct net_device *netdev)
return ERR_PTR(rv);
}

-/*
- * Network link becomes unavailable. Mark all
- * affected QP's accordingly.
- */
-static void siw_netdev_down(struct work_struct *work)
-{
- struct siw_device *sdev =
- container_of(work, struct siw_device, netdev_down);
-
- struct siw_qp_attrs qp_attrs;
- struct list_head *pos, *tmp;
-
- memset(&qp_attrs, 0, sizeof(qp_attrs));
- qp_attrs.state = SIW_QP_STATE_ERROR;
-
- list_for_each_safe(pos, tmp, &sdev->qp_list) {
- struct siw_qp *qp = list_entry(pos, struct siw_qp, devq);
-
- down_write(&qp->state_lock);
- WARN_ON(siw_qp_modify(qp, &qp_attrs, SIW_QP_ATTR_STATE));
- up_write(&qp->state_lock);
- }
- ib_device_put(&sdev->base_dev);
-}
-
static void siw_device_goes_down(struct siw_device *sdev)
{
- if (ib_device_try_get(&sdev->base_dev)) {
- INIT_WORK(&sdev->netdev_down, siw_netdev_down);
+ if (ib_device_try_get(&sdev->base_dev) &&
+ !work_pending(&sdev->netdev_down))
schedule_work(&sdev->netdev_down);
- }
}

static int siw_netdev_event(struct notifier_block *nb, unsigned long event,
--
2.38.1



2024-02-19 07:26:10

by Leon Romanovsky

[permalink] [raw]
Subject: Re: [PATCH] RDMA/siw: Fix handling netdev going down event

On Thu, Feb 15, 2024 at 12:55:24PM +0100, Bernard Metzler wrote:
> siw uses the NETDEV_GOING_DOWN event to schedule work which
> gracefully clears all related siw devices connections. This
> fix avoids re-initiating and re-scheduling this work if still
> pending from a previous invocation.
>
> Fixes: bdcf26bf9b3a ("rdma/siw: network and RDMA core interface")
> Reported-by: [email protected]
> Signed-off-by: Bernard Metzler <[email protected]>
> ---
> drivers/infiniband/sw/siw/siw_main.c | 56 ++++++++++++++--------------
> 1 file changed, 28 insertions(+), 28 deletions(-)

It doesn't apply. I still think that you should simply delete this NETDEV_GOING_DOWN event.

Thanks

Grabbing thread from lore.kernel.org/all/20240215115524.126477-1-bmt%40zurich.ibm.com/t.mbox.gz
Checking for newer revisions
Grabbing search results from lore.kernel.org
Analyzing 1 messages in the thread
Checking attestation on all messages, may take a moment...
---
[PATCH] RDMA/siw: Fix handling netdev going down event
+ Link: https://lore.kernel.org/r/[email protected]
+ Signed-off-by: Leon Romanovsky <[email protected]>
---
NOTE: install dkimpy for DKIM signature verification
---
Total patches: 1
---
Applying: RDMA/siw: Fix handling netdev going down event
Patch failed at 0001 RDMA/siw: Fix handling netdev going down event
When you have resolved this problem, run "git am --continue".
If you prefer to skip this patch, run "git am --skip" instead.
To restore the original branch and stop patching, run "git am --abort".
error: patch failed: drivers/infiniband/sw/siw/siw_main.c:276
error: drivers/infiniband/sw/siw/siw_main.c: patch does not apply
hint: Use 'git am --show-current-patch=diff' to see the failed patch

Thanks