2023-06-26 09:30:34

by Souradeep Chakrabarti

[permalink] [raw]
Subject: [PATCH 0/2 V3 net] net: mana: Fix MANA VF unload when host is unresponsive

From: Souradeep Chakrabarti <[email protected]>

VF unload gets stuck in MANA driver, when the host is not responding.
The function mana_dealloc_queues() tries to clear the inflight packets,
and gets stuck in while loop. Another problem in this scenario is the
timeout from hwc send request.
These patch add fix for the same.
In mana driver we are adding a timeout in the while loop, to fix it.
Also we are adding a new attribute in mana_context, which gets set when
mana_hwc_send_request() hits a timeout because of host unresponsiveness.

Souradeep Chakrabarti (2):
net: mana: Fix MANA VF unload when host is unresponsive
net: mana: Fix MANA VF unload when host is unresponsive

.../net/ethernet/microsoft/mana/gdma_main.c | 4 +++-
.../net/ethernet/microsoft/mana/hw_channel.c | 12 +++++++++++-
drivers/net/ethernet/microsoft/mana/mana_en.c | 19 +++++++++++++++++--
include/net/mana/mana.h | 2 ++
4 files changed, 33 insertions(+), 4 deletions(-)

--
2.34.1



2023-06-26 10:20:42

by Souradeep Chakrabarti

[permalink] [raw]
Subject: [PATCH 1/2 V3 net] net: mana: Fix MANA VF unload when host is unresponsive

From: Souradeep Chakrabarti <[email protected]>

This patch addresses the VF unload issue, where mana_dealloc_queues()
gets stuck in infinite while loop, because of host unresponsiveness.
It adds a timeout in the while loop, to fix it.

Fixes: ca9c54d2d6a5ab2430c4eda364c77125d62e5e0f (net: mana: Add a driver for
Microsoft Azure Network Adapter)
Signed-off-by: Souradeep Chakrabarti <[email protected]>
---
V2 -> V3:
* Splitted the patch in two parts.
* Removed the unnecessary braces from mana_dealloc_queues().
---
drivers/net/ethernet/microsoft/mana/mana_en.c | 19 +++++++++++++++++--
1 file changed, 17 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/microsoft/mana/mana_en.c b/drivers/net/ethernet/microsoft/mana/mana_en.c
index d907727c7b7a..cb5c43c3c47e 100644
--- a/drivers/net/ethernet/microsoft/mana/mana_en.c
+++ b/drivers/net/ethernet/microsoft/mana/mana_en.c
@@ -2329,7 +2329,10 @@ static int mana_dealloc_queues(struct net_device *ndev)
{
struct mana_port_context *apc = netdev_priv(ndev);
struct gdma_dev *gd = apc->ac->gdma_dev;
+ unsigned long timeout;
struct mana_txq *txq;
+ struct sk_buff *skb;
+ struct mana_cq *cq;
int i, err;

if (apc->port_is_up)
@@ -2348,13 +2351,25 @@ static int mana_dealloc_queues(struct net_device *ndev)
*
* Drain all the in-flight TX packets
*/
+
+ timeout = jiffies + 120 * HZ;
for (i = 0; i < apc->num_queues; i++) {
txq = &apc->tx_qp[i].txq;
-
- while (atomic_read(&txq->pending_sends) > 0)
+ while (atomic_read(&txq->pending_sends) > 0 &&
+ time_before(jiffies, timeout))
usleep_range(1000, 2000);
}

+ for (i = 0; i < apc->num_queues; i++) {
+ txq = &apc->tx_qp[i].txq;
+ cq = &apc->tx_qp[i].tx_cq;
+ while (atomic_read(&txq->pending_sends)) {
+ skb = skb_dequeue(&txq->pending_skbs);
+ mana_unmap_skb(skb, apc);
+ napi_consume_skb(skb, cq->budget);
+ atomic_sub(1, &txq->pending_sends);
+ }
+ }
/* We're 100% sure the queues can no longer be woken up, because
* we're sure now mana_poll_tx_cq() can't be running.
*/
--
2.34.1


2023-06-26 13:20:29

by Simon Horman

[permalink] [raw]
Subject: Re: [PATCH 1/2 V3 net] net: mana: Fix MANA VF unload when host is unresponsive

On Mon, Jun 26, 2023 at 02:18:18AM -0700, souradeep chakrabarti wrote:
> From: Souradeep Chakrabarti <[email protected]>
>
> This patch addresses the VF unload issue, where mana_dealloc_queues()
> gets stuck in infinite while loop, because of host unresponsiveness.
> It adds a timeout in the while loop, to fix it.
>
> Fixes: ca9c54d2d6a5ab2430c4eda364c77125d62e5e0f (net: mana: Add a driver for
> Microsoft Azure Network Adapter)

nit: A correct format of this fixes tag is:

In particular:
* All on lone line
* Description in double quotes.

Fixes: ca9c54d2d6a5 ("net: mana: Add a driver for Microsoft Azure Network Adapter (MANA)")

> Signed-off-by: Souradeep Chakrabarti <[email protected]>
> ---
> V2 -> V3:
> * Splitted the patch in two parts.
> * Removed the unnecessary braces from mana_dealloc_queues().

2023-06-26 20:09:24

by Dexuan Cui

[permalink] [raw]
Subject: RE: [PATCH 1/2 V3 net] net: mana: Fix MANA VF unload when host is unresponsive

> From: Simon Horman
> Sent: Monday, June 26, 2023 6:05 AM
> > ...
> > Fixes: ca9c54d2d6a5ab2430c4eda364c77125d62e5e0f (net: mana: Add a
> > driver for
> > Microsoft Azure Network Adapter)
>
> nit: A correct format of this fixes tag is:
>
> In particular:
> * All on lone line
> * Description in double quotes.
>
> Fixes: ca9c54d2d6a5 ("net: mana: Add a driver for Microsoft Azure Network
> Adapter (MANA)")

Hi Souradeep, FYI I often refer to:
https://marc.info/?l=linux-pci&m=150905742808166&w=2

The link mentions:
alias gsr='git --no-pager show -s --abbrev-commit --abbrev=12 --pretty=format:"%h (\"%s\")%n"'

"gsr ca9c54d2d6a5ab2430c4eda364c77125d62e5e0f" produces:
ca9c54d2d6a5 ("net: mana: Add a driver for Microsoft Azure Network Adapter (MANA)")

2023-06-26 21:27:54

by Stephen Hemminger

[permalink] [raw]
Subject: Re: [PATCH 1/2 V3 net] net: mana: Fix MANA VF unload when host is unresponsive

On Mon, 26 Jun 2023 20:06:48 +0000
Dexuan Cui <[email protected]> wrote:

> > From: Simon Horman
> > Sent: Monday, June 26, 2023 6:05 AM
> > > ...
> > > Fixes: ca9c54d2d6a5ab2430c4eda364c77125d62e5e0f (net: mana: Add a
> > > driver for
> > > Microsoft Azure Network Adapter)
> >
> > nit: A correct format of this fixes tag is:
> >
> > In particular:
> > * All on lone line
> > * Description in double quotes.
> >
> > Fixes: ca9c54d2d6a5 ("net: mana: Add a driver for Microsoft Azure Network
> > Adapter (MANA)")
>
> Hi Souradeep, FYI I often refer to:
> https://marc.info/?l=linux-pci&m=150905742808166&w=2
>
> The link mentions:
> alias gsr='git --no-pager show -s --abbrev-commit --abbrev=12 --pretty=format:"%h (\"%s\")%n"'
>
> "gsr ca9c54d2d6a5ab2430c4eda364c77125d62e5e0f" produces:
> ca9c54d2d6a5 ("net: mana: Add a driver for Microsoft Azure Network Adapter (MANA)")

You can do same thing without shell alias by using git-config

[alias]
fixes = log -1 --format=fixes
gsr = log -1 --format=gsr

[pretty]
fixes = Fixes: %h (\"%s\")
gsr = %h (\"%s\")

Then:
$ git gsr 1919b39fc6eabb9a6f9a51706ff6d03865f5df29
1919b39fc6ea ("net: mana: Fix perf regression: remove rx_cqes, tx_cqes counters")


2023-06-27 08:42:22

by Souradeep Chakrabarti

[permalink] [raw]
Subject: Re: [PATCH 1/2 V3 net] net: mana: Fix MANA VF unload when host is unresponsive

On Mon, Jun 26, 2023 at 01:47:21PM -0700, Stephen Hemminger wrote:
> On Mon, 26 Jun 2023 20:06:48 +0000
> Dexuan Cui <[email protected]> wrote:
>
> > > From: Simon Horman
> > > Sent: Monday, June 26, 2023 6:05 AM
> > > > ...
> > > > Fixes: ca9c54d2d6a5ab2430c4eda364c77125d62e5e0f (net: mana: Add a
> > > > driver for
> > > > Microsoft Azure Network Adapter)
> > >
> > > nit: A correct format of this fixes tag is:
> > >
> > > In particular:
> > > * All on lone line
> > > * Description in double quotes.
> > >
> > > Fixes: ca9c54d2d6a5 ("net: mana: Add a driver for Microsoft Azure Network
> > > Adapter (MANA)")
> >
> > Hi Souradeep, FYI I often refer to:
> > https://marc.info/?l=linux-pci&m=150905742808166&w=2
> >
> > The link mentions:
> > alias gsr='git --no-pager show -s --abbrev-commit --abbrev=12 --pretty=format:"%h (\"%s\")%n"'
> >
Thank you for the advice. Will use it from now onwards.
> > "gsr ca9c54d2d6a5ab2430c4eda364c77125d62e5e0f" produces:
> > ca9c54d2d6a5 ("net: mana: Add a driver for Microsoft Azure Network Adapter (MANA)")
>
> You can do same thing without shell alias by using git-config
>
> [alias]
> fixes = log -1 --format=fixes
> gsr = log -1 --format=gsr
>
> [pretty]
> fixes = Fixes: %h (\"%s\")
> gsr = %h (\"%s\")
>
> Then:
> $ git gsr 1919b39fc6eabb9a6f9a51706ff6d03865f5df29
> 1919b39fc6ea ("net: mana: Fix perf regression: remove rx_cqes, tx_cqes counters")
Thank you for the suggestion.