2021-06-03 06:36:33

by Íñigo Huguet

[permalink] [raw]
Subject: [PATCH 1/2] net:cxgb3: replace tasklets with works

OFLD and CTRL TX queues can be stopped if there is no room in
their DMA rings. If this happens, they're tried to be restarted
later after having made some room in the corresponding ring.

The tasks of restarting these queues were triggered using
tasklets, but they can be replaced for workqueue works, getting
them out of softirq context.

This queues stop/restart probably doesn't happen often and they
can be quite lengthy because they try to send all pending skbs.
Moreover, given that probably the ring is not empty yet, so the
DMA still has work to do, we don't need to be so fast to justify
using tasklets/softirq instead of running in a thread.

Signed-off-by: Íñigo Huguet <[email protected]>
---
drivers/net/ethernet/chelsio/cxgb3/adapter.h | 2 +-
drivers/net/ethernet/chelsio/cxgb3/common.h | 2 ++
drivers/net/ethernet/chelsio/cxgb3/sge.c | 38 +++++++++++---------
3 files changed, 25 insertions(+), 17 deletions(-)

diff --git a/drivers/net/ethernet/chelsio/cxgb3/adapter.h b/drivers/net/ethernet/chelsio/cxgb3/adapter.h
index f80fbd81b609..6d682b7c7aac 100644
--- a/drivers/net/ethernet/chelsio/cxgb3/adapter.h
+++ b/drivers/net/ethernet/chelsio/cxgb3/adapter.h
@@ -178,7 +178,7 @@ struct sge_txq { /* state for an SGE Tx queue */
unsigned int token; /* WR token */
dma_addr_t phys_addr; /* physical address of the ring */
struct sk_buff_head sendq; /* List of backpressured offload packets */
- struct tasklet_struct qresume_tsk; /* restarts the queue */
+ struct work_struct qresume_task; /* restarts the queue */
unsigned int cntxt_id; /* SGE context id for the Tx q */
unsigned long stops; /* # of times q has been stopped */
unsigned long restarts; /* # of queue restarts */
diff --git a/drivers/net/ethernet/chelsio/cxgb3/common.h b/drivers/net/ethernet/chelsio/cxgb3/common.h
index 1bd7d89666c4..b706f2fbe4f4 100644
--- a/drivers/net/ethernet/chelsio/cxgb3/common.h
+++ b/drivers/net/ethernet/chelsio/cxgb3/common.h
@@ -770,4 +770,6 @@ int t3_xaui_direct_phy_prep(struct cphy *phy, struct adapter *adapter,
int phy_addr, const struct mdio_ops *mdio_ops);
int t3_aq100x_phy_prep(struct cphy *phy, struct adapter *adapter,
int phy_addr, const struct mdio_ops *mdio_ops);
+
+extern struct workqueue_struct *cxgb3_wq;
#endif /* __CHELSIO_COMMON_H */
diff --git a/drivers/net/ethernet/chelsio/cxgb3/sge.c b/drivers/net/ethernet/chelsio/cxgb3/sge.c
index 1cc3c51eff71..fa91aa57b50a 100644
--- a/drivers/net/ethernet/chelsio/cxgb3/sge.c
+++ b/drivers/net/ethernet/chelsio/cxgb3/sge.c
@@ -1518,14 +1518,15 @@ static int ctrl_xmit(struct adapter *adap, struct sge_txq *q,

/**
* restart_ctrlq - restart a suspended control queue
- * @t: pointer to the tasklet associated with this handler
+ * @w: pointer to the work associated with this handler
*
* Resumes transmission on a suspended Tx control queue.
*/
-static void restart_ctrlq(struct tasklet_struct *t)
+static void restart_ctrlq(struct work_struct *w)
{
struct sk_buff *skb;
- struct sge_qset *qs = from_tasklet(qs, t, txq[TXQ_CTRL].qresume_tsk);
+ struct sge_qset *qs = container_of(w, struct sge_qset,
+ txq[TXQ_CTRL].qresume_task);
struct sge_txq *q = &qs->txq[TXQ_CTRL];

spin_lock(&q->lock);
@@ -1736,14 +1737,15 @@ again: reclaim_completed_tx(adap, q, TX_RECLAIM_CHUNK);

/**
* restart_offloadq - restart a suspended offload queue
- * @t: pointer to the tasklet associated with this handler
+ * @w: pointer to the work associated with this handler
*
* Resumes transmission on a suspended Tx offload queue.
*/
-static void restart_offloadq(struct tasklet_struct *t)
+static void restart_offloadq(struct work_struct *w)
{
struct sk_buff *skb;
- struct sge_qset *qs = from_tasklet(qs, t, txq[TXQ_OFLD].qresume_tsk);
+ struct sge_qset *qs = container_of(w, struct sge_qset,
+ txq[TXQ_OFLD].qresume_task);
struct sge_txq *q = &qs->txq[TXQ_OFLD];
const struct port_info *pi = netdev_priv(qs->netdev);
struct adapter *adap = pi->adapter;
@@ -1998,13 +2000,17 @@ static void restart_tx(struct sge_qset *qs)
should_restart_tx(&qs->txq[TXQ_OFLD]) &&
test_and_clear_bit(TXQ_OFLD, &qs->txq_stopped)) {
qs->txq[TXQ_OFLD].restarts++;
- tasklet_schedule(&qs->txq[TXQ_OFLD].qresume_tsk);
+
+ /* The work can be quite lengthy so we use driver's own queue */
+ queue_work(cxgb3_wq, &qs->txq[TXQ_OFLD].qresume_task);
}
if (test_bit(TXQ_CTRL, &qs->txq_stopped) &&
should_restart_tx(&qs->txq[TXQ_CTRL]) &&
test_and_clear_bit(TXQ_CTRL, &qs->txq_stopped)) {
qs->txq[TXQ_CTRL].restarts++;
- tasklet_schedule(&qs->txq[TXQ_CTRL].qresume_tsk);
+
+ /* The work can be quite lengthy so we use driver's own queue */
+ queue_work(cxgb3_wq, &qs->txq[TXQ_CTRL].qresume_task);
}
}

@@ -3085,8 +3091,8 @@ int t3_sge_alloc_qset(struct adapter *adapter, unsigned int id, int nports,
skb_queue_head_init(&q->txq[i].sendq);
}

- tasklet_setup(&q->txq[TXQ_OFLD].qresume_tsk, restart_offloadq);
- tasklet_setup(&q->txq[TXQ_CTRL].qresume_tsk, restart_ctrlq);
+ INIT_WORK(&q->txq[TXQ_OFLD].qresume_task, restart_offloadq);
+ INIT_WORK(&q->txq[TXQ_CTRL].qresume_task, restart_ctrlq);

q->fl[0].gen = q->fl[1].gen = 1;
q->fl[0].size = p->fl_size;
@@ -3276,11 +3282,11 @@ void t3_sge_start(struct adapter *adap)
*
* Can be invoked from interrupt context e.g. error handler.
*
- * Note that this function cannot disable the restart of tasklets as
+ * Note that this function cannot disable the restart of works as
* it cannot wait if called from interrupt context, however the
- * tasklets will have no effect since the doorbells are disabled. The
+ * works will have no effect since the doorbells are disabled. The
* driver will call tg3_sge_stop() later from process context, at
- * which time the tasklets will be stopped if they are still running.
+ * which time the works will be stopped if they are still running.
*/
void t3_sge_stop_dma(struct adapter *adap)
{
@@ -3292,7 +3298,7 @@ void t3_sge_stop_dma(struct adapter *adap)
* @adap: the adapter
*
* Called from process context. Disables the DMA engine and any
- * pending queue restart tasklets.
+ * pending queue restart works.
*/
void t3_sge_stop(struct adapter *adap)
{
@@ -3303,8 +3309,8 @@ void t3_sge_stop(struct adapter *adap)
for (i = 0; i < SGE_QSETS; ++i) {
struct sge_qset *qs = &adap->sge.qs[i];

- tasklet_kill(&qs->txq[TXQ_OFLD].qresume_tsk);
- tasklet_kill(&qs->txq[TXQ_CTRL].qresume_tsk);
+ cancel_work_sync(&qs->txq[TXQ_OFLD].qresume_task);
+ cancel_work_sync(&qs->txq[TXQ_OFLD].qresume_task);
}
}

--
2.31.1


2021-06-03 06:37:25

by Íñigo Huguet

[permalink] [raw]
Subject: [PATCH 2/2] net:cxgb3: fix code style issues

Signed-off-by: Íñigo Huguet <[email protected]>
---
.../net/ethernet/chelsio/cxgb3/cxgb3_main.c | 19 ++++++++++---------
1 file changed, 10 insertions(+), 9 deletions(-)

diff --git a/drivers/net/ethernet/chelsio/cxgb3/cxgb3_main.c b/drivers/net/ethernet/chelsio/cxgb3/cxgb3_main.c
index 84ad7261e243..57f210c53afc 100644
--- a/drivers/net/ethernet/chelsio/cxgb3/cxgb3_main.c
+++ b/drivers/net/ethernet/chelsio/cxgb3/cxgb3_main.c
@@ -1273,14 +1273,14 @@ static int cxgb_up(struct adapter *adap)
free_irq(adap->msix_info[0].vec, adap);
goto irq_err;
}
- } else if ((err = request_irq(adap->pdev->irq,
- t3_intr_handler(adap,
- adap->sge.qs[0].rspq.
- polling),
- (adap->flags & USING_MSI) ?
- 0 : IRQF_SHARED,
- adap->name, adap)))
- goto irq_err;
+ } else {
+ err = request_irq(adap->pdev->irq,
+ t3_intr_handler(adap, adap->sge.qs[0].rspq.polling),
+ (adap->flags & USING_MSI) ? 0 : IRQF_SHARED,
+ adap->name, adap);
+ if (err)
+ goto irq_err;
+ }

enable_all_napi(adap);
t3_sge_start(adap);
@@ -3098,8 +3098,9 @@ static void set_nqsets(struct adapter *adap)
nqsets = num_cpus;
if (nqsets < 1 || hwports == 4)
nqsets = 1;
- } else
+ } else {
nqsets = 1;
+ }

for_each_port(adap, i) {
struct port_info *pi = adap2pinfo(adap, i);
--
2.31.1

2021-06-03 12:37:15

by Íñigo Huguet

[permalink] [raw]
Subject: Re: [PATCH 1/2] net:cxgb3: replace tasklets with works

On Thu, Jun 3, 2021 at 9:47 AM Hillf Danton <[email protected]> wrote:
>
> On Thu, 3 Jun 2021 08:34:29 +0200 Inigo Huguet wrote:
> >
> > Moreover, given that probably the ring is not empty yet, so the
> > DMA still has work to do, we don't need to be so fast to justify
> > using tasklets/softirq instead of running in a thread.
>
> [...]
>
> > - tasklet_kill(&qs->txq[TXQ_OFLD].qresume_tsk);
> > - tasklet_kill(&qs->txq[TXQ_CTRL].qresume_tsk);
> > + cancel_work_sync(&qs->txq[TXQ_OFLD].qresume_task);
> > + cancel_work_sync(&qs->txq[TXQ_OFLD].qresume_task);
>
> This is the last minute mark that figures are needed to support your
> reasoning above.

OFLD queue has length=1024, and CTRL queue length=256. If a queue is
so full that the next packet doesn't fit, driver "stop" that queue.
That means that it adds the new outcoming packets to a list of packets
which are pending from being enqueued. Packets which were already in
the queue keep being processed by the DMA. When the queue has half or
more of free space, pending packets are moved from the "pending" list
to the queue.

If the list got full in the first place, it was because the NIC was
processing the packets slower than the CPU was trying to send them.
Given that, it is reasonable to think that there is enough time to
enqueue the pending packets before the DMA and the NIC are able to
process the other half of data still in the queue. If for some reason
the situation has changed in the meanwhile, and now the NIC is capable
of sending the packets REALLY fast, and the queue gets empty before
the qresume_task is run, it will be a small delay only once, not many
times, so it is not a big problem. But honestly, I don't know if this
situation can really happen in practice.

In my opinion there are no drawbacks moving these tasks to threads,
and in exchange we avoid increasing latencies in softirq context.
Consider that there might be a high amount of packets pending of being
enqueued, and in these "resume" tasks pending packets keep being
enqueued until there are no more of them, or until the queue gets full
again. The "resume" task might run for quite a long time.

cancel_work_sync is only called when the interface is set to down
state or the module removed.

--
Íñigo Huguet

2021-06-03 22:04:00

by patchwork-bot+netdevbpf

[permalink] [raw]
Subject: Re: [PATCH 1/2] net:cxgb3: replace tasklets with works

Hello:

This series was applied to netdev/net-next.git (refs/heads/master):

On Thu, 3 Jun 2021 08:34:29 +0200 you wrote:
> OFLD and CTRL TX queues can be stopped if there is no room in
> their DMA rings. If this happens, they're tried to be restarted
> later after having made some room in the corresponding ring.
>
> The tasks of restarting these queues were triggered using
> tasklets, but they can be replaced for workqueue works, getting
> them out of softirq context.
>
> [...]

Here is the summary with links:
- [1/2] net:cxgb3: replace tasklets with works
https://git.kernel.org/netdev/net-next/c/5e0b8928927f
- [2/2] net:cxgb3: fix code style issues
https://git.kernel.org/netdev/net-next/c/6a8dd8b2fa5b

You are awesome, thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html


2021-06-04 11:21:44

by Íñigo Huguet

[permalink] [raw]
Subject: Re: [PATCH 1/2] net:cxgb3: replace tasklets with works

On Fri, Jun 4, 2021 at 5:15 AM Hillf Danton <[email protected]> wrote:
> Good material to be shoehorned into commit message though without sheding
> any light first on the reasons why CTRL is so special its work wont bother
> being canceled.

Not sure what you mean. If it's that CTRL work isn't being cancelled,
but OFLD is being cancelled twice, thanks for pointing it. I'm sending
a new patch.

> How long? Long enough for the kworker to become a CPU hog?
> What is it in your opinion to cut the chance for that risk?

I don't think so, not by its own, but the maybe yes the sum of many
tasks in softirq context (it will depend on what else is being
executed in the system). As far as I know, moving work from softirq to
threads to reduce latencies is seen as a good thing, unless it's
really necessary.
--
Íñigo Huguet