Prompted by Vinicius' request to consolidate some child Qdisc
dereferences in taprio:
https://lore.kernel.org/netdev/[email protected]/
I remembered that I had left some unfinished work in this Qdisc, namely
commit af7b29b1deaa ("Revert "net/sched: taprio: make qdisc_leaf() see
the per-netdev-queue pfifo child qdiscs"").
This patch set represents another stab at, essentially, what's in the
title. Not only does taprio not properly detect when it's grafted as a
non-root qdisc, but it also returns incorrect per-class stats.
Eventually, Vinicius' request is addressed too, although in a different
form than the one he requested (which was purely cosmetic).
Review from people more experienced with Qdiscs than me would be
appreciated. I tried my best to explain what I consider to be problems.
I am deliberately targeting net-next because the changes are too
invasive for net - they were reverted from stable once already.
Vladimir Oltean (5):
net/sched: taprio: don't access q->qdiscs[] in unoffloaded mode during
attach()
net/sched: taprio: keep child Qdisc refcount elevated at 2 in offload
mode
net/sched: taprio: try again to report q->qdiscs[] to qdisc_leaf()
net/sched: taprio: delete misleading comment about preallocating child
qdiscs
net/sched: taprio: dump class stats for the actual q->qdiscs[]
net/sched/sch_taprio.c | 60 ++++++++++++++++++++++++------------------
1 file changed, 35 insertions(+), 25 deletions(-)
--
2.34.1
This is a simple code transformation with no intended behavior change,
just to make it absolutely clear that q->qdiscs[] is only attached to
the child taprio classes in full offload mode.
Right now we use the q->qdiscs[] variable in taprio_attach() for
software mode too, but that is quite confusing and avoidable. We use
it only to reach the netdev TX queue, but we could as well just use
netdev_get_tx_queue() for that.
Signed-off-by: Vladimir Oltean <[email protected]>
---
net/sched/sch_taprio.c | 12 +++++++++---
1 file changed, 9 insertions(+), 3 deletions(-)
diff --git a/net/sched/sch_taprio.c b/net/sched/sch_taprio.c
index 3c4c2c334878..b1c611c72aa4 100644
--- a/net/sched/sch_taprio.c
+++ b/net/sched/sch_taprio.c
@@ -2130,14 +2130,20 @@ static void taprio_attach(struct Qdisc *sch)
/* Attach underlying qdisc */
for (ntx = 0; ntx < dev->num_tx_queues; ntx++) {
- struct Qdisc *qdisc = q->qdiscs[ntx];
+ struct netdev_queue *dev_queue = netdev_get_tx_queue(dev, ntx);
struct Qdisc *old;
if (FULL_OFFLOAD_IS_ENABLED(q->flags)) {
+ struct Qdisc *qdisc = q->qdiscs[ntx];
+
qdisc->flags |= TCQ_F_ONETXQUEUE | TCQ_F_NOPARENT;
- old = dev_graft_qdisc(qdisc->dev_queue, qdisc);
+ old = dev_graft_qdisc(dev_queue, qdisc);
} else {
- old = dev_graft_qdisc(qdisc->dev_queue, sch);
+ /* In software mode, attach the root taprio qdisc
+ * to all netdev TX queues, so that dev_qdisc_enqueue()
+ * goes through taprio_enqueue().
+ */
+ old = dev_graft_qdisc(dev_queue, sch);
qdisc_refcount_inc(sch);
}
if (old)
--
2.34.1
This makes a difference for the software scheduling mode, where
dev_queue->qdisc_sleeping is the same as the taprio root Qdisc itself,
but when we're talking about what Qdisc and stats get reported for a
traffic class, the root taprio isn't what comes to mind, but q->qdiscs[]
is.
To understand the difference, I've attempted to send 100 packets in
software mode through traffic class 0 (they are in the Qdisc's backlog),
and recorded the stats before and after the change.
Here is before:
$ tc -s class show dev eth0
class taprio 8001:1 root leaf 8001:
Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
backlog 9400b 100p requeues 0
Window drops: 0
class taprio 8001:2 root leaf 8001:
Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
backlog 9400b 100p requeues 0
Window drops: 0
class taprio 8001:3 root leaf 8001:
Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
backlog 9400b 100p requeues 0
Window drops: 0
class taprio 8001:4 root leaf 8001:
Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
backlog 9400b 100p requeues 0
Window drops: 0
class taprio 8001:5 root leaf 8001:
Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
backlog 9400b 100p requeues 0
Window drops: 0
class taprio 8001:6 root leaf 8001:
Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
backlog 9400b 100p requeues 0
Window drops: 0
class taprio 8001:7 root leaf 8001:
Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
backlog 9400b 100p requeues 0
Window drops: 0
class taprio 8001:8 root leaf 8001:
Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
backlog 9400b 100p requeues 0
Window drops: 0
and here is after:
class taprio 8001:1 root
Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
backlog 9400b 100p requeues 0
Window drops: 0
class taprio 8001:2 root
Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
backlog 0b 0p requeues 0
Window drops: 0
class taprio 8001:3 root
Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
backlog 0b 0p requeues 0
Window drops: 0
class taprio 8001:4 root
Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
backlog 0b 0p requeues 0
Window drops: 0
class taprio 8001:5 root
Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
backlog 0b 0p requeues 0
Window drops: 0
class taprio 8001:6 root
Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
backlog 0b 0p requeues 0
Window drops: 0
class taprio 8001:7 root leaf 8010:
Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
backlog 0b 0p requeues 0
Window drops: 0
class taprio 8001:8 root
Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
backlog 0b 0p requeues 0
Window drops: 0
The most glaring (and expected) difference is that before, all class
stats reported the global stats, whereas now, they really report just
the counters for that traffic class.
Signed-off-by: Vladimir Oltean <[email protected]>
---
net/sched/sch_taprio.c | 7 +++----
1 file changed, 3 insertions(+), 4 deletions(-)
diff --git a/net/sched/sch_taprio.c b/net/sched/sch_taprio.c
index cc7ff98e5e86..23b98c3af8b2 100644
--- a/net/sched/sch_taprio.c
+++ b/net/sched/sch_taprio.c
@@ -2452,11 +2452,11 @@ static unsigned long taprio_find(struct Qdisc *sch, u32 classid)
static int taprio_dump_class(struct Qdisc *sch, unsigned long cl,
struct sk_buff *skb, struct tcmsg *tcm)
{
- struct netdev_queue *dev_queue = taprio_queue_get(sch, cl);
+ struct Qdisc *child = taprio_leaf(sch, cl);
tcm->tcm_parent = TC_H_ROOT;
tcm->tcm_handle |= TC_H_MIN(cl);
- tcm->tcm_info = dev_queue->qdisc_sleeping->handle;
+ tcm->tcm_info = child->handle;
return 0;
}
@@ -2466,8 +2466,7 @@ static int taprio_dump_class_stats(struct Qdisc *sch, unsigned long cl,
__releases(d->lock)
__acquires(d->lock)
{
- struct netdev_queue *dev_queue = taprio_queue_get(sch, cl);
- struct Qdisc *child = dev_queue->qdisc_sleeping;
+ struct Qdisc *child = taprio_leaf(sch, cl);
struct tc_taprio_qopt_offload offload = {
.cmd = TAPRIO_CMD_TC_STATS,
.tc_stats = {
--
2.34.1
This is another stab at commit 1461d212ab27 ("net/sched: taprio: make
qdisc_leaf() see the per-netdev-queue pfifo child qdiscs"), later
reverted in commit af7b29b1deaa ("Revert "net/sched: taprio: make
qdisc_leaf() see the per-netdev-queue pfifo child qdiscs"").
I believe that the problems that caused the revert were fixed, and thus,
this change is identical to the original patch.
Its purpose is to properly reject attaching a software taprio child
qdisc to a software taprio parent. Because unoffloaded taprio currently
reports itself (the root Qdisc) as the return value from qdisc_leaf(),
then the process of attaching another taprio as child to a Qdisc class
of the root will just result in a Qdisc_ops :: change() call for the
root. Whereas that's not we want. We want Qdisc_ops :: init() to be
called for the taprio child, in order to give the taprio child a chance
to check whether its sch->parent is TC_H_ROOT or not (and reject this
configuration).
Signed-off-by: Vladimir Oltean <[email protected]>
---
net/sched/sch_taprio.c | 8 +++++---
1 file changed, 5 insertions(+), 3 deletions(-)
diff --git a/net/sched/sch_taprio.c b/net/sched/sch_taprio.c
index 8807fc915b79..263306fe38d8 100644
--- a/net/sched/sch_taprio.c
+++ b/net/sched/sch_taprio.c
@@ -2433,12 +2433,14 @@ static int taprio_dump(struct Qdisc *sch, struct sk_buff *skb)
static struct Qdisc *taprio_leaf(struct Qdisc *sch, unsigned long cl)
{
- struct netdev_queue *dev_queue = taprio_queue_get(sch, cl);
+ struct taprio_sched *q = qdisc_priv(sch);
+ struct net_device *dev = qdisc_dev(sch);
+ unsigned int ntx = cl - 1;
- if (!dev_queue)
+ if (ntx >= dev->num_tx_queues)
return NULL;
- return dev_queue->qdisc_sleeping;
+ return q->qdiscs[ntx];
}
static unsigned long taprio_find(struct Qdisc *sch, u32 classid)
--
2.34.1