LinuxLists.cc - [net-next Patchv2 0/3] support Round Robin scheduling

2023-07-13 06:26:29

Subject: [net-next Patchv2 0/3] support Round Robin scheduling

octeontx2 and CN10K silicons support Round Robin scheduling. When multiple
traffic flows reach transmit level with the same priority, with Round Robin
scheduling traffic flow with the highest quantum value is picked. With this
support, the user can add multiple classes with the same priority and
different quantum in htb offload.

This series of patches adds support for the same.

Patch1: implement transmit schedular allocation algorithm as preparation
for support round robin scheduling.

Patch2: Allow quantum parameter in HTB offload mode.

Patch3: extends octeontx2 htb offload support for Round Robin scheduling

Naveen Mamindlapalli (3):
octeontx2-pf: implement transmit schedular allocation algorithm
sch_htb: Allow HTB quantum parameter in offload mode
octeontx2-pf: htb offload support for Round Robin scheduling
--
v2 * change data type of otx2_index_used to reduce size of structure
otx2_qos_cfg

.../marvell/octeontx2/nic/otx2_common.c | 1 +
.../marvell/octeontx2/nic/otx2_common.h | 1 +
.../net/ethernet/marvell/octeontx2/nic/qos.c | 347 ++++++++++++++++--
.../net/ethernet/marvell/octeontx2/nic/qos.h | 11 +-
.../net/ethernet/mellanox/mlx5/core/en/qos.c | 4 +-
include/net/pkt_cls.h | 1 +
net/sched/sch_htb.c | 7 +-
7 files changed, 329 insertions(+), 43 deletions(-)

--
2.17.1

2023-07-13 06:39:06

by Hariprasad Kelam

[permalink] [raw]

Subject: [net-next Patchv2 1/3] octeontx2-pf: implement transmit schedular allocation algorithm

From: Naveen Mamindlapalli <[email protected]>

unlike strict priority, where number of classes are limited to max
8, there is no restriction on the number of dwrr child nodes unless
the count increases the max number of child nodes supported.

Hardware expects strict priority transmit schedular indexes mapped
to their priority. This patch adds defines transmit schedular allocation
algorithm such that the above requirement is honored.

Signed-off-by: Naveen Mamindlapalli <[email protected]>
Signed-off-by: Hariprasad Kelam <[email protected]>
---
.../net/ethernet/marvell/octeontx2/nic/qos.c | 129 +++++++++++++++++-
.../net/ethernet/marvell/octeontx2/nic/qos.h | 6 +
2 files changed, 129 insertions(+), 6 deletions(-)

diff --git a/drivers/net/ethernet/marvell/octeontx2/nic/qos.c b/drivers/net/ethernet/marvell/octeontx2/nic/qos.c
index d3a76c5ccda8..51e9be55d5f5 100644
--- a/drivers/net/ethernet/marvell/octeontx2/nic/qos.c
+++ b/drivers/net/ethernet/marvell/octeontx2/nic/qos.c
@@ -19,6 +19,7 @@
#define OTX2_QOS_CLASS_NONE 0
#define OTX2_QOS_DEFAULT_PRIO 0xF
#define OTX2_QOS_INVALID_SQ 0xFFFF
+#define OTX2_QOS_INVALID_TXSCHQ_IDX 0xFFFF

static void otx2_qos_update_tx_netdev_queues(struct otx2_nic *pfvf)
{
@@ -315,9 +316,12 @@ static void otx2_qos_fill_cfg_tl(struct otx2_qos_node *parent,

list_for_each_entry(node, &parent->child_list, list) {
otx2_qos_fill_cfg_tl(node, cfg);
- cfg->schq_contig[node->level]++;
otx2_qos_fill_cfg_schq(node, cfg);
}
+
+ /* Assign the required number of transmit schedular queues under the given class */
+ cfg->schq_contig[parent->level - 1] += parent->child_dwrr_cnt +
+ parent->max_static_prio + 1;
}

static void otx2_qos_prepare_txschq_cfg(struct otx2_nic *pfvf,
@@ -401,9 +405,13 @@ static int otx2_qos_add_child_node(struct otx2_qos_node *parent,
struct otx2_qos_node *tmp_node;
struct list_head *tmp;

+ if (node->prio > parent->max_static_prio)
+ parent->max_static_prio = node->prio;
+
for (tmp = head->next; tmp != head; tmp = tmp->next) {
tmp_node = list_entry(tmp, struct otx2_qos_node, list);
- if (tmp_node->prio == node->prio)
+ if (tmp_node->prio == node->prio &&
+ tmp_node->is_static)
return -EEXIST;
if (tmp_node->prio > node->prio) {
list_add_tail(&node->list, tmp);
@@ -476,6 +484,7 @@ otx2_qos_sw_create_leaf_node(struct otx2_nic *pfvf,
node->rate = otx2_convert_rate(rate);
node->ceil = otx2_convert_rate(ceil);
node->prio = prio;
+ node->is_static = true;

__set_bit(qid, pfvf->qos.qos_sq_bmap);

@@ -628,6 +637,20 @@ static int otx2_qos_txschq_alloc(struct otx2_nic *pfvf,
return rc;
}

+static void otx2_qos_free_unused_txschq(struct otx2_nic *pfvf, struct otx2_qos_cfg *cfg)
+{
+ int lvl, idx, schq;
+
+ for (lvl = 0; lvl < NIX_TXSCH_LVL_CNT; lvl++) {
+ for (idx = 0; idx < cfg->schq_contig[lvl]; idx++) {
+ if (!cfg->schq_index_used[lvl][idx]) {
+ schq = cfg->schq_contig_list[lvl][idx];
+ otx2_txschq_free_one(pfvf, lvl, schq);
+ }
+ }
+ }
+}
+
static void otx2_qos_txschq_fill_cfg_schq(struct otx2_nic *pfvf,
struct otx2_qos_node *node,
struct otx2_qos_cfg *cfg)
@@ -652,9 +675,10 @@ static void otx2_qos_txschq_fill_cfg_tl(struct otx2_nic *pfvf,
list_for_each_entry(tmp, &node->child_list, list) {
otx2_qos_txschq_fill_cfg_tl(pfvf, tmp, cfg);
cnt = cfg->static_node_pos[tmp->level];
- tmp->schq = cfg->schq_contig_list[tmp->level][cnt];
+ tmp->schq = cfg->schq_contig_list[tmp->level][tmp->txschq_idx];
+ cfg->schq_index_used[tmp->level][tmp->txschq_idx] = true;
if (cnt == 0)
- node->prio_anchor = tmp->schq;
+ node->prio_anchor = cfg->schq_contig_list[tmp->level][0];
cfg->static_node_pos[tmp->level]++;
otx2_qos_txschq_fill_cfg_schq(pfvf, tmp, cfg);
}
@@ -667,9 +691,84 @@ static void otx2_qos_txschq_fill_cfg(struct otx2_nic *pfvf,
mutex_lock(&pfvf->qos.qos_lock);
otx2_qos_txschq_fill_cfg_tl(pfvf, node, cfg);
otx2_qos_txschq_fill_cfg_schq(pfvf, node, cfg);
+ otx2_qos_free_unused_txschq(pfvf, cfg);
mutex_unlock(&pfvf->qos.qos_lock);
}

+static void __otx2_qos_assign_base_idx_tl(struct otx2_nic *pfvf,
+ struct otx2_qos_node *tmp,
+ unsigned long *child_idx_bmap,
+ int child_cnt)
+{
+ int idx;
+
+ if (tmp->txschq_idx != OTX2_QOS_INVALID_TXSCHQ_IDX)
+ return;
+
+ /* assign static nodes 1:1 prio mapping first, then remaining nodes */
+ for (idx = 0; idx < child_cnt; idx++) {
+ if (tmp->is_static && tmp->prio == idx &&
+ !test_bit(idx, child_idx_bmap)) {
+ tmp->txschq_idx = idx;
+ set_bit(idx, child_idx_bmap);
+ return;
+ } else if (!tmp->is_static && idx >= tmp->prio &&
+ !test_bit(idx, child_idx_bmap)) {
+ tmp->txschq_idx = idx;
+ set_bit(idx, child_idx_bmap);
+ return;
+ }
+ }
+}
+
+static int otx2_qos_assign_base_idx_tl(struct otx2_nic *pfvf,
+ struct otx2_qos_node *node)
+{
+ unsigned long *child_idx_bmap;
+ struct otx2_qos_node *tmp;
+ int child_cnt;
+
+ list_for_each_entry(tmp, &node->child_list, list)
+ tmp->txschq_idx = OTX2_QOS_INVALID_TXSCHQ_IDX;
+
+ /* allocate child index array */
+ child_cnt = node->child_dwrr_cnt + node->max_static_prio + 1;
+ child_idx_bmap = kcalloc(BITS_TO_LONGS(child_cnt), sizeof(unsigned long),
+ GFP_KERNEL);
+ if (!child_idx_bmap)
+ return -ENOMEM;
+
+ list_for_each_entry(tmp, &node->child_list, list)
+ otx2_qos_assign_base_idx_tl(pfvf, tmp);
+
+ /* assign base index of static priority children first */
+ list_for_each_entry(tmp, &node->child_list, list) {
+ if (!tmp->is_static)
+ continue;
+ __otx2_qos_assign_base_idx_tl(pfvf, tmp, child_idx_bmap, child_cnt);
+ }
+
+ /* assign base index of dwrr priority children */
+ list_for_each_entry(tmp, &node->child_list, list)
+ __otx2_qos_assign_base_idx_tl(pfvf, tmp, child_idx_bmap, child_cnt);
+
+ kfree(child_idx_bmap);
+
+ return 0;
+}
+
+static int otx2_qos_assign_base_idx(struct otx2_nic *pfvf,
+ struct otx2_qos_node *node)
+{
+ int ret = 0;
+
+ mutex_lock(&pfvf->qos.qos_lock);
+ ret = otx2_qos_assign_base_idx_tl(pfvf, node);
+ mutex_unlock(&pfvf->qos.qos_lock);
+
+ return ret;
+}
+
static int otx2_qos_txschq_push_cfg_schq(struct otx2_nic *pfvf,
struct otx2_qos_node *node,
struct otx2_qos_cfg *cfg)
@@ -761,8 +860,10 @@ static void otx2_qos_free_cfg(struct otx2_nic *pfvf, struct otx2_qos_cfg *cfg)

for (lvl = 0; lvl < NIX_TXSCH_LVL_CNT; lvl++) {
for (idx = 0; idx < cfg->schq_contig[lvl]; idx++) {
- schq = cfg->schq_contig_list[lvl][idx];
- otx2_txschq_free_one(pfvf, lvl, schq);
+ if (cfg->schq_index_used[lvl][idx]) {
+ schq = cfg->schq_contig_list[lvl][idx];
+ otx2_txschq_free_one(pfvf, lvl, schq);
+ }
}
}
}
@@ -838,6 +939,10 @@ static int otx2_qos_push_txschq_cfg(struct otx2_nic *pfvf,
if (ret)
return -ENOSPC;

+ ret = otx2_qos_assign_base_idx(pfvf, node);
+ if (ret)
+ return -ENOMEM;
+
if (!(pfvf->netdev->flags & IFF_UP)) {
otx2_qos_txschq_fill_cfg(pfvf, node, cfg);
return 0;
@@ -995,6 +1100,7 @@ static int otx2_qos_leaf_alloc_queue(struct otx2_nic *pfvf, u16 classid,
if (ret)
goto out;

+ parent->child_static_cnt++;
set_bit(prio, parent->prio_bmap);

/* read current txschq configuration */
@@ -1067,6 +1173,7 @@ static int otx2_qos_leaf_alloc_queue(struct otx2_nic *pfvf, u16 classid,
free_old_cfg:
kfree(old_cfg);
reset_prio:
+ parent->child_static_cnt--;
clear_bit(prio, parent->prio_bmap);
out:
return ret;
@@ -1105,6 +1212,7 @@ static int otx2_qos_leaf_to_inner(struct otx2_nic *pfvf, u16 classid,
goto out;
}

+ node->child_static_cnt++;
set_bit(prio, node->prio_bmap);

/* store the qid to assign to leaf node */
@@ -1178,6 +1286,7 @@ static int otx2_qos_leaf_to_inner(struct otx2_nic *pfvf, u16 classid,
free_old_cfg:
kfree(old_cfg);
reset_prio:
+ node->child_static_cnt--;
clear_bit(prio, node->prio_bmap);
out:
return ret;
@@ -1207,6 +1316,10 @@ static int otx2_qos_leaf_del(struct otx2_nic *pfvf, u16 *classid,
otx2_qos_destroy_node(pfvf, node);
pfvf->qos.qid_to_sqmap[qid] = OTX2_QOS_INVALID_SQ;

+ parent->child_static_cnt--;
+ if (!parent->child_static_cnt)
+ parent->max_static_prio = 0;
+
clear_bit(prio, parent->prio_bmap);

return 0;
@@ -1245,6 +1358,10 @@ static int otx2_qos_leaf_del_last(struct otx2_nic *pfvf, u16 classid, bool force
otx2_qos_destroy_node(pfvf, node);
pfvf->qos.qid_to_sqmap[qid] = OTX2_QOS_INVALID_SQ;

+ parent->child_static_cnt--;
+ if (!parent->child_static_cnt)
+ parent->max_static_prio = 0;
+
clear_bit(prio, parent->prio_bmap);

/* create downstream txschq entries to parent */
diff --git a/drivers/net/ethernet/marvell/octeontx2/nic/qos.h b/drivers/net/ethernet/marvell/octeontx2/nic/qos.h
index 19773284be27..faa7c24675d1 100644
--- a/drivers/net/ethernet/marvell/octeontx2/nic/qos.h
+++ b/drivers/net/ethernet/marvell/octeontx2/nic/qos.h
@@ -35,6 +35,7 @@ struct otx2_qos_cfg {
int dwrr_node_pos[NIX_TXSCH_LVL_CNT];
u16 schq_contig_list[NIX_TXSCH_LVL_CNT][MAX_TXSCHQ_PER_FUNC];
u16 schq_list[NIX_TXSCH_LVL_CNT][MAX_TXSCHQ_PER_FUNC];
+ bool schq_index_used[NIX_TXSCH_LVL_CNT][MAX_TXSCHQ_PER_FUNC];
};

struct otx2_qos {
@@ -62,7 +63,12 @@ struct otx2_qos_node {
u16 schq; /* hw txschq */
u16 qid;
u16 prio_anchor;
+ u16 max_static_prio;
+ u16 child_dwrr_cnt;
+ u16 child_static_cnt;
+ u16 txschq_idx; /* txschq allocation index */
u8 level;
+ bool is_static;
};

--
2.17.1

2023-07-13 06:43:04

by Hariprasad Kelam

[permalink] [raw]

Subject: [net-next Patchv2 2/3] sch_htb: Allow HTB quantum parameter in offload mode

From: Naveen Mamindlapalli <[email protected]>

The current implementation of HTB offload returns the EINVAL error for
quantum parameter. This patch removes the error returning checks for
'quantum' parameter and populates its value to tc_htb_qopt_offload
structure such that driver can use the same.

Add quantum parameter check in mlx5 driver, as mlx5 devices are not capable
of supporting the quantum parameter when htb offload is used. Report error
if quantum parameter is set to a non-default value.

Signed-off-by: Naveen Mamindlapalli <[email protected]>
Signed-off-by: Hariprasad Kelam <[email protected]>
---
drivers/net/ethernet/mellanox/mlx5/core/en/qos.c | 4 ++--
include/net/pkt_cls.h | 1 +
net/sched/sch_htb.c | 7 +++----
3 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/qos.c b/drivers/net/ethernet/mellanox/mlx5/core/en/qos.c
index 1874c2f0587f..244bc15a42ab 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en/qos.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en/qos.c
@@ -379,9 +379,9 @@ int mlx5e_htb_setup_tc(struct mlx5e_priv *priv, struct tc_htb_qopt_offload *htb_
if (!htb && htb_qopt->command != TC_HTB_CREATE)
return -EINVAL;

- if (htb_qopt->prio) {
+ if (htb_qopt->prio || htb_qopt->quantum) {
NL_SET_ERR_MSG_MOD(htb_qopt->extack,
- "prio parameter is not supported by device with HTB offload enabled.");
+ "prio and quantum parameters are not supported by device with HTB offload enabled.");
return -EOPNOTSUPP;
}

diff --git a/include/net/pkt_cls.h b/include/net/pkt_cls.h
index a2ea45c7b53e..139cd09828af 100644
--- a/include/net/pkt_cls.h
+++ b/include/net/pkt_cls.h
@@ -866,6 +866,7 @@ struct tc_htb_qopt_offload {
u32 parent_classid;
u16 classid;
u16 qid;
+ u32 quantum;
u64 rate;
u64 ceil;
u8 prio;
diff --git a/net/sched/sch_htb.c b/net/sched/sch_htb.c
index 325c29041c7d..333800a7d4eb 100644
--- a/net/sched/sch_htb.c
+++ b/net/sched/sch_htb.c
@@ -1810,10 +1810,6 @@ static int htb_change_class(struct Qdisc *sch, u32 classid,
NL_SET_ERR_MSG(extack, "HTB offload doesn't support the mpu parameter");
goto failure;
}
- if (hopt->quantum) {
- NL_SET_ERR_MSG(extack, "HTB offload doesn't support the quantum parameter");
- goto failure;
- }
}

/* Keeping backward compatible with rate_table based iproute2 tc */
@@ -1910,6 +1906,7 @@ static int htb_change_class(struct Qdisc *sch, u32 classid,
.rate = max_t(u64, hopt->rate.rate, rate64),
.ceil = max_t(u64, hopt->ceil.rate, ceil64),
.prio = hopt->prio,
+ .quantum = hopt->quantum,
.extack = extack,
};
err = htb_offload(dev, &offload_opt);
@@ -1931,6 +1928,7 @@ static int htb_change_class(struct Qdisc *sch, u32 classid,
.rate = max_t(u64, hopt->rate.rate, rate64),
.ceil = max_t(u64, hopt->ceil.rate, ceil64),
.prio = hopt->prio,
+ .quantum = hopt->quantum,
.extack = extack,
};
err = htb_offload(dev, &offload_opt);
@@ -2017,6 +2015,7 @@ static int htb_change_class(struct Qdisc *sch, u32 classid,
.rate = max_t(u64, hopt->rate.rate, rate64),
.ceil = max_t(u64, hopt->ceil.rate, ceil64),
.prio = hopt->prio,
+ .quantum = hopt->quantum,
.extack = extack,
};
err = htb_offload(dev, &offload_opt);
--
2.17.1

2023-07-14 04:10:29

by Jakub Kicinski

[permalink] [raw]

Subject: Re: [net-next Patchv2 0/3] support Round Robin scheduling

On Thu, 13 Jul 2023 11:31:08 +0530 Hariprasad Kelam wrote:
> octeontx2 and CN10K silicons support Round Robin scheduling. When multiple
> traffic flows reach transmit level with the same priority, with Round Robin
> scheduling traffic flow with the highest quantum value is picked. With this
> support, the user can add multiple classes with the same priority and
> different quantum in htb offload.

Please extend the driver documentation appropriately, there's
a "Setup HTB offload" section which only shows strict prio now.
--
pw-bot: cr

2023-07-14 07:06:53

by Hariprasad Kelam

[permalink] [raw]

Subject: Re: [net-next Patchv2 0/3] support Round Robin scheduling

> -----Original Message-----
> From: Jakub Kicinski <[email protected]>
> Sent: Friday, July 14, 2023 8:58 AM
> To: Hariprasad Kelam <[email protected]>
> On Thu, 13 Jul 2023 11:31:08 +0530 Hariprasad Kelam wrote:
> > octeontx2 and CN10K silicons support Round Robin scheduling. When
> > multiple traffic flows reach transmit level with the same priority,
> > with Round Robin scheduling traffic flow with the highest quantum
> > value is picked. With this support, the user can add multiple classes
> > with the same priority and different quantum in htb offload.
>
> Please extend the driver documentation appropriately, there's a "Setup HTB
> offload" section which only shows strict prio now.

Thanks for the review. Will add the changes in the next version.

Thanks,
Hariprasad k

> --
> pw-bot: cr