Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754910Ab2J3GBa (ORCPT ); Tue, 30 Oct 2012 02:01:30 -0400 Received: from smtp25.sms.unimo.it ([155.185.44.25]:48538 "EHLO smtp25.sms.unimo.it" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753027Ab2J3GB3 (ORCPT ); Tue, 30 Oct 2012 02:01:29 -0400 Date: Tue, 30 Oct 2012 07:00:56 +0100 From: Paolo Valente To: jhs@mojatatu.com, davem@davemloft.net, shemminger@vyatta.com Cc: linux-kernel@vger.kernel.org, netdev@vger.kernel.org, rizzo@iet.unipi.it, fchecconi@gmail.com, paolo.valente@unimore.it Subject: [PATCH RFC] pkt_sched: enable QFQ to support TSO/GSO Message-ID: <20121030060056.GA3373@paolo-ThinkPad-W520> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.5.21 (2010-09-15) UNIMORE-X-SA-Score: -2.9 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 8505 Lines: 227 [new version modified according to the suggestions received] Hi, if the max packet size for some class (configured through tc) is violated by the actual size of the packets of that class, then QFQ would not schedule classes correctly, and the data structures implementing the bucket lists may get corrupted. This problem occurs with TSO/GSO even if the max packet size is set to the MTU, and is, e.g., the cause of the failure reported in [1]. Two patches have been proposed to solve this problem in [2], one of them is a preliminary version of this patch. This patch addresses the above issues by: 1) setting QFQ parameters to proper values for supporting TSO/GSO (in particular, setting the maximum possible packet size to 64KB), 2) automatically increasing the max packet size for a class, lmax, when a packet with a larger size than the current value of lmax arrives. The drawback of the first point is that the maximum weight for a class is now limited to 4096, which is equal to 1/16 of the maximum weight sum. Finally, this patch also forcibly caps the timestamps of a class if they are too high to be stored in the bucket list. This capping, taken from QFQ+ [3], handles the unfrequent case described in the comment to the function slot_insert. [1] http://marc.info/?l=linux-netdev&m=134968777902077&w=2 [2] http://marc.info/?l=linux-netdev&m=135096573507936&w=2 [3] http://marc.info/?l=linux-netdev&m=134902691421670&w=2 Signed-off-by: Paolo Valente Tested-by: Cong Wang --- net/sched/sch_qfq.c | 109 +++++++++++++++++++++++++++++++++++++-------------- 1 file changed, 79 insertions(+), 30 deletions(-) diff --git a/net/sched/sch_qfq.c b/net/sched/sch_qfq.c index f0dd83c..9687fa1 100644 --- a/net/sched/sch_qfq.c +++ b/net/sched/sch_qfq.c @@ -84,18 +84,19 @@ * grp->index is the index of the group; and grp->slot_shift * is the shift for the corresponding (scaled) sigma_i. */ -#define QFQ_MAX_INDEX 19 -#define QFQ_MAX_WSHIFT 16 +#define QFQ_MAX_INDEX 24 +#define QFQ_MAX_WSHIFT 12 #define QFQ_MAX_WEIGHT (1<wsum += delta_w; } +static void qfq_update_reactivate_class(struct qfq_sched *q, + struct qfq_class *cl, + u32 inv_w, u32 lmax, int delta_w) +{ + bool need_reactivation = false; + int i = qfq_calc_index(inv_w, lmax); + + if (&q->groups[i] != cl->grp && cl->qdisc->q.qlen > 0) { + /* + * shift cl->F back, to not charge the + * class for the not-yet-served head + * packet + */ + cl->F = cl->S; + /* remove class from its slot in the old group */ + qfq_deactivate_class(q, cl); + need_reactivation = true; + } + + qfq_update_class_params(q, cl, lmax, inv_w, delta_w); + + if (need_reactivation) /* activate in new group */ + qfq_activate_class(q, cl, qdisc_peek_len(cl->qdisc)); +} + + static int qfq_change_class(struct Qdisc *sch, u32 classid, u32 parentid, struct nlattr **tca, unsigned long *arg) { @@ -238,7 +265,7 @@ static int qfq_change_class(struct Qdisc *sch, u32 classid, u32 parentid, struct qfq_class *cl = (struct qfq_class *)*arg; struct nlattr *tb[TCA_QFQ_MAX + 1]; u32 weight, lmax, inv_w; - int i, err; + int err; int delta_w; if (tca[TCA_OPTIONS] == NULL) { @@ -270,16 +297,14 @@ static int qfq_change_class(struct Qdisc *sch, u32 classid, u32 parentid, if (tb[TCA_QFQ_LMAX]) { lmax = nla_get_u32(tb[TCA_QFQ_LMAX]); - if (!lmax || lmax > (1UL << QFQ_MTU_SHIFT)) { + if (lmax < QFQ_MIN_LMAX || lmax > (1UL << QFQ_MTU_SHIFT)) { pr_notice("qfq: invalid max length %u\n", lmax); return -EINVAL; } } else - lmax = 1UL << QFQ_MTU_SHIFT; + lmax = psched_mtu(qdisc_dev(sch)); if (cl != NULL) { - bool need_reactivation = false; - if (tca[TCA_RATE]) { err = gen_replace_estimator(&cl->bstats, &cl->rate_est, qdisc_root_sleeping_lock(sch), @@ -291,24 +316,8 @@ static int qfq_change_class(struct Qdisc *sch, u32 classid, u32 parentid, if (lmax == cl->lmax && inv_w == cl->inv_w) return 0; /* nothing to update */ - i = qfq_calc_index(inv_w, lmax); sch_tree_lock(sch); - if (&q->groups[i] != cl->grp && cl->qdisc->q.qlen > 0) { - /* - * shift cl->F back, to not charge the - * class for the not-yet-served head - * packet - */ - cl->F = cl->S; - /* remove class from its slot in the old group */ - qfq_deactivate_class(q, cl); - need_reactivation = true; - } - - qfq_update_class_params(q, cl, lmax, inv_w, delta_w); - - if (need_reactivation) /* activate in new group */ - qfq_activate_class(q, cl, qdisc_peek_len(cl->qdisc)); + qfq_update_reactivate_class(q, cl, inv_w, lmax, delta_w); sch_tree_unlock(sch); return 0; @@ -663,15 +672,48 @@ static void qfq_make_eligible(struct qfq_sched *q, u64 old_V) /* - * XXX we should make sure that slot becomes less than 32. - * This is guaranteed by the input values. - * roundedS is always cl->S rounded on grp->slot_shift bits. + * If the weight and lmax (max_pkt_size) of the classes do not change, + * then QFQ guarantees that the slot index is never higher than + * 2 + ((1<S) >> grp->slot_shift; - unsigned int i = (grp->front + slot) % QFQ_MAX_SLOTS; + unsigned int i; /* slot index in the bucket list */ + + if (unlikely(slot > QFQ_MAX_SLOTS - 2)) { + u64 deltaS = roundedS - grp->S - + ((u64)(QFQ_MAX_SLOTS - 2)<slot_shift); + cl->S -= deltaS; + cl->F -= deltaS; + slot = QFQ_MAX_SLOTS - 2; + } + + i = (grp->front + slot) % QFQ_MAX_SLOTS; hlist_add_head(&cl->next, &grp->slots[i]); __set_bit(slot, &grp->full_slots); @@ -892,6 +934,13 @@ static int qfq_enqueue(struct sk_buff *skb, struct Qdisc *sch) } pr_debug("qfq_enqueue: cl = %x\n", cl->common.classid); + if (unlikely(cl->lmax < qdisc_pkt_len(skb))) { + pr_debug("qfq: increasing maxpkt from %u to %u for class %u", + cl->lmax, qdisc_pkt_len(skb), cl->common.classid); + qfq_update_reactivate_class(q, cl, cl->inv_w, + qdisc_pkt_len(skb), 0); + } + err = qdisc_enqueue(skb, cl->qdisc); if (unlikely(err != NET_XMIT_SUCCESS)) { pr_debug("qfq_enqueue: enqueue failed %d\n", err); -- 1.7.9.5 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/