2004-03-03 12:31:26

by Jesper Dangaard Brouer

[permalink] [raw]
Subject: [NET_SCHED] BUG in qdisc TBF (token bucket filter)


BUG in qdisc TBF (token bucket filter).

Problem in : Kernel 2.4.22 (and newer, tested till 2.4.25-rc2)
Problem NOT in: kernel 2.4.21 (and older)

Problem:
--------
After I add an tbf qdisc to an htb class, then the htb class disappear
from the output-listing "tc -s class ls dev ethX".

tc-versions:
I have tested with different version of the "tc" utility.
And different versions of the htb patch.

"tc utility, iproute2-ss010824" (debian's iproute)
"tc utility, iproute2-ss020116" (with htb3.6-020525)

I've not tested if the tfb qdisc and htb class still works, but only
that the output-listing is wrong. (I'm running 2.4.21 on my
router/firewall to be on the safe side.)

Howto reproduce:
----------------
#Removing previous 'root' handle/classification
/sbin/tc qdisc del dev eth0 root
#
/sbin/tc qdisc add dev eth0 root handle 1: htb default 10
#
/sbin/tc class add dev eth0 parent 1: classid 1:10 htb rate 500kbit

# output-listing of class'es
tc -s class ls dev eth0
#
# output:
class htb 1:10 root prio 0 rate 500Kbit ceil 500Kbit burst 2239b cburst 2239b
Sent 812 bytes 6 pkts (dropped 0, overlimits 0)
lended: 6 borrowed: 0 giants: 0
tokens: 27239 ctokens: 27239

# the tbf line
/sbin/tc qdisc add dev eth0 parent 1:10 handle 4210: tbf rate 500kbit \
latency 50ms burst 2239b

# output-listing of class'es
tc -s class ls dev eth0
#
# output:
<NOTHING>

The output-listing of class'es returns, if I remove the tbf qdisc again.

Diff between file /net/sched/sch_tbf.c in kernel 2.4.21 and 2.4.22 is
attached.


Hilsen
Jesper Brouer

--
-------------------------------------------------------------------
System Administrator
Dept. of Computer Science, University of Copenhagen
E-mail: [email protected], Direct Tel.: 353 21464
-------------------------------------------------------------------


Attachments:
sch_tbf.2.4.21-22.diff (8.43 kB)

2004-03-05 06:48:45

by Dmitry Torokhov

[permalink] [raw]
Subject: [PATCH 1/2] NET: fix class reporting in TBF qdisc


===================================================================


[email protected], 2004-03-05 01:02:36-05:00, [email protected]
NET: Fix class reporting in TBF qdisc


sch_tbf.c | 9 +++------
1 files changed, 3 insertions(+), 6 deletions(-)


===================================================================



diff -Nru a/net/sched/sch_tbf.c b/net/sched/sch_tbf.c
--- a/net/sched/sch_tbf.c Fri Mar 5 01:26:33 2004
+++ b/net/sched/sch_tbf.c Fri Mar 5 01:26:33 2004
@@ -434,8 +434,7 @@
if (cl != 1) /* only one class */
return -ENOENT;

- tcm->tcm_parent = TC_H_ROOT;
- tcm->tcm_handle = 1;
+ tcm->tcm_handle |= TC_H_MIN(1);
tcm->tcm_info = q->qdisc->handle;

return 0;
@@ -486,11 +485,9 @@

static void tbf_walk(struct Qdisc *sch, struct qdisc_walker *walker)
{
- struct tbf_sched_data *q = (struct tbf_sched_data *)sch->data;
-
if (!walker->stop) {
- if (walker->count >= walker->skip)
- if (walker->fn(sch, (unsigned long)q, walker) < 0) {
+ if (walker->count >= walker->skip)
+ if (walker->fn(sch, 1, walker) < 0) {
walker->stop = 1;
return;
}

2004-03-05 06:51:17

by Dmitry Torokhov

[permalink] [raw]
Subject: Re: [PATCH 1/2] NET: fix class reporting in TBF qdisc


===================================================================


[email protected], 2004-03-05 01:18:18-05:00, [email protected]
NET: TBF trailing whitespace cleanup


sch_tbf.c | 68 +++++++++++++++++++++++++++++++-------------------------------
1 files changed, 34 insertions(+), 34 deletions(-)


===================================================================



diff -Nru a/net/sched/sch_tbf.c b/net/sched/sch_tbf.c
--- a/net/sched/sch_tbf.c Fri Mar 5 01:26:58 2004
+++ b/net/sched/sch_tbf.c Fri Mar 5 01:26:58 2004
@@ -62,7 +62,7 @@

Algorithm.
----------
-
+
Let N(t_i) be B/R initially and N(t) grow continuously with time as:

N(t+delta) = min{B/R, N(t) + delta}
@@ -146,15 +146,15 @@
if (sch->reshape_fail == NULL || sch->reshape_fail(skb, sch))
#endif
kfree_skb(skb);
-
+
return NET_XMIT_DROP;
}
-
+
if ((ret = q->qdisc->enqueue(skb, q->qdisc)) != 0) {
sch->stats.drops++;
return ret;
- }
-
+ }
+
sch->q.qlen++;
sch->stats.bytes += skb->len;
sch->stats.packets++;
@@ -165,10 +165,10 @@
{
struct tbf_sched_data *q = (struct tbf_sched_data *)sch->data;
int ret;
-
+
if ((ret = q->qdisc->ops->requeue(skb, q->qdisc)) == 0)
- sch->q.qlen++;
-
+ sch->q.qlen++;
+
return ret;
}

@@ -176,7 +176,7 @@
{
struct tbf_sched_data *q = (struct tbf_sched_data *)sch->data;
unsigned int len;
-
+
if ((len = q->qdisc->ops->drop(q->qdisc)) != 0) {
sch->q.qlen--;
sch->stats.drops++;
@@ -196,7 +196,7 @@
{
struct tbf_sched_data *q = (struct tbf_sched_data *)sch->data;
struct sk_buff *skb;
-
+
skb = q->qdisc->dequeue(q->qdisc);

if (skb) {
@@ -204,7 +204,7 @@
long toks;
long ptoks = 0;
unsigned int len = skb->len;
-
+
PSCHED_GET_TIME(now);

toks = PSCHED_TDIFF_SAFE(now, q->t_c, q->buffer, 0);
@@ -248,13 +248,13 @@
This is the main idea of all FQ algorithms
(cf. CSZ, HPFQ, HFSC)
*/
-
+
if (q->qdisc->ops->requeue(skb, q->qdisc) != NET_XMIT_SUCCESS) {
- /* When requeue fails skb is dropped */
+ /* When requeue fails skb is dropped */
sch->q.qlen--;
sch->stats.drops++;
- }
-
+ }
+
sch->flags |= TCQ_F_THROTTLED;
sch->stats.overlimits++;
}
@@ -279,24 +279,24 @@
struct Qdisc *q = qdisc_create_dflt(dev, &bfifo_qdisc_ops);
struct rtattr *rta;
int ret;
-
+
if (q) {
rta = kmalloc(RTA_LENGTH(sizeof(struct tc_fifo_qopt)), GFP_KERNEL);
if (rta) {
rta->rta_type = RTM_NEWQDISC;
rta->rta_len = RTA_LENGTH(sizeof(struct tc_fifo_qopt));
((struct tc_fifo_qopt *)RTA_DATA(rta))->limit = limit;
-
+
ret = q->ops->change(q, rta);
kfree(rta);
-
+
if (ret == 0)
return q;
}
qdisc_destroy(q);
}

- return NULL;
+ return NULL;
}

static int tbf_change(struct Qdisc* sch, struct rtattr *opt)
@@ -340,7 +340,7 @@
}
if (max_size < 0)
goto done;
-
+
if (q->qdisc == &noop_qdisc) {
if ((child = tbf_create_dflt_qdisc(sch->dev, qopt->limit)) == NULL)
goto done;
@@ -369,17 +369,17 @@
static int tbf_init(struct Qdisc* sch, struct rtattr *opt)
{
struct tbf_sched_data *q = (struct tbf_sched_data *)sch->data;
-
+
if (opt == NULL)
return -EINVAL;
-
+
PSCHED_GET_TIME(q->t_c);
init_timer(&q->wd_timer);
q->wd_timer.function = tbf_watchdog;
q->wd_timer.data = (unsigned long)sch;

q->qdisc = &noop_qdisc;
-
+
return tbf_change(sch, opt);
}

@@ -393,7 +393,7 @@
qdisc_put_rtab(q->P_tab);
if (q->R_tab)
qdisc_put_rtab(q->R_tab);
-
+
qdisc_destroy(q->qdisc);
q->qdisc = &noop_qdisc;
}
@@ -404,10 +404,10 @@
unsigned char *b = skb->tail;
struct rtattr *rta;
struct tc_tbf_qopt opt;
-
+
rta = (struct rtattr*)b;
RTA_PUT(skb, TCA_OPTIONS, 0, NULL);
-
+
opt.limit = q->limit;
opt.rate = q->R_tab->rate;
if (q->P_tab)
@@ -427,13 +427,13 @@
}

static int tbf_dump_class(struct Qdisc *sch, unsigned long cl,
- struct sk_buff *skb, struct tcmsg *tcm)
+ struct sk_buff *skb, struct tcmsg *tcm)
{
struct tbf_sched_data *q = (struct tbf_sched_data*)sch->data;

- if (cl != 1) /* only one class */
+ if (cl != 1) /* only one class */
return -ENOENT;
-
+
tcm->tcm_handle |= TC_H_MIN(1);
tcm->tcm_info = q->qdisc->handle;

@@ -448,12 +448,12 @@
if (new == NULL)
new = &noop_qdisc;

- sch_tree_lock(sch);
+ sch_tree_lock(sch);
*old = xchg(&q->qdisc, new);
qdisc_reset(*old);
sch->q.qlen = 0;
sch_tree_unlock(sch);
-
+
return 0;
}

@@ -473,7 +473,7 @@
}

static int tbf_change_class(struct Qdisc *sch, u32 classid, u32 parentid,
- struct rtattr **tca, unsigned long *arg)
+ struct rtattr **tca, unsigned long *arg)
{
return -ENOSYS;
}
@@ -497,7 +497,7 @@

static struct Qdisc_class_ops tbf_class_ops =
{
- .graft = tbf_graft,
+ .graft = tbf_graft,
.leaf = tbf_leaf,
.get = tbf_get,
.put = tbf_put,
@@ -529,7 +529,7 @@
return register_qdisc(&tbf_qdisc_ops);
}

-static void __exit tbf_module_exit(void)
+static void __exit tbf_module_exit(void)
{
unregister_qdisc(&tbf_qdisc_ops);
}

2004-03-05 06:49:26

by Dmitry Torokhov

[permalink] [raw]
Subject: Re: [NET_SCHED] BUG in qdisc TBF (token bucket filter)

On Wednesday 03 March 2004 07:31 am, Jesper Dangaard Brouer wrote:
>
> BUG in qdisc TBF (token bucket filter).
>
> Problem in : Kernel 2.4.22 (and newer, tested till 2.4.25-rc2)
> Problem NOT in: kernel 2.4.21 (and older)
>
> Problem:
> --------
> After I add an tbf qdisc to an htb class, then the htb class disappear
> from the output-listing "tc -s class ls dev ethX".
>

Yeah, I botched class reporting in TBF, I am sending 2 patches as followups
to this message:

01-tbf-class-reporting.patch - actual fix
02-tbf-trailing-whitespace.patch - removes trailing whitespace from TBF code

The patches are against 2.6 but I am pretty sure they will apply to 2.4.

--
Dmitry

2004-03-05 15:41:08

by Matthias Jim Knopf

[permalink] [raw]
Subject: Problems with WLAN orinoco_pci

Hi!

My WLAN net goes down from time to time. I cannot ping the wlan/dsl-router
(T-Sinus 111) and the only thing that helps is re-load the kernel-driver
(ifconfig eth1 down; modprobe -r orinoco_pci;
sleep 8; ifconfig eth1 up; iwconfig ...)

Changing speed (down to 1 Mb/s) does no better

Bisides this, I have annother problem: This f***ing router mentioned
above cannot handle at least one connection per second in the long run
(>30 minutes) and crashes in the way, that it still may be pinged, but
I cannot access the router's web-interface, nor can I get to the internet
and have to power-cyle it (official advice from the company selling it!)
I tried hard to see these problems as one, but it seems, they ARE
two (see logs)


Here is, what I can offer you as info:
WLAN is a Netgear PCI MA311 (using hermes, orinoco)

# iwconfig
eth1 IEEE 802.11-DS ESSID:"WLAN" Nickname:"jim"
Mode:Managed Frequency:2.437GHz Access Point: 00:30:F1:xx:xx:xx
Bit Rate:11Mb/s Tx-Power=15 dBm Sensitivity:1/3
Retry min limit:8 RTS thr:off Fragment thr:off
Encryption key:xxxx-xxxx-xx Security mode:open
Power Management:off
Link Quality:20/92 Signal level:-76 dBm Noise level:-136 dBm
Rx invalid nwid:0 Rx invalid crypt:0 Rx invalid frag:1
Tx excessive retries:2 Invalid misc:0 Missed beacon:0

# ping -c 5 router
PING wlan-router (192.168.2.1): 56 data bytes
--- wlan-router ping statistics ---
5 packets transmitted, 0 packets received, 100% packet loss

# uname -a
Linux laura 2.4.22 #2 Sun Dec 21 17:14:01 MET 2003 i686 unknown

# lsmod
Module Size Used by
orinoco_pci 3056 1 (autoclean)
orinoco 31808 0 (autoclean) [orinoco_pci]
hermes 5232 0 (autoclean) [orinoco_pci orinoco]
[...]

---{ /var/log/warn }----------------------------------------------------------------------------
Mar 5 15:41:43 laura kernel: eth1: Error -110 writing Tx descriptor to BAP
Mar 5 15:42:15 laura last message repeated 59 times
Mar 5 15:42:22 laura last message repeated 12 times
[...]
Mar 5 15:44:34 laura kernel: eth1: error -110 reading info frame. Frame dropped.
Mar 5 15:44:34 laura kernel: eth1: Error -110 writing Tx descriptor to BAP
Mar 5 15:44:41 laura last message repeated 14 times
Mar 5 15:45:33 laura kernel: .....<7>orinoco_lock() called with hw_unavailable (dev=e3c82800)
Mar 5 15:45:33 laura kernel: ......<7>orinoco_lock() called with hw_unavailable (dev=e3c82800)
Mar 5 15:45:33 laura last message repeated 4 times
Mar 5 15:45:33 laura kernel: .....<7>orinoco_lock() called with hw_unavailable (dev=e3c82800)
Mar 5 15:45:33 laura kernel: ......<7>orinoco_lock() called with hw_unavailable (dev=e3c82800)
Mar 5 15:45:33 laura last message repeated 4 times
Mar 5 15:45:34 laura kernel: .........;
[...]
------------------------------------------------------------------------------------------------


Thanks in advance for any help you can give me!

Matthias


--
EOF


2004-03-05 20:58:18

by Mike Fedyk

[permalink] [raw]
Subject: Re: [PATCH 1/2] NET: fix class reporting in TBF qdisc

Dmitry Torokhov wrote:
> ===================================================================
>
>
> [email protected], 2004-03-05 01:18:18-05:00, [email protected]
> NET: TBF trailing whitespace cleanup
>
>
> sch_tbf.c | 68 +++++++++++++++++++++++++++++++-------------------------------
> 1 files changed, 34 insertions(+), 34 deletions(-)
>
>
> ===================================================================
>
>
>
> diff -Nru a/net/sched/sch_tbf.c b/net/sched/sch_tbf.c
> --- a/net/sched/sch_tbf.c Fri Mar 5 01:26:58 2004
> +++ b/net/sched/sch_tbf.c Fri Mar 5 01:26:58 2004
> @@ -62,7 +62,7 @@
>
> Algorithm.
> ----------
> -
> +

That's a lot of whitespace cleanup in there too.

2004-03-05 23:01:29

by Peter Williams

[permalink] [raw]
Subject: Re: Problems with WLAN orinoco_pci

Matthias Jim Knopf wrote:
> Hi!
>
> My WLAN net goes down from time to time. I cannot ping the wlan/dsl-router
> (T-Sinus 111) and the only thing that helps is re-load the kernel-driver
> (ifconfig eth1 down; modprobe -r orinoco_pci;
> sleep 8; ifconfig eth1 up; iwconfig ...)
>
> Changing speed (down to 1 Mb/s) does no better
>
> Bisides this, I have annother problem: This f***ing router mentioned
> above cannot handle at least one connection per second in the long run
> (>30 minutes) and crashes in the way, that it still may be pinged, but
> I cannot access the router's web-interface, nor can I get to the internet
> and have to power-cyle it (official advice from the company selling it!)
> I tried hard to see these problems as one, but it seems, they ARE
> two (see logs)
>
>
> Here is, what I can offer you as info:
> WLAN is a Netgear PCI MA311 (using hermes, orinoco)
>
> # iwconfig
> eth1 IEEE 802.11-DS ESSID:"WLAN" Nickname:"jim"
> Mode:Managed Frequency:2.437GHz Access Point: 00:30:F1:xx:xx:xx
> Bit Rate:11Mb/s Tx-Power=15 dBm Sensitivity:1/3
> Retry min limit:8 RTS thr:off Fragment thr:off
> Encryption key:xxxx-xxxx-xx Security mode:open
> Power Management:off
> Link Quality:20/92 Signal level:-76 dBm Noise level:-136 dBm
> Rx invalid nwid:0 Rx invalid crypt:0 Rx invalid frag:1
> Tx excessive retries:2 Invalid misc:0 Missed beacon:0
>
> # ping -c 5 router
> PING wlan-router (192.168.2.1): 56 data bytes
> --- wlan-router ping statistics ---
> 5 packets transmitted, 0 packets received, 100% packet loss
>
> # uname -a
> Linux laura 2.4.22 #2 Sun Dec 21 17:14:01 MET 2003 i686 unknown
>
> # lsmod
> Module Size Used by
> orinoco_pci 3056 1 (autoclean)
> orinoco 31808 0 (autoclean) [orinoco_pci]
> hermes 5232 0 (autoclean) [orinoco_pci orinoco]
> [...]
>
> ---{ /var/log/warn }----------------------------------------------------------------------------
> Mar 5 15:41:43 laura kernel: eth1: Error -110 writing Tx descriptor to BAP
> Mar 5 15:42:15 laura last message repeated 59 times
> Mar 5 15:42:22 laura last message repeated 12 times
> [...]
> Mar 5 15:44:34 laura kernel: eth1: error -110 reading info frame. Frame dropped.
> Mar 5 15:44:34 laura kernel: eth1: Error -110 writing Tx descriptor to BAP
> Mar 5 15:44:41 laura last message repeated 14 times
> Mar 5 15:45:33 laura kernel: .....<7>orinoco_lock() called with hw_unavailable (dev=e3c82800)
> Mar 5 15:45:33 laura kernel: ......<7>orinoco_lock() called with hw_unavailable (dev=e3c82800)
> Mar 5 15:45:33 laura last message repeated 4 times
> Mar 5 15:45:33 laura kernel: .....<7>orinoco_lock() called with hw_unavailable (dev=e3c82800)
> Mar 5 15:45:33 laura kernel: ......<7>orinoco_lock() called with hw_unavailable (dev=e3c82800)
> Mar 5 15:45:33 laura last message repeated 4 times
> Mar 5 15:45:34 laura kernel: .........;
> [...]
> ------------------------------------------------------------------------------------------------
>
>
> Thanks in advance for any help you can give me!

I'm having similar problems with a PCMCIA orinoco wlan device. Works
perfectly with various 2.4.X kernels but keeps (partially) dropping the
connection with 2.6.X kernels. By partially, I mean connections are
lost and pings report target hosts as unreachable but netstat -r reports
make it look like the connection is still valid. Doing an ifup (without
a preceding ifdown) seems to fix the problem.

If more information would be helpful or testing is required don't
hesitate to ask.

Peter
--
Dr Peter Williams, Chief Scientist [email protected]
Aurema Pty Limited Tel:+61 2 9698 2322
PO Box 305, Strawberry Hills NSW 2012, Australia Fax:+61 2 9699 9174
79 Myrtle Street, Chippendale NSW 2008, Australia http://www.aurema.com

2004-03-06 04:33:29

by Dmitry Torokhov

[permalink] [raw]
Subject: Re: [PATCH 1/2] NET: fix class reporting in TBF qdisc

On Friday 05 March 2004 03:57 pm, Mike Fedyk wrote:
> Dmitry Torokhov wrote:
> > ===================================================================
> >
> >
> > [email protected], 2004-03-05 01:18:18-05:00, [email protected]
> > NET: TBF trailing whitespace cleanup
> >
> >
> > sch_tbf.c | 68 +++++++++++++++++++++++++++++++-------------------------------
> > 1 files changed, 34 insertions(+), 34 deletions(-)
> >
> >
> > ===================================================================
> >
> >
> >
> > diff -Nru a/net/sched/sch_tbf.c b/net/sched/sch_tbf.c
> > --- a/net/sched/sch_tbf.c Fri Mar 5 01:26:58 2004
> > +++ b/net/sched/sch_tbf.c Fri Mar 5 01:26:58 2004
> > @@ -62,7 +62,7 @@
> >
> > Algorithm.
> > ----------
> > -
> > +
>
> That's a lot of whitespace cleanup in there too.
>

It's only whitespace cleanup... See the changeset comment. Ahh, I see...
I have the same subject on both emails. Doh! Well, it was almost 2AM...

--
Dmitry

2004-04-06 14:17:35

by Jesper Dangaard Brouer

[permalink] [raw]
Subject: Re: [NET_SCHED] BUG in qdisc TBF (token bucket filter)

(...lets keep google updated ;-))

The error have been corrected in kernel 2.4.26-rc1, by

Dmitry Torokhov:
o [NET_SCHED]: Fix class reporting in TBF qdisc

--Hawk

On Wed, 3 Mar 2004, Jesper Dangaard Brouer wrote:

> BUG in qdisc TBF (token bucket filter).
>
> Problem in : Kernel 2.4.22 (and newer, tested till 2.4.25-rc2)
> Problem NOT in: kernel 2.4.21 (and older)
>
> Problem:
> --------
> After I add an tbf qdisc to an htb class, then the htb class disappear
> from the output-listing "tc -s class ls dev ethX".
>
> tc-versions:
> I have tested with different version of the "tc" utility.
> And different versions of the htb patch.
>
> "tc utility, iproute2-ss010824" (debian's iproute)
> "tc utility, iproute2-ss020116" (with htb3.6-020525)
>
> I've not tested if the tfb qdisc and htb class still works, but only
> that the output-listing is wrong. (I'm running 2.4.21 on my
> router/firewall to be on the safe side.)
>
> Howto reproduce:
> ----------------
> #Removing previous 'root' handle/classification
> /sbin/tc qdisc del dev eth0 root
> #
> /sbin/tc qdisc add dev eth0 root handle 1: htb default 10
> #
> /sbin/tc class add dev eth0 parent 1: classid 1:10 htb rate 500kbit
>
> # output-listing of class'es
> tc -s class ls dev eth0
> #
> # output:
> class htb 1:10 root prio 0 rate 500Kbit ceil 500Kbit burst 2239b cburst 2239b
> Sent 812 bytes 6 pkts (dropped 0, overlimits 0)
> lended: 6 borrowed: 0 giants: 0
> tokens: 27239 ctokens: 27239
>
> # the tbf line
> /sbin/tc qdisc add dev eth0 parent 1:10 handle 4210: tbf rate 500kbit \
> latency 50ms burst 2239b
>
> # output-listing of class'es
> tc -s class ls dev eth0
> #
> # output:
> <NOTHING>
>
> The output-listing of class'es returns, if I remove the tbf qdisc again.
>
> Diff between file /net/sched/sch_tbf.c in kernel 2.4.21 and 2.4.22 is
> attached.
>
>
> Hilsen
> Jesper Brouer
>
> --
> -------------------------------------------------------------------
> System Administrator
> Dept. of Computer Science, University of Copenhagen
> E-mail: [email protected], Direct Tel.: 353 21464
> -------------------------------------------------------------------
>

Hilsen
Jesper Brouer

--
-------------------------------------------------------------------
System Administrator / Research Assistent
Dept. of Computer Science, University of Copenhagen
E-mail: [email protected], Direct Tel.: 353 21464
-------------------------------------------------------------------