2008-02-14 23:30:34

by Martin Devera

[permalink] [raw]
Subject: [PATCH 2.6.24 1/1] sch_htb: fix "too many events" situation

From: Martin Devera <[email protected]>

HTB is event driven algorithm and part of its work is to apply
scheduled events at proper times. It tried to defend itself from
livelock by processing only limited number of events per dequeue.
Because of faster computers some users already hit this hardcoded
limit.
This patch uses loops_per_jiffy variable to limit event processing
up to single jiffy interval and then delay remainder to other
jiffy.

Signed-off-by: Martin Devera <[email protected]>
---
BTW, from my measurement is seems that value 500 was good one
for my first 600MHz machine :-)
Maybe I can make something self-converging (using tasklets probably)
but I'm not sure if it is worth of the complexity.

--- a/net/sched/sch_htb.c 2008-02-14 22:56:48.000000000 +0100
+++ b/net/sched/sch_htb.c 2008-02-14 23:37:02.000000000 +0100
@@ -704,13 +704,17 @@ static void htb_charge_class(struct htb_
*
* Scans event queue for pending events and applies them. Returns time of
* next pending event (0 for no event in pq).
+ * One event costs about 1300 cycles on x86_64, let's be conservative
+ * and round it to 4096. We will allow only loops_per_jiffy/4096 events
+ * in one call to prevent us from livelock.
* Note: Applied are events whose have cl->pq_key <= q->now.
*/
+#define HTB_EVENT_COST_SHIFTS 12
static psched_time_t htb_do_events(struct htb_sched *q, int level)
{
- int i;
-
- for (i = 0; i < 500; i++) {
+ int i, max_events = loops_per_jiffy >> HTB_EVENT_COST_SHIFTS;
+ /* <= below is just for case where max_events==0 (unlikely) */
+ for (i = 0; i <= max_events; i++) {
struct htb_class *cl;
long diff;
struct rb_node *p = rb_first(&q->wait_pq[level]);
@@ -728,9 +732,8 @@ static psched_time_t htb_do_events(struc
if (cl->cmode != HTB_CAN_SEND)
htb_add_to_wait_tree(q, cl, diff);
}
- if (net_ratelimit())
- printk(KERN_WARNING "htb: too many events !\n");
- return q->now + PSCHED_TICKS_PER_SEC / 10;
+ /* too much load - let's continue on next tick */
+ return q->now + PSCHED_TICKS_PER_SEC / HZ;
}

/* Returns class->node+prio from id-tree where classe's id is >= id. NULL


2008-02-18 07:28:06

by David Miller

[permalink] [raw]
Subject: Re: [PATCH 2.6.24 1/1] sch_htb: fix "too many events" situation

From: Martin Devera <[email protected]>
Date: Fri, 15 Feb 2008 00:02:56 +0100

> From: Martin Devera <[email protected]>
>
> HTB is event driven algorithm and part of its work is to apply
> scheduled events at proper times. It tried to defend itself from
> livelock by processing only limited number of events per dequeue.
> Because of faster computers some users already hit this hardcoded
> limit.
> This patch uses loops_per_jiffy variable to limit event processing
> up to single jiffy interval and then delay remainder to other
> jiffy.
>
> Signed-off-by: Martin Devera <[email protected]>

I think we would be wise to use something other than loops_per_jiffy.

Depending upon the loop calibration method used by a particular
architecture it can me one of many different things.

Some platforms don't even make use of it and thus leave it at it's
default value of "1<<12", so using it as a heuristic here is arbitrary
at best.

2008-02-18 08:04:46

by Martin Devera

[permalink] [raw]
Subject: Re: [PATCH 2.6.24 1/1] sch_htb: fix "too many events" situation

>> up to single jiffy interval and then delay remainder to other
>> jiffy.
>>
>> Signed-off-by: Martin Devera <[email protected]>
>
> I think we would be wise to use something other than loops_per_jiffy.
>
> Depending upon the loop calibration method used by a particular
> architecture it can me one of many different things.
>
> Some platforms don't even make use of it and thus leave it at it's

aha, ok, I'm not so informed about crossplatform issues.
I was also thining about looking at jiffies value and stop once
it is startjiffy+2, but with NO_HZ introduction, are jiffies
still incremented ?

2008-02-18 08:17:17

by David Miller

[permalink] [raw]
Subject: Re: [PATCH 2.6.24 1/1] sch_htb: fix "too many events" situation

From: Martin Devera <[email protected]>
Date: Mon, 18 Feb 2008 09:03:52 +0100

> aha, ok, I'm not so informed about crossplatform issues.
> I was also thining about looking at jiffies value and stop once
> it is startjiffy+2, but with NO_HZ introduction, are jiffies
> still incremented ?

There should always be at least once cpu tasked with incrementing
jiffies. Lots of stuff would break if not :-)

2008-02-18 10:08:25

by Martin Devera

[permalink] [raw]
Subject: Re: [PATCH 2.6.24 1/1] sch_htb: fix "too many events" situation

David Miller wrote:
> From: Martin Devera <[email protected]>
> Date: Mon, 18 Feb 2008 09:03:52 +0100
>
>> aha, ok, I'm not so informed about crossplatform issues.
>> I was also thining about looking at jiffies value and stop once
>> it is startjiffy+2, but with NO_HZ introduction, are jiffies
>> still incremented ?
>
> There should always be at least once cpu tasked with incrementing
> jiffies. Lots of stuff would break if not :-)
>

Aha ok, so that when (at least one) cpu is busy then I can count on
jiffies incrementing via do_timer, can't I ?
So that I'd remove the loop limit altogether but limiting it to
1 or 2 jiffies to prevent livelock.
Like
max_jiff = jiffies+2; /* not +1 at we could be at +0.9999 now */
while (jiffies<max_jiff) do_hard_potentionaly_long_work();
if (more_work) schedule_to_next_jiffie();

This will keep event queue work load under 66% of system load which
seems reasonable to me.

Would you accept such solution ?

2008-02-18 10:43:22

by David Miller

[permalink] [raw]
Subject: Re: [PATCH 2.6.24 1/1] sch_htb: fix "too many events" situation

From: Martin Devera <[email protected]>
Date: Mon, 18 Feb 2008 11:08:09 +0100

> Like
> max_jiff = jiffies+2; /* not +1 at we could be at +0.9999 now */
> while (jiffies<max_jiff) do_hard_potentionaly_long_work();
> if (more_work) schedule_to_next_jiffie();
>
> This will keep event queue work load under 66% of system load which
> seems reasonable to me.
>
> Would you accept such solution ?

Sure.