On Wed, Oct 22, 2008 at 10:06:58AM +0400, Badalian Vyacheslav wrote:
> Hello!
Hi!
> I get more information.
>
> Statistics of PC:
> 1. 2.6.26.6 Dunamic Timer, HiResTimer, 1000HZ, htb_hysteresis=0 -
> crashed 1d 18h ago
> 2. 2.6.26.5 HZ300, NO Dunamic Timer, No HiResTimer, htb_hysteresis=0 -
> uptime 5d 17h (no crashes for now, but it crashed some time ago with
> htb_hysteresis=1)
So it looks like htb_hysteresis change could be not enough, and it's
a pity because this could be easiest to "reverse" in the kernel.
> 3. 2.6.27, 1000HZ, NO Dunamic Timer, No HiResTimer, htb_hysteresis=0 +
> PATCH - uptime 5d 17h (no crashes for now, but it crashed some time ago
> without patch)
So I guess this patch seems to make some difference, but it also could
be hard to convince people to fix it this way, because it makes
scheduling less exact.
Alas I have still no idea of the real reason. IMHO this should be
rather debugged by hrtimers/NMI people, but there were no respose last
time I Cc-ed them - anyway I added linux-kernel to Cc again here.
I attach below a slightly modified version of the previous patch,
which lets for more exact scheduling, but could be more vulnerable for
this bug. Alas, even if it works, it still could be not the final
solution of this problem.
> Also attach crash log of lash crash PC 1:
>
> [10610.110729] BUG: NMI Watchdog detected LOCKUP on CPU1, ip c01fd939,
> registers:
> [10610.110729] Modules linked in: netconsole e1000e i2c_i801 i2c_core e1000
> [10610.110729]
> [10610.110729] Pid: 0, comm: swapper Not tainted (2.6.26.6-fw #1)
> [10610.110729] EIP: 0060:[<c01fd939>] EFLAGS: 00000082 CPU: 1
> [10610.110729] EIP is at rb_insert_color+0x19/0xc0
Thanks,
Jarek P.
--- (testing patch #2)
net/sched/sch_htb.c | 8 +++++++-
1 files changed, 7 insertions(+), 1 deletions(-)
diff --git a/net/sched/sch_htb.c b/net/sched/sch_htb.c
index 30c999c..ff9e965 100644
--- a/net/sched/sch_htb.c
+++ b/net/sched/sch_htb.c
@@ -162,6 +162,7 @@ struct htb_sched {
int rate2quantum; /* quant = rate / rate2quantum */
psched_time_t now; /* cached dequeue time */
+ psched_time_t next_watchdog;
struct qdisc_watchdog watchdog;
/* non shaped skbs; let them go directly thru */
@@ -920,7 +921,11 @@ static struct sk_buff *htb_dequeue(struct Qdisc *sch)
}
}
sch->qstats.overlimits++;
- qdisc_watchdog_schedule(&q->watchdog, next_event);
+ if (q->next_watchdog < q->now || next_event <=
+ q->next_watchdog - PSCHED_TICKS_PER_SEC / (10 * HZ)) {
+ qdisc_watchdog_schedule(&q->watchdog, next_event);
+ q->next_watchdog = next_event;
+ }
fin:
return skb;
}
@@ -973,6 +978,7 @@ static void htb_reset(struct Qdisc *sch)
}
}
qdisc_watchdog_cancel(&q->watchdog);
+ q->next_watchdog = 0;
__skb_queue_purge(&q->direct_queue);
sch->q.qlen = 0;
memset(q->row, 0, sizeof(q->row));
Hello again! Sorry for long away.
I was go away from this work for long time.
May we return to this bug?
Servers at last stable kernel 2.6.27.8
HZ=1000, HR=off, DynamicTicks=off, hysteresis=1
Sorry - no patched, update do not i. Do you have fresh patches or ideas
for tests?
Also debug "Debug object operations" and "Debug timer objects" is on but
i not see additional information at dump.
Thanks!
That dump:
[ 1500.932813] BUG: NMI Watchdog detected LOCKUP on CPU3, ip c0208ed9,
registers:
[ 1500.932813] Modules linked in: cls_u32 sch_sfq sch_htb netconsole
e1000e e1000 i2c_i801
[ 1500.932813]
[ 1500.932813] Pid: 0, comm: swapper Not tainted (2.6.27-gentoo-r5-fw #1)
[ 1500.932813] EIP: 0060:[<c0208ed9>] EFLAGS: 00000082 CPU: 3
[ 1500.932813] EIP is at rb_insert_color+0x29/0xf0
[ 1500.932813] EAX: f3391c68 EBX: f3391c68 ECX: 00000000 EDX: f3391c68
[ 1500.932813] ESI: 00000000 EDI: f3391c68 EBP: f3391c68 ESP: f785fc1c
[ 1500.932813] DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
[ 1500.932813] Process swapper (pid: 0, ti=f785e000 task=f7832940
task.ti=f785e000)
[ 1500.932813] Stack: c202c134 f3391c68 f3391c6c 00000000 c202c134
c013d5a2 c202c12c f3391c68
[ 1500.932813] c202c12c c202212c c0480100 c013dba9 00000000
002dcd3c 76886800 0000015d
[ 1500.932813] 00000001 00000282 0000015d f3391c68 f3391800
f3391880 c02ff53d 00000000
[ 1500.932813] Call Trace:
[ 1500.932813] [<c013d5a2>] enqueue_hrtimer+0x72/0x90
[ 1500.932813] [<c013dba9>] hrtimer_start+0x99/0x150
[ 1500.932813] [<c02ff53d>] qdisc_watchdog_schedule+0x3d/0x50
[ 1500.932813] [<f88a379a>] htb_dequeue+0x6aa/0x7d0 [sch_htb]
[ 1500.932813] [<c02ee6a2>] dev_hard_start_xmit+0x242/0x2d0
[ 1500.932813] [<c02fccfc>] __qdisc_run+0x13c/0x1e0
[ 1500.932813] [<c02eea81>] dev_queue_xmit+0x241/0x570
[ 1500.932813] [<c03133a4>] ip_finish_output+0x184/0x260
[ 1500.932813] [<c030f6a6>] ip_forward_finish+0x26/0x40
[ 1500.932813] [<c030e1a6>] ip_rcv_finish+0x156/0x340
[ 1500.932813] [<c02e93e2>] __netdev_alloc_skb+0x22/0x50
[ 1500.932813] [<c02e8934>] __alloc_skb+0x34/0x120
[ 1500.932813] [<c02e93e2>] __netdev_alloc_skb+0x22/0x50
[ 1500.932813] [<c02edef7>] netif_receive_skb+0x227/0x4c0
[ 1500.932813] [<f8866db6>] e1000_receive_skb+0x46/0x80 [e1000e]
[ 1500.932813] [<f8867030>] e1000_clean_rx_irq+0x240/0x340 [e1000e]
[ 1500.932813] [<f8865277>] e1000_clean+0x1f7/0x5b0 [e1000e]
[ 1500.932813] [<c0121daa>] run_rebalance_domains+0x8a/0x520
[ 1500.932813] [<c011d6b7>] __dequeue_entity+0x57/0xc0
[ 1500.932813] [<c02ec69c>] net_rx_action+0xcc/0x1e0
[ 1500.932813] [<c012b842>] __do_softirq+0x82/0xf0
[ 1500.932813] [<c012b8ed>] do_softirq+0x3d/0x50
[ 1500.932813] [<c0106090>] do_IRQ+0x40/0x70
[ 1500.932813] [<c010462b>] common_interrupt+0x23/0x28
[ 1500.932813] [<c01099ef>] mwait_idle+0x2f/0x40
[ 1500.932813] [<c0102c13>] cpu_idle+0x73/0xf0
[ 1500.932813] =======================
[ 1500.932813] Code: 76 00 55 89 c5 57 56 53 83 ec 04 89 14 24 8d 74 26
00 8b 45 00 89 c3 83 e3 fc 74 36 8b 13 f6 c2 01 75 2f 89 d7 83 e7 fc 8b
77 08 <39> de 74 53 85 f6 74 2f 8b 06 a8 01 75 29 83 c8 01 89 fd 89 06
> On Wed, Oct 22, 2008 at 10:06:58AM +0400, Badalian Vyacheslav wrote:
>
>> Hello!
>>
> Hi!
>
>
>> I get more information.
>>
>> Statistics of PC:
>> 1. 2.6.26.6 Dunamic Timer, HiResTimer, 1000HZ, htb_hysteresis=0 -
>> crashed 1d 18h ago
>> 2. 2.6.26.5 HZ300, NO Dunamic Timer, No HiResTimer, htb_hysteresis=0 -
>> uptime 5d 17h (no crashes for now, but it crashed some time ago with
>> htb_hysteresis=1)
>>
>
> So it looks like htb_hysteresis change could be not enough, and it's
> a pity because this could be easiest to "reverse" in the kernel.
>
>
>> 3. 2.6.27, 1000HZ, NO Dunamic Timer, No HiResTimer, htb_hysteresis=0 +
>> PATCH - uptime 5d 17h (no crashes for now, but it crashed some time ago
>> without patch)
>>
>
> So I guess this patch seems to make some difference, but it also could
> be hard to convince people to fix it this way, because it makes
> scheduling less exact.
>
> Alas I have still no idea of the real reason. IMHO this should be
> rather debugged by hrtimers/NMI people, but there were no respose last
> time I Cc-ed them - anyway I added linux-kernel to Cc again here.
>
> I attach below a slightly modified version of the previous patch,
> which lets for more exact scheduling, but could be more vulnerable for
> this bug. Alas, even if it works, it still could be not the final
> solution of this problem.
>
>
>> Also attach crash log of lash crash PC 1:
>>
>> [10610.110729] BUG: NMI Watchdog detected LOCKUP on CPU1, ip c01fd939,
>> registers:
>> [10610.110729] Modules linked in: netconsole e1000e i2c_i801 i2c_core e1000
>> [10610.110729]
>> [10610.110729] Pid: 0, comm: swapper Not tainted (2.6.26.6-fw #1)
>> [10610.110729] EIP: 0060:[<c01fd939>] EFLAGS: 00000082 CPU: 1
>> [10610.110729] EIP is at rb_insert_color+0x19/0xc0
>>
>
> Thanks,
> Jarek P.
>
> --- (testing patch #2)
>
> net/sched/sch_htb.c | 8 +++++++-
> 1 files changed, 7 insertions(+), 1 deletions(-)
>
> diff --git a/net/sched/sch_htb.c b/net/sched/sch_htb.c
> index 30c999c..ff9e965 100644
> --- a/net/sched/sch_htb.c
> +++ b/net/sched/sch_htb.c
> @@ -162,6 +162,7 @@ struct htb_sched {
>
> int rate2quantum; /* quant = rate / rate2quantum */
> psched_time_t now; /* cached dequeue time */
> + psched_time_t next_watchdog;
> struct qdisc_watchdog watchdog;
>
> /* non shaped skbs; let them go directly thru */
> @@ -920,7 +921,11 @@ static struct sk_buff *htb_dequeue(struct Qdisc *sch)
> }
> }
> sch->qstats.overlimits++;
> - qdisc_watchdog_schedule(&q->watchdog, next_event);
> + if (q->next_watchdog < q->now || next_event <=
> + q->next_watchdog - PSCHED_TICKS_PER_SEC / (10 * HZ)) {
> + qdisc_watchdog_schedule(&q->watchdog, next_event);
> + q->next_watchdog = next_event;
> + }
> fin:
> return skb;
> }
> @@ -973,6 +978,7 @@ static void htb_reset(struct Qdisc *sch)
> }
> }
> qdisc_watchdog_cancel(&q->watchdog);
> + q->next_watchdog = 0;
> __skb_queue_purge(&q->direct_queue);
> sch->q.qlen = 0;
> memset(q->row, 0, sizeof(q->row));
>
>
On Wed, Dec 10, 2008 at 06:14:28PM +0300, Badalian Vyacheslav wrote:
> Hello again! Sorry for long away.
Hi!
> I was go away from this work for long time.
>
> May we return to this bug?
> Servers at last stable kernel 2.6.27.8
> HZ=1000, HR=off, DynamicTicks=off, hysteresis=1
> Sorry - no patched, update do not i. Do you have fresh patches or ideas
> for tests?
Not much, but I can have if you only are willing to test them...
I attach below a patch which combines 2 patches I sent yesterday to
netdev (PATCH 7/6 and 8/6) vs. 2.6.27.7 (named testing patch #3 here).
You can still try the testing patch #2 I sent previously (quoted below)
with or without this new #3 patch.
Thanks,
Jarek P.
> Also debug "Debug object operations" and "Debug timer objects" is on but
> i not see additional information at dump.
>
> Thanks!
>
> That dump:
> [ 1500.932813] BUG: NMI Watchdog detected LOCKUP on CPU3, ip c0208ed9,
> registers:
> [ 1500.932813] Modules linked in: cls_u32 sch_sfq sch_htb netconsole
> e1000e e1000 i2c_i801
> [ 1500.932813]
> [ 1500.932813] Pid: 0, comm: swapper Not tainted (2.6.27-gentoo-r5-fw #1)
> [ 1500.932813] EIP: 0060:[<c0208ed9>] EFLAGS: 00000082 CPU: 3
> [ 1500.932813] EIP is at rb_insert_color+0x29/0xf0
...
> > --- (testing patch #2)
> >
> > net/sched/sch_htb.c | 8 +++++++-
> > 1 files changed, 7 insertions(+), 1 deletions(-)
> >
> > diff --git a/net/sched/sch_htb.c b/net/sched/sch_htb.c
> > index 30c999c..ff9e965 100644
-----------> (testing patch #3)
diff -Nurp a2.6.27.7/net/sched/sch_htb.c b2.6.27.7/net/sched/sch_htb.c
--- a2.6.27.7/net/sched/sch_htb.c 2008-12-11 08:16:16.000000000 +0000
+++ b2.6.27.7/net/sched/sch_htb.c 2008-12-11 08:20:27.000000000 +0000
@@ -696,12 +696,13 @@ static void htb_charge_class(struct htb_
* next pending event (0 for no event in pq).
* Note: Applied are events whose have cl->pq_key <= q->now.
*/
-static psched_time_t htb_do_events(struct htb_sched *q, int level)
+static psched_time_t htb_do_events(struct htb_sched *q, int level,
+ unsigned long start)
{
/* don't run for longer than 2 jiffies; 2 is used instead of
1 to simplify things when jiffy is going to be incremented
too soon */
- unsigned long stop_at = jiffies + 2;
+ unsigned long stop_at = start + 2;
while (time_before(jiffies, stop_at)) {
struct htb_class *cl;
long diff;
@@ -720,8 +721,8 @@ static psched_time_t htb_do_events(struc
if (cl->cmode != HTB_CAN_SEND)
htb_add_to_wait_tree(q, cl, diff);
}
- /* too much load - let's continue on next jiffie */
- return q->now + PSCHED_TICKS_PER_SEC / HZ;
+ /* too much load - let's continue on next jiffie (including above) */
+ return q->now + 2 * PSCHED_TICKS_PER_SEC / HZ;
}
/* Returns class->node+prio from id-tree where classe's id is >= id. NULL
@@ -880,6 +881,7 @@ static struct sk_buff *htb_dequeue(struc
struct htb_sched *q = qdisc_priv(sch);
int level;
psched_time_t next_event;
+ unsigned long start_at;
/* try to dequeue direct packets as high prio (!) to minimize cpu work */
skb = __skb_dequeue(&q->direct_queue);
@@ -892,6 +894,7 @@ static struct sk_buff *htb_dequeue(struc
if (!sch->q.qlen)
goto fin;
q->now = psched_get_time();
+ start_at = jiffies;
next_event = q->now + 5 * PSCHED_TICKS_PER_SEC;
q->nwc_hit = 0;
@@ -901,14 +904,14 @@ static struct sk_buff *htb_dequeue(struc
psched_time_t event;
if (q->now >= q->near_ev_cache[level]) {
- event = htb_do_events(q, level);
+ event = htb_do_events(q, level, start_at);
if (!event)
event = q->now + PSCHED_TICKS_PER_SEC;
q->near_ev_cache[level] = event;
} else
event = q->near_ev_cache[level];
- if (event && next_event > event)
+ if (next_event > event)
next_event = event;
m = ~q->row_mask[level];
On Thu, Dec 11, 2008 at 08:46:06AM +0000, Jarek Poplawski wrote:
> On Wed, Dec 10, 2008 at 06:14:28PM +0300, Badalian Vyacheslav wrote:
> > Hello again! Sorry for long away.
>
> Hi!
>
> > I was go away from this work for long time.
> >
> > May we return to this bug?
> > Servers at last stable kernel 2.6.27.8
> > HZ=1000, HR=off, DynamicTicks=off, hysteresis=1
> > Sorry - no patched, update do not i. Do you have fresh patches or ideas
> > for tests?
>
> Not much, but I can have if you only are willing to test them...
> I attach below a patch which combines 2 patches I sent yesterday to
> netdev (PATCH 7/6 and 8/6) vs. 2.6.27.7 (named testing patch #3 here).
>
> You can still try the testing patch #2 I sent previously (quoted below)
> with or without this new #3 patch.
>
Here is another idea worth checking (instead of patch #2).
Jarek P.
--- (testing patch #4)
diff -Nurp a2.6.27.7/net/sched/sch_htb.c b2.6.27.7/net/sched/sch_htb.c
--- a2.6.27.7/net/sched/sch_htb.c 2008-12-11 08:16:16.000000000 +0000
+++ b2.6.27.7/net/sched/sch_htb.c 2008-12-15 10:44:32.000000000 +0000
@@ -924,6 +924,7 @@ static struct sk_buff *htb_dequeue(struc
}
}
sch->qstats.overlimits++;
+ qdisc_watchdog_cancel(&q->watchdog);
qdisc_watchdog_schedule(&q->watchdog, next_event);
fin:
return skb;
Now i test patches 2+3. 5 pc work without oops 5 days
> On Thu, Dec 11, 2008 at 08:46:06AM +0000, Jarek Poplawski wrote:
>
>> On Wed, Dec 10, 2008 at 06:14:28PM +0300, Badalian Vyacheslav wrote:
>>
>>> Hello again! Sorry for long away.
>>>
>> Hi!
>>
>>
>>> I was go away from this work for long time.
>>>
>>> May we return to this bug?
>>> Servers at last stable kernel 2.6.27.8
>>> HZ=1000, HR=off, DynamicTicks=off, hysteresis=1
>>> Sorry - no patched, update do not i. Do you have fresh patches or ideas
>>> for tests?
>>>
>> Not much, but I can have if you only are willing to test them...
>> I attach below a patch which combines 2 patches I sent yesterday to
>> netdev (PATCH 7/6 and 8/6) vs. 2.6.27.7 (named testing patch #3 here).
>>
>> You can still try the testing patch #2 I sent previously (quoted below)
>> with or without this new #3 patch.
>>
>>
>
> Here is another idea worth checking (instead of patch #2).
>
> Jarek P.
>
> --- (testing patch #4)
>
> diff -Nurp a2.6.27.7/net/sched/sch_htb.c b2.6.27.7/net/sched/sch_htb.c
> --- a2.6.27.7/net/sched/sch_htb.c 2008-12-11 08:16:16.000000000 +0000
> +++ b2.6.27.7/net/sched/sch_htb.c 2008-12-15 10:44:32.000000000 +0000
> @@ -924,6 +924,7 @@ static struct sk_buff *htb_dequeue(struc
> }
> }
> sch->qstats.overlimits++;
> + qdisc_watchdog_cancel(&q->watchdog);
> qdisc_watchdog_schedule(&q->watchdog, next_event);
> fin:
> return skb;
>
>
Hello
result: Patch 2+3 = uptime 7 days without crashes.
May i revert patches and try single new patch?
Thanks!
> On Thu, Dec 11, 2008 at 08:46:06AM +0000, Jarek Poplawski wrote:
>
>> On Wed, Dec 10, 2008 at 06:14:28PM +0300, Badalian Vyacheslav wrote:
>>
>>> Hello again! Sorry for long away.
>>>
>> Hi!
>>
>>
>>> I was go away from this work for long time.
>>>
>>> May we return to this bug?
>>> Servers at last stable kernel 2.6.27.8
>>> HZ=1000, HR=off, DynamicTicks=off, hysteresis=1
>>> Sorry - no patched, update do not i. Do you have fresh patches or ideas
>>> for tests?
>>>
>> Not much, but I can have if you only are willing to test them...
>> I attach below a patch which combines 2 patches I sent yesterday to
>> netdev (PATCH 7/6 and 8/6) vs. 2.6.27.7 (named testing patch #3 here).
>>
>> You can still try the testing patch #2 I sent previously (quoted below)
>> with or without this new #3 patch.
>>
>>
>
> Here is another idea worth checking (instead of patch #2).
>
> Jarek P.
>
> --- (testing patch #4)
>
> diff -Nurp a2.6.27.7/net/sched/sch_htb.c b2.6.27.7/net/sched/sch_htb.c
> --- a2.6.27.7/net/sched/sch_htb.c 2008-12-11 08:16:16.000000000 +0000
> +++ b2.6.27.7/net/sched/sch_htb.c 2008-12-15 10:44:32.000000000 +0000
> @@ -924,6 +924,7 @@ static struct sk_buff *htb_dequeue(struc
> }
> }
> sch->qstats.overlimits++;
> + qdisc_watchdog_cancel(&q->watchdog);
> qdisc_watchdog_schedule(&q->watchdog, next_event);
> fin:
> return skb;
>
>
>
On Thu, Dec 18, 2008 at 09:43:51AM +0300, Badalian Vyacheslav wrote:
> Hello
> result: Patch 2+3 = uptime 7 days without crashes.
> May i revert patches and try single new patch?
Here is my current opinion on this bug:
1) I'm almost sure it's not a htb, but hrtimers bug (some race),
2) the htb patches you've tested are not "the proper" way of fixing
it; I see substantial changes in hrtimers code in the "-tip" tree
(probably for 2.6.29), which, probably, you'll be advised by
hrtimers maintainers to try, and I guess, it's not easy on a
production system,
So, it's up to you:
1) since these patches work for you, you can stop with testing and
wait with these patched kernels until 2.6.29 (I can propose this
#2 patch as a temporary fix then),
2) for curiosity you could try this patch #4 alone on one box first
(after reverting at least patch #2), but again: if it works, it
could be only treated as a temporary hack, and alternative of #2.
Thanks,
Jarek P.
>
> > On Thu, Dec 11, 2008 at 08:46:06AM +0000, Jarek Poplawski wrote:
> >
> >> On Wed, Dec 10, 2008 at 06:14:28PM +0300, Badalian Vyacheslav wrote:
> >>
> >>> Hello again! Sorry for long away.
> >>>
> >> Hi!
> >>
> >>
> >>> I was go away from this work for long time.
> >>>
> >>> May we return to this bug?
> >>> Servers at last stable kernel 2.6.27.8
> >>> HZ=1000, HR=off, DynamicTicks=off, hysteresis=1
> >>> Sorry - no patched, update do not i. Do you have fresh patches or ideas
> >>> for tests?
> >>>
> >> Not much, but I can have if you only are willing to test them...
> >> I attach below a patch which combines 2 patches I sent yesterday to
> >> netdev (PATCH 7/6 and 8/6) vs. 2.6.27.7 (named testing patch #3 here).
> >>
> >> You can still try the testing patch #2 I sent previously (quoted below)
> >> with or without this new #3 patch.
> >>
> >>
> >
> > Here is another idea worth checking (instead of patch #2).
> >
> > Jarek P.
> >
> > --- (testing patch #4)
> >
> > diff -Nurp a2.6.27.7/net/sched/sch_htb.c b2.6.27.7/net/sched/sch_htb.c
> > --- a2.6.27.7/net/sched/sch_htb.c 2008-12-11 08:16:16.000000000 +0000
> > +++ b2.6.27.7/net/sched/sch_htb.c 2008-12-15 10:44:32.000000000 +0000
> > @@ -924,6 +924,7 @@ static struct sk_buff *htb_dequeue(struc
> > }
> > }
> > sch->qstats.overlimits++;
> > + qdisc_watchdog_cancel(&q->watchdog);
> > qdisc_watchdog_schedule(&q->watchdog, next_event);
> > fin:
> > return skb;
> >
> >
> >
>
Thanks for all Jarek!
Vyacheslav Badalian
> On Thu, Dec 18, 2008 at 09:43:51AM +0300, Badalian Vyacheslav wrote:
>
>> Hello
>> result: Patch 2+3 = uptime 7 days without crashes.
>> May i revert patches and try single new patch?
>>
>
> Here is my current opinion on this bug:
>
> 1) I'm almost sure it's not a htb, but hrtimers bug (some race),
>
> 2) the htb patches you've tested are not "the proper" way of fixing
> it; I see substantial changes in hrtimers code in the "-tip" tree
> (probably for 2.6.29), which, probably, you'll be advised by
> hrtimers maintainers to try, and I guess, it's not easy on a
> production system,
>
> So, it's up to you:
>
> 1) since these patches work for you, you can stop with testing and
> wait with these patched kernels until 2.6.29 (I can propose this
> #2 patch as a temporary fix then),
>
> 2) for curiosity you could try this patch #4 alone on one box first
> (after reverting at least patch #2), but again: if it works, it
> could be only treated as a temporary hack, and alternative of #2.
>
> Thanks,
> Jarek P.
>
>
>>> On Thu, Dec 11, 2008 at 08:46:06AM +0000, Jarek Poplawski wrote:
>>>
>>>
>>>> On Wed, Dec 10, 2008 at 06:14:28PM +0300, Badalian Vyacheslav wrote:
>>>>
>>>>
>>>>> Hello again! Sorry for long away.
>>>>>
>>>>>
>>>> Hi!
>>>>
>>>>
>>>>
>>>>> I was go away from this work for long time.
>>>>>
>>>>> May we return to this bug?
>>>>> Servers at last stable kernel 2.6.27.8
>>>>> HZ=1000, HR=off, DynamicTicks=off, hysteresis=1
>>>>> Sorry - no patched, update do not i. Do you have fresh patches or ideas
>>>>> for tests?
>>>>>
>>>>>
>>>> Not much, but I can have if you only are willing to test them...
>>>> I attach below a patch which combines 2 patches I sent yesterday to
>>>> netdev (PATCH 7/6 and 8/6) vs. 2.6.27.7 (named testing patch #3 here).
>>>>
>>>> You can still try the testing patch #2 I sent previously (quoted below)
>>>> with or without this new #3 patch.
>>>>
>>>>
>>>>
>>> Here is another idea worth checking (instead of patch #2).
>>>
>>> Jarek P.
>>>
>>> --- (testing patch #4)
>>>
>>> diff -Nurp a2.6.27.7/net/sched/sch_htb.c b2.6.27.7/net/sched/sch_htb.c
>>> --- a2.6.27.7/net/sched/sch_htb.c 2008-12-11 08:16:16.000000000 +0000
>>> +++ b2.6.27.7/net/sched/sch_htb.c 2008-12-15 10:44:32.000000000 +0000
>>> @@ -924,6 +924,7 @@ static struct sk_buff *htb_dequeue(struc
>>> }
>>> }
>>> sch->qstats.overlimits++;
>>> + qdisc_watchdog_cancel(&q->watchdog);
>>> qdisc_watchdog_schedule(&q->watchdog, next_event);
>>> fin:
>>> return skb;
>>>
>>>
>>>
>>>
>
>
>
On Thu, Dec 18, 2008 at 02:23:33PM +0300, Badalian Vyacheslav wrote:
> Thanks for all Jarek!
The same to you for very valuable reporting and testing Vyacheslav!
Jarek P.
On Thu, 18 Dec 2008, Jarek Poplawski wrote:
> On Thu, Dec 18, 2008 at 09:43:51AM +0300, Badalian Vyacheslav wrote:
> > Hello
> > result: Patch 2+3 = uptime 7 days without crashes.
> > May i revert patches and try single new patch?
>
> Here is my current opinion on this bug:
>
> 1) I'm almost sure it's not a htb, but hrtimers bug (some race),
>
> 2) the htb patches you've tested are not "the proper" way of fixing
> it; I see substantial changes in hrtimers code in the "-tip" tree
> (probably for 2.6.29), which, probably, you'll be advised by
> hrtimers maintainers to try, and I guess, it's not easy on a
> production system,
>
> So, it's up to you:
>
> 1) since these patches work for you, you can stop with testing and
> wait with these patched kernels until 2.6.29 (I can propose this
> #2 patch as a temporary fix then),
>
> 2) for curiosity you could try this patch #4 alone on one box first
> (after reverting at least patch #2), but again: if it works, it
> could be only treated as a temporary hack, and alternative of #2.
I think I am hitting the same issue discussed in this thread and:
http://bugzilla.kernel.org/show_bug.cgi?id=11718
and wanted to share my data.
My system:
2x Xeon @ 3.06ghz
CONFIG_HZ=250
CONFIG_HIGH_RES_TIMERS=y
CONFIG_NO_HZ=y
(please let me know if more details are useful)
With 2.6.27.10 after around 30 minutes or an hour, server tends to hang
with:
Pid: 0, comm: swapper Not tainted (2.6.27.10 #1)
EIP: 0060:[<c023199f>] EFLAGS: 00000286 CPU: 0
EIP is at run_hrtimer_pending+0x31/0xb8
EAX: f6c1d468 EBX: c039b2ae ECX: 00000002 EDX: 00000001
ESI: f6c1d468 EDI: c1807e20 EBP: c0587f28 ESP: c0587f18
DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
CR0: 8005003b CR2: 09a36358 CR3: 36d64000 CR4: 000006d0
DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
DR6: ffff0ff0 DR7: 00000400
[<c0231a3c>] run_hrtimer_softirq+0x16/0x18
[<c0223e6d>] __do_softirq+0x64/0xcd
[<c0223f0b>] do_softirq+0x35/0x3a
[<c0224199>] irq_exit+0x38/0x6d
[<c020eefb>] smp_apic_timer_interrupt+0x71/0x82
[<c0203640>] apic_timer_interrupt+0x28/0x30
[<c0207f9d>] ? default_idle+0x2d/0x42
[<c0201946>] cpu_idle+0xca/0xea
[<c045194e>] rest_init+0x4e/0x50
One time with 2.6.27.10:
BUG: unable to handle kernel NULL pointer dereference at 000000
[<c0239b6b>] smp_call_function+0x12/0x14
[<c020df20>] native_smp_send_stop+0x1b/0x28
[<c0220304>] panic+0x47/0xdf
[<c0203ac2>] oops_end+0x5d/0x71
[<c0203f9f>] die+0x57/0x5f
[<c02129a7>] do_page_fault+0x474/0x529
[<c0212533>] ? do_page_fault+0x0/0x529
[<c045eb3a>] error_code+0x72/0x78
[<c024007b>] ? cgroup_get_sb+0x20/0x2b4
[<c02f223c>] ? rb_erase+0x118/0x241
[<c023177d>] __remove_hrtimer+0x5f/0x67
[<c039b2ae>] ? qdisc_watchdog+0x0/0x1c
[<c023199a>] run_hrtimer_pending+0x2c/0xb8
[<c0231a3c>] run_hrtimer_softirq+0x16/0x18
[<c0223e6d>] __do_softirq+0x64/0xcd
[<c0223f0b>] do_softirq+0x35/0x3a
[<c0224199>] irq_exit+0x38/0x6d
[<c020eefb>] smp_apic_timer_interrupt+0x71/0x82
[<c0203640>] apic_timer_interrupt+0x28/0x30
[<c0207f9d>] ? default_idle+0x2d/0x42
[<c0201946>] cpu_idle+0xca/0xea
[<c045194e>] rest_init+0x4e/0x50
=======================
---[ end trace 818de8d3237477a5 ]---
Rebooting in 10 seconds..
With 2.6.27.10 plus what is being called patches #2 and #3 in this thread,
the server ran for a day or so before hanging again:
Pid: 0, comm: swapper Not tainted (2.6.27.10 #3)
EIP: 0060:[<c023199f>] EFLAGS: 00000206 CPU: 0
EIP is at run_hrtimer_pending+0x31/0xb8
EAX: f73f8470 EBX: c039b312 ECX: 00000002 EDX: 00000001
ESI: f73f8470 EDI: c1807e20 EBP: c0589f20 ESP: c0589f10
DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
CR0: 8005003b CR2: 08ca3358 CR3: 005ca000 CR4: 000006d0
DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
DR6: ffff0ff0 DR7: 00000400
[<c0231a3c>] run_hrtimer_softirq+0x16/0x18
[<c0223e6d>] __do_softirq+0x64/0xcd
[<c0223f0b>] do_softirq+0x35/0x3a
[<c0224199>] irq_exit+0x38/0x6d
[<c02052b3>] do_IRQ+0x5c/0x75
[<c0203553>] common_interrupt+0x23/0x28
[<c0207f9d>] ? default_idle+0x2d/0x42
[<c0201946>] cpu_idle+0xca/0xea
[<c045225e>] rest_init+0x4e/0x50
With 2.6.28 (without the patches from this thread) after about an hour the
system hung again. "Show Regs" indicated:
Pid: 0, comm: swapper Not tainted (2.6.28 #1)
EIP: 0060:[<c03977bd>] EFLAGS: 00000247 CPU: 0
EIP is at __netif_schedule+0xc/0x39
EAX: f6934600 EBX: 00000000 ECX: f6934600 EDX: f6864000
ESI: f6864488 EDI: c1807480 EBP: c05c1f08 ESP: c05c1f04
DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
CR0: 8005003b CR2: 09968358 CR3: 365cd000 CR4: 000006d0
DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
DR6: ffff0ff0 DR7: 00000400
Call Trace:
[<c03a669d>] qdisc_watchdog+0x18/0x1c
[<c02335c3>] run_hrtimer_pending+0x56/0xda
[<c03a6685>] ? qdisc_watchdog+0x0/0x1c
[<c023365d>] run_hrtimer_softirq+0x16/0x18
[<c0225b1a>] __do_softirq+0x7a/0x11c
[<c0225bf1>] do_softirq+0x35/0x3a
[<c0225eb5>] irq_exit+0x38/0x6d
[<c020f719>] smp_apic_timer_interrupt+0x71/0x7f
[<c0203858>] apic_timer_interrupt+0x28/0x30
[<c0208360>] ? default_idle+0x2d/0x42
[<c0201938>] cpu_idle+0x6b/0x84
[<c046b292>] rest_init+0x4e/0x50
Per Jarek's suggestion in bugzilla, I ran 2.6.28 plus Peter Zijlstra's
"hrtimer: removing all ur callback modes" patches dated 2008-11-25,
2008-12-04 and 2008-12-08. Uptime was 2 days 22 hours before I hit what
appears to be an unrelated bug related to the IPv6 FIB. (Separately
reported with subject 'panic with 2.6.28 while doing "ip -6 route"'.)
Chris
On Wed, Jan 14, 2009 at 02:43:13AM +0000, Chris Caputo wrote:
...
> Per Jarek's suggestion in bugzilla, I ran 2.6.28 plus Peter Zijlstra's
> "hrtimer: removing all ur callback modes" patches dated 2008-11-25,
> 2008-12-04 and 2008-12-08. Uptime was 2 days 22 hours before I hit what
> appears to be an unrelated bug related to the IPv6 FIB. (Separately
> reported with subject 'panic with 2.6.28 while doing "ip -6 route"'.)
Chris, this is a very good news, thanks for testing this!
Peter, great work! IMHO these patches should go to the stable (at
least 2.6.28).
Thanks,
Jarek P.
I will try that patches too, when i got this message, after 3 minutes i got
crash of my router :-) after working around 17 hours :-(
There is 3 of them by the way, 2 fixes also.
hrtimer: removing all ur callback modes
hrtimer: removing all ur callback modes, fix hotplug
hrtimer: removing all ur callback modes, fix
On Wednesday 14 January 2009 08:39:09 Jarek Poplawski wrote:
> On Wed, Jan 14, 2009 at 02:43:13AM +0000, Chris Caputo wrote:
> ...
>
> > Per Jarek's suggestion in bugzilla, I ran 2.6.28 plus Peter Zijlstra's
> > "hrtimer: removing all ur callback modes" patches dated 2008-11-25,
> > 2008-12-04 and 2008-12-08. Uptime was 2 days 22 hours before I hit what
> > appears to be an unrelated bug related to the IPv6 FIB. (Separately
> > reported with subject 'panic with 2.6.28 while doing "ip -6 route"'.)
>
> Chris, this is a very good news, thanks for testing this!
>
> Peter, great work! IMHO these patches should go to the stable (at
> least 2.6.28).
>
> Thanks,
> Jarek P.
On Wed, Jan 14, 2009 at 02:17:58PM +0200, Denys Fedoryschenko wrote:
> I will try that patches too, when i got this message, after 3 minutes i got
> crash of my router :-) after working around 17 hours :-(
> There is 3 of them by the way, 2 fixes also.
>
> hrtimer: removing all ur callback modes
> hrtimer: removing all ur callback modes, fix hotplug
> hrtimer: removing all ur callback modes, fix
Yes, it's a very nasty bug which, BTW you reported the first AFAIK,
but I hope it's really gone with these patches.
Thanks,
Jarek P.
On Wednesday 14 January 2009 14:36:23 Jarek Poplawski wrote:
> On Wed, Jan 14, 2009 at 02:17:58PM +0200, Denys Fedoryschenko wrote:
> > I will try that patches too, when i got this message, after 3 minutes i
> > got crash of my router :-) after working around 17 hours :-(
> > There is 3 of them by the way, 2 fixes also.
> >
> > hrtimer: removing all ur callback modes
> > hrtimer: removing all ur callback modes, fix hotplug
> > hrtimer: removing all ur callback modes, fix
>
> Yes, it's a very nasty bug which, BTW you reported the first AFAIK,
> but I hope it's really gone with these patches.
>
If it is really fix, it must go to mainline ASAP.
Because really many people hitting this bug... and complaining, that "Linux
with htb not stable".
But i think it is not just fix :-(
On Wed, 2009-01-14 at 14:17 +0200, Denys Fedoryschenko wrote:
> I will try that patches too, when i got this message, after 3 minutes i got
> crash of my router :-) after working around 17 hours :-(
> There is 3 of them by the way, 2 fixes also.
>
> hrtimer: removing all ur callback modes
> hrtimer: removing all ur callback modes, fix hotplug
> hrtimer: removing all ur callback modes, fix
I'm afraid its a bit more than that:
ca109491f612aab5c8152207631c0444f63da97f
37810659ea7d9572c5ac284ade272f806ef8f788
a0a99b227da57f81319dd239bc4de811b0f530ec
b2e3c0adec918ea22b6c9d7c76193dd3aaba9bd4
8bdec955b0da2ffbd10eb9b200651dd1f9e366f2
d5fd43c4ae04523e1dcd7794f9c511b289851350
731a55ba0f17064f85903b7bf8e24849ec6cfa20
a6037b61c2f5fc99c57c15b26d7cfa58bbb34008
e3f1d883740b09e5116d4d4e30a6a6987264a83c
82c5b7b527ccc4b5d3cf832437e842f9d2920a79
and
6e5c172cf7ca1ab878cc6a6a4c1d52fef60f3ee0
Also, all this is rather invasive and large, so I'm not sure it will
meet the -stable criteria.
On Wed, Jan 14, 2009 at 01:50:04PM +0100, Peter Zijlstra wrote:
> On Wed, 2009-01-14 at 14:17 +0200, Denys Fedoryschenko wrote:
> > I will try that patches too, when i got this message, after 3 minutes i got
> > crash of my router :-) after working around 17 hours :-(
> > There is 3 of them by the way, 2 fixes also.
> >
> > hrtimer: removing all ur callback modes
> > hrtimer: removing all ur callback modes, fix hotplug
> > hrtimer: removing all ur callback modes, fix
>
> I'm afraid its a bit more than that:
>
> ca109491f612aab5c8152207631c0444f63da97f
> 37810659ea7d9572c5ac284ade272f806ef8f788
> a0a99b227da57f81319dd239bc4de811b0f530ec
> b2e3c0adec918ea22b6c9d7c76193dd3aaba9bd4
> 8bdec955b0da2ffbd10eb9b200651dd1f9e366f2
> d5fd43c4ae04523e1dcd7794f9c511b289851350
> 731a55ba0f17064f85903b7bf8e24849ec6cfa20
> a6037b61c2f5fc99c57c15b26d7cfa58bbb34008
> e3f1d883740b09e5116d4d4e30a6a6987264a83c
> 82c5b7b527ccc4b5d3cf832437e842f9d2920a79
>
> and
>
> 6e5c172cf7ca1ab878cc6a6a4c1d52fef60f3ee0
>
> Also, all this is rather invasive and large, so I'm not sure it will
> meet the -stable criteria.
>
Earlier I thought it's very hard to hit, but after Chris reported it
triggers within hours, IMHO some minimal fix is necessary.
Jarek P.
> I'm afraid its a bit more than that:
>
> ca109491f612aab5c8152207631c0444f63da97f
> 37810659ea7d9572c5ac284ade272f806ef8f788
> a0a99b227da57f81319dd239bc4de811b0f530ec
> b2e3c0adec918ea22b6c9d7c76193dd3aaba9bd4
> 8bdec955b0da2ffbd10eb9b200651dd1f9e366f2
> d5fd43c4ae04523e1dcd7794f9c511b289851350
> 731a55ba0f17064f85903b7bf8e24849ec6cfa20
> a6037b61c2f5fc99c57c15b26d7cfa58bbb34008
> e3f1d883740b09e5116d4d4e30a6a6987264a83c
> 82c5b7b527ccc4b5d3cf832437e842f9d2920a79
>
> and
>
> 6e5c172cf7ca1ab878cc6a6a4c1d52fef60f3ee0
>
> Also, all this is rather invasive and large, so I'm not sure it will
> meet the -stable criteria.
Then i will try to run 2.6.29-rc1-git4. Kind of risky to run rc1 on loaded
router, but i am hitting this bug with hrtimers hardly, so no choice for me
seems. And i cannot disable hrtimers because of bad shaper precision.
Thanks a lot for info and patches :-)
On Wed, Jan 14, 2009 at 03:05:46PM +0200, Denys Fedoryschenko wrote:
> > I'm afraid its a bit more than that:
> >
> > ca109491f612aab5c8152207631c0444f63da97f
> > 37810659ea7d9572c5ac284ade272f806ef8f788
> > a0a99b227da57f81319dd239bc4de811b0f530ec
> > b2e3c0adec918ea22b6c9d7c76193dd3aaba9bd4
> > 8bdec955b0da2ffbd10eb9b200651dd1f9e366f2
> > d5fd43c4ae04523e1dcd7794f9c511b289851350
> > 731a55ba0f17064f85903b7bf8e24849ec6cfa20
> > a6037b61c2f5fc99c57c15b26d7cfa58bbb34008
> > e3f1d883740b09e5116d4d4e30a6a6987264a83c
> > 82c5b7b527ccc4b5d3cf832437e842f9d2920a79
> >
> > and
> >
> > 6e5c172cf7ca1ab878cc6a6a4c1d52fef60f3ee0
> >
> > Also, all this is rather invasive and large, so I'm not sure it will
> > meet the -stable criteria.
>
> Then i will try to run 2.6.29-rc1-git4. Kind of risky to run rc1 on loaded
> router, but i am hitting this bug with hrtimers hardly, so no choice for me
> seems. And i cannot disable hrtimers because of bad shaper precision.
> Thanks a lot for info and patches :-)
>
Denys, since these three patches worked for Chris, I guess it's safer
to stay with 2.6.28 anyway (or my patches mentioned earlier in this
thread). rc1 is really not stable...
Jarek P.
On Wed, 2009-01-14 at 13:12 +0000, Jarek Poplawski wrote:
> On Wed, Jan 14, 2009 at 03:05:46PM +0200, Denys Fedoryschenko wrote:
> > > I'm afraid its a bit more than that:
> > >
> > > ca109491f612aab5c8152207631c0444f63da97f
> > > 37810659ea7d9572c5ac284ade272f806ef8f788
> > > a0a99b227da57f81319dd239bc4de811b0f530ec
> > > b2e3c0adec918ea22b6c9d7c76193dd3aaba9bd4
> > > 8bdec955b0da2ffbd10eb9b200651dd1f9e366f2
> > > d5fd43c4ae04523e1dcd7794f9c511b289851350
> > > 731a55ba0f17064f85903b7bf8e24849ec6cfa20
> > > a6037b61c2f5fc99c57c15b26d7cfa58bbb34008
> > > e3f1d883740b09e5116d4d4e30a6a6987264a83c
> > > 82c5b7b527ccc4b5d3cf832437e842f9d2920a79
> > >
> > > and
> > >
> > > 6e5c172cf7ca1ab878cc6a6a4c1d52fef60f3ee0
> > >
> > > Also, all this is rather invasive and large, so I'm not sure it will
> > > meet the -stable criteria.
> >
> > Then i will try to run 2.6.29-rc1-git4. Kind of risky to run rc1 on loaded
> > router, but i am hitting this bug with hrtimers hardly, so no choice for me
> > seems. And i cannot disable hrtimers because of bad shaper precision.
> > Thanks a lot for info and patches :-)
> >
>
> Denys, since these three patches worked for Chris, I guess it's safer
> to stay with 2.6.28 anyway (or my patches mentioned earlier in this
> thread). rc1 is really not stable...
Yeah, but those three patches have some holes in them still, so I cannot
recommend just patching in those three.
On Wednesday 14 January 2009 15:15:29 Peter Zijlstra wrote:
> >
> > Denys, since these three patches worked for Chris, I guess it's safer
> > to stay with 2.6.28 anyway (or my patches mentioned earlier in this
> > thread). rc1 is really not stable...
>
> Yeah, but those three patches have some holes in them still, so I cannot
> recommend just patching in those three.
I will test rc1-git4 on few units which is less important and have redundancy,
and then will decide. Already i did this 3 patches, but i prefer complete
solution. If i will hit any bug, at least i will do my best to report about
this bugs and such way to help fix them :-)
On Wed, Jan 14, 2009 at 02:15:29PM +0100, Peter Zijlstra wrote:
> On Wed, 2009-01-14 at 13:12 +0000, Jarek Poplawski wrote:
> > On Wed, Jan 14, 2009 at 03:05:46PM +0200, Denys Fedoryschenko wrote:
> > > > I'm afraid its a bit more than that:
> > > >
> > > > ca109491f612aab5c8152207631c0444f63da97f
> > > > 37810659ea7d9572c5ac284ade272f806ef8f788
> > > > a0a99b227da57f81319dd239bc4de811b0f530ec
> > > > b2e3c0adec918ea22b6c9d7c76193dd3aaba9bd4
> > > > 8bdec955b0da2ffbd10eb9b200651dd1f9e366f2
> > > > d5fd43c4ae04523e1dcd7794f9c511b289851350
> > > > 731a55ba0f17064f85903b7bf8e24849ec6cfa20
> > > > a6037b61c2f5fc99c57c15b26d7cfa58bbb34008
> > > > e3f1d883740b09e5116d4d4e30a6a6987264a83c
> > > > 82c5b7b527ccc4b5d3cf832437e842f9d2920a79
> > > >
> > > > and
> > > >
> > > > 6e5c172cf7ca1ab878cc6a6a4c1d52fef60f3ee0
> > > >
> > > > Also, all this is rather invasive and large, so I'm not sure it will
> > > > meet the -stable criteria.
> > >
> > > Then i will try to run 2.6.29-rc1-git4. Kind of risky to run rc1 on loaded
> > > router, but i am hitting this bug with hrtimers hardly, so no choice for me
> > > seems. And i cannot disable hrtimers because of bad shaper precision.
> > > Thanks a lot for info and patches :-)
> > >
> >
> > Denys, since these three patches worked for Chris, I guess it's safer
> > to stay with 2.6.28 anyway (or my patches mentioned earlier in this
> > thread). rc1 is really not stable...
>
> Yeah, but those three patches have some holes in them still, so I cannot
> recommend just patching in those three.
>
OK, I hope Denys can apply more, but what about others? Without any
patches the hole seems to be much bigger.
Jarek P.
On Wed, 2009-01-14 at 13:26 +0000, Jarek Poplawski wrote:
> OK, I hope Denys can apply more, but what about others? Without any
> patches the hole seems to be much bigger.
OK, I read most of this thread on netdev, but didn't find a clear clue
on the specific hrtimer insertion race.
Do you have any clear ideas or should I poke at the htb/hrtimer code a
little?
On Wed, Jan 14, 2009 at 02:32:26PM +0100, Peter Zijlstra wrote:
> On Wed, 2009-01-14 at 13:26 +0000, Jarek Poplawski wrote:
>
> > OK, I hope Denys can apply more, but what about others? Without any
> > patches the hole seems to be much bigger.
>
> OK, I read most of this thread on netdev, but didn't find a clear clue
> on the specific hrtimer insertion race.
>
> Do you have any clear ideas or should I poke at the htb/hrtimer code a
> little?
>
I didn't diagnose this entirely, and there were "a few" days since I
wrote my opinion on this, so currently I forgot details, but I guess
the most suspicious place is adding a hrtimer, when it's on the list,
maybe even two times in a row. I guess it's not removed from the
rbtree in some case. (I stopped searching after seeing you removed
this list or something...)
Jarek P.
On Wed, Jan 14, 2009 at 02:32:26PM +0100, Peter Zijlstra wrote:
> On Wed, 2009-01-14 at 13:26 +0000, Jarek Poplawski wrote:
>
> > OK, I hope Denys can apply more, but what about others? Without any
> > patches the hole seems to be much bigger.
>
> OK, I read most of this thread on netdev, but didn't find a clear clue
> on the specific hrtimer insertion race.
There is something at the beginning of this thread, plus earlier
threads mostly with Denys as sender, and "htb bug" in the subject.
>
> Do you have any clear ideas or should I poke at the htb/hrtimer code a
> little?
>
...And htb code is htb_dequeue(): qdisc_watchdog_schedule().
Jarek P.
On Wed, 2009-01-14 at 14:13 +0000, Jarek Poplawski wrote:
> On Wed, Jan 14, 2009 at 02:32:26PM +0100, Peter Zijlstra wrote:
> > On Wed, 2009-01-14 at 13:26 +0000, Jarek Poplawski wrote:
> >
> > > OK, I hope Denys can apply more, but what about others? Without any
> > > patches the hole seems to be much bigger.
> >
> > OK, I read most of this thread on netdev, but didn't find a clear clue
> > on the specific hrtimer insertion race.
>
> There is something at the beginning of this thread, plus earlier
> threads mostly with Denys as sender, and "htb bug" in the subject.
>
> >
> > Do you have any clear ideas or should I poke at the htb/hrtimer code a
> > little?
> >
>
> ....And htb code is htb_dequeue(): qdisc_watchdog_schedule().
Right, found all that...
Can't spot anything obviously wrong though.. hrtimer_start*() does
remove_hrtimer() which checks STATE_ENQUEUED, STATE_PENDING and pulls it
off the relevant list before it continues the enqueue.
However a loop in enqueue_hrtimer() would suggest a corrupted RB-tree,
but I cannot find an RB-op that doesn't hold base-lock.
Hohumm..
On Wed, Jan 14, 2009 at 03:28:03PM +0100, Peter Zijlstra wrote:
> On Wed, 2009-01-14 at 14:13 +0000, Jarek Poplawski wrote:
> > On Wed, Jan 14, 2009 at 02:32:26PM +0100, Peter Zijlstra wrote:
> > > On Wed, 2009-01-14 at 13:26 +0000, Jarek Poplawski wrote:
> > >
> > > > OK, I hope Denys can apply more, but what about others? Without any
> > > > patches the hole seems to be much bigger.
> > >
> > > OK, I read most of this thread on netdev, but didn't find a clear clue
> > > on the specific hrtimer insertion race.
> >
> > There is something at the beginning of this thread, plus earlier
> > threads mostly with Denys as sender, and "htb bug" in the subject.
> >
> > >
> > > Do you have any clear ideas or should I poke at the htb/hrtimer code a
> > > little?
> > >
> >
> > ....And htb code is htb_dequeue(): qdisc_watchdog_schedule().
>
> Right, found all that...
>
> Can't spot anything obviously wrong though.. hrtimer_start*() does
> remove_hrtimer() which checks STATE_ENQUEUED, STATE_PENDING and pulls it
> off the relevant list before it continues the enqueue.
>
> However a loop in enqueue_hrtimer() would suggest a corrupted RB-tree,
> but I cannot find an RB-op that doesn't hold base-lock.
>
Anyway I think some trace in rbtree seemed to show the parent is the
same with the node, if I didn't mess anything.
Jarek P.
On Wed, 14 Jan 2009, Peter Zijlstra wrote:
> On Wed, 2009-01-14 at 14:17 +0200, Denys Fedoryschenko wrote:
> > I will try that patches too, when i got this message, after 3 minutes i got
> > crash of my router :-) after working around 17 hours :-(
> > There is 3 of them by the way, 2 fixes also.
> >
> > hrtimer: removing all ur callback modes
> > hrtimer: removing all ur callback modes, fix hotplug
> > hrtimer: removing all ur callback modes, fix
>
> I'm afraid its a bit more than that:
>
> ca109491f612aab5c8152207631c0444f63da97f
> 37810659ea7d9572c5ac284ade272f806ef8f788
> a0a99b227da57f81319dd239bc4de811b0f530ec
> b2e3c0adec918ea22b6c9d7c76193dd3aaba9bd4
> 8bdec955b0da2ffbd10eb9b200651dd1f9e366f2
> d5fd43c4ae04523e1dcd7794f9c511b289851350
> 731a55ba0f17064f85903b7bf8e24849ec6cfa20
> a6037b61c2f5fc99c57c15b26d7cfa58bbb34008
> e3f1d883740b09e5116d4d4e30a6a6987264a83c
> 82c5b7b527ccc4b5d3cf832437e842f9d2920a79
>
> and
>
> 6e5c172cf7ca1ab878cc6a6a4c1d52fef60f3ee0
>
> Also, all this is rather invasive and large, so I'm not sure it will
> meet the -stable criteria.
In my testing (another 12 hours of uptime, BTW), the three patches I am
using are:
ca109491f612aab5c8152207631c0444f63da97f
37810659ea7d9572c5ac284ade272f806ef8f788
a0a99b227da57f81319dd239bc4de811b0f530ec
I have not tried the other patches.
That said, I would not recommend just the three for -stable unless they
get a much wider amount of testing, on multiple platforms. I don't see
that as likely to happen, plus Peter says they are incomplete, so maybe it
is just best to recommend that 2.6.28 users getting crashes while using
HTB try these specific patches at first, and then the rest of the patches
if they do not work.
Thanks,
Chris
On Wed, Jan 14, 2009 at 06:02:04PM +0000, Chris Caputo wrote:
...
> That said, I would not recommend just the three for -stable unless they
> get a much wider amount of testing, on multiple platforms. I don't see
> that as likely to happen, plus Peter says they are incomplete, so maybe it
> is just best to recommend that 2.6.28 users getting crashes while using
> HTB try these specific patches at first, and then the rest of the patches
> if they do not work.
The main problem is my patches, at least the tested ones, harm htb's
exactness, and I doubt I could convince anybody to merege them, at
least before your case. It was only reported by two users here (plus
one more on private), and looked like something very rare. After your
report it looks much more necessary.
If there is nothing better, I can recommend it, but IMHO the best
candidate for this is the testing patch #4 from this thread, which
alas wasn't even tested... So, Chris, if you could give it a try in
the meantime (without any other patches)?
Thanks,
Jarek P.
(resend testing patch #4 - for 2.6.27 or 2.6.28)
---
diff -Nurp a2.6.27.7/net/sched/sch_htb.c b2.6.27.7/net/sched/sch_htb.c
--- a2.6.27.7/net/sched/sch_htb.c 2008-12-11 08:16:16.000000000 +0000
+++ b2.6.27.7/net/sched/sch_htb.c 2008-12-15 10:44:32.000000000 +0000
@@ -924,6 +924,7 @@ static struct sk_buff *htb_dequeue(struc
}
}
sch->qstats.overlimits++;
+ qdisc_watchdog_cancel(&q->watchdog);
qdisc_watchdog_schedule(&q->watchdog, next_event);
fin:
return skb;
> On Wed, Jan 14, 2009 at 06:02:04PM +0000, Chris Caputo wrote:
> ...
>
>> That said, I would not recommend just the three for -stable unless they
>> get a much wider amount of testing, on multiple platforms. I don't see
>> that as likely to happen, plus Peter says they are incomplete, so maybe it
>> is just best to recommend that 2.6.28 users getting crashes while using
>> HTB try these specific patches at first, and then the rest of the patches
>> if they do not work.
>>
>
> The main problem is my patches, at least the tested ones, harm htb's
> exactness, and I doubt I could convince anybody to merege them, at
> least before your case. It was only reported by two users here (plus
> one more on private), and looked like something very rare. After your
> report it looks much more necessary.
>
> If there is nothing better, I can recommend it, but IMHO the best
> candidate for this is the testing patch #4 from this thread, which
> alas wasn't even tested... So, Chris, if you could give it a try in
> the meantime (without any other patches)?
>
> Thanks,
> Jarek P.
>
> (resend testing patch #4 - for 2.6.27 or 2.6.28)
> ---
>
> diff -Nurp a2.6.27.7/net/sched/sch_htb.c b2.6.27.7/net/sched/sch_htb.c
> --- a2.6.27.7/net/sched/sch_htb.c 2008-12-11 08:16:16.000000000 +0000
> +++ b2.6.27.7/net/sched/sch_htb.c 2008-12-15 10:44:32.000000000 +0000
> @@ -924,6 +924,7 @@ static struct sk_buff *htb_dequeue(struc
> }
> }
> sch->qstats.overlimits++;
> + qdisc_watchdog_cancel(&q->watchdog);
> qdisc_watchdog_schedule(&q->watchdog, next_event);
> fin:
> return skb;
>
>
Hello all.
I also can say this, maybe its help:
At old kernels my servers do 100% soft interupt if traffic more 600mbs.
Without your patches at new kernel i get crash only at heavy network
load PCs (more then 400mbs-500mbs). Servers that get 100-200 mbs not
crashed long time.
I remember that i not test patch #4, because you sat what its only
another way to temporary fix and mainline problem in hrtimer , but i try
turn on HiRes and Dynamic Tics in kernel - its not help for me.
Best Regals. Slavon
On Thu, 15 Jan 2009, Jarek Poplawski wrote:
> On Wed, Jan 14, 2009 at 06:02:04PM +0000, Chris Caputo wrote:
> ...
> > That said, I would not recommend just the three for -stable unless they
> > get a much wider amount of testing, on multiple platforms. I don't see
> > that as likely to happen, plus Peter says they are incomplete, so maybe it
> > is just best to recommend that 2.6.28 users getting crashes while using
> > HTB try these specific patches at first, and then the rest of the patches
> > if they do not work.
>
> The main problem is my patches, at least the tested ones, harm htb's
> exactness, and I doubt I could convince anybody to merege them, at
> least before your case. It was only reported by two users here (plus
> one more on private), and looked like something very rare. After your
> report it looks much more necessary.
>
> If there is nothing better, I can recommend it, but IMHO the best
> candidate for this is the testing patch #4 from this thread, which
> alas wasn't even tested... So, Chris, if you could give it a try in
> the meantime (without any other patches)?
>
> Thanks,
> Jarek P.
>
> (resend testing patch #4 - for 2.6.27 or 2.6.28)
> ---
>
> diff -Nurp a2.6.27.7/net/sched/sch_htb.c b2.6.27.7/net/sched/sch_htb.c
> --- a2.6.27.7/net/sched/sch_htb.c 2008-12-11 08:16:16.000000000 +0000
> +++ b2.6.27.7/net/sched/sch_htb.c 2008-12-15 10:44:32.000000000 +0000
> @@ -924,6 +924,7 @@ static struct sk_buff *htb_dequeue(struc
> }
> }
> sch->qstats.overlimits++;
> + qdisc_watchdog_cancel(&q->watchdog);
> qdisc_watchdog_schedule(&q->watchdog, next_event);
> fin:
> return skb;
I hope you can forgive me, but since I have something that works on a
production machine, I am not in a position in which I can test this.
Chris
On Thu, Jan 15, 2009 at 07:26:28AM +0000, Chris Caputo wrote:
> On Thu, 15 Jan 2009, Jarek Poplawski wrote:
> > On Wed, Jan 14, 2009 at 06:02:04PM +0000, Chris Caputo wrote:
> > ...
> > > That said, I would not recommend just the three for -stable unless they
> > > get a much wider amount of testing, on multiple platforms. I don't see
> > > that as likely to happen, plus Peter says they are incomplete, so maybe it
> > > is just best to recommend that 2.6.28 users getting crashes while using
> > > HTB try these specific patches at first, and then the rest of the patches
> > > if they do not work.
...
> I hope you can forgive me, but since I have something that works on a
> production machine, I am not in a position in which I can test this.
So, IOW, you don't recommend Peter's patches but prefere to use them,
and recommend my patches, but you are afraid of using them...
Of course I forgive you!-)
Jarek P.
On Thu, Jan 15, 2009 at 10:12:20AM +0300, Badalian Vyacheslav wrote:
...
> Hello all.
> I also can say this, maybe its help:
> At old kernels my servers do 100% soft interupt if traffic more 600mbs.
> Without your patches at new kernel i get crash only at heavy network
> load PCs (more then 400mbs-500mbs). Servers that get 100-200 mbs not
> crashed long time.
> I remember that i not test patch #4, because you sat what its only
> another way to temporary fix and mainline problem in hrtimer , but i try
> turn on HiRes and Dynamic Tics in kernel - its not help for me.
Slavon, I'm very glad of your testing and reporting, and I never
expected you test anything on production, when something else works
for you. On the other hand, what works for you could be unacceptable
for other users, who don't have such crashes, and prefer better
exactness. So it's hard to recommend such patches if there is not
enough people interested in them.
Best regards,
Jarek P.
On Thursday 15 January 2009 10:09:21 Jarek Poplawski wrote:
> Slavon, I'm very glad of your testing and reporting, and I never
> expected you test anything on production, when something else works
> for you. On the other hand, what works for you could be unacceptable
> for other users, who don't have such crashes, and prefer better
> exactness. So it's hard to recommend such patches if there is not
> enough people interested in them.
>
> Best regards,
> Jarek P.
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
Jarek, can you give me exact name of patch or link to it?
I will test it on production.
And i am interested in searching, what is a problem.
On Wed, Jan 14, 2009 at 03:28:03PM +0100, Peter Zijlstra wrote:
...
> Right, found all that...
>
> Can't spot anything obviously wrong though.. hrtimer_start*() does
> remove_hrtimer() which checks STATE_ENQUEUED, STATE_PENDING and pulls it
> off the relevant list before it continues the enqueue.
>
> However a loop in enqueue_hrtimer() would suggest a corrupted RB-tree,
> but I cannot find an RB-op that doesn't hold base-lock.
>
I've revisited it yesterday, and if I don't miss something, there is
possible a scenario similar to this:
cpu1: cpu2:
run_hrtimer_pending
spin_unlock
restart = fn(timer)
hrtimer_start
enqueue_hrtimer
hrtimer_start
remove_hrtimer
(the HRTIMER_STATE_CALLBACK is removed)
switch_hrtimer_base
spin_lock
(not this hrtimer's anymore)
__remove_hrtimer
list_add_tail enqueue_hrtimer
Jarek P.
On Thu, Jan 15, 2009 at 11:01:14AM +0200, Denys Fedoryschenko wrote:
...
> Jarek, can you give me exact name of patch or link to it?
> I will test it on production.
> And i am interested in searching, what is a problem.
If there is nothing better I could recommend the patch below for
-stable (2.6.28), when it's tested.
Thanks,
Jarek P.
---
diff -Nurp a2.6.27.7/net/sched/sch_htb.c b2.6.27.7/net/sched/sch_htb.c
--- a2.6.27.7/net/sched/sch_htb.c 2008-12-11 08:16:16.000000000 +0000
+++ b2.6.27.7/net/sched/sch_htb.c 2008-12-15 10:44:32.000000000 +0000
@@ -924,6 +924,7 @@ static struct sk_buff *htb_dequeue(struc
}
}
sch->qstats.overlimits++;
+ qdisc_watchdog_cancel(&q->watchdog);
qdisc_watchdog_schedule(&q->watchdog, next_event);
fin:
return skb;
Jarek Poplawski пишет:
> On Thu, Jan 15, 2009 at 11:01:14AM +0200, Denys Fedoryschenko wrote:
> ...
>
>> Jarek, can you give me exact name of patch or link to it?
>> I will test it on production.
>> And i am interested in searching, what is a problem.
>>
>
> If there is nothing better I could recommend the patch below for
> -stable (2.6.28), when it's tested.
>
> Thanks,
> Jarek P.
> ---
>
> diff -Nurp a2.6.27.7/net/sched/sch_htb.c b2.6.27.7/net/sched/sch_htb.c
> --- a2.6.27.7/net/sched/sch_htb.c 2008-12-11 08:16:16.000000000 +0000
> +++ b2.6.27.7/net/sched/sch_htb.c 2008-12-15 10:44:32.000000000 +0000
> @@ -924,6 +924,7 @@ static struct sk_buff *htb_dequeue(struc
> }
> }
> sch->qstats.overlimits++;
> + qdisc_watchdog_cancel(&q->watchdog);
> qdisc_watchdog_schedule(&q->watchdog, next_event);
> fin:
> return skb;
>
>
Jarek, i easy can test patch 4 without 2+3 if it needed at heavy
production server. I use dynamic routing and if server crashed - traffic
go to another pc after few seconds. I not test it because you say that
its not needed if for me help 3+2. I apply it today and test at few servers.
On Thu, Jan 15, 2009 at 07:54:50AM +0000, Jarek Poplawski wrote:
> On Thu, Jan 15, 2009 at 07:26:28AM +0000, Chris Caputo wrote:
> > On Thu, 15 Jan 2009, Jarek Poplawski wrote:
> > > On Wed, Jan 14, 2009 at 06:02:04PM +0000, Chris Caputo wrote:
> > > ...
> > > > That said, I would not recommend just the three for -stable unless they
> > > > get a much wider amount of testing, on multiple platforms. I don't see
> > > > that as likely to happen, plus Peter says they are incomplete, so maybe it
> > > > is just best to recommend that 2.6.28 users getting crashes while using
> > > > HTB try these specific patches at first, and then the rest of the patches
> > > > if they do not work.
> ...
> > I hope you can forgive me, but since I have something that works on a
> > production machine, I am not in a position in which I can test this.
>
> So, IOW, you don't recommend Peter's patches but prefere to use them,
> and recommend my patches, but you are afraid of using them...
Sorry, Chris! It looks like I misunderstood, and you recomend Peter's
patches yet. (But I guess there are users relying on official kernel
versions.)
I hope you can forgive me,
Jarek P.
On Thu, Jan 15, 2009 at 12:40:59PM +0300, Badalian Vyacheslav wrote:
...
> Jarek, i easy can test patch 4 without 2+3 if it needed at heavy
> production server. I use dynamic routing and if server crashed - traffic
> go to another pc after few seconds. I not test it because you say that
> its not needed if for me help 3+2. I apply it today and test at few servers.
Yes, I thought it's so rare we can wait till 2.6.29, while you had this
patched, and Denys seemed to not see it after changing his sever. But
I have changed my mind after seeing Chris's report...
I'm not sure if Peter would recomend something better yet, so I guess
you could wait with this testing for his opinion.
Thanks,
Jarek P.
I didn't notice a bug, just because all that time i was using it without
hrtimers (because of nmi_watchdog).
On Thursday 15 January 2009 11:54:54 Jarek Poplawski wrote:
> On Thu, Jan 15, 2009 at 12:40:59PM +0300, Badalian Vyacheslav wrote:
> ...
>
> > Jarek, i easy can test patch 4 without 2+3 if it needed at heavy
> > production server. I use dynamic routing and if server crashed - traffic
> > go to another pc after few seconds. I not test it because you say that
> > its not needed if for me help 3+2. I apply it today and test at few
> > servers.
>
> Yes, I thought it's so rare we can wait till 2.6.29, while you had this
> patched, and Denys seemed to not see it after changing his sever. But
> I have changed my mind after seeing Chris's report...
>
> I'm not sure if Peter would recomend something better yet, so I guess
> you could wait with this testing for his opinion.
>
> Thanks,
> Jarek P.
On Thu, Jan 15, 2009 at 11:57:13AM +0200, Denys Fedoryschenko wrote:
> I didn't notice a bug, just because all that time i was using it without
> hrtimers (because of nmi_watchdog).
>
Yes, it's strange: I think Slavon reported it happened without hrtimers
too.
Jarek P.
On Thu, Jan 15, 2009 at 11:57:13AM +0200, Denys Fedoryschenko wrote:
> I didn't notice a bug, just because all that time i was using it without
> hrtimers (because of nmi_watchdog).
>
Denys, maybe I forgot something, but I think you reported this bug
with nmi_watchdog too...
Jarek P.
Yes, you are correct, on another hardware. Thats strange.
On Thursday 15 January 2009 12:10:15 Jarek Poplawski wrote:
>
> Denys, maybe I forgot something, but I think you reported this bug
> with nmi_watchdog too...
>
> Jarek P.
On Thu, 2009-01-15 at 09:01 +0000, Jarek Poplawski wrote:
> On Wed, Jan 14, 2009 at 03:28:03PM +0100, Peter Zijlstra wrote:
> ....
> > Right, found all that...
> >
> > Can't spot anything obviously wrong though.. hrtimer_start*() does
> > remove_hrtimer() which checks STATE_ENQUEUED, STATE_PENDING and pulls it
> > off the relevant list before it continues the enqueue.
> >
> > However a loop in enqueue_hrtimer() would suggest a corrupted RB-tree,
> > but I cannot find an RB-op that doesn't hold base-lock.
> >
>
> I've revisited it yesterday, and if I don't miss something, there is
> possible a scenario similar to this:
>
> cpu1: cpu2:
>
> run_hrtimer_pending
> spin_unlock
> restart = fn(timer)
>
> hrtimer_start
> enqueue_hrtimer
>
> hrtimer_start
> remove_hrtimer
> (the HRTIMER_STATE_CALLBACK is removed)
>
> switch_hrtimer_base
> spin_lock
> (not this hrtimer's anymore)
> __remove_hrtimer
> list_add_tail enqueue_hrtimer
>
(looking at .28 code)
run_hrtimer_pending() reads like:
while (pending timers) {
__remove_hrtimer(timer, HRTIMER_STATE_CALLBACK);
spin_unlock(&cpu_base->lock);
fn(timer);
spin_lock(&cpu_base->lock);
timer->state &= ~HRTIMER_STATE_CALLBACK; // _should_ result in HRTIMER_STATE_INACTIVE
if (HRTIMER_RESTART)
re-queue
else if (timer->state != INACTIVE) {
// so another cpu re-queued this timer _while_ we were executing it.
if (timer is first && !reprogramm) {
__remove_hrtimer(timer, HRTIMER_STATE_PENDING);
list_add_tail(timer, &cb_pending);
}
}
}
So in the window where we drop the lock, one can, as you said, have
another cpu requeue the timer, but the rb_entry and list_entry are free,
so it should not cause the data corruption we're seeing.
On Thu, Jan 15, 2009 at 11:46:48AM +0100, Peter Zijlstra wrote:
> On Thu, 2009-01-15 at 09:01 +0000, Jarek Poplawski wrote:
...
> > spin_lock
> > (not this hrtimer's anymore)
> > __remove_hrtimer
> > list_add_tail enqueue_hrtimer
> >
>
> (looking at .28 code)
>
> run_hrtimer_pending() reads like:
>
> while (pending timers) {
> __remove_hrtimer(timer, HRTIMER_STATE_CALLBACK);
> spin_unlock(&cpu_base->lock);
>
> fn(timer);
>
> spin_lock(&cpu_base->lock);
> timer->state &= ~HRTIMER_STATE_CALLBACK; // _should_ result in HRTIMER_STATE_INACTIVE
> if (HRTIMER_RESTART)
> re-queue
> else if (timer->state != INACTIVE) {
> // so another cpu re-queued this timer _while_ we were executing it.
> if (timer is first && !reprogramm) {
> __remove_hrtimer(timer, HRTIMER_STATE_PENDING);
> list_add_tail(timer, &cb_pending);
> }
> }
> }
>
> So in the window where we drop the lock, one can, as you said, have
> another cpu requeue the timer, but the rb_entry and list_entry are free,
> so it should not cause the data corruption we're seeing.
>
Can't they be enqueued to the list (without a lock) and rbtree at the
same time? Then removing is done for the list only?
Jarek P.
On Thu, 15 Jan 2009, Jarek Poplawski wrote:
> On Thu, Jan 15, 2009 at 07:54:50AM +0000, Jarek Poplawski wrote:
> > On Thu, Jan 15, 2009 at 07:26:28AM +0000, Chris Caputo wrote:
> > > On Thu, 15 Jan 2009, Jarek Poplawski wrote:
> > > > On Wed, Jan 14, 2009 at 06:02:04PM +0000, Chris Caputo wrote:
> > > > ...
> > > > > That said, I would not recommend just the three for -stable unless they
> > > > > get a much wider amount of testing, on multiple platforms. I don't see
> > > > > that as likely to happen, plus Peter says they are incomplete, so maybe it
> > > > > is just best to recommend that 2.6.28 users getting crashes while using
> > > > > HTB try these specific patches at first, and then the rest of the patches
> > > > > if they do not work.
> > ...
> > > I hope you can forgive me, but since I have something that works on a
> > > production machine, I am not in a position in which I can test this.
> >
> > So, IOW, you don't recommend Peter's patches but prefere to use them,
> > and recommend my patches, but you are afraid of using them...
>
> Sorry, Chris! It looks like I misunderstood, and you recomend Peter's
> patches yet. (But I guess there are users relying on official kernel
> versions.)
>
> I hope you can forgive me,
:-) No worries.
Just to be clear, by "these specific patches at first", I mean the
"hrtimer: removing all ur callback modes" ones:
ca109491f612aab5c8152207631c0444f63da97f
37810659ea7d9572c5ac284ade272f806ef8f788
a0a99b227da57f81319dd239bc4de811b0f530ec
By "rest of the patches", I mean the above plus the others created by
Peter for hrtimers. Full list:
ca109491f612aab5c8152207631c0444f63da97f
37810659ea7d9572c5ac284ade272f806ef8f788
a0a99b227da57f81319dd239bc4de811b0f530ec
b2e3c0adec918ea22b6c9d7c76193dd3aaba9bd4
8bdec955b0da2ffbd10eb9b200651dd1f9e366f2
d5fd43c4ae04523e1dcd7794f9c511b289851350
731a55ba0f17064f85903b7bf8e24849ec6cfa20
a6037b61c2f5fc99c57c15b26d7cfa58bbb34008
e3f1d883740b09e5116d4d4e30a6a6987264a83c
82c5b7b527ccc4b5d3cf832437e842f9d2920a79
6e5c172cf7ca1ab878cc6a6a4c1d52fef60f3ee0
The first 3 git patches listed above are working for me so far, so I have
not had a need to try the others.
On Thu, Jan 15, 2009 at 11:57:13AM +0200, Denys Fedoryschenko wrote:
> I didn't notice a bug, just because all that time i was using it without
> hrtimers (because of nmi_watchdog).
nmi_watchdog 1 or 2? I'm running with 2 which I've read does not disable
hrtimers, right?
I mention it because my original HTB/hrtimers-related crash on .28 was
without nmi_watchdog altogether. I added both "nmi_watchdog=panic,2" and
Peter's first 3 patches (ca1, 378, a0a) at the same time, to produce an
apparently stable system.
I wish I had a repro scenario in a non-production environment, so I could
help out further with this. If I did, I would test without nmi_watchdog
while trying just 2.6.28 and Jarek's #4 patch.
Chris
On Thu, Jan 15, 2009 at 12:00:50PM +0000, Chris Caputo wrote:
...
> I wish I had a repro scenario in a non-production environment, so I could
> help out further with this. If I did, I would test without nmi_watchdog
> while trying just 2.6.28 and Jarek's #4 patch.
I think this watchdog doesn't matter too much yet. Probably it's more
about hardware (smp, maybe kind of hw timer), and high traffic vs. htb
rules.
Jarek P.
On Thu, 15 Jan 2009, Jarek Poplawski wrote:
> On Thu, Jan 15, 2009 at 12:00:50PM +0000, Chris Caputo wrote:
> ...
> > I wish I had a repro scenario in a non-production environment, so I could
> > help out further with this. If I did, I would test without nmi_watchdog
> > while trying just 2.6.28 and Jarek's #4 patch.
>
> I think this watchdog doesn't matter too much yet. Probably it's more
> about hardware (smp, maybe kind of hw timer), and high traffic vs. htb
> rules.
Per Thomas' comment in http://bugzilla.kernel.org/show_bug.cgi?id=10944
regarding broken hrtimers:
--
nmi_watchdog=1 disables the local apic timers and therefor highres/nohz
Try to use nmi_watchdog=2 instead.
--
Chris
Chris Caputo пишет:
> On Thu, 15 Jan 2009, Jarek Poplawski wrote:
>
>> On Thu, Jan 15, 2009 at 12:00:50PM +0000, Chris Caputo wrote:
>> ...
>>
>>> I wish I had a repro scenario in a non-production environment, so I could
>>> help out further with this. If I did, I would test without nmi_watchdog
>>> while trying just 2.6.28 and Jarek's #4 patch.
>>>
>> I think this watchdog doesn't matter too much yet. Probably it's more
>> about hardware (smp, maybe kind of hw timer), and high traffic vs. htb
>> rules.
>>
>
> Per Thomas' comment in http://bugzilla.kernel.org/show_bug.cgi?id=10944
> regarding broken hrtimers:
>
> --
> nmi_watchdog=1 disables the local apic timers and therefor highres/nohz
>
> Try to use nmi_watchdog=2 instead.
> --
>
> Chris
>
>
>
Oh... after first deadlocks i add nmi_watchdog=1 to kernel and test with
it. Is this result what highres/nohz
allways on and my message where i say what test bug which highres/nohz
on/off is wrong?
From: Jarek Poplawski <[email protected]>
Date: Thu, 15 Jan 2009 06:53:22 +0000
> (resend testing patch #4 - for 2.6.27 or 2.6.28)
Jarek, if you deem that this is in fact what we should
submit for -stable please give me a submission with
a suitable commit message and signoffs, and I will queue
it up for -stable.
Thanks.
> diff -Nurp a2.6.27.7/net/sched/sch_htb.c b2.6.27.7/net/sched/sch_htb.c
> --- a2.6.27.7/net/sched/sch_htb.c 2008-12-11 08:16:16.000000000 +0000
> +++ b2.6.27.7/net/sched/sch_htb.c 2008-12-15 10:44:32.000000000 +0000
> @@ -924,6 +924,7 @@ static struct sk_buff *htb_dequeue(struc
> }
> }
> sch->qstats.overlimits++;
> + qdisc_watchdog_cancel(&q->watchdog);
> qdisc_watchdog_schedule(&q->watchdog, next_event);
> fin:
> return skb;
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
On Sun, Jan 18, 2009 at 09:46:04PM -0800, David Miller wrote:
> From: Jarek Poplawski <[email protected]>
> Date: Thu, 15 Jan 2009 06:53:22 +0000
>
> > (resend testing patch #4 - for 2.6.27 or 2.6.28)
>
> Jarek, if you deem that this is in fact what we should
> submit for -stable please give me a submission with
> a suitable commit message and signoffs, and I will queue
> it up for -stable.
It looks like this should be needed, but I think it's better to wait
2 or 3 days for "Tested-by" from Denys and/or maybe Vyacheslav yet.
(I hoped they would rather test some hrtimers patch, but it looks like
Peter was busy.)
Thanks,
Jarek P.
-----------------> (needed only for -stables: 2.6.28 and older)
pkt_sched: sch_htb: Fix deadlock in hrtimers triggered by HTB
Most probably there is a (still unproven) race in hrtimers (before
2.6.29 kernels), which causes a corruption of hrtimers rbtree. This
patch doesn't fix it, but should let HTB avoid triggering the bug.
Reported-by: Denys Fedoryschenko <[email protected]>
Reported-by: Badalian Vyacheslav <[email protected]>
Reported-by: Chris Caputo <[email protected]>
Signed-off-by: Jarek Poplawski <[email protected]>
---
diff -Nurp a2.6.27.7/net/sched/sch_htb.c b2.6.27.7/net/sched/sch_htb.c
--- a2.6.27.7/net/sched/sch_htb.c 2008-12-11 08:16:16.000000000 +0000
+++ b2.6.27.7/net/sched/sch_htb.c 2008-12-15 10:44:32.000000000 +0000
@@ -924,6 +924,7 @@ static struct sk_buff *htb_dequeue(struc
}
}
sch->qstats.overlimits++;
+ qdisc_watchdog_cancel(&q->watchdog);
qdisc_watchdog_schedule(&q->watchdog, next_event);
fin:
return skb;
Tested 2 days (weekend is stress days for this bug, because its have
many traffic) at 3 servers. Fly normal. For completed test need up to 7
days, but i think (imho) this patch not break any functional and may
safely added to stable. 100% help for me patch # 2+3.
> On Sun, Jan 18, 2009 at 09:46:04PM -0800, David Miller wrote:
>
>> From: Jarek Poplawski <[email protected]>
>> Date: Thu, 15 Jan 2009 06:53:22 +0000
>>
>>
>>> (resend testing patch #4 - for 2.6.27 or 2.6.28)
>>>
>> Jarek, if you deem that this is in fact what we should
>> submit for -stable please give me a submission with
>> a suitable commit message and signoffs, and I will queue
>> it up for -stable.
>>
>
> It looks like this should be needed, but I think it's better to wait
> 2 or 3 days for "Tested-by" from Denys and/or maybe Vyacheslav yet.
> (I hoped they would rather test some hrtimers patch, but it looks like
> Peter was busy.)
>
> Thanks,
> Jarek P.
> -----------------> (needed only for -stables: 2.6.28 and older)
>
> pkt_sched: sch_htb: Fix deadlock in hrtimers triggered by HTB
>
> Most probably there is a (still unproven) race in hrtimers (before
> 2.6.29 kernels), which causes a corruption of hrtimers rbtree. This
> patch doesn't fix it, but should let HTB avoid triggering the bug.
>
> Reported-by: Denys Fedoryschenko <[email protected]>
> Reported-by: Badalian Vyacheslav <[email protected]>
> Reported-by: Chris Caputo <[email protected]>
> Signed-off-by: Jarek Poplawski <[email protected]>
> ---
>
> diff -Nurp a2.6.27.7/net/sched/sch_htb.c b2.6.27.7/net/sched/sch_htb.c
> --- a2.6.27.7/net/sched/sch_htb.c 2008-12-11 08:16:16.000000000 +0000
> +++ b2.6.27.7/net/sched/sch_htb.c 2008-12-15 10:44:32.000000000 +0000
> @@ -924,6 +924,7 @@ static struct sk_buff *htb_dequeue(struc
> }
> }
> sch->qstats.overlimits++;
> + qdisc_watchdog_cancel(&q->watchdog);
> qdisc_watchdog_schedule(&q->watchdog, next_event);
> fin:
> return skb;
>
>
On Mon, Jan 19, 2009 at 10:42:29AM +0300, Badalian Vyacheslav wrote:
> Tested 2 days (weekend is stress days for this bug, because its have
> many traffic) at 3 servers. Fly normal. For completed test need up to 7
> days, but i think (imho) this patch not break any functional and may
> safely added to stable. 100% help for me patch # 2+3.
Thank you very much, Slavon! (2+3 are of course better tested, but
I hope this #4 is more appropriate for -stable.)
Jarek P.
> > On Sun, Jan 18, 2009 at 09:46:04PM -0800, David Miller wrote:
> >
> >> From: Jarek Poplawski <[email protected]>
> >> Date: Thu, 15 Jan 2009 06:53:22 +0000
> >>
> >>
> >>> (resend testing patch #4 - for 2.6.27 or 2.6.28)
> >>>
> >> Jarek, if you deem that this is in fact what we should
> >> submit for -stable please give me a submission with
> >> a suitable commit message and signoffs, and I will queue
> >> it up for -stable.
> >>
> >
> > It looks like this should be needed, but I think it's better to wait
> > 2 or 3 days for "Tested-by" from Denys and/or maybe Vyacheslav yet.
> > (I hoped they would rather test some hrtimers patch, but it looks like
> > Peter was busy.)
> >
> > Thanks,
> > Jarek P.
> > -----------------> (needed only for -stables: 2.6.28 and older)
> >
> > pkt_sched: sch_htb: Fix deadlock in hrtimers triggered by HTB
> >
> > Most probably there is a (still unproven) race in hrtimers (before
> > 2.6.29 kernels), which causes a corruption of hrtimers rbtree. This
> > patch doesn't fix it, but should let HTB avoid triggering the bug.
> >
> > Reported-by: Denys Fedoryschenko <[email protected]>
> > Reported-by: Badalian Vyacheslav <[email protected]>
> > Reported-by: Chris Caputo <[email protected]>
> > Signed-off-by: Jarek Poplawski <[email protected]>
> > ---
> >
> > diff -Nurp a2.6.27.7/net/sched/sch_htb.c b2.6.27.7/net/sched/sch_htb.c
> > --- a2.6.27.7/net/sched/sch_htb.c 2008-12-11 08:16:16.000000000 +0000
> > +++ b2.6.27.7/net/sched/sch_htb.c 2008-12-15 10:44:32.000000000 +0000
> > @@ -924,6 +924,7 @@ static struct sk_buff *htb_dequeue(struc
> > }
> > }
> > sch->qstats.overlimits++;
> > + qdisc_watchdog_cancel(&q->watchdog);
> > qdisc_watchdog_schedule(&q->watchdog, next_event);
> > fin:
> > return skb;
> >
> >
>
From: Jarek Poplawski <[email protected]>
Date: Mon, 19 Jan 2009 07:57:33 +0000
> On Mon, Jan 19, 2009 at 10:42:29AM +0300, Badalian Vyacheslav wrote:
> > Tested 2 days (weekend is stress days for this bug, because its have
> > many traffic) at 3 servers. Fly normal. For completed test need up to 7
> > days, but i think (imho) this patch not break any functional and may
> > safely added to stable. 100% help for me patch # 2+3.
>
> Thank you very much, Slavon! (2+3 are of course better tested, but
> I hope this #4 is more appropriate for -stable.)
I've queued this up, thanks everyone.