2007-05-29 21:58:08

by Ian Kumlien

[permalink] [raw]
Subject: [BUG] Something goes wrong with timer statistics.

Hi,

As the daystar sets, i try to play some with my new would be
firewall/server, but since this will be running for quite some time i
have been experimenting with powertop to find out what i can do to limit
it's power usage.

But, if i run powertop for too long or a few times to many... this
happens:
http://pomac.netswarm.net/pics/kernel_panic.jpg

If i don't run powertop, it is rock solid... Compiling for hours,
running memtest for hours etc etc...

--
Ian Kumlien <pomac () vapor ! com> -- http://pomac.netswarm.net


2007-05-29 22:35:57

by Ian Kumlien

[permalink] [raw]
Subject: Re: [BUG] Something goes wrong with timer statistics.

On ons, 2007-05-30 at 00:24 +0200, Michal Piotrowski wrote:
> Hi Ian,
>
> (Thomas "The Wizard of Time" added to CC :))
>
> On 29/05/07, Ian Kumlien <[email protected]> wrote:
> > Hi,
> >
> > As the daystar sets, i try to play some with my new would be
> > firewall/server, but since this will be running for quite some time i
> > have been experimenting with powertop to find out what i can do to limit
> > it's power usage.
> >
> > But, if i run powertop for too long or a few times to many... this
> > happens:
> > http://pomac.netswarm.net/pics/kernel_panic.jpg
> >
> > If i don't run powertop, it is rock solid... Compiling for hours,
> > running memtest for hours etc etc...
>
> (Where is my magic ball?)

(Lost? =))

> Let me guess - Pentium 4 with HT?

Core 2 Duo, without HT =)

> Also, please try the latest hrt tree
> http://tglx.de/projects/hrtimers/2.6.22-rc3/

If you still think i should try that, i will, this box is not in
production yet for a while so... =)

I switched to using 64 bit hpet after the last crash, will this
influence the results?

> Regards,
> Michal
>
--
Ian Kumlien <pomac () vapor ! com> -- http://pomac.netswarm.net

2007-05-29 22:37:24

by David Miller

[permalink] [raw]
Subject: Re: [BUG] Something goes wrong with timer statistics.

From: Ian Kumlien <[email protected]>
Date: Tue, 29 May 2007 23:38:48 +0200

> As the daystar sets, i try to play some with my new would be
> firewall/server, but since this will be running for quite some time i
> have been experimenting with powertop to find out what i can do to limit
> it's power usage.
>
> But, if i run powertop for too long or a few times to many... this
> happens:
> http://pomac.netswarm.net/pics/kernel_panic.jpg
>
> If i don't run powertop, it is rock solid... Compiling for hours,
> running memtest for hours etc etc...

I see this same exact problem on sparc64.

2007-05-29 22:38:22

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [BUG] Something goes wrong with timer statistics.

On Wed, 2007-05-30 at 00:24 +0200, Michal Piotrowski wrote:
> Hi Ian,
>
> (Thomas "The Wizard of Time" added to CC :))

Added more wizards :)

> On 29/05/07, Ian Kumlien <[email protected]> wrote:
> > Hi,
> >
> > As the daystar sets, i try to play some with my new would be
> > firewall/server, but since this will be running for quite some time i
> > have been experimenting with powertop to find out what i can do to limit
> > it's power usage.
> >
> > But, if i run powertop for too long or a few times to many... this
> > happens:
> > http://pomac.netswarm.net/pics/kernel_panic.jpg

This was reported before. It's incredibly hard to reproduce.

Ian, is this happening reproducible ?

> > If i don't run powertop, it is rock solid... Compiling for hours,
> > running memtest for hours etc etc...
>
> (Where is my magic ball?)
>
> Let me guess - Pentium 4 with HT?

I might be wrong, but it's a newer core duo thingy.

tglx


2007-05-29 22:42:23

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [BUG] Something goes wrong with timer statistics.

On Tue, 2007-05-29 at 15:37 -0700, David Miller wrote:
> From: Ian Kumlien <[email protected]>
> Date: Tue, 29 May 2007 23:38:48 +0200
>
> > As the daystar sets, i try to play some with my new would be
> > firewall/server, but since this will be running for quite some time i
> > have been experimenting with powertop to find out what i can do to limit
> > it's power usage.
> >
> > But, if i run powertop for too long or a few times to many... this
> > happens:
> > http://pomac.netswarm.net/pics/kernel_panic.jpg
> >
> > If i don't run powertop, it is rock solid... Compiling for hours,
> > running memtest for hours etc etc...
>
> I see this same exact problem on sparc64.

100% reproducible ?

tglx



2007-05-29 22:44:04

by David Miller

[permalink] [raw]
Subject: Re: [BUG] Something goes wrong with timer statistics.

From: "Michal Piotrowski" <[email protected]>
Date: Wed, 30 May 2007 00:41:46 +0200

> On 30/05/07, David Miller <[email protected]> wrote:
> > From: Ian Kumlien <[email protected]>
> > Date: Tue, 29 May 2007 23:38:48 +0200
> >
> > > As the daystar sets, i try to play some with my new would be
> > > firewall/server, but since this will be running for quite some time i
> > > have been experimenting with powertop to find out what i can do to limit
> > > it's power usage.
> > >
> > > But, if i run powertop for too long or a few times to many... this
> > > happens:
> > > http://pomac.netswarm.net/pics/kernel_panic.jpg
> > >
> > > If i don't run powertop, it is rock solid... Compiling for hours,
> > > running memtest for hours etc etc...
> >
> > I see this same exact problem on sparc64.
>
> Have you tried this patch?
> http://lkml.org/lkml/2007/5/29/392

Of course, I worked with Thomas on the fix :-)

2007-05-29 22:44:26

by David Miller

[permalink] [raw]
Subject: Re: [BUG] Something goes wrong with timer statistics.

From: Thomas Gleixner <[email protected]>
Date: Wed, 30 May 2007 00:42:12 +0200

> On Tue, 2007-05-29 at 15:37 -0700, David Miller wrote:
> > I see this same exact problem on sparc64.
>
> 100% reproducible ?

I think so, I'll try to find time to tinker with it a bit
more.

2007-05-29 22:51:36

by Ian Kumlien

[permalink] [raw]
Subject: Re: [BUG] Something goes wrong with timer statistics.

On ons, 2007-05-30 at 00:38 +0200, Thomas Gleixner wrote:
> On Wed, 2007-05-30 at 00:24 +0200, Michal Piotrowski wrote:
> > Hi Ian,
> >
> > (Thomas "The Wizard of Time" added to CC :))
>
> Added more wizards :)
>
> > On 29/05/07, Ian Kumlien <[email protected]> wrote:
> > > Hi,
> > >
> > > As the daystar sets, i try to play some with my new would be
> > > firewall/server, but since this will be running for quite some time i
> > > have been experimenting with powertop to find out what i can do to limit
> > > it's power usage.
> > >
> > > But, if i run powertop for too long or a few times to many... this
> > > happens:
> > > http://pomac.netswarm.net/pics/kernel_panic.jpg
>
> This was reported before. It's incredibly hard to reproduce.
>
> Ian, is this happening reproducible ?

Seems like it, i haven't actually TRIED to crash yet...
But i can try if i don't have to crash in the future =)

(The machine is just a motherboard lying on the floor with some disks
attached, so i have to bring out my pocket knife and short circuit the
right pins =))

> > > If i don't run powertop, it is rock solid... Compiling for hours,
> > > running memtest for hours etc etc...
> >
> > (Where is my magic ball?)
> >
> > Let me guess - Pentium 4 with HT?
>
> I might be wrong, but it's a newer core duo thingy.
>
> tglx
>
--
Ian Kumlien <pomac () vapor ! com> -- http://pomac.netswarm.net

2007-05-29 22:52:53

by Ian Kumlien

[permalink] [raw]
Subject: Re: [BUG] Something goes wrong with timer statistics.

On tis, 2007-05-29 at 15:44 -0700, David Miller wrote:
> From: "Michal Piotrowski" <[email protected]>
> Date: Wed, 30 May 2007 00:41:46 +0200
>
> > On 30/05/07, David Miller <[email protected]> wrote:
> > > From: Ian Kumlien <[email protected]>
> > > Date: Tue, 29 May 2007 23:38:48 +0200
> > >
> > > > As the daystar sets, i try to play some with my new would be
> > > > firewall/server, but since this will be running for quite some time i
> > > > have been experimenting with powertop to find out what i can do to limit
> > > > it's power usage.
> > > >
> > > > But, if i run powertop for too long or a few times to many... this
> > > > happens:
> > > > http://pomac.netswarm.net/pics/kernel_panic.jpg
> > > >
> > > > If i don't run powertop, it is rock solid... Compiling for hours,
> > > > running memtest for hours etc etc...
> > >
> > > I see this same exact problem on sparc64.
> >
> > Have you tried this patch?
> > http://lkml.org/lkml/2007/5/29/392
>
> Of course, I worked with Thomas on the fix :-)

I dunno if this applies to me, i run a 32 bit userspace...

--
Ian Kumlien <pomac () vapor ! com> -- http://pomac.netswarm.net

2007-05-29 22:53:44

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [BUG] Something goes wrong with timer statistics.

On Tue, 2007-05-29 at 15:44 -0700, David Miller wrote:
> From: Thomas Gleixner <[email protected]>
> Date: Wed, 30 May 2007 00:42:12 +0200
>
> > On Tue, 2007-05-29 at 15:37 -0700, David Miller wrote:
> > > I see this same exact problem on sparc64.
> >
> > 100% reproducible ?
>
> I think so, I'll try to find time to tinker with it a bit
> more.

It seems to be a subtle bug in the timer stats code, but I did not spot
it.

tglx


2007-05-29 23:03:33

by David Miller

[permalink] [raw]
Subject: Re: [BUG] Something goes wrong with timer statistics.

From: Ian Kumlien <[email protected]>
Date: Wed, 30 May 2007 00:51:52 +0200

> On tis, 2007-05-29 at 15:44 -0700, David Miller wrote:
> > From: "Michal Piotrowski" <[email protected]>
> > Date: Wed, 30 May 2007 00:41:46 +0200
> >
> > > On 30/05/07, David Miller <[email protected]> wrote:
> > > > From: Ian Kumlien <[email protected]>
> > > > Date: Tue, 29 May 2007 23:38:48 +0200
> > > >
> > > > > As the daystar sets, i try to play some with my new would be
> > > > > firewall/server, but since this will be running for quite some time i
> > > > > have been experimenting with powertop to find out what i can do to limit
> > > > > it's power usage.
> > > > >
> > > > > But, if i run powertop for too long or a few times to many... this
> > > > > happens:
> > > > > http://pomac.netswarm.net/pics/kernel_panic.jpg
> > > > >
> > > > > If i don't run powertop, it is rock solid... Compiling for hours,
> > > > > running memtest for hours etc etc...
> > > >
> > > > I see this same exact problem on sparc64.
> > >
> > > Have you tried this patch?
> > > http://lkml.org/lkml/2007/5/29/392
> >
> > Of course, I worked with Thomas on the fix :-)
>
> I dunno if this applies to me, i run a 32 bit userspace...

That patch fixes a problem on 64-bit kernels, regardless of
userspace, when NOHZ is enabled.

2007-05-30 00:05:05

by Ian Kumlien

[permalink] [raw]
Subject: Re: [BUG] Something goes wrong with timer statistics.

On tis, 2007-05-29 at 16:03 -0700, David Miller wrote:
> From: Ian Kumlien <[email protected]>
> Date: Wed, 30 May 2007 00:51:52 +0200
>
> > On tis, 2007-05-29 at 15:44 -0700, David Miller wrote:
> > > From: "Michal Piotrowski" <[email protected]>
> > > Date: Wed, 30 May 2007 00:41:46 +0200
> > >
> > > > On 30/05/07, David Miller <[email protected]> wrote:
> > > > > From: Ian Kumlien <[email protected]>
> > > > > Date: Tue, 29 May 2007 23:38:48 +0200
> > > > >
> > > > > > As the daystar sets, i try to play some with my new would be
> > > > > > firewall/server, but since this will be running for quite some time i
> > > > > > have been experimenting with powertop to find out what i can do to limit
> > > > > > it's power usage.
> > > > > >
> > > > > > But, if i run powertop for too long or a few times to many... this
> > > > > > happens:
> > > > > > http://pomac.netswarm.net/pics/kernel_panic.jpg
> > > > > >
> > > > > > If i don't run powertop, it is rock solid... Compiling for hours,
> > > > > > running memtest for hours etc etc...
> > > > >
> > > > > I see this same exact problem on sparc64.
> > > >
> > > > Have you tried this patch?
> > > > http://lkml.org/lkml/2007/5/29/392
> > >
> > > Of course, I worked with Thomas on the fix :-)
> >
> > I dunno if this applies to me, i run a 32 bit userspace...
>
> That patch fixes a problem on 64-bit kernels, regardless of
> userspace, when NOHZ is enabled.

Sorry, i run all 32 bit, i was on the phone while typing that, so a wee
bit distracted. Anyways, i run a nohz 32 bit kernel.

I dunno how good a intel T7200 is when it comes to 64 bit
kernel/userspace.. (Haven't really read up on it)

--
Ian Kumlien <pomac () vapor ! com> -- http://pomac.netswarm.net

2007-05-30 00:45:09

by Stephane Casset

[permalink] [raw]
Subject: Re: [BUG] Something goes wrong with timer statistics.

Le Tue, May 29, 2007 at 11:38:48PM +0200, Ian Kumlien écrivait :
> Hi,
>
> As the daystar sets, i try to play some with my new would be
> firewall/server, but since this will be running for quite some time i
> have been experimenting with powertop to find out what i can do to limit
> it's power usage.
>
> But, if i run powertop for too long or a few times to many... this
> happens:
> http://pomac.netswarm.net/pics/kernel_panic.jpg
>
> If i don't run powertop, it is rock solid... Compiling for hours,
> running memtest for hours etc etc...

Same here, P4 HT, if I run powertop for 5-10 minutes the system crash :(
It happens with 2.6.22-rc{2,3} at least :( I didn't try before.

I don't have a oops since the last crash was in X, but next time I will
try to catch it. If there is any patch around to fix this I am willing
to test.

Regards
--
Stéphane Casset LOGIDÉE sàrl Se faire plaisir d'apprendre
1a, rue Pasteur Tel : +33 388 23 69 77 [email protected]
F-67540 OSTWALD Fax : +33 388 23 69 77 http://logidee.com

2007-05-30 06:52:18

by Björn Steinbrink

[permalink] [raw]
Subject: Re: [BUG] Something goes wrong with timer statistics.

On 2007.05.30 00:38:08 +0200, Thomas Gleixner wrote:
> On Wed, 2007-05-30 at 00:24 +0200, Michal Piotrowski wrote:
> > Hi Ian,
> >
> > (Thomas "The Wizard of Time" added to CC :))
>
> Added more wizards :)
>
> > On 29/05/07, Ian Kumlien <[email protected]> wrote:
> > > Hi,
> > >
> > > As the daystar sets, i try to play some with my new would be
> > > firewall/server, but since this will be running for quite some time i
> > > have been experimenting with powertop to find out what i can do to limit
> > > it's power usage.
> > >
> > > But, if i run powertop for too long or a few times to many... this
> > > happens:
> > > http://pomac.netswarm.net/pics/kernel_panic.jpg
>
> This was reported before. It's incredibly hard to reproduce.

Hm, tstats_write() only grabs show_mutex, but it calls reset_entries()
which happily messes with the entries table, shouldn't the table_lock
also be held in that case?

Bj?rn

2007-05-30 07:10:16

by Björn Steinbrink

[permalink] [raw]
Subject: Re: [BUG] Something goes wrong with timer statistics.

On 2007.05.30 08:52:04 +0200, Bj?rn Steinbrink wrote:
> On 2007.05.30 00:38:08 +0200, Thomas Gleixner wrote:
> > On Wed, 2007-05-30 at 00:24 +0200, Michal Piotrowski wrote:
> > > Hi Ian,
> > >
> > > (Thomas "The Wizard of Time" added to CC :))
> >
> > Added more wizards :)
> >
> > > On 29/05/07, Ian Kumlien <[email protected]> wrote:
> > > > Hi,
> > > >
> > > > As the daystar sets, i try to play some with my new would be
> > > > firewall/server, but since this will be running for quite some time i
> > > > have been experimenting with powertop to find out what i can do to limit
> > > > it's power usage.
> > > >
> > > > But, if i run powertop for too long or a few times to many... this
> > > > happens:
> > > > http://pomac.netswarm.net/pics/kernel_panic.jpg
> >
> > This was reported before. It's incredibly hard to reproduce.
>
> Hm, tstats_write() only grabs show_mutex, but it calls reset_entries()
> which happily messes with the entries table, shouldn't the table_lock
> also be held in that case?

Hrmpf, nevermind, I missed the active flag...

Sorry,
Bj?rn

2007-05-30 11:38:36

by Björn Steinbrink

[permalink] [raw]
Subject: Re: [BUG] Something goes wrong with timer statistics.

On 2007.05.30 00:38:08 +0200, Thomas Gleixner wrote:
> On Wed, 2007-05-30 at 00:24 +0200, Michal Piotrowski wrote:
> > Hi Ian,
> >
> > (Thomas "The Wizard of Time" added to CC :))
>
> Added more wizards :)
>
> > On 29/05/07, Ian Kumlien <[email protected]> wrote:
> > > Hi,
> > >
> > > As the daystar sets, i try to play some with my new would be
> > > firewall/server, but since this will be running for quite some time i
> > > have been experimenting with powertop to find out what i can do to limit
> > > it's power usage.
> > >
> > > But, if i run powertop for too long or a few times to many... this
> > > happens:
> > > http://pomac.netswarm.net/pics/kernel_panic.jpg
>
> This was reported before. It's incredibly hard to reproduce.

OK, second try, this time with a patch. In timer_stats_update_stats,
input is allocated on the stack, so it is uninitialized, in particular
the "next" field is random. Now in tstat_lookup, the new entry "curr" is
initialized with the values from "input" (passed as "entry") and "next"
is set to NULL _after_ it is added to the list, so if a second CPU is
running the fastpath lookup while we're inserting the new element, it
might get the garbage value in "next". The patch below fixes that.

Bj?rn



Initialize the "next" field of a timer stats entry before it is inserted
into the list to avoid a race with the fastpath lookup.

Signed-off-by: Bj?rn Steinbrink <[email protected]>
---
diff --git a/kernel/time/timer_stats.c b/kernel/time/timer_stats.c
index 868f1bc..5bc8f91 100644
--- a/kernel/time/timer_stats.c
+++ b/kernel/time/timer_stats.c
@@ -202,12 +202,12 @@ static struct entry *tstat_lookup(struct entry *entry, char *comm)
if (curr) {
*curr = *entry;
curr->count = 0;
+ curr->next = NULL;
memcpy(curr->comm, comm, TASK_COMM_LEN);
if (prev)
prev->next = curr;
else
*head = curr;
- curr->next = NULL;
}
out_unlock:
spin_unlock(&table_lock);

2007-05-30 12:44:58

by Eric Dumazet

[permalink] [raw]
Subject: Re: [BUG] Something goes wrong with timer statistics.

On Wed, 30 May 2007 13:38:25 +0200
Bj?rn Steinbrink <[email protected]> wrote:

> On 2007.05.30 00:38:08 +0200, Thomas Gleixner wrote:
> > On Wed, 2007-05-30 at 00:24 +0200, Michal Piotrowski wrote:
> > > Hi Ian,
> > >
> > > (Thomas "The Wizard of Time" added to CC :))
> >
> > Added more wizards :)
> >
> > > On 29/05/07, Ian Kumlien <[email protected]> wrote:
> > > > Hi,
> > > >
> > > > As the daystar sets, i try to play some with my new would be
> > > > firewall/server, but since this will be running for quite some time i
> > > > have been experimenting with powertop to find out what i can do to limit
> > > > it's power usage.
> > > >
> > > > But, if i run powertop for too long or a few times to many... this
> > > > happens:
> > > > http://pomac.netswarm.net/pics/kernel_panic.jpg
> >
> > This was reported before. It's incredibly hard to reproduce.
>
> OK, second try, this time with a patch. In timer_stats_update_stats,
> input is allocated on the stack, so it is uninitialized, in particular
> the "next" field is random. Now in tstat_lookup, the new entry "curr" is
> initialized with the values from "input" (passed as "entry") and "next"
> is set to NULL _after_ it is added to the list, so if a second CPU is
> running the fastpath lookup while we're inserting the new element, it
> might get the garbage value in "next". The patch below fixes that.
>
> Bj?rn
>
>
>
> Initialize the "next" field of a timer stats entry before it is inserted
> into the list to avoid a race with the fastpath lookup.
>
> Signed-off-by: Bj?rn Steinbrink <[email protected]>
> ---
> diff --git a/kernel/time/timer_stats.c b/kernel/time/timer_stats.c
> index 868f1bc..5bc8f91 100644
> --- a/kernel/time/timer_stats.c
> +++ b/kernel/time/timer_stats.c
> @@ -202,12 +202,12 @@ static struct entry *tstat_lookup(struct entry *entry, char *comm)
> if (curr) {
> *curr = *entry;
> curr->count = 0;
> + curr->next = NULL;
> memcpy(curr->comm, comm, TASK_COMM_LEN);
> if (prev)
> prev->next = curr;
> else
> *head = curr;
> - curr->next = NULL;
> }
> out_unlock:
> spin_unlock(&table_lock);
>

Your analysis might be right, not the fix.

You *cannot* assume curr->next = NULL will be done before insert.

You probably also need a memory barrier.

curr->next = NULL;
memcpy(curr->comm, comm, TASK_COMM_LEN);

smp_mb(); /* commit all writes to curr before insert */

if (prev)
prev->next = curr;
else
*head = curr;

2007-05-30 13:15:11

by Björn Steinbrink

[permalink] [raw]
Subject: Re: [BUG] Something goes wrong with timer statistics.

On 2007.05.30 14:44:49 +0200, Eric Dumazet wrote:
> On Wed, 30 May 2007 13:38:25 +0200
> Bj?rn Steinbrink <[email protected]> wrote:
>
> > On 2007.05.30 00:38:08 +0200, Thomas Gleixner wrote:
> > > On Wed, 2007-05-30 at 00:24 +0200, Michal Piotrowski wrote:
> > > > On 29/05/07, Ian Kumlien <[email protected]> wrote:
> > > > > Hi,
> > > > >
> > > > > As the daystar sets, i try to play some with my new would be
> > > > > firewall/server, but since this will be running for quite some time i
> > > > > have been experimenting with powertop to find out what i can do to limit
> > > > > it's power usage.
> > > > >
> > > > > But, if i run powertop for too long or a few times to many... this
> > > > > happens:
> > > > > http://pomac.netswarm.net/pics/kernel_panic.jpg
> > >
> > > This was reported before. It's incredibly hard to reproduce.
> >
> > OK, second try, this time with a patch. In timer_stats_update_stats,
> > input is allocated on the stack, so it is uninitialized, in particular
> > the "next" field is random. Now in tstat_lookup, the new entry "curr" is
> > initialized with the values from "input" (passed as "entry") and "next"
> > is set to NULL _after_ it is added to the list, so if a second CPU is
> > running the fastpath lookup while we're inserting the new element, it
> > might get the garbage value in "next". The patch below fixes that.
> >
> > Bj?rn
> >
> >
> >
> > Initialize the "next" field of a timer stats entry before it is inserted
> > into the list to avoid a race with the fastpath lookup.
> >
> > Signed-off-by: Bj?rn Steinbrink <[email protected]>
> > ---
> > diff --git a/kernel/time/timer_stats.c b/kernel/time/timer_stats.c
> > index 868f1bc..5bc8f91 100644
> > --- a/kernel/time/timer_stats.c
> > +++ b/kernel/time/timer_stats.c
> > @@ -202,12 +202,12 @@ static struct entry *tstat_lookup(struct entry *entry, char *comm)
> > if (curr) {
> > *curr = *entry;
> > curr->count = 0;
> > + curr->next = NULL;
> > memcpy(curr->comm, comm, TASK_COMM_LEN);
> > if (prev)
> > prev->next = curr;
> > else
> > *head = curr;
> > - curr->next = NULL;
> > }
> > out_unlock:
> > spin_unlock(&table_lock);
> >
>
> Your analysis might be right, not the fix.
>
> You *cannot* assume curr->next = NULL will be done before insert.
>
> You probably also need a memory barrier.

Ehrm, right. I somehow thought of the spinlock being enough to satisfy
that constraint, but if that was the case, the whole problem wouldn't
exist in the first place. D'oh!

Thanks,
Bj?rn


Initialize the "next" field of a timer stats entry before it is inserted
into the list to avoid a race with the fastpath lookup.

Thanks to Eric Dumazet for reminding me of the memory barrier.

Signed-off-by: Bj?rn Steinbrink <[email protected]>
---
diff --git a/kernel/time/timer_stats.c b/kernel/time/timer_stats.c
index 868f1bc..ab0ba6c 100644
--- a/kernel/time/timer_stats.c
+++ b/kernel/time/timer_stats.c
@@ -202,12 +202,15 @@ static struct entry *tstat_lookup(struct entry *entry, char *comm)
if (curr) {
*curr = *entry;
curr->count = 0;
+ curr->next = NULL;
memcpy(curr->comm, comm, TASK_COMM_LEN);
+
+ smp_mb(); /* Ensure that curr is initialized before insert */
+
if (prev)
prev->next = curr;
else
*head = curr;
- curr->next = NULL;
}
out_unlock:
spin_unlock(&table_lock);

2007-05-31 14:25:25

by Björn Steinbrink

[permalink] [raw]
Subject: Re: [BUG] Something goes wrong with timer statistics.

On 2007.05.31 12:20:47 +0200, [email protected] wrote:
> On Wed, May 30, 2007 at 03:14:58PM +0200, Bj?rn Steinbrink wrote:
> > Initialize the "next" field of a timer stats entry before it is inserted
> > into the list to avoid a race with the fastpath lookup.
>
> Sorry to say... but this does not fix my problem, however, i can't reach
> that machine at all atm, but i will post the oops later today.

Ok, three times is a charm, right? ;-) The previous patch did fix a
race, but there's still one left. The hash table is never cleared, so it
can still point into the entries array.

As long as we do not race with the insertion, that's not a directly
visible problem as all fields are 0, so we get no match and the lookup
finishes right away. We probably _can_ get weird results though, as the
all-zero entry isn't marked as used, but is used as "head" and "prev" in
this case and even multiple hash entries might point to it in some
cases. That would eventually lead to lost entries because the "next"
field gets overwritten when alloc_entry() finally claims the entry.

But when we race, we can still end up between "*curr = *entry" (makes
"next" contain garbage) and "curr->next = NULL", while we're doing the
fastpath lookup, because the insertion basically already took place!
Replacing "curr->next = NULL" with "entry->next = NULL" and moving it up
is pointless, as "*curr = *entry" isn't atomic, so instead, we have to
simply clean the hash table when the entries are cleared.

Bj?rn


Fix two races in the timer stats lookup code. One by ensuring that the
initialization of a new entry is finished upon insertion of that entry.
The other by cleaning up the hash table when the entries array is
cleared, so that we don't have "pre-inserted" entries.

Thanks to Eric Dumazet for reminding me of the memory barrier.

Signed-off-by: Bj?rn Steinbrink <[email protected]>
---
diff --git a/kernel/time/timer_stats.c b/kernel/time/timer_stats.c
index 868f1bc..611b844 100644
--- a/kernel/time/timer_stats.c
+++ b/kernel/time/timer_stats.c
@@ -117,21 +117,6 @@ static struct entry entries[MAX_ENTRIES];

static atomic_t overflow_count;

-static void reset_entries(void)
-{
- nr_entries = 0;
- memset(entries, 0, sizeof(entries));
- atomic_set(&overflow_count, 0);
-}
-
-static struct entry *alloc_entry(void)
-{
- if (nr_entries >= MAX_ENTRIES)
- return NULL;
-
- return entries + nr_entries++;
-}
-
/*
* The entries are in a hash-table, for fast lookup:
*/
@@ -149,6 +134,22 @@ static struct entry *alloc_entry(void)

static struct entry *tstat_hash_table[TSTAT_HASH_SIZE] __read_mostly;

+static void reset_entries(void)
+{
+ nr_entries = 0;
+ memset(entries, 0, sizeof(entries));
+ memset(tstat_hash_table, 0, sizeof(tstat_hash_table));
+ atomic_set(&overflow_count, 0);
+}
+
+static struct entry *alloc_entry(void)
+{
+ if (nr_entries >= MAX_ENTRIES)
+ return NULL;
+
+ return entries + nr_entries++;
+}
+
static int match_entries(struct entry *entry1, struct entry *entry2)
{
return entry1->timer == entry2->timer &&
@@ -202,12 +203,15 @@ static struct entry *tstat_lookup(struct entry *entry, char *comm)
if (curr) {
*curr = *entry;
curr->count = 0;
+ curr->next = NULL;
memcpy(curr->comm, comm, TASK_COMM_LEN);
+
+ smp_mb(); /* Ensure that curr is initialized before insert */
+
if (prev)
prev->next = curr;
else
*head = curr;
- curr->next = NULL;
}
out_unlock:
spin_unlock(&table_lock);

2007-05-31 15:10:38

by Eric Dumazet

[permalink] [raw]
Subject: Re: [BUG] Something goes wrong with timer statistics.

On Thu, 31 May 2007 16:25:22 +0200
Bj?rn Steinbrink <[email protected]> wrote:

> On 2007.05.31 12:20:47 +0200, [email protected] wrote:
> > On Wed, May 30, 2007 at 03:14:58PM +0200, Bj?rn Steinbrink wrote:
> > > Initialize the "next" field of a timer stats entry before it is inserted
> > > into the list to avoid a race with the fastpath lookup.
> >
> > Sorry to say... but this does not fix my problem, however, i can't reach
> > that machine at all atm, but i will post the oops later today.
>
> Ok, three times is a charm, right? ;-) The previous patch did fix a
> race, but there's still one left. The hash table is never cleared, so it
> can still point into the entries array.
>
> As long as we do not race with the insertion, that's not a directly
> visible problem as all fields are 0, so we get no match and the lookup
> finishes right away. We probably _can_ get weird results though, as the
> all-zero entry isn't marked as used, but is used as "head" and "prev" in
> this case and even multiple hash entries might point to it in some
> cases. That would eventually lead to lost entries because the "next"
> field gets overwritten when alloc_entry() finally claims the entry.
>
> But when we race, we can still end up between "*curr = *entry" (makes
> "next" contain garbage) and "curr->next = NULL", while we're doing the
> fastpath lookup, because the insertion basically already took place!
> Replacing "curr->next = NULL" with "entry->next = NULL" and moving it up
> is pointless, as "*curr = *entry" isn't atomic, so instead, we have to
> simply clean the hash table when the entries are cleared.
>
> Bj?rn
>
>
> Fix two races in the timer stats lookup code. One by ensuring that the
> initialization of a new entry is finished upon insertion of that entry.
> The other by cleaning up the hash table when the entries array is
> cleared, so that we don't have "pre-inserted" entries.
>
> Thanks to Eric Dumazet for reminding me of the memory barrier.
>
> Signed-off-by: Bj?rn Steinbrink <[email protected]>
> ---
> diff --git a/kernel/time/timer_stats.c b/kernel/time/timer_stats.c
> index 868f1bc..611b844 100644
> --- a/kernel/time/timer_stats.c
> +++ b/kernel/time/timer_stats.c
> @@ -117,21 +117,6 @@ static struct entry entries[MAX_ENTRIES];
>
> static atomic_t overflow_count;
>
> -static void reset_entries(void)
> -{
> - nr_entries = 0;
> - memset(entries, 0, sizeof(entries));
> - atomic_set(&overflow_count, 0);
> -}
> -
> -static struct entry *alloc_entry(void)
> -{
> - if (nr_entries >= MAX_ENTRIES)
> - return NULL;
> -
> - return entries + nr_entries++;
> -}
> -
> /*
> * The entries are in a hash-table, for fast lookup:
> */
> @@ -149,6 +134,22 @@ static struct entry *alloc_entry(void)
>
> static struct entry *tstat_hash_table[TSTAT_HASH_SIZE] __read_mostly;
>
> +static void reset_entries(void)
> +{
> + nr_entries = 0;
> + memset(entries, 0, sizeof(entries));
> + memset(tstat_hash_table, 0, sizeof(tstat_hash_table));
> + atomic_set(&overflow_count, 0);
> +}
> +
> +static struct entry *alloc_entry(void)
> +{
> + if (nr_entries >= MAX_ENTRIES)
> + return NULL;
> +
> + return entries + nr_entries++;
> +}
> +
> static int match_entries(struct entry *entry1, struct entry *entry2)
> {
> return entry1->timer == entry2->timer &&
> @@ -202,12 +203,15 @@ static struct entry *tstat_lookup(struct entry *entry, char *comm)
> if (curr) {
> *curr = *entry;
> curr->count = 0;
> + curr->next = NULL;
> memcpy(curr->comm, comm, TASK_COMM_LEN);
> +
> + smp_mb(); /* Ensure that curr is initialized before insert */
> +
> if (prev)
> prev->next = curr;
> else
> *head = curr;
> - curr->next = NULL;
> }
> out_unlock:
> spin_unlock(&table_lock);
> -

Well... :) , there is still a memory barrier missing it seems.

Another cpu might see a bad value if 'active=1' is set before tstat_hash_table is really cleared.


diff -urp linux/kernel/time/timer_stats.c.orig linux/kernel/time/timer_stats.c
--- kernel/time/timer_stats.c.orig
+++ kernel/time/timer_stats.c
@@ -361,6 +361,7 @@ static ssize_t tstats_write(struct file
if (!active) {
reset_entries();
time_start = ktime_get();
+ smp_mb();
active = 1;
}
break;


2007-05-31 15:27:15

by Björn Steinbrink

[permalink] [raw]
Subject: Re: [BUG] Something goes wrong with timer statistics.

On 2007.05.31 17:10:07 +0200, Eric Dumazet wrote:
> Well... :) , there is still a memory barrier missing it seems.
>
> Another cpu might see a bad value if 'active=1' is set before tstat_hash_table is really cleared.

Hm, that even makes the assumption in my first try valid ;-)
Just for the record, this time I thought that the barrier from the
spinlock in timer_stats_update_stats (right before the check for active)
would be enough, but that's obviously running on the wrong cpu if we
race... *sigh*

Thanks,
Bj?rn



Fix two races in the timer stats lookup code. One by ensuring that the
initialization of a new entry is finished upon insertion of that entry.
The other by cleaning up the hash table when the entries array is
cleared, so that we don't have "pre-inserted" entries.

Thanks to Eric Dumazet for reminding me of the memory barriers.

Signed-off-by: Bj?rn Steinbrink <[email protected]>
---
diff --git a/kernel/time/timer_stats.c b/kernel/time/timer_stats.c
index 868f1bc..63d00f9 100644
--- a/kernel/time/timer_stats.c
+++ b/kernel/time/timer_stats.c
@@ -117,21 +117,6 @@ static struct entry entries[MAX_ENTRIES];

static atomic_t overflow_count;

-static void reset_entries(void)
-{
- nr_entries = 0;
- memset(entries, 0, sizeof(entries));
- atomic_set(&overflow_count, 0);
-}
-
-static struct entry *alloc_entry(void)
-{
- if (nr_entries >= MAX_ENTRIES)
- return NULL;
-
- return entries + nr_entries++;
-}
-
/*
* The entries are in a hash-table, for fast lookup:
*/
@@ -149,6 +134,22 @@ static struct entry *alloc_entry(void)

static struct entry *tstat_hash_table[TSTAT_HASH_SIZE] __read_mostly;

+static void reset_entries(void)
+{
+ nr_entries = 0;
+ memset(entries, 0, sizeof(entries));
+ memset(tstat_hash_table, 0, sizeof(tstat_hash_table));
+ atomic_set(&overflow_count, 0);
+}
+
+static struct entry *alloc_entry(void)
+{
+ if (nr_entries >= MAX_ENTRIES)
+ return NULL;
+
+ return entries + nr_entries++;
+}
+
static int match_entries(struct entry *entry1, struct entry *entry2)
{
return entry1->timer == entry2->timer &&
@@ -202,12 +203,15 @@ static struct entry *tstat_lookup(struct entry *entry, char *comm)
if (curr) {
*curr = *entry;
curr->count = 0;
+ curr->next = NULL;
memcpy(curr->comm, comm, TASK_COMM_LEN);
+
+ smp_mb(); /* Ensure that curr is initialized before insert */
+
if (prev)
prev->next = curr;
else
*head = curr;
- curr->next = NULL;
}
out_unlock:
spin_unlock(&table_lock);
@@ -360,6 +364,7 @@ static ssize_t tstats_write(struct file *file, const char __user *buf,
if (!active) {
reset_entries();
time_start = ktime_get();
+ smb_mb();
active = 1;
}
break;

2007-05-31 22:59:33

by Ian Kumlien

[permalink] [raw]
Subject: Re: [BUG] Something goes wrong with timer statistics.

On tor, 2007-05-31 at 17:27 +0200, Björn Steinbrink wrote:
> On 2007.05.31 17:10:07 +0200, Eric Dumazet wrote:
> > Well... :) , there is still a memory barrier missing it seems.
> >
> > Another cpu might see a bad value if 'active=1' is set before tstat_hash_table is really cleared.
>
> Hm, that even makes the assumption in my first try valid ;-)
> Just for the record, this time I thought that the barrier from the
> spinlock in timer_stats_update_stats (right before the check for active)
> would be enough, but that's obviously running on the wrong cpu if we
> race... *sigh*
>
> Thanks,
> Björn
>
>
>
> Fix two races in the timer stats lookup code. One by ensuring that the
> initialization of a new entry is finished upon insertion of that entry.
> The other by cleaning up the hash table when the entries array is
> cleared, so that we don't have "pre-inserted" entries.
>
> Thanks to Eric Dumazet for reminding me of the memory barriers.

Fix the comment below and you can add:
Signed-off-by: Ian Kumlien <[email protected]>

It's currently been running for the longest period ever, ie, 11 minutes
=)

I'm gonna leave it running during the night and send a status update
when the evil daystar reaches it's peak CET. (i haven't been able to
stop since linus mentioned it... damn it... =))

> Signed-off-by: Björn Steinbrink <[email protected]>
> ---

> @@ -360,6 +364,7 @@ static ssize_t tstats_write(struct file *file, const char __user *buf,
> if (!active) {
> reset_entries();
> time_start = ktime_get();
> + smb_mb();

smb? you mean smp =)

> active = 1;
> }
> break;

--
Ian Kumlien <pomac () vapor ! com> -- http://pomac.netswarm.net

2007-05-31 23:09:00

by Björn Steinbrink

[permalink] [raw]
Subject: Re: [BUG] Something goes wrong with timer statistics.

On 2007.06.01 00:59:11 +0200, Ian Kumlien wrote:
> On tor, 2007-05-31 at 17:27 +0200, Bj?rn Steinbrink wrote:
> > On 2007.05.31 17:10:07 +0200, Eric Dumazet wrote:
> > > Well... :) , there is still a memory barrier missing it seems.
> > >
> > > Another cpu might see a bad value if 'active=1' is set before tstat_hash_table is really cleared.
> >
> > Hm, that even makes the assumption in my first try valid ;-)
> > Just for the record, this time I thought that the barrier from the
> > spinlock in timer_stats_update_stats (right before the check for active)
> > would be enough, but that's obviously running on the wrong cpu if we
> > race... *sigh*
> >
>
> Fix the comment below and you can add:
> Signed-off-by: Ian Kumlien <[email protected]>
>
> It's currently been running for the longest period ever, ie, 11 minutes
> =)

Finally! :-)

> I'm gonna leave it running during the night and send a status update
> when the evil daystar reaches it's peak CET. (i haven't been able to
> stop since linus mentioned it... damn it... =))
>
> > @@ -360,6 +364,7 @@ static ssize_t tstats_write(struct file *file, const char __user *buf,
> > if (!active) {
> > reset_entries();
> > time_start = ktime_get();
> > + smb_mb();
>
> smb? you mean smp =)

No idea how I managed to break that after test compiling it... :-/
Thanks for catching it :-)

So unless I messed up something again, here's the final patch.

Thanks,
Bj?rn


Fix two races in the timer stats lookup code. One by ensuring that the
initialization of a new entry is finished upon insertion of that entry.
The other by cleaning up the hash table when the entries array is
cleared, so that we don't have any "pre-inserted" entries.

Thanks to Eric Dumazet for reminding me of the memory barriers.

Signed-off-by: Bj?rn Steinbrink <[email protected]>
Signed-off-by: Ian Kumlien <[email protected]>
---
diff --git a/kernel/time/timer_stats.c b/kernel/time/timer_stats.c
index 868f1bc..fa3d380 100644
--- a/kernel/time/timer_stats.c
+++ b/kernel/time/timer_stats.c
@@ -117,21 +117,6 @@ static struct entry entries[MAX_ENTRIES];

static atomic_t overflow_count;

-static void reset_entries(void)
-{
- nr_entries = 0;
- memset(entries, 0, sizeof(entries));
- atomic_set(&overflow_count, 0);
-}
-
-static struct entry *alloc_entry(void)
-{
- if (nr_entries >= MAX_ENTRIES)
- return NULL;
-
- return entries + nr_entries++;
-}
-
/*
* The entries are in a hash-table, for fast lookup:
*/
@@ -149,6 +134,22 @@ static struct entry *alloc_entry(void)

static struct entry *tstat_hash_table[TSTAT_HASH_SIZE] __read_mostly;

+static void reset_entries(void)
+{
+ nr_entries = 0;
+ memset(entries, 0, sizeof(entries));
+ memset(tstat_hash_table, 0, sizeof(tstat_hash_table));
+ atomic_set(&overflow_count, 0);
+}
+
+static struct entry *alloc_entry(void)
+{
+ if (nr_entries >= MAX_ENTRIES)
+ return NULL;
+
+ return entries + nr_entries++;
+}
+
static int match_entries(struct entry *entry1, struct entry *entry2)
{
return entry1->timer == entry2->timer &&
@@ -202,12 +203,15 @@ static struct entry *tstat_lookup(struct entry *entry, char *comm)
if (curr) {
*curr = *entry;
curr->count = 0;
+ curr->next = NULL;
memcpy(curr->comm, comm, TASK_COMM_LEN);
+
+ smp_mb(); /* Ensure that curr is initialized before insert */
+
if (prev)
prev->next = curr;
else
*head = curr;
- curr->next = NULL;
}
out_unlock:
spin_unlock(&table_lock);
@@ -360,6 +364,7 @@ static ssize_t tstats_write(struct file *file, const char __user *buf,
if (!active) {
reset_entries();
time_start = ktime_get();
+ smp_mb();
active = 1;
}
break;

2007-05-31 23:53:15

by Björn Steinbrink

[permalink] [raw]
Subject: Re: [BUG] Something goes wrong with timer statistics.

On 2007.05.30 02:13:50 +0200, Stephane Casset wrote:
> Le Tue, May 29, 2007 at 11:38:48PM +0200, Ian Kumlien ?crivait :
> > Hi,
> >
> > As the daystar sets, i try to play some with my new would be
> > firewall/server, but since this will be running for quite some time i
> > have been experimenting with powertop to find out what i can do to limit
> > it's power usage.
> >
> > But, if i run powertop for too long or a few times to many... this
> > happens:
> > http://pomac.netswarm.net/pics/kernel_panic.jpg
> >
> > If i don't run powertop, it is rock solid... Compiling for hours,
> > running memtest for hours etc etc...
>
> Same here, P4 HT, if I run powertop for 5-10 minutes the system crash :(
> It happens with 2.6.22-rc{2,3} at least :( I didn't try before.
>
> I don't have a oops since the last crash was in X, but next time I will
> try to catch it. If there is any patch around to fix this I am willing
> to test.

There is finally one that seems to work:
http://lkml.org/lkml/2007/5/31/394

I'd be glad if you could give it a try.

Thanks,
Bj?rn

2007-06-01 06:23:08

by Ingo Molnar

[permalink] [raw]
Subject: Re: [BUG] Something goes wrong with timer statistics.


* Bj?rn Steinbrink <[email protected]> wrote:

> > Fix the comment below and you can add:
> > Signed-off-by: Ian Kumlien <[email protected]>
> >
> > It's currently been running for the longest period ever, ie, 11 minutes
> > =)
>
> Finally! :-)

cool! Great detective work Bj?rn!

> Thanks to Eric Dumazet for reminding me of the memory barriers.
>
> Signed-off-by: Bj?rn Steinbrink <[email protected]>
> Signed-off-by: Ian Kumlien <[email protected]>

Acked-by: Ingo Molnar <[email protected]>

Ingo