2001-10-31 21:57:36

by Bob Matthews

[permalink] [raw]
Subject: Stress testing 2.4.14-pre6

Hi Linus,

We've been doing some stress-testing on 2.4.14-pre6 and have encountered
a couple of problems. The platform is an 8xPIII with 8G RAM and 32G
swap. After running Cerberus for about 3 hours, the machine hung
completely. I was not able to switch VC's.

Unfortunately, this is as detailed a bug report as I can submit. It
looks like Magic SysRq is broken in this kernel. <alt><SysRq>T will
print the column headings but nothing else.
--
Bob Matthews
Red Hat, Inc.


2001-11-01 00:33:45

by Alan

[permalink] [raw]
Subject: Re: Stress testing 2.4.14-pre6

> Unfortunately, this is as detailed a bug report as I can submit. It
> looks like Magic SysRq is broken in this kernel. <alt><SysRq>T will
> print the column headings but nothing else.

Thats consistent with memory corruption trashig the task list

2001-11-01 16:27:01

by Linus Torvalds

[permalink] [raw]
Subject: Re: Stress testing 2.4.14-pre6

In article <[email protected]>,
Bob Matthews <[email protected]> wrote:
>Hi Linus,
>
>We've been doing some stress-testing on 2.4.14-pre6 and have encountered
>a couple of problems. The platform is an 8xPIII with 8G RAM and 32G
>swap. After running Cerberus for about 3 hours, the machine hung
>completely. I was not able to switch VC's.

There is some race somewhere - I've found one interrupt race (that
actually seems to exist in the 2.2.x VM too, but is probably _much_
harder to trigger where an interrupt at _just_ the right time will
corrupt the per-process local page list. That looks so unlikely that I
doubt that is it, but I'm looking for others (the irq one wasn't even a
SMP race - it's on UP too, surprise surprise).

Working on it, in other words.

Linus

2001-11-01 17:01:34

by Marcelo Tosatti

[permalink] [raw]
Subject: Re: Stress testing 2.4.14-pre6



On Thu, 1 Nov 2001, Linus Torvalds wrote:

> In article <[email protected]>,
> Bob Matthews <[email protected]> wrote:
> >Hi Linus,
> >
> >We've been doing some stress-testing on 2.4.14-pre6 and have encountered
> >a couple of problems. The platform is an 8xPIII with 8G RAM and 32G
> >swap. After running Cerberus for about 3 hours, the machine hung
> >completely. I was not able to switch VC's.
>
> There is some race somewhere - I've found one interrupt race (that
> actually seems to exist in the 2.2.x VM too, but is probably _much_
> harder to trigger where an interrupt at _just_ the right time will
> corrupt the per-process local page list. That looks so unlikely that I
> doubt that is it, but I'm looking for others (the irq one wasn't even a
> SMP race - it's on UP too, surprise surprise).
>
> Working on it, in other words.

Would you mind to describe this race?

Thanks


2001-11-01 17:11:45

by Linus Torvalds

[permalink] [raw]
Subject: Re: Stress testing 2.4.14-pre6


On Thu, 1 Nov 2001, Marcelo Tosatti wrote:
> >
> > There is some race somewhere - I've found one interrupt race (that
> > actually seems to exist in the 2.2.x VM too, but is probably _much_
> > harder to trigger where an interrupt at _just_ the right time will
> > corrupt the per-process local page list. That looks so unlikely that I
> > doubt that is it, but I'm looking for others (the irq one wasn't even a
> > SMP race - it's on UP too, surprise surprise).
> >
> > Working on it, in other words.
>
> Would you mind to describe this race?

Both 2.2.x and the new VM (which, through Andrea, has a lot of the same
things) have this notion of a per-process "free pages list" that it
replenished by any freeing that the process does itself when it gets into
the "try_to_free_memory()" path.

The trigger for refilling this list is "current->flags & PF_FREE_PAGES".

The bug is that ytou can be in the middle of adding such a recently free'd
page to the per-process list of free pages, and an interrupt comes in.

The interrupt (or bottom half), in turn, might do something like

page = get_free_page(GFP_ATOMIC);
...
free_page(page);

and now the free_page() inside the interrupt context will _also_ trigger
the PF_FREE_PAGES test, and _also_ add the page to the list. Except, of
course, the list is totally unprotected by any locks, so it may not be
valid at this point.

Fix is to only care about the PF_FREE_PAGES bit when not in an interrupt
context.

Anyway, I seriously doubt this explains any real-world bad behaviour: the
window for the interrupt hitting a half-way updated list is something like
two instructions long out of the whole memory freeing path. AND most
interrupts don't actually do any allocation.

Linus

2001-11-01 17:20:14

by Jeff Garzik

[permalink] [raw]
Subject: Re: Stress testing 2.4.14-pre6

Linus Torvalds wrote:
> Anyway, I seriously doubt this explains any real-world bad behaviour: the
> window for the interrupt hitting a half-way updated list is something like
> two instructions long out of the whole memory freeing path. AND most
> interrupts don't actually do any allocation.

Network Rx interrupts do.... definitely not as frequent as IDE
interrupts, but not infrequent.

--
Jeff Garzik | Only so many songs can be sung
Building 1024 | with two lips, two lungs, and one tongue.
MandrakeSoft | - nomeansno

2001-11-01 17:29:54

by Arjan van de Ven

[permalink] [raw]
Subject: Re: Stress testing 2.4.14-pre6

Jeff Garzik wrote:
>
> Linus Torvalds wrote:
> > Anyway, I seriously doubt this explains any real-world bad behaviour: the
> > window for the interrupt hitting a half-way updated list is something like
> > two instructions long out of the whole memory freeing path. AND most
> > interrupts don't actually do any allocation.
>
> Network Rx interrupts do.... definitely not as frequent as IDE
> interrupts, but not infrequent.

Cerberus doesn't use networking in the tested setup iirc

2001-11-01 18:16:31

by Jens Axboe

[permalink] [raw]
Subject: Re: Stress testing 2.4.14-pre6

On Thu, Nov 01 2001, Jeff Garzik wrote:
> Linus Torvalds wrote:
> > Anyway, I seriously doubt this explains any real-world bad behaviour: the
> > window for the interrupt hitting a half-way updated list is something like
> > two instructions long out of the whole memory freeing path. AND most
> > interrupts don't actually do any allocation.
>
> Network Rx interrupts do.... definitely not as frequent as IDE
> interrupts, but not infrequent.

Which IDE interrupts allocate memory?!

--
Jens Axboe

2001-11-01 18:21:11

by Jeff Garzik

[permalink] [raw]
Subject: Re: Stress testing 2.4.14-pre6

Jens Axboe wrote:
>
> On Thu, Nov 01 2001, Jeff Garzik wrote:
> > Linus Torvalds wrote:
> > > Anyway, I seriously doubt this explains any real-world bad behaviour: the
> > > window for the interrupt hitting a half-way updated list is something like
> > > two instructions long out of the whole memory freeing path. AND most
> > > interrupts don't actually do any allocation.
> >
> > Network Rx interrupts do.... definitely not as frequent as IDE
> > interrupts, but not infrequent.
>
> Which IDE interrupts allocate memory?!

Sorry, I meant as in, IDE interrupts occur more frequently than Rx
interrupts.

English is my first language... really.

--
Jeff Garzik | Only so many songs can be sung
Building 1024 | with two lips, two lungs, and one tongue.
MandrakeSoft | - nomeansno