2001-10-30 21:20:14

by Jeff Garzik

[permalink] [raw]
Subject: pre5 VM livelock

2.4.14-pre5 was looking very nice on my alpha, doing RPM builds. It
seems to swap only when it needs to, and subjectively, performance
appears better.

However at this very moment, the kernel is livelocked. I can type on
console and do sysrq to your heart's content... I can even sysrq-s and
sync successfully. But no processing occurs. I can ping, but two ssh
sessions are frozen.

Key symptoms: Free swab 0Kb according to sysrq-m, and several processes
in run state according to sysrq-t.

Let me know if I should poke at this alpha further before rebooting.

further info:
free pages: 2560 kb (0kb highmem)
( active 2422 inactive 38578 free 320 )
swap cache: add 850670 delete 850666 find 323063/440091 race 1+0
free swap: 0kb
49074 pages of ram
786 free pages
1299 reserved pages
2683 pages shared
4 pages swap cached
4 pages in page table cache
buffer memory: 168kb

This behavior is reproducible, I am pretty sure.


--
Jeff Garzik | Only so many songs can be sung
Building 1024 | with two lips, two lungs, and one tongue.
MandrakeSoft | - nomeansno


2001-10-30 22:46:44

by Andrea Arcangeli

[permalink] [raw]
Subject: Re: pre5 VM livelock

On Tue, Oct 30, 2001 at 04:20:25PM -0500, Jeff Garzik wrote:
> 2.4.14-pre5 was looking very nice on my alpha, doing RPM builds. It
> seems to swap only when it needs to, and subjectively, performance
> appears better.
>
> However at this very moment, the kernel is livelocked. I can type on
> console and do sysrq to your heart's content... I can even sysrq-s and
> sync successfully. But no processing occurs. I can ping, but two ssh
> sessions are frozen.
>
> Key symptoms: Free swab 0Kb according to sysrq-m, and several processes
> in run state according to sysrq-t.
>
> Let me know if I should poke at this alpha further before rebooting.
>
> further info:
> free pages: 2560 kb (0kb highmem)
> ( active 2422 inactive 38578 free 320 )
> swap cache: add 850670 delete 850666 find 323063/440091 race 1+0
> free swap: 0kb
> 49074 pages of ram
> 786 free pages
> 1299 reserved pages
> 2683 pages shared
> 4 pages swap cached
> 4 pages in page table cache
> buffer memory: 168kb
>
> This behavior is reproducible, I am pretty sure.

can you reproduce with 2.4.14pre5aa1 too? The inactive list is pretty
big so maybe it's something else but maybe it's really mlocked anon
memory.

Andrea

2001-10-30 22:51:34

by Linus Torvalds

[permalink] [raw]
Subject: Re: pre5 VM livelock


On Tue, 30 Oct 2001, Jeff Garzik wrote:
>
> Key symptoms: Free swab 0Kb according to sysrq-m, and several processes
> in run state according to sysrq-t.

What are the stack traces for the processes (ie "Ctrl-ScrollLock")

It actually _sounds_ like you're out-of-memory for real, you have no swap
left, and you have only four pages in the swap cache which implies that
the system has tried very very hard to get rid of the pages you _did_
write to swap.

That, in turn, sounds like a memory leak. You've got 38578 pages on the
inactive list (300MB worth of memory).

The fact that they _are_ on the inactive list means that the kernel
certainly sees them. They just aren't freeable for some reason - possibly
because they are mapped and dirty in some process (and you're out of
swap so the kernel cannot put them anywhere). And if the oom() killer
doesn't react, you're sol.

Question: did you have some big process that you tried to test the VM
with? Did you expect the oom killer to kill it?

OR there is a leak inside the kernel, where we forget to decrement the
count for certain pages in certain circumstances. Not unlikely.

Linus

2001-10-30 23:23:54

by Linus Torvalds

[permalink] [raw]
Subject: Re: pre5 VM livelock


On Wed, 31 Oct 2001, Andrea Arcangeli wrote:
>
> I agree it's oom, I think it's the infinite loop, it probably thinks
> this memory is freeable but maybe it's all anonymous mlocked memory, or
> maybe there's no swap at all that is equivalent for the vm.

It doesn't have to be mlocked - Jeff has zero swap left.

Linus

2001-10-30 23:18:44

by Andrea Arcangeli

[permalink] [raw]
Subject: Re: pre5 VM livelock

On Tue, Oct 30, 2001 at 02:49:27PM -0800, Linus Torvalds wrote:
>
> On Tue, 30 Oct 2001, Jeff Garzik wrote:
> >
> > Key symptoms: Free swab 0Kb according to sysrq-m, and several processes
> > in run state according to sysrq-t.
>
> What are the stack traces for the processes (ie "Ctrl-ScrollLock")
>
> It actually _sounds_ like you're out-of-memory for real, you have no swap
> left, and you have only four pages in the swap cache which implies that
> the system has tried very very hard to get rid of the pages you _did_
> write to swap.
>
> That, in turn, sounds like a memory leak. You've got 38578 pages on the

I agree it's oom, I think it's the infinite loop, it probably thinks
this memory is freeable but maybe it's all anonymous mlocked memory, or
maybe there's no swap at all that is equivalent for the vm.

I think it's not reproducible in -aa.

Andrea

2001-10-30 23:43:42

by Jeff Garzik

[permalink] [raw]
Subject: Re: pre5 VM livelock

Linus Torvalds wrote:
> Question: did you have some big process that you tried to test the VM
> with? Did you expect the oom killer to kill it?

AFAICT, yes. I am going to re-run again to make sure (both with pre5
and also pre5aa1).

--
Jeff Garzik | Only so many songs can be sung
Building 1024 | with two lips, two lungs, and one tongue.
MandrakeSoft | - nomeansno

2001-10-30 23:42:32

by Linus Torvalds

[permalink] [raw]
Subject: Re: pre5 VM livelock


More detailed look at your numbers..

On Tue, 30 Oct 2001, Jeff Garzik wrote:
>
> free pages: 2560 kb (0kb highmem)

Ok, the above is just the "pages_low" for your machine, it refuses to use
them except for atomic allocations and such (which is why "ping" still
works).

> ( active 2422 inactive 38578 free 320 )

As mentioned, you have about 300MB of pages in the inactive list, and
about 20M in the active list.

> swap cache: add 850670 delete 850666 find 323063/440091 race 1+0
> free swap: 0kb

You obviously _do_ have a swapfile, but it's now gone. I suspect it's
clearly smaller than your RAM (you seem to have 384MB in your machine, I
have no idea what your swap size is)

> 49074 pages of ram
> 786 free pages

The free page calculation is wrong, as it doesn't understand about
multi-page allocations. You really only have 320 free pages (see above),
but because there are multi-page allocations the low-level stuff thinks
the later pages are free.

> 1299 reserved pages
> 2683 pages shared

The "shared" count is the number of pages with page_count() > 1
(actually, the sum of the page_counts), so it's likely that these are the
mapped pages that are shared due to fork(). Notably it's a much smaller
number than your "inactive pages", which implies that the inactive pages
mostly have a count of 1. Whcih is consistent with them being dirty
and mapped, but nonfreeable due to being out of swap-space.

> 4 pages swap cached

That's basically zero, you probably have a few pages on the active list
that were swapped in and haven't been thrown away (or the livelock is
continually throwing them away and re-loading them).

> 4 pages in page table cache
> buffer memory: 168kb

That's 21 pages of buffer cache, most likely pinned by the filesystem
(ext2 will pin down a number of buffers just to keep track of bitmaps
etc).

In short, everything is very consistent with a out-of-memory condition.
We'll need to tweak the oom killer to just kill whatever offending process
it is that uses everything up.

I just want confirmation that you actually did something that could result
in this, ie you were testing big processes?

Linus

2001-10-30 23:45:22

by Jeff Garzik

[permalink] [raw]
Subject: Re: pre5 VM livelock

Linus Torvalds wrote:
> I just want confirmation that you actually did something that could result
> in this, ie you were testing big processes?

yes. here's meminfo, FWIW:


[jgarzik@brutus jgarzik]$ cat /proc/meminfo
total: used: free: shared: buffers: cached:
Mem: 391372800 217260032 174112768 0 13025280 163340288
Swap: 418996224 0 418996224
MemTotal: 382200 kB
MemFree: 170032 kB
MemShared: 0 kB
Buffers: 12720 kB
Cached: 159512 kB
SwapCached: 0 kB
Active: 39888 kB
Inactive: 137880 kB
HighTotal: 0 kB
HighFree: 0 kB
LowTotal: 382200 kB
LowFree: 170032 kB
SwapTotal: 409176 kB
SwapFree: 409176 kB

--
Jeff Garzik | Only so many songs can be sung
Building 1024 | with two lips, two lungs, and one tongue.
MandrakeSoft | - nomeansno

2001-10-31 00:12:26

by Linus Torvalds

[permalink] [raw]
Subject: Re: pre5 VM livelock


On Tue, 30 Oct 2001, Jeff Garzik wrote:
> Linus Torvalds wrote:
> > Question: did you have some big process that you tried to test the VM
> > with? Did you expect the oom killer to kill it?
>
> AFAICT, yes. I am going to re-run again to make sure (both with pre5
> and also pre5aa1).

Ok. The oom-killer is something I didn't even bother worrying about in the
pre-series, I'll give that another look.

Linus

2001-10-31 11:29:10

by Marcelo Tosatti

[permalink] [raw]
Subject: Re: pre5 VM livelock



On Tue, 30 Oct 2001, Linus Torvalds wrote:

>
> On Tue, 30 Oct 2001, Jeff Garzik wrote:
> > Linus Torvalds wrote:
> > > Question: did you have some big process that you tried to test the VM
> > > with? Did you expect the oom killer to kill it?
> >
> > AFAICT, yes. I am going to re-run again to make sure (both with pre5
> > and also pre5aa1).
>
> Ok. The oom-killer is something I didn't even bother worrying about in the
> pre-series, I'll give that another look.

Jeff,

Could you please show us the _full_ Alt+Sysrq+M output when the problem
happens? I want to see the "free min low high" part.

Thanks

2001-10-31 11:34:40

by Jeff Garzik

[permalink] [raw]
Subject: Re: pre5 VM livelock

Marcelo Tosatti wrote:
> Jeff,
>
> Could you please show us the _full_ Alt+Sysrq+M output when the problem
> happens? I want to see the "free min low high" part.

under pre5 or pre6?

--
Jeff Garzik | Only so many songs can be sung
Building 1024 | with two lips, two lungs, and one tongue.
MandrakeSoft | - nomeansno

2001-10-31 11:43:00

by Marcelo Tosatti

[permalink] [raw]
Subject: Re: pre5 VM livelock


pre6.


On Wed, 31 Oct 2001, Jeff Garzik wrote:

> Marcelo Tosatti wrote:
> > Jeff,
> >
> > Could you please show us the _full_ Alt+Sysrq+M output when the problem
> > happens? I want to see the "free min low high" part.
>
> under pre5 or pre6?
>
> --
> Jeff Garzik | Only so many songs can be sung
> Building 1024 | with two lips, two lungs, and one tongue.
> MandrakeSoft | - nomeansno
>