2000-11-19 00:34:40

by Rik van Riel

[permalink] [raw]
Subject: [PATCH] blindingly stupid 2.2 VM bug

Hi Alan,

here's a fix for a blindingly stupid bug that's been in
2.2 for ages (and which I've warned you about a few times
in the last 6 months, and which I've even sent some patches
for).

This patch should make 2.2 VM a bit more stable and should
also fix the complaints from people who's system gets
flooded by "VM: do_try_to_free_pages failed for process XXX"

It's against something suspiciously like 2.2.18-pre15, so
chances are it'll apply cleanly.

regards,

Rik
--
Hollywood goes for world dumbination,
Trailer at 11.

http://www.conectiva.com/ http://www.surriel.com/



--- ../rpm/BUILD/linux/mm/vmscan.c Sat Nov 18 21:56:50 2000
+++ linux/mm/vmscan.c Sun Nov 5 19:36:23 2000
@@ -17,7 +17,6 @@
#include <linux/smp_lock.h>
#include <linux/pagemap.h>
#include <linux/init.h>
-#include <linux/bigmem.h>

#include <asm/pgtable.h>

@@ -61,8 +60,7 @@

if (PageReserved(page_map)
|| PageLocked(page_map)
- || ((gfp_mask & __GFP_DMA) && !PageDMA(page_map))
- || (!(gfp_mask & __GFP_BIGMEM) && PageBIGMEM(page_map)))
+ || ((gfp_mask & __GFP_DMA) && !PageDMA(page_map)))
return 0;

/*
@@ -156,9 +154,6 @@
if (!entry)
return 0; /* No swap space left */

- if (!(page_map = prepare_bigmem_swapout(page_map)))
- goto out_swap_free;
-
vma->vm_mm->rss--;
tsk->nswap++;
set_pte(page_table, __pte(entry));
@@ -170,14 +165,10 @@
set_bit(PG_locked, &page_map->flags);

/* OK, do a physical asynchronous write to swap. */
- rw_swap_page(WRITE, entry, (char *) page_address(page_map), 0);
+ rw_swap_page(WRITE, entry, (char *) page, 0);

__free_page(page_map);
return 1;
-
- out_swap_free:
- swap_free(entry);
- return 0;
}

/*
@@ -400,13 +391,14 @@
{
int priority;
int count = SWAP_CLUSTER_MAX;
+ int loopcount = count;

lock_kernel();

/* Always trim SLAB caches when memory gets low. */
kmem_cache_reap(gfp_mask);

- priority = 5;
+ priority = 6;
do {
while (shrink_mmap(priority, gfp_mask)) {
if (!--count)
@@ -428,12 +420,21 @@
}

shrink_dcache_memory(priority, gfp_mask);
- } while (--priority > 0);
+
+ /* Only lower priority if we didn't make progress. */
+ if (count == loopcount) {
+ priority--;
+ }
+ loopcount = count;
+ } while (priority > 0);
done:
unlock_kernel();

- /* Return success if we freed a page. */
- return priority > 0;
+ /* Return success if we have enough free memory or we freed a page. */
+ if (nr_free_pages > freepages.low)
+ return 1;
+
+ return count < SWAP_CLUSTER_MAX;
}

/*
@@ -505,22 +506,23 @@
* the processes needing more memory will wake us
* up on a more timely basis.
*/
- interruptible_sleep_on(&kswapd_wait);
+ interruptible_sleep_on_timeout(&kswapd_wait, HZ);

- /*
- * In 2.2.x-bigmem kswapd is critical to provide GFP_ATOMIC
- * allocations (not GFP_BIGMEM ones).
- */
- while (nr_free_pages - nr_free_bigpages < freepages.high)
- {
- if (try_to_free_pages(GFP_KSWAPD))
+ if (nr_free_pages < freepages.low) {
+ while (nr_free_pages < freepages.high)
{
- if (tsk->need_resched)
- schedule();
- continue;
+ if (try_to_free_pages(GFP_KSWAPD))
+ {
+ if (tsk->need_resched)
+ schedule();
+ continue;
+ }
+ tsk->state = TASK_INTERRUPTIBLE;
+ schedule_timeout(10*HZ);
}
- tsk->state = TASK_INTERRUPTIBLE;
- schedule_timeout(10*HZ);
+ } else if (nr_free_pages < freepages.high) {
+ /* Background scanning. */
+ try_to_free_pages(GFP_KSWAPD);
}
}
}


2000-11-19 08:31:30

by Ville Herva

[permalink] [raw]
Subject: Re: [PATCH] blindingly stupid 2.2 VM bug

On Sat, Nov 18, 2000 at 10:04:02PM -0200, you [Rik van Riel] said:
> Hi Alan,
>
> here's a fix for a blindingly stupid bug that's been in
> 2.2 for ages (and which I've warned you about a few times
> in the last 6 months, and which I've even sent some patches
> for).
>
> This patch should make 2.2 VM a bit more stable and should
> also fix the complaints from people who's system gets
> flooded by "VM: do_try_to_free_pages failed for process XXX"

Okay, I see those "VM: do_try_to_free_pages failed for process XXX" errors
as well (2.2.18pre19, uptime 8 days, machine had been idle for hours.
Then, all of a sudden, I get 30 times "VM: do_try_to_free_pages failed for
kswapd...", then 15 "VM: do_try_to_free_pages failed for xmms...", then
"VM: killing process xmms" and that repeats for ~10 processes including
X.) Never had problems with earlier 2.2.x.

My questions is: I saw Andrea's VM-global patch being recommended as a
solution for this problem, and I already compiled it in (although I
haven't booted into it yet). Should I use Rik's or Andrea's patch?

Is either of them going into 2.2.18?


-- v --

[email protected]

2000-11-20 14:09:30

by Rik van Riel

[permalink] [raw]
Subject: Re: [PATCH] blindingly stupid 2.2 VM bug

On Sun, 19 Nov 2000, Ville Herva wrote:

> My questions is: I saw Andrea's VM-global patch being
> recommended as a solution for this problem, and I already
> compiled it in (although I haven't booted into it yet). Should I
> use Rik's or Andrea's patch?

This patch is incremental with VM-global, but that's
only because the Conectiva kernel RPM happens to
contain VM-global, not because I agree with all points
of VM-global (some of the changes seem a bit .. suspect).

Luckily my patch fixes some of the suspect areas in
VM-global and most of the other parts of VM-global make
a lot of sense. It's not perfect, but it should be good
enough for 2.2.

On whether any of these improvements are going into the
next 2.2, don't bother asking me since I have no intention
paying much attention to 2.2 and I was only needing it to
install a machine with ext3...

regards,

Rik
--
Hollywood goes for world dumbination,
Trailer at 11.

http://www.conectiva.com/ http://www.surriel.com/

2000-11-25 00:25:12

by Chip Salzenberg

[permalink] [raw]
Subject: Re: [PATCH] blindingly stupid 2.2 VM bug

According to Rik van Riel:
> Luckily my patch fixes some of the suspect areas in
> VM-global [...]

Would you say that the below patch is just the try_to_free_pages
bug fix, then?

Index: mm/vmscan.c
--- mm/vmscan.c.prev
+++ mm/vmscan.c Fri Nov 24 15:17:59 2000
@@ -401,4 +401,5 @@ int try_to_free_pages(unsigned int gfp_m
int priority;
int count = SWAP_CLUSTER_MAX;
+ int loopcount = count;
int killed = 0;

@@ -409,5 +410,5 @@ int try_to_free_pages(unsigned int gfp_m

again:
- priority = 5;
+ priority = 6;
do {
while (shrink_mmap(priority, gfp_mask)) {
@@ -431,5 +432,10 @@ again:

shrink_dcache_memory(priority, gfp_mask);
- } while (--priority > 0);
+
+ /* Only lower priority if we didn't make progress. */
+ if (count == loopcount)
+ --priority;
+ loopcount = count;
+ } while (priority > 0);
done:
unlock_kernel();
@@ -454,6 +460,9 @@ done:
}

- /* Return success if we freed a page. */
- return priority > 0;
+ /* Return success if we have enough free memory or we freed a page. */
+ if (nr_free_pages > freepages.low)
+ return 1;
+
+ return count < SWAP_CLUSTER_MAX;
}


--
Chip Salzenberg - a.k.a. - <[email protected]>
"Give me immortality, or give me death!" // Firesign Theatre

2000-11-25 14:27:43

by Andrea Arcangeli

[permalink] [raw]
Subject: Re: [PATCH] blindingly stupid 2.2 VM bug

+ /* Only lower priority if we didn't make progress. */
+ if (count == loopcount)
+ --priority;
+ loopcount = count;

If the while loops around the page-recycling-methods were missing we would
have just noticed as soon as we needed to recycle some byte of cache.

+ /* Return success if we have enough free memory or we freed a page. */
+ if (nr_free_pages > freepages.low)
+ return 1;

This anti-kill-flood check is just handled by the GFP layer (you don't
need to replicate it here in the memory balancing layer). It's currently
done against freepages.high to stay conservative (it's meant to catch
a task killed while we were blocked; a task killed will certainly
raise nr_free_pages over freepages.high).

+ return count < SWAP_CLUSTER_MAX;

This will make oom handling less graceful and if memory balancing
fails wrongly when the machine isn't truly oom this will only hide the problem
(and if there's a real problem like we had with MAP_SHARED and
always-async-kpiod [fixed by VM-global] it won't even be enough to hide it).


VM-global-*-7 has no known bugs AFIK.

Andrea

2000-11-25 17:37:06

by Rik van Riel

[permalink] [raw]
Subject: Re: [PATCH] blindingly stupid 2.2 VM bug

On Sat, 25 Nov 2000, Andrea Arcangeli wrote:

> VM-global-*-7 has no known bugs AFIK.

Sure, I don't use 2.2 anyway ;)

Rik
--
Hollywood goes for world dumbination,
Trailer at 11.

http://www.conectiva.com/ http://www.surriel.com/

2000-11-28 23:33:07

by John Kennedy

[permalink] [raw]
Subject: Re: [PATCH] blindingly stupid 2.2 VM bug

On Sat, Nov 25, 2000 at 02:57:01PM +0100, Andrea Arcangeli wrote:
> ... VM-global-*-7 has no known bugs AFIK.

Is there anything more recent than VM-global-2.2.18pre18-7? It isn't
patching very cleanly against my pre-patch-2.2.18-23 tree.

(I don't see anything under your pre19 thru pre23 dirs, but I
may not be looking at a fully populated server or something.)

Reiserfs on top of 2.2.18-23 blatantly runs me out of memory, and ext3fs
may be doing it too, although the laptop fan makes it sounds like it is
busy-looping somewhere.

2000-11-28 23:50:31

by Andrea Arcangeli

[permalink] [raw]
Subject: Re: [PATCH] blindingly stupid 2.2 VM bug

On Tue, Nov 28, 2000 at 03:02:35PM -0800, John Kennedy wrote:
> On Sat, Nov 25, 2000 at 02:57:01PM +0100, Andrea Arcangeli wrote:
> > ... VM-global-*-7 has no known bugs AFIK.
>
> Is there anything more recent than VM-global-2.2.18pre18-7? It isn't
> patching very cleanly against my pre-patch-2.2.18-23 tree.

It patches cleanly for me. (ignore the offset warnings from patch, just make
sure there are no rejects)

Andrea

2000-11-29 00:06:42

by John Kennedy

[permalink] [raw]
Subject: Re: [PATCH] blindingly stupid 2.2 VM bug

On Wed, Nov 29, 2000 at 12:20:09AM +0100, Andrea Arcangeli wrote:
> On Tue, Nov 28, 2000 at 03:02:35PM -0800, John Kennedy wrote:
> > On Sat, Nov 25, 2000 at 02:57:01PM +0100, Andrea Arcangeli wrote:
> > > ... VM-global-*-7 has no known bugs AFIK.
> >
> > Is there anything more recent than VM-global-2.2.18pre18-7? It isn't
> > patching very cleanly against my pre-patch-2.2.18-23 tree.
>
> It patches cleanly for me. (ignore the offset warnings from patch, just make
> sure there are no rejects)

No, it is all ext3fs stuff that is touching the same areas your
patch is. Sorry. I was assuming that Rik van Riel's comments implied
that they should get patch a lot more cleanly and that I might have
missed a version, but there is no way these two could patch without
rejects along as-is.

2000-11-29 00:34:28

by Andrea Arcangeli

[permalink] [raw]
Subject: Re: [PATCH] blindingly stupid 2.2 VM bug

On Tue, Nov 28, 2000 at 03:36:15PM -0800, John Kennedy wrote:
> No, it is all ext3fs stuff that is touching the same areas your

Ok this now makes sense. I ported VM-global-7 on top of ext3 right now
but it's untested:

ftp://ftp.us.kernel.org/pub/linux/kernel/people/andrea/patches/v2.2/2.2.18pre18/VM-global-2.2.18pre18-7-against-ext3-0.0.3b-1.bz2

Andrea

2000-11-29 01:06:49

by John Kennedy

[permalink] [raw]
Subject: Re: [PATCH] blindingly stupid 2.2 VM bug

On Wed, Nov 29, 2000 at 01:04:16AM +0100, Andrea Arcangeli wrote:
> On Tue, Nov 28, 2000 at 03:36:15PM -0800, John Kennedy wrote:
> > No, it is all ext3fs stuff that is touching the same areas your
>
> Ok this now makes sense. I ported VM-global-7 on top of ext3 right now
> but it's untested:

Not for long. (:

> ftp://ftp.us.kernel.org/pub/linux/kernel/people/andrea/patches/v2.2/2.2.18pre18/VM-global-2.2.18pre18-7-against-ext3-0.0.3b-1.bz2

Is ext3-0.0.3b the most current/stable? On the FTP site (well,
ftp://ftp.uk.linux.org/pub/linux/sct/fs/jfs at least) it looks like
ext3-0.0.2f is current (which is what I've been using), but under the
test directory I see ext3-0.0.3b as well as ext3-0.0.5b. I haven't
heard anything new about ext3fs in a long, long time.

2000-11-30 12:20:47

by Stephen C. Tweedie

[permalink] [raw]
Subject: Re: [PATCH] blindingly stupid 2.2 VM bug

Hi,

On Tue, Nov 28, 2000 at 04:35:32PM -0800, John Kennedy wrote:
> On Wed, Nov 29, 2000 at 01:04:16AM +0100, Andrea Arcangeli wrote:
> > On Tue, Nov 28, 2000 at 03:36:15PM -0800, John Kennedy wrote:
> > > No, it is all ext3fs stuff that is touching the same areas your
> >
> > Ok this now makes sense. I ported VM-global-7 on top of ext3 right now
> > but it's untested:
>
> Not for long. (:
>
> > ftp://ftp.us.kernel.org/pub/linux/kernel/people/andrea/patches/v2.2/2.2.18pre18/VM-global-2.2.18pre18-7-against-ext3-0.0.3b-1.bz2
>
> Is ext3-0.0.3b the most current/stable?

Yes. 0.0.5b is out with metadata-only journaling, but it hasn't had
the same amount of testing which 3b has had.

> ftp://ftp.uk.linux.org/pub/linux/sct/fs/jfs at least) it looks like
> ext3-0.0.2f is current (which is what I've been using), but under the
> test directory I see ext3-0.0.3b as well as ext3-0.0.5b. I haven't
> heard anything new about ext3fs in a long, long time.

linux-fsdevel has had a fair amount of traffic on it, and there's also
[email protected] (https://listman.redhat.com to subscribe). You
won't have seen much if you've only been watching linux-kernel.

Cheers,
Stephen