2002-06-27 20:12:40

by Andrea Arcangeli

[permalink] [raw]
Subject: vm fixes for 2.4.19rc1

some fix for 2.4.19rc1 (btw, the lru_cache_del() in the LRU path is
needed in 2.5 too and it's also more efficient than the
page_cache_release, see ptrace freeing the anon pages with put_page(),
it will not pass through page_cache_release and it will trigger the
PageLRU check that __free_pages_ok isn't capable to handle in 2.5, I
will make a full vm update for 2.5 [in small pieces based on post-Andrew
split of the monolithic patch] in the next days anyways):

diff -urNp 2.4.19rc1/mm/page_alloc.c 2.4.19rc1aa1/mm/page_alloc.c
--- 2.4.19rc1/mm/page_alloc.c Tue Jun 25 23:56:23 2002
+++ 2.4.19rc1aa1/mm/page_alloc.c Wed Jun 26 00:55:35 2002
@@ -82,8 +82,11 @@ static void __free_pages_ok (struct page
/* Yes, think what happens when other parts of the kernel take
* a reference to a page in order to pin it for io. -ben
*/
- if (PageLRU(page))
+ if (PageLRU(page)) {
+ if (unlikely(in_interrupt()))
+ BUG();
lru_cache_del(page);
+ }

this adds a debugging check just in case.

@@ -91,6 +94,8 @@ static void __free_pages_ok (struct page
BUG();
if (!VALID_PAGE(page))
BUG();
+ if (PageSwapCache(page))
+ BUG();

This resurrects a valid debugging check dropped in rc1.

@@ -280,10 +285,12 @@ static struct page * balance_classzone(z
BUG();
if (!VALID_PAGE(page))
BUG();
+ if (PageSwapCache(page))
+ BUG();

same as above. a page with page->mapping set cannot be freed, if it
happens it's a bug that we want to trap.

if (PageLocked(page))
BUG();
if (PageLRU(page))
- BUG();
+ lru_cache_del(page);


This fixes a bug I found two days ago, it could explain the nvidia
oopses in __free_pages_ok. Not too bad, no risk of mm corruption or fs
corruption, just a bug could trigger in some unlikely case (never seen
it happening myself). (btw, the in_interrupt() check isn't needed above
because it's executed by the caller anyways)

diff -urNp 2.4.19rc1/mm/swap_state.c 2.4.19rc1aa1/mm/swap_state.c
--- 2.4.19rc1/mm/swap_state.c Tue Jan 22 12:55:27 2002
+++ 2.4.19rc1aa1/mm/swap_state.c Wed Jun 26 00:48:13 2002
@@ -95,11 +95,8 @@ int add_to_swap_cache(struct page *page,
*/
void __delete_from_swap_cache(struct page *page)
{
- if (!PageLocked(page))
- BUG();
if (!PageSwapCache(page))
BUG();
- ClearPageDirty(page);
__remove_inode_page(page);
INC_CACHE_INFO(del_total);
}

Here I play risky for a rc1, but it made perfect sense to me, when we
drop a page from the swapcache it doesn't make any sense to clear the
dirty bit, conceptually the only case where an anon page can be clean is
when it's part of swapcache, if it's not part of swapcache it cannot be
clean, the clearpagedirty was only to avoid oopsing in
__remove_inode_dirty but rc1 fixes that because the dirty bit can be
added without the page lock (i.e. mark_dirty_kiobuf) so the right thing
is to accept in __remove_inode_pages if somebody is dropping from the
swapcache a dirty page without oopsing. So the above patch will be now
safe on top of rc1, thanks to the fix in rc1 that avoids a false
positive bug to trigger in __remove_inode_page.

To bias the correctness of the above, we always mark the page dirty
after deleting from the swapcache, of course we have to or we risk to
lose it if the pte isn't dirty too (for example if there was a read
fault and the swapcache was clean and mapped as clean in the pte too).

@@ -114,9 +111,6 @@ void delete_from_swap_cache(struct page
{
swp_entry_t entry;

- if (!PageLocked(page))
- BUG();
-
block_flushpage(page, 0);

entry.val = page->index;

dropped also two pagelocked checks because there execute unconditionally
paths that checks the pagelocked bit at the lower layer.

here the full patch below (only slightly tested on my laptop so far, so
please make sure not to release this without an intermediate rc2 first :)

thanks,

diff -urNp 2.4.19rc1/mm/page_alloc.c 2.4.19rc1aa1/mm/page_alloc.c
--- 2.4.19rc1/mm/page_alloc.c Tue Jun 25 23:56:23 2002
+++ 2.4.19rc1aa1/mm/page_alloc.c Wed Jun 26 00:55:35 2002
@@ -82,8 +82,11 @@ static void __free_pages_ok (struct page
/* Yes, think what happens when other parts of the kernel take
* a reference to a page in order to pin it for io. -ben
*/
- if (PageLRU(page))
+ if (PageLRU(page)) {
+ if (unlikely(in_interrupt()))
+ BUG();
lru_cache_del(page);
+ }

if (page->buffers)
BUG();
@@ -91,6 +94,8 @@ static void __free_pages_ok (struct page
BUG();
if (!VALID_PAGE(page))
BUG();
+ if (PageSwapCache(page))
+ BUG();
if (PageLocked(page))
BUG();
if (PageActive(page))
@@ -280,10 +285,12 @@ static struct page * balance_classzone(z
BUG();
if (!VALID_PAGE(page))
BUG();
+ if (PageSwapCache(page))
+ BUG();
if (PageLocked(page))
BUG();
if (PageLRU(page))
- BUG();
+ lru_cache_del(page);
if (PageActive(page))
BUG();
if (PageDirty(page))
diff -urNp 2.4.19rc1/mm/swap_state.c 2.4.19rc1aa1/mm/swap_state.c
--- 2.4.19rc1/mm/swap_state.c Tue Jan 22 12:55:27 2002
+++ 2.4.19rc1aa1/mm/swap_state.c Wed Jun 26 00:48:13 2002
@@ -95,11 +95,8 @@ int add_to_swap_cache(struct page *page,
*/
void __delete_from_swap_cache(struct page *page)
{
- if (!PageLocked(page))
- BUG();
if (!PageSwapCache(page))
BUG();
- ClearPageDirty(page);
__remove_inode_page(page);
INC_CACHE_INFO(del_total);
}
@@ -114,9 +111,6 @@ void delete_from_swap_cache(struct page
{
swp_entry_t entry;

- if (!PageLocked(page))
- BUG();
-
block_flushpage(page, 0);

entry.val = page->index;

Andrea


2002-06-27 20:41:44

by Andrea Arcangeli

[permalink] [raw]
Subject: Re: vm fixes for 2.4.19rc1

On Thu, Jun 27, 2002 at 04:14:13PM -0400, Andrea Arcangeli wrote:
> if (PageLocked(page))
> BUG();
> if (PageLRU(page))
> - BUG();
> + lru_cache_del(page);
>
>
> This fixes a bug I found two days ago, it could explain the nvidia

please don't add the above one liner, on a second thought it wasn't a
bug here because we add to the local list only after running
lru_cache_del, so it still has to be an issue with the nvidia drivers,
but everything else in the previous patch still is valid :), however
nothing high prio here.

Andrea

2002-06-27 22:26:39

by Hugh Dickins

[permalink] [raw]
Subject: Re: vm fixes for 2.4.19rc1

On Thu, 27 Jun 2002, Andrea Arcangeli wrote:
> I will make a full vm update for 2.5 [in small pieces based on post-Andrew
> split of the monolithic patch] in the next days anyways):

Great!

> diff -urNp 2.4.19rc1/mm/page_alloc.c 2.4.19rc1aa1/mm/page_alloc.c
> --- 2.4.19rc1/mm/page_alloc.c Tue Jun 25 23:56:23 2002
> +++ 2.4.19rc1aa1/mm/page_alloc.c Wed Jun 26 00:55:35 2002
> @@ -82,8 +82,11 @@ static void __free_pages_ok (struct page
> /* Yes, think what happens when other parts of the kernel take
> * a reference to a page in order to pin it for io. -ben
> */
> - if (PageLRU(page))
> + if (PageLRU(page)) {
> + if (unlikely(in_interrupt()))
> + BUG();
> lru_cache_del(page);
> + }
>
> this adds a debugging check just in case.

Well, of course I like this patch, having proposed it months ago.
But the evidence for needing it seems to have died down, and I
don't think -rc is the right time for adding a BUG() where you
can almost always get away with such bad behaviour - I intended
it for a diagnostic aid in the early stages of a release -
but I'm happy to go whichever way Marcelo decides on it.

> @@ -91,6 +94,8 @@ static void __free_pages_ok (struct page
> BUG();
> if (!VALID_PAGE(page))
> BUG();
> + if (PageSwapCache(page))
> + BUG();
>
> This resurrects a valid debugging check dropped in rc1.

Sorry, Andrea, perhaps I should have copied you explicitly on my
patch which dropped that check. It is a valid check, yes, but it's
a waste of space and time: remember that PageSwapCache(page) just
checks whether page->mapping == &swapper_space (used to be a bitflag,
yes, but not since last Autumn), and the BUG() you can just see above
was if page->mapping was set at all. One day, yes, PageSwapCache may
become bitflag test again: then would be the time to add the test back.

> @@ -280,10 +285,12 @@ static struct page * balance_classzone(z
> BUG();
> if (!VALID_PAGE(page))
> BUG();
> + if (PageSwapCache(page))
> + BUG();
>
> same as above. a page with page->mapping set cannot be freed, if it
> happens it's a bug that we want to trap.

As you say, same as above: again the BUG() you can just see at the top
was on a test for whether page->mapping is set, so PageSwapCache test
again redundant. If compiler optimized it away, I'd leave it: but no.

> if (PageLocked(page))
> BUG();
> if (PageLRU(page))
> - BUG();
> + lru_cache_del(page);

I was going to query this, but see later mail from you withdrawing it.

> diff -urNp 2.4.19rc1/mm/swap_state.c 2.4.19rc1aa1/mm/swap_state.c
> --- 2.4.19rc1/mm/swap_state.c Tue Jan 22 12:55:27 2002
> +++ 2.4.19rc1aa1/mm/swap_state.c Wed Jun 26 00:48:13 2002
> @@ -95,11 +95,8 @@ int add_to_swap_cache(struct page *page,
> */
> void __delete_from_swap_cache(struct page *page)
> {
> - if (!PageLocked(page))
> - BUG();
> if (!PageSwapCache(page))
> BUG();
> - ClearPageDirty(page);
> __remove_inode_page(page);
> INC_CACHE_INFO(del_total);
> }
>
> Here I play risky for a rc1, but it made perfect sense to me,

Yes and yes. My comment when I weakened the __remove_inode_page BUG
was "Remove the prior ClearPageDirty? maybe but not without deeper
thought: let stay for now." I've not given it deep enough thought,
and I'm not convinced you have yet: need to consider the peculiarities
of mm/shmem.c, and the way clearing the flag lets the page be dirtied
on to a list later. My _guess_ is that it's fine to remove that
ClearPageDirty in 2.4, but 2.5 a rather different case; but it
seems to me a risky cleanup, not -rc material. (But, sorry,
maybe I'm just advertizing the shallowness of _my_ thought.)

> @@ -114,9 +111,6 @@ void delete_from_swap_cache(struct page
> {
> swp_entry_t entry;
>
> - if (!PageLocked(page))
> - BUG();
> -
> block_flushpage(page, 0);
>
> entry.val = page->index;
>
> dropped also two pagelocked checks because there execute unconditionally
> paths that checks the pagelocked bit at the lower layer.

I don't mind that removal if you leave the !PageLocked BUG test in
__delete_from_swap_cache, but you've proposed to remove that one too?

Hugh

2002-06-28 03:17:01

by Austin Gonyou

[permalink] [raw]
Subject: Re: vm fixes for 2.4.19rc1

For something like DB work, would this patch be *too* aggressive on
freeing memory/cache as to introduced increased latency there?
Just curious, I'm all for using *any* good VM changes.

On Thu, 2002-06-27 at 15:14, Andrea Arcangeli wrote:
> some fix for 2.4.19rc1 (btw, the lru_cache_del() in the LRU path is
> needed in 2.5 too and it's also more efficient than the
> page_cache_release, see ptrace freeing the anon pages with put_page(),
> it will not pass through page_cache_release and it will trigger the
> PageLRU check that __free_pages_ok isn't capable to handle in 2.5, I
> will make a full vm update for 2.5 [in small pieces based on post-Andrew
> split of the monolithic patch] in the next days anyways):



--
Austin Gonyou <[email protected]>

2002-06-28 08:49:14

by Hugh Dickins

[permalink] [raw]
Subject: Re: vm fixes for 2.4.19rc1

On 27 Jun 2002, Austin Gonyou wrote:
> For something like DB work, would this patch be *too* aggressive on
> freeing memory/cache as to introduced increased latency there?
> Just curious, I'm all for using *any* good VM changes.

No, you're taking the "_cache_" too seriously. This has nothing to
do with taking pages out of the lookup cache: it's a matter of when
anonymous pages not in any lookup cache, which were put on an LRU list
for fair aging, can safely be removed from that LRU list when they are
freed. An issue of the best place to catch them all, an issue of when
the locking is safe, but no macroscopic issue of caching versus latency.

> On Thu, 2002-06-27 at 15:14, Andrea Arcangeli wrote:
> > some fix for 2.4.19rc1 (btw, the lru_cache_del() in the LRU path is
> > needed in 2.5 too and it's also more efficient than the
> > page_cache_release, see ptrace freeing the anon pages with put_page(),
> > it will not pass through page_cache_release and it will trigger the
> > PageLRU check that __free_pages_ok isn't capable to handle in 2.5, I
> > will make a full vm update for 2.5 [in small pieces based on post-Andrew
> > split of the monolithic patch] in the next days anyways):

2002-06-28 16:00:45

by Andrea Arcangeli

[permalink] [raw]
Subject: Re: vm fixes for 2.4.19rc1

On Thu, Jun 27, 2002 at 11:28:46PM +0100, Hugh Dickins wrote:
> On Thu, 27 Jun 2002, Andrea Arcangeli wrote:
> > I will make a full vm update for 2.5 [in small pieces based on post-Andrew
> > split of the monolithic patch] in the next days anyways):
>
> Great!
>
> > diff -urNp 2.4.19rc1/mm/page_alloc.c 2.4.19rc1aa1/mm/page_alloc.c
> > --- 2.4.19rc1/mm/page_alloc.c Tue Jun 25 23:56:23 2002
> > +++ 2.4.19rc1aa1/mm/page_alloc.c Wed Jun 26 00:55:35 2002
> > @@ -82,8 +82,11 @@ static void __free_pages_ok (struct page
> > /* Yes, think what happens when other parts of the kernel take
> > * a reference to a page in order to pin it for io. -ben
> > */
> > - if (PageLRU(page))
> > + if (PageLRU(page)) {
> > + if (unlikely(in_interrupt()))
> > + BUG();
> > lru_cache_del(page);
> > + }
> >
> > this adds a debugging check just in case.
>
> Well, of course I like this patch, having proposed it months ago.
> But the evidence for needing it seems to have died down, and I
> don't think -rc is the right time for adding a BUG() where you
> can almost always get away with such bad behaviour - I intended
> it for a diagnostic aid in the early stages of a release -
> but I'm happy to go whichever way Marcelo decides on it.

it's up to Marcelo as you say, personally I prefer it to go in for
robusteness.

> > @@ -91,6 +94,8 @@ static void __free_pages_ok (struct page
> > BUG();
> > if (!VALID_PAGE(page))
> > BUG();
> > + if (PageSwapCache(page))
> > + BUG();
> >
> > This resurrects a valid debugging check dropped in rc1.
>
> Sorry, Andrea, perhaps I should have copied you explicitly on my
> patch which dropped that check. It is a valid check, yes, but it's
> a waste of space and time: remember that PageSwapCache(page) just
> checks whether page->mapping == &swapper_space (used to be a bitflag,
> yes, but not since last Autumn), and the BUG() you can just see above
> was if page->mapping was set at all. One day, yes, PageSwapCache may
> become bitflag test again: then would be the time to add the test back.
>
> > @@ -280,10 +285,12 @@ static struct page * balance_classzone(z
> > BUG();
> > if (!VALID_PAGE(page))
> > BUG();
> > + if (PageSwapCache(page))
> > + BUG();
> >
> > same as above. a page with page->mapping set cannot be freed, if it
> > happens it's a bug that we want to trap.
>
> As you say, same as above: again the BUG() you can just see at the top
> was on a test for whether page->mapping is set, so PageSwapCache test
> again redundant. If compiler optimized it away, I'd leave it: but no.

ok.

> > diff -urNp 2.4.19rc1/mm/swap_state.c 2.4.19rc1aa1/mm/swap_state.c
> > --- 2.4.19rc1/mm/swap_state.c Tue Jan 22 12:55:27 2002
> > +++ 2.4.19rc1aa1/mm/swap_state.c Wed Jun 26 00:48:13 2002
> > @@ -95,11 +95,8 @@ int add_to_swap_cache(struct page *page,
> > */
> > void __delete_from_swap_cache(struct page *page)
> > {
> > - if (!PageLocked(page))
> > - BUG();
> > if (!PageSwapCache(page))
> > BUG();
> > - ClearPageDirty(page);
> > __remove_inode_page(page);
> > INC_CACHE_INFO(del_total);
> > }
> >
> > Here I play risky for a rc1, but it made perfect sense to me,
>
> Yes and yes. My comment when I weakened the __remove_inode_page BUG
> was "Remove the prior ClearPageDirty? maybe but not without deeper
> thought: let stay for now." I've not given it deep enough thought,
> and I'm not convinced you have yet: need to consider the peculiarities
> of mm/shmem.c, and the way clearing the flag lets the page be dirtied

mm/shmem.c looked trivial too, just grep for __delete_from_swap_cache and
delete_from_swap_cache. they all correctly set the dirty bit on the page
afterwards. Infact we shouldn't only remove the clearpagedirty, we
should replace it with a setpagedirty above, just the opposite :), then
we can drop various setpagedirty after delete_from_swap_cache.

> on to a list later. My _guess_ is that it's fine to remove that
> ClearPageDirty in 2.4, but 2.5 a rather different case; but it
> seems to me a risky cleanup, not -rc material. (But, sorry,
> maybe I'm just advertizing the shallowness of _my_ thought.)

well, the other PG_swap_cache cleanups that happened in rc1 weren't
strictly necessary too, but I'm fine to postpone the above one. I will
just take care of those bits for next -aa according to the corrections
you made. thanks,

> > @@ -114,9 +111,6 @@ void delete_from_swap_cache(struct page
> > {
> > swp_entry_t entry;
> >
> > - if (!PageLocked(page))
> > - BUG();
> > -
> > block_flushpage(page, 0);
> >
> > entry.val = page->index;
> >
> > dropped also two pagelocked checks because there execute unconditionally
> > paths that checks the pagelocked bit at the lower layer.
>
> I don't mind that removal if you leave the !PageLocked BUG test in
> __delete_from_swap_cache, but you've proposed to remove that one too?

Andrea