2001-02-13 01:11:14

by Marcelo Tosatti

[permalink] [raw]
Subject: [PATCH] swapin flush cache bug


Hi,

Niibe Yutaka noted (and added an entry on the MM bugzilla system) that
cache flushing on do_swap_page() is buggy. Here:

---
struct page *page = lookup_swap_cache(entry);
pte_t pte;

if (!page) {
lock_kernel();
swapin_readahead(entry);
page = read_swap_cache(entry);
unlock_kernel();
if (!page)
return -1;

flush_page_to_ram(page);
flush_icache_page(vma, page);
}

mm->rss++;
--

If lookup_swap_cache() finds a page in the swap cache, and that page was
in memory because of the swapin readahead, the cache is not flushed.

Here is a patch to fix the problem by always flushing the cache including
for pages in the swap cache:

--- linux.orig/mm/memory.c.orig Thu Feb 8 10:52:20 2001
+++ linux/mm/memory.c Thu Feb 8 10:54:07 2001
@@ -1033,12 +1033,12 @@
unlock_kernel();
if (!page)
return -1;
-
- flush_page_to_ram(page);
- flush_icache_page(vma, page);
}

mm->rss++;
+
+ flush_page_to_ram(page);
+ flush_icache_page(vma, page);

pte = mk_pte(page, vma->vm_page_prot);





2001-02-13 09:56:29

by Russell King

[permalink] [raw]
Subject: Re: [PATCH] swapin flush cache bug

Marcelo Tosatti writes:
> If lookup_swap_cache() finds a page in the swap cache, and that page was
> in memory because of the swapin readahead, the cache is not flushed.
>
> Here is a patch to fix the problem by always flushing the cache including
> for pages in the swap cache:

> -
> - flush_page_to_ram(page);
> - flush_icache_page(vma, page);
> }
>
> mm->rss++;
> +
> + flush_page_to_ram(page);
> + flush_icache_page(vma, page);

Surely if the page is in the swap cache, we don't need the
flush_page_to_ram() because the data is already written to the page. Yes,
there may be some reminents of it in the cache due to it being written
to disk via PIO.

Thinking about it some more - we have a process. It used to contain page
P at address V. We unmapped the page (and did the right thing with the
caches). Now, something wants to access address V, so we pull the page
from the swap cache, and place page P back at address V. We therefore
shouldn't need any cache manipulation at this point.

What was the problem? The old code seems to behave well on a virtual
address indexed virtual address tagged cache.

--
Russell King ([email protected]) The developer of ARM Linux
http://www.arm.linux.org.uk/personal/aboutme.html

2001-02-13 10:54:31

by NIIBE Yutaka

[permalink] [raw]
Subject: Re: [PATCH] swapin flush cache bug

Russell King wrote:
> What was the problem? The old code seems to behave well on a virtual
> address indexed virtual address tagged cache.

My case (SH-4) is: virtual address indexed, physical address tagged cache
(which has alias issue).

Suppose there's I/O to the physical page P asynchronously, and the
page is placed in the swap cache. It remains cache entry, say,
indexed kernel virtual address K. Then, process maps P at U. U and K
(may) indexes differently. The process will get the data from memory
(not the one in the cashe), if it's not flushed.
--

2001-02-13 11:17:07

by Russell King

[permalink] [raw]
Subject: Re: [PATCH] swapin flush cache bug

NIIBE Yutaka writes:
> My case (SH-4) is: virtual address indexed, physical address tagged cache
> (which has alias issue).

vivt caches have the same alias issue.

> Suppose there's I/O to the physical page P asynchronously, and the
> page is placed in the swap cache.

Unless someone else (Rik/DaveM) says otherwise, it is my understanding
that any IO for page P will only ever be a write to disk. Therefore,
when you get a copy of the page from the swap cache, the physical memory
for that page is the same as it was when the process was using it last.

> It remains cache entry, say, indexed kernel virtual address K. Then,
> process maps P at U. U and K (may) indexes differently. The process
> will get the data from memory (not the one in the cashe), if it's not
> flushed.

The data from memory will still be up to date though. However, I agree
that you will end up with cache aliases. I will also end up with cache
aliases. The question now is, do these aliases really matter?

On my caches, the answer is no because they're not marked dirty, and
therefore will get dropped from the cache without writeback to memory.

If your cache doesn't write back clean cache data to memory, then you
should also behave well.

However, that said, someone more experienced with the Linux MM should
comment.

--
Russell King ([email protected]) The developer of ARM Linux
http://www.arm.linux.org.uk/personal/aboutme.html

2001-02-13 11:27:27

by Alan

[permalink] [raw]
Subject: Re: [PATCH] swapin flush cache bug

> Suppose there's I/O to the physical page P asynchronously, and the
> page is placed in the swap cache. It remains cache entry, say,
> indexed kernel virtual address K. Then, process maps P at U. U and K
> (may) indexes differently. The process will get the data from memory
> (not the one in the cashe), if it's not flushed.

Ok we need to handle that case a bit more intelligently so those flushes dont
get into other ports code paths.

2001-02-13 23:51:14

by NIIBE Yutaka

[permalink] [raw]
Subject: Re: [PATCH] swapin flush cache bug

Russell King wrote:
> Unless someone else (Rik/DaveM) says otherwise, it is my understanding
> that any IO for page P will only ever be a write to disk. Therefore,
> when you get a copy of the page from the swap cache, the physical memory
> for that page is the same as it was when the process was using it last.
[...]
> The data from memory will still be up to date though. However, I agree
> that you will end up with cache aliases. I will also end up with cache
> aliases. The question now is, do these aliases really matter?
>
> On my caches, the answer is no because they're not marked dirty, and
> therefore will get dropped from the cache without writeback to memory.
>
> If your cache doesn't write back clean cache data to memory, then you
> should also behave well.

Yes, that's the difference. It's write back cache, in my case.
--

2001-02-14 02:09:02

by NIIBE Yutaka

[permalink] [raw]
Subject: Re: [PATCH] swapin flush cache bug

Alan Cox wrote:
> Ok we need to handle that case a bit more intelligently so those flushes dont
> get into other ports code paths.

Possibly at fs/buffer.c:end_buffer_io_async?

We need to flush the cache when I/O was READ or READA. Is there any
way for end_buffer_io_async to distinguish which I/O (READ or WRITE)
has been done?

--------------------------------------
Problem with write-back cache.

(1) Page got swapped out

Swap out
[ Disk ] <---- P [ Page ]

(2) Page got swapped in asynchronously, possibly by read-ahead

Swap in
[ Disk ] ----> P [ Page ]
K

The I/O from disk goes through kernel virtual address K.
We have cache entries indexed by K.

(3) Page fault occurs at user space U

[ Disk ] P [ Page ] <----- U
K

The control goes to do_swap_page, found the page at
lookup_swap_cache.

If K and U indexes differently, we have cache alias issues,
we need to flush the entries indexed by K.
--

2001-02-14 12:02:45

by Marcelo Tosatti

[permalink] [raw]
Subject: Re: [PATCH] swapin flush cache bug


On Wed, 14 Feb 2001, NIIBE Yutaka wrote:

> Alan Cox wrote:
> > Ok we need to handle that case a bit more intelligently so those flushes dont
> > get into other ports code paths.
>
> Possibly at fs/buffer.c:end_buffer_io_async?
>
> We need to flush the cache when I/O was READ or READA.

Yet another thing (1) on end_buffer_io_async() to handle a case which is
only true for a specific user of it. Since the other special case handling
is for swap IO too, I think a separate IO end operation for swap would be
interesting.

(1) The current one is SetPageDecrAfter handling.

> Is there any way for end_buffer_io_async to distinguish which I/O (READ or WRITE)
> has been done?

Yes. If the buffer_head is uptodated (BH_Uptodate) then its a WRITE,
otherwise its a READ (this is only true before mark_buffer_uptodate() call
inside end_buffer_io_async(), of course).