2009-10-12 08:52:18

by Nitin Gupta

[permalink] [raw]
Subject: [PATCH] [ARM] force dcache flush if dcache_dirty bit set

On ARM, update_mmu_cache() does dcache flush for a page only if
it has a kernel mapping (page_mapping(page) != NULL). The correct
behavior would be to force the flush based on dcache_dirty bit only.

One of the cases where present logic would be a problem is when
a RAM based block device[1] is used as a swap disk. In this case,
we would have in-memory data corruption as shown in steps below:

do_swap_page()
{
- Allocate a new page (if not already in swap cache)
- Issue read from swap disk
- Block driver issues flush_dcache_page()
- flush_dcache_page() simply sets PG_dcache_dirty bit and does not
actually issue a flush since this page has no user space mapping yet.
- Now, if swap disk is almost full, this newly read page is removed
from swap cache and corrsponding swap slot is freed.
- Map this page anonymously in user space.
- update_mmu_cache()
- Since this page does not have kernel mapping (its not in page/swap
cache and is mapped anonymously), it does not issue dcache flush
even if dcache_dirty bit is set by flush_dcache_page() above.

<user now gets stale data since dcache was never flushed>
}

Same problem exists on mips too.

[1] example:
- brd (RAM based block device)
- ramzswap (RAM based compressed swap device)

Signed-off-by: Nitin Gupta <[email protected]>
---
arch/arm/mm/fault-armv.c | 9 +++------
1 files changed, 3 insertions(+), 6 deletions(-)

diff --git a/arch/arm/mm/fault-armv.c b/arch/arm/mm/fault-armv.c
index bc0099d..d0d17b6 100644
--- a/arch/arm/mm/fault-armv.c
+++ b/arch/arm/mm/fault-armv.c
@@ -153,14 +153,11 @@ void update_mmu_cache(struct vm_area_struct *vma, unsigned long addr, pte_t pte)

page = pfn_to_page(pfn);
mapping = page_mapping(page);
- if (mapping) {
#ifndef CONFIG_SMP
- int dirty = test_and_clear_bit(PG_dcache_dirty, &page->flags);
-
- if (dirty)
- __flush_dcache_page(mapping, page);
+ if (test_and_clear_bit(PG_dcache_dirty, &page->flags))
+ __flush_dcache_page(mapping, page);
#endif
-
+ if (mapping) {
if (cache_is_vivt())
make_coherent(mapping, vma, addr, pfn);
else if (vma->vm_flags & VM_EXEC)
--
1.6.2.5


2009-10-12 09:08:04

by Russell King - ARM Linux

[permalink] [raw]
Subject: Re: [PATCH] [ARM] force dcache flush if dcache_dirty bit set

On Mon, Oct 12, 2009 at 02:20:23PM +0530, Nitin Gupta wrote:
> Same problem exists on mips too.

Nice catch. This logic came from sparc64, so I think you'd want to talk
to davem about it as well.

In the mean time, submitting your fix to the patch system would be great,
thanks.

2009-10-12 09:37:49

by David Miller

[permalink] [raw]
Subject: Re: [PATCH] [ARM] force dcache flush if dcache_dirty bit set

From: Russell King - ARM Linux <[email protected]>
Date: Mon, 12 Oct 2009 10:07:10 +0100

> On Mon, Oct 12, 2009 at 02:20:23PM +0530, Nitin Gupta wrote:
>> Same problem exists on mips too.
>
> Nice catch. This logic came from sparc64, so I think you'd want to talk
> to davem about it as well.
>
> In the mean time, submitting your fix to the patch system would be great,
> thanks.

Sparc64 flushes unconditionally when there is no page mapping.
So, it should be fine here.

2009-10-12 10:01:17

by Russell King - ARM Linux

[permalink] [raw]
Subject: Re: [PATCH] [ARM] force dcache flush if dcache_dirty bit set

On Mon, Oct 12, 2009 at 02:37:44AM -0700, David Miller wrote:
> From: Russell King - ARM Linux <[email protected]>
> Date: Mon, 12 Oct 2009 10:07:10 +0100
>
> > On Mon, Oct 12, 2009 at 02:20:23PM +0530, Nitin Gupta wrote:
> >> Same problem exists on mips too.
> >
> > Nice catch. This logic came from sparc64, so I think you'd want to talk
> > to davem about it as well.
> >
> > In the mean time, submitting your fix to the patch system would be great,
> > thanks.
>
> Sparc64 flushes unconditionally when there is no page mapping.
> So, it should be fine here.

Are you sure - I checked the sparc64 code before posting, and we're doing
the same thing.

Sparc64 update_mmu_cache:
page = pfn_to_page(pfn);
if (page && page_mapping(page)) {
pg_flags = page->flags;
if (pg_flags & (1UL << PG_dcache_dirty)) {
/* do lazy flush */
}
}

Sparc64 flush_dcache_page:
mapping = page_mapping(page);
if (mapping && !mapping_mapped(mapping)) {
set_dcache_dirty(page, this_cpu);
}

ARM update_mmu_cache:
page = pfn_to_page(pfn);
mapping = page_mapping(page);
if (mapping) {
int dirty = test_and_clear_bit(PG_dcache_dirty, &page->flags);
if (dirty)
/* do lazy flush */

ARM flush_dcache_page:
struct address_space *mapping = page_mapping(page);
if (!PageHighMem(page) && mapping && !mapping_mapped(mapping))
set_bit(PG_dcache_dirty, &page->flags);

It looks identical to me.

The problem which has been identified is that when flush_dcache_page() is
called, there is a mapping, and so the page is marked for lazy flushing.
However, by the time update_mmu_cache() gets called, the mapping has gone
and so update_mmu_cache() does nothing.

2009-10-12 10:16:34

by David Miller

[permalink] [raw]
Subject: Re: [PATCH] [ARM] force dcache flush if dcache_dirty bit set

From: Russell King - ARM Linux <[email protected]>
Date: Mon, 12 Oct 2009 11:00:23 +0100

> The problem which has been identified is that when flush_dcache_page() is
> called, there is a mapping, and so the page is marked for lazy flushing.
> However, by the time update_mmu_cache() gets called, the mapping has gone
> and so update_mmu_cache() does nothing.

Now I understand, thanks.

Ok, I'll remove the mapping check from the sparc64 flush_dcache()
function, seems the right thing to do.

2009-10-12 10:30:25

by Nitin Gupta

[permalink] [raw]
Subject: Re: [PATCH] [ARM] force dcache flush if dcache_dirty bit set

On 10/12/2009 02:37 PM, Russell King - ARM Linux wrote:
> On Mon, Oct 12, 2009 at 02:20:23PM +0530, Nitin Gupta wrote:
>> Same problem exists on mips too.
>
> Nice catch. This logic came from sparc64, so I think you'd want to talk
> to davem about it as well.
>
> In the mean time, submitting your fix to the patch system would be great,
> thanks.
>

Thanks for looking into this. I think this can make it to mainline if
you could Ack it and perhaps CC Andrew to get his attention?

I will soon post similar patch for mips. For sparc64, I guess David wants to fix it.

Thanks,
Nitin

2009-10-12 11:09:04

by Russell King - ARM Linux

[permalink] [raw]
Subject: Re: [PATCH] [ARM] force dcache flush if dcache_dirty bit set

On Mon, Oct 12, 2009 at 03:58:02PM +0530, Nitin Gupta wrote:
> On 10/12/2009 02:37 PM, Russell King - ARM Linux wrote:
> > On Mon, Oct 12, 2009 at 02:20:23PM +0530, Nitin Gupta wrote:
> >> Same problem exists on mips too.
> >
> > Nice catch. This logic came from sparc64, so I think you'd want to talk
> > to davem about it as well.
> >
> > In the mean time, submitting your fix to the patch system would be great,
> > thanks.
> >
>
> Thanks for looking into this. I think this can make it to mainline if
> you could Ack it and perhaps CC Andrew to get his attention?

Or it could take a less tortuous route by going through my tree.

2009-10-12 11:18:51

by Nitin Gupta

[permalink] [raw]
Subject: Re: [PATCH] [ARM] force dcache flush if dcache_dirty bit set

On 10/12/2009 04:38 PM, Russell King - ARM Linux wrote:
> On Mon, Oct 12, 2009 at 03:58:02PM +0530, Nitin Gupta wrote:
>> On 10/12/2009 02:37 PM, Russell King - ARM Linux wrote:
>>> On Mon, Oct 12, 2009 at 02:20:23PM +0530, Nitin Gupta wrote:
>>>> Same problem exists on mips too.
>>>
>>> Nice catch. This logic came from sparc64, so I think you'd want to talk
>>> to davem about it as well.
>>>
>>> In the mean time, submitting your fix to the patch system would be great,
>>> thanks.
>>>
>>
>> Thanks for looking into this. I think this can make it to mainline if
>> you could Ack it and perhaps CC Andrew to get his attention?
>
> Or it could take a less tortuous route by going through my tree.
>

Ok, this is fine too.

Thanks,
Nitin

2009-10-12 16:10:57

by Hugh Dickins

[permalink] [raw]
Subject: Re: [PATCH] [ARM] force dcache flush if dcache_dirty bit set

On Mon, 12 Oct 2009, Russell King - ARM Linux wrote:
> On Mon, Oct 12, 2009 at 02:37:44AM -0700, David Miller wrote:
> > From: Russell King - ARM Linux <[email protected]>
> > Date: Mon, 12 Oct 2009 10:07:10 +0100
> >
> > > On Mon, Oct 12, 2009 at 02:20:23PM +0530, Nitin Gupta wrote:
> > >> Same problem exists on mips too.
> > >
> > > Nice catch. This logic came from sparc64, so I think you'd want to talk
> > > to davem about it as well.
> > >
> > > In the mean time, submitting your fix to the patch system would be great,
> > > thanks.
> >
> > Sparc64 flushes unconditionally when there is no page mapping.
> > So, it should be fine here.
>
> Are you sure - I checked the sparc64 code before posting, and we're doing
> the same thing.
>
> Sparc64 update_mmu_cache:
> page = pfn_to_page(pfn);
> if (page && page_mapping(page)) {
> pg_flags = page->flags;
> if (pg_flags & (1UL << PG_dcache_dirty)) {
> /* do lazy flush */
> }
> }
>
> Sparc64 flush_dcache_page:
> mapping = page_mapping(page);
> if (mapping && !mapping_mapped(mapping)) {
> set_dcache_dirty(page, this_cpu);
> }
>
> ARM update_mmu_cache:
> page = pfn_to_page(pfn);
> mapping = page_mapping(page);
> if (mapping) {
> int dirty = test_and_clear_bit(PG_dcache_dirty, &page->flags);
> if (dirty)
> /* do lazy flush */
>
> ARM flush_dcache_page:
> struct address_space *mapping = page_mapping(page);
> if (!PageHighMem(page) && mapping && !mapping_mapped(mapping))
> set_bit(PG_dcache_dirty, &page->flags);
>
> It looks identical to me.
>
> The problem which has been identified is that when flush_dcache_page() is
> called, there is a mapping, and so the page is marked for lazy flushing.
> However, by the time update_mmu_cache() gets called, the mapping has gone
> and so update_mmu_cache() does nothing.

Sorry to muddy the waters on this, if you and Dave are sure that
you have the right fix, down in your architectures, and that fix
isn't going to hurt your performance significantly.

But this issue would affect other architectures too, wouldn't it?
I think not just mips, which was mentioned, but others too, parisc
for example: linux-arch Cc'ed.

And I'm not certain that down in the arch is the right place:
it may well turn out to be right, but that's not obvious.

How much is this issue peculiar to swap? Where Nitin discovered it
in do_swap_page(), due to swap cache being freed before reaching the
update_mmu_cache().

If it is peculiar to swap, then I'd suggest perhaps one of two very
different solutions. One would be to rearrange do_swap_page() so we
do not remove the page from swap cache before the update_mmu_cache()
(that would need care, and might turn out to be harder than I think).
The alternative solution might be to stop page_mapping(page) returning
the pseudo-address_space swapper_space when PageSwapCache - I always
hated that, but had to give in at the time.

(Aside: I don't think it affects this issue, but just be aware
that mapping_mapped(page) returns false on a swapcache page,
no matter whether it's mapped in anywhere or not.)

But if it is not peculiar to swap, fixes down in the architecture
may be inevitable. I think this depends on what happens when
truncating or unmapping a file. It used to be the case that pages
were unmapped first, then truncated (clearing page->mapping) after:
in which case, Nitin's issue would appear to be peculiar to swap.

But around 2.6.23 Nick found a second unmap_mapping_range() necessary
after the truncation (just for the invalidation case? or because of
inadequate serialization, now I think fixed with the page lock?).
And I think he has that area under reconstruction again at present,
so Cc'ed for his opinion on this issue - thanks.

Hugh

2009-10-12 17:04:16

by Russell King

[permalink] [raw]
Subject: Re: [PATCH] [ARM] force dcache flush if dcache_dirty bit set

On Mon, Oct 12, 2009 at 05:09:53PM +0100, Hugh Dickins wrote:
> Sorry to muddy the waters on this, if you and Dave are sure that
> you have the right fix, down in your architectures, and that fix
> isn't going to hurt your performance significantly.

If I look at the issue from this point of view:

- we are using PG_arch_1 to delay cache handling for the page

- if PG_arch_1 is set on a page, we set it explicitly because we
didn't do some flushing between the allocation of the page and
mapping it into userspace

- if a page with PG_arch_1 set ever gets to userspace, this can
only be because we did the lazy flushing thing

I don't see that there should have been any bearing on whether a page
has a mapping or not when we get to update_mmu_cache. The issue here
is that > if PG_arch_1 is set on a page, then we didn't flush it at
the time when we believed it was appropriate to do so. <

Tell me I'm wrong (having only just sent it to Linus...)

--
Russell King
Linux kernel 2.6 ARM Linux - http://www.arm.linux.org.uk/
maintainer of:

2009-10-12 17:14:16

by Nitin Gupta

[permalink] [raw]
Subject: Re: [PATCH] [ARM] force dcache flush if dcache_dirty bit set

On 10/12/2009 10:33 PM, Russell King wrote:
> On Mon, Oct 12, 2009 at 05:09:53PM +0100, Hugh Dickins wrote:
>> Sorry to muddy the waters on this, if you and Dave are sure that
>> you have the right fix, down in your architectures, and that fix
>> isn't going to hurt your performance significantly.
>
> If I look at the issue from this point of view:
>
> - we are using PG_arch_1 to delay cache handling for the page
>
> - if PG_arch_1 is set on a page, we set it explicitly because we
> didn't do some flushing between the allocation of the page and
> mapping it into userspace
>
> - if a page with PG_arch_1 set ever gets to userspace, this can
> only be because we did the lazy flushing thing
>


> I don't see that there should have been any bearing on whether a page
> has a mapping or not when we get to update_mmu_cache. The issue here
> is that > if PG_arch_1 is set on a page, then we didn't flush it at
> the time when we believed it was appropriate to do so. <
>


Presented so clearly by Russell and this is exactly what I meant, though
the example just focused on some particular swap case which I happen to
stumble upon while working on something similar.

Thanks,
Nitin

2009-10-12 18:07:04

by Hugh Dickins

[permalink] [raw]
Subject: Re: [PATCH] [ARM] force dcache flush if dcache_dirty bit set

On Mon, 12 Oct 2009, Russell King wrote:
> On Mon, Oct 12, 2009 at 05:09:53PM +0100, Hugh Dickins wrote:
> > Sorry to muddy the waters on this, if you and Dave are sure that
> > you have the right fix, down in your architectures, and that fix
> > isn't going to hurt your performance significantly.
>
> If I look at the issue from this point of view:
>
> - we are using PG_arch_1 to delay cache handling for the page
>
> - if PG_arch_1 is set on a page, we set it explicitly because we
> didn't do some flushing between the allocation of the page and
> mapping it into userspace
>
> - if a page with PG_arch_1 set ever gets to userspace, this can
> only be because we did the lazy flushing thing
>
> I don't see that there should have been any bearing on whether a page
> has a mapping or not when we get to update_mmu_cache. The issue here
> is that > if PG_arch_1 is set on a page, then we didn't flush it at
> the time when we believed it was appropriate to do so. <
>
> Tell me I'm wrong (having only just sent it to Linus...)

I wouldn't dare! Put that way, it seems very clear that it was always
at best a redundant test, which Nitin now shows can go wrong. Okay,
thanks, let's forget about whether file invalidation can or cannot
also put a page into this state, and proceed with your fix.

The architectures which appear to need fixing in the same way are
arm, mips, parisc, sh and sparc64 (xtensa looks right already, and
score just looks confused - a whole function __update_cache() which
checks for PG_arch_1, yet nothing sets it?).

Hugh

2009-10-13 02:09:33

by Paul Mundt

[permalink] [raw]
Subject: Re: [PATCH] [ARM] force dcache flush if dcache_dirty bit set

On Mon, Oct 12, 2009 at 06:03:12PM +0100, Russell King wrote:
> On Mon, Oct 12, 2009 at 05:09:53PM +0100, Hugh Dickins wrote:
> > Sorry to muddy the waters on this, if you and Dave are sure that
> > you have the right fix, down in your architectures, and that fix
> > isn't going to hurt your performance significantly.
>
> If I look at the issue from this point of view:
>
> - we are using PG_arch_1 to delay cache handling for the page
>
> - if PG_arch_1 is set on a page, we set it explicitly because we
> didn't do some flushing between the allocation of the page and
> mapping it into userspace
>
> - if a page with PG_arch_1 set ever gets to userspace, this can
> only be because we did the lazy flushing thing
>
> I don't see that there should have been any bearing on whether a page
> has a mapping or not when we get to update_mmu_cache. The issue here
> is that > if PG_arch_1 is set on a page, then we didn't flush it at
> the time when we believed it was appropriate to do so. <
>
> Tell me I'm wrong (having only just sent it to Linus...)
>
Having looked at the ARM fix, in the !mapping case do you not need the
I-cache flush on vma->vm_flags & VM_EXEC? Or is the presumption that
flush_icache_page()-type action doesn't need to be undertaken by
flush_dcache_page()/update_mmu_cache() when there is no page_mapping()?

2009-10-14 17:11:35

by Catalin Marinas

[permalink] [raw]
Subject: Re: [PATCH] [ARM] force dcache flush if dcache_dirty bit set

On Tue, 2009-10-13 at 11:07 +0900, Paul Mundt wrote:
> On Mon, Oct 12, 2009 at 06:03:12PM +0100, Russell King wrote:
> > On Mon, Oct 12, 2009 at 05:09:53PM +0100, Hugh Dickins wrote:
> > > Sorry to muddy the waters on this, if you and Dave are sure that
> > > you have the right fix, down in your architectures, and that fix
> > > isn't going to hurt your performance significantly.
> >
> > If I look at the issue from this point of view:
> >
> > - we are using PG_arch_1 to delay cache handling for the page
> >
> > - if PG_arch_1 is set on a page, we set it explicitly because we
> > didn't do some flushing between the allocation of the page and
> > mapping it into userspace
> >
> > - if a page with PG_arch_1 set ever gets to userspace, this can
> > only be because we did the lazy flushing thing
> >
> > I don't see that there should have been any bearing on whether a page
> > has a mapping or not when we get to update_mmu_cache. The issue here
> > is that > if PG_arch_1 is set on a page, then we didn't flush it at
> > the time when we believed it was appropriate to do so. <
> >
> > Tell me I'm wrong (having only just sent it to Linus...)
>
> Having looked at the ARM fix, in the !mapping case do you not need the
> I-cache flush on vma->vm_flags & VM_EXEC? Or is the presumption that
> flush_icache_page()-type action doesn't need to be undertaken by
> flush_dcache_page()/update_mmu_cache() when there is no page_mapping()?

If I understand Nitin's scenario correctly, I think it should also
invalidate the I-cache.

For executable anonymous pages containing, it's the user app writing the
code (JIT etc.) and it calls an ARM-specific syscall for I and D cache
maintenance. If such page is read back from swap, following Nitin's
scenario, the I-cache would need to be invalidated as well otherwise it
can have stale entries.

--
Catalin

2009-10-15 12:00:16

by Catalin Marinas

[permalink] [raw]
Subject: Re: [PATCH] [ARM] force dcache flush if dcache_dirty bit set

On Wed, 2009-10-14 at 18:10 +0100, Catalin Marinas wrote:
> On Tue, 2009-10-13 at 11:07 +0900, Paul Mundt wrote:
> > On Mon, Oct 12, 2009 at 06:03:12PM +0100, Russell King wrote:
> > > I don't see that there should have been any bearing on whether a page
> > > has a mapping or not when we get to update_mmu_cache. The issue here
> > > is that > if PG_arch_1 is set on a page, then we didn't flush it at
> > > the time when we believed it was appropriate to do so. <
> > >
> > > Tell me I'm wrong (having only just sent it to Linus...)
> >
> > Having looked at the ARM fix, in the !mapping case do you not need the
> > I-cache flush on vma->vm_flags & VM_EXEC? Or is the presumption that
> > flush_icache_page()-type action doesn't need to be undertaken by
> > flush_dcache_page()/update_mmu_cache() when there is no page_mapping()?
>
> If I understand Nitin's scenario correctly, I think it should also
> invalidate the I-cache.
>
> For executable anonymous pages containing, it's the user app writing the
> code (JIT etc.) and it calls an ARM-specific syscall for I and D cache
> maintenance. If such page is read back from swap, following Nitin's
> scenario, the I-cache would need to be invalidated as well otherwise it
> can have stale entries.

We can have a scenario where I-cache invalidation would not help on ARM.
Some apps may temporarily change protection from RX to RW to write some
data (not instructions) to a page containing code. If Nitin's scenario
happens when the page is RW, the VM_EXEC wouldn't be set, hence no
I-cache invalidation.

The app would later do mprotect(RX) but on some ARM processors
flush_cache_range() is a no-op and therefore no cache flushing. What
other architectures with Harvard caches do for mprotect(RX)? Is this
assumed to invalidate the I-cache?

(we have another I-D cache coherency problem on ARM with COW text pages
following a RX -> RW -> write data -> RX scenario and there are a few
solutions for this, though none of them optimal with the current cache
flushing API, especially with the read-implies-exec ELF personality)

--
Catalin

2009-10-15 21:48:54

by Kyle McMartin

[permalink] [raw]
Subject: Re: [PATCH] [ARM] force dcache flush if dcache_dirty bit set

On Mon, Oct 12, 2009 at 07:06:04PM +0100, Hugh Dickins wrote:
> The architectures which appear to need fixing in the same way are
> arm, mips, parisc, sh and sparc64 (xtensa looks right already, and
> score just looks confused - a whole function __update_cache() which
> checks for PG_arch_1, yet nothing sets it?).
>

parisc looks affected as well, though I don't recall the semantics of
dcache flushing without a mapping. James, can you send a patch?

WRT score, which seems to have been mostly cobbled together as a modern
copy of mips, it can likely be eliminated entirely, as it looks to have
been cargoculted from mips without thought.

regards, Kyle