2010-01-20 08:31:35

by anfei

[permalink] [raw]
Subject: cache alias in mmap + write

Hello,

The below test case is easy to reproduce the cache alias problem on
omap2430 with the VIPT cache. The steps as these:

$ dd if=/dev/zero of=abc bs=4k count=1
$ ./a.out # this program
$ xxd abc | head -1 # check the result

I expect it's always 0x11111111 0x77777777 at the beginning of file
"abc", but the result is not (run multiple times):

0x11111111 0x77777777
0x44444444 0x77777777
0x11111111 0x77777777
0x44444444 0x77777777
0x44444444 0x77777777

If I add munmap()/msync() before write(), I can see it's always as
expected (0x11111111 0x77777777).

Does Linux guarantee the coherence between write and the shared mappings
w/o the help of munmap/msync? If not, what kind of the coherence are
ensured? Can anyone give a clear definition?

And if I apply the below patch to write (only for the fs using this
generic function), the problem disappeared. That's another question,
why do we only do flush for read (see flush_dcache_page in
do_generic_file_read), but not write too?

Thanks,
Anfei.

---
#include <stdio.h>
#include <sys/mman.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <unistd.h>

int main(void)
{
int fd;
int *addr;
int tmp;
int val = 0x11111111;

fd = open("abc", O_RDWR);
addr = mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_SHARED, fd, 0);
*(addr+0) = 0x44444444;
tmp = *(addr+0);
*(addr+1) = 0x77777777;
write(fd, &val, sizeof(int));
close(fd);

return 0;
}



diff --git a/mm/filemap.c b/mm/filemap.c
index 96ac6b0..07056fb 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -2196,6 +2196,9 @@ again:
if (unlikely(status))
break;

+ if (mapping_writably_mapped(mapping))
+ flush_dcache_page(page);
+
pagefault_disable();
copied = iov_iter_copy_from_user_atomic(page, i, offset, bytes);
pagefault_enable();


2010-01-20 09:10:20

by KOSAKI Motohiro

[permalink] [raw]
Subject: Re: cache alias in mmap + write

Hello,

> diff --git a/mm/filemap.c b/mm/filemap.c
> index 96ac6b0..07056fb 100644
> --- a/mm/filemap.c
> +++ b/mm/filemap.c
> @@ -2196,6 +2196,9 @@ again:
> if (unlikely(status))
> break;
>
> + if (mapping_writably_mapped(mapping))
> + flush_dcache_page(page);
> +
> pagefault_disable();
> copied = iov_iter_copy_from_user_atomic(page, i, offset, bytes);
> pagefault_enable();

I'm not sure ARM cache coherency model. but I guess correct patch is here.

+ if (mapping_writably_mapped(mapping))
+ flush_dcache_page(page);
+
pagefault_disable();
copied = iov_iter_copy_from_user_atomic(page, i, offset, bytes);
pagefault_enable();
- flush_dcache_page(page);


Why do we need to call flush_dcache_page() twice?



2010-01-20 09:52:52

by anfei

[permalink] [raw]
Subject: Re: cache alias in mmap + write

On Wed, Jan 20, 2010 at 06:10:11PM +0900, KOSAKI Motohiro wrote:
> Hello,
>
> > diff --git a/mm/filemap.c b/mm/filemap.c
> > index 96ac6b0..07056fb 100644
> > --- a/mm/filemap.c
> > +++ b/mm/filemap.c
> > @@ -2196,6 +2196,9 @@ again:
> > if (unlikely(status))
> > break;
> >
> > + if (mapping_writably_mapped(mapping))
> > + flush_dcache_page(page);
> > +
> > pagefault_disable();
> > copied = iov_iter_copy_from_user_atomic(page, i, offset, bytes);
> > pagefault_enable();
>
> I'm not sure ARM cache coherency model. but I guess correct patch is here.
>
> + if (mapping_writably_mapped(mapping))
> + flush_dcache_page(page);
> +
> pagefault_disable();
> copied = iov_iter_copy_from_user_atomic(page, i, offset, bytes);
> pagefault_enable();
> - flush_dcache_page(page);
>
>
> Why do we need to call flush_dcache_page() twice?
>
The latter flush_dcache_page is used to flush the kernel changes
(iov_iter_copy_from_user_atomic), which makes the userspace to see the
write, and the one I added is used to flush the userspace changes.
And I think it's better to split this function into two:
flush_dcache_user_page(page);
kmap_atomic(page);
write to page;
kunmap_atomic(page);
flush_dcache_kern_page(page);
But currently there is no such API.
>
>
>

2010-01-21 01:10:11

by KOSAKI Motohiro

[permalink] [raw]
Subject: Re: cache alias in mmap + write

> On Wed, Jan 20, 2010 at 06:10:11PM +0900, KOSAKI Motohiro wrote:
> > Hello,
> >
> > > diff --git a/mm/filemap.c b/mm/filemap.c
> > > index 96ac6b0..07056fb 100644
> > > --- a/mm/filemap.c
> > > +++ b/mm/filemap.c
> > > @@ -2196,6 +2196,9 @@ again:
> > > if (unlikely(status))
> > > break;
> > >
> > > + if (mapping_writably_mapped(mapping))
> > > + flush_dcache_page(page);
> > > +
> > > pagefault_disable();
> > > copied = iov_iter_copy_from_user_atomic(page, i, offset, bytes);
> > > pagefault_enable();
> >
> > I'm not sure ARM cache coherency model. but I guess correct patch is here.
> >
> > + if (mapping_writably_mapped(mapping))
> > + flush_dcache_page(page);
> > +
> > pagefault_disable();
> > copied = iov_iter_copy_from_user_atomic(page, i, offset, bytes);
> > pagefault_enable();
> > - flush_dcache_page(page);
> >
> > Why do we need to call flush_dcache_page() twice?
> >
> The latter flush_dcache_page is used to flush the kernel changes
> (iov_iter_copy_from_user_atomic), which makes the userspace to see the
> write, and the one I added is used to flush the userspace changes.
> And I think it's better to split this function into two:
> flush_dcache_user_page(page);
> kmap_atomic(page);
> write to page;
> kunmap_atomic(page);
> flush_dcache_kern_page(page);
> But currently there is no such API.

Why can't we create new api? this your pseudo code looks very fine to me.


note: if you don't like to create new api. I can agree your current patch.
but I have three requests.
1. Move flush_dcache_page() into iov_iter_copy_from_user_atomic().
Your above explanation indicate it is real intention. plus, change
iov_iter_copy_from_user_atomic() fixes fuse too.
2. Add some commnet. almost developer only have x86 machine. so, arm
specific trick need additional explicit explanation. otherwise anybody
might break this code in the future.
3. Resend the patch. original mail isn't good patch format. please consider
to reduce akpm suffer.


2010-01-21 01:32:17

by Jamie Lokier

[permalink] [raw]
Subject: Re: cache alias in mmap + write

KOSAKI Motohiro wrote:
> 2. Add some commnet. almost developer only have x86 machine. so, arm
> specific trick need additional explicit explanation. otherwise anybody
> might break this code in the future.

That's Documentation/cachetlb.txt.

What's being discussed here is not ARM-specific, although it appears
maintainers of different architecture (ARM and MIPS for a start) may
have different ideas about what they are guaranteeing to userspace.
It sounds like MIPS expects userspace to use msync() sometimes (even
though Linux msync(MS_INVALIDATE) is quite broken), and ARM expects to
to keep mappings coherent automatically (which is sometimes slower
than necessary, but usually very helpful).

> 3. Resend the patch. original mail isn't good patch format. please
> consider to reduce akpm suffer.

This type of change in generic code would need review from a number of
architecture maintainers, I'd expect.

-- Jamie

2010-01-21 02:39:16

by anfei

[permalink] [raw]
Subject: Re: cache alias in mmap + write

On Thu, Jan 21, 2010 at 9:10 AM, KOSAKI Motohiro
<[email protected]> wrote:
>> On Wed, Jan 20, 2010 at 06:10:11PM +0900, KOSAKI Motohiro wrote:
>> > Hello,
>> >
>> > > diff --git a/mm/filemap.c b/mm/filemap.c
>> > > index 96ac6b0..07056fb 100644
>> > > --- a/mm/filemap.c
>> > > +++ b/mm/filemap.c
>> > > @@ -2196,6 +2196,9 @@ again:
>> > > ? ? ? ? ? if (unlikely(status))
>> > > ? ? ? ? ? ? ? ? ? break;
>> > >
>> > > + ? ? ? ? if (mapping_writably_mapped(mapping))
>> > > + ? ? ? ? ? ? ? ? flush_dcache_page(page);
>> > > +
>> > > ? ? ? ? ? pagefault_disable();
>> > > ? ? ? ? ? copied = iov_iter_copy_from_user_atomic(page, i, offset, bytes);
>> > > ? ? ? ? ? pagefault_enable();
>> >
>> > I'm not sure ARM cache coherency model. but I guess correct patch is here.
>> >
>> > + ? ? ? ? ? if (mapping_writably_mapped(mapping))
>> > + ? ? ? ? ? ? ? ? ? flush_dcache_page(page);
>> > +
>> > ? ? ? ? ? ? pagefault_disable();
>> > ? ? ? ? ? ? copied = iov_iter_copy_from_user_atomic(page, i, offset, bytes);
>> > ? ? ? ? ? ? pagefault_enable();
>> > - ? ? ? ? ? flush_dcache_page(page);
>> >
>> > Why do we need to call flush_dcache_page() twice?
>> >
>> The latter flush_dcache_page is used to flush the kernel changes
>> (iov_iter_copy_from_user_atomic), which makes the userspace to see the
>> write, ?and the one I added is used to flush the userspace changes.
>> And I think it's better to split this function into two:
>> ? ? ? flush_dcache_user_page(page);
>> ? ? ? kmap_atomic(page);
>> ? ? ? write to ?page;
>> ? ? ? kunmap_atomic(page);
>> ? ? ? flush_dcache_kern_page(page);
>> But currently there is no such API.
>
> Why can't we create new api? this your pseudo code looks very fine to me.
>
Thanks for your suggestion, I will try to add the new APIs. But
firstly, as Jamie
pointed out, can we confirm is this a real bug? Or it depends on the arch.

>
> note: if you don't like to create new api. I can agree your current patch.
> but I have three requests.
> ?1. Move flush_dcache_page() into iov_iter_copy_from_user_atomic().
> ? ?Your above explanation indicate it is real intention. plus, change
> ? ?iov_iter_copy_from_user_atomic() fixes fuse too.

OK.

> ?2. Add some commnet. almost developer only have x86 machine. so, arm
> ? ?specific trick need additional explicit explanation. otherwise anybody
> ? ?might break this code in the future.
> ?3. Resend the patch. original mail isn't good patch format. please consider
> ? ?to reduce akpm suffer.
>
OK.

Thanks,
Anfei.
>
>
>

2010-01-21 03:06:17

by anfei

[permalink] [raw]
Subject: Re: cache alias in mmap + write

On Thu, Jan 21, 2010 at 9:32 AM, Jamie Lokier <[email protected]> wrote:
> KOSAKI Motohiro wrote:
>> ?2. Add some commnet. almost developer only have x86 machine. so, arm
>> ? ? specific trick need additional explicit explanation. otherwise anybody
>> ? ? might break this code in the future.
>
> That's Documentation/cachetlb.txt.
>
> What's being discussed here is not ARM-specific, although it appears
> maintainers of different architecture (ARM and MIPS for a start) may
> have different ideas about what they are guaranteeing to userspace.
> It sounds like MIPS expects userspace to use msync() sometimes (even
> though Linux msync(MS_INVALIDATE) is quite broken), and ARM expects to
> to keep mappings coherent automatically (which is sometimes slower
> than necessary, but usually very helpful).
>
>> ?3. Resend the patch. original mail isn't good patch format. please
>> ?consider to reduce akpm suffer.
>
> This type of change in generic code would need review from a number of
> architecture maintainers, I'd expect.
>
So should I broadcast this mail in order to get their attentions,
[email protected]?

Thanks!
> -- Jamie
>

2010-01-21 05:00:07

by anfei

[permalink] [raw]
Subject: Re: cache alias in mmap + write

On Thu, Jan 21, 2010 at 9:10 AM, KOSAKI Motohiro
<[email protected]> wrote:
>> On Wed, Jan 20, 2010 at 06:10:11PM +0900, KOSAKI Motohiro wrote:
>> > Hello,
>> >
>> > > diff --git a/mm/filemap.c b/mm/filemap.c
>> > > index 96ac6b0..07056fb 100644
>> > > --- a/mm/filemap.c
>> > > +++ b/mm/filemap.c
>> > > @@ -2196,6 +2196,9 @@ again:
>> > > ? ? ? ? ? if (unlikely(status))
>> > > ? ? ? ? ? ? ? ? ? break;
>> > >
>> > > + ? ? ? ? if (mapping_writably_mapped(mapping))
>> > > + ? ? ? ? ? ? ? ? flush_dcache_page(page);
>> > > +
>> > > ? ? ? ? ? pagefault_disable();
>> > > ? ? ? ? ? copied = iov_iter_copy_from_user_atomic(page, i, offset, bytes);
>> > > ? ? ? ? ? pagefault_enable();
>> >
>> > I'm not sure ARM cache coherency model. but I guess correct patch is here.
>> >
>> > + ? ? ? ? ? if (mapping_writably_mapped(mapping))
>> > + ? ? ? ? ? ? ? ? ? flush_dcache_page(page);
>> > +
>> > ? ? ? ? ? ? pagefault_disable();
>> > ? ? ? ? ? ? copied = iov_iter_copy_from_user_atomic(page, i, offset, bytes);
>> > ? ? ? ? ? ? pagefault_enable();
>> > - ? ? ? ? ? flush_dcache_page(page);
>> >
>> > Why do we need to call flush_dcache_page() twice?
>> >
>> The latter flush_dcache_page is used to flush the kernel changes
>> (iov_iter_copy_from_user_atomic), which makes the userspace to see the
>> write, ?and the one I added is used to flush the userspace changes.
>> And I think it's better to split this function into two:
>> ? ? ? flush_dcache_user_page(page);
>> ? ? ? kmap_atomic(page);
>> ? ? ? write to ?page;
>> ? ? ? kunmap_atomic(page);
>> ? ? ? flush_dcache_kern_page(page);
>> But currently there is no such API.
>
> Why can't we create new api? this your pseudo code looks very fine to me.
>
I will resend the patch, if this patch is acceptable, I will create
another patch
to introduce this new API.
>
> note: if you don't like to create new api. I can agree your current patch.
> but I have three requests.
> ?1. Move flush_dcache_page() into iov_iter_copy_from_user_atomic().
> ? ?Your above explanation indicate it is real intention. plus, change
> ? ?iov_iter_copy_from_user_atomic() fixes fuse too.

There is a check on mapping, that's not passed into
iov_iter_copy_from_user_atomic, and this function is only called a few places,
So I just add flush_dcache_page directly.

> ?2. Add some commnet. almost developer only have x86 machine. so, arm
> ? ?specific trick need additional explicit explanation. otherwise anybody
> ? ?might break this code in the future.
> ?3. Resend the patch. original mail isn't good patch format. please consider
> ? ?to reduce akpm suffer.
>
I will send it soon.

Thanks,
Anfei.
>
>
>