2009-04-24 10:41:41

by Robin Holt

[permalink] [raw]
Subject: Re: Why doesn't zap_pte_range() call page_mkwrite()

On Fri, Apr 24, 2009 at 09:15:22AM +0200, Miklos Szeredi wrote:
> On Thu, 23 Apr 2009, Trond Myklebust wrote:
> > On Thu, 2009-04-23 at 21:52 +0200, Miklos Szeredi wrote:
> > > Now this is mostly done at page fault time, and the pte's are always
> > > being re-protected whenever the PG_dirty flag is cleared (see
> > > page_mkclean()).
> > >
> > > But in some cases (shmfs being the example I know) pages are not write
> > > protected and so zap_pte_range(), and other functions, still need to
> > > transfer the pte dirtyness to the page flag.
> >
> > My main worry is that this is all happening at munmap() time. There
> > shouldn't be any more page faults after that completes (am I right?), so
> > what other mechanism would transfer the pte dirtyness?
>
> After munmap() a page fault will result in SIGSEGV. A write access
> during munmap(), when the vma has been removed but the page table is
> still intact is more interesting. But in that case the write fault
> should also result in a SEGV, because it won't be able to find the
> matching VMA.
>
> Now lets see what happens if writeback is started against the page
> during this limbo period. page_mkclean() is called, which doesn't
> find the vma, so it doesn't re-protect the pte. But the PG_dirty will

I am not sure how you came to this conclusion. The address_space has
the vma's chained together and protected by the i_mmap_lock. That is
acquired prior to the cleaning operation. Additionally, the cleaning
operation walks the process's page tables and will remove/write-protect
the page before releasing the i_mmap_lock.

Maybe I misunderstand. I hope I have not added confusion.

Thanks,
Robin


2009-04-24 14:52:26

by Miklos Szeredi

[permalink] [raw]
Subject: Re: Why doesn't zap_pte_range() call page_mkwrite()

On Fri, 24 Apr 2009, Robin Holt wrote:
> I am not sure how you came to this conclusion. The address_space has
> the vma's chained together and protected by the i_mmap_lock. That is
> acquired prior to the cleaning operation. Additionally, the cleaning
> operation walks the process's page tables and will remove/write-protect
> the page before releasing the i_mmap_lock.
>
> Maybe I misunderstand. I hope I have not added confusion.

Looking more closely, I think you're right.

I thought that detach_vmas_to_be_unmapped() also removed them from
mapping->i_mmap, but that is not the case, it only removes them from
the process's mm_struct. The vma is only removed from ->i_mmap in
unmap_region() _after_ zapping the pte's.

This means that while the pte zapping is going on, any page faults
will fail but page_mkclean() (and all of rmap) will continue to work.

But then I don't see how we get a dirty pte without also first getting
a page fault. Weird...

Miklos

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to [email protected]. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"[email protected]"> [email protected] </a>