From: Chris Mason Subject: Re: Why doesn't zap_pte_range() call page_mkwrite() Date: Tue, 8 Sep 2009 12:31:49 -0400 Message-ID: <20090908163149.GB2975@think> References: <1240510668.11148.40.camel@heimdal.trondhjem.org> <1240519320.5602.9.camel@heimdal.trondhjem.org> <20090424104137.GA7601@sgi.com> <1240592448.4946.35.camel@heimdal.trondhjem.org> <20090425051028.GC10088@wotan.suse.de> <20090908153007.GB2513@think> <20090908154132.GC29902@wotan.suse.de> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: Trond Myklebust , Miklos Szeredi , holt@sgi.com, linux-nfs@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org To: Nick Piggin Return-path: In-Reply-To: <20090908154132.GC29902@wotan.suse.de> Sender: owner-linux-mm@kvack.org List-ID: On Tue, Sep 08, 2009 at 05:41:32PM +0200, Nick Piggin wrote: > On Tue, Sep 08, 2009 at 11:30:07AM -0400, Chris Mason wrote: > > > > As I said, I think I can fix the NFS problem by simply unmapping the > > > > page inside ->writepage() whenever we know the write request was > > > > originally set up by a page fault. > > > > > > The biggest outstanding problem we have remaining is get_user_pages. > > > Callers are only required to hold a ref on the page and then they > > > can call set_page_dirty at any point after that. > > > > > > I have a half-done patch somewhere to add a put_user_pages, and then > > > we could probably go from there to pinning the fs metadata (whether > > > by using the page lock or something else, I don't quite know). > > > > Hi everyone, > > > > Sorry for digging up an old thread, but is there any reason we can't > > just use page_mkwrite here? I'd love to get rid of the btrfs code to > > detect places that use set_page_dirty without a page_mkwrite. > > It is because page_mkwrite must be called before the page is dirtied > (it may fail, it theoretically may do something crazy with the previous > clean page data). And in several places I think it gets called from a > nasty context. > > It hasn't fallen completely off my radar. fsblock has the same issue > (although I've just been ignoring gup writes into fsblock fs for the > time being). Ok, I'll change my detection code a bit then. > > I have a basic idea of what to do... It would be nice to change calling > convention of get_user_pages and take the page lock. Database people might > scream, in which case we could only take the page lock for filesystems that > define ->page_mkwrite (so shared mem segments avoid the overhead). Lock > ordering might get a bit interesting, but if we can have callers ensure they > always submit and release partially fulfilled requirests, then we can always > trylock them. I think everyone will have page_mkwrite eventually, at least everyone who the databases will care about ;) -chris -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org