Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752351AbaBRX5f (ORCPT ); Tue, 18 Feb 2014 18:57:35 -0500 Received: from mta-out.inet.fi ([195.156.147.13]:36279 "EHLO kirsi1.inet.fi" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750832AbaBRX5d (ORCPT ); Tue, 18 Feb 2014 18:57:33 -0500 Date: Wed, 19 Feb 2014 01:57:14 +0200 From: "Kirill A. Shutemov" To: Linus Torvalds Cc: "Kirill A. Shutemov" , Andrew Morton , Mel Gorman , Rik van Riel , Andi Kleen , Matthew Wilcox , Dave Hansen , Alexander Viro , Dave Chinner , linux-mm , linux-fsdevel , Linux Kernel Mailing List Subject: Re: [RFC, PATCHv2 0/2] mm: map few pages around fault address if they are in page cache Message-ID: <20140218235714.GA16064@node.dhcp.inet.fi> References: <1392662333-25470-1-git-send-email-kirill.shutemov@linux.intel.com> <20140218175900.8CF90E0090@blue.fi.intel.com> <20140218180730.C2552E0090@blue.fi.intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.22.1-rc1 (2013-10-16) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Feb 18, 2014 at 10:28:11AM -0800, Linus Torvalds wrote: > On Tue, Feb 18, 2014 at 10:07 AM, Kirill A. Shutemov > wrote: > > > > Patch is wrong. Correct one is below. > > Hmm. I don't hate this. Looking through it, it's fairly simple > conceptually, and the code isn't that complex either. I can live with > this. > > I think it's a bit odd how you pass both "max_pgoff" and "nr_pages" to > the fault-around function, though. In fact, I'd consider that a bug. > Passing in "FAULT_AROUND_PAGES" is just wrong, since the code cannot - > and in fact *must* not - actually fault in that many pages, since the > starting/ending address can be limited by other things. > > So I think that part of the code is bogus. You need to remove > nr_pages, because any use of it is just incorrect. I don't think it > can actually matter, since the max_pgoff checks are more restrictive, > but if you think it can matter please explain how and why it wouldn't > be a major bug? I don't like this too... Current max_pgoff is end of page table (or end of vma, if it ends before). If we drop nr_pages but keep current max_pgoff, we will potentially setup PTRS_PER_PTE pages a time: i.e. page fault to first page of page table and all pages are ready. nr_pages limits the number. It's not necessary bad idea to populate whole page table at once. I need to measure how much latency we will add by doing that. The only problem I see is that we take ptl for a bit too long. But with split ptl it will affect only page table we populate. Other approach is too limit ourself to FAULT_AROUND_PAGES from start_addr. In this case sometimes we will do useless radix-tree lookup even if we had chance to populated pages further in the page table. > Apart from that, I'd really like to see numbers for different ranges > of FAULT_AROUND_ORDER, because I think 5 is pretty high, but on the > whole I don't find this horrible, and you still lock the page so it > doesn't involve any new rules. I'm not hugely happy with another raw > radix-tree user, but it's not horrible. > > Btw, is the "radix_tree_deref_retry(page) -> goto restart" really > necessary? I'd be almost more inclined to just make it just do a > "break;" to break out of the loop and stop doing anything clever at > all. The code has not ready yet. I'll rework it. It just what I had by the end of the day. I wanted to know if setup pte directly from ->fault_nonblock() is okayish approach or considered layering violation. -- Kirill A. Shutemov -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/