Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1756561AbZA3VhD (ORCPT ); Fri, 30 Jan 2009 16:37:03 -0500 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1753558AbZA3Vgx (ORCPT ); Fri, 30 Jan 2009 16:36:53 -0500 Received: from g1t0029.austin.hp.com ([15.216.28.36]:10812 "EHLO g1t0029.austin.hp.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753006AbZA3Vgw (ORCPT ); Fri, 30 Jan 2009 16:36:52 -0500 Subject: Re: [PATCH] Fix OOPS in mmap_region() when merging adjacent VM_LOCKED file segments From: Lee Schermerhorn To: Hugh Dickins Cc: Linus Torvalds , Greg KH , Maksim Yevmenkin , linux-kernel , Nick Piggin , Andrew Morton , will@crowder-design.com, Rik van Riel , KOSAKI Motohiro , KAMEZAWA Hiroyuki , Mikos Szeredi In-Reply-To: References: <1233259410.2315.75.camel@lts-notebook> <20090130055639.GA30950@suse.de> <1233345190.908.36.camel@lts-notebook> Content-Type: text/plain Organization: HP/OSLO Date: Fri, 30 Jan 2009 16:36:52 -0500 Message-Id: <1233351412.908.69.camel@lts-notebook> Mime-Version: 1.0 X-Mailer: Evolution 2.22.3.1 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3803 Lines: 78 On Fri, 2009-01-30 at 21:12 +0000, Hugh Dickins wrote: > On Fri, 30 Jan 2009, Linus Torvalds wrote: > > On Fri, 30 Jan 2009, Lee Schermerhorn wrote: > > > > > > So happens, I'm mapping with MAP_SHARED, so the VM_ACCOUNT flag gets > > > cleared later in mmap_region(). Comments say that this is for checking > > > memory availability during shmem_file_setup(). Maybe we can move the > > > temporary setting of VM_ACCOUNT until just before the call to > > > shmem_zero_setup()? > > > > Yeah, that would probably fix it, and looks like the right thing to do. > > I do need to refresh my memory on that in a moment... > > > > > It all looks pretty confused wrong to set the whole VM_ACCOUNT flag for a > > file-backed file AT ALL in the first place, but the code knows that it > > won't matter for a shared file, and will be cleared again later. > > > > So it plays these temporary games with vm_flags, and it didn't matter > > because of how we used to call "vma_merge()" either early only for the > > anonymous memory case (that had VM_ACCOUNT stable and didn't have that > > temporary case at all) or much later (after having undone the temporary > > flag setting) for files. > > I'm to blame for those games, and now they've given trouble, > the right thing may be to put an end to them. > > > > > Why do we pass in that "accountable" flag, btw? It's only ever set to 0 by > > a MAP_PRIVATE mapping that hits is_file_hugepages() (see do_mmap_pgoff), > > and we could just do that decision all inside mmap_region(). So the flag > > doesn't really seem to have any real meaning, and is just passed around > > for some odd historical reason? > > It looks like the "accountable" flag dates from before Miklos separated > mmap_region() out from do_mmap_pgoff(): so he just passed it on down to > mmap_region() as an additional argument, preferring to leave the more > complex MAP_PRIVATE/is_file_hugepages test behind in do_mmap_pgoff(). > > It seemed rather a random refactoring to me. Looking at it again, > I wonder if we should be getting do_brk() to use mmap_region() too; > but my appetite for cleanups is low at present, bugs we have enough. > > By the way, there's an argument to say that you should add > VM_MIXEDMAP to VM_CAN_NONLINEAR in VM_MERGEABLE_FLAGS: I don't > really care whether we merge the odd filemap_xip vma or not, > but it used to do so and now won't. > > By the same (used to merge, now won't) argument, one could say > VM_INSERTPAGE should be there too; but whereas VM_MIXEDMAP is used > in one place only, quite a lot of drivers use vm_insert_page(), so > I feel more comfortable with the idea that it's stopping merges - > though in that case, shouldn't we add it to VM_SPECIAL? > > But I'm caring more about that VM_ACCOUNT... I just verified that adding VM_ACCOUNT to VM_MERGEABLE does allow the merge to happen with the test program. And the system didn't come crashing down around me. But, I wouldn't trust that simple test as the last word. A short run of a stress load I use held up/still running, but I can't tell whether it's merging as expected there. I am running a slightly modified version of Maksim's test program under the harness. I modified it to mmap the entire region to reserve space, then MAP_FIXED at each page address in the range returned by the first mmap. I saw that it was leaving holes between some of the pages w/o this. I'm going to automate the check for merging [read map and verify a single segment at expected range] and leave that running with the load. Lee -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/