Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932508AbbHYNzw (ORCPT ); Tue, 25 Aug 2015 09:55:52 -0400 Received: from mx2.suse.de ([195.135.220.15]:39453 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932355AbbHYNzu (ORCPT ); Tue, 25 Aug 2015 09:55:50 -0400 Subject: Re: [PATCH v7 3/6] mm: Introduce VM_LOCKONFAULT To: Michal Hocko , Eric B Munson References: <1439097776-27695-1-git-send-email-emunson@akamai.com> <1439097776-27695-4-git-send-email-emunson@akamai.com> <20150812115909.GA5182@dhcp22.suse.cz> <20150819213345.GB4536@akamai.com> <20150820075611.GD4780@dhcp22.suse.cz> <20150820170309.GA11557@akamai.com> <20150821072552.GF23723@dhcp22.suse.cz> <20150821183132.GA12835@akamai.com> <20150825134154.GB6285@dhcp22.suse.cz> Cc: Andrew Morton , Jonathan Corbet , "Kirill A. Shutemov" , linux-kernel@vger.kernel.org, dri-devel@lists.freedesktop.org, linux-mm@kvack.org, linux-api@vger.kernel.org From: Vlastimil Babka Message-ID: <55DC73E2.6050509@suse.cz> Date: Tue, 25 Aug 2015 15:55:46 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.2.0 MIME-Version: 1.0 In-Reply-To: <20150825134154.GB6285@dhcp22.suse.cz> Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2954 Lines: 67 On 08/25/2015 03:41 PM, Michal Hocko wrote: > On Fri 21-08-15 14:31:32, Eric B Munson wrote: > [...] >> I am in the middle of implementing lock on fault this way, but I cannot >> see how we will hanlde mremap of a lock on fault region. Say we have >> the following: >> >> addr = mmap(len, MAP_ANONYMOUS, ...); >> mlock(addr, len, MLOCK_ONFAULT); >> ... >> mremap(addr, len, 2 * len, ...) >> >> There is no way for mremap to know that the area being remapped was lock >> on fault so it will be locked and prefaulted by remap. How can we avoid >> this without tracking per vma if it was locked with lock or lock on >> fault? > > Yes mremap is a problem and it is very much similar to mmap(MAP_LOCKED). > It doesn't guarantee the full mlock semantic because it leaves partially > populated ranges behind without reporting any error. Hm, that's right. > Considering the current behavior I do not thing it would be terrible > thing to do what Konstantin was suggesting and populate only the full > ranges in a best effort mode (it is done so anyway) and document the > behavior properly. > " > If the memory segment specified by old_address and old_size is > locked (using mlock(2) or similar), then this lock is maintained > when the segment is resized and/or relocated. As a consequence, > the amount of memory locked by the process may change. > > If the range is already fully populated and the range is > enlarged the new range is attempted to be fully populated > as well to preserve the full mlock semantic but there is no > guarantee this will succeed. Partially populated (e.g. created by > mlock(MLOCK_ONFAULT)) ranges do not have the full mlock semantic > so they are not populated on resize. > " > > So what we have as a result is that partially populated ranges are > preserved and fully populated ones work in the best effort mode the same > way as they are now. > > Does that sound at least remotely reasonably? I'll basically repeat what I said earlier: - mremap scanning existing pte's to figure out the population would slow it down for no good reason - it would be unreliable anyway: - example: was the area completely populated because MLOCK_ONFAULT was not used or because the process faulted it already - example: was the area not completely populated because MLOCK_ONFAULT was used, or because mmap(MAP_LOCKED) failed to populate it fully? I think the first point is a pointless regression for workloads that use just plain mlock() and don't want the onfault semantics. Unless there's some shortcut? Does vma have a counter of how much is populated? (I don't think so?) -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/