Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S965273AbbEMOi1 (ORCPT ); Wed, 13 May 2015 10:38:27 -0400 Received: from cantor2.suse.de ([195.135.220.15]:56440 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S934473AbbEMOiX (ORCPT ); Wed, 13 May 2015 10:38:23 -0400 From: Michal Hocko To: Michael Kerrisk Cc: Andrew Morton , Linus Torvalds , David Rientjes , LKML , Linux API , , Michal Hocko Subject: [PATCH 1/2] mmap.2: clarify MAP_LOCKED semantic Date: Wed, 13 May 2015 16:38:11 +0200 Message-Id: <1431527892-2996-2-git-send-email-miso@dhcp22.suse.cz> X-Mailer: git-send-email 2.1.4 In-Reply-To: <1431527892-2996-1-git-send-email-miso@dhcp22.suse.cz> References: <1431527892-2996-1-git-send-email-miso@dhcp22.suse.cz> Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2326 Lines: 61 From: Michal Hocko MAP_LOCKED had a subtly different semantic from mmap(2)+mlock(2) since it has been introduced. mlock(2) fails if the memory range cannot get populated to guarantee that no future major faults will happen on the range. mmap(MAP_LOCKED) on the other hand silently succeeds even if the range was populated only partially. Fixing this subtle difference in the kernel is rather awkward because the memory population happens after mm locks have been dropped and so the cleanup before returning failure (munlock) could operate on something else than the originally mapped area. E.g. speculative userspace page fault handler catching SEGV and doing mmap(fault_addr, MAP_FIXED|MAP_LOCKED) might discard portion of a racing mmap and lead to lost data. Although it is not clear whether such a usage would be valid, mmap page doesn't explicitly describe requirements for threaded applications so we cannot exclude this possibility. This patch makes the semantic of MAP_LOCKED explicit and suggest using mmap + mlock as the only way to guarantee no later major page faults. Signed-off-by: Michal Hocko --- man2/mmap.2 | 13 ++++++++++++- 1 file changed, 12 insertions(+), 1 deletion(-) diff --git a/man2/mmap.2 b/man2/mmap.2 index 54d68cf87e9e..1486be2e96b3 100644 --- a/man2/mmap.2 +++ b/man2/mmap.2 @@ -235,8 +235,19 @@ See the Linux kernel source file for further information. .TP .BR MAP_LOCKED " (since Linux 2.5.37)" -Lock the pages of the mapped region into memory in the manner of +Mark the mmaped region to be locked in the same way as .BR mlock (2). +This implementation will try to populate (prefault) the whole range but +the mmap call doesn't fail with +.B ENOMEM +if this fails. Therefore major faults might happen later on. So the semantic +is not as strong as +.BR mlock (2). +.BR mmap (2) ++ +.BR mlock (2) +should be used when major faults are not acceptable after the initialization +of the mapping. This flag is ignored in older kernels. .\" If set, the mapped pages will not be swapped out. .TP -- 2.1.4 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/