Received: by 2002:ad5:474a:0:0:0:0:0 with SMTP id i10csp5125368imu; Tue, 8 Jan 2019 12:00:29 -0800 (PST) X-Google-Smtp-Source: ALg8bN6kiDF1L3lt/72piHyEQ7AgfdCTkSufrL1skfp47WkCfLJDYgBvZUITT0F9zeATkHQsGSgb X-Received: by 2002:a63:5346:: with SMTP id t6mr2792977pgl.40.1546977629286; Tue, 08 Jan 2019 12:00:29 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1546977629; cv=none; d=google.com; s=arc-20160816; b=TsrP8s6KZYXdfClc8i8rH2oZ4m0hgwDLSu+4fhTm4Zz9p/NcgVCZEyScq1ZYF7S9rD upnONvQcyYbw2hpH18b7DuA46V4+QOagXExObKxsnCRddpor7gq8DdJiuwK4QoFMjpHc SlIEs6g7wNU/IEeBaT9fGjd1hs1442vo3BWBf5lcdW1nEQ/A+8KN4ftZ0xcvmZgvKjAF wlhUoMQ4KLWKGD1curm5cqixxEp0NV2yI3fUmP8awoyaE96yVp9zQ8o4RNsZUDowFtLe ZVum1FTvF3Mqt4MX0kHAkfC0/iZ2EFzHvWPli+e5GQWcqb5q6CpVFXOBikka2gKBlybE 0nNA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=Wsm4N9G7vzu6x+TttwywYHP6fOwEWwUIkyyeXEFo8gg=; b=yBWHnjyrIgAIdfEvvJ5WKLrfhVf3w1ByHPYs88kPddUhAOW5MKL54A7YE5+Go+0I+p e0HAxA4uSynOh4NnIDg7mojLdeLOLawx9kSzRyytPQJnws8ARJwubA36cw3RDd+dOFTK pfbXm9prSnP+2FcaaH3KhOGrYBiSLGdrj2atxfsZb5xXikgIE7wataeq5m4fQZBUNvtO Tn1TJiSCo8anaxD9+1zHDx3Ej7RzqLCSpQVTh+o+jNe0guj6mprfWKzRujoaYvP0jT5Y bFmI1SAnvO92Spl5JfBuWhadFnRUrapSdrEYm6Ni7q5iG1/T5c5dizlyrkH9G67AbZh9 Xl8g== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=WLyM1qhc; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id g33si22383575pgm.426.2019.01.08.12.00.13; Tue, 08 Jan 2019 12:00:29 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=WLyM1qhc; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730900AbfAHT6W (ORCPT + 99 others); Tue, 8 Jan 2019 14:58:22 -0500 Received: from mail.kernel.org ([198.145.29.99]:37212 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729783AbfAHT3m (ORCPT ); Tue, 8 Jan 2019 14:29:42 -0500 Received: from sasha-vm.mshome.net (c-73-47-72-35.hsd1.nh.comcast.net [73.47.72.35]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 34F90218A3; Tue, 8 Jan 2019 19:29:40 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1546975781; bh=0vKajdQvDYlbTLni4tUr1Jz8lE3e9dBiWzT0FAjJpHw=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=WLyM1qhcAzMkP+Mo2hHNlTTCmG3BA91JPAGkZvQKHH2+YKmzUT6ZJ6r8Fk2ScfaQH Cdd7CnSgRPwEdoarR89agXfUsEIk2cC8tUhcjM4EGyf1vQA/YYb1Wj7akkmPSGYT8B EPm46g7oIfWWasAZQLvJLIj+PyF7/YcxZG0EeF2M= From: Sasha Levin To: linux-kernel@vger.kernel.org, stable@vger.kernel.org Cc: Michal Hocko , Dan Williams , David Rientjes , Paul Oppenheimer , William Kucharski , Andrew Morton , Linus Torvalds , Sasha Levin , linux-fsdevel@vger.kernel.org, linux-doc@vger.kernel.org Subject: [PATCH AUTOSEL 4.20 116/117] mm, proc: be more verbose about unstable VMA flags in /proc//smaps Date: Tue, 8 Jan 2019 14:26:24 -0500 Message-Id: <20190108192628.121270-116-sashal@kernel.org> X-Mailer: git-send-email 2.19.1 In-Reply-To: <20190108192628.121270-1-sashal@kernel.org> References: <20190108192628.121270-1-sashal@kernel.org> MIME-Version: 1.0 X-Patchwork-Hint: Ignore Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Michal Hocko [ Upstream commit 7550c6079846a24f30d15ac75a941c8515dbedfb ] Patch series "THP eligibility reporting via proc". This series of three patches aims at making THP eligibility reporting much more robust and long term sustainable. The trigger for the change is a regression report [2] and the long follow up discussion. In short the specific application didn't have good API to query whether a particular mapping can be backed by THP so it has used VMA flags to workaround that. These flags represent a deep internal state of VMAs and as such they should be used by userspace with a great deal of caution. A similar has happened for [3] when users complained that VM_MIXEDMAP is no longer set on DAX mappings. Again a lack of a proper API led to an abuse. The first patch in the series tries to emphasise that that the semantic of flags might change and any application consuming those should be really careful. The remaining two patches provide a more suitable interface to address [2] and provide a consistent API to query the THP status both for each VMA and process wide as well. [1] http://lkml.kernel.org/r/20181120103515.25280-1-mhocko@kernel.org [2] http://lkml.kernel.org/r/http://lkml.kernel.org/r/alpine.DEB.2.21.1809241054050.224429@chino.kir.corp.google.com [3] http://lkml.kernel.org/r/20181002100531.GC4135@quack2.suse.cz This patch (of 3): Even though vma flags exported via /proc//smaps are explicitly documented to be not guaranteed for future compatibility the warning doesn't go far enough because it doesn't mention semantic changes to those flags. And they are important as well because these flags are a deep implementation internal to the MM code and the semantic might change at any time. Let's consider two recent examples: http://lkml.kernel.org/r/20181002100531.GC4135@quack2.suse.cz : commit e1fb4a086495 "dax: remove VM_MIXEDMAP for fsdax and device dax" has : removed VM_MIXEDMAP flag from DAX VMAs. Now our testing shows that in the : mean time certain customer of ours started poking into /proc//smaps : and looks at VMA flags there and if VM_MIXEDMAP is missing among the VMA : flags, the application just fails to start complaining that DAX support is : missing in the kernel. http://lkml.kernel.org/r/alpine.DEB.2.21.1809241054050.224429@chino.kir.corp.google.com : Commit 1860033237d4 ("mm: make PR_SET_THP_DISABLE immediately active") : introduced a regression in that userspace cannot always determine the set : of vmas where thp is ineligible. : Userspace relies on the "nh" flag being emitted as part of /proc/pid/smaps : to determine if a vma is eligible to be backed by hugepages. : Previous to this commit, prctl(PR_SET_THP_DISABLE, 1) would cause thp to : be disabled and emit "nh" as a flag for the corresponding vmas as part of : /proc/pid/smaps. After the commit, thp is disabled by means of an mm : flag and "nh" is not emitted. : This causes smaps parsing libraries to assume a vma is eligible for thp : and ends up puzzling the user on why its memory is not backed by thp. In both cases userspace was relying on a semantic of a specific VMA flag. The primary reason why that happened is a lack of a proper interface. While this has been worked on and it will be fixed properly, it seems that our wording could see some refinement and be more vocal about semantic aspect of these flags as well. Link: http://lkml.kernel.org/r/20181211143641.3503-2-mhocko@kernel.org Signed-off-by: Michal Hocko Acked-by: Jan Kara Acked-by: Dan Williams Acked-by: David Rientjes Acked-by: Mike Rapoport Acked-by: Vlastimil Babka Cc: Dan Williams Cc: David Rientjes Cc: Paul Oppenheimer Cc: William Kucharski Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Sasha Levin --- Documentation/filesystems/proc.txt | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/Documentation/filesystems/proc.txt b/Documentation/filesystems/proc.txt index 12a5e6e693b6..2a4e63f5122c 100644 --- a/Documentation/filesystems/proc.txt +++ b/Documentation/filesystems/proc.txt @@ -496,7 +496,9 @@ manner. The codes are the following: Note that there is no guarantee that every flag and associated mnemonic will be present in all further kernel releases. Things get changed, the flags may -be vanished or the reverse -- new added. +be vanished or the reverse -- new added. Interpretation of their meaning +might change in future as well. So each consumer of these flags has to +follow each specific kernel version for the exact semantic. This file is only present if the CONFIG_MMU kernel configuration option is enabled. -- 2.19.1