Received: by 2002:ad5:474a:0:0:0:0:0 with SMTP id i10csp1663816imu; Tue, 20 Nov 2018 23:25:35 -0800 (PST) X-Google-Smtp-Source: AFSGD/U7TZpMj7sSiuMiyv8N37uPpalOI//U9QMYHz4w0WzXFelmunveuwbvOdw8jsRTienqB78B X-Received: by 2002:a63:5a08:: with SMTP id o8mr4915715pgb.185.1542785135793; Tue, 20 Nov 2018 23:25:35 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1542785135; cv=none; d=google.com; s=arc-20160816; b=ExOajzyPRIz9s7OJ7f40s5sa6sdLi0HctW/4jvR9oxpMRqq3L9VXJRBZh4nNhoKLEt laWMpnvhUKHZwsUDoElCOeqW69uSUfKubuVUrIXK4q9zD6wge9rGRF8FU8XN3uQzUdYy X4+aSsXdQgTMQei4S8x0UqDEkN1x+vdGbsx5PhJ2koc9amxU1dzycTEInvu3mvzuHHUQ YwWqVuqYbjzhvbVSGmvRrgPPkcemwcP0I5HfHaWxg4RMaENLbj2ebvdTmZrqWnfi7bYf Bl+o5H+j+VYXaZE563J0BSrBFpSwcXwp2l8QNPhUXqFrERhbE5f4pynSgP8LRUPV+k/T miTA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date; bh=Kl1PfyIKRn7+sKYwuOfTJOsr8lriiiUFxsKALDimaEM=; b=SjyK8m6immuHtjFMlas2Wr73AdMCPqVrI4pF35ehrSD1kHLtNCB6jidoK+0isVfKxy xa7/wMnQdmG3+FendxXF+e+i8ZLQMCl79M4Xw9D6+HZJep1ZiLsbKtX1tbCxeHCPcDHx MPSuhfi5GXgsvUQbzNqQ7veMD97lU40Pvguv3yjhJ4LtxG8Dweqq3L1Y8g3+Nuj3gYqV kMzqvxRd6GOshiVrXGjS/5sQM89n+U/kBZ0CNvsRS7GrjoLB7sIEb0PwA6IMnNH9iTBe JNnPwQO4gq3IeN34OEcclTzezEu0xyWD8Z7Is77agBDKmLGF+4JlNXU1NBetvyYNZrEf dfHA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id b11si13187460pgb.536.2018.11.20.23.25.20; Tue, 20 Nov 2018 23:25:35 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726246AbeKURiU (ORCPT + 99 others); Wed, 21 Nov 2018 12:38:20 -0500 Received: from mx2.suse.de ([195.135.220.15]:58002 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1725999AbeKURiU (ORCPT ); Wed, 21 Nov 2018 12:38:20 -0500 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id E0D63ACB4; Wed, 21 Nov 2018 07:05:00 +0000 (UTC) Date: Wed, 21 Nov 2018 08:05:00 +0100 From: Michal Hocko To: Dan Williams Cc: Linux API , Andrew Morton , adobriyan@gmail.com, Linux MM , Linux Kernel Mailing List , Jan Kara , David Rientjes Subject: Re: [RFC PATCH 1/3] mm, proc: be more verbose about unstable VMA flags in /proc//smaps Message-ID: <20181121070500.GB12932@dhcp22.suse.cz> References: <20181120103515.25280-1-mhocko@kernel.org> <20181120103515.25280-2-mhocko@kernel.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.10.1 (2018-07-13) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue 20-11-18 10:32:07, Dan Williams wrote: > On Tue, Nov 20, 2018 at 2:35 AM Michal Hocko wrote: > > > > From: Michal Hocko > > > > Even though vma flags exported via /proc//smaps are explicitly > > documented to be not guaranteed for future compatibility the warning > > doesn't go far enough because it doesn't mention semantic changes to > > those flags. And they are important as well because these flags are > > a deep implementation internal to the MM code and the semantic might > > change at any time. > > > > Let's consider two recent examples: > > http://lkml.kernel.org/r/20181002100531.GC4135@quack2.suse.cz > > : commit e1fb4a086495 "dax: remove VM_MIXEDMAP for fsdax and device dax" has > > : removed VM_MIXEDMAP flag from DAX VMAs. Now our testing shows that in the > > : mean time certain customer of ours started poking into /proc//smaps > > : and looks at VMA flags there and if VM_MIXEDMAP is missing among the VMA > > : flags, the application just fails to start complaining that DAX support is > > : missing in the kernel. > > > > http://lkml.kernel.org/r/alpine.DEB.2.21.1809241054050.224429@chino.kir.corp.google.com > > : Commit 1860033237d4 ("mm: make PR_SET_THP_DISABLE immediately active") > > : introduced a regression in that userspace cannot always determine the set > > : of vmas where thp is ineligible. > > : Userspace relies on the "nh" flag being emitted as part of /proc/pid/smaps > > : to determine if a vma is eligible to be backed by hugepages. > > : Previous to this commit, prctl(PR_SET_THP_DISABLE, 1) would cause thp to > > : be disabled and emit "nh" as a flag for the corresponding vmas as part of > > : /proc/pid/smaps. After the commit, thp is disabled by means of an mm > > : flag and "nh" is not emitted. > > : This causes smaps parsing libraries to assume a vma is eligible for thp > > : and ends up puzzling the user on why its memory is not backed by thp. > > > > In both cases userspace was relying on a semantic of a specific VMA > > flag. The primary reason why that happened is a lack of a proper > > internface. While this has been worked on and it will be fixed properly, > > it seems that our wording could see some refinement and be more vocal > > about semantic aspect of these flags as well. > > > > Cc: Jan Kara > > Cc: Dan Williams > > Cc: David Rientjes > > Signed-off-by: Michal Hocko > > --- > > Documentation/filesystems/proc.txt | 4 +++- > > 1 file changed, 3 insertions(+), 1 deletion(-) > > > > diff --git a/Documentation/filesystems/proc.txt b/Documentation/filesystems/proc.txt > > index 12a5e6e693b6..b1fda309f067 100644 > > --- a/Documentation/filesystems/proc.txt > > +++ b/Documentation/filesystems/proc.txt > > @@ -496,7 +496,9 @@ flags associated with the particular virtual memory area in two letter encoded > > > > Note that there is no guarantee that every flag and associated mnemonic will > > be present in all further kernel releases. Things get changed, the flags may > > -be vanished or the reverse -- new added. > > +be vanished or the reverse -- new added. Interpretatation of their meaning > > +might change in future as well. So each consumnent of these flags have to > > +follow each specific kernel version for the exact semantic. > > Can we start to claw some of this back? Perhaps with a config option > to hide the flags to put applications on notice? I would love to. My knowledge of CRIU is very minimal, but my understanding is that this is the primary consumer of those flags. And checkpointing is so close to the specific kernel version that I assume that this abuse is somehow justified. We can hide it behind CONFIG_CHECKPOINT_RESTORE but does it going to help? I presume that many distro kernels will have the config enabled. > I recall that when I > introduced CONFIG_IO_STRICT_DEVMEM it caused enough regressions that > distros did not enable it, but now a few years out I'm finding that it > is enabled in more places. > > In any event, > > Acked-by: Dan Williams Thanks! -- Michal Hocko SUSE Labs