Received: by 2002:a25:e7d8:0:0:0:0:0 with SMTP id e207csp261965ybh; Thu, 12 Mar 2020 01:23:29 -0700 (PDT) X-Google-Smtp-Source: ADFU+vuW13Eqjj8PUSFXpS6gpeKy0kUkjmC+bQW123hG/MsfxUe/SySTcX7nclm9W/eQzs3KiBw3 X-Received: by 2002:a9d:138:: with SMTP id 53mr5444382otu.67.1584001408949; Thu, 12 Mar 2020 01:23:28 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1584001408; cv=none; d=google.com; s=arc-20160816; b=tY+mIiMdczDHfGck0p7aPMK0KX0g0L20447g4qtBW1Q6NNIkcSHmkTQyhK+hh58zwn NQzMrKNmaD4OHMVTCL0dUpEr6ObP4wuLFy3U4L2uxCaWE/xiCfHr739TGY6EOBCiQBtD gyvK3ZJe4oLrS7GV1c0DALuKIagE1aOi6+9Mo+fcRTOW7CoYARBrXHrftaoQUe+zdeOV BbOyaLfK2Aijty68oi99SKAtcTgwUSVDJj9sX4TWVthfIctotbWOmjPdjXVEUCy+5qXE fe4EEx5HCV3TrOeXoc6KwtLlzpxFWwffOoy0oQJ6MbPPnU0w8nMCBF9RZUh/mjFle5L8 LUqQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:in-reply-to:content-disposition :mime-version:references:message-id:subject:cc:to:from:date; bh=GsxxIwFSAGEf48PcrAOgcjGCiOZstCS977zPZAz+O2I=; b=ixJgI+pwpGv8jBa2j3LoGkZLvrOsHFT0u6w4SFaaLIUdubgdlqvhJqLOnYWgvw36IF o93hwlmrIHpJI3T2pukvuTcHB3wdSSpJoDNZOzC53DfDsJw82CfTyLPhb9oYMQGjrf7p +iw6y9gWUo/tSfN4XtYg+ahfgaKvCqUJwkVl+vG+szeLqGRXkCHOozmIZ99tohV9wmau AI7q3RY31rVZTt9cJaaMALV9q0Z0NEaNu0EkaNKCIhWxSnOk0T1p1Lte2396tBfxXxPz yIl63qIoWo2uRYWtUE6riUfU1B12ZjvmgLngBMF5ggvNvsxQM/fXgSxP1nnxnw8BJFeA VSaw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id t132si2378018oih.173.2020.03.12.01.23.15; Thu, 12 Mar 2020 01:23:28 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726492AbgCLIWx (ORCPT + 99 others); Thu, 12 Mar 2020 04:22:53 -0400 Received: from mail-wr1-f68.google.com ([209.85.221.68]:44381 "EHLO mail-wr1-f68.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725980AbgCLIWw (ORCPT ); Thu, 12 Mar 2020 04:22:52 -0400 Received: by mail-wr1-f68.google.com with SMTP id l18so6170936wru.11 for ; Thu, 12 Mar 2020 01:22:51 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=GsxxIwFSAGEf48PcrAOgcjGCiOZstCS977zPZAz+O2I=; b=Y37vqHG9+7IHYu8lCqjPWWEXPpzkyJbAAUcNjcCvB/OzYjchHycS0oxiedVAlgv+nY xtmWb6Qxcknlwqt3m4nGS9/RMB7Fn2Axzc82a40tdU28rcaVeby16JIftJLqf+2/o15w vmthuLbi/h7TO+FM/lBpKoSjxdS9rnG3kpyA1NrxhKQx9v0pVah22R60y3H3J6Lagu46 4ZpYCNJBKOjyAig5CsQCRDvDKViZ7nATnmot1axmBYD0g4/FMkcazmKdeGo4YjgGKWQU HfwngSVnmL/tqFO8dbnvwjRXWWfaqIslbTG0GLgR5MCQVPlUY3eWWSv0pIhTq7YsdbvE Hhlw== X-Gm-Message-State: ANhLgQ12iCjozj7zgB4UCf9OIP1ZXeTmuj4Xs5r/rQbTZ5JN57BNJr6X oY/hDz4Z9qZwgXPYhk/J3I0= X-Received: by 2002:a5d:6a04:: with SMTP id m4mr9428669wru.127.1584001370668; Thu, 12 Mar 2020 01:22:50 -0700 (PDT) Received: from localhost (ip-37-188-253-35.eurotel.cz. [37.188.253.35]) by smtp.gmail.com with ESMTPSA id q5sm27406612wrc.68.2020.03.12.01.22.48 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 12 Mar 2020 01:22:49 -0700 (PDT) Date: Thu, 12 Mar 2020 09:22:48 +0100 From: Michal Hocko To: Jann Horn Cc: Minchan Kim , Linux-MM , kernel list , Daniel Colascione , Dave Hansen , "Joel Fernandes (Google)" , Andrew Morton Subject: Re: interaction of MADV_PAGEOUT with CoW anonymous mappings? Message-ID: <20200312082248.GS23944@dhcp22.suse.cz> References: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org [Cc akpm] So what about this? From eca97990372679c097a88164ff4b3d7879b0e127 Mon Sep 17 00:00:00 2001 From: Michal Hocko Date: Thu, 12 Mar 2020 09:04:35 +0100 Subject: [PATCH] mm: do not allow MADV_PAGEOUT for CoW pages Jann has brought up a very interesting point [1]. While shared pages are excluded from MADV_PAGEOUT normally, CoW pages can be easily reclaimed that way. This can lead to all sorts of hard to debug problems. E.g. performance problems outlined by Daniel [2]. There are runtime environments where there is a substantial memory shared among security domains via CoW memory and a easy to reclaim way of that memory, which MADV_{COLD,PAGEOUT} offers, can lead to either performance degradation in for the parent process which might be more privileged or even open side channel attacks. The feasibility of the later is not really clear to me TBH but there is no real reason for exposure at this stage. It seems there is no real use case to depend on reclaiming CoW memory via madvise at this stage so it is much easier to simply disallow it and this is what this patch does. Put it simply MADV_{PAGEOUT,COLD} can operate only on the exclusively owned memory which is a straightforward semantic. [1] http://lkml.kernel.org/r/CAG48ez0G3JkMq61gUmyQAaCq=_TwHbi1XKzWRooxZkv08PQKuw@mail.gmail.com [2] http://lkml.kernel.org/r/CAKOZueua_v8jHCpmEtTB6f3i9e2YnmX4mqdYVWhV4E=Z-n+zRQ@mail.gmail.com Signed-off-by: Michal Hocko --- mm/madvise.c | 12 +++++++++--- 1 file changed, 9 insertions(+), 3 deletions(-) diff --git a/mm/madvise.c b/mm/madvise.c index 43b47d3fae02..4bb30ed6c8d2 100644 --- a/mm/madvise.c +++ b/mm/madvise.c @@ -335,12 +335,14 @@ static int madvise_cold_or_pageout_pte_range(pmd_t *pmd, } page = pmd_page(orig_pmd); + + /* Do not interfere with other mappings of this page */ + if (page_mapcount(page) != 1) + goto huge_unlock; + if (next - addr != HPAGE_PMD_SIZE) { int err; - if (page_mapcount(page) != 1) - goto huge_unlock; - get_page(page); spin_unlock(ptl); lock_page(page); @@ -426,6 +428,10 @@ static int madvise_cold_or_pageout_pte_range(pmd_t *pmd, continue; } + /* Do not interfere with other mappings of this page */ + if (page_mapcount(page) != 1) + continue; + VM_BUG_ON_PAGE(PageTransCompound(page), page); if (pte_young(ptent)) { -- 2.24.1 -- Michal Hocko SUSE Labs