Received: by 2002:a25:e7d8:0:0:0:0:0 with SMTP id e207csp905054ybh; Thu, 12 Mar 2020 13:16:58 -0700 (PDT) X-Google-Smtp-Source: ADFU+vsXD9qIoZPmcYb/WEeSMvNiMUUH2AZJGCOqXOitXb62xCGmqauz7+62ziI8vrx9I8/Xrlj1 X-Received: by 2002:a05:6830:1d7:: with SMTP id r23mr7775324ota.181.1584044218642; Thu, 12 Mar 2020 13:16:58 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1584044218; cv=none; d=google.com; s=arc-20160816; b=zIE0gVnFlkuD5YFSj2x2Mhd2KWdb71wzPZupiHDiCc4TSCD5A6mynCw3YBR8LctCgo Gqq5xmi4h0kpJb3aBCReK9SAUb4Xbs6fidYINVFWrX3qsNzUJmtM8h+2rmy16s5LL5xQ mCMes9FPUeFpp2wKZpJChE2NRlkZjP7+dEaLYzA9VQqv2Ff7cwkbAn4KzyneY0sWJn8v nyZ2bjZzzJYSYF5HnW7IJPKduwnpMB+PqiFe6ko3tqKiDfybNp+uf/TT4rlkg8beMfzQ lz3DDcDI8epb92B3q7jjjNNaj5V1mdpICQZSEZI+ly+mV+EEXRJdgJtPBfHSvuM8epMH NTLg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:dkim-signature; bh=5ODS1TFq9M93gWOQaGQ9WOy3EnmI/QAeYw+Gw8Sae8Y=; b=sh0Zl5p4VlSlG+jhyjj1PWRLLJCaCOouKfITmB87uPTAOpC0FuW0xhy6TnvPkiy/jl LGBhqbjUhcnLlwK6B105f/nFmHeDScwsAn5ACRJj+grHe9gK3NfZm2fpK3gSoUEEifcq 36d7952gCaml7Qo43XyeSpVdKyZ9L/0dg8oRaMIHJLqN7eeN0/h/Y1QjrTaYCA5m9HOR 73FGL04hy+qzQKl/PkqTX/D2gjn7MUrg/xq0EAnDSUUnEqs/C+cuqNhKDv3YXpIYBARh tCibUgvg3B8nggHDkXnl8OQdlauvd5E/jAKrFxtycEhqwfCZwNxirbjq3Xuk9S/yx5wZ Ivsw== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@gmail.com header.s=20161025 header.b="X/r/XGA7"; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id o7si3172236otk.185.2020.03.12.13.16.43; Thu, 12 Mar 2020 13:16:58 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=fail header.i=@gmail.com header.s=20161025 header.b="X/r/XGA7"; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726822AbgCLUQH (ORCPT + 99 others); Thu, 12 Mar 2020 16:16:07 -0400 Received: from mail-pj1-f67.google.com ([209.85.216.67]:39105 "EHLO mail-pj1-f67.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726558AbgCLUQG (ORCPT ); Thu, 12 Mar 2020 16:16:06 -0400 Received: by mail-pj1-f67.google.com with SMTP id d8so3107768pje.4 for ; Thu, 12 Mar 2020 13:16:05 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=sender:date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to:user-agent; bh=5ODS1TFq9M93gWOQaGQ9WOy3EnmI/QAeYw+Gw8Sae8Y=; b=X/r/XGA7oK6/08aLFZb7cozSoP5Ox1n14lQ/b7pmAbgOSH2AocxqP3VTQCE0IZIiEd 7Zhj7R74pgBIrG0Pzay4Gkz6RDMMjmTiLZcvnzN2QPzxKKxO66lDtOrSjWHseef3d0pW ko9LF2Cs3ROzOveWcMSMwJHX2in4UoFIdhsLMreC837s5gRVUvllRNCA4ZrLFrBxGeIY CDMqpHqo4KrY+cEQ1gxsRqPg84TWv1Ivf1loHceNVXenOgD+P9rXKfQZsfS5270ulRIH VwlwZkGDfbH0TRQcqiZ2Nm4xTaxFFLWi70hGeLwQVHhbOmk6puQP1n0/RPatfLZ4wQ7T iNpA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:date:from:to:cc:subject:message-id :references:mime-version:content-disposition:in-reply-to:user-agent; bh=5ODS1TFq9M93gWOQaGQ9WOy3EnmI/QAeYw+Gw8Sae8Y=; b=fVzN55WYeZ3+bzQvOMtbRh9tTqePiO75Nt0JLRe3sk7JjVRih9jYK2GUbP11Q2NSY5 U+8HloI/j1MAt+r0jtSj/lHVhSMF84aBfs6bQctPIvVwA3VfGJJbOZGwNcdDOeVBVn8h BMbcZDOZtsWZ1pD9HaPyUra+l1EeUDKBbLvztRyVkdSLVAKr7XrCBunolSQuxAGayW5J h+BP4g7+629yrFywrZp7Hlohw9xqRMZ1xfR89Mx3OJIg2PJ08FItmh0nmGzQldpoKKeK MmmkNWGjGNseOWKVQ1LmKOwVdVT/cNFib+Qnl7B+gzcCQhJ0wANkOpF4jNn4Z6428YtB JVEQ== X-Gm-Message-State: ANhLgQ30k8wbKl3DSihSZXUFlCI4d1KH/UyMTTu1S1jDw3eebIKjFtF3 0oRmZRDhnuyFPkGZvsGV/iM= X-Received: by 2002:a17:90a:fa81:: with SMTP id cu1mr5864534pjb.192.1584044165162; Thu, 12 Mar 2020 13:16:05 -0700 (PDT) Received: from google.com ([2620:15c:211:1:3e01:2939:5992:52da]) by smtp.gmail.com with ESMTPSA id b10sm5078236pfo.215.2020.03.12.13.16.03 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 12 Mar 2020 13:16:03 -0700 (PDT) Date: Thu, 12 Mar 2020 13:16:02 -0700 From: Minchan Kim To: Michal Hocko Cc: Jann Horn , Linux-MM , kernel list , Daniel Colascione , Dave Hansen , "Joel Fernandes (Google)" , Andrew Morton Subject: Re: interaction of MADV_PAGEOUT with CoW anonymous mappings? Message-ID: <20200312201602.GA68817@google.com> References: <20200312082248.GS23944@dhcp22.suse.cz> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20200312082248.GS23944@dhcp22.suse.cz> User-Agent: Mutt/1.12.2 (2019-09-21) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Mar 12, 2020 at 09:22:48AM +0100, Michal Hocko wrote: > [Cc akpm] > > So what about this? Thanks, Michal. I't likde to wait Jann's reply since Dave gave his opinion about the vulnerability. https://lore.kernel.org/linux-mm/cf95db88-968d-fee5-1c15-10d024c09d8a@intel.com/ Jann, could you give your insigh about that practically it's possible? A real dumb question to understand vulnerability: The attacker would be able to trigger heavy memory consumption so that he could make paging them out without MADV_PAGEOUT. I know MADV_PAGEOUT makes it easier but he still could do without MADV_PAGEOUT. What makes difference here? To clarify how MADV_PAGEWORK works: If other process has accessed the page so that his page table has access bit marked, MADV_PAGEOUT couldn't page it out. > > From eca97990372679c097a88164ff4b3d7879b0e127 Mon Sep 17 00:00:00 2001 > From: Michal Hocko > Date: Thu, 12 Mar 2020 09:04:35 +0100 > Subject: [PATCH] mm: do not allow MADV_PAGEOUT for CoW pages > > Jann has brought up a very interesting point [1]. While shared pages are > excluded from MADV_PAGEOUT normally, CoW pages can be easily reclaimed > that way. This can lead to all sorts of hard to debug problems. E.g. > performance problems outlined by Daniel [2]. There are runtime > environments where there is a substantial memory shared among security > domains via CoW memory and a easy to reclaim way of that memory, which > MADV_{COLD,PAGEOUT} offers, can lead to either performance degradation > in for the parent process which might be more privileged or even open > side channel attacks. The feasibility of the later is not really clear I am not sure it's a good idea to mention performance stuff because it's rather arguble. You and Johannes already pointed it out when I sbumit early draft which had shared page filtering out logic due to performance reason. You guys suggested the shared pages has higher chance to be touched so that if it's really hot pages, that whould keep in the memory. I agree. I think the only reason at this moment is just vulnerability. > to me TBH but there is no real reason for exposure at this stage. It > seems there is no real use case to depend on reclaiming CoW memory via > madvise at this stage so it is much easier to simply disallow it and > this is what this patch does. Put it simply MADV_{PAGEOUT,COLD} can > operate only on the exclusively owned memory which is a straightforward > semantic. > > [1] http://lkml.kernel.org/r/CAG48ez0G3JkMq61gUmyQAaCq=_TwHbi1XKzWRooxZkv08PQKuw@mail.gmail.com > [2] http://lkml.kernel.org/r/CAKOZueua_v8jHCpmEtTB6f3i9e2YnmX4mqdYVWhV4E=Z-n+zRQ@mail.gmail.com > > Signed-off-by: Michal Hocko > --- > mm/madvise.c | 12 +++++++++--- > 1 file changed, 9 insertions(+), 3 deletions(-) > > diff --git a/mm/madvise.c b/mm/madvise.c > index 43b47d3fae02..4bb30ed6c8d2 100644 > --- a/mm/madvise.c > +++ b/mm/madvise.c > @@ -335,12 +335,14 @@ static int madvise_cold_or_pageout_pte_range(pmd_t *pmd, > } > > page = pmd_page(orig_pmd); > + > + /* Do not interfere with other mappings of this page */ How about this? /* * paging out only single mapped private pages for anonymous mapping, * otherwise, it opens a side channel. */ Otherwise, looks good to me. > + if (page_mapcount(page) != 1) > + goto huge_unlock; > + > if (next - addr != HPAGE_PMD_SIZE) { > int err; > > - if (page_mapcount(page) != 1) > - goto huge_unlock; > - > get_page(page); > spin_unlock(ptl); > lock_page(page); > @@ -426,6 +428,10 @@ static int madvise_cold_or_pageout_pte_range(pmd_t *pmd, > continue; > } > > + /* Do not interfere with other mappings of this page */ > + if (page_mapcount(page) != 1) > + continue; > + > VM_BUG_ON_PAGE(PageTransCompound(page), page); > > if (pte_young(ptent)) { > -- > 2.24.1 > > -- > Michal Hocko > SUSE Labs