Received: by 2002:a25:e7d8:0:0:0:0:0 with SMTP id e207csp3246340ybh; Mon, 16 Mar 2020 19:06:35 -0700 (PDT) X-Google-Smtp-Source: ADFU+vuHS8H60vyDepGFL1Zqao0vy/+RGdPIwBM+/Nbtjg93UkVZ+6M61cH4pAXZcpVIaylmcGAQ X-Received: by 2002:a9d:b8f:: with SMTP id 15mr1824116oth.256.1584410795544; Mon, 16 Mar 2020 19:06:35 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1584410795; cv=none; d=google.com; s=arc-20160816; b=QU6g/il8Kbbu2yBO8B+N8otuKt1Xk2H1jUsBJfkp+teFOmP2s8KF94V8nWowhkOLGQ Q7eOCUmaYTKhcGA/WBtEk/ioRuI1yUZKSWH0MPiqApBTdW+BpEf9aKoSqcN5iu5aDvG9 /LXM7rxLPLd+jc8DEOH0h9zszYI4Xa1taAQD0KKS8jC+CjBS4xvFGTYVz8KcPVtz4bwz COKnfs2jlRhOjs5NfNy/QWlr8Noe/R/C55lUpzYfSW9eijRRO04jEohsAQNBOwIr6IUO 3ljOaHoY3CpeGhtv15XtbqQlkfedaczcnosD5jWkZGj+qBr3YSEs4yMy4mDp44OazE2H wAKw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:dkim-signature; bh=YPrE3TC9cvR5eow6UPNLAi081AyK8g0VypG98zq/3rI=; b=vVYZk4EWfoLhuzLwTDKohdQIIGupEH0V//cepaQl/p0/lkrkWfqPm9w927x6gJbmEU ana/xbZXPRi8DMAK3U/ywpOO51nTaQ9xonsJJj+GfCqS8plmEJQlAkLwRnKt7dD4UoRn m3xh+W8Djrq+hWc2XY9I5wk25HDlaJtmkMrgHkr7oYh8FbX0MzFotOzwoHrszNerEQ7H Wqg/gX9XbkKLduOfVV//4r8JZCX2f9A0wuXS2SfWwg8pEZon3CcdQFb6t8WGDdVKA1t+ rVrUwZ8bia2GfjqI2HwEHOaRDdq/s9FoIxIhvQLsXkWTKXVgdddF3s/3ErWDMk7ctx6V HlxQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@gmail.com header.s=20161025 header.b=Qs7KHkvG; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id n20si1020861otf.63.2020.03.16.19.06.23; Mon, 16 Mar 2020 19:06:35 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=fail header.i=@gmail.com header.s=20161025 header.b=Qs7KHkvG; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726695AbgCQBnq (ORCPT + 99 others); Mon, 16 Mar 2020 21:43:46 -0400 Received: from mail-pg1-f181.google.com ([209.85.215.181]:33132 "EHLO mail-pg1-f181.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726559AbgCQBnq (ORCPT ); Mon, 16 Mar 2020 21:43:46 -0400 Received: by mail-pg1-f181.google.com with SMTP id m5so10814568pgg.0 for ; Mon, 16 Mar 2020 18:43:45 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=sender:date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to:user-agent; bh=YPrE3TC9cvR5eow6UPNLAi081AyK8g0VypG98zq/3rI=; b=Qs7KHkvGrxolIHTiAhyQGd2igPK8/UgI2BqNWe8908q+A6DbfydjZ89uzH8KvnLNG7 bxKuwqDsn/gl0UbTRoIEIZgf6ViRSq9GQRIrCNibxeR52oKm7yVGsTqWvBckF5PHJ5Hw WjfnQU+Q+v973TV+wdXPdsOtge82uAMlVNlwqb0OqohYOi47Kht7dzaKMteRyAVcRBSp 868NkrAAF0c8BBzyHhbBv8B12BDOldVVIZCDBQwFK5mmln13hICY8/+hhp5dHjlgoRnM ezqNu3oeHGObMUlKU2uaEYQ+soCYl5bKQt2P0xnFoSM5k6hTgmmCYfVgrhlulFRMqIFD rWRQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:date:from:to:cc:subject:message-id :references:mime-version:content-disposition:in-reply-to:user-agent; bh=YPrE3TC9cvR5eow6UPNLAi081AyK8g0VypG98zq/3rI=; b=NAQNbXSOwLtCVhOrhdgl3bmx7740KXObwVncn6Z2R64Xyzpl0BXCSyb+mgpbfRTzVw 0y1MaTzFdtSxT9fzaERsttokmrU2r2HceYRsdDa3U9yOFJMAq5frLwOfuMtlCevTILhF lkDw7hDK6HDrXGGx3cnQIwm/6/d5CpXf8i3ovNJr0o0huZzjyY3OESGLFU9SMiEHAO3Z PKTxPtG+r0VpGfPenbzEpAv89bM6GtjoJZwtl1cykhzmpej6/qcQybUhQm2AkNizV+Gt Pql5/2WI2TUn6fg4d7Jrd1R5Jt9Pfd0RRo+WnL9lAQbjnCsJwnb8D9HoPbLiUVb3mVxj zusw== X-Gm-Message-State: ANhLgQ3jtCFiBiuEQNPD9GleCcxbF3C4T2wI6q1jNNhvZVKWbSdP6Ygx 3xFdxffqsGxEmmGxGjgduwM= X-Received: by 2002:a63:1f58:: with SMTP id q24mr2755192pgm.334.1584409424318; Mon, 16 Mar 2020 18:43:44 -0700 (PDT) Received: from google.com ([2620:15c:211:1:3e01:2939:5992:52da]) by smtp.gmail.com with ESMTPSA id d84sm1031512pfd.197.2020.03.16.18.43.42 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 16 Mar 2020 18:43:42 -0700 (PDT) Date: Mon, 16 Mar 2020 18:43:40 -0700 From: Minchan Kim To: Michal Hocko Cc: Jann Horn , Linux-MM , kernel list , Daniel Colascione , Dave Hansen , "Joel Fernandes (Google)" , Andrew Morton Subject: Re: interaction of MADV_PAGEOUT with CoW anonymous mappings? Message-ID: <20200317014340.GA73302@google.com> References: <20200312082248.GS23944@dhcp22.suse.cz> <20200312201602.GA68817@google.com> <20200312204155.GE23944@dhcp22.suse.cz> <20200313020851.GD68817@google.com> <20200313080546.GA21007@dhcp22.suse.cz> <20200313205941.GA78185@google.com> <20200316092052.GD11482@dhcp22.suse.cz> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20200316092052.GD11482@dhcp22.suse.cz> User-Agent: Mutt/1.12.2 (2019-09-21) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Mar 16, 2020 at 10:20:52AM +0100, Michal Hocko wrote: > On Fri 13-03-20 13:59:41, Minchan Kim wrote: > > On Fri, Mar 13, 2020 at 09:05:46AM +0100, Michal Hocko wrote: > > > On Thu 12-03-20 19:08:51, Minchan Kim wrote: > > > > On Thu, Mar 12, 2020 at 09:41:55PM +0100, Michal Hocko wrote: > > > > > On Thu 12-03-20 13:16:02, Minchan Kim wrote: > > > > > > On Thu, Mar 12, 2020 at 09:22:48AM +0100, Michal Hocko wrote: > > > > > [...] > > > > > > > From eca97990372679c097a88164ff4b3d7879b0e127 Mon Sep 17 00:00:00 2001 > > > > > > > From: Michal Hocko > > > > > > > Date: Thu, 12 Mar 2020 09:04:35 +0100 > > > > > > > Subject: [PATCH] mm: do not allow MADV_PAGEOUT for CoW pages > > > > > > > > > > > > > > Jann has brought up a very interesting point [1]. While shared pages are > > > > > > > excluded from MADV_PAGEOUT normally, CoW pages can be easily reclaimed > > > > > > > that way. This can lead to all sorts of hard to debug problems. E.g. > > > > > > > performance problems outlined by Daniel [2]. There are runtime > > > > > > > environments where there is a substantial memory shared among security > > > > > > > domains via CoW memory and a easy to reclaim way of that memory, which > > > > > > > MADV_{COLD,PAGEOUT} offers, can lead to either performance degradation > > > > > > > in for the parent process which might be more privileged or even open > > > > > > > side channel attacks. The feasibility of the later is not really clear > > > > > > > > > > > > I am not sure it's a good idea to mention performance stuff because > > > > > > it's rather arguble. You and Johannes already pointed it out when I sbumit > > > > > > early draft which had shared page filtering out logic due to performance > > > > > > reason. You guys suggested the shared pages has higher chance to be touched > > > > > > so that if it's really hot pages, that whould keep in the memory. I agree. > > > > > > > > > > Yes, the hot memory is likely to be referenced but the point was an > > > > > unexpected latency because of the major fault. I have to say that I have > > > > > > > > I don't understand your point here. If it's likely to be referenced > > > > among several processes, it doesn't have the major fault latency. > > > > What's your point here? > > > > > > a) the particular CoW page might be cold enough to be reclaimed and b) > > > > If it is, that means it's *cold* so it's really worth to be reclaimed. > > > > > nothing really prevents the MADV_PAGEOUT to be called faster than the > > > reference bit being readded. > > > > Yeb, that's undesirable. I should admit it was not intended when I implemented > > PAGEOUT. The thing is page_check_references clears access bit of pte for every > > process are sharing the page so that two times MADV_PAGEOUT from a process could > > evict the page. That's the really bug. > > I do not really think this is a bug. This is a side effect of the > reclaim process and we do not really want MADV_{PAGEOUT,COLD} behave No, that's the bug since we didn't consider the side effect. > differently here because then the behavior would be even harder to No, I do want to have difference because it's per-process hint. IOW, what he know is for only his context, not others so it shouldn't clean others' pte. That makes difference between LRU aging and the hint. > understand. It's not hard to understand.. MADV_PAGEOUT should consider only his context since it's per-process hint(Even, he couldn't know others' context) so it shouldn't bother others. Actually, Dave's suggestion is correct to fix the issue if there was no isse with side channel attack. However, due to the attack issue, page_mapcount could prevent the problem effectively. That's why I am not against of the patch now since it fixes the bug as well as vulnerability.