Received: by 2002:a25:e7d8:0:0:0:0:0 with SMTP id e207csp1098631ybh; Tue, 10 Mar 2020 14:41:34 -0700 (PDT) X-Google-Smtp-Source: ADFU+vs7n1zvpNw0La7PXWBWqSzZv3tdHC7FiBcVWpDvP5PLLxJHQOeAk3WFkehJPMVCcoDO9ZK9 X-Received: by 2002:a9d:5f7:: with SMTP id 110mr12567579otd.73.1583876494440; Tue, 10 Mar 2020 14:41:34 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1583876494; cv=none; d=google.com; s=arc-20160816; b=P3DfOtAgWga12qL32JpCckTnc7FhVKfpUfjHcdIts5OSVa8UyUhDQI8ghqiHv1D3gk D7F30s0uuAckB0tbYf5p3029CRhdNc+46HkZgsOvahg8bITDKpGbcwg90b+z7rv6m8CB 8MKIOqr1AHr6acZBzGRTcNcq+8mefpdUht9Patf83dDbCwAAGPqZndOZKmg71a+pEolD DkrAgC2lxqY0O6wAdUhOI+jrO5ojCz/30XiBJAMZbV4eu1JpCx5KLkRPhXCAtUja5PcN LlYRcBa4EUUjGAs/A24bUfFkZ6/h6rppDQx40ew9gylXblqDxzahFDOEFRFQ45kwYy3w 1LUw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:dkim-signature; bh=30v28QZwvgciyeUbGcFniNwVinb7jSfmAcitW4ywOgw=; b=pRYJubfsEy6nJmcFAbdWtDcFOk5e5qPgd6/xhN5eVVEktdkWA1iErjP3lEyMBWSlaN 4O0mc2jA5HGt1B5PFnBuueRmK2b48Oz3LvmRbdF3CJVqZNrLr+vLt10Ontmfww8gN0Uo zeIEzC16Vw6Ab2ZDKeswud16AdMr75jWQqDu0Oan5cJ5ZlawBbRSOVP+2gfKKoiMQRPV wzhOGy6JYSTWtxafAtb5gPL4tL5h9hot0I6QmFplHFRYxXE/O3cwPBj02PERCO5vKKft v21FuTq2aX/0GUlv+8ft/iJfvXj94IOyQvaW22XeqS/hflXOx+gJdg2wTj7SiyKhWtXM Mnhg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=pVHbW9YK; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id q16si9051843otm.231.2020.03.10.14.41.22; Tue, 10 Mar 2020 14:41:34 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=pVHbW9YK; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727700AbgCJVk2 (ORCPT + 99 others); Tue, 10 Mar 2020 17:40:28 -0400 Received: from mail-ot1-f54.google.com ([209.85.210.54]:37038 "EHLO mail-ot1-f54.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726268AbgCJVk2 (ORCPT ); Tue, 10 Mar 2020 17:40:28 -0400 Received: by mail-ot1-f54.google.com with SMTP id b3so14763908otp.4 for ; Tue, 10 Mar 2020 14:40:27 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=30v28QZwvgciyeUbGcFniNwVinb7jSfmAcitW4ywOgw=; b=pVHbW9YKRfFKZADsnAMJUGWLLN4457YVQMmhbheWBFB3MwtHdAS3MmRJYpyKaJnsOb rFf6OXsUpkBbr5iDyWHykufITmHwEqrVNRlJqU63iuzLFlg1Z9MIRUH7UWE4GgLt0miz 5WTmAnuC4FApvZNx+enghhHzYN0PInrSX/8HNHkBs5vuFg6tkrFnlhEmj47EoYVX5IjP YOccE9PudP0FfqfHBvzgCI8LKHZgbRKKToWxImEyQlMGHTFJqk+EXMcAF0uDr+n5LQKg fy1W/aiXBgKwdLBhXziDEqmkP5FSQOg05yyWFYcle69NPG2P47ndKV1i5Wvpy21K2sEQ X47w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=30v28QZwvgciyeUbGcFniNwVinb7jSfmAcitW4ywOgw=; b=HTQifvdQrH2cCPNSguMleJuDPLVSPPKmrXeVN8jE39xeVamRdOgiJMu8Svgl1Fp3m4 A9SG4HrxooJWAdCseVq2alcGOfjj/tOi30IBy4jpBAL8za6kEhQztO6imi7cMlDNTZIB H+G/EkRfJ5QfMdrmyxKReAemMjzkIQpcLE8+YuDU+u5T7YapPQuPHqD/ce020a6rnb/s JdjAbA4uqgzk8dGxDMCU2flCIy16yx2W2d+hzxNaLBc50kBKmKYXPC2GauoUUGjtVDPh rFfm0/AvA1oqf6K5yU5GnxQMtdE9VJ75K8jxmbQA+12Etfktx0wI70+kVf0JX/ttu1B1 j+bA== X-Gm-Message-State: ANhLgQ1gaq9Q5mHBHdB1zPAoiBmofJv2bEtT8iZXSA69EYCR+vFxzeQX O09uS3BP8iqrIR7arUKh6KotAlYP2bAFtesfp4EPgA== X-Received: by 2002:a9d:6654:: with SMTP id q20mr5592078otm.180.1583876427094; Tue, 10 Mar 2020 14:40:27 -0700 (PDT) MIME-Version: 1.0 References: <20200310184814.GA8447@dhcp22.suse.cz> In-Reply-To: From: Jann Horn Date: Tue, 10 Mar 2020 22:40:00 +0100 Message-ID: Subject: Re: interaction of MADV_PAGEOUT with CoW anonymous mappings? To: Daniel Colascione Cc: Michal Hocko , Minchan Kim , Linux-MM , kernel list , Dave Hansen , "Joel Fernandes (Google)" Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Mar 10, 2020 at 9:19 PM Daniel Colascione wrote: > On Tue, Mar 10, 2020 at 11:48 AM Michal Hocko wrote: > > On Tue 10-03-20 19:08:28, Jann Horn wrote: > > > >From looking at the source code, it looks to me as if using > > > MADV_PAGEOUT on a CoW anonymous mapping will page out the page if > > > possible, even if other processes still have the same page mapped. Is > > > that correct? > > > > > > If so, that's probably bad in environments where many processes (with > > > different privileges) are forked from a single zygote process (like > > > Android and Chrome), I think? If you accidentally call it on a CoW > > > anonymous mapping with shared pages, you'll degrade the performance of > > > other processes. And if an attacker does it intentionally, they could > > > use that to aid with exploiting race conditions or weird > > > microarchitectural stuff (e.g. the new https://lviattack.eu/lvi.pdf > > > talks about "the assumption that attackers can provoke page faults or > > > microcode assists for (arbitrary) load operations in the victim > > > domain"). > > > > > > Should madvise_cold_or_pageout_pte_range() maybe refuse to operate on > > > pages with mapcount>1, or something like that? Or does it already do > > > that, and I just missed the check? > > > > I have brought up side channel attacks earlier [1] but only in the > > context of shared page cache pages. I didn't really consider shared > > anonymous pages to be a real problem. I was under impression that CoW > > pages shouldn't be a real problem because any security sensible > > applications shouldn't allow untrusted code to be forked and CoW > > anything really important. I believe we have made this assumption > > in other places - IIRC on gup with FOLL_FORCE but I admit I have > > very happily forgot most details. > > I'm more worried about the performance implications. Consider > libc.so's data section: that's a COW mapping, and we COW it during > zygote initialization as we load and relocate libc.so. Child processes > shouldn't be dirtying and re-COWing those relocated pages. If I > understand Jann's message correctly, MADV_PAGEOUT would force the > pages corresponding to the libc.so data segment out to zram just > because we MADV_PAGEOUT-ed a single process that happened to use libc. > We should leave those pages in memory, IMHO. Actually, the libc.so data section is a file mapping, so I think can_do_pageout() would decide whether the caller is allowed to force pageout based on whether the caller is the owner of (or capable over) libc (in other words, root, basically). But I think the bss section, as well as heap memory, could have pageout forced by anyone.