Received: by 2002:a25:8b12:0:0:0:0:0 with SMTP id i18csp1424178ybl; Tue, 13 Aug 2019 12:19:11 -0700 (PDT) X-Google-Smtp-Source: APXvYqzYJssv+vmdmr6hRvjnCQdW1xf9fiOldwzc4b2URy9T8eLk5TDqSw/LFv6TxfDM8zw0+yiJ X-Received: by 2002:a17:902:7612:: with SMTP id k18mr38499956pll.48.1565723951839; Tue, 13 Aug 2019 12:19:11 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1565723951; cv=none; d=google.com; s=arc-20160816; b=P4dEIjT8aeltczhlgjfL+dgLntIJVJ8st/aA+bupr3zJJbXAv0VTp6U2I1cAaFMRWG G4rGmWcVwlBZQdUl71Y4NyDmufJQqkyku1ZxCosznQ0o9Efv3tmzOj2yXQdvnRR3Ikjb KhNGQtcE1XIpKNtp7l+Nn2xL1f3vafR889RBC1vDJXKS1I4wGcG2vVVJztHStL1I8hfR AYacNqVdv1tQxqQsddiAhVSqV3s7rXLPInB80nPNkEm8CYvxzVsCx7TfX9MUYyP+lK2z /Oeko1RShxf2a9h7GiVmwocz6eW2xvSSZ52iz6T+VR20/iGzAjXBmn0RK90rVlFiiI02 jP+Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:dkim-signature; bh=loEq7gvtH2D4YM70lSJU2t757iLjhLgW2oNM/hq7U4o=; b=SZgICryHS6pqDyQPyV9ENRHBQvW9bx4QRGQnyYnbAd1KrqehochhLlKfyYfzFh8Ohs sLI+GltRa+XfjrnJDQHtzca+fvUuSPcbPHmkMOb1pLUZk1eeEIyI5Sd/XBreoGobZblP /6dJnLdFF0jxOKP1TQnSSe3jF+ELPdpCwuxgbfWMZTGrMoGzYOrdlL1ZGO4XZ3WqbuFG kZSgIn/UGVbjG4NZzfIWiPael1SVOdzvwQt0PtC1KHf2TgVFFhzs1YGMeB4gZAAhj2q4 EX+UdC46/Ti/WZk1f9nLztYgwuQfnNpwmJw6yehCoajN5zA3yP4Lz7likupxc0mRk4db 7w6Q== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@joelfernandes.org header.s=google header.b=LTYsl0hn; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id t23si1458860pjw.56.2019.08.13.12.18.56; Tue, 13 Aug 2019 12:19:11 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@joelfernandes.org header.s=google header.b=LTYsl0hn; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726365AbfHMTSO (ORCPT + 99 others); Tue, 13 Aug 2019 15:18:14 -0400 Received: from mail-pg1-f195.google.com ([209.85.215.195]:40339 "EHLO mail-pg1-f195.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726126AbfHMTSO (ORCPT ); Tue, 13 Aug 2019 15:18:14 -0400 Received: by mail-pg1-f195.google.com with SMTP id w10so51752504pgj.7 for ; Tue, 13 Aug 2019 12:18:13 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=joelfernandes.org; s=google; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to:user-agent; bh=loEq7gvtH2D4YM70lSJU2t757iLjhLgW2oNM/hq7U4o=; b=LTYsl0hn/4O4Lk67PS1teGli+J/CBeYjnpXfm9tfH0i+zOSgdNX5QbQB0RiV1cwLVS aQFI0+M0BUTWkF6W2i1lvzo9FpttW3++ybFl0W35uihIzD7kfF8yohk6JkHdgzzXeahp 5DsoEkDuL4SFzvngtlmT4x1afOLWHC0l2szNo= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to:user-agent; bh=loEq7gvtH2D4YM70lSJU2t757iLjhLgW2oNM/hq7U4o=; b=QnmdADt3cCjJfIFSY01PSgLCTHd8n2oe+eOJhQF/dQ5WwQkWxNVmOEjPXBuUatzy9y hr/AAgQAlnlEW12NprvMD4wkxswpkLjygzm22XXw324EI6mk+E1QBGGc92woS3L1XTvB lBUdMjQy3gapNvEaNMQELrFO+JpoAjG6UD37xTUicvnMzfUqzVRBnyGIUIcQUMnaN1zN C6r5jJkdgi9yRm1q538Lbm6nkRz6Gku1GIUjDHqhX9Cgh3G5xc6VrFTjdSS7aqzt3QYe hdRLTBzSDgsPnSYL8zraS1OOBvq5eotrG0ZJ7kMzRJZ6RDN6RjtLEmzXnAj3D68U0IxE Kakw== X-Gm-Message-State: APjAAAWZvX+AFi+UOB7wsnpRQufYDIwvQn3rKmnVUOMaL5S//HPJqfJc J+MPcF4NBAOctmWno74GVb8I0w== X-Received: by 2002:aa7:81d9:: with SMTP id c25mr43244542pfn.255.1565723893335; Tue, 13 Aug 2019 12:18:13 -0700 (PDT) Received: from localhost ([2620:15c:6:12:9c46:e0da:efbf:69cc]) by smtp.gmail.com with ESMTPSA id e2sm8395527pff.49.2019.08.13.12.18.12 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 13 Aug 2019 12:18:12 -0700 (PDT) Date: Tue, 13 Aug 2019 15:18:11 -0400 From: Joel Fernandes To: Daniel Gruss Cc: Jann Horn , Michal Hocko , kernel list , Alexey Dobriyan , Andrew Morton , Borislav Petkov , Brendan Gregg , Catalin Marinas , Christian Hansen , Daniel Colascione , fmayer@google.com, "H. Peter Anvin" , Ingo Molnar , Jonathan Corbet , Kees Cook , kernel-team , Linux API , linux-doc@vger.kernel.org, linux-fsdevel , Linux-MM , Mike Rapoport , Minchan Kim , namhyung@google.com, "Paul E. McKenney" , Robin Murphy , Roman Gushchin , Stephen Rothwell , Suren Baghdasaryan , Thomas Gleixner , Todd Kjos , Vladimir Davydov , Vlastimil Babka , Will Deacon Subject: Re: [PATCH v5 1/6] mm/page_idle: Add per-pid idle page tracking using virtual index Message-ID: <20190813191811.GA117503@google.com> References: <20190807171559.182301-1-joel@joelfernandes.org> <20190813100856.GF17933@dhcp22.suse.cz> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.10.1 (2018-07-13) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Aug 13, 2019 at 05:34:16PM +0200, Daniel Gruss wrote: > On 8/13/19 5:29 PM, Jann Horn wrote: > > On Tue, Aug 13, 2019 at 12:09 PM Michal Hocko wrote: > >> On Mon 12-08-19 20:14:38, Jann Horn wrote: > >>> On Wed, Aug 7, 2019 at 7:16 PM Joel Fernandes (Google) > >>> wrote: > >>>> The page_idle tracking feature currently requires looking up the pagemap > >>>> for a process followed by interacting with /sys/kernel/mm/page_idle. > >>>> Looking up PFN from pagemap in Android devices is not supported by > >>>> unprivileged process and requires SYS_ADMIN and gives 0 for the PFN. > >>>> > >>>> This patch adds support to directly interact with page_idle tracking at > >>>> the PID level by introducing a /proc//page_idle file. It follows > >>>> the exact same semantics as the global /sys/kernel/mm/page_idle, but now > >>>> looking up PFN through pagemap is not needed since the interface uses > >>>> virtual frame numbers, and at the same time also does not require > >>>> SYS_ADMIN. > >>>> > >>>> In Android, we are using this for the heap profiler (heapprofd) which > >>>> profiles and pin points code paths which allocates and leaves memory > >>>> idle for long periods of time. This method solves the security issue > >>>> with userspace learning the PFN, and while at it is also shown to yield > >>>> better results than the pagemap lookup, the theory being that the window > >>>> where the address space can change is reduced by eliminating the > >>>> intermediate pagemap look up stage. In virtual address indexing, the > >>>> process's mmap_sem is held for the duration of the access. > >>> > >>> What happens when you use this interface on shared pages, like memory > >>> inherited from the zygote, library file mappings and so on? If two > >>> profilers ran concurrently for two different processes that both map > >>> the same libraries, would they end up messing up each other's data? > >> > >> Yup PageIdle state is shared. That is the page_idle semantic even now > >> IIRC. > >> > >>> Can this be used to observe which library pages other processes are > >>> accessing, even if you don't have access to those processes, as long > >>> as you can map the same libraries? I realize that there are already a > >>> bunch of ways to do that with side channels and such; but if you're > >>> adding an interface that allows this by design, it seems to me like > >>> something that should be gated behind some sort of privilege check. > >> > >> Hmm, you need to be priviledged to get the pfn now and without that you > >> cannot get to any page so the new interface is weakening the rules. > >> Maybe we should limit setting the idle state to processes with the write > >> status. Or do you think that even observing idle status is useful for > >> practical side channel attacks? If yes, is that a problem of the > >> profiler which does potentially dangerous things? > > > > I suppose read-only access isn't a real problem as long as the > > profiler isn't writing the idle state in a very tight loop... but I > > don't see a usecase where you'd actually want that? As far as I can > > tell, if you can't write the idle state, being able to read it is > > pretty much useless. > > > > If the profiler only wants to profile process-private memory, then > > that should be implementable in a safe way in principle, I think, but > > since Joel said that they want to profile CoW memory as well, I think > > that's inherently somewhat dangerous. > > I agree that allowing profiling of shared pages would leak information. Will think more about it. If we limit it to private pages, then it could become useless. Consider a scenario where: A process allocates a some memory, then forks a bunch of worker processes that read that memory and perform some work with them. Per-PID page idle tracking is now run on the parent processes. Now it should appear that the pages are actively accessed (not-idle). If we don't track shared pages, then we cannot detect if those pages are really due to memory leaking, or if they are there for a purpose and are actively used. > To me the use case is not entirely clear. This is not a feature that > would normally be run in everyday computer usage, right? Generally, this to be used as a debugging feature that helps developers detect memory leaks in their programs. thanks, - Joel