Received: by 2002:a25:8b12:0:0:0:0:0 with SMTP id i18csp1026069ybl; Wed, 14 Aug 2019 09:33:29 -0700 (PDT) X-Google-Smtp-Source: APXvYqxVDLuZeAFNZcRCKmVY+elXP06JONYrVMbVSkfib0QyWXKw8BrxqGgrKZJREcojnKPvpFqd X-Received: by 2002:a65:6817:: with SMTP id l23mr100368pgt.46.1565800409724; Wed, 14 Aug 2019 09:33:29 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1565800409; cv=none; d=google.com; s=arc-20160816; b=gSt+dVi1T8CNQDYXgVklPz4Ly06GryBV/4XZZyy8bocicYlPbsysaRKf6G/3FfwXrJ N/6zdCS7kwRB6cX0Z0Xemx5JM5NXP5MvsBRXlUEgKQzk4rqH9zNn2Y5QYbpUFpYUTYgO DEKHSxwjT1ugQQwyxWakq00dzhwT2eLlAbWtmFuEhCE31/g+FgS6xlucNCloOhMh10zO M8C2O0FT4J2zXqZrn68HmCYvGbur7/Q/8v42rpSMD8nCPVUIvj/fZwLp7vcAslY3I/F0 jEFRbVTBUyrZWFpQJeUwqQXP1fXY4l1zTWr8LSIEAE49Sx8lCOeFj9vl3DJUt0k+JysJ czVg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:dkim-signature; bh=JW7NJ2x8aCYJgh9Quu4m2Xy6h4AMn4oD9DcUlwK+nlA=; b=On/PNNv50I/6URGisnQbHhlNKck2Mt+zEWwGANiZpk7nJueq4yL/99cJsIPQx1GEPI ShSDo0fb3tIUqH6zRKQzHc5vN/czdj4xFzLjr12D+m7zmk+R7C5fBrL6OQ3/xfW+/sNa O2wi4zCHBGmUV3A9GhNxjwc/RAw8YMLAdXn4y4cJGlNbBnhJxl1FQ9VzTPQjrXH02Cj6 knXh7bvr3ck2hCi1hlWpR0U4e5iRKIOk2a6gKqZowSxfyJc6+QPKMcgVaFfyVsVZVQOm GhXzVw6Vk/JcE6UHhc6bUzAXtO4aXYANew6ItLeYqCTXs0GoqcqzKOqvV+1UQfNxvJUO qYNw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@joelfernandes.org header.s=google header.b=RnBh7I01; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id d13si115351pgu.268.2019.08.14.09.33.13; Wed, 14 Aug 2019 09:33:29 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@joelfernandes.org header.s=google header.b=RnBh7I01; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727929AbfHNQcH (ORCPT + 99 others); Wed, 14 Aug 2019 12:32:07 -0400 Received: from mail-pl1-f193.google.com ([209.85.214.193]:35739 "EHLO mail-pl1-f193.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726509AbfHNQcH (ORCPT ); Wed, 14 Aug 2019 12:32:07 -0400 Received: by mail-pl1-f193.google.com with SMTP id gn20so1661374plb.2 for ; Wed, 14 Aug 2019 09:32:06 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=joelfernandes.org; s=google; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to:user-agent; bh=JW7NJ2x8aCYJgh9Quu4m2Xy6h4AMn4oD9DcUlwK+nlA=; b=RnBh7I01/H1DbTynCmuURc8VbMF7+q40o0fbix1RneZMmFzklD3vG3O010YUou1qzr xgDCJtHMH/tZntjjAeYmLfFj3G6gKPLWnQnMVapQQcNVGDb2UBOMic/vN+kl/T1f3hpI 6jlqwgKsplgqpmo7KxuSiGH/I6rDpu/yLQ17E= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to:user-agent; bh=JW7NJ2x8aCYJgh9Quu4m2Xy6h4AMn4oD9DcUlwK+nlA=; b=DjzQsjfI9EbBSiO5zQjPZz9yMMg5UyHcrZ/m6LUpn4oZjeScCdYrLBjyl34thtc1qP IApRGzFbvCrzlbEGbmDiYYhlpkRf767qNnQm+bA5aLIDtp99UJqVJEeFeLjtj/UTvWz3 acsw1wO4SUIAoAX6keLneb5vtzp87bQXaQlLVDDSlxCqw5/j8JbNlrj/WICef/Bnz6q6 EpdlUuPklJCOFH9+ljeS7UqsLCOU2wtibNxNXiw0y5+kdhJPty7zWfrqe2/3451Ed4ii 3fyyW7NXG05NQlAtIkzggxd8kZp4fyNWu0LCOxy2bD9s28uizxuuHQ16HXH47BN5Z/8u awaw== X-Gm-Message-State: APjAAAXHFpBz/1daNbMZbyW44R05Wogz/MaaRVOBJ6KRklK2rlN0CxpL RLk8vVMiVt5kzLq9mUfJvO9I9A== X-Received: by 2002:a17:902:169:: with SMTP id 96mr219508plb.297.1565800325785; Wed, 14 Aug 2019 09:32:05 -0700 (PDT) Received: from localhost ([2620:15c:6:12:9c46:e0da:efbf:69cc]) by smtp.gmail.com with ESMTPSA id t7sm290486pjq.15.2019.08.14.09.32.04 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 14 Aug 2019 09:32:04 -0700 (PDT) Date: Wed, 14 Aug 2019 12:32:03 -0400 From: Joel Fernandes To: Michal Hocko Cc: khlebnikov@yandex-team.ru, linux-kernel@vger.kernel.org, Minchan Kim , Alexey Dobriyan , Andrew Morton , Borislav Petkov , Brendan Gregg , Catalin Marinas , Christian Hansen , dancol@google.com, fmayer@google.com, "H. Peter Anvin" , Ingo Molnar , Jonathan Corbet , Kees Cook , kernel-team@android.com, linux-api@vger.kernel.org, linux-doc@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, Mike Rapoport , namhyung@google.com, paulmck@linux.ibm.com, Robin Murphy , Roman Gushchin , Stephen Rothwell , surenb@google.com, Thomas Gleixner , tkjos@google.com, Vladimir Davydov , Vlastimil Babka , Will Deacon Subject: Re: [PATCH v5 2/6] mm/page_idle: Add support for handling swapped PG_Idle pages Message-ID: <20190814163203.GB59398@google.com> References: <20190807171559.182301-1-joel@joelfernandes.org> <20190807171559.182301-2-joel@joelfernandes.org> <20190813150450.GN17933@dhcp22.suse.cz> <20190813153659.GD14622@google.com> <20190814080531.GP17933@dhcp22.suse.cz> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20190814080531.GP17933@dhcp22.suse.cz> User-Agent: Mutt/1.10.1 (2018-07-13) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Aug 14, 2019 at 10:05:31AM +0200, Michal Hocko wrote: > On Tue 13-08-19 11:36:59, Joel Fernandes wrote: > > On Tue, Aug 13, 2019 at 05:04:50PM +0200, Michal Hocko wrote: > > > On Wed 07-08-19 13:15:55, Joel Fernandes (Google) wrote: > > > > Idle page tracking currently does not work well in the following > > > > scenario: > > > > 1. mark page-A idle which was present at that time. > > > > 2. run workload > > > > 3. page-A is not touched by workload > > > > 4. *sudden* memory pressure happen so finally page A is finally swapped out > > > > 5. now see the page A - it appears as if it was accessed (pte unmapped > > > > so idle bit not set in output) - but it's incorrect. > > > > > > > > To fix this, we store the idle information into a new idle bit of the > > > > swap PTE during swapping of anonymous pages. > > > > > > > > Also in the future, madvise extensions will allow a system process > > > > manager (like Android's ActivityManager) to swap pages out of a process > > > > that it knows will be cold. To an external process like a heap profiler > > > > that is doing idle tracking on another process, this procedure will > > > > interfere with the idle page tracking similar to the above steps. > > > > > > This could be solved by checking the !present/swapped out pages > > > right? Whoever decided to put the page out to the swap just made it > > > idle effectively. So the monitor can make some educated guess for > > > tracking. If that is fundamentally not possible then please describe > > > why. > > > > But the monitoring process (profiler) does not have control over the 'whoever > > made it effectively idle' process. > > Why does that matter? Whether it is a global/memcg reclaim or somebody > calling MADV_PAGEOUT or whatever it is a decision to make the page not > hot. Sure you could argue that a missing idle bit on swap entries might > mean that the swap out decision was pre-mature/sub-optimal/wrong but is > this the aim of the interface? > > > As you said it will be a guess, it will not be accurate. > > Yes and the point I am trying to make is that having some space and not > giving a guarantee sounds like a safer option for this interface because I do see your point of view, but jJust because a future (and possibly not going to happen) usecase which you mentioned as pte reclaim, makes you feel that userspace may be subject to inaccuracies anyway, doesn't mean we should make everything inaccurate.. We already know idle page tracking is not completely accurate. But that doesn't mean we miss out on the opportunity to make the "non pte-reclaim" usecase inaccurate as well. IMO, we should do our best for today, and not hypothesize. How likely is pte reclaim and is there a thread to describe that direction? > > I am curious what is your concern with using a bit in the swap PTE? > > ... It is a promiss of the semantic I find limiting for future. The bit > in the pte might turn out insufficient (e.g. pte reclaim) so teaching > the userspace to consider this a hard guarantee is a ticket to problems > later on. Maybe I am overly paranoid because I have seen so many "nice > to have" features turning into a maintenance burden in the past. > > If this is really considered mostly debugging purpouse interface then a > certain level of imprecision should be tolerateable. If there is a > really strong real world usecase that simply has no other way to go > then this might be added later. Adding an information is always safer > than take it away. > > That being said, if I am a minority voice here then I will not really > stand in the way and won't nack the patch. I will not ack it neither > though. Ok. thanks, - Joel