Received: by 2002:a25:d7c1:0:0:0:0:0 with SMTP id o184csp1250029ybg; Fri, 18 Oct 2019 14:34:25 -0700 (PDT) X-Google-Smtp-Source: APXvYqx8VDr8Ur7LL8zpiwgZSjAaXyOkkMZnDP2N5X8u5L+3wVW2ZesnrZtjjYqmbcZ3T5QKp+x4 X-Received: by 2002:aa7:ca59:: with SMTP id j25mr12095854edt.78.1571434465872; Fri, 18 Oct 2019 14:34:25 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1571434465; cv=none; d=google.com; s=arc-20160816; b=k3irlwRFfITuDVzWrObwUUnopw4//3yec1pLty4S7VdcvDbjbRvU/N1VVP9C/8woWf lzmIWGrVn2iB3zgkbDKJXIlObljiY5jwpf0T1uz0d/VGgmqi+FGvvNcMC63yTIsLJAkX h8Jbgq0VTaBSMUYqRIZyWnVQBzv8N0igPC28oKl7RXk9gj2VABU7L39RA/oNFkh2iiMm 4dJc58PqJSBk2qhatzvLpLQlE1Dh4HroGOmEc2QnIVNNUnP4k3zmbSOeuNlOnvG7DGPG +D7SWyRBBbWbI8GVrc+oIs5uGOlH1nRAGUvRiVFRNridgFx08Ozn3+bTXVbaAzoaZWmJ ccAg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:dkim-signature; bh=qhptbpId7hBoCEy521ZldkxQoZ8KyT+B+QjPOyNo5f4=; b=KAFvMTsGHJaBuORaVuhUAYYOp2m4imgFV2Z8Qpbz+YspIzU85XEVnTEZxe/afVeJtE dKqz4Ex6JgqcEaYM7aF0Nwyjk1uqDoeLjovIloBFuP0dS7xbzSMTJBB8iiABNeMrOkjq cRnqImVB1pvEi225RUEK2F4A2IfMSOlBSkjUJLk7V1Eg2a5caoHUb0Cbf42dmGwzmZFE U9N6OEPoQ1txx5wiCTUtTauQ6kyNKw3Yy6WbL2fPPAvmxiWOy1wop9nHyBMlYHrqJWot FYArK19cLYGKLXediunsE8WmQo3icMZYe/33BmJxxtAzuJnl+wD/g3v2jxeXrYNXIhDU sAmA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=shdaf3RG; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id i40si4661672eda.225.2019.10.18.14.34.02; Fri, 18 Oct 2019 14:34:25 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=shdaf3RG; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2440662AbfJQRUm (ORCPT + 99 others); Thu, 17 Oct 2019 13:20:42 -0400 Received: from mail-qt1-f196.google.com ([209.85.160.196]:37121 "EHLO mail-qt1-f196.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728639AbfJQRUm (ORCPT ); Thu, 17 Oct 2019 13:20:42 -0400 Received: by mail-qt1-f196.google.com with SMTP id n17so4721765qtr.4 for ; Thu, 17 Oct 2019 10:20:42 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=qhptbpId7hBoCEy521ZldkxQoZ8KyT+B+QjPOyNo5f4=; b=shdaf3RG+fIKoGa0z2l/NkUI+QSGKKpOKxxaiF90zWPhsnEc8EiAfpbnd2S9OWAPAJ bcXLZTUi9GwCYBdVmPOSWSCcd89tXJhLO+NdzZSsgLUKIlDDfCWi4Utzj7ZCjiVTRL60 +5iiurA6DXo4kxPiaGVaFeXWDicGzgQWmMmz2M5Fq0UfSVP7mmCcPRqR6d3rgrMxtWeD LP8FY/MFZcmObD4HT4SftqKmi5sey/YfVCLhObskA3iO0k5TI9IcGzGp3Hgx4pVhX0Qs peUUIw05kWHFGQ7VwaQWVpNVmgqrst7IIfpzuvN+eenxlKQPNeFghk3Ohm0mKY8VbctO UIjg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=qhptbpId7hBoCEy521ZldkxQoZ8KyT+B+QjPOyNo5f4=; b=h2IuL7174LuvvfJysRfT2xa2vpGuZ+ERZejiqTCykf34ffuXodWGrxubxT737wJGge ncKzFUQ8c4+Wg+hhJWciUfVKFP+TWs1WeFZxnhfkWrp+Ew5WHmQdxMUHyjcf8mMXT8xg J1kexkwogsneJv1MMr6L5AiEA519vvrnWfCO5vimNZu67E/bgzZzanKjap/S17uZQH1M eOWOQQErMtNLMAYv99LM26oSYkJA+idm+C4PksHMFekEcUt6GErYpXEa5ChLumT9c13h vUsbB2Oec4B9HEYWXnSoiXzZgJ0tKLBQbmh0eRHZ1KraewMzLNt7kPaYASq/QhIw0Yxr M46A== X-Gm-Message-State: APjAAAVOEg6NxPAuXFTFJcmYz0ZZZZL9nPzolV0kS4qz/64SH0FMjbYm n5bVUVtbOjBlzafeB9wd7GDOGcgAYgTVd5TqkW7orFthC7c= X-Received: by 2002:ad4:530b:: with SMTP id y11mr4937814qvr.12.1571332841511; Thu, 17 Oct 2019 10:20:41 -0700 (PDT) MIME-Version: 1.0 References: <20191016221148.F9CCD155@viggo.jf.intel.com> <496566a6-2581-17f4-a4f2-e5def7f97582@intel.com> In-Reply-To: <496566a6-2581-17f4-a4f2-e5def7f97582@intel.com> From: Yang Shi Date: Thu, 17 Oct 2019 10:20:30 -0700 Message-ID: Subject: Re: [PATCH 0/4] [RFC] Migrate Pages in lieu of discard To: Dave Hansen Cc: Shakeel Butt , Dave Hansen , LKML , Linux MM , Dan Williams , Jonathan Adams , "Chen, Tim C" Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Oct 17, 2019 at 7:26 AM Dave Hansen wrote: > > On 10/16/19 8:45 PM, Shakeel Butt wrote: > > On Wed, Oct 16, 2019 at 3:49 PM Dave Hansen wrote: > >> This set implements a solution to these problems. At the end of the > >> reclaim process in shrink_page_list() just before the last page > >> refcount is dropped, the page is migrated to persistent memory instead > >> of being dropped. > ..> The memory cgroup part of the story is missing here. Since PMEM is > > treated as slow DRAM, shouldn't its usage be accounted to the > > corresponding memcg's memory/memsw counters and the migration should > > not happen for memcg limit reclaim? Otherwise some jobs can hog the > > whole PMEM. > > My expectation (and I haven't confirmed this) is that the any memory use > is accounted to the owning cgroup, whether it is DRAM or PMEM. memcg > limit reclaim and global reclaim both end up doing migrations and > neither should have a net effect on the counters. Yes, your expectation is correct. As long as PMEM is a NUMA node, it is treated as regular memory by memcg. But, I don't think memcg limit reclaim should do migration since limit reclaim is used to reduce memory usage, but migration doesn't reduce usage, it just moves memory from one node to the other. In my implementation, I just skip migration for memcg limit reclaim, please see: https://lore.kernel.org/linux-mm/1560468577-101178-7-git-send-email-yang.shi@linux.alibaba.com/ > > There is certainly a problem here because DRAM is a more valuable > resource vs. PMEM, and memcg accounts for them as if they were equally > valuable. I really want to see memcg account for this cost discrepancy > at some point, but I'm not quite sure what form it would take. Any > feedback from you heavy memcg users out there would be much appreciated. We did have some demands to control the ratio between DRAM and PMEM as I mentioned in LSF/MM. Mel Gorman did suggest make memcg account DRAM and PMEM respectively or something similar. > > > Also what happens when PMEM is full? Can the memory migrated to PMEM > > be reclaimed (or discarded)? > > Yep. The "migration path" can be as long as you want, but once the data > hits a "terminal node" it will stop getting migrated and normal discard > at the end of reclaim happens. I recalled I had a hallway conversation with Keith about this in LSF/MM. We all agree there should be not a cycle. But, IMHO, I don't think exporting migration path to userspace (or letting user to define migration path) and having multiple migration stops are good ideas in general. >