Received: by 2002:a25:c205:0:0:0:0:0 with SMTP id s5csp2873904ybf; Mon, 2 Mar 2020 17:32:16 -0800 (PST) X-Google-Smtp-Source: ADFU+vv+TXgdysAiUsJae1Xky458HOuBzpZ5oV2qmhx3L22LTWAqltYMcV1ItS3M8hOAN5ZjdPZI X-Received: by 2002:aca:524a:: with SMTP id g71mr891470oib.7.1583199136727; Mon, 02 Mar 2020 17:32:16 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1583199136; cv=none; d=google.com; s=arc-20160816; b=V6shT1RQGoXZlzvDFIZ/+JEagaHUctOdqJm5kyeBJiOSXaXp8/sBR0Pa7IWojOv5OY VCzFi/1EwUG/yohFS2rMtRQlcjrgLz1J9l2MwkenbjkQ9wklQ5PNQ7Ks3kZ+x5e7vSII U6/C1/D6nN2o0K8ADwNuuWqQqecRQhz4YgbWhYHD42zOpKqeiGnRhYap8aC128tNKAgj QC1AojyUDqDjm02UQ/RCI0xtlUvh0zVHb8urpncYABb85OhrXWy4vWkr9LTRNq/5Su8J jjOM4BO4ePsnX+95pgRVDQxDVOLSfYH0o7keDKr3XOyuK7a/c5bI2zFGxiklFFLXKx7a xGtA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:mime-version:user-agent:message-id :in-reply-to:date:references:subject:cc:to:from; bh=vc+VoLjaV4CF8ZKBHYBcUOs0+0TkVDwNUj8zae/ZYgQ=; b=FOqA80JMj5HPL6ICE0BLHJiedgTQGmmZQG3tPmtURKj5VQsxs7Jrw0vKSXwSQNb2bE wOv09BdasAe7/VTQnHFNvK17wkSTnXGoMN0UliOmVFpzAEFtSsqb1CrrK5WEPjo7lUXr DfyxiXnDOWncWvsvcdEYi0/nolw4UaSkpLMfFY2AYLRzMOGIuSmOObLcq2Smx9pzBt4X 29H+YanvN6/EVWkg8HPjLcq10n+YEC1N1faVJDzcpwLeQuV+i64DucxQ6nwn7h//DvgS YQbBRYw9mOfvloVUeF6qf9oolsyKIgy77QHDQZ6uA5ui73/sEP7K5bc0wWzSx5fA/6Dg 6QMQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id u15si2739817oic.269.2020.03.02.17.32.03; Mon, 02 Mar 2020 17:32:16 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727069AbgCCBad (ORCPT + 99 others); Mon, 2 Mar 2020 20:30:33 -0500 Received: from mga18.intel.com ([134.134.136.126]:22374 "EHLO mga18.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726755AbgCCBad (ORCPT ); Mon, 2 Mar 2020 20:30:33 -0500 X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga002.jf.intel.com ([10.7.209.21]) by orsmga106.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 02 Mar 2020 17:30:32 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.70,509,1574150400"; d="scan'208";a="258195400" Received: from yhuang-dev.sh.intel.com (HELO yhuang-dev) ([10.239.159.23]) by orsmga002.jf.intel.com with ESMTP; 02 Mar 2020 17:30:28 -0800 From: "Huang\, Ying" To: Michal Hocko Cc: David Hildenbrand , Matthew Wilcox , Andrew Morton , , , Mel Gorman , Vlastimil Babka , Zi Yan , Peter Zijlstra , Dave Hansen , Minchan Kim , "Johannes Weiner" , Hugh Dickins , "Alexander Duyck" Subject: Re: [RFC 0/3] mm: Discard lazily freed pages when migrating References: <20200228033819.3857058-1-ying.huang@intel.com> <20200228034248.GE29971@bombadil.infradead.org> <87a7538977.fsf@yhuang-dev.intel.com> <871rqf850z.fsf@yhuang-dev.intel.com> <20200228095048.GK3771@dhcp22.suse.cz> <87d09u7sm2.fsf@yhuang-dev.intel.com> <20200302142549.GO4380@dhcp22.suse.cz> Date: Tue, 03 Mar 2020 09:30:28 +0800 In-Reply-To: <20200302142549.GO4380@dhcp22.suse.cz> (Michal Hocko's message of "Mon, 2 Mar 2020 15:25:49 +0100") Message-ID: <874kv66x8r.fsf@yhuang-dev.intel.com> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.1 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=ascii Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Michal Hocko writes: > On Mon 02-03-20 22:12:53, Huang, Ying wrote: >> Michal Hocko writes: > [...] >> > And MADV_FREE pages are a kind of cache as well. If the target node is >> > short on memory then those will be reclaimed as a cache so a >> > pro-active freeing sounds counter productive as you do not have any >> > idea whether that cache is going to be used in future. In other words >> > you are not going to free a clean page cache if you want to use that >> > memory as a migration target right? So you should make a clear case >> > about why MADV_FREE cache is less important than the clean page cache >> > and ideally have a good justification backed by real workloads. >> >> Clean page cache still have valid contents, while clean MADV_FREE pages >> has no valid contents. So penalty of discarding the clean page cache is >> reading from disk, while the penalty of discarding clean MADV_FREE pages >> is just page allocation and zeroing. > > And "just page allocation and zeroing" overhead is the primary > motivation to keep the page in memory. It is a decision of the workload > to use MADV_FREE because chances are that this will speed things up. All > that with a contract that the memory goes away under memory pressure so > with a good workload/memory sizing you do not really lose that > optimization. Now you want to make decision on behalf of the consumer of > the MADV_FREE memory. I understand that MADV_FREE helps in some situations. And if the application want to keep the "contract" after migration, they should have a way to do that. >> I understand that MADV_FREE is another kind of cache and has its value. >> But in the original implementation, during migration, we have already >> freed the original "cache", then reallocate the cache elsewhere and >> copy. This appears more like all pages are populated in mmap() always. >> I know there's value to populate all pages in mmap(), but does that need >> to be done always by default? > > It is not. You have to explicitly request MAP_POPULATE to initialize > mmap. Yes. mmap() can control whether to populate the underlying physical pages. But for migrating MADV_FREE pages, there's no control, all pages will be populated again always by default. Maybe we should avoid to do that in some situations too. Best Regards, Huang, Ying