Received: by 2002:a25:c205:0:0:0:0:0 with SMTP id s5csp2886668ybf; Mon, 2 Mar 2020 17:52:27 -0800 (PST) X-Google-Smtp-Source: ADFU+vuTynytsfcG2fmG5zbNAFkSCpqAbcnWXIQz4t4IwTDB+2lLOgaljFR8+esGFDLvxyiXDKjg X-Received: by 2002:a9d:6b91:: with SMTP id b17mr1536032otq.235.1583200347754; Mon, 02 Mar 2020 17:52:27 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1583200347; cv=none; d=google.com; s=arc-20160816; b=q9U25J0jnRz+PsrCds4mFlzPFZm/W85ccT2wlalfnkPqXirbzgWL3eIDlabOuWwIxO QVsu50cpN9aBU8bPl4gSbC/y5LzEDrSG5Xj00EF+l31rzt9DhEgtv7W2EobUGxOtd7OC qqzobgwMlTVsD2pEmMqSQ8dZL1+OxvY1fqhQ4BHjVxeOTys5REtDCXzSKNOjlU+tAuXy nO7HOqq84vy+Nhsikkkb41rSMYK4aljQz3hfAAuoWKBNdDeXCAa8AnZ7022ICMse8gCu kZyhLwtvKZe7E9NdmymPplQe3ddx77laK/Ey+GEudv6mW1LZlV9p+I7MOpLX6jL7DGta YlAg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:mime-version:user-agent:message-id :in-reply-to:date:references:subject:cc:to:from; bh=TCPinqpGCAX/4SBu/T8H8anj+Dvd3FCuO1DJpQdLKRo=; b=GQYVQ3HHJclZpVJeJFFbwNCaIIAfjOk/rbqUJJNvzGfEsWCbnnRyRb+YGGjpENROga ivqv+l6rbwsIOvvfRgOufdhL5MiHr8U4cbVZzQI98yUfhJGinoFcFyKQpdzx9f4baxXk ZNtS7K1ZuJBOHOBHiP4pw6LbpGNzztiyfjRH9NTTHznJHcJt/HeaboK3Ah5AwHYX9Skc gM9ZncYxOIyL85vlJJtWn8mkAGmyqLSvTRkbCjOK6V+p7xQFw1wYHLE5k/XkhJCPbzb2 LlwloMTvmWcM+8mq8zoTuVFzPbqP4tQ+aTQH3pARIg88k0FqH4FLTtY+B+SQ8FWTbPcT 26DA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id q9si6254589oif.92.2020.03.02.17.52.14; Mon, 02 Mar 2020 17:52:27 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726990AbgCCBwB (ORCPT + 99 others); Mon, 2 Mar 2020 20:52:01 -0500 Received: from mga04.intel.com ([192.55.52.120]:21371 "EHLO mga04.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726773AbgCCBwB (ORCPT ); Mon, 2 Mar 2020 20:52:01 -0500 X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga002.fm.intel.com ([10.253.24.26]) by fmsmga104.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 02 Mar 2020 17:52:01 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.70,509,1574150400"; d="scan'208";a="274000129" Received: from yhuang-dev.sh.intel.com (HELO yhuang-dev) ([10.239.159.23]) by fmsmga002.fm.intel.com with ESMTP; 02 Mar 2020 17:51:57 -0800 From: "Huang\, Ying" To: Mel Gorman , David Hildenbrand , Michal Hocko , "Johannes Weiner" Cc: Matthew Wilcox , Andrew Morton , , , Vlastimil Babka , Zi Yan , Peter Zijlstra , Dave Hansen , Minchan Kim , Hugh Dickins , "Alexander Duyck" Subject: Re: [RFC 0/3] mm: Discard lazily freed pages when migrating References: <20200228033819.3857058-1-ying.huang@intel.com> <20200228034248.GE29971@bombadil.infradead.org> <87a7538977.fsf@yhuang-dev.intel.com> <871rqf850z.fsf@yhuang-dev.intel.com> <20200228094954.GB3772@suse.de> <87h7z76lwf.fsf@yhuang-dev.intel.com> <20200302151607.GC3772@suse.de> Date: Tue, 03 Mar 2020 09:51:56 +0800 In-Reply-To: <20200302151607.GC3772@suse.de> (Mel Gorman's message of "Mon, 2 Mar 2020 15:16:07 +0000") Message-ID: <87zhcy5hoj.fsf@yhuang-dev.intel.com> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.1 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=ascii Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Mel Gorman writes: > On Mon, Mar 02, 2020 at 07:23:12PM +0800, Huang, Ying wrote: >> If some applications cannot tolerate the latency incurred by the memory >> allocation and zeroing. Then we cannot discard instead of migrate >> always. While in some situations, less memory pressure can help. So >> it's better to let the administrator and the application choose the >> right behavior in the specific situation? >> > > Is there an application you have in mind that benefits from discarding > MADV_FREE pages instead of migrating them? > > Allowing the administrator or application to tune this would be very > problematic. An application would require an update to the system call > to take advantage of it and then detect if the running kernel supports > it. An administrator would have to detect that MADV_FREE pages are being > prematurely discarded leading to a slowdown and that is hard to detect. > It could be inferred from monitoring compaction stats and checking > if compaction activity is correlated with higher minor faults in the > target application. Proving the correlation would require using the perf > software event PERF_COUNT_SW_PAGE_FAULTS_MIN and matching the addresses > to MADV_FREE regions that were freed prematurely. That is not an obvious > debugging step to take when an application detects latency spikes. > > Now, you could add a counter specifically for MADV_FREE pages freed for > reasons other than memory pressure and hope the administrator knows about > the counter and what it means. That type of knowledge could take a long > time to spread so it's really very important that there is evidence of > an application that suffers due to the current MADV_FREE and migration > behaviour. OK. I understand that this patchset isn't a universal win, so we need some way to justify it. I will try to find some application for that. Another thought, as proposed by David Hildenbrand, it's may be a universal win to discard clean MADV_FREE pages when migrating if there are already memory pressure on the target node. For example, if the free memory on the target node is lower than high watermark? Best Regards, Huang, Ying