Received: by 2002:a25:1506:0:0:0:0:0 with SMTP id 6csp4623050ybv; Wed, 26 Feb 2020 00:06:09 -0800 (PST) X-Google-Smtp-Source: APXvYqysPW47vFKqxYQWQkp9Iiou1eJ+UNYrULveCf4VrJO/VYEuP3poBWoRh9E6m1dRwpPeG/RE X-Received: by 2002:aca:c4d2:: with SMTP id u201mr2163589oif.54.1582704369525; Wed, 26 Feb 2020 00:06:09 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1582704369; cv=none; d=google.com; s=arc-20160816; b=NisZKf5Vxh7IplVySbpeAfNeKRjb4fZySlYcqjsxh9pIWsY4IQGnl6lR3WFq/nCr9j MZiD552/BXc6SO5l0UPGouiDX3BOQyFQDEkoXzMMUKE+c+N9T3nRVfykwj0XuNpuO+em z4Tl0poaMKSziw/c/ptuuNVG3WYRxclMlxUNik674iy/kzs0sRyxSoOM/h5wUylOsWwB 4n2ey6DA3Czf8F1MnhmwGHAMuwka3oy/bzt/+g8Fe/w602t8cEqmPBKq9p02i1+BO8r/ hSIEZc2tbShwXnNA3jZvaLdFAg3Q7KY8ZAWz8y/8N96RvNHnGsBGMo/U8QsoAaQPkmWd OOzg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date; bh=VEjdTIuhw0MIq/48XVVCSybCvMsGleWNEPDpK0BN5/M=; b=Ek1hjbtO+4ztGRLmreATQERDbzm8bYdeDEK+fjsdVjf45tSYyU6Jaa90g6kGON1iFT F1nKr6OHJ3mnl3inopwvTv+9Bvgic64Bwj6x9oWYmrHlW+8VAgzvd5XxuQ6LPGJtqmgQ AAnb1VafPZV5Lwl3OknTtkHLrt1kbRqQRmBAwzRE4ztOYl61anGgWE1xagnGpThXrWy2 zQdQ/CZ/AfVqsjcUH44xb2+gdqTBTRO/aCHcAR7Bx2B7zB2gdvAiN/e01fHanVfAdMT7 ChPU0NqgCd8Mx8MZspS6lUKRiuwG5g1rRkHV+gte4mSWqC7KIQnWYlQjgk1QOIw4NfUb ECkA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id b141si802045oii.79.2020.02.26.00.05.55; Wed, 26 Feb 2020 00:06:09 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727312AbgBZIEj (ORCPT + 99 others); Wed, 26 Feb 2020 03:04:39 -0500 Received: from outbound-smtp27.blacknight.com ([81.17.249.195]:49168 "EHLO outbound-smtp27.blacknight.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727223AbgBZIEj (ORCPT ); Wed, 26 Feb 2020 03:04:39 -0500 Received: from mail.blacknight.com (pemlinmail02.blacknight.ie [81.17.254.11]) by outbound-smtp27.blacknight.com (Postfix) with ESMTPS id 47A4ACAC76 for ; Wed, 26 Feb 2020 08:04:36 +0000 (GMT) Received: (qmail 14671 invoked from network); 26 Feb 2020 08:04:36 -0000 Received: from unknown (HELO techsingularity.net) (mgorman@techsingularity.net@[84.203.18.57]) by 81.17.254.9 with ESMTPSA (AES256-SHA encrypted, authenticated); 26 Feb 2020 08:04:36 -0000 Date: Wed, 26 Feb 2020 08:04:26 +0000 From: Mel Gorman To: Andrew Morton Cc: Michal Hocko , Vlastimil Babka , Ivan Babrou , Rik van Riel , Linux-MM , Linux Kernel Mailing List Subject: Re: [PATCH 0/3] Limit runaway reclaim due to watermark boosting Message-ID: <20200226080426.GA3818@techsingularity.net> References: <20200225141534.5044-1-mgorman@techsingularity.net> <20200225185130.6a32a8a6920d11b4c098e90e@linux-foundation.org> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-15 Content-Disposition: inline In-Reply-To: <20200225185130.6a32a8a6920d11b4c098e90e@linux-foundation.org> User-Agent: Mutt/1.10.1 (2018-07-13) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Feb 25, 2020 at 06:51:30PM -0800, Andrew Morton wrote: > On Tue, 25 Feb 2020 14:15:31 +0000 Mel Gorman wrote: > > > Ivan Babrou reported the following > > http://lkml.kernel.org/r/CABWYdi1eOUD1DHORJxTsWPMT3BcZhz++xP1pXhT=x4SgxtgQZA@mail.gmail.com > is helpful. > Noted for future reference. > > Commit 1c30844d2dfe ("mm: reclaim small amounts of memory when > > an external fragmentation event occurs") introduced undesired > > effects in our environment. > > > > * NUMA with 2 x CPU > > * 128GB of RAM > > * THP disabled > > * Upgraded from 4.19 to 5.4 > > > > Before we saw free memory hover at around 1.4GB with no > > spikes. After the upgrade we saw some machines decide that they > > need a lot more than that, with frequent spikes above 10GB, > > often only on a single numa node. > > > > There have been a few reports recently that might be watermark boost > > related. Unfortunately, finding someone that can reproduce the problem > > and test a patch has been problematic. This series intends to limit > > potential damage only. > > It's problematic that we don't understand what's happening. And these > palliatives can only reduce our ability to do that. > Not for certain no, but we do know that there are conditions whereby node 0 can end up reclaiming excessively for extended periods of time. The available evidence does match a pattern whereby a lower zone on node 0 is getting stuck in a boosted state. > Rik seems to have the means to reproduce this (or something similar) > and it seems Ivan can test patches three weeks hence. If Rik can reproduce it great but I have a strong feeling that Ivan may never be able to test this if it requires a production machine which is why I did not wait the three weeks. > So how about a > debug patch which will help figure out what's going on in there? A debug patch would not help much in this case given that we have tracepoints. An ftrace containing mm_page_alloc_extfrag, mm_vmscan_kswapd_wake, mm_vmscan_wakeup_kswapd and mm_vmscan_node_reclaim_begin would be a big help for 30 seconds while the problem is occurring would work. Ideally mm_vmscan_lru_shrink_inactive would also be included to capture the priority but the size of the trace is what's going to be problematic. mm_page_alloc_extfrag would be correlated with the conditions that boost the watermarks and the others would track what kswapd is doing to see if it's persistently reclaiming. If they are, mm_vmscan_lru_shrink_inactive would tell if it's persistently reclaiming at priority DEF_PRIORITY - 2 which would prove the patch would at least mitigate the problem. It would be more preferable to have a description of a testcase that reproduces the problem and I'll capture/analyse the trace myself. It would also be something I could slot into a test grid to catch the problem happening again in the future. -- Mel Gorman SUSE Labs