Received: by 10.213.65.68 with SMTP id h4csp3874806imn; Tue, 3 Apr 2018 12:11:11 -0700 (PDT) X-Google-Smtp-Source: AIpwx4/TIQwejDfnQggepeCbssIzUVbQIFN/xj9ZV8CXQdkzj/ZZnc0m+vTQ0VEaj3rL50W0Ptbf X-Received: by 10.98.103.199 with SMTP id t68mr11453143pfj.24.1522782671048; Tue, 03 Apr 2018 12:11:11 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1522782671; cv=none; d=google.com; s=arc-20160816; b=ozit+XmKvT1ODOuB8D2TX+tQYe4it0s6310cemVhqnUsE5DkYudK96kgGhLdDEhbKc 1uh4rPeSi1oMtzFJeknGrmPoWX5ec+Q2RocbpdeyGYP7GH1eX84aKmzn7jEBygGepjTu ffejzN0qxrSQwPC+1+jZjQmvknWTNG1ylhliEEcR/QONbD3cdGuW8TQSvggLp0Y5AtUU cZ0momsz7/Arg0v3c5umvW5giwJIAdmUkFiIsFx0qm1lgbEOJQ98KUxZYo3r9E0jnF9r DqMKfjVO/RzVynUaTOUC5OhVz2gT8WOwYoL8S7OR0nyA+mQoFKM6xrja7hyBRGRO8XzL fj5A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:dkim-signature:arc-authentication-results; bh=Bzw3LE9aYUpv/wIkYJXfd4x1OBMGa4lnUqJBX5ZFr5g=; b=oUDYZ9rrGaaZbPjagjIQVD3SxI4wykwsB9NJgbUAk4LdwknDsOn6tl/fHd+/nIyNIt 8yqr2Ibx1dW0mTnx0VXY1obbyu33h3NdPSWrRYslMnSdzUvRtwsCxMHj0CuCUczlxAIL r947ZTbJ0Ka8eVFWtr3U6lJEFgckk/ZnokbnTGZbNv+FoKvyEu3SOwzgh9y9qbUB+YzV +Qa12GCpUKFAB75vKOaZgAawZQS08LttuI617tDmoXrnTyyKw1B1t7cuh0sdeoZP47V6 SLBiRZ6EVIsUkfoBwBZxDIpIgJKiUnyyvP7rztPqZU0v32JM1fzS472w7S8xwR/bnCox Cc2A== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@infradead.org header.s=bombadil.20170209 header.b=K2lfn+UK; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id b5-v6si1059392ple.584.2018.04.03.12.10.56; Tue, 03 Apr 2018 12:11:11 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=fail header.i=@infradead.org header.s=bombadil.20170209 header.b=K2lfn+UK; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753367AbeDCTIE (ORCPT + 99 others); Tue, 3 Apr 2018 15:08:04 -0400 Received: from bombadil.infradead.org ([198.137.202.133]:34402 "EHLO bombadil.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752832AbeDCTIC (ORCPT ); Tue, 3 Apr 2018 15:08:02 -0400 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=bombadil.20170209; h=In-Reply-To:Content-Type:MIME-Version :References:Message-ID:Subject:Cc:To:From:Date:Sender:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Id: List-Help:List-Unsubscribe:List-Subscribe:List-Post:List-Owner:List-Archive; bh=Bzw3LE9aYUpv/wIkYJXfd4x1OBMGa4lnUqJBX5ZFr5g=; b=K2lfn+UK98aA3Y79+/6N/ibLA HMuelw92ID0K/qaxEEF5hXAZ72yLnbwsiuEIP2hRAFrFko4fDpAeOSUPjW/WSpQLJsjKlFJfjoOpG iwRDXF98DesfyNUS8udVB2ljA3Fn1upubl5zjh7QChDeHpVXPhACGg4FMuZmsLL/SpD4RspYqCT28 6PM7Ct7sSkzMB4mZT+IyLX2YsTSefoG2nk+wE49DPPKSQfbDXd9vSEk3zIuxURk/qbzI0hppsJpRh xcYhELCy9I1344aWxiPt+WT2ikYabm53T/PZVevLOvRdXUVLuzjVFp9C2KAGMXGq/Dlqq6G+R2U2w JRTwdml+w==; Received: from willy by bombadil.infradead.org with local (Exim 4.90_1 #2 (Red Hat Linux)) id 1f3RHk-0005O1-4C; Tue, 03 Apr 2018 19:08:00 +0000 Date: Tue, 3 Apr 2018 12:07:59 -0700 From: Matthew Wilcox To: Michal Hocko Cc: Buddy Lumpkin , linux-mm@kvack.org, linux-kernel@vger.kernel.org, hannes@cmpxchg.org, riel@surriel.com, mgorman@suse.de, akpm@linux-foundation.org Subject: Re: [RFC PATCH 1/1] vmscan: Support multiple kswapd threads per node Message-ID: <20180403190759.GB6779@bombadil.infradead.org> References: <1522661062-39745-1-git-send-email-buddy.lumpkin@oracle.com> <1522661062-39745-2-git-send-email-buddy.lumpkin@oracle.com> <20180403133115.GA5501@dhcp22.suse.cz> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20180403133115.GA5501@dhcp22.suse.cz> User-Agent: Mutt/1.9.2 (2017-12-15) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Apr 03, 2018 at 03:31:15PM +0200, Michal Hocko wrote: > On Mon 02-04-18 09:24:22, Buddy Lumpkin wrote: > > The presence of direct reclaims 10 years ago was a fairly reliable > > indicator that too much was being asked of a Linux system. Kswapd was > > likely wasting time scanning pages that were ineligible for eviction. > > Adding RAM or reducing the working set size would usually make the problem > > go away. Since then hardware has evolved to bring a new struggle for > > kswapd. Storage speeds have increased by orders of magnitude while CPU > > clock speeds stayed the same or even slowed down in exchange for more > > cores per package. This presents a throughput problem for a single > > threaded kswapd that will get worse with each generation of new hardware. > > AFAIR we used to scale the number of kswapd workers many years ago. It > just turned out to be not all that great. We have a kswapd reclaim > window for quite some time and that can allow to tune how much proactive > kswapd should be. > > Also please note that the direct reclaim is a way to throttle overly > aggressive memory consumers. The more we do in the background context > the easier for them it will be to allocate faster. So I am not really > sure that more background threads will solve the underlying problem. It > is just a matter of memory hogs tunning to end in the very same > situtation AFAICS. Moreover the more they are going to allocate the more > less CPU time will _other_ (non-allocating) task get. > > > Test Details > > I will have to study this more to comment. > > [...] > > By increasing the number of kswapd threads, throughput increased by ~50% > > while kernel mode CPU utilization decreased or stayed the same, likely due > > to a decrease in the number of parallel tasks at any given time doing page > > replacement. > > Well, isn't that just an effect of more work being done on behalf of > other workload that might run along with your tests (and which doesn't > really need to allocate a lot of memory)? In other words how > does the patch behaves with a non-artificial mixed workloads? > > Please note that I am not saying that we absolutely have to stick with the > current single-thread-per-node implementation but I would really like to > see more background on why we should be allowing heavy memory hogs to > allocate faster or how to prevent that. I would be also very interested > to see how to scale the number of threads based on how CPUs are utilized > by other workloads. Yes, very much this. If you have a single-threaded workload which is using the entirety of memory and would like to use even more, then it makes sense to use as many CPUs as necessary getting memory out of its way. If you have N CPUs and N-1 threads happily occupying themselves in their own reasonably-sized working sets with one monster process trying to use as much RAM as possible, then I'd be pretty unimpressed to see the N-1 well-behaved threads preempted by kswapd. My biggest problem with the patch-as-presented is that it's yet one more thing for admins to get wrong. We should spawn more threads automatically if system conditions are right to do that.