Received: by 2002:a05:6a10:22f:0:0:0:0 with SMTP id 15csp1329211pxk; Fri, 2 Oct 2020 07:02:49 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxXLe7zuh0C98z+xZoZSLKVuceOtBFrZUC7PnGkbiMGlwK6HLqEtbO+DicIqag/B90GupGK X-Received: by 2002:a17:906:486:: with SMTP id f6mr1656870eja.473.1601647368616; Fri, 02 Oct 2020 07:02:48 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1601647368; cv=none; d=google.com; s=arc-20160816; b=yp9O9zKqnZerfPekAxKew6T66z/JkAv6SPZ8wnNg1TaumMsfYGONn/t8zFQWOKiYrH KJjKxrRAePnZmSdVmBGddt+/asBxghWqq7dZ364Lk68e0EDZl7xn1703mhOiOHYlZTiz 9aiASR6aiuuPNZFLyAgAlTzIvGA6f1RQ1BlelVWYmId3ZKXe9AVP5FO3bDOGljuloJMQ nxEJV9HaFDOmCKoVq54lMe5NpN3MzHo8BPhuud76NMBviHGjnaYgmDt7MdVz0U3P/pjY TyCNFHLco8B6mB7yx9ui8mQjahCJYHg138Ubs0cNbQfjhAtojkC9D2DSxZIvUmzZJA0j WmGg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:dkim-signature; bh=CUDGsfaX06RV82lRVdfunkb9Yk0TwCX6+kZd1kTshAQ=; b=cd0aAVp3Tpuef9GcPP6YdSGnvU31zKffX9K8nPgvM02WkvFGCauRC8WsL5s5Y4KxH4 prkl1wKBoO7RhnIT/7Y1+G25guN5RlRKT988USqnukCkeCI/MfCIVPr3wvHvr0nBFk+B IrDGu6rrU+IsnLYHoEXzVMZCKMviSfPtJY5hq8j4NleNbEmtpULnrB7Z/P9wh4Dh6GZ0 LmXst/xhNC6DlbNucBITAhKBGkMUR9Y4f41dZr8GHTNO2S8vUjV6N37JSiA/vDkcQ6v4 L1BHHKWBRrv/44IiEMbVCRTkeWCTNO4DXQjjDT4AXP+CGQwe1PcCgYbQQarYdoXarrED 4m9A== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@infradead.org header.s=casper.20170209 header.b=SbK0nYmP; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id f16si1134918ejt.311.2020.10.02.07.02.24; Fri, 02 Oct 2020 07:02:48 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@infradead.org header.s=casper.20170209 header.b=SbK0nYmP; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2387908AbgJBOAu (ORCPT + 99 others); Fri, 2 Oct 2020 10:00:50 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50090 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726017AbgJBOAt (ORCPT ); Fri, 2 Oct 2020 10:00:49 -0400 Received: from casper.infradead.org (casper.infradead.org [IPv6:2001:8b0:10b:1236::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B1DF0C0613D0 for ; Fri, 2 Oct 2020 07:00:49 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=In-Reply-To:Content-Type:MIME-Version: References:Message-ID:Subject:Cc:To:From:Date:Sender:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description; bh=CUDGsfaX06RV82lRVdfunkb9Yk0TwCX6+kZd1kTshAQ=; b=SbK0nYmPxa1e4eYN/337RPAU/6 Zi933T8DX436+uLM5QIy3XFcYxzc6NRdPhln+IwhSJ4FW2sa2SB0HYpJoGcsVgH4Mv2sAukCge/+W b+jwTuKpJiGxQKdVcSX57y2xBCWcetkbMy7nDy9mYN60KMC2fKlk6svRGKCSUfMs7CDJVXvq/exZH ddLlgLu92q/O2REzhz5Pwb2DHQnnPM3DxibKoHH1dTZbyRLxE0m7jqtP6P9GNVSeBDi2iIXz3LbPL pwUoKdFDML2Kfto6bcTDe/XEz62FkTxLHFao/r5MpBavVmfvELx/qJbR1PxIl8Sszp4kGvr81Xd1A +M5fghZQ==; Received: from willy by casper.infradead.org with local (Exim 4.92.3 #3 (Red Hat Linux)) id 1kOLc2-0001FE-L6; Fri, 02 Oct 2020 14:00:42 +0000 Date: Fri, 2 Oct 2020 15:00:42 +0100 From: Matthew Wilcox To: Rik van Riel Cc: Michal Hocko , Sebastiaan Meijer , akpm@linux-foundation.org, buddy.lumpkin@oracle.com, hannes@cmpxchg.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, mgorman@suse.de Subject: Re: [RFC PATCH 1/1] vmscan: Support multiple kswapd threads per node Message-ID: <20201002140042.GB20115@casper.infradead.org> References: <20201001123032.GC22560@dhcp22.suse.cz> <20201002070333.GA21871@dhcp22.suse.cz> <656725362af9bd757a281f0799a0bb9c9b2487bd.camel@surriel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <656725362af9bd757a281f0799a0bb9c9b2487bd.camel@surriel.com> Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Oct 02, 2020 at 09:53:05AM -0400, Rik van Riel wrote: > On Fri, 2020-10-02 at 09:03 +0200, Michal Hocko wrote: > > On Thu 01-10-20 18:18:10, Sebastiaan Meijer wrote: > > > (Apologies for messing up the mailing list thread, Gmail had fooled > > > me into > > > believing that it properly picked up the thread) > > > > > > On Thu, 1 Oct 2020 at 14:30, Michal Hocko wrote: > > > > On Wed 30-09-20 21:27:12, Sebastiaan Meijer wrote: > > > > > > yes it shows the bottleneck but it is quite artificial. Read > > > > > > data is > > > > > > usually processed and/or written back and that changes the > > > > > > picture a > > > > > > lot. > > > > > Apologies for reviving an ancient thread (and apologies in > > > > > advance for my lack > > > > > of knowledge on how mailing lists work), but I'd like to offer > > > > > up another > > > > > reason why merging this might be a good idea. > > > > > > > > > > From what I understand, zswap runs its compression on the same > > > > > kswapd thread, > > > > > limiting it to a single thread for compression. Given enough > > > > > processing power, > > > > > zswap can get great throughput using heavier compression > > > > > algorithms like zstd, > > > > > but this is currently greatly limited by the lack of threading. > > > > > > > > Isn't this a problem of the zswap implementation rather than > > > > general > > > > kswapd reclaim? Why zswap doesn't do the same as normal swap out > > > > in a > > > > context outside of the reclaim? > > On systems with lots of very fast IO devices, we have > also seen kswapd take 100% CPU time without any zswap > in use. > > This seems like a generic issue, though zswap does > manage to bring it out on lower end systems. Then, given Mel's observation about contention on the LRU lock, what's the solution? Partition the LRU list? Batch removals from the LRU list by kswapd and hand off to per-?node?cpu? worker threads? Rik, if you have access to one of those systems, I'd be interested to know whether using file THPs would help with your workload. Tracking only one THP instead of, say, 16 regular size pages is going to reduce the amount of time taken to pull things off the LRU list.