Received: by 2002:a05:6a10:5bc5:0:0:0:0 with SMTP id os5csp881996pxb; Wed, 27 Oct 2021 14:23:51 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxdrubhzUHhCP/y0/tzoS+ea+DozgS6EUiU0Mr6IZhWIw8PQYwn3mg5m5CiIOWmy6ymHINO X-Received: by 2002:a05:6402:1751:: with SMTP id v17mr428870edx.85.1635369831696; Wed, 27 Oct 2021 14:23:51 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1635369831; cv=none; d=google.com; s=arc-20160816; b=qex6TkR1rIhnHrKICplfLcPzxvTC3iaGGqflBatuPuahEUOqBGa/zpaZxSh+Afr6yY UBzNI6fsNOJ3TYXWc1MJA5MwmRO+83U+iXmhyhJEPE3OocnaQ+8g7A2xrG/W2ezUzise /SaHw2qtBxO/nCJfAyH6Ry/nF7Z5q7h0kWpiUdTJMOFu1cQYnKHe2Nj97qxOx8/6zflS tD+k8Cqiiotx88T6dMjZxQSdTLKkqTgKfeWokL6/g6uU+B+Sa1tMdE0mOfuCf6S6kp5L UlHU2O0ZsmYJEpGq7EbtMgD4uUYBv7k3BdHsBOumY0jHtDgxJ2X0G2F7AivLeA3EVB7/ bT6Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:user-agent:in-reply-to:content-disposition :mime-version:references:message-id:subject:cc:to:from:date; bh=W6YMmWRqudzh/RI9+enHcJ5loHdHb63/WIF6bWM58eE=; b=ueMVvQhAwyAfj/kqeY/FsOOCa4rxC94vWqiD4wM2i24cPHqz0Nl1XqNRKvjxH4ob4Z 6k4DqjkLEX9rHB3qgxBSn1Zg72h0vldOLeyRibavQwdTEIpR1y7a1YkXJ1JT4o3ZGGuK MsZpUNa9pL9IWp5pwraUu5hom2XrG9kFwLL76ro4BP7dPaoFFywvCP3/G9L9fweqSU7S KOovUBI9/XEHAlV0vYJXSTDIvRL15cf5Pv388wY8g2FwRBlGihrDVo1ItT7hNLmXLRym rPlwtZmdAjzeTsBM/y2NBjmyWQZJ/+3QkMCJMFn3gcf+z80d4tJ+dRi7azwJJzInDVw1 vWiw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id o8si1012621edi.429.2021.10.27.14.23.28; Wed, 27 Oct 2021 14:23:51 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236906AbhJ0KQR (ORCPT + 97 others); Wed, 27 Oct 2021 06:16:17 -0400 Received: from outbound-smtp19.blacknight.com ([46.22.139.246]:33513 "EHLO outbound-smtp19.blacknight.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235484AbhJ0KQQ (ORCPT ); Wed, 27 Oct 2021 06:16:16 -0400 Received: from mail.blacknight.com (pemlinmail06.blacknight.ie [81.17.255.152]) by outbound-smtp19.blacknight.com (Postfix) with ESMTPS id 73EA51C3766 for ; Wed, 27 Oct 2021 11:13:49 +0100 (IST) Received: (qmail 13566 invoked from network); 27 Oct 2021 10:13:49 -0000 Received: from unknown (HELO techsingularity.net) (mgorman@techsingularity.net@[84.203.17.29]) by 81.17.254.9 with ESMTPSA (AES256-SHA encrypted, authenticated); 27 Oct 2021 10:13:48 -0000 Date: Wed, 27 Oct 2021 11:13:46 +0100 From: Mel Gorman To: NeilBrown Cc: Andrew Morton , Theodore Ts'o , Andreas Dilger , "Darrick J . Wong" , Matthew Wilcox , Michal Hocko , Dave Chinner , Rik van Riel , Vlastimil Babka , Johannes Weiner , Jonathan Corbet , Linux-MM , Linux-fsdevel , LKML Subject: Re: [PATCH v4 0/8] Remove dependency on congestion_wait in mm/ Message-ID: <20211027101346.GQ3959@techsingularity.net> References: <20211019090108.25501-1-mgorman@techsingularity.net> <163486531001.17149.13533181049212473096@noble.neil.brown.name> <20211022083927.GI3959@techsingularity.net> <163490199006.17149.17259708448207042563@noble.neil.brown.name> <20211022131732.GK3959@techsingularity.net> <163529540259.8576.9186192891154927096@noble.neil.brown.name> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-15 Content-Disposition: inline In-Reply-To: <163529540259.8576.9186192891154927096@noble.neil.brown.name> User-Agent: Mutt/1.10.1 (2018-07-13) Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Oct 27, 2021 at 11:43:22AM +1100, NeilBrown wrote: > On Sat, 23 Oct 2021, Mel Gorman wrote: > > On Fri, Oct 22, 2021 at 10:26:30PM +1100, NeilBrown wrote: > > > On Fri, 22 Oct 2021, Mel Gorman wrote: > > > > On Fri, Oct 22, 2021 at 12:15:10PM +1100, NeilBrown wrote: > > > > > > > > > In general, I still don't like the use of wake_up_all(), though it won't > > > > > cause incorrect behaviour. > > > > > > > > > > > > > Removing wake_up_all would be tricky. > > > > > > I think there is a misunderstanding. Removing wake_up_all() is as > > > simple as > > > s/wake_up_all/wake_up/ > > > > > > If you used prepare_to_wait_exclusive(), then wake_up() would only wake > > > one waiter, while wake_up_all() would wake all of them. > > > As you use prepare_to_wait(), wake_up() will wake all waiters - as will > > > wake_up_all(). > > > > > > > Ok, yes, there was a misunderstanding. I thought you were suggesting a > > move to exclusive wakeups. I felt that the wake_up_all was explicit in > > terms of intent and that I really meant for all tasks to wake instead of > > one at a time. > > Fair enough. Thanks for changing it :-) > > But this prompts me to wonder if exclusive wakeups would be a good idea > - which is a useful springboard to try to understand the code better. > > For VMSCAN_THROTTLE_ISOLATED they probably are. > One pattern for reliable exclusive wakeups is for any thread that > received a wake-up to then consider sending a wake up. > > Two places receive VMSCAN_THROTTLE_ISOLATED wakeups and both then call > too_many_isolated() which - on success - sends another wakeup - before > the caller has had a chance to isolate anything. If, instead, the > wakeup was sent sometime later, after pages were isolated by before the > caller (isoloate_migratepages_block() or shrink_inactive_list()) > returned, then we would get an orderly progression of threads running > through that code. > That should work as the throttling condition is straight-forward. It might even reduce a race condition where waking all throttled tasks all then trigger the same "too many isolated" condition. > For VMSCAN_THROTTLE_WRITEBACK is a little less straight forward. > There are two different places that wait for the wakeup, and a wake_up > is sent to all waiters after a time proportional to the number of > waiters. It might make sense to wake one thread per time unit? I'd avoid time as a wakeup condition other than the timeout which is there to guarantee forward progress. I assume you mean "one thread per SWAP_CLUSTER_MAX completions". > That might work well for do_writepages - every SWAP_CLUSTER_MAX writes > triggers one wakeup. > I'm less sure that it would work for shrink_node(). Maybe the > shrink_node() waiters could be non-exclusive so they get woken as soon a > SWAP_CLUSTER_MAX writes complete, while do_writepages are exclusive and > get woken one at a time. > It should work for either with the slight caveat that the last waiter may not see SWAP_CLUSTER_MAX completions. > For VMSCAN_THROTTLE_NOPROGRESS .... I don't understand. > If one zone isn't making "enough" progress, we throttle before moving on > to the next zone. So we delay processing of the next zone, and only > indirectly delay re-processing of the current congested zone. > Maybe it make sense, but I don't see it yet. I note that the commit > message says "it's messy". I can't argue with that! > Yes, we delay the processing of the next zone when a given zone cannot make progress. The thinking is that circumstances that cause one zone to fail to make progress could spill over to other zones in the absense of any throttling. Where it might cause problems is where the preferred zone is very small. If a bug showed up like that, a potential fix would be to avoid throttling if the preferred zone is very small relative to the total amount of memory local to the node or total memory (preferably local node). > I'll follow up with patches to clarify what I am thinking about the > first two. I'm not proposing the patches, just presenting them as part > of improving my understanding. > If I'm cc'd, I'll review and if I think they're promising, I'll run them through the same tests and machines. -- Mel Gorman SUSE Labs