Received: by 2002:a05:6a11:4021:0:0:0:0 with SMTP id ky33csp188576pxb; Tue, 21 Sep 2021 23:07:05 -0700 (PDT) X-Google-Smtp-Source: ABdhPJzkPYuwl/I46FOpQ1xlv1M622SMXkH4BO3RD+4mLMuZL2m4RGow6vy90Tep4d2q+ceqCgOp X-Received: by 2002:a50:cf4c:: with SMTP id d12mr13710714edk.115.1632290825658; Tue, 21 Sep 2021 23:07:05 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1632290825; cv=none; d=google.com; s=arc-20160816; b=Ft7trNrMS8/MqtPJote/pe1TxVy1mlL1j3f2aHjTQYhtejobf6P0XyGwhW2W46J+ez 15sP8YsPK7QT36o+gzgxaJhPI86qDeZ4/6Guvli2ip3yN4obWmPadXUk0x39gCx/tJXk REPaQQvdmhqV+BKJ3phE5L0fnMJbOPN9Z01RNlNzrZALHBR4OUXWU8wEYhI1PsQWjgTO Bnpb6cjHDtFJWhCc4V+GLRxnuetQjElZwb1rpohVAy/4gmpIgt3zlBDuqFWbWLlKI9Rx Rw2BpkNGz8xbg/B/Pbvtp4mKECjfu/VX8nJ+vGjVRNZfzzEnnGN+u0TO4frGp9SW0KZi ON4w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date; bh=Hx2ytEB0P2LIoNcdHvNDuuKGd7aciJrif9iCBROC+8k=; b=PKUvNaWaLhz4jb2u0XA5BoTgoxjTy4LvWTvBzaCB2xfQa350d3Mc6Nx/iaKGI2L53H XyarDmXmPiW35as9+cKUAvIB+5Zb+kSpsh43R9mW09T/R6ZjjReOoHPDsEQptR7Utax3 ivMLY36sLEsehO2nTt8TlC6nI9nr/xv9h1nUqfQsBJqt8tag40eA2DxdT0V5QYVxXP3v GSZ4HtG81lajJ3QXOWAyVytwfEuZYhjTQkgU4OXYBK2gwmF148u+/VCKFsnyUr/HzJ+g DuVEistkU2lZvZPsiSH7r/+E+PQt/IY9RFStE9ZRYc9KElxvhMGytku2znXj1K5fDyRa z9Gg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id i14si1442011edf.22.2021.09.21.23.06.41; Tue, 21 Sep 2021 23:07:05 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232627AbhIVGGZ (ORCPT + 99 others); Wed, 22 Sep 2021 02:06:25 -0400 Received: from mail108.syd.optusnet.com.au ([211.29.132.59]:36692 "EHLO mail108.syd.optusnet.com.au" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232490AbhIVGGY (ORCPT ); Wed, 22 Sep 2021 02:06:24 -0400 Received: from dread.disaster.area (pa49-195-238-16.pa.nsw.optusnet.com.au [49.195.238.16]) by mail108.syd.optusnet.com.au (Postfix) with ESMTPS id 218341BC362; Wed, 22 Sep 2021 16:04:48 +1000 (AEST) Received: from dave by dread.disaster.area with local (Exim 4.92.3) (envelope-from ) id 1mSvN9-00FJVY-TE; Wed, 22 Sep 2021 16:04:47 +1000 Date: Wed, 22 Sep 2021 16:04:47 +1000 From: Dave Chinner To: Mel Gorman Cc: NeilBrown , Linux-MM , Theodore Ts'o , Andreas Dilger , "Darrick J . Wong" , Matthew Wilcox , Michal Hocko , Rik van Riel , Vlastimil Babka , Johannes Weiner , Jonathan Corbet , Linux-fsdevel , LKML Subject: Re: [PATCH 1/5] mm/vmscan: Throttle reclaim until some writeback completes if congested Message-ID: <20210922060447.GA2361455@dread.disaster.area> References: <20210920085436.20939-1-mgorman@techsingularity.net> <20210920085436.20939-2-mgorman@techsingularity.net> <163218319798.3992.1165186037496786892@noble.neil.brown.name> <20210921105831.GO3959@techsingularity.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20210921105831.GO3959@techsingularity.net> X-Optus-CM-Score: 0 X-Optus-CM-Analysis: v=2.3 cv=F8MpiZpN c=1 sm=1 tr=0 a=DzKKRZjfViQTE5W6EVc0VA==:117 a=DzKKRZjfViQTE5W6EVc0VA==:17 a=kj9zAlcOel0A:10 a=7QKq2e-ADPsA:10 a=7-415B0cAAAA:8 a=XCLsOYAi7itVMiQYH0cA:9 a=CjuIK1q_8ugA:10 a=biEYGPWJfzWAr4FL6Ov7:22 Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Sep 21, 2021 at 11:58:31AM +0100, Mel Gorman wrote: > On Tue, Sep 21, 2021 at 10:13:17AM +1000, NeilBrown wrote: > > On Mon, 20 Sep 2021, Mel Gorman wrote: > > > -long wait_iff_congested(int sync, long timeout) > > > -{ > > > - long ret; > > > - unsigned long start = jiffies; > > > - DEFINE_WAIT(wait); > > > - wait_queue_head_t *wqh = &congestion_wqh[sync]; > > > - > > > - /* > > > - * If there is no congestion, yield if necessary instead > > > - * of sleeping on the congestion queue > > > - */ > > > - if (atomic_read(&nr_wb_congested[sync]) == 0) { > > > - cond_resched(); > > > - > > > - /* In case we scheduled, work out time remaining */ > > > - ret = timeout - (jiffies - start); > > > - if (ret < 0) > > > - ret = 0; > > > - > > > - goto out; > > > - } > > > - > > > - /* Sleep until uncongested or a write happens */ > > > - prepare_to_wait(wqh, &wait, TASK_UNINTERRUPTIBLE); > > > > Uninterruptible wait. > > > > .... > > > +static void > > > +reclaim_throttle(pg_data_t *pgdat, enum vmscan_throttle_state reason, > > > + long timeout) > > > +{ > > > + wait_queue_head_t *wqh = &pgdat->reclaim_wait; > > > + unsigned long start = jiffies; > > > + long ret; > > > + DEFINE_WAIT(wait); > > > + > > > + atomic_inc(&pgdat->nr_reclaim_throttled); > > > + WRITE_ONCE(pgdat->nr_reclaim_start, > > > + node_page_state(pgdat, NR_THROTTLED_WRITTEN)); > > > + > > > + prepare_to_wait(wqh, &wait, TASK_INTERRUPTIBLE); > > > > Interruptible wait. > > > > Why the change? I think these waits really need to be TASK_UNINTERRUPTIBLE. > > > > Because from mm/ context, I saw no reason why the task *should* be > uninterruptible. It's waiting on other tasks to complete IO and it is not > protecting device state, filesystem state or anything else. If it gets > a signal, it's safe to wake up, particularly if that signal is KILL and > the context is a direct reclaimer. I disagree. whether the sleep should be interruptable or not is entirely dependent on whether the caller can handle failure or not. If this is GFP_NOFAIL, allocation must not fail no matter what the context is, so signals and the like are irrelevant. For a context that can handle allocation failure, then it makes sense to wake on events that will result in the allocation failing immediately. But if all this does is make the allocation code go around another retry loop sooner, then an interruptible sleep still doesn't make any sense at all here... > The original TASK_UNINTERRUPTIBLE is almost certainly a copy&paste from > congestion_wait which may be called because a filesystem operation must > complete before it can return to userspace so a signal waking it up is > pointless. Yup, but that AFAICT that same logic still applies. Only now it's the allocation context that determines whether signal waking is pointless or not... Cheer, Dave. -- Dave Chinner david@fromorbit.com