Received: by 2002:a05:6a11:4021:0:0:0:0 with SMTP id ky33csp3106464pxb; Tue, 21 Sep 2021 14:53:38 -0700 (PDT) X-Google-Smtp-Source: ABdhPJyvdByjYgF/Gf9khZ2Ubl2C3YV49Wd0OYLc9g15g64n4HsTGk6/EnRFZcdWatTWMt/V2otw X-Received: by 2002:a05:6e02:152b:: with SMTP id i11mr23706963ilu.65.1632261218382; Tue, 21 Sep 2021 14:53:38 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1632261218; cv=none; d=google.com; s=arc-20160816; b=xlA7Vp6FbRkFGSQOhERnfwoIgSAMnKCWNe5yJa/JqZnrauIVzpjx04xAjw7DqpC7PK Mwddmni5obNrypgDJUa7bxN/jW7ehLvOpuLLCz2iePwc7kcrUcjkvhR81ET1YGak8yz1 AXiiiRIfHtNvr4CETTLLOKrRSP0uzjiVXLb1uRxkTI+Qi8Xv6HX69uAlb+y0gXlWwH/G ZHu/2lLk7w6HSWk6+CXSFDMmVBexR6puxFB11VTuhAff9Eo6CTBjuoZU0elcnJx3WrsY 0xl5HFaj3E5E1jVXATGKZZ6qJCkLpcqXWvfqe/TU/UOU3TdrnDLTZSSqPUfJ1a3lMkdQ w80w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:message-id:date:references:in-reply-to:subject :cc:to:from:mime-version:content-transfer-encoding:dkim-signature :dkim-signature; bh=I0GqJYRTLtj5brv4Bac0gTMm/xYvL7+Yx0c3tiayK/k=; b=nJVXYu67VuWDYgS2Nva2TPKGjdCkz5u6jkpp6oTLAtn+w4lBW21xjUQBooAUVN5LSM vzzVXPu3SQ8bCgI0RH/InDqFQfDeaTstwFGoURY7Om/kYkWYA0LjurT6GjhnIitDXJyZ m+K79hDaoH0MWw9QP1t+6hxJDgw8gaMl02IJqGOzDN3P88untmwV0tgui3yOOGDT8cSY c1QhWT4Ydu2ivH2Ekt0LYT/Uliu39F2GgDKm24FUnqliqcDLmR2VrDExsPw01+qnAcA2 mo805N/j+ZSkrO2LBplvgr+mTE9/hwBalPD1yfXDM+h4mCNB8OuWK1uVuAH+aN5zukRR CN2Q== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@suse.de header.s=susede2_rsa header.b=vMIuuQA6; dkim=neutral (no key) header.i=@suse.de header.s=susede2_ed25519 header.b=v45T6WfD; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=suse.de Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id y12si250040ilv.168.2021.09.21.14.53.08; Tue, 21 Sep 2021 14:53:38 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@suse.de header.s=susede2_rsa header.b=vMIuuQA6; dkim=neutral (no key) header.i=@suse.de header.s=susede2_ed25519 header.b=v45T6WfD; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=suse.de Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235637AbhIUVm7 (ORCPT + 99 others); Tue, 21 Sep 2021 17:42:59 -0400 Received: from smtp-out2.suse.de ([195.135.220.29]:47990 "EHLO smtp-out2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235583AbhIUVmk (ORCPT ); Tue, 21 Sep 2021 17:42:40 -0400 Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by smtp-out2.suse.de (Postfix) with ESMTPS id 60AD11FF24; Tue, 21 Sep 2021 21:41:10 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1632260470; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=I0GqJYRTLtj5brv4Bac0gTMm/xYvL7+Yx0c3tiayK/k=; b=vMIuuQA6sKKov3GBq6bp3HF6ImlswNZ8QL88oU7KwnEnDh4vSRf7bjMaAGoAn42aeQnLUc WdeeswDTbEseAAr6RJdEbqnwfwIz/8Rr1b6J9rU7mzMznBgjowrwApazbpIswb9gwT5yDP ElAFdF9oLF7WKJV3L/sl0qg5DMPeAWk= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1632260470; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=I0GqJYRTLtj5brv4Bac0gTMm/xYvL7+Yx0c3tiayK/k=; b=v45T6WfDNNIpJAkeEid0sdThJriTirP9PcSXknSMnBWz+DqFKWZxQgYk730mh93tmRf6FX V3/uWkZDGihpibDw== Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by imap2.suse-dmz.suse.de (Postfix) with ESMTPS id 2C7A413BF7; Tue, 21 Sep 2021 21:41:05 +0000 (UTC) Received: from dovecot-director2.suse.de ([192.168.254.65]) by imap2.suse-dmz.suse.de with ESMTPSA id 5PsjN3FRSmELIwAAMHmgww (envelope-from ); Tue, 21 Sep 2021 21:41:05 +0000 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 From: "NeilBrown" To: "Mel Gorman" Cc: "Linux-MM" , "Theodore Ts'o" , "Andreas Dilger" , "Darrick J . Wong" , "Matthew Wilcox" , "Michal Hocko" , "Dave Chinner" , "Rik van Riel" , "Vlastimil Babka" , "Johannes Weiner" , "Jonathan Corbet" , "Linux-fsdevel" , "LKML" Subject: Re: [PATCH 1/5] mm/vmscan: Throttle reclaim until some writeback completes if congested In-reply-to: <20210921105831.GO3959@techsingularity.net> References: <20210920085436.20939-1-mgorman@techsingularity.net>, <20210920085436.20939-2-mgorman@techsingularity.net>, <163218319798.3992.1165186037496786892@noble.neil.brown.name>, <20210921105831.GO3959@techsingularity.net> Date: Wed, 22 Sep 2021 07:40:59 +1000 Message-id: <163226045956.21861.7998898955979000139@noble.neil.brown.name> Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, 21 Sep 2021, Mel Gorman wrote: > On Tue, Sep 21, 2021 at 10:13:17AM +1000, NeilBrown wrote: > > On Mon, 20 Sep 2021, Mel Gorman wrote: > > > -long wait_iff_congested(int sync, long timeout) > > > -{ > > > - long ret; > > > - unsigned long start =3D jiffies; > > > - DEFINE_WAIT(wait); > > > - wait_queue_head_t *wqh =3D &congestion_wqh[sync]; > > > - > > > - /* > > > - * If there is no congestion, yield if necessary instead > > > - * of sleeping on the congestion queue > > > - */ > > > - if (atomic_read(&nr_wb_congested[sync]) =3D=3D 0) { > > > - cond_resched(); > > > - > > > - /* In case we scheduled, work out time remaining */ > > > - ret =3D timeout - (jiffies - start); > > > - if (ret < 0) > > > - ret =3D 0; > > > - > > > - goto out; > > > - } > > > - > > > - /* Sleep until uncongested or a write happens */ > > > - prepare_to_wait(wqh, &wait, TASK_UNINTERRUPTIBLE); > >=20 > > Uninterruptible wait. > >=20 > > .... > > > +static void > > > +reclaim_throttle(pg_data_t *pgdat, enum vmscan_throttle_state reason, > > > + long timeout) > > > +{ > > > + wait_queue_head_t *wqh =3D &pgdat->reclaim_wait; > > > + unsigned long start =3D jiffies; > > > + long ret; > > > + DEFINE_WAIT(wait); > > > + > > > + atomic_inc(&pgdat->nr_reclaim_throttled); > > > + WRITE_ONCE(pgdat->nr_reclaim_start, > > > + node_page_state(pgdat, NR_THROTTLED_WRITTEN)); > > > + > > > + prepare_to_wait(wqh, &wait, TASK_INTERRUPTIBLE); > >=20 > > Interruptible wait. > >=20 > > Why the change? I think these waits really need to be TASK_UNINTERRUPTIB= LE. > >=20 >=20 > Because from mm/ context, I saw no reason why the task *should* be > uninterruptible. It's waiting on other tasks to complete IO and it is not > protecting device state, filesystem state or anything else. If it gets > a signal, it's safe to wake up, particularly if that signal is KILL and > the context is a direct reclaimer. I disagree. An Interruptible sleep only makes sense if the "was interrupted" status can propagate up to user-space (or to some in-kernel handler that will clear the signal). In particular, if reclaim_throttle() is called in a loop (which it is), and if that loop doesn't check for signal_pending (which it doesn't), then the next time around the loop after receiving a signal, it won't sleep at all. That would be bad. In general, if you don't return an error, then you probably shouldn't sleep Interruptible. I notice that tasks sleep on kswapd_wait as TASK_INTERRUPTIBLE, but they don't have any signal handling. I suspect this isn't actually a defect because I suspect that is it not even possible to SIGKILL kswapd. But the code seems misleading. I guess I should write a patch. Unless reclaim knows to abort completely on a signal (__GFP_KILLABLE ???) this must be an UNINTERRUPTIBLE wait. Thanks, NeilBrown >=20 > The original TASK_UNINTERRUPTIBLE is almost certainly a copy&paste from > congestion_wait which may be called because a filesystem operation must > complete before it can return to userspace so a signal waking it up is > pointless. >=20 > --=20 > Mel Gorman > SUSE Labs >=20 >=20