Received: by 2002:a05:6358:3188:b0:123:57c1:9b43 with SMTP id q8csp5484065rwd; Wed, 24 May 2023 02:24:14 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ7x0QpxoLuXrLdN0UmApdDyDDb1JTykHtt0amvb7TQoIwow9G904WpiMFP57aO3wrpbrHOi X-Received: by 2002:a17:90a:d188:b0:253:4212:9157 with SMTP id fu8-20020a17090ad18800b0025342129157mr17607492pjb.28.1684920253879; Wed, 24 May 2023 02:24:13 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1684920253; cv=none; d=google.com; s=arc-20160816; b=BCbXcNTvbb0jbxODE05Pq0bqES3Pi0r97knSOI8hL+kV77WZmVlkzn4aWxh0h2UkET Oi0aPDBhwqXQC5fVAHMvjz/0+Jfwx9esi86y7LzwdMWJvgeEIJshd5JEreMMWycDs2VU ISnqZCrrzNz5cS414k5UtTA/vLi5kSWHKTL7HL6Z5RU6BKE6KWsXwQwH9Y96IjlG7Fz3 wjUskvyZRa6ttLgB+HFLtNCTLf2FuliKSGk2RU1MX/iwPHOVCG2P4+hEfjmNlN9XoIhK ywgFINVNG/8xFMDWU6Td3jwFPygRn1wgYbMv/1UGYLNLyWey/xHGzwztq3FL+gxjiQO8 HD8w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:in-reply-to:from :content-language:references:cc:to:subject:user-agent:mime-version :date:message-id:dkim-signature:dkim-signature; bh=QM4gol7aWxHM+cK08fFzQCFpDFcExgXFNwjtTAaTJjY=; b=KLUHQH5/joKoM7qwNm+TWHdrxe/zsSSBszrhkhH6pp82YnGEzL5zvQ29IVg142PRHs fw5e4M8TStnH0QbUWRbXnUuMNFg2AtNM3ONPsRtXvNRTK8UoAAmytG0I1z8kzjGvsanf gH1CZXQ+eR5732WsXPrm6hEMU7yknCcuvSb3mx+/haG8oHgwUaU4UgyxOnJvc8DiCGBE BYmCDOKbWsKiGsZf4gi6WIGaTKEe/CabyjXUYwKWSwi/nAla+HnRoaoyy5+GNd/eO7fU wFTX16zsxKRBrIw+skBdb1ujz6NEfMCzqdEnElRawZiRNSAQMsI5035bbnV1hgM+G4eD dCew== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@suse.cz header.s=susede2_rsa header.b=mGN2Q5yY; dkim=neutral (no key) header.i=@suse.cz header.s=susede2_ed25519 header.b=9ATs2viG; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id l14-20020a17090a72ce00b00253555e5f95si963464pjk.153.2023.05.24.02.23.58; Wed, 24 May 2023 02:24:13 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@suse.cz header.s=susede2_rsa header.b=mGN2Q5yY; dkim=neutral (no key) header.i=@suse.cz header.s=susede2_ed25519 header.b=9ATs2viG; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S239670AbjEXJWB (ORCPT + 99 others); Wed, 24 May 2023 05:22:01 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:45932 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229572AbjEXJWA (ORCPT ); Wed, 24 May 2023 05:22:00 -0400 Received: from smtp-out1.suse.de (smtp-out1.suse.de [IPv6:2001:67c:2178:6::1c]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0094E196 for ; Wed, 24 May 2023 02:21:44 -0700 (PDT) Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by smtp-out1.suse.de (Postfix) with ESMTPS id 7AECC223F2; Wed, 24 May 2023 09:21:43 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_rsa; t=1684920103; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=QM4gol7aWxHM+cK08fFzQCFpDFcExgXFNwjtTAaTJjY=; b=mGN2Q5yYXM3sOwkoOzVucYbctEeSV5S5CjOkCLnGTEX4OefaG6LdHozlo7VP6XoyR3w+zs aJwyN3Q3+kc0eBbgdcW+P4FmyeclYPk/qqszRMsJp/LXntlCz/P+YfHAy8JVcr78Ojuf9d umqXoEZQEF0uc3ym2HPN+x4PINZXtcQ= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_ed25519; t=1684920103; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=QM4gol7aWxHM+cK08fFzQCFpDFcExgXFNwjtTAaTJjY=; b=9ATs2viGI7oMBF8q/E6X4vUArTOwZzfVhlcbq6P33XT0G6FbJMnW7Si8dFysJ8aGcssps7 BFFLOTiMxG6dFSBQ== Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by imap2.suse-dmz.suse.de (Postfix) with ESMTPS id 5D43813425; Wed, 24 May 2023 09:21:43 +0000 (UTC) Received: from dovecot-director2.suse.de ([192.168.254.65]) by imap2.suse-dmz.suse.de with ESMTPSA id qIIJFifXbWSnewAAMHmgww (envelope-from ); Wed, 24 May 2023 09:21:43 +0000 Message-ID: <8fd1a56d-5a22-4bde-59a5-169a4696219e@suse.cz> Date: Wed, 24 May 2023 11:21:43 +0200 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.11.0 Subject: Re: [PATCH] mm: compaction: avoid GFP_NOFS ABBA deadlock To: Johannes Weiner , Andrew Morton Cc: Mel Gorman , Michal Hocko , linux-mm@kvack.org, linux-kernel@vger.kernel.org, kernel-team@fb.com References: <20230519111359.40475-1-hannes@cmpxchg.org> Content-Language: en-US From: Vlastimil Babka In-Reply-To: <20230519111359.40475-1-hannes@cmpxchg.org> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-3.9 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,NICE_REPLY_A,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_SOFTFAIL,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 5/19/23 13:13, Johannes Weiner wrote: > During stress testing with higher-order allocations, a deadlock > scenario was observed in compaction: One GFP_NOFS allocation was > sleeping on mm/compaction.c::too_many_isolated(), while all CPUs in > the system were busy with compactors spinning on buffer locks held by > the sleeping GFP_NOFS allocation. > > Reclaim is susceptible to this same deadlock; we fixed it by granting > GFP_NOFS allocations additional LRU isolation headroom, to ensure it > makes forward progress while holding fs locks that other reclaimers > might acquire. Do the same here. > > This code has been like this since compaction was initially merged, > and I only managed to trigger this with out-of-tree patches that > dramatically increase the contexts that do GFP_NOFS compaction. While > the issue is real, it seems theoretical in nature given existing > allocation sites. Worth fixing now, but no Fixes tag or stable CC. > Signed-off-by: Johannes Weiner So IIUC the change is done by not giving GFP_NOFS extra headroom, but instead restricting the headroom of __GFP_FS allocations. But the original one was probably too generous anyway so it should be fine? Acked-by: Vlastimil Babka > --- > mm/compaction.c | 16 ++++++++++++++-- > 1 file changed, 14 insertions(+), 2 deletions(-) > > v2: > - clarify too_many_isolated() comment (Mel) > - split isolation deadlock from no-contiguous-anon lockups as that's > a different scenario and deserves its own patch > > diff --git a/mm/compaction.c b/mm/compaction.c > index c8bcdea15f5f..c9a4b6dffcf2 100644 > --- a/mm/compaction.c > +++ b/mm/compaction.c > @@ -745,8 +745,9 @@ isolate_freepages_range(struct compact_control *cc, > } > > /* Similar to reclaim, but different enough that they don't share logic */ > -static bool too_many_isolated(pg_data_t *pgdat) > +static bool too_many_isolated(struct compact_control *cc) > { > + pg_data_t *pgdat = cc->zone->zone_pgdat; > bool too_many; > > unsigned long active, inactive, isolated; > @@ -758,6 +759,17 @@ static bool too_many_isolated(pg_data_t *pgdat) > isolated = node_page_state(pgdat, NR_ISOLATED_FILE) + > node_page_state(pgdat, NR_ISOLATED_ANON); > > + /* > + * Allow GFP_NOFS to isolate past the limit set for regular > + * compaction runs. This prevents an ABBA deadlock when other > + * compactors have already isolated to the limit, but are > + * blocked on filesystem locks held by the GFP_NOFS thread. > + */ > + if (cc->gfp_mask & __GFP_FS) { > + inactive >>= 3; > + active >>= 3; > + } > + > too_many = isolated > (inactive + active) / 2; > if (!too_many) > wake_throttle_isolated(pgdat); > @@ -806,7 +818,7 @@ isolate_migratepages_block(struct compact_control *cc, unsigned long low_pfn, > * list by either parallel reclaimers or compaction. If there are, > * delay for some time until fewer pages are isolated > */ > - while (unlikely(too_many_isolated(pgdat))) { > + while (unlikely(too_many_isolated(cc))) { > /* stop isolation if there are still pages not migrated */ > if (cc->nr_migratepages) > return -EAGAIN;