Received: by 2002:a05:6a10:8c0a:0:0:0:0 with SMTP id go10csp1103825pxb; Tue, 9 Feb 2021 23:25:35 -0800 (PST) X-Google-Smtp-Source: ABdhPJyJBZ7yPgh/b3YAOS+l4rB5xNPC92uBpriL1EOAg63sOaY1fBlsLukHG4AoaoRjpzjxnrcw X-Received: by 2002:a17:906:3fc4:: with SMTP id k4mr1615876ejj.137.1612941935101; Tue, 09 Feb 2021 23:25:35 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1612941935; cv=none; d=google.com; s=arc-20160816; b=uCfbt1WwZGFh+ntrkGb2cNKG2sAL9BQ4r+dkn45S7Dh3zqP5ySXHt/72zXeATtUDcj O20fibQylphePAnBpIAY764iMtQO/FV5g7halynn3tarctf+4jNypWnUMiF1+Bm+RMYC l7g1zZBRWKGVDQ3r6CI4bVBB62l5OIuPNVCrQa86NJ9hwizAgxU2wFTJBDpBVaF+1ZjP TTR7IfMJA67gejB9Dl8+7ldpAhY4oOT/p+YpM+BmspRO7xzopR/6NE76O20fvLe9LtMW nUsYoTQiQZhYXO+BEZiRkwcjrjo0/Z7J4SmY3MxLMcDtA94L2xM65Xpo6qB1boMubjYQ 2/Lw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:content-language :in-reply-to:mime-version:user-agent:date:message-id:subject:from :references:cc:to; bh=9dRYF828YvEEZgfY4BzPUhOphbADAOpOSLFfpNd75Ic=; b=YPKmqBG/eeZBkK79OA0wkMtoCvGuEtKHTZCRr9uZ+3VEAWy2XP4rC83BApgDTE42ug NeNi2DChvAO96/dtKB9wzsRlEyKl6YIVsSQyH5d6cOEmm/WS2RoOJYDwpOATP+cqaR7K U8GFosLyRVoDvXQMZQFw+P4GvcxmpWHCuWYO9BiGSE3ZoQxDXuKOB2aMuK2wsCP9dUSe KLzLYb2Z6YRe89KjV4nZJQ0djdKpjF1CkO3r+R9g6J3p6IGAI5iu3GZMjewrPfd/qEJY hehIIOig3JTnx+gBogllsaQQEZnMPMwTG2UXZllEQ1QIzKXOXPb45He6aZr8bPEB3Vjj Aa1A== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id a66si939541edf.607.2021.02.09.23.25.12; Tue, 09 Feb 2021 23:25:35 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232645AbhBIQIz (ORCPT + 99 others); Tue, 9 Feb 2021 11:08:55 -0500 Received: from mx2.suse.de ([195.135.220.15]:39320 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232608AbhBIQIx (ORCPT ); Tue, 9 Feb 2021 11:08:53 -0500 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id CF459AB71; Tue, 9 Feb 2021 16:08:11 +0000 (UTC) To: David Rientjes , Charan Teja Kalla Cc: akpm@linux-foundation.org, mhocko@suse.com, vinmenon@codeaurora.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org References: <1612187338-19100-1-git-send-email-charante@codeaurora.org> <160ba3b5-2cd4-5ff0-1348-fb477cefd33d@codeaurora.org> <1213f4c6-7557-268d-253e-23f8fea55b19@google.com> From: Vlastimil Babka Subject: Re: [PATCH] mm: page_alloc: update the COMPACT[STALL|FAIL] events properly Message-ID: <77fd72eb-0bb5-af33-0727-90a69ef7733a@suse.cz> Date: Tue, 9 Feb 2021 17:08:11 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.7.0 MIME-Version: 1.0 In-Reply-To: <1213f4c6-7557-268d-253e-23f8fea55b19@google.com> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 8bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 2/5/21 11:28 PM, David Rientjes wrote: > On Tue, 2 Feb 2021, Charan Teja Kalla wrote: > >> >> diff --git a/mm/page_alloc.c b/mm/page_alloc.c >> >> index 519a60d..531f244 100644 >> >> --- a/mm/page_alloc.c >> >> +++ b/mm/page_alloc.c >> >> @@ -4152,6 +4152,8 @@ __alloc_pages_direct_compact(gfp_t gfp_mask, unsigned int order, >> >> memalloc_noreclaim_restore(noreclaim_flag); >> >> psi_memstall_leave(&pflags); >> >> >> >> + if (*compact_result == COMPACT_SKIPPED) >> >> + return NULL; >> >> /* >> >> * At least in one zone compaction wasn't deferred or skipped, so let's >> >> * count a compaction stall >> > >> > This makes sense, I wonder if it would also be useful to check that >> > page == NULL, either in try_to_compact_pages() or here for >> > COMPACT_SKIPPED? >> >> In the code, when COMPACT_SKIPPED is being returned, the page will >> always be NULL. So, I'm not sure how much useful it is for the page == >> NULL check here. Or I failed to understand your point here? >> > > Your code is short-circuiting the rest of __alloc_pages_direct_compact() > where the return value is dictated by whether page is NULL or non-NULL. > We can't leak a captured page if we are testing for it being NULL or > non-NULL, which is what the rest of __alloc_pages_direct_compact() does > *before* your change. So the idea was to add a check the page is actually > NULL here since you are now relying on the return value of > compact_zone_order() to be COMPACT_SKIPPED to infer page == NULL. > > I agree that's currently true in the code, I was trying to catch any > errors where current->capture_control.page was non-NULL but > try_to_compact_pages() returns COMPACT_SKIPPED. There's some complexity > here. > > So my idea was the expand this out to: > > if (*compact_result == COMPACT_SKIPPED) { > VM_BUG_ON(page); > return NULL; > } Note that this may indeed actually trigger due to free page capture, when an IRQ handler frees the page. See commit b9e20f0da1f5 ("mm, compaction: make capture control handling safe wrt interrupts") describing how this was happening for Hugh. So, this VM_BUG_ON() would sooner or later trigger. It's because while compact_zone() does detect a successful capture and return COMPACT_SUCCESS, the IRQ-capture can also happen later without being detected - at any moment until compact_zone_order() resets the current->capture_control to NULL. And at that point it may be already poised to return COMPACT_SKIPPED. It might be cleanest to check *capture at the end of compact_zone_order() and return COMPACT_SUCCESS when non-NULL. Technically it might be not true that compaction was successful (we were just lucky that IRQ came and freed the page), but not much harm in that. Better than e.g. the danger of leaking the captured page which the proposed patch would do due to the shortcut. The minor downside is that you would count a stall that wasn't really a stall in case we skipped compaction, but captured a page by luck, but it would be very rare.