Received: by 2002:a05:6a10:16a7:0:0:0:0 with SMTP id gp39csp517628pxb; Wed, 11 Nov 2020 09:14:41 -0800 (PST) X-Google-Smtp-Source: ABdhPJyybw9HbwuDKUHfIeAD1m0/zGhpRPgN60WWLbJaS6+tCsISW2eSQsOgCqd3jJHyIu1w3yjs X-Received: by 2002:a17:906:4cca:: with SMTP id q10mr26096124ejt.181.1605114881247; Wed, 11 Nov 2020 09:14:41 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1605114881; cv=none; d=google.com; s=arc-20160816; b=DgbN+gqIODado6IXNZ2SgQKX7g/cbhBMm4SVHT+sW7igTdmXb6M1ajTJ5m72n0/QCV EXKetE25ygCPd4ZMLaVazIBYjIMSMP8BlAYeAAGquJ2bcN56H94hvktS5gpnP75ZVdbR Wzx2//jVa5OZPxDoXj7yGkCAuNlOlWxG+oS/rSN8DJZYOy4oKdEY6aNQv1+9dSmF67tI Al8DexfkRGu6jOdnVxinDqLRHN8kacSEYF0R11LZx6/S+vAOG3luxrtZrUd8vThUYi3V OiCRoe2nv9bIsjGMZqJ7KL6JtgsOehbipd6x4kXi1LtTCodnWTX9+n/f0PIGLt8oLNAt ieYA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:content-language :in-reply-to:mime-version:user-agent:date:message-id:subject:from :references:to; bh=Pd8JUoBGpWkNbP0oSiKI9V1jZcOzWxg1ESGuS1+WWoo=; b=OBVDFEB6z0S+lrGJvu0+85FlWi1gfkQNoTkbZ6Lr+O6SojOQOMWnNZzNI0fa3cQ8my BsMogGpL34hMfAh8T3sm39KSPpB0CFkDh/7MK8QbfQ5kySchQauIIiDpqTWvBHhl7W3H zBXVIizim7iMJDOJ3TqX+TgyTOS5SbPPDSSL4X0/2oDUyqFWsM5diPiDA7laB7Al2yHB xSeA7ZCwAF4/nkT+T8+2gt/W/cRcDYkT6mKZtL7gdpcy1dvEOtxQEQS8bIUPlriVj43x 2aTxuRJZAFbVcaxy4+P6Qrp/aiG90FlzWpzL0QUEXzAsCRSiBetC4Jpa6AiObbgKWJF5 i15w== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id b13si1958523edw.337.2020.11.11.09.14.08; Wed, 11 Nov 2020 09:14:41 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726491AbgKKRMf (ORCPT + 99 others); Wed, 11 Nov 2020 12:12:35 -0500 Received: from mx2.suse.de ([195.135.220.15]:35408 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725995AbgKKRMe (ORCPT ); Wed, 11 Nov 2020 12:12:34 -0500 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id 51074AC24; Wed, 11 Nov 2020 17:12:32 +0000 (UTC) To: Alex Shi , akpm@linux-foundation.org, mgorman@techsingularity.net, tj@kernel.org, hughd@google.com, khlebnikov@yandex-team.ru, daniel.m.jordan@oracle.com, willy@infradead.org, hannes@cmpxchg.org, lkp@intel.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org, cgroups@vger.kernel.org, shakeelb@google.com, iamjoonsoo.kim@lge.com, richard.weiyang@gmail.com, kirill@shutemov.name, alexander.duyck@gmail.com, rong.a.chen@intel.com, mhocko@suse.com, vdavydov.dev@gmail.com, shy828301@gmail.com References: <1604566549-62481-1-git-send-email-alex.shi@linux.alibaba.com> <1604566549-62481-16-git-send-email-alex.shi@linux.alibaba.com> From: Vlastimil Babka Subject: Re: [PATCH v21 15/19] mm/compaction: do page isolation first in compaction Message-ID: Date: Wed, 11 Nov 2020 18:12:28 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.4.0 MIME-Version: 1.0 In-Reply-To: <1604566549-62481-16-git-send-email-alex.shi@linux.alibaba.com> Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 8bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 11/5/20 9:55 AM, Alex Shi wrote: > Currently, compaction would get the lru_lock and then do page isolation > which works fine with pgdat->lru_lock, since any page isoltion would > compete for the lru_lock. If we want to change to memcg lru_lock, we > have to isolate the page before getting lru_lock, thus isoltion would > block page's memcg change which relay on page isoltion too. Then we > could safely use per memcg lru_lock later. > > The new page isolation use previous introduced TestClearPageLRU() + > pgdat lru locking which will be changed to memcg lru lock later. > > Hugh Dickins fixed following bugs in this patch's > early version: > > Fix lots of crashes under compaction load: isolate_migratepages_block() > must clean up appropriately when rejecting a page, setting PageLRU again > if it had been cleared; and a put_page() after get_page_unless_zero() > cannot safely be done while holding locked_lruvec - it may turn out to > be the final put_page(), which will take an lruvec lock when PageLRU. > And move __isolate_lru_page_prepare back after get_page_unless_zero to > make trylock_page() safe: > trylock_page() is not safe to use at this time: its setting PG_locked > can race with the page being freed or allocated ("Bad page"), and can > also erase flags being set by one of those "sole owners" of a freshly > allocated page who use non-atomic __SetPageFlag(). > > Suggested-by: Johannes Weiner > Signed-off-by: Alex Shi > Acked-by: Hugh Dickins > Acked-by: Johannes Weiner > Cc: Andrew Morton > Cc: Matthew Wilcox > Cc: linux-kernel@vger.kernel.org > Cc: linux-mm@kvack.org Acked-by: Vlastimil Babka A question below: > @@ -979,10 +995,6 @@ static bool too_many_isolated(pg_data_t *pgdat) > goto isolate_abort; > } > > - /* Recheck PageLRU and PageCompound under lock */ > - if (!PageLRU(page)) > - goto isolate_fail; > - > /* > * Page become compound since the non-locked check, > * and it's on LRU. It can only be a THP so the order > @@ -990,16 +1002,13 @@ static bool too_many_isolated(pg_data_t *pgdat) > */ > if (unlikely(PageCompound(page) && !cc->alloc_contig)) { > low_pfn += compound_nr(page) - 1; > - goto isolate_fail; > + SetPageLRU(page); > + goto isolate_fail_put; > } IIUC the danger here is khugepaged will collapse a THP. For that, __collapse_huge_page_isolate() has to succeed isolate_lru_page(). Under the new scheme, it shouldn't be possible, right? If that's correct, we can remove this part?