Received: by 2002:a05:6a10:9e8c:0:0:0:0 with SMTP id y12csp1340069pxx; Fri, 30 Oct 2020 07:54:00 -0700 (PDT) X-Google-Smtp-Source: ABdhPJySHOHcBoy0eaXmqj13XOPQ5/rvHHwflNpZ/ZatuYp/InLXLyZNq57sjV+ewuxGlFAD60Tx X-Received: by 2002:a17:906:c041:: with SMTP id bm1mr2824794ejb.202.1604069639969; Fri, 30 Oct 2020 07:53:59 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1604069639; cv=none; d=google.com; s=arc-20160816; b=MIjB0VCOmizh1Nz8jHa//b+uXFvTv/VsfouLyVKFZo1L17K1Bpfn8B+/u8sig6XXAu cKD+F1o9KlmlSQBr8cO0jlbhv8lsdduvWbgyKjcKznRXrYTorBlK1eGZZLBmXLRRrGeG M1ktuehOWhMoE4GsGcTWXkb/jXP0p06eLC1I3rEXyXYxbpR3+864TZyUDlkp1iAmD68d 4bq4yqwMoozTSFrucVomMvMkqjn4yyUxdYF+nDm/FCvcV1d8SJUNufFpbSQqUgbCnFjk zgPLN9heVUiiyyxOjW638+9zmoxFwVSmvOtU7Y8hEzrx7JCUuJ/Bob85ASzMD4umqFrD MADw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:content-language :in-reply-to:mime-version:user-agent:date:message-id:from:references :cc:to:subject; bh=taVT7nc5pvu05NHx59qYjJ1Vkjuz0I7pc8i4EZnGRfU=; b=m6p/dVoR+5R4U6Vq8f4ZinKdR6nsy53c2OO41QlN5yc7iLtuma2ns5ToLjLI8wZQu4 qnW5JdECu086Vdc3J/D2bOA+i+RngIRxHgM94ow9ndWNpIbNNG3ages6mn9Cq0lsYeci saz0odK6vvK6RZb6IfF8HjQoAsHaZusGzin6EcZq/K7aT/Iu3kSv0QI1wjvefNdHVgUL CQnwCdGi1Ya5LknCfccPe+yjq3P1PTS4Ws0le1HG44tTCyNwHMkYJ8nXlU/MJcls2nc4 N55rUlepZeCQI0RP/uEAUGg83H3J4XkPfT+nEGB6DqSs9Wq+zGlOS9sobk+/ack6l8DD 8nyw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id d25si1838165edn.255.2020.10.30.07.53.33; Fri, 30 Oct 2020 07:53:59 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726920AbgJ3OuH (ORCPT + 99 others); Fri, 30 Oct 2020 10:50:07 -0400 Received: from mx2.suse.de ([195.135.220.15]:59516 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726799AbgJ3OuH (ORCPT ); Fri, 30 Oct 2020 10:50:07 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id 52D8FAF55; Fri, 30 Oct 2020 14:50:05 +0000 (UTC) Subject: Re: [PATCH] mm/compaction: count pages and stop correctly during page isolation. To: Zi Yan , Andrew Morton , linux-mm@kvack.org Cc: Rik van Riel , linux-kernel@vger.kernel.org References: <20201029200435.3386066-1-zi.yan@sent.com> From: Vlastimil Babka Message-ID: <16bdfad8-05f9-6ecf-0db6-c2dcf8e60309@suse.cz> Date: Fri, 30 Oct 2020 15:50:04 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.3.3 MIME-Version: 1.0 In-Reply-To: <20201029200435.3386066-1-zi.yan@sent.com> Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 7bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 10/29/20 9:04 PM, Zi Yan wrote: > From: Zi Yan > > In isolate_migratepages_block, when cc->alloc_contig is true, we are > able to isolate compound pages, nr_migratepages and nr_isolated did not > count compound pages correctly, causing us to isolate more pages than we > thought. Use thp_nr_pages to count pages. Otherwise, we might be trapped > in too_many_isolated while loop, since the actual isolated pages can go > up to COMPACT_CLUSTER_MAX*512=16384, where COMPACT_CLUSTER_MAX is 32, > since we stop isolation after cc->nr_migratepages reaches to > COMPACT_CLUSTER_MAX. I wonder if a better fix would be to adjust the too_many_isolated() check so that if we have non-zero cc->nr_migratepages, we bail out from further isolation and migrate what we have immediately, instead of looping. Because I can also imagine a hypothetical situation where multiple threads in parallel cause too_many_isolated() to be true, and will all loop there forever. The proposed fix should prevent such situation as well, AFAICT. > In addition, after we fix the issue above, cc->nr_migratepages could > never be equal to COMPACT_CLUSTER_MAX if compound pages are isolated, > thus page isolation could not stop as we intended. Change the isolation > stop condition to >=. > > Signed-off-by: Zi Yan > --- > mm/compaction.c | 8 ++++---- > 1 file changed, 4 insertions(+), 4 deletions(-) > > diff --git a/mm/compaction.c b/mm/compaction.c > index ee1f8439369e..0683a4999581 100644 > --- a/mm/compaction.c > +++ b/mm/compaction.c > @@ -1012,8 +1012,8 @@ isolate_migratepages_block(struct compact_control *cc, unsigned long low_pfn, > > isolate_success: > list_add(&page->lru, &cc->migratepages); > - cc->nr_migratepages++; > - nr_isolated++; > + cc->nr_migratepages += thp_nr_pages(page); > + nr_isolated += thp_nr_pages(page); > > /* > * Avoid isolating too much unless this block is being > @@ -1021,7 +1021,7 @@ isolate_migratepages_block(struct compact_control *cc, unsigned long low_pfn, > * or a lock is contended. For contention, isolate quickly to > * potentially remove one source of contention. > */ > - if (cc->nr_migratepages == COMPACT_CLUSTER_MAX && > + if (cc->nr_migratepages >= COMPACT_CLUSTER_MAX && > !cc->rescan && !cc->contended) { > ++low_pfn; > break; > @@ -1132,7 +1132,7 @@ isolate_migratepages_range(struct compact_control *cc, unsigned long start_pfn, > if (!pfn) > break; > > - if (cc->nr_migratepages == COMPACT_CLUSTER_MAX) > + if (cc->nr_migratepages >= COMPACT_CLUSTER_MAX) > break; > } > >