Received: by 2002:a05:6a10:9e8c:0:0:0:0 with SMTP id y12csp1393943pxx; Fri, 30 Oct 2020 09:00:39 -0700 (PDT) X-Google-Smtp-Source: ABdhPJyIgqQxJ0Ts/FPL89oOjy0ocIkWULQMXKihC4P9Brdd3j8MHQxgPGM5XKzKZANGO4V+g/Xk X-Received: by 2002:a1c:c286:: with SMTP id s128mr3461509wmf.88.1604073639394; Fri, 30 Oct 2020 09:00:39 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1604073639; cv=none; d=google.com; s=arc-20160816; b=sYyq83lWmp7e23bXiq15XyCoPO23ciMLABbXEV1IQNlzBGA2HBokDS2MYqmjwGhN23 bhYWlgGJiXXK8DthRPSHiL/z5aHS0x3gl8qtFyqUdXHCIrHTa8tvPJMn9zblfGdCjErM e6hQYSgkekdAjmeDyd7aBA6wMhHYQwq8MJz2Jhqwr5KZ28B2rF343AWLJ/ud8F0E6v7V boRGIF1XZafazeE8GthGTLXAfbA9+pmSQ6HrSJEQffps1XYZYC+oCVlRMxzL5n5yS8a6 Aqrw3VOB+IjiDQL0NpB2r0Du4UOzWQ+f3R96s282p9dtmHccw53n9PXYrdcgtFsc+NFf V32w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version:reply-to :message-id:date:subject:cc:to:from:dkim-signature:dkim-signature; bh=GgFywE3sedM8pRqGb2GCUrt5KozSDzUlHAh3jMIZlYc=; b=VXK7qqiBJ9qEroTvBSxPbRCwkfDkli42JttQFz/VZBbtLECb157eS20mKb1zKPCImD QJtn/Es8/cT9gTAlondPl5bpKN+yd52W8wyfHtkwCdRu902xFXKfAcxOqV/NWZMvDSet esmNdH5Xu6BZvPpjyylOfudAjfnhThOTILJ0Nde1Mqtm2Jc+eDiWg4y0PsGSQUg6O4B3 C1fVuRYkmq5rzKC0MwPqfNC7oNtuigpUnTVOFFMLYQ55qUGlcoV+xOU9qm4m7aDM/9VR ZRTHcu/cZWSdzeTYuWup91UfUKAwo0Jx+76RuTYBlLmQNMA6y1IHZJcV3ElKaKcr6ugi IICw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@sent.com header.s=fm1 header.b=ELu5vcLm; dkim=pass header.i=@messagingengine.com header.s=fm1 header.b=kcys9TqQ; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=sent.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id d12si5499286ejj.595.2020.10.30.09.00.13; Fri, 30 Oct 2020 09:00:39 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@sent.com header.s=fm1 header.b=ELu5vcLm; dkim=pass header.i=@messagingengine.com header.s=fm1 header.b=kcys9TqQ; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=sent.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726951AbgJ3P5d (ORCPT + 99 others); Fri, 30 Oct 2020 11:57:33 -0400 Received: from out1-smtp.messagingengine.com ([66.111.4.25]:55841 "EHLO out1-smtp.messagingengine.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726899AbgJ3P5c (ORCPT ); Fri, 30 Oct 2020 11:57:32 -0400 Received: from compute6.internal (compute6.nyi.internal [10.202.2.46]) by mailout.nyi.internal (Postfix) with ESMTP id 2BD295C02E4; Fri, 30 Oct 2020 11:57:31 -0400 (EDT) Received: from mailfrontend1 ([10.202.2.162]) by compute6.internal (MEProxy); Fri, 30 Oct 2020 11:57:31 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sent.com; h=from :to:cc:subject:date:message-id:reply-to:mime-version :content-type:content-transfer-encoding; s=fm1; bh=GgFywE3sedM8p RqGb2GCUrt5KozSDzUlHAh3jMIZlYc=; b=ELu5vcLmJCPbE1orsyqdaspWVSIsY 5oksyRCGM576vzsq89hLCxIBtbLPiCCrpDaFw83NsKi3/IznqGNNbiqhpTwgQhK+ Ufls4ptpwl56aGFMNH6Df/oT3Wu+08VrHoUK+yqU72y2JeCEzoNkETULRfqFXcx3 6oCb2Asj+gmC+Y2ZZCEJtz7EXKo1eNDHHRwHn7pGhXWCxaGQE+ZD1hoG7Ju8pSvM CyYf+VwStncPVVlBqmQ7GGbCb7yAEuyid1JVD9faaPMlLpPcD4tarWsAl95+voEw 4sC0+/7hjoGctlNjU0tAZLv0Mzuo0WVb3UpYE4xYxvdHSwxv+tQslMSYg== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-transfer-encoding:content-type :date:from:message-id:mime-version:reply-to:subject:to :x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm1; bh=GgFywE3sedM8pRqGb2GCUrt5KozSDzUlHAh3jMIZlYc=; b=kcys9TqQ hx/6X5aclY41cAAm2+CfJLf318GOtMzVEb/zdTO0t/7En4msj8YMVxEWK5DtkKo0 p0sc/FfWfrwEvsGTXVK2R/zZhZ261K7d6pSKPJTw+H3V91x4fJxDxfo4ITgtejc7 J1vhAC2//JgIyKVxNowH5HCr2Ta1qo1s7GXVQyP+VSohKjuJryAdbiFbLaMHwEAI 9TjVqDghhW/QbkkhhJDsjd9mCJNsogaj5/ztJJR2BDoM5AsJH0QxdA03I500RqdX a4p8c2+VD+GQ1qixZkJ8OgaL9+hXQqkh4G9KUtXdLe2xaZ5MBInP2/P43rRZn6ng +KASUSrMSm1fuA== X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedujedrleehgdejkecutefuodetggdotefrodftvf curfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfghnecu uegrihhlohhuthemuceftddtnecusecvtfgvtghiphhivghnthhsucdlqddutddtmdenuc fjughrpefhvffufffkofhrgggtgfesthhqredtredtjeenucfhrhhomhepkghiucgjrghn uceoiihirdihrghnsehsvghnthdrtghomheqnecuggftrfgrthhtvghrnhepudevleffhe duuddvhfdtvdehfeekjedtleeifefhgeehjeetvdethfefvdekkeelnecukfhppeduvddr geeirddutdeirdduieegnecuvehluhhsthgvrhfuihiivgeptdenucfrrghrrghmpehmrg hilhhfrhhomhepiihirdihrghnsehsvghnthdrtghomh X-ME-Proxy: Received: from nvrsysarch6.NVidia.COM (unknown [12.46.106.164]) by mail.messagingengine.com (Postfix) with ESMTPA id 0B8B8328005E; Fri, 30 Oct 2020 11:57:30 -0400 (EDT) From: Zi Yan To: Andrew Morton , linux-mm@kvack.org Cc: Yang Shi , Michal Hocko , Vlastimil Babka , Rik van Riel , linux-kernel@vger.kernel.org, stable@vger.kernel.org, Zi Yan Subject: [PATCH v2 1/2] mm/compaction: count pages and stop correctly during page isolation. Date: Fri, 30 Oct 2020 11:57:15 -0400 Message-Id: <20201030155716.3614401-1-zi.yan@sent.com> X-Mailer: git-send-email 2.28.0 Reply-To: Zi Yan MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Zi Yan In isolate_migratepages_block, when cc->alloc_contig is true, we are able to isolate compound pages, nr_migratepages and nr_isolated did not count compound pages correctly, causing us to isolate more pages than we thought. Use thp_nr_pages to count pages. Otherwise, we might be trapped in too_many_isolated while loop, since the actual isolated pages can go up to COMPACT_CLUSTER_MAX*512=3D16384, where COMPACT_CLUSTER_MAX is 32, since we stop isolation after cc->nr_migratepages reaches to COMPACT_CLUSTER_MAX. In addition, after we fix the issue above, cc->nr_migratepages could never be equal to COMPACT_CLUSTER_MAX if compound pages are isolated, thus page isolation could not stop as we intended. Change the isolation stop condition to >=3D. The issue can be triggered as follows: In a system with 16GB memory and an 8GB CMA region reserved by hugetlb_cma, if we first allocate 10GB THPs and mlock them (so some THPs are allocated in the CMA region and mlocked), reserving 6 1GB hugetlb pages via /sys/kernel/mm/hugepages/hugepages-1048576kB/nr_hugepages will get stuck (looping in too_many_isolated function) until we kill either task. With the patch applied, oom will kill the application with 10GB THPs and let hugetlb page reservation finish. Fixes: 1da2f328fa64 (=E2=80=9Cmm,thp,compaction,cma: allow THP migration fo= r CMA allocations=E2=80=9D) Signed-off-by: Zi Yan Reviewed-by: Yang Shi Cc: --- mm/compaction.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/mm/compaction.c b/mm/compaction.c index ee1f8439369e..3e834ac402f1 100644 --- a/mm/compaction.c +++ b/mm/compaction.c @@ -1012,8 +1012,8 @@ isolate_migratepages_block(struct compact_control *cc= , unsigned long low_pfn, =20 isolate_success: list_add(&page->lru, &cc->migratepages); - cc->nr_migratepages++; - nr_isolated++; + cc->nr_migratepages +=3D compound_nr(page); + nr_isolated +=3D compound_nr(page); =20 /* * Avoid isolating too much unless this block is being @@ -1021,7 +1021,7 @@ isolate_migratepages_block(struct compact_control *cc= , unsigned long low_pfn, * or a lock is contended. For contention, isolate quickly to * potentially remove one source of contention. */ - if (cc->nr_migratepages =3D=3D COMPACT_CLUSTER_MAX && + if (cc->nr_migratepages >=3D COMPACT_CLUSTER_MAX && !cc->rescan && !cc->contended) { ++low_pfn; break; @@ -1132,7 +1132,7 @@ isolate_migratepages_range(struct compact_control *cc= , unsigned long start_pfn, if (!pfn) break; =20 - if (cc->nr_migratepages =3D=3D COMPACT_CLUSTER_MAX) + if (cc->nr_migratepages >=3D COMPACT_CLUSTER_MAX) break; } =20 --=20 2.28.0