Received: by 2002:a05:6a10:206:0:0:0:0 with SMTP id 6csp4034315pxj; Tue, 8 Jun 2021 05:04:11 -0700 (PDT) X-Google-Smtp-Source: ABdhPJyLgyYUvPpbJCUw9MImkVxaa5YxxCVsXdcALF7VM6hbsvXVv2Alp/k5EveHMutmfuW0QjtN X-Received: by 2002:a05:6402:524b:: with SMTP id t11mr25264322edd.327.1623153851022; Tue, 08 Jun 2021 05:04:11 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1623153851; cv=none; d=google.com; s=arc-20160816; b=K81muMqjdENZ6S9trHRPuUXrwQOIAru137PB0ZcnZe99LpyiP9YxS9pFtX1dmD3pNH B+ruwPsoKOHJx4EuWZ39GBszlVMhVknvcDD8Fdf+1j3ll8GpLoPSm7iolqP3Itz5revH GAk07YjpaSbJJO7bjKm+/rJ2wLXvfTpWi6EXfx5qaNATgmE4dKxmEoVDpmcx8gdeXuRd pCUo1nf/dyKh314kk+A7FI1CyBf4in9drZ4rIsInmBKsCIAuydAIcYp8dpoBSNG4IZ+d s8VtCMoZrXdlAyVvGwHOv5bi5LLPnBmd2ly08Gd8soTYUHHtGK0ARi2nrB+AA9OORAcm /UYQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:dkim-signature; bh=uYU0wBlQGxD0TLzC1dztsvlGJDA1tbga5OhaOeNPfa8=; b=l56c20Zio8/+Z3cgvW0/ee8mOYCjb4xA8A3fEGt6fcYr/+TLPbfu2x32D8qIklDSjZ kHXyNzbH8jtjqCW0M5lXqF6TXBzxqgHpyvbpQM7ijsaGKQyOd7yLIRjO6Yn29YRWPqjm rbhWr4NOEk1Avf/WF3nPGCiEA524D1fWV13GwIBdCcg0omXWSXmjpgOYA7myJbkWtOk3 YHGEoY3DB0wHH3e3pRJaB1hxEkbX/Iu20X40sqL/T0I160X/xc2NViKJCQpsWzCiPqKq SPq7Hz1FfuWe6sIijgp47PUr36dhtOHyFIQiu/PppZGwMawCcUFTTtbudlCyNvid61tc Klww== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@shutemov-name.20150623.gappssmtp.com header.s=20150623 header.b=zbTsbmBN; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id dz22si9177601edb.470.2021.06.08.05.03.47; Tue, 08 Jun 2021 05:04:11 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@shutemov-name.20150623.gappssmtp.com header.s=20150623 header.b=zbTsbmBN; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232388AbhFHMDH (ORCPT + 99 others); Tue, 8 Jun 2021 08:03:07 -0400 Received: from mail-lj1-f173.google.com ([209.85.208.173]:38531 "EHLO mail-lj1-f173.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232218AbhFHMDG (ORCPT ); Tue, 8 Jun 2021 08:03:06 -0400 Received: by mail-lj1-f173.google.com with SMTP id s22so6215018ljg.5 for ; Tue, 08 Jun 2021 05:01:13 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=shutemov-name.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=uYU0wBlQGxD0TLzC1dztsvlGJDA1tbga5OhaOeNPfa8=; b=zbTsbmBNcYTv2dszMaSb1//4OEXSqqmXwpT2KhnVaKPW0N3YsFFxJEqwiZGoxW4tkY MHUJDrTPq8jmsRyipdRMm4bLN9oE8Kgm/+AB9yhqzlABNzRNEGXizx59Lu4cFLzvxgah Tgjw4GfnQHh87Lp0bMqWl5xZe5qQdJo3zjbwPOzPPkJ5+SPUxwhBKZSmrG1ldKDXOxXT ccutdvxZ2mnC6M6WWc1Yf0tNJQNpIe1pQ7RhKDlSdXYjdp5cJNoIpOhWIx/mzyl0wgNN ExC/3ZY3B4REllhqWJTTuvovmYdR1QVq45qwiVkFLiNqaUrEu/KYIaaE4YNbVqPriQaH nh8A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=uYU0wBlQGxD0TLzC1dztsvlGJDA1tbga5OhaOeNPfa8=; b=RxssEknB+hRBCbtQ7jVALc7mTcy0A4+Qkvck+7V7AucJ+9i5VjLYHyhoL0EPSCRl7Q OBLyfbVDSELvlnw3MhZiO8OQB62SmGFwYkudH+woEbgD2CoStmrBjJq34Pel+759rhUE qWrqGwt6QMa0Wa+0mMjRZ2V7jypFF6RN0x1X1WDqKYfjWg70nMtsliC/wpzLWNAaJ/Y3 SxlyYZc9zUVCQV1QckwvSF9XJeCtVzqW8HNUMRKTAzFIhPz6lsh0b9Es9Jur+rJ2H5Sm R0zXlHtgg1L/Avfgn+8yODAjNe9I4BVItPBl9c/GDfx34SkDu7XT1R/jsLmTzoXKxoem woFA== X-Gm-Message-State: AOAM532mSr20MpVUxRFyWcIq2A+8FVgFOAbIrj4EFSjmzthaXLTyJRBf b52eItHZIC7dwKkewB/NFW7g3A== X-Received: by 2002:a2e:b8cc:: with SMTP id s12mr18412608ljp.66.1623153612807; Tue, 08 Jun 2021 05:00:12 -0700 (PDT) Received: from box.localdomain ([86.57.175.117]) by smtp.gmail.com with ESMTPSA id g20sm2229786lja.2.2021.06.08.05.00.11 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 08 Jun 2021 05:00:12 -0700 (PDT) Received: by box.localdomain (Postfix, from userid 1000) id EC77E102815; Tue, 8 Jun 2021 15:00:26 +0300 (+03) Date: Tue, 8 Jun 2021 15:00:26 +0300 From: "Kirill A. Shutemov" To: Xu Yu Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, hughd@google.com, akpm@linux-foundation.org, gavin.dg@linux.alibaba.com Subject: Re: [PATCH v2] mm, thp: use head page in __migration_entry_wait Message-ID: <20210608120026.ugfh72ydjeba44bo@box.shutemov.name> References: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Jun 08, 2021 at 05:22:39PM +0800, Xu Yu wrote: > We notice that hung task happens in a conner but practical scenario when > CONFIG_PREEMPT_NONE is enabled, as follows. > > Process 0 Process 1 Process 2..Inf > split_huge_page_to_list > unmap_page > split_huge_pmd_address > __migration_entry_wait(head) > __migration_entry_wait(tail) > remap_page (roll back) > remove_migration_ptes > rmap_walk_anon > cond_resched > > Where __migration_entry_wait(tail) is occurred in kernel space, e.g., > copy_to_user in fstat, which will immediately fault again without > rescheduling, and thus occupy the cpu fully. > > When there are too many processes performing __migration_entry_wait on > tail page, remap_page will never be done after cond_resched. > > This makes __migration_entry_wait operate on the compound head page, > thus waits for remap_page to complete, whether the THP is split > successfully or roll back. > > Note that put_and_wait_on_page_locked helps to drop the page reference > acquired with get_page_unless_zero, as soon as the page is on the wait > queue, before actually waiting. So splitting the THP is only prevented > for a brief interval. > > Fixes: ba98828088ad ("thp: add option to setup migration entries during PMD split") > Suggested-by: Hugh Dickins > Signed-off-by: Gang Deng > Signed-off-by: Xu Yu Looks good to me: Acked-by: Kirill A. Shutemov But there's one quirk: if split succeed we effectively wait on wrong page to be unlocked. And it may take indefinite time if split_huge_page() was called on the head page. Maybe we should consider waking up head waiter on head page, even if it is still locked after split? Something like this (untested): diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 63ed6b25deaa..f79a38e21e53 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -2535,6 +2535,9 @@ static void __split_huge_page(struct page *page, struct list_head *list, */ put_page(subpage); } + + if (page == head) + wake_up_page_bit(page, PG_locked); } int total_mapcount(struct page *page) -- Kirill A. Shutemov