Received: by 2002:a05:6358:d09b:b0:dc:cd0c:909e with SMTP id jc27csp8328283rwb; Wed, 23 Nov 2022 20:09:06 -0800 (PST) X-Google-Smtp-Source: AA0mqf7vp4/G//4K+jbZ1aIW240QmIdhcEHOdRtZr1ULX2fxxpnVX3CgUklPpffgxSfz7gSucFL1 X-Received: by 2002:a17:906:86c9:b0:78d:9324:6f18 with SMTP id j9-20020a17090686c900b0078d93246f18mr26637561ejy.664.1669262945966; Wed, 23 Nov 2022 20:09:05 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1669262945; cv=none; d=google.com; s=arc-20160816; b=ZVAWSpZFT50lhecCLfH1EG+13Omuz1TYtmYV0Cfjys3m/7SmkGVyxgZR9PCXzeBUda v26AuginqjdEJBqpBmREBID9gogKk/kzhXlRfcDoJO86f6EkV4yoz5mL9s+LRlF4d/BS VNc2gsZomRMq6ZvpWRzNOPqlqbYS0BaaY+iXLI+vMGQaVEQUZakAYgNc223HoPqmPStG 09OodJHvmzuFZBgCIlvH7CqpPiEoDUM+ZaAA8AX8W3vYmES1CaWuVUBYLLGJYak5Z6g9 +SNoCBaDFCO09mN8qABLL5jTug67HVFRMTfLiZQxsCXcNLTUH6RJV3MnJ4g3BYJ8MVwZ STOA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:dkim-signature; bh=SiYzGWsRarEdt81Og2LTbZsHP1GWYUw0h1JMLzGK6gc=; b=oDe+1SGTvnbHAnbjuxNcqo2iR8pEYpOALur7E5f0So96alDUzs0O7UV4JzEKHsgrgT Z+pWSOTES05n45IBExpi50oVN09VVAwGTBig5c1C/lCMaF2lXVn3M3IPa14T61jsiWj8 DGcCo9Byy+tx2djnkJ+nAcc2MzKWIYw0H0nWDR9uWsVJQIposZPC5eKNg5Ds88IQ53Ku 63CVz29LpOQWOAiKwjXI9CZzb/wajXEHT3xff56JidlGTggZE/cKAxUZPrRWu2Ooh+PD ktj/4c96db4N97MgOaEbNbmr43hZV5Wut3qok7jS9aS/73ffnzAZwrNlvYeqxGSMR9V7 PeWg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@infradead.org header.s=casper.20170209 header.b=rWgR4hAa; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id n14-20020a05640206ce00b004661ece3fabsi38600edy.575.2022.11.23.20.08.44; Wed, 23 Nov 2022 20:09:05 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@infradead.org header.s=casper.20170209 header.b=rWgR4hAa; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229602AbiKXDdg (ORCPT + 89 others); Wed, 23 Nov 2022 22:33:36 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:52394 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229448AbiKXDde (ORCPT ); Wed, 23 Nov 2022 22:33:34 -0500 Received: from casper.infradead.org (casper.infradead.org [IPv6:2001:8b0:10b:1236::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 71309C607F for ; Wed, 23 Nov 2022 19:33:33 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=In-Reply-To:Content-Type:MIME-Version: References:Message-ID:Subject:Cc:To:From:Date:Sender:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description; bh=SiYzGWsRarEdt81Og2LTbZsHP1GWYUw0h1JMLzGK6gc=; b=rWgR4hAalUnGhNvEqX826xzxNB K4oSWO0gcBRj+m4eI4DUB8xXzQmbQbVC1L4TCRMe9n+8t8EqVLyXLrMdFcnRb7yUmv4c4gPYKFlhY VoO1DFAyA5IhXAc7Yln76y40O3ok5lfnVACt4nq6k+BJ08meFBD9tZyo9vptBQFqYmDAt6suEdqmj ri8conSsbCyAf9i0HtJhTB5Fx/iC59MDaKsyxbij1y6G5BNhwFv74bjhcFyPffZvAAi5gMbN52XRy ELp3UPyvx4jLBihtwHYg/ci9KOIIoAchx7k1knEUucZS+RTd/Jj3o20u09jIW/+gudAskJVN9jBpz D4h0odvg==; Received: from willy by casper.infradead.org with local (Exim 4.94.2 #2 (Red Hat Linux)) id 1oy2zT-008LGP-Sl; Thu, 24 Nov 2022 03:33:31 +0000 Date: Thu, 24 Nov 2022 03:33:31 +0000 From: Matthew Wilcox To: Alistair Popple Cc: David Hildenbrand , Hugh Dickins , Gavin Shan , linux-mm@kvack.org, linux-kernel@vger.kernel.org, akpm@linux-foundation.org, william.kucharski@oracle.com, ziy@nvidia.com, kirill.shutemov@linux.intel.com, zhenyzha@redhat.com, shan.gavin@gmail.com, riel@surriel.com Subject: Re: [PATCH] mm: migrate: Fix THP's mapcount on isolation Message-ID: References: <20221123005752.161003-1-gshan@redhat.com> <871qptrvsw.fsf@nvidia.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <871qptrvsw.fsf@nvidia.com> X-Spam-Status: No, score=-4.4 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED,SPF_HELO_NONE, SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Nov 24, 2022 at 12:06:56PM +1100, Alistair Popple wrote: > > David Hildenbrand writes: > > > On 23.11.22 06:14, Hugh Dickins wrote: > >> On Wed, 23 Nov 2022, Gavin Shan wrote: > >> > >>> The issue is reported when removing memory through virtio_mem device. > >>> The transparent huge page, experienced copy-on-write fault, is wrongly > >>> regarded as pinned. The transparent huge page is escaped from being > >>> isolated in isolate_migratepages_block(). The transparent huge page > >>> can't be migrated and the corresponding memory block can't be put > >>> into offline state. > >>> > >>> Fix it by replacing page_mapcount() with total_mapcount(). With this, > >>> the transparent huge page can be isolated and migrated, and the memory > >>> block can be put into offline state. > >>> > >>> Fixes: 3917c80280c9 ("thp: change CoW semantics for anon-THP") > >>> Cc: stable@vger.kernel.org # v5.8+ > >>> Reported-by: Zhenyu Zhang > >>> Suggested-by: David Hildenbrand > >>> Signed-off-by: Gavin Shan > >> Interesting, good catch, looked right to me: except for the Fixes > >> line > >> and mention of v5.8. That CoW change may have added a case which easily > >> demonstrates the problem, but it would have been the wrong test on a THP > >> for long before then - but only in v5.7 were compound pages allowed > >> through at all to reach that test, so I think it should be > >> Fixes: 1da2f328fa64 ("mm,thp,compaction,cma: allow THP migration for > >> CMA allocations") > >> Cc: stable@vger.kernel.org # v5.7+ > >> Oh, no, stop: this is not so easy, even in the latest tree. > >> Because at the time of that "admittedly racy check", we have no hold > >> at all on the page in question: and if it's PageLRU or PageCompound > >> at one instant, it may be different the next instant. Which leaves it > >> vulnerable to whatever BUG_ON()s there may be in the total_mapcount() > >> path - needs research. *Perhaps* there are no more BUG_ON()s in the > >> total_mapcount() path than in the existing page_mapcount() path. > >> I suspect that for this to be safe (before your patch and more so > >> after), > >> it will be necessary to shift the "admittedly racy check" down after the > >> get_page_unless_zero() (and check the sequence of operations when a > >> compound page is initialized). > > > > Grabbing a reference first sounds like the right approach to me. > > I think you're right. Without a page reference I don't think it is even > safe to look at struct page, at least not without synchronisation > against memory hot unplug which could remove the struct page. From a > quick glance I didn't see anything here that obviously did that though. Memory hotplug is the offending party here. It has to make sure that everything else is definitely quiescent before removing the struct pages. Otherwise you can't even try_get a refcount.