Received: by 2002:a05:6a10:a841:0:0:0:0 with SMTP id d1csp158953pxy; Wed, 28 Apr 2021 01:24:48 -0700 (PDT) X-Google-Smtp-Source: ABdhPJz5aT7z6sq7C/2m17Y8XBmtNFCby7ksQ1chZz6AA/NlrR8BWs91+8pTRxHCLVIfsa1AhugE X-Received: by 2002:a62:e40a:0:b029:263:e573:e1c9 with SMTP id r10-20020a62e40a0000b0290263e573e1c9mr27522278pfh.74.1619598288643; Wed, 28 Apr 2021 01:24:48 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1619598288; cv=none; d=google.com; s=arc-20160816; b=uJRg0/+yYOARv3ZZ/NUnuFkDw3Yr4UNvn4WDpVfFgbUY9EzyHLvuMOyFJ7KnRggeoJ kGS3EfrmZZ7NN+b+wx80rZlqpjoNixURCgxFjbN/wF4ND1Nf5bFxwonB4U7iBfozsCO+ Xdu6RayB1ug86Yjq1ryjuQ2oHPBaUgUpxKLJoDSTIjEdvH8c7JqMCNyLdKFEq4FNdlq0 PvH/wG3YX1EkMs47nMWGw0WbDaXvfHeS7RDP1lqhp9TOgPwgqfpnymyIlOnqqcaeRUX1 9E3cTh8PRKdpjcoF751smsYMOFmQJEqriwuXc4jmenwu1GcwNtEfwoGXjksuUN5f0XNN aZIQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:user-agent:in-reply-to:content-disposition :mime-version:references:message-id:subject:cc:to:from:date; bh=UHwlxX2UKUZMiiliKll10f7IVDa+hUnB5YLQoOXY4Wg=; b=P2mzvd+8I47cX4BeWVX+H7SztaSMAex/ocZFLeMspDueqPEt5305VEBoJ1096ecmj5 VEH94wWnIVvVSFQhr9mnVbzuUtn1uhnYSH1KjEhesWhIWP0ELK73T0vG7/3IBtmutgRQ euSW3ZiLI9FgvWyhdBzZtLgHc9ah2vgtC8usyxY54WG4MhVT20ChoM+4UJ7KNPo6mEE5 wshFRzCdP+dt//vKLR+rJIHzB98oSaRhmWz00PgXjBbjUU+68u9BVuYByRhpLS1rZyOS v8iJjsZCwTndwkjTYqiyeG/4/eD5rmB4TWa/MnYl3pgyVny21oxS57Tn+dVCYhRf4/U7 p8Rg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id g24si2828168pgj.554.2021.04.28.01.24.36; Wed, 28 Apr 2021 01:24:48 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237178AbhD1IYk (ORCPT + 99 others); Wed, 28 Apr 2021 04:24:40 -0400 Received: from mx2.suse.de ([195.135.220.15]:39286 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231635AbhD1IYi (ORCPT ); Wed, 28 Apr 2021 04:24:38 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id 31E57AF9C; Wed, 28 Apr 2021 08:23:53 +0000 (UTC) Date: Wed, 28 Apr 2021 10:23:49 +0200 From: Oscar Salvador To: Naoya Horiguchi Cc: Mike Kravetz , Michal Hocko , Muchun Song , "akpm@linux-foundation.org" , "linux-mm@kvack.org" , "linux-kernel@vger.kernel.org" , Naoya Horiguchi Subject: Re: [PATCH] mm,hwpoison: fix race with compound page allocation Message-ID: <20210428082344.GA29213@linux> References: <20210423080153.GA78658@hori.linux.bs1.fc.nec.co.jp> <20210428074654.GA2093897@u2004> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20210428074654.GA2093897@u2004> User-Agent: Mutt/1.10.1 (2018-07-13) Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Apr 28, 2021 at 04:46:54PM +0900, Naoya Horiguchi wrote: > --- > From: Naoya Horiguchi > Date: Wed, 28 Apr 2021 15:55:47 +0900 > Subject: [PATCH] mm,hwpoison: fix race with compound page allocation > > When hugetlb page fault (under overcommiting situation) and memory_failure() > race, VM_BUG_ON_PAGE() is triggered by the following race: > > CPU0: CPU1: > > gather_surplus_pages() > page = alloc_surplus_huge_page() > memory_failure_hugetlb() > get_hwpoison_page(page) > __get_hwpoison_page(page) > get_page_unless_zero(page) > zero = put_page_testzero(page) > VM_BUG_ON_PAGE(!zero, page) > enqueue_huge_page(h, page) > put_page(page) > > __get_hwpoison_page() only checks page refcount before taking additional > one for memory error handling, which is wrong because there's time > windows where compound pages have non-zero refcount during initialization. > > So makes __get_hwpoison_page() check more page status for a few types > of compound pages. PageSlab() check is added because otherwise > "non anonymous thp" path is wrongly chosen for slab pages. Was it wrongly chosen even before? If so, maybe a Fix tag is warranted. > > Signed-off-by: Naoya Horiguchi > Reported-by: Muchun Song > --- > mm/memory-failure.c | 48 +++++++++++++++++++++++++-------------------- > 1 file changed, 27 insertions(+), 21 deletions(-) > > diff --git a/mm/memory-failure.c b/mm/memory-failure.c > index a3659619d293..61988e332712 100644 > --- a/mm/memory-failure.c > +++ b/mm/memory-failure.c > @@ -1095,30 +1095,36 @@ static int __get_hwpoison_page(struct page *page) > + if (PageCompound(page)) { > + if (PageSlab(page)) { > + return get_page_unless_zero(page); > + } else if (PageHuge(head)) { > + if (HPageFreed(head) || HPageMigratable(head)) > + return get_page_unless_zero(head); There were concerns raised wrt. memory-failure should not be fiddling with page's refcount without holding a hugetlb lock. So, if we really want to make this more stable, we might want to hold the lock here. The clearing and setting of HPageFreed happens under the lock, and for HPageMigratable that is also true for the clearing part, so I think it would be more sane to do this under the lock to close any possible race. Does it make sense? -- Oscar Salvador SUSE L3