Received: by 2002:a05:6a10:206:0:0:0:0 with SMTP id 6csp2505074pxj; Mon, 17 May 2021 03:15:09 -0700 (PDT) X-Google-Smtp-Source: ABdhPJyZA4QqXyR7NUGLPkJxn2bnVhILmve2MDuKdJbeuKoNlxL0kX0gGjhxCUMoDRfPTDQWOaa9 X-Received: by 2002:a17:907:1c8b:: with SMTP id nb11mr59723806ejc.327.1621246508673; Mon, 17 May 2021 03:15:08 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1621246508; cv=none; d=google.com; s=arc-20160816; b=AHjvDg739T6F1g0RQUw/qqTLcoWQ+7x3CCObVX3/snG3bM0/S9kOSfNFvyPvpPIonu N1LyC4p0S+I4G7a8KYtuH9rXeoGNNTnn0k5npe8thBZ/72U6EBArn3DQkABFY8r8XtBm S50RUOySe0R98Pfk26X3PhOKjNbqie2nJgS01hPAfVDoyxezbOahizrVM4Kg+pDaMLd3 SDZGWB1hINMbzao75JTnGOBjKQl+658jiTC41JkSOZ1VO+eamXBxMayfUa7gMX5tIqYg 1LDmnXrf2JnQUJKP0xBpl3nFrqc7WnLpiPsNV5JyvZchqiustg1cI28pcCIZwZhoD9Ia VM+w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:user-agent:in-reply-to:content-disposition :mime-version:references:message-id:subject:cc:to:from:date; bh=wZWlC7K9ljhs+ZyckveFi5ioST6eC+EjxvAxL7YOxyU=; b=be+42Jy6QTBL3pM7swuntPAZMqCjx+0zNXO8rUBvSZL/qEy3004v60+OLbxHhN9jfy BnH8FK/Leu19mFJ0Jiw6FjaZ7+Yjy8sTjiI54GXx2btQ+ZW1+QaYNSdh9pP4sJRcFpOj ShkzShBtvtbbQR+wC+BCVeEJt7fuyXu9AvUqVrdCi1vvNJQOqaANLxcEJMgMvlFnkgtC ymHjMFe5rBeunUcPug+cLyqdaKzbbwIlx/Kl9GjrfvCggEVdggBxkTIJNhy54wEg6Fxw r0RsKZ7slhxn5QvxM4FfkRf4q6nuETJs3AoitQv0Yu5//XCuncAp3FfO3ijtSXczYgGi kbGw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id a3si13605945ejf.699.2021.05.17.03.14.45; Mon, 17 May 2021 03:15:08 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236280AbhEQKOM (ORCPT + 99 others); Mon, 17 May 2021 06:14:12 -0400 Received: from mx2.suse.de ([195.135.220.15]:46932 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236275AbhEQKOM (ORCPT ); Mon, 17 May 2021 06:14:12 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id 023B6ADF1; Mon, 17 May 2021 10:12:55 +0000 (UTC) Date: Mon, 17 May 2021 12:12:50 +0200 From: Oscar Salvador To: Naoya Horiguchi Cc: Muchun Song , linux-mm@kvack.org, Andrew Morton , Mike Kravetz , Michal Hocko , Tony Luck , Naoya Horiguchi , linux-kernel@vger.kernel.org Subject: Re: [PATCH v4 1/2] mm,hwpoison: fix race with compound page allocation Message-ID: <20210517101239.GA21612@linux> References: <20210517045401.2506032-1-nao.horiguchi@gmail.com> <20210517045401.2506032-2-nao.horiguchi@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20210517045401.2506032-2-nao.horiguchi@gmail.com> User-Agent: Mutt/1.10.1 (2018-07-13) Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, May 17, 2021 at 01:54:00PM +0900, Naoya Horiguchi wrote: > From: Naoya Horiguchi > > When hugetlb page fault (under overcommitting situation) and > memory_failure() race, VM_BUG_ON_PAGE() is triggered by the following race: > > CPU0: CPU1: > > gather_surplus_pages() > page = alloc_surplus_huge_page() > memory_failure_hugetlb() > get_hwpoison_page(page) > __get_hwpoison_page(page) > get_page_unless_zero(page) > zero = put_page_testzero(page) > VM_BUG_ON_PAGE(!zero, page) > enqueue_huge_page(h, page) > put_page(page) > > __get_hwpoison_page() only checks page refcount before taking additional ^^ the? ^^ an > one for memory error handling, which is wrong because there's a time > window where compound pages have non-zero refcount during initialization. > > So makes __get_hwpoison_page() check page status a bit more for a few ^^ make > types of compound pages. PageSlab() check is added because otherwise > "non anonymous thp" path is wrongly chosen. This is no longer true with this patch, is it? What happened here? > static int __get_hwpoison_page(struct page *page) > { > struct page *head = compound_head(page); > + int ret = 0; > + > +#ifdef CONFIG_HUGETLB_PAGE > + spin_lock(&hugetlb_lock); > + if (PageHuge(head) && (HPageFreed(head) || HPageMigratable(head))) > + ret = get_page_unless_zero(head); > + spin_unlock(&hugetlb_lock); > + if (ret > 0) > + return ret; > +#endif I am kind of fine with this, but I wonder whether it makes sense to hide this details into helper (with an empty stub for non-hugetlb pages)? > if (!PageHuge(head) && PageTransHuge(head)) { This !PageHuge could go? -- Oscar Salvador SUSE L3