Received: by 2002:a05:6a10:d5a5:0:0:0:0 with SMTP id gn37csp840818pxb; Wed, 6 Oct 2021 17:00:51 -0700 (PDT) X-Google-Smtp-Source: ABdhPJyk6WSuBRQIOzNOjfK8Rk1INWMQpW/icEgzJf8XOfbMxSVS5VJDGXgSHTZfUiNnXhBNJeB4 X-Received: by 2002:a65:62c4:: with SMTP id m4mr784060pgv.453.1633564851508; Wed, 06 Oct 2021 17:00:51 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1633564851; cv=none; d=google.com; s=arc-20160816; b=Uft7hbE2ImoF7z6DqzjNc0H/4u941BtBCcQ2TskmfBLoTuJdwEZgcFc48KhpNYWWTg bO929BkEUXYkhnwIREX0aCzN7YIoOMx+bxP8WyGhGGKMoaDXYmuIWYy8C1AQxRxwZVoF 6nH2MMTpcCGoEc+jWFvEYpniQ5YfEuXvc7adf6W7fYmFNi9DNyYgbp9SPB7/NQ1JFdY1 kuxMEVAMCEInMAgPT2Awtzyz0QmpTNFUp2W/KBQFiYWuo5iRdlUItip9jN4FReCmFvXV hog3uGg5K45wzSf/2cxCGXHGWtyEci/0QIiPciYrSVdvs5KrMlH6qdzGYksaKkSax40o Vbxg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:subject:message-id:date:from:in-reply-to :references:mime-version:dkim-signature; bh=H3kwNDEY68OIwU8Je8RDowqnTfHPIHgvZzXHiiUTic4=; b=eJmrNelBcPeQGKKXOHj4ZMskM6yb6riTQ4LQVavsq1fCvW8+vvmJPEl4d2+sds5gK/ PH3BdAJUhlIFkna+Fv7HbO1jfAqUjCNViUDEFyol2euIYA0A1q/6Q1zm3zQ/UzzLAtPE CQ+5kMRAK8SjBcZZEvp725ZXh8OmCqwDNT/iDueCqimYzXMKe7Vc2hFpxTOHairhnHiI PRv2ZxzzTokotkjfRH2bZPGyv2KI1Qu/0OlL8XmmFEHN2qC/nEut56PeikFykOOKbE6T kbqvHdiGQT8ddfD6wFt30fQzPAGjA30PbKELeszhLYj0KNXBpxo5hoBJAfEo1O5vlrr2 yRww== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20210112 header.b="d/kT/ON2"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id b18si28769315pfb.1.2021.10.06.17.00.37; Wed, 06 Oct 2021 17:00:51 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20210112 header.b="d/kT/ON2"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S239899AbhJFX7p (ORCPT + 99 others); Wed, 6 Oct 2021 19:59:45 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56788 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232006AbhJFX7o (ORCPT ); Wed, 6 Oct 2021 19:59:44 -0400 Received: from mail-ed1-x535.google.com (mail-ed1-x535.google.com [IPv6:2a00:1450:4864:20::535]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B06FDC061746; Wed, 6 Oct 2021 16:57:51 -0700 (PDT) Received: by mail-ed1-x535.google.com with SMTP id d8so16041968edx.9; Wed, 06 Oct 2021 16:57:51 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=H3kwNDEY68OIwU8Je8RDowqnTfHPIHgvZzXHiiUTic4=; b=d/kT/ON2cmX4+vA2KqMxHsfjLTtYnJwPqvnfOukGEPxSSZ/GvLYKK8ZqBbxX6/uNRN gX4mSv7tqBcRrv4xthfBorbjKb1F7KThxFM55jvVHjfCIzhbkTY73sAjSXhWNB/qUgxP drtgusD6XAFCdTj0bBWvcER1gw0Q4r+8UBz23u0kYEZtZPSpTGET8E5O1llXAKk5DBxh MT+1EECqLY+IbKScKWrL+bF3HqmiXcamInnjYdDpu7GR0Y4Q5feYToEN5fHIFyG6Hlej MIYR9IMn1pjUbR5iUMnHeZZAjFGTc0wuVsVmRVhhIfyJbyIe0QRmx1+n6pc/XsJvmj9+ aCHQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=H3kwNDEY68OIwU8Je8RDowqnTfHPIHgvZzXHiiUTic4=; b=4f+rCY/hzfKKplHBmSZb5cTEd7aQeOvG4XhuKH70Gpa0nRrTCYAf8BsGFVl2sFBDgo B4bT43DJYIswdCowkPqsNtQNDvqYiE/TmpSfspOVT04f3Dvfq9I+rVQWuHQTmPLVBsvK rkmNrBnWnwQ4CTZNixt2l+EC0Q/UvKCWNuQG7IDrVnzH+DO+JIi7Q+UuLZY1Q+DNTMvh S52XMfYoNyMOKaSZYIRO6friooJ6tK0vz1IIcE1QqxFnFzP4EAhwoDKdabXKhyS5q4lb mM/1aj5G6ixazLgoMK8zW2QF6Kx6RaEkE09Ycr11wkJPYrEkavan845s9be2qPQXcso+ nvAg== X-Gm-Message-State: AOAM530yuFhn2X24sVC8bRz5O71azaqdJwDh/AXtsjAgyxpe4b4nvrE+ FpmDGb2jYoHV4BdwnKWSAkmIC1mqREq1QZ6VOgw= X-Received: by 2002:a17:906:c7d0:: with SMTP id dc16mr1495026ejb.555.1633564670339; Wed, 06 Oct 2021 16:57:50 -0700 (PDT) MIME-Version: 1.0 References: <20210930215311.240774-1-shy828301@gmail.com> <20210930215311.240774-3-shy828301@gmail.com> In-Reply-To: From: Yang Shi Date: Wed, 6 Oct 2021 16:57:38 -0700 Message-ID: Subject: Re: [v3 PATCH 2/5] mm: filemap: check if THP has hwpoisoned subpage for PMD page fault To: Peter Xu Cc: =?UTF-8?B?SE9SSUdVQ0hJIE5BT1lBKOWggOWPoyDnm7TkuZ8p?= , Hugh Dickins , "Kirill A. Shutemov" , Matthew Wilcox , Oscar Salvador , Andrew Morton , Linux MM , Linux FS-devel Mailing List , Linux Kernel Mailing List Content-Type: text/plain; charset="UTF-8" Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Oct 6, 2021 at 1:15 PM Peter Xu wrote: > > On Thu, Sep 30, 2021 at 02:53:08PM -0700, Yang Shi wrote: > > @@ -1148,8 +1148,12 @@ static int __get_hwpoison_page(struct page *page) > > return -EBUSY; > > > > if (get_page_unless_zero(head)) { > > - if (head == compound_head(page)) > > + if (head == compound_head(page)) { > > + if (PageTransHuge(head)) > > + SetPageHasHWPoisoned(head); > > + > > return 1; > > + } > > > > pr_info("Memory failure: %#lx cannot catch tail\n", > > page_to_pfn(page)); > > Sorry for the late comments. > > I'm wondering whether it's ideal to set this bit here, as get_hwpoison_page() > sounds like a pure helper to get a refcount out of a sane hwpoisoned page. I'm > afraid there can be side effect that we set this without being noticed, so I'm > also wondering we should keep it in memory_failure(). > > Quotting comments for get_hwpoison_page(): > > * get_hwpoison_page() takes a page refcount of an error page to handle memory > * error on it, after checking that the error page is in a well-defined state > * (defined as a page-type we can successfully handle the memor error on it, > * such as LRU page and hugetlb page). > > For example, I see that both unpoison_memory() and soft_offline_page() will > call it too, does it mean that we'll also set the bits e.g. even when we want > to inject an unpoison event too? unpoison_memory() should be not a problem since it will just bail out once THP is met as the comment says: /* * unpoison_memory() can encounter thp only when the thp is being * worked by memory_failure() and the page lock is not held yet. * In such case, we yield to memory_failure() and make unpoison fail. */ And I think we should set the flag for soft offline too, right? The soft offline does set the hwpoison flag for the corrupted sub page and doesn't split file THP, so it should be captured by page fault as well. And yes for poison injection. But your comment reminds me that get_hwpoison_page() is just called when !MF_COUNT_INCREASED, so it means MADV_HWPOISON still could escape. This needs to be covered too. BTW, I did the test with MADV_HWPOISON, but I didn't test this change (moving flag set after get_page_unless_zero()) since I thought it was just a trivial change and did overlook this case. > > Thanks, > > -- > Peter Xu >