Received: by 2002:a05:6a10:5bc5:0:0:0:0 with SMTP id os5csp3047039pxb; Tue, 12 Oct 2021 20:02:58 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxvWcneIru8qP8Jj1wPDdDFbqEatExjrhbuKzlbfL/ViQLocTs5YvGV/CMVwRcFSeL5yaqk X-Received: by 2002:a63:a804:: with SMTP id o4mr25480322pgf.309.1634094178220; Tue, 12 Oct 2021 20:02:58 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1634094178; cv=none; d=google.com; s=arc-20160816; b=dhnhezLHWegCZ8WgQRX0YK8ofXiLg3O+bEPLk5N4YwLiqhxpJDlPAIQZUenuDScn3G jNF86YNartaaRR4SGr1QuT4aEvn5hXcFNItss5hfhoffAILRe+42sxGgyVBXizzYMD8g euuiH6P32iLJdsLuGXzRpKu6HC6vLZ+5M85N/o95WR394CnVuq2G54rF4xng5k9sDGQA Z29EmHRxZoZn04MVmzt6fzEKRq/N/iSjHW5hmGUgfTimjFLuzRy0dpG3F5ZinyhbdEHw ntBSaqcx1WdbDRyyxRe6Yy5AVUzVpRlV5hc3t0x5TZYIbJyeiQytBLUGka4uHVwuRIBk vwpQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:dkim-signature; bh=qAK7ZtyLejfkOrKCroLEuyBtwqTtOLSVz4puxzj6usQ=; b=bRyqSggtBemVkxMkkCSCjpAN0/qxf7hhi7qnlu8LRYeKBLyNR8ICiMlQs1Adfa72SP OoSpPixF8xqiCwknPJGVRYOYX/S4ioSOTPeIMJJ6HxN5Q4QK4m6d75eqP4q1gHfip3/P z5nPdGjJlshTE4MyJLCkFcjNq68zAZiTAVANB6TiEdJ9wUKdol931UMitX1Om/rTdFb4 7Y4NfmkBrYjV499+M2InJMX+DvQdUufN0OaVcIZBxTsKxCx84WNgdr+FtmSh/rpeIDRu SzL0tliGQEigJL6U7vcTVaUmHQlhFF1uI1l9X5Wa7C4VSG6AKuunTEU/PBwPJyOAwzCx 8mdQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=T+cpt8QW; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id u13si18518861pfi.122.2021.10.12.20.02.45; Tue, 12 Oct 2021 20:02:58 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=T+cpt8QW; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236953AbhJMDDr (ORCPT + 99 others); Tue, 12 Oct 2021 23:03:47 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]:46085 "EHLO us-smtp-delivery-124.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229980AbhJMDDr (ORCPT ); Tue, 12 Oct 2021 23:03:47 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1634094104; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=qAK7ZtyLejfkOrKCroLEuyBtwqTtOLSVz4puxzj6usQ=; b=T+cpt8QWNnL4ZfodoggsyipXB6iPu2yGt5rcReBb+DVCN76uqUduOMZNvPBEHfoxgJWMfd g4xtKKVO2AgaK5gnTiypK/1D8POMSKYiL0xyexvj5q57COp21swbNAERI8rTMpRNCJdPeY HeuCLnfgHqNFljKlSgO8QrPI2+tZ2Ks= Received: from mail-pg1-f198.google.com (mail-pg1-f198.google.com [209.85.215.198]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-530-UB0ebTrgMRifbd23g8mhDA-1; Tue, 12 Oct 2021 23:01:43 -0400 X-MC-Unique: UB0ebTrgMRifbd23g8mhDA-1 Received: by mail-pg1-f198.google.com with SMTP id d5-20020a63ed05000000b002688d3aff8aso662928pgi.2 for ; Tue, 12 Oct 2021 20:01:42 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=qAK7ZtyLejfkOrKCroLEuyBtwqTtOLSVz4puxzj6usQ=; b=Vqqi3calW2TSaB2CEyCefmuGZZblumgoL+B5xW+vapPrfcAznjP32JkFbDfjeT6dAb QX+QqbH1JRSJPMnqP/ZmDxyKlsMs2ax/c8kmiG+/1EIxNSIe+4lO47RAxr8hWgLmp0kR oWWycon2QDD/d+n6mfSjxWiD8mBg3CtIBX+srxcoBw6a8cv/4BwZeA0MiGYu9yu6FSrw 8j9WgEVuB7vz0QcT2LjmedEEOag5MYmy7eQAwgVUNUIUABpQmnX5RMgZEuaksuilZkBw Qw2Pvr/Y6y5IRuVzFjqxcgfESdmNDU8IIc1nzJP8FTtOJ1MlpbMf/dsY5erm791eGCzP 9Qxg== X-Gm-Message-State: AOAM532PvuBvJLnHsFlUst3MuJx9clFz5G0POXLj+0rFfpJz4nhJlfvT +nVo6UkubdQgZwy6zxvscL9GXyTadFD9CsU7cUOH4YWkWrlV2xEf/NS0VDiMwHLhXlt3JLHohz7 ndlSMwoqeVKCHmblqd8Qtcip2 X-Received: by 2002:a17:902:b18a:b0:13e:e5f3:1b25 with SMTP id s10-20020a170902b18a00b0013ee5f31b25mr33374302plr.78.1634094101808; Tue, 12 Oct 2021 20:01:41 -0700 (PDT) X-Received: by 2002:a17:902:b18a:b0:13e:e5f3:1b25 with SMTP id s10-20020a170902b18a00b0013ee5f31b25mr33374264plr.78.1634094101392; Tue, 12 Oct 2021 20:01:41 -0700 (PDT) Received: from t490s ([209.132.188.80]) by smtp.gmail.com with ESMTPSA id p189sm12224275pfp.167.2021.10.12.20.01.36 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 12 Oct 2021 20:01:40 -0700 (PDT) Date: Wed, 13 Oct 2021 11:01:33 +0800 From: Peter Xu To: Yang Shi Cc: HORIGUCHI =?utf-8?B?TkFPWUEo5aCA5Y+jIOebtOS5nyk=?= , Hugh Dickins , "Kirill A. Shutemov" , Matthew Wilcox , Oscar Salvador , Andrew Morton , Linux MM , Linux FS-devel Mailing List , Linux Kernel Mailing List Subject: Re: [v3 PATCH 2/5] mm: filemap: check if THP has hwpoisoned subpage for PMD page fault Message-ID: References: <20210930215311.240774-1-shy828301@gmail.com> <20210930215311.240774-3-shy828301@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Oct 12, 2021 at 07:48:39PM -0700, Yang Shi wrote: > On Tue, Oct 12, 2021 at 3:10 PM Peter Xu wrote: > > > > On Tue, Oct 12, 2021 at 11:02:09AM -0700, Yang Shi wrote: > > > On Mon, Oct 11, 2021 at 6:44 PM Peter Xu wrote: > > > > > > > > On Mon, Oct 11, 2021 at 08:55:26PM -0400, Peter Xu wrote: > > > > > Another thing is I noticed soft_offline_in_use_page() will still ignore file > > > > > backed split. I'm not sure whether it means we'd better also handle that case > > > > > as well, so shmem thp can be split there too? > > > > > > > > Please ignore this paragraph - I somehow read "!PageHuge(page)" as > > > > "PageAnon(page)"... So I think patch 5 handles soft offline too. > > > > > > Yes, exactly. And even though the split is failed (or file THP didn't > > > get split before patch 5/5), soft offline would just return -EBUSY > > > instead of calling __soft_offline_page->page_handle_poison(). So > > > page_handle_poison() should not see THP at all. > > > > I see, so I'm trying to summarize myself on what I see now with the new logic.. > > > > I think the offline code handles hwpoison differently as it sets PageHWPoison > > at the end of the process, IOW if anything failed during the offline process > > the hwpoison bit is not set. > > > > That's different from how the memory failure path is handling this, as in that > > case the hwpoison bit on the subpage is set firstly, e.g. before split thp. I > > believe that's also why memory failure requires the extra sub-page-hwpoison bit > > while offline code shouldn't need to: because for soft offline split happens > > before setting hwpoison so we just won't ever see a "poisoned file thp", while > > for memory failure it could happen, and the sub-page-hwpoison will be a temp > > bit anyway only exist for a very short period right after we set hwpoison on > > the small page but before we split the thp. > > > > Am I right above? > > Yeah, you are right. I noticed this too, only successfully migrated > page is marked as hwpoison. But TBH I'm not sure why it does this way. My wild guess is that unlike memory failures, soft offline is best-effort. Say, the data on the page is still consistent, so even if offline failed for some reason we shouldn't stop the program from execution. That's not true for memory failures via MCEs, afaict, as the execution could read/write wrong data and that'll be a serious mistake, so we set hwpoison 1st there first before doing anything else, making sure "this page is broken" message delivered and user app won't run with risk. But yeah it'll be great if Naoya could help confirm that. > Naoya may know. Anyway, THP doesn't get migrated if it can't be split, > so PageHasHWPoisoned doesn't apply, right? Right, that matches my current understanding of the code, so the extra bit is perhaps not useful for soft offline case. But this also reminded me that shouldn't we be with the page lock already during the process of "setting hwpoison-subpage bit, split thp, clear hwpoison-subpage bit"? If it's only the small window that needs protection, while when looking up the shmem pagecache we always need to take the page lock too, then it seems already safe even without the extra bit? Hmm? > > > > > I feel like __soft_offline_page() still has some code that assumes "thp can be > > there", e.g. iiuc after your change to allow file thp split, "hpage" will > > always be the same as "page" then in that function, and isolate_page() does not > > need to pass in a pagelist pointer too as it'll always be handling a small page > > anyway. But maybe they're fine to be there for now as they'll just work as > > before, I think, so just raise it up. > > That compound_head() call seems to be for hugetlb since isolating > hugetlb needs to pass in the head page IIUC. For the pagelist, I think > it is just because migrate_pages() requires a list as the second > parameter. Fair enough. Thanks, -- Peter Xu