Received: by 2002:a05:6a10:9848:0:0:0:0 with SMTP id x8csp69837pxf; Tue, 30 Mar 2021 19:45:28 -0700 (PDT) X-Google-Smtp-Source: ABdhPJz5rj2bz/QuuvuEoeNek4BXPgceu0eUXSOzSkeouKxPEkpR2nTkDwSNPZT7ioxJ5gCq+DOj X-Received: by 2002:a17:906:78d:: with SMTP id l13mr1169768ejc.97.1617158727810; Tue, 30 Mar 2021 19:45:27 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1617158727; cv=none; d=google.com; s=arc-20160816; b=e18lD8ho0L1xLTFSmw+fnQZ+83hT8lZkwkYHacbPPKjnN8AhG2YA0QqnbV7HFdICNW lnqPES4f5z2ShkqnZ7QW+TmNuUG0muND0S5Iszp6Q3TnDKzy/P/VdTrerUph5c5/ftm2 peQBaAjlX93u+3Mza2zR+A5q1ALofdigk0jVrVT7VCDxo4nfOHc39/IYoFSrUK/wfprg SbqpVHgm2WIhhZI8DSxkt7WMJ1vZUtx5+2LuUVoaACYdD+TbL28Bdy9IhVMvK6A39IKY lvQn83sKam9/OCET4PEEt9sfADcmEU4vDw92dFfz0rmLDfK6qYP1OhlmQnmHbrzCFGkX ldsQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :organization:references:in-reply-to:message-id:subject:cc:to:from :date; bh=Jrq+pBIqqwl9meB5GfWm3Qvv59lL1P2KsYal2UlxAG8=; b=Tfc8JFSq8tZeihd4Wj4sMmreKIlw9atY1+LasrX41q4+lAkkpjMxm32MzOq/L0VVNy art113T04y4pi2Jjb/y/IX0MOmOmTLYfnKpJ9aZ6SVc8x7JfirL1nEA+xI4mV0czWfJK VLh8oIC5n1kzNG2DQelvAnyC1/EBedLayn4CDNWiEep4nS/Ox5MZ16MZb/s0iYJD23XM IF9Lewf4PsPid4z2FBYPo+7BNHcuewIoitCTkMLg0Lws2s1unBZnjVjPwOhNMuNVtpGU EQjdZsj+BtrEnE5iQ9x9/JtXZmbBR3H3ZMhPoFQARmBKbn7Mv32tkCh4n7oAWfhM2F6o OBww== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id x20si685234edq.132.2021.03.30.19.45.04; Tue, 30 Mar 2021 19:45:27 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233291AbhCaCoH convert rfc822-to-8bit (ORCPT + 99 others); Tue, 30 Mar 2021 22:44:07 -0400 Received: from mail.kingsoft.com ([114.255.44.145]:18398 "EHLO mail.kingsoft.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233240AbhCaCn5 (ORCPT ); Tue, 30 Mar 2021 22:43:57 -0400 X-AuditID: 0a580157-2cfff7000006b36a-db-6063e1e84d7d Received: from mail.kingsoft.com (localhost [10.88.1.79]) (using TLS with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client did not present a certificate) by mail.kingsoft.com (SMG-1-NODE-87) with SMTP id 6E.51.45930.8E1E3606; Wed, 31 Mar 2021 10:43:52 +0800 (HKT) Received: from alex-virtual-machine (172.16.253.254) by KSBJMAIL4.kingsoft.cn (10.88.1.79) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2176.2; Wed, 31 Mar 2021 10:43:52 +0800 Date: Wed, 31 Mar 2021 10:43:36 +0800 From: Aili Yao To: "HORIGUCHI =?UTF-8?B?TkFPWUE=?=(=?UTF-8?B?5aCA5Y+j44CA55u05Lmf?=)" CC: David Hildenbrand , Matthew Wilcox , "akpm@linux-foundation.org" , "linux-mm@kvack.org" , "linux-kernel@vger.kernel.org" , "yangfeng1@kingsoft.com" , "sunhao2@kingsoft.com" , Oscar Salvador , Mike Kravetz , Subject: Re: [PATCH v5] mm/gup: check page hwposion status for coredump. Message-ID: <20210331104303.145aea53@alex-virtual-machine> In-Reply-To: <20210331015258.GB22060@hori.linux.bs1.fc.nec.co.jp> References: <20210317163714.328a038d@alex-virtual-machine> <20a0d078-f49d-54d6-9f04-f6b41dd51e5f@redhat.com> <20210318044600.GJ3420@casper.infradead.org> <20210318133412.12078eb7@alex-virtual-machine> <20210319104437.6f30e80d@alex-virtual-machine> <20210320003516.GC3420@casper.infradead.org> <20210322193318.377c9ce9@alex-virtual-machine> <20210331015258.GB22060@hori.linux.bs1.fc.nec.co.jp> Organization: kingsoft X-Mailer: Claws Mail 3.17.5 (GTK+ 2.24.30; x86_64-pc-linux-gnu) MIME-Version: 1.0 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 8BIT X-Originating-IP: [172.16.253.254] X-ClientProxiedBy: KSBJMAIL1.kingsoft.cn (10.88.1.31) To KSBJMAIL4.kingsoft.cn (10.88.1.79) X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFvrCLMWRmVeSWpSXmKPExsXCFcHor/viYXKCwZW96hZz1q9hs/i6/hez xeVdc9gs7q35z2rxcX+wxcXGA4wWZ6YVWfz+MYfNgcNj8wotj02fJrF7nJjxm8XjxdWNLB4f n95i8Xi/7yqbx+bT1R6fN8kFcERx2aSk5mSWpRbp2yVwZZz4cpW1YJJyxazLRxkbGG9KdzFy ckgImEhc3LCcsYuRi0NIYDqTRMPCZVDOK0aJ58/OMINUsQioSjxY38oEYrMB2bvuzWIFsUUE kiQWz/7KBNLALHCYWeLd/cPsIAlhAQ+JIytOgNm8AlYSfw4+BWvgFHCUWNPczgSx4R+zxMHX ixlBEvwCYhK9V/4zQdxkL9G2ZREjRLOgxMmZT1hAbGYBTYnW7b/ZIWxtiWULX4NdJySgKHF4 yS92iF4liSPdM9gg7FiJpgO32CYwCs9CMmoWklGzkIxawMi8ipGlODfdcBMjJGLCdzDOa/qo d4iRiYPxEKMEB7OSCK/wgcQEId6UxMqq1KL8+KLSnNTiQ4zSHCxK4ry8D5MShATSE0tSs1NT C1KLYLJMHJxSDUzrDrGrhe12+fVV/1mES0abtaVAldaksoXpkTuUDPkXKKVw/jyownAgOfdg X+2Kje/vaOZGy/2eOeXM/RVl5oncEtNOTtmkeUo/7xNL+/JNfhJu/K7/2nov1ZtcWhKd9fVC 7I8aQx+tQpWoD9sPdh+bMeP+ky0JejvXREw833XQksP9xDbptNMvTq/JiXTnOvDlIXP2P8m2 DSIqh+P6Zb/2Muf7L4mU43Xt2Wi1bYKnSLPJusJ1at8ZfRpqJYQ3K+dHKj7jldWseZGZlGIX wX09la1ia596VucV0X6NxNv6ZYVrQ4o7My7weHRU5a/7Ibrh96XNAuxcWdePNz1SVJpS/mPC NN+Fm9SfKwqlK7EUZyQaajEXFScCAMnRq1QHAwAA Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, 31 Mar 2021 01:52:59 +0000 HORIGUCHI NAOYA(堀口 直也) wrote: > On Fri, Mar 26, 2021 at 03:22:49PM +0100, David Hildenbrand wrote: > > On 26.03.21 15:09, David Hildenbrand wrote: > > > On 22.03.21 12:33, Aili Yao wrote: > > > > When we do coredump for user process signal, this may be one SIGBUS signal > > > > with BUS_MCEERR_AR or BUS_MCEERR_AO code, which means this signal is > > > > resulted from ECC memory fail like SRAR or SRAO, we expect the memory > > > > recovery work is finished correctly, then the get_dump_page() will not > > > > return the error page as its process pte is set invalid by > > > > memory_failure(). > > > > > > > > But memory_failure() may fail, and the process's related pte may not be > > > > correctly set invalid, for current code, we will return the poison page, > > > > get it dumped, and then lead to system panic as its in kernel code. > > > > > > > > So check the hwpoison status in get_dump_page(), and if TRUE, return NULL. > > > > > > > > There maybe other scenario that is also better to check hwposion status > > > > and not to panic, so make a wrapper for this check, Thanks to David's > > > > suggestion(). > > > > > > > > Link: https://lkml.kernel.org/r/20210319104437.6f30e80d@alex-virtual-machine > > > > Signed-off-by: Aili Yao > > > > Cc: David Hildenbrand > > > > Cc: Matthew Wilcox > > > > Cc: Naoya Horiguchi > > > > Cc: Oscar Salvador > > > > Cc: Mike Kravetz > > > > Cc: Aili Yao > > > > Cc: stable@vger.kernel.org > > > > Signed-off-by: Andrew Morton > > > > --- > > > > mm/gup.c | 4 ++++ > > > > mm/internal.h | 20 ++++++++++++++++++++ > > > > 2 files changed, 24 insertions(+) > > > > > > > > diff --git a/mm/gup.c b/mm/gup.c > > > > index e4c224c..6f7e1aa 100644 > > > > --- a/mm/gup.c > > > > +++ b/mm/gup.c > > > > @@ -1536,6 +1536,10 @@ struct page *get_dump_page(unsigned long addr) > > > > FOLL_FORCE | FOLL_DUMP | FOLL_GET); > > > > if (locked) > > > > mmap_read_unlock(mm); > > > > > > Thinking again, wouldn't we get -EFAULT from __get_user_pages_locked() > > > when stumbling over a hwpoisoned page? > > > > > > See __get_user_pages_locked()->__get_user_pages()->faultin_page(): > > > > > > handle_mm_fault()->vm_fault_to_errno(), which translates > > > VM_FAULT_HWPOISON to -EFAULT, unless FOLL_HWPOISON is set (-> -EHWPOISON) > > > > > > ? > > We could get -EFAULT, but sometimes not (depends on how memory_failure() fails). > > If we failed to unmap, the page table is not converted to hwpoison entry, > so __get_user_pages_locked() get the hwpoisoned page. > > If we successfully unmapped but failed in truncate_error_page() for example, > the processes mapping the page would get -EFAULT as expected. But even in > this case, other processes could reach the error page via page cache and > __get_user_pages_locked() for them could return the hwpoisoned page. > > > > > Or doesn't that happen as you describe "But memory_failure() may fail, and > > the process's related pte may not be correctly set invalid" -- but why does > > that happen? > > Simply because memory_failure() doesn't handle some page types like ksm page > and zero page. Or maybe shmem thp also belongs to this class. > > > > > On a similar thought, should get_user_pages() never return a page that has > > HWPoison set? E.g., check also for existing PTEs if the page is hwpoisoned? > > Make sense to me. Maybe inserting hwpoison check into follow_page_pte() and > follow_huge_pmd() would work well. I think we should take more care to broadcast the hwpoison check to other cases, SIGBUS coredump is such a case that it is supposed to not touch the poison page, and if we return NULL for this, the coredump process will get a successful finish. Other cases may also meet the requirements like coredump, but we need to identify it, that's the poison check wrapper's purpose. If not, we may break the integrity of the related action, which may be no better than panic. -- Thanks! Aili Yao