Received: by 2002:a05:6a10:a852:0:0:0:0 with SMTP id d18csp1116645pxy; Thu, 6 May 2021 00:22:15 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwZ2wkAw0hK54n1K2IIiNO/HgIAmoit+iUMJpjesJqnaIVmAJS7msMc9b3pP+8u6KKFQBcB X-Received: by 2002:a17:907:110f:: with SMTP id qu15mr2971276ejb.378.1620285734897; Thu, 06 May 2021 00:22:14 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1620285734; cv=none; d=google.com; s=arc-20160816; b=Bn66wfkcIsh0hr6gTH34v4qGLSPXMYrEzPXf8PggcEEQDocVTJoVB57IIjoyISQ1xC 0HrGLC6XBfOabUhkcL+KMLF0phhTb7rhOW7vs5bH6BM0CchbliWUIpd5t462W+AyfGN1 UgegEGj4BDW8H4tDwfuS+arpT9JSg8ERiTfu7c2rB6JRruAwXxozIrgbHvoA0dzHwf+0 Y83PtwgMxARRjFmFspOwS+ttVaOvY0VznuYAFDbFeoAJbQqO11vvV7JZpd+V+eO1i7WF 6xWy7MLfOATAVfrU/pZrOZ2Buoyp1ywIokkXwedQ+0mOetP4XfvzlFlvRxi6UETF81lC WTDQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :organization:references:in-reply-to:message-id:subject:cc:to:from :date; bh=gBPeYTzsPbEYAzwaa6kXjie2Jx47YCOq1t/0j81tPg0=; b=bDNG3cT2HkN+nWTepZQhZB4btnfTBqGi1Q6c+s7Ydi5rSWE1CgZMxbotb85pG5OU3L nMIzBX59yu9dMb7gl/ix+aTomC8Cl9peXEoJy1MWYNuKoXy+yTiqO3Ne3Aipn071fXep sHtowUP8+d55SIc5UMP9Xq+CCm9NqQXctHOYy68yGUIH3kER4xCzZbjEbnxh9QPF7VwH O3qDQ2nHweukfhuULbCxYZEJyPD9u3IqXpm9p5vpgecoB1gSqN8fQCVhxqL/RU/WDpr2 jBGEn8tYVSrALXiMmYYm+kLEdX5kHz1ZPrOvuomfw5KmJkJDIs0tUKhbf7IX4wCmB7uL FgtA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id c11si894591edy.550.2021.05.06.00.21.51; Thu, 06 May 2021 00:22:14 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233394AbhEFHV5 (ORCPT + 99 others); Thu, 6 May 2021 03:21:57 -0400 Received: from mail.kingsoft.com ([114.255.44.145]:24434 "EHLO mail.kingsoft.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232271AbhEFHVy (ORCPT ); Thu, 6 May 2021 03:21:54 -0400 X-AuditID: 0a580157-bebff70000027901-14-609398d7daa5 Received: from mail.kingsoft.com (localhost [10.88.1.79]) (using TLS with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client did not present a certificate) by mail.kingsoft.com (SMG-1-NODE-87) with SMTP id 5E.BA.30977.7D893906; Thu, 6 May 2021 15:20:55 +0800 (HKT) Received: from alex-virtual-machine (10.88.1.103) by KSBJMAIL4.kingsoft.cn (10.88.1.79) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2176.2; Thu, 6 May 2021 15:20:54 +0800 Date: Thu, 6 May 2021 15:20:48 +0800 From: Aili Yao To: Michal Hocko CC: Andrew Morton , David Hildenbrand , , LKML , Subject: Re: [PATCH] Revert "mm/gup: check page posion status for coredump." Message-ID: <20210506152048.2baefb05@alex-virtual-machine> In-Reply-To: References: <20210505135407.31590-1-mhocko@kernel.org> <20210506134750.17d2f6eb@alex-virtual-machine> Organization: kingsoft X-Mailer: Claws Mail 3.17.8 (GTK+ 2.24.30; x86_64-pc-linux-gnu) MIME-Version: 1.0 Content-Type: text/plain; charset="US-ASCII" Content-Transfer-Encoding: 7bit X-Originating-IP: [10.88.1.103] X-ClientProxiedBy: KSBJMAIL1.kingsoft.cn (10.88.1.31) To KSBJMAIL4.kingsoft.cn (10.88.1.79) X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFnrELMWRmVeSWpSXmKPExsXCFcHor3t9xuQEgwkzhCzmrF/DZvF1/S9m i8u75rBZ3Fvzn9Xifp+DxbPWqywObB47Z91l99j0aRK7x4kZv1k83u+7yuaxfstVFo/Pm+QC 2KK4bFJSczLLUov07RK4Mj5sbGAsuCFQMePoSqYGxjbeLkZODgkBE4nVX/qZuhi5OIQEpjNJ TGr9wA7hPGOUWHF/DxtIFYuAisSNy3dYQGw2AVWJXfdmsYLYIgJKEl2bd7KBNDALrGKUOPf8 C1hCWMBHYnbvOnYQm1fASqJp93cmEJtTQE/i1d19UOu6GSUWN08C28AvICbRe+U/UIID6CZ7 icfrFSF6BSVOznwCtphZQEfixKpjzBC2vMT2t3PAbCEBRYnDS36xQ7wjL3H393RGCDtWounA LbYJjMKzkIyahWTULCSjFjAyr2JkKc5NN9zECImG8B2M85o+6h1iZOJgPMQowcGsJMJbsLY/ QYg3JbGyKrUoP76oNCe1+BCjNAeLkjjvlYKuBCGB9MSS1OzU1ILUIpgsEwenVAMTr9N15o7m X4uCvRI3GRwT3LiQaao7a+aZ+b/47uU9PNk7+fNviSnc5XmvFcTTbnfOOyYpbB2Ue+pT+hwW p1M1LYLn7tavbJ6kHevX1XD8acoGg6eVu40/SC9OWykZVnr9uYIDo3iZ8+H/3hf+/Gfd+dvg uta35p//Djm6ce1Z8G3Jqnm/G16rzuDQt7z/XXBWgvGWtM2srU46LSIzQzX27M8/+rT8T5/+ gu0Wxwz2bp3cUWk1+wrj9HSLPQLCs6VmlFWEc0XcPOvhdcY06bLcK6GjeSu6jgq3NjaWPbns V+/M678nxnj9OoH7Uyr4l5asu3/xwLeA4gMcyvLPa2s+7BFdu5NX25pH+NweNu55SizFGYmG WsxFxYkAimq/2PUCAAA= Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, 6 May 2021 09:02:50 +0200 Michal Hocko wrote: > On Thu 06-05-21 13:47:50, Aili Yao wrote: > > On Wed, 5 May 2021 15:54:07 +0200 > > Michal Hocko wrote: > > > > > From: Michal Hocko > > > > > > While reviewing http://lkml.kernel.org/r/20210429122519.15183-4-david@redhat.com > > > I have crossed d3378e86d182 ("mm/gup: check page posion status for > > > coredump.") and noticed that this patch is broken in two ways. First it > > > doesn't really prevent hwpoison pages from being dumped because hwpoison > > > pages can be marked asynchornously at any time after the check. > > > > I rethink this: > > There are two cases for this coredump panic issue. > > One is the scenario that the hwpoison flag is set correctly, and the previous patch > > will make it recoverable and avoid panic. > > > > Another is the hwpoison flag not valid in the check, maybe race condition. I don't think > > this case is worth and reliazable to be covered. As the SRAR can happen freshly in the dump > > process and thus can't be detected. > > > > And the previous patch doesn't make the Another case worse and unacceptable. just as it can't be > > covered. > > > > So here is the patch: > > For most case in this topic, the patch will work. For the case hwpoison flag not valid, it will > > fallback to the original process before this patch --- just panic. > > Please propose a new fix which a) doesn't leak a page reference b) > evaluates how realistic is the scenario Got this, Thanks, I will dig into it and try to fix the leak. And There will be more comments on the scenario that the issue will be triggered. > c) explain why any other gup > user doesn't really need to care - or in other words is the gup layer > really suitable for this issue? For SIGBUS coredump case, we will call the gup module for dump pages. For normal hwposion case, the gup module will check the pte entry for hwpoison case, ans this issue is for another case for hwpoison. Maybe it's easy to fix this issue in gup module. Thanks! Aili Yao