Received: by 2002:a05:6a10:7420:0:0:0:0 with SMTP id hk32csp591401pxb; Tue, 15 Feb 2022 23:06:28 -0800 (PST) X-Google-Smtp-Source: ABdhPJxACBNnPJgg7AMxqSYp5BGL24brxCaGl9nLMucif/qaQWu+buMiVtkArJQhvQVgRocLiEKZ X-Received: by 2002:a17:902:da82:b0:14e:bbe8:35e6 with SMTP id j2-20020a170902da8200b0014ebbe835e6mr1230132plx.13.1644995188059; Tue, 15 Feb 2022 23:06:28 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1644995188; cv=none; d=google.com; s=arc-20160816; b=uMIjy1CYIph+SKadMD7H74VOI4STgtCa2yeEUGyhJQi8qaIkXxOd8dmNMAL13ch8xT z5sbTKy+s1p98hHZwWNXZakZ5aZogrMXae+yhewLUtwMNScH8JAanErjTY8f0TH5RWCU 3HDARM1HgmF9StYpOSwYlJl0W17zEPGEFC0pBbb+TxLhhvS38PStmUYtZl/Rw++R0/e8 vzThF/nYcA6M/Y9IGMBd/5+ap048r8trRz03+NC6gDrbB6I2fXRikL1N4OFSiINql++o OYkoWDromcYJHgBASvTm9+yxljSjM3PnOBvG5wfSqk6EUKrzmjWLF6D/IS4nc54Plg3N nWvQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:mime-version:user-agent:references :in-reply-to:date:cc:to:from:subject:message-id; bh=sL2aMYVA+CEIzk7uScQLiVAsRcmr9UNjQ1qkeMqa0b0=; b=yaH3XNKMclliJStb8+uq2OSNv069GFZrmP1HqteVdC1NTVpdX4ReZ3038xtPGQFuB5 koZ2x/w2xoyCE1YIGav06LPQLMlbzU9YzZuGYsm7UGwijCfDSycnnXmKd9nDCbuaxtsU R90wVVFxK8R54gOcuwzx1paDWhGbYCvfmALtbjrsCyifaQ0YqPLVSrHIBULSSWANa8T6 bbU6n7n/jyUa6xL+myQgwVYZATsFGdTqy33N9gMbRH05knxZZqJSphumf7b5UWmkxgtp i/eP61ArwDE0gOmnPcG60OO6AlFoN5zLMpGmUVssL8g6tzmtKBvugAIMcrk07i2gSwMf Qktw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net. [2620:137:e000::1:18]) by mx.google.com with ESMTPS id w7si912551pgs.144.2022.02.15.23.06.27 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 15 Feb 2022 23:06:28 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:18 as permitted sender) client-ip=2620:137:e000::1:18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 7D7161E54B2; Tue, 15 Feb 2022 22:43:47 -0800 (PST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237593AbiBOPFM (ORCPT + 99 others); Tue, 15 Feb 2022 10:05:12 -0500 Received: from mxb-00190b01.gslb.pphosted.com ([23.128.96.19]:33256 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234944AbiBOPFK (ORCPT ); Tue, 15 Feb 2022 10:05:10 -0500 Received: from shelob.surriel.com (shelob.surriel.com [96.67.55.147]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 72E68107D33 for ; Tue, 15 Feb 2022 07:05:00 -0800 (PST) Received: from imladris.surriel.com ([96.67.55.152]) by shelob.surriel.com with esmtpsa (TLS1.2) tls TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1nJzNx-0001uA-LX; Tue, 15 Feb 2022 10:04:57 -0500 Message-ID: Subject: Re: [PATCH v2] mm: clean up hwpoison page cache page in fault path From: Rik van Riel To: Oscar Salvador Cc: linux-kernel@vger.kernel.org, kernel-team@fb.com, linux-mm@kvack.org, Miaohe Lin , Andrew Morton , Mel Gorman , Johannes Weiner , Matthew Wilcox Date: Tue, 15 Feb 2022 10:04:57 -0500 In-Reply-To: References: <20220212213740.423efcea@imladris.surriel.com> Content-Type: multipart/signed; micalg="pgp-sha256"; protocol="application/pgp-signature"; boundary="=-cEEbhHLZGjBrDgd1rkUJ" User-Agent: Evolution 3.42.3 (3.42.3-1.fc35) MIME-Version: 1.0 Sender: riel@shelob.surriel.com X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,RDNS_NONE, SPF_HELO_NONE,T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org --=-cEEbhHLZGjBrDgd1rkUJ Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Tue, 2022-02-15 at 13:51 +0100, Oscar Salvador wrote: > On Sat, Feb 12, 2022 at 09:37:40PM -0500, Rik van Riel wrote: > > Sometimes the page offlining code can leave behind a hwpoisoned > > clean > > page cache page. This can lead to programs being killed over and > > over > > and over again as they fault in the hwpoisoned page, get killed, > > and > > then get re-spawned by whatever wanted to run them. >=20 > Hi Rik, >=20 > Do you know how that exactly happens? We should not be really leaving > anything behind, and soft-offline (not hard) code works with the > premise > of only poisoning a page in case it was contained, so I am wondering > what is going on here. >=20 > In-use pagecache pages are migrated away, and the actual page is > contained, and for clean ones, we already do the > invalidate_inode_page() > and then contain it in case we succeed. I do not know the exact failure case, since I have never caught a system in the act of leaking one of these pages. I just know I have seen this issue on systems where the "soft_offline: %#lx: invalidated\n" printk was the only offline method leaving any message in the kernel log. However, there are a few code paths through the soft offlining code path that don't seem to have any printks, so I am not sure exactly where things went wrong. I only really found the aftermath, and tested this patch by loading it as a kernel live patch module on some of those systems. --=20 All Rights Reversed. --=-cEEbhHLZGjBrDgd1rkUJ Content-Type: application/pgp-signature; name="signature.asc" Content-Description: This is a digitally signed message part Content-Transfer-Encoding: 7bit -----BEGIN PGP SIGNATURE----- iQEzBAABCAAdFiEEKR73pCCtJ5Xj3yADznnekoTE3oMFAmILwRkACgkQznnekoTE 3oOatgf/VctfpxQ82Wr2xD3ogIG/T6vKdLWw/cOzRgJoZDyal2JxdXppe3Cu1IPt C8UGfdwh/LKsmFf2fUdux3aBc9abX4KAzntPkhnfN2ST3Bd4Eph8ejFoLQPsmFV8 UMP966KO25wDVf8eovgXHQLB0gcIMVxivr72wOVXzZz2Iz0DzUovcYwjgPmt1NMG nGJ4Xre00BEPi0Pb1ktzGoAWOfC8iv27C+mMPR9cQY1RFDvkbAYhS33ch7ntKKHq 9mbNXxIPlIFVR3Zh61qssRrZzGrX3L/PotkiTtZW9qPs+roaWHZQwSCyAhp6tNT5 wCXIC1iwsAqEMS7Lxk74heUHTIscWw== =1vXq -----END PGP SIGNATURE----- --=-cEEbhHLZGjBrDgd1rkUJ--