Received: by 2002:a05:6a10:5bc5:0:0:0:0 with SMTP id os5csp139335pxb; Mon, 18 Oct 2021 22:54:37 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxqGst0NdQ/Dl/u0Obk1/NGz6nNteAwR6GZ08sggfJ6x+bXK4OWiAYcL+sfiMz4R596zUQr X-Received: by 2002:a17:907:205c:: with SMTP id pg28mr36413404ejb.407.1634622877770; Mon, 18 Oct 2021 22:54:37 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1634622877; cv=none; d=google.com; s=arc-20160816; b=vDcOJTs1Mfihd+jCQW7ZlGlzTwqxjvM1hi3wQG1FFSW5x3KQD29+hVB/yp1sfQ07oJ ozv16zEp6LI+CuOrj34pzGTzILG4WuUdkr+Bcz7/M/QysslH35CvSBYUmqm3F9WrDqN2 V2pxqIG/fZZlhgPcOrp+1qU98bjX6iIGSIK5XcP9CE8kqeAHI1Ghb3hQVi7IoeyfaYAB XtGtnQ9f4ttQvpClnsnvZhzfmoM8vEC+Pn54HxtY0B3stwK8SfUqQL1ovLMHmkhQ/j+d A6LGlQPcGw5CTy5VQbAg0rA427PU02h22L6tWnpsqg0eF1rItiZ15Z1NhspjEzlruFRY FQig== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:dkim-signature:date; bh=eleFrI14OL3nhvSbN5u9PeAeR7eMrpEMGjnceXAO42w=; b=sTw3j1aV1VqL86XWtFRdorxvcGTl56+COOBepeQ56epIPnrl13kATNNHXBJp6ufWpJ o58Np3XJnXjnoyYkLD54v30Rvys5NOs9P/4rYsp14zMj1SLtQleI1Oydo9p2FhB1Xfdq CevFBNOWoeBJtw9WCMMD/U1mzDXQDxF4fi3Ee5mdlCjAdnqoansEUNVNwULfS6O1Xzwx r89x4eZHbK/IgrBQetN5n3+3qtynX+On3snMEqcoCqpStTLlk2kB+Jq0whmaN9OsKNKI 7XRhK6NnGbV2npF7qXFlqIq760Ldmjol1JgCJyu5+9mVgg8x85DlfmMcmkzmapDSq85/ O6Og== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linux.dev header.s=key1 header.b=dJZ910Bq; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linux.dev Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id x62si25139746ede.361.2021.10.18.22.54.14; Mon, 18 Oct 2021 22:54:37 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@linux.dev header.s=key1 header.b=dJZ910Bq; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linux.dev Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233793AbhJSFyn (ORCPT + 99 others); Tue, 19 Oct 2021 01:54:43 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:58402 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229784AbhJSFym (ORCPT ); Tue, 19 Oct 2021 01:54:42 -0400 Received: from out0.migadu.com (out0.migadu.com [IPv6:2001:41d0:2:267::]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0A812C06161C for ; Mon, 18 Oct 2021 22:52:29 -0700 (PDT) Date: Tue, 19 Oct 2021 14:52:21 +0900 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1634622748; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=eleFrI14OL3nhvSbN5u9PeAeR7eMrpEMGjnceXAO42w=; b=dJZ910Bq7uAQPTACw1YuK1xa43xV5o64PfQlD/qatlaV4qPpHaZGvPaCtxlGkVu01rYZV1 MnuCpNomxvOZCDuchCqK++KnbLxFWM+YCnk/MQ4mMJQ1tNABQv7a2y6f07NDLJVSj2EZqT Qe2DIIjUHmjM9w9opjuZqbMgAO7sEOc= X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. From: Naoya Horiguchi To: Yang Shi Cc: naoya.horiguchi@nec.com, hughd@google.com, kirill.shutemov@linux.intel.com, willy@infradead.org, peterx@redhat.com, osalvador@suse.de, akpm@linux-foundation.org, linux-mm@kvack.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org Subject: Re: [v4 PATCH 5/6] mm: shmem: don't truncate page if memory failure happens Message-ID: <20211019055221.GC2268449@u2004> References: <20211014191615.6674-1-shy828301@gmail.com> <20211014191615.6674-6-shy828301@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <20211014191615.6674-6-shy828301@gmail.com> X-Migadu-Flow: FLOW_OUT X-Migadu-Auth-User: naoya.horiguchi@linux.dev Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Oct 14, 2021 at 12:16:14PM -0700, Yang Shi wrote: > The current behavior of memory failure is to truncate the page cache > regardless of dirty or clean. If the page is dirty the later access > will get the obsolete data from disk without any notification to the > users. This may cause silent data loss. It is even worse for shmem > since shmem is in-memory filesystem, truncating page cache means > discarding data blocks. The later read would return all zero. > > The right approach is to keep the corrupted page in page cache, any > later access would return error for syscalls or SIGBUS for page fault, > until the file is truncated, hole punched or removed. The regular > storage backed filesystems would be more complicated so this patch > is focused on shmem. This also unblock the support for soft > offlining shmem THP. > > Signed-off-by: Yang Shi > --- > mm/memory-failure.c | 10 +++++++++- > mm/shmem.c | 37 ++++++++++++++++++++++++++++++++++--- > mm/userfaultfd.c | 5 +++++ > 3 files changed, 48 insertions(+), 4 deletions(-) > > diff --git a/mm/memory-failure.c b/mm/memory-failure.c > index cdf8ccd0865f..f5eab593b2a7 100644 > --- a/mm/memory-failure.c > +++ b/mm/memory-failure.c > @@ -57,6 +57,7 @@ > #include > #include > #include > +#include > #include "internal.h" > #include "ras/ras_event.h" > > @@ -866,6 +867,7 @@ static int me_pagecache_clean(struct page_state *ps, struct page *p) > { > int ret; > struct address_space *mapping; > + bool extra_pins; > > delete_from_lru_cache(p); > > @@ -894,6 +896,12 @@ static int me_pagecache_clean(struct page_state *ps, struct page *p) > goto out; > } > > + /* > + * The shmem page is kept in page cache instead of truncating > + * so is expected to have an extra refcount after error-handling. > + */ > + extra_pins = shmem_mapping(mapping); > + > /* > * Truncation is a bit tricky. Enable it per file system for now. > * > @@ -903,7 +911,7 @@ static int me_pagecache_clean(struct page_state *ps, struct page *p) > out: > unlock_page(p); > > - if (has_extra_refcount(ps, p, false)) > + if (has_extra_refcount(ps, p, extra_pins)) > ret = MF_FAILED; > > return ret; > diff --git a/mm/shmem.c b/mm/shmem.c > index b5860f4a2738..69eaf65409e6 100644 > --- a/mm/shmem.c > +++ b/mm/shmem.c > @@ -2456,6 +2456,7 @@ shmem_write_begin(struct file *file, struct address_space *mapping, > struct inode *inode = mapping->host; > struct shmem_inode_info *info = SHMEM_I(inode); > pgoff_t index = pos >> PAGE_SHIFT; > + int ret = 0; > > /* i_rwsem is held by caller */ > if (unlikely(info->seals & (F_SEAL_GROW | > @@ -2466,7 +2467,15 @@ shmem_write_begin(struct file *file, struct address_space *mapping, > return -EPERM; > } > > - return shmem_getpage(inode, index, pagep, SGP_WRITE); > + ret = shmem_getpage(inode, index, pagep, SGP_WRITE); > + > + if (*pagep && PageHWPoison(*pagep)) { shmem_getpage() could return with pagep == NULL, so you need check ret first to avoid NULL pointer dereference. > + unlock_page(*pagep); > + put_page(*pagep); > + ret = -EIO; > + } > + > + return ret; > } > > static int > @@ -2555,6 +2564,11 @@ static ssize_t shmem_file_read_iter(struct kiocb *iocb, struct iov_iter *to) > unlock_page(page); > } > > + if (page && PageHWPoison(page)) { > + error = -EIO; Is it cleaner to add PageHWPoison() check in the existing "if (page)" block just above? Then, you don't have to check "page != NULL" twice. @@ -2562,7 +2562,11 @@ static ssize_t shmem_file_read_iter(struct kiocb *iocb, struct iov_iter *to) if (sgp == SGP_CACHE) set_page_dirty(page); unlock_page(page); + if (PageHWPoison(page)) { + error = -EIO; + break; + } } /* Thanks, Naoya Horiguchi