Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 16593C433FE for ; Thu, 11 Nov 2021 18:42:28 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id EB47960F58 for ; Thu, 11 Nov 2021 18:42:27 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234402AbhKKSpQ (ORCPT ); Thu, 11 Nov 2021 13:45:16 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57960 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233245AbhKKSpP (ORCPT ); Thu, 11 Nov 2021 13:45:15 -0500 Received: from mail-ed1-x533.google.com (mail-ed1-x533.google.com [IPv6:2a00:1450:4864:20::533]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 727E6C061766 for ; Thu, 11 Nov 2021 10:42:25 -0800 (PST) Received: by mail-ed1-x533.google.com with SMTP id m14so28113196edd.0 for ; Thu, 11 Nov 2021 10:42:25 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=7OVq/a8eoAC0rD0a0Se2SFau2iZCsRfN7iIH0M01F54=; b=cnv7pvsGOiLbBzOtsY4zit9N7T6R8B4xu0fyr6ZuO+QavmMgyj0nTZ2ZD/3OZGtIMR GjUET6ILX79iRRmdc4toxpmaRFzGx1rCTGJihUsiE71wRo8e8hBowvW8byY9qpsfkjmI djnxZP/nQODZXLJXTz5u9cgdt1EBWtYiRBGpG+7ArkujWN4yeD+uEsbeyxxgU8B5+JEz v4g93t2n//wyqUG1ZFkaCxlNH7nIJzhR/YjlmGpUu7ftVLYvv6FVdwJn9tMkMVRNJtyR YdPEGKkwPLlmtkhGYtAguv6Lx781HIf3YOGL1dqZTTfxWMFKWzqvLX1fONqONsf6+RIg vZkw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=7OVq/a8eoAC0rD0a0Se2SFau2iZCsRfN7iIH0M01F54=; b=4OdNTkASsa3vGfXe/0nvIPLALUqKXMTKSm9TVm0Pel7SGWoBBdlQ2egjVFv+Rds1M3 DdIe1b6PbTqjWjJfAR97s3OwYYK77caW9FZ5wqgt0gLZ+ZNRsHNPxuFiRU00n+kQxuSx CvfQVmE7qXGchqtaTPNQWzandZ76Fj5udpnUK5R2Em7Lrz+n2gc8XekcP3383FD+KImp wCCdBBTRRWbNpOLMNJnQFyu8BWBsGWxPnJRQzq+uu5p4LwUdCMXQZEg9efBGhmdQ0cFm nm7s5av2P/vwVJ8a4bVELvLi1mgwgM66W8SWW7CucphEfH/CS9ixDIdPSkRtdsxUaFcN S+mA== X-Gm-Message-State: AOAM533LOiypmhZYNXY2Bodc5gHPp+ZwxFSv11k36PPB6W/Mdjmjh0pC HtyzbFW5RUXQJK3/jJYfuqXG+NMo1lR0HHZtoHo= X-Google-Smtp-Source: ABdhPJxQxL3BoTVB6IuRxiuuGiYOFaQLNL4BiX88ZCCeqXSthWL0Fej1Lrx45oIJxG9ArJXOzrV6KVh0PBk9WUsDUps= X-Received: by 2002:a17:906:9b95:: with SMTP id dd21mr12075307ejc.85.1636656144035; Thu, 11 Nov 2021 10:42:24 -0800 (PST) MIME-Version: 1.0 References: <20211111084617.6746-1-ajaygargnsit@gmail.com> <6864f744-ca3e-f328-8793-2adb0146db03@bytedance.com> In-Reply-To: <6864f744-ca3e-f328-8793-2adb0146db03@bytedance.com> From: Yang Shi Date: Thu, 11 Nov 2021 10:42:12 -0800 Message-ID: Subject: Re: [PATCH] mm: shmem: do not call PageHWPoison on a ERR-page To: Muchun Song Cc: Ajay Garg , Hugh Dickins , Andrew Morton , Linux MM , Linux Kernel Mailing List Content-Type: text/plain; charset="UTF-8" Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Nov 11, 2021 at 3:20 AM Muchun Song wrote: > > > > On 2021/11/11 16:46, Ajay Garg wrote: > > commit b9d02f1bdd98 > > ("mm: shmem: don't truncate page if memory failure happens") > > > > introduced a PageHWPoison(page) call in "shmem_read_mapping_page_gfp" > > in shmem.c. > > > > Now, if "shmem_getpage_gfp" returns an error, page is set to ERR-page. > > Therafter, calling PageHWPoison() on this ERR-page, causes KASAN to OOP > > the kernel : > > > > ############################# > > BUG: unable to handle page fault for address: fffffffffffffff4 > > PF: supervisor read access in kernel mode > > PF: error_code(0x0000) - not-present page > > PGD 18e019067 P4D 18e019067 PUD 18e01b067 PMD 0 > > Oops: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC KASAN PTI > > CPU: 0 PID: 4836 Comm: MATLAB Not tainted 5.15.0+ #18 > > Hardware name: Dell Inc. Latitude E6320/0GJF11, BIOS A19 11/14/2013 > > RIP: 0010:shmem_read_mapping_page_gfp+0xd3/0x140 > > Code: 4c 89 ff e8 6f eb ff ff 5a 59 85 c0 74 64 48 63 d8 48 89 5d 98 be 08 00 00 00 48 89 df e8 e5 67 0c 00 48 89 df e8 6d 5c 0c 00 <48> 8b 13 48 c7 c0 fb ff ff ff f7 c2 00 00 80 00 74 30 48 ba 00 00 > > RSP: 0018:ffff88806b33f998 EFLAGS: 00010246 > > RAX: 0000000000000000 RBX: fffffffffffffff4 RCX: ffffffffb7a37ba3 > > RDX: 0000000000000003 RSI: dffffc0000000000 RDI: fffffffffffffff4 > > RBP: ffff88806b33fa20 R08: 1ffffffffffffffe R09: fffffffffffffffb > > R10: fffffbffffffffff R11: 0000000000000001 R12: 1ffff1100d667f33 > > R13: 00000000001120d2 R14: 00000000000005db R15: ffff88814e64e2d8 > > FS: 00007f379a384640(0000) GS:ffff888161a00000(0000) knlGS:0000000000000000 > > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > > CR2: fffffffffffffff4 CR3: 00000000269dc004 CR4: 00000000000606f0 > > Call Trace: > > > > ? shmem_fault+0x480/0x480 > > ? __cond_resched+0x1c/0x30 > > ? __kasan_check_read+0x11/0x20 > > shmem_get_pages+0x3a4/0xa70 [i915] > > ? shmem_writeback+0x3b0/0x3b0 [i915] > > ? i915_gem_object_wait_reservation+0x330/0x560 [i915] > > ... > > ... > > ################################ > > > > So, we proceed with PageHWPoison() call, only if the page is not a > > ERR-page. > > > > > > P.S. : Alternative (optimised) solution : > > =========================================== > > > > We could save some CPU cycles, if we directly replace > > > > if (error) > > page = ERR_PTR(error); > > else > > unlock_page(page); > > > > with > > > > if (error) > > return ERR_PTR(error); > > > > > > Fixes: b9d02f1bdd98 ("mm: shmem: don't truncate page if memory failure happens") > > Signed-off-by: Ajay Garg > > --- > > mm/shmem.c | 2 +- > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > diff --git a/mm/shmem.c b/mm/shmem.c > > index 23c91a8beb78..427863cbf0dc 100644 > > --- a/mm/shmem.c > > +++ b/mm/shmem.c > > @@ -4222,7 +4222,7 @@ struct page *shmem_read_mapping_page_gfp(struct address_space *mapping, > > else > > unlock_page(page); > > > > - if (PageHWPoison(page)) > > + if (!IS_ERR(page) && PageHWPoison(page)) > > page = ERR_PTR(-EIO); > > How about the following changes since the above if block > already do the judgment? > > diff --git a/mm/shmem.c b/mm/shmem.c > index f0eee4e221a7..0c84b6624026 100644 > --- a/mm/shmem.c > +++ b/mm/shmem.c > @@ -4195,13 +4195,13 @@ struct page *shmem_read_mapping_page_gfp(struct > address_space *mapping, > BUG_ON(!shmem_mapping(mapping)); > error = shmem_getpage_gfp(inode, index, &page, SGP_CACHE, > gfp, NULL, NULL, NULL); > - if (error) > + if (error) { > page = ERR_PTR(error); > - else > + } else { > unlock_page(page); > - > - if (PageHWPoison(page)) > - page = ERR_PTR(-EIO); > + if (PageHWPoison(page)) > + page = ERR_PTR(-EIO); > + } Thanks guys. My bad. It is an apparent bug. IMHO, Muchun's version seems better to me. Anyway, whatever Andrew picks up could have Reviewed-by: Yang Shi > > return page; > #else > > > > > return page; > > >