Received: by 2002:a5d:9c59:0:0:0:0:0 with SMTP id 25csp100868iof; Sun, 5 Jun 2022 22:23:32 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxRvODA1w4LIgemTXVm4yIJb4p6M+KwGjUlKvEMjrpN5Od6Y9LOiasoGs0wRCJmp5y4BeiB X-Received: by 2002:a17:90b:4a03:b0:1e8:4b95:677f with SMTP id kk3-20020a17090b4a0300b001e84b95677fmr13260400pjb.85.1654493012663; Sun, 05 Jun 2022 22:23:32 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1654493012; cv=none; d=google.com; s=arc-20160816; b=a+bKDw2YS8NvYjUl+Amo8u+U1sbmWPI59ZuLHHyxVNTJw2cKKM2xkGNAG7wXIDnyAB UVYbwAn/uoCo5Ir212lx6ISkymbCSpYolQd/lbV2WiMEStZZ0ByP7y+I45Jdt6fCCEgZ OgiSKHs12g5c5EZeB2cevFZx7AFBr4HoTL5bittlitCky/FAUrqVhk+JqycM+xmx+Goi ez+LZY0auT8hfBpWjYIZrV8dHsNPVD231qx0M/FQUATc9Kjk5L77hIqJsbONHN3R9w7A TzkzAHpqw56zj24o0N7zCNHh6ul/0An2HcQf1OW9iitikGVwoTgluDnMAjyXGIyrSYah GUlQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:in-reply-to:from :references:cc:to:content-language:subject:user-agent:mime-version :date:message-id:dkim-signature; bh=cwL2wHcA34X3qXSJRLp9vC/RDXcQCiZVTt+d5U9lstk=; b=fLgfPdyLhvChhmvfHKBfESXLk0l/xLR+pKOqBtbhsAR6IGv40PPpUp5iyt3WYExtHO 63ivbSTw4TMV516znp42mg91557eD2mwMlVJ0No+JhWKoG0Sg5kyBtPRuRfsI13P2eb7 e6v7xSKEB/rcPwG2TWtTCbh9cz3pxqsuhIoS2QVtSfmR6D3l6419JTelVuUScseSjtk5 /OEPpHiNPDgh+XMsSOHd2G91PpDw4rxw7cMjAn4zqEGJIqag7s0C8A7Zf+hrLu92t53S AxSwp1wJjBMzF6vneyedPt+doBId6xzqaIiZJrI5OnM4O0yHI54Ujf+g963rOy1Cm6Ea wHWQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@bytedance-com.20210112.gappssmtp.com header.s=20210112 header.b=2VOwiJZp; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=bytedance.com Return-Path: Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net. [2620:137:e000::1:18]) by mx.google.com with ESMTPS id c7-20020a6566c7000000b003fcd6212317si18540596pgw.304.2022.06.05.22.23.31 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 05 Jun 2022 22:23:32 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:18 as permitted sender) client-ip=2620:137:e000::1:18; Authentication-Results: mx.google.com; dkim=pass header.i=@bytedance-com.20210112.gappssmtp.com header.s=20210112 header.b=2VOwiJZp; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=bytedance.com Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 0B2931A15CB; Sun, 5 Jun 2022 21:25:59 -0700 (PDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1348795AbiFEE2a (ORCPT + 99 others); Sun, 5 Jun 2022 00:28:30 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:41112 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233427AbiFEE22 (ORCPT ); Sun, 5 Jun 2022 00:28:28 -0400 Received: from mail-pj1-x102d.google.com (mail-pj1-x102d.google.com [IPv6:2607:f8b0:4864:20::102d]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E9881958B for ; Sat, 4 Jun 2022 21:28:26 -0700 (PDT) Received: by mail-pj1-x102d.google.com with SMTP id gc3-20020a17090b310300b001e33092c737so10074075pjb.3 for ; Sat, 04 Jun 2022 21:28:26 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance-com.20210112.gappssmtp.com; s=20210112; h=message-id:date:mime-version:user-agent:subject:content-language:to :cc:references:from:in-reply-to:content-transfer-encoding; bh=cwL2wHcA34X3qXSJRLp9vC/RDXcQCiZVTt+d5U9lstk=; b=2VOwiJZp5DrIJXyOlgpB6VfK40D/XqlM2qapqxtFAZMjZMOMHTJFFkULcMMCP6JArV 87gJxVxrXOesZpMAzFjvBoZNfijEo6hJbiWSYNwhTxcfUJF4V3dVGMaKax7nChHJabyX UMUYGw90KXmANZKYS886MNSWBEJPgITEhpSgfiVKJ5NqSzto8yjP0IwUSTYz0OohkL88 BMWiGS3MYCEoAKKMAt+edZZJlkOX9PoMuP12bSKAEDeIk6r5TwF4RK/mC4P8jRPThBIm E0TkMyZpGS1/e76IsxNQ1A6tCp7MMhyU/lp6xThIH8z7iv3HJCmmDq0l+G2PwjmWEnqx SnGw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:message-id:date:mime-version:user-agent:subject :content-language:to:cc:references:from:in-reply-to :content-transfer-encoding; bh=cwL2wHcA34X3qXSJRLp9vC/RDXcQCiZVTt+d5U9lstk=; b=Ab2criPKiATQrPYzx/9H9iYLQAEIBB7OiZek2FCc8VXNCxjdV7kPGn476Q81KHirOD zXdhGbgEx81e0O0kkpiAN2RQyY+2LcKnVJh0m8rvoOyi7D1wvyqeqSXR1tqV9T/SLSFA ejlYKr/irDKesVpxmeTueTt2XUnXmdcBENXCrrBOlARupI0P3A359o8AuLRuLB2Xd/RG MJVORrItRKmXGCtN4jGz1NJ8r/tNWwykg2f3jTWd8S+xom4rbI+p9mqxnafd0pHxlvmQ GBcx8fwUvzqLt0V5Vo1LvYzb7NPTe9xU+KdfT/Fxzr3pRyZGSZKt7VcPQTEZbllYQN6u PjSw== X-Gm-Message-State: AOAM533SLN5hbmu9jF14S6hrpDrIXTlGQfiQtqVWTVAP0fOAE5PGAZA4 tV6PINGcF6SYwMEGR9FU5wVBVU8wOvsSWQ== X-Received: by 2002:a17:90a:17c9:b0:1e8:5e58:f658 with SMTP id q67-20020a17090a17c900b001e85e58f658mr4940749pja.239.1654403306417; Sat, 04 Jun 2022 21:28:26 -0700 (PDT) Received: from [10.255.89.136] ([139.177.225.249]) by smtp.gmail.com with ESMTPSA id 13-20020a170902c20d00b0015e8d4eb2adsm7927589pll.247.2022.06.04.21.28.21 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Sat, 04 Jun 2022 21:28:25 -0700 (PDT) Message-ID: <584eedd3-9369-9df1-39e2-62e331abdcc0@bytedance.com> Date: Sun, 5 Jun 2022 12:24:24 +0800 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.8.1 Subject: Re: Re: [PATCH] mm/memory-failure: don't allow to unpoison hw corrupted page Content-Language: en-US To: Andrew Morton Cc: naoya.horiguchi@nec.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org, Tony Luck , Wu Fengguang References: <20220604103229.3378591-1-pizhenwei@bytedance.com> <20220604115616.b7d5912ac5a37db608f67b78@linux-foundation.org> From: zhenwei pi In-Reply-To: <20220604115616.b7d5912ac5a37db608f67b78@linux-foundation.org> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-3.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, NICE_REPLY_A,RDNS_NONE,SPF_HELO_NONE,T_SCC_BODY_TEXT_LINE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 6/5/22 02:56, Andrew Morton wrote: > On Sat, 4 Jun 2022 18:32:29 +0800 zhenwei pi wrote: > >> Currently unpoison_memory(unsigned long pfn) is designed for soft >> poison(hwpoison-inject) only. Unpoisoning a hardware corrupted page >> puts page back buddy only, this leads BUG during accessing on the >> corrupted KPTE. >> >> Do not allow to unpoison hardware corrupted page in unpoison_memory() >> to avoid BUG like this: >> >> Unpoison: Software-unpoisoned page 0x61234 >> BUG: unable to handle page fault for address: ffff888061234000 > > Thanks. > >> --- a/mm/memory-failure.c >> +++ b/mm/memory-failure.c >> @@ -2090,6 +2090,7 @@ int unpoison_memory(unsigned long pfn) >> { >> struct page *page; >> struct page *p; >> + pte_t *kpte; >> int ret = -EBUSY; >> int freeit = 0; >> static DEFINE_RATELIMIT_STATE(unpoison_rs, DEFAULT_RATELIMIT_INTERVAL, >> @@ -2101,6 +2102,13 @@ int unpoison_memory(unsigned long pfn) >> p = pfn_to_page(pfn); >> page = compound_head(p); >> >> + kpte = virt_to_kpte((unsigned long)page_to_virt(p)); >> + if (kpte && !pte_present(*kpte)) { >> + unpoison_pr_info("Unpoison: Page was hardware poisoned %#lx\n", >> + pfn, &unpoison_rs); >> + return -EPERM; >> + } >> + >> mutex_lock(&mf_mutex); >> >> if (!PageHWPoison(p)) { > > I guess we don't want to let fault injection crash the kernel, so a > cc:stable seems appropriate here. > > Can we think up a suitable Fixes: commit? I'm suspecting this bug has > been there for a long time? > Sure! 2009-Dec-16, hwpoison_unpoison() was introduced into linux in commit: 847ce401df392("HWPOISON: Add unpoisoning support") ... There is no hardware level unpoisioning, so this cannot be used for real memory errors, only for software injected errors. ... We can find that this function should be used for software level unpoisoning only in both commit log and comment in source code. unfortunately there is no check in function hwpoison_unpoison(). 2020-May-20, 17fae1294ad9d("x86/{mce,mm}: Unmap the entire page if the whole page is affected and poisoned") This clears KPTE, and leads BUG(described in this patch) during unpoisoning the hardware corrupted page. Fixes: 847ce401df392("HWPOISON: Add unpoisoning support") Fixes: 17fae1294ad9d("x86/{mce,mm}: Unmap the entire page if the whole page is affected and poisoned") Cc: Wu Fengguang Cc: Tony Luck . -- zhenwei pi