Received: by 2002:a05:6a10:17d3:0:0:0:0 with SMTP id hz19csp2394858pxb; Tue, 13 Apr 2021 00:22:36 -0700 (PDT) X-Google-Smtp-Source: ABdhPJx9QNkCoicG4FxUZNi6qaCm7MKxQ7xceuoCOkGULCa26aXAwsggv2lKWlna2fYNrxjL9ZD9 X-Received: by 2002:aa7:99cb:0:b029:245:1b7c:7982 with SMTP id v11-20020aa799cb0000b02902451b7c7982mr22661542pfi.23.1618298556641; Tue, 13 Apr 2021 00:22:36 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1618298556; cv=none; d=google.com; s=arc-20160816; b=l5k6v/x7FU2aqqtasUS2CM7FZiIIoJb1XzdnHKRl/gw8BioLL89LfjEoeb2xIY4gwh fXxPkcTYWp+hQGkMbEx65yFI0Ot/TpFi9T2LouzWkqgw01VamMh0moU9LKi+y3296eMB U0gy42EMZ6tUOimEdVXcrglIvASwCxSW4XUlj9rQgBQSdhPrimj2k/JAYFITqmbpP1hG koUfubAwEvEth6NjF150RajCnNAmMiR9UkzvdloTOh6WktQuIcEDcKePfelwcNXfa6YS YivpfM43yRHVOVv/5/f7E/KfycmD5eio45/ZxZu5N9e/x6qoyuBZ74oRhn6nQSvMM5fr h2lw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=+eLCTporA8diA9+4AwwE0Qpo+13q5cqafF4gWgDZDZA=; b=PRC6TJEzOKczTHAenss4tJzbRwsfZ6UKo096ojc95Cchz2lCRe5CmJuzmCzmMbAUqx Ih7id6e5sj/qa2Mm5AZ7mqns7Z2jeqLkhnLMUd0JwHzCOTsXV3PUL2tmnBgNv0PtKtGg 1UQ6GdoP91qVVglGxF9XVYY73pRvccPgw5dU9hte00advbq0pqB60ZB48Q7bpiLfttUJ 2OGt6mIqk9NL6pL8ULtT13HfzAwdCB+8stIr3BH2dxGENLKsCdPrfqwRS3Di89IEdin8 XLP5hC9B2tN0RO5itLHFgd7VcXmEr2tUO0qeSfFphSoKYB3MzELbJ8qfdW8SbsmU8Gy8 yx6A== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b="f/HYbNRL"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id b12si15142639pfl.235.2021.04.13.00.22.23; Tue, 13 Apr 2021 00:22:36 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b="f/HYbNRL"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1343804AbhDLWnx (ORCPT + 99 others); Mon, 12 Apr 2021 18:43:53 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50564 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238901AbhDLWnv (ORCPT ); Mon, 12 Apr 2021 18:43:51 -0400 Received: from mail-pj1-x102d.google.com (mail-pj1-x102d.google.com [IPv6:2607:f8b0:4864:20::102d]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 67872C061574 for ; Mon, 12 Apr 2021 15:43:33 -0700 (PDT) Received: by mail-pj1-x102d.google.com with SMTP id kk2-20020a17090b4a02b02900c777aa746fso7944794pjb.3 for ; Mon, 12 Apr 2021 15:43:33 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=+eLCTporA8diA9+4AwwE0Qpo+13q5cqafF4gWgDZDZA=; b=f/HYbNRLsApenom0NFRbF9JdUXkmoFwxcRf6BNOXYUII4ox16vyEHHeI6wxMAPi0Gq C3Uv9pW8IY5/cYsfSHcKeS0HXJZiKFUiCi3r+DZXg1ZmG6w5NcIFEE5o+cpNmAesCwtm Rqy/VkoEfacLhP2zBOCA3hjpRJ2a/rAbtiz8huTKx1WT62yutctCdJ6XNr5yaMz7wBb7 aIKRheBotsTFfaSb4IEluArWWfySIPI13A2cU4XdSUlpEjqDgyOwHsScSatylhPLNM+F K8MhiLvUU+5o0yDouHqYdeL/uwuxIHWkvY6x09CV4FinCKkCAQVdaZ/DPsWAgxnosXwT V1og== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=+eLCTporA8diA9+4AwwE0Qpo+13q5cqafF4gWgDZDZA=; b=WaeTx4XhjY6N2MVX8u+1TCuY8TenHoUYQvu7fnv0VkL6vNgXtuU1NCNTib4G4vUX4Y 1LnBoWlo67u5Dm1sf212eM5Hb+lUx6/jcEyb04RA/s9i96XO0DPj5fjsGSwpJPZ0oYrt ArHsiqGMcyT1WmL+ONtZXYNLLiRr8BXehNr+g662iBDhMUguNsXhdcR1tMkkew9VFXN4 TfxSovAS9PUHoI2X/f4qKM1J4UsFPTKR1btZeu1tKwSOgGBDaNAfE6ifDnMyc0igwygt mnwL1I53a0/a4TsmGRDEHjQLcOnuM4Yr3jAWPbkEbK3UUMBLNoDjeF7DCC0H52RZ77Eb gZpQ== X-Gm-Message-State: AOAM531b5UmLvJyd2MrKkvOknyMWxJwjg/j3HynhP7vQftyaPVm6ZsAV HnJzKhGDjxi0saA6XRaD7g== X-Received: by 2002:a17:902:8f89:b029:ea:ea23:a02c with SMTP id z9-20020a1709028f89b02900eaea23a02cmr8991008plo.71.1618267413002; Mon, 12 Apr 2021 15:43:33 -0700 (PDT) Received: from localhost.localdomain (h175-177-040-153.catv02.itscom.jp. [175.177.40.153]) by smtp.gmail.com with ESMTPSA id l25sm13365373pgu.72.2021.04.12.15.43.30 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 12 Apr 2021 15:43:32 -0700 (PDT) From: Naoya Horiguchi To: linux-mm@kvack.org, Tony Luck , Aili Yao Cc: Andrew Morton , Oscar Salvador , David Hildenbrand , Borislav Petkov , Andy Lutomirski , Naoya Horiguchi , linux-kernel@vger.kernel.org Subject: [PATCH v1 2/3] mm,hwpoison: return -EHWPOISON when page already Date: Tue, 13 Apr 2021 07:43:19 +0900 Message-Id: <20210412224320.1747638-3-nao.horiguchi@gmail.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210412224320.1747638-1-nao.horiguchi@gmail.com> References: <20210412224320.1747638-1-nao.horiguchi@gmail.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Aili Yao When the page is already poisoned, another memory_failure() call in the same page now returns 0, meaning OK. For nested memory mce handling, this behavior may lead to one mce looping, Example: 1. When LCME is enabled, and there are two processes A && B running on different core X && Y separately, which will access one same page, then the page corrupted when process A access it, a MCE will be rasied to core X and the error process is just underway. 2. Then B access the page and trigger another MCE to core Y, it will also do error process, it will see TestSetPageHWPoison be true, and 0 is returned. 3. The kill_me_maybe will check the return: 1244 static void kill_me_maybe(struct callback_head *cb) 1245 { ... 1254 if (!memory_failure(p->mce_addr >> PAGE_SHIFT, flags) && 1255 !(p->mce_kflags & MCE_IN_KERNEL_COPYIN)) { 1256 set_mce_nospec(p->mce_addr >> PAGE_SHIFT, p->mce_whole_page); 1257 sync_core(); 1258 return; 1259 } ... 1267 } 4. The error process for B will end, and may nothing happened if kill-early is not set, The process B will re-excute instruction and get into mce again and then loop happens. And also the set_mce_nospec() here is not proper, may refer to commit fd0e786d9d09 ("x86/mm, mm/hwpoison: Don't unconditionally unmap kernel 1:1 pages"). For other cases which care the return value of memory_failure() should check why they want to process a memory error which have already been processed. This behavior seems reasonable. Signed-off-by: Aili Yao Signed-off-by: Naoya Horiguchi --- mm/memory-failure.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git v5.12-rc5/mm/memory-failure.c v5.12-rc5_patched/mm/memory-failure.c index c1509f4b565e..368ef77e01f9 100644 --- v5.12-rc5/mm/memory-failure.c +++ v5.12-rc5_patched/mm/memory-failure.c @@ -1228,7 +1228,7 @@ static int memory_failure_hugetlb(unsigned long pfn, int flags) if (TestSetPageHWPoison(head)) { pr_err("Memory failure: %#lx: already hardware poisoned\n", pfn); - return 0; + return -EHWPOISON; } num_poisoned_pages_inc(); @@ -1438,7 +1438,7 @@ int memory_failure(unsigned long pfn, int flags) pr_err("Memory failure: %#lx: already hardware poisoned\n", pfn); mutex_unlock(&mf_mutex); - return 0; + return -EHWPOISON; } orig_head = hpage = compound_head(p); -- 2.25.1