Received: by 2002:a05:6358:1087:b0:cb:c9d3:cd90 with SMTP id j7csp5575489rwi; Sun, 23 Oct 2022 08:17:08 -0700 (PDT) X-Google-Smtp-Source: AMsMyM4aOgNOlpiSi7LiAST9MUzbTm5qZHFhWHnL9ebC02zA9LYnv76ucrCpM0GX+c6kI6vVGCeL X-Received: by 2002:a17:90b:4d8b:b0:20a:e256:fdd8 with SMTP id oj11-20020a17090b4d8b00b0020ae256fdd8mr68690773pjb.4.1666538228296; Sun, 23 Oct 2022 08:17:08 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1666538228; cv=none; d=google.com; s=arc-20160816; b=fFSzGCPklMtEIv70FmZV2h6zeqa2JLjA2sejLGQA072uNrUDrYeRjTU5aFRSZPl0Yq ImqBjKyQ1djdvDNH76QvaL20eWImp7PVnk5eRrz3HRUAH6EO7ICU3AXsqGw5b1hLcCRj BhNxAFtttJ68Tg8eZA6mMrnUB5OqHKl+MPuLRHQ281ZbruH0hYXJ+IGth4NQXt93ATg2 rqIupSKsfDboyjIA2jxx3TYOEPgGNL3KeXHYFPlrwUlBpWlvRgRot0ps7fyTutLjOEt6 H5PENYgwmoZ5jTD8l+4E8L0SPfm02Y+i2CiAByOnON+ALPFYZXBdm6/7Uzs0du13M40q ZYpA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:in-reply-to:from :references:cc:to:content-language:subject:user-agent:mime-version :date:message-id; bh=aNOJ3g9uIwYlOizfz4J3CN+SRQNEQ6tKmttO8COfLs0=; b=xFfR7gUUW1igKXjaq77Gb1ft1qjNoOlP7QWCGekg4SWFnuByVOXLwIh2gnWW0Oy4Rs Aad25QbRJ0Cz+DoHDBdtdZLV0h6sB0We3laENRlWbEhOComNQQlY8NkxVdTMMT+VsxF+ 7lzUMsdA1GW4UMrod6jdrd00yFt3vqyNuheUppUtjJ9H5mu3RATjakj9UYnBeq8sqj6Y iM5pQ9XVu+2kH8nc2I00DRHBx9b2e+JT9HFg3ADL3AivIrd3oRc2Esoo301zd+n2GNga KwdvE+eB9Z6kr9vO3ws74Arl0BFaob06T9rz0OJfoftA1AEmrk0el5j+QGeNHLpWRK7C 5CGA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=alibaba.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id h187-20020a636cc4000000b004396227b476si27681138pgc.774.2022.10.23.08.16.37; Sun, 23 Oct 2022 08:17:08 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=alibaba.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230216AbiJWPEg (ORCPT + 99 others); Sun, 23 Oct 2022 11:04:36 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51780 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229730AbiJWPEe (ORCPT ); Sun, 23 Oct 2022 11:04:34 -0400 Received: from out30-57.freemail.mail.aliyun.com (out30-57.freemail.mail.aliyun.com [115.124.30.57]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 5850213E8B for ; Sun, 23 Oct 2022 08:04:30 -0700 (PDT) X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R191e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=ay29a033018046050;MF=xueshuai@linux.alibaba.com;NM=1;PH=DS;RN=13;SR=0;TI=SMTPD_---0VSptLky_1666537463; Received: from 30.13.157.28(mailfrom:xueshuai@linux.alibaba.com fp:SMTPD_---0VSptLky_1666537463) by smtp.aliyun-inc.com; Sun, 23 Oct 2022 23:04:25 +0800 Message-ID: <13658301-6af4-9dcf-0158-d24745d49f4f@linux.alibaba.com> Date: Sun, 23 Oct 2022 23:04:22 +0800 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:91.0) Gecko/20100101 Thunderbird/91.13.0 Subject: Re: [PATCH v2] mm, hwpoison: Try to recover from copy-on write faults Content-Language: en-US To: "Luck, Tony" , David Laight Cc: Naoya Horiguchi , Andrew Morton , Miaohe Lin , Matthew Wilcox , "Williams, Dan J" , Michael Ellerman , Nicholas Piggin , Christophe Leroy , "linux-mm@kvack.org" , "linux-kernel@vger.kernel.org" , "linuxppc-dev@lists.ozlabs.org" References: <20221019170835.155381-1-tony.luck@intel.com> <893b681b-726e-94e3-441e-4d68c767778a@linux.alibaba.com> <359bae4e-6ce3-cc7e-33d0-252064157bc6@linux.alibaba.com> <1643d19d795b4a8084228eab66a7db9f@AcuMS.aculab.com> From: Shuai Xue In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-9.9 required=5.0 tests=BAYES_00, ENV_AND_HDR_SPF_MATCH,NICE_REPLY_A,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE, SPF_PASS,UNPARSEABLE_RELAY,USER_IN_DEF_SPF_WL autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org 在 2022/10/22 AM12:30, Luck, Tony 写道: >>> But maybe it is some RMW instruction ... then, if all the above options didn't happen ... we >>> could get another machine check from the same address. But then we just follow the usual >>> recovery path. > > >> Let assume the instruction that cause the COW is in the 63/64 case, aka, >> it is writing a different cache line from the poisoned one. But the new_page >> allocated in COW is dropped right? So might page fault again? > > It can, but this should be no surprise to a user that has a signal handler for > a h/w event (SIGBUS, SIGSEGV, SIGILL) that does nothing to address the > problem, but simply returns to re-execute the same instruction that caused > the original trap. > > There may be badly written signal handlers that do this. But they just cause > pain for themselves. Linux can keep taking the traps and fixing things up and > sending a new signal over and over. > > In this case that loop may involve taking the machine check again, so some > extra pain for the kernel, but recoverable machine checks on Intel/x86 switched > from broadcast to delivery to just the logical CPU that tried to consume the poison > a few generations back. So only a bit more painful than a repeated page fault. > > -Tony > > I see, thanks for your patient explanation :) Best Regards, Shuai