Received: by 10.192.165.156 with SMTP id m28csp812762imm; Thu, 19 Apr 2018 07:59:56 -0700 (PDT) X-Google-Smtp-Source: AIpwx4/bJxk8Vh5HlynjaQcrmXEDlDa+IJJ/q57dkaGEFOhQ9exVG4eS8ptBTTXuIPghDgzYSVew X-Received: by 10.99.97.139 with SMTP id v133mr5282835pgb.285.1524149995988; Thu, 19 Apr 2018 07:59:55 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1524149995; cv=none; d=google.com; s=arc-20160816; b=s7v1axfeNDBhb9FQV4pOrPAsrdHwC4xWxfv+mkdrtPUsdv91Ad90JaMnRRvCKjdYcA /+/qfbqzYgF46GPmDbOjXaGgxd534h/TuzMIDe+vsmTiVryyZtv3kwQb7SughRLKmbYy GCroNpAoTSUYffwqdRu5PNzeZfbfUi6K1KHDy4FI+IJVi+UsUwpXl96NKkvO57XvXtTh 5+cIYXPapu8d4biN5greqavyKvxt7o2T9rwbtilWYq/IUpeGzMoJcwshl1ZFFgWum8Ju ppRBRqFs5FwxXDGfCwlctDqIFoeE7ImlZ9LRaKEJQcI5ARXUdztKQjOiQhNkFOIzDPm9 cEkA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding :content-language:in-reply-to:mime-version:user-agent:date :message-id:from:references:cc:to:subject:dkim-signature :arc-authentication-results; bh=dfjhi12EzP30nZ8fM9J4vdKskIrcR1/K99rjkUIRxbc=; b=vqNN49tuXfqX3jqqXBeeHBAxNU8KymaYrFaCOdQgUYwhKSJIibk2WxX6WKsx3WET+b Iko36ZMBd/3tncqFo9cyNuuutUmJfv2OnFGZpSLx8QEoagnVa5niQAl1KxpZozCXRUdz a1PnMy5Wauul8CkyEGZf98mlWkYSGATgD4GOTVG15f/RfwrLS+HKQLC5DLlU3vdYFfze 9YEBlLqDyA7omDcTOmPo9H/y9hibHfDdnqeV2gT04d0wozYtpRpY44Uk6prvCwybpJFH AQDRTNFwkrWfFdGv8H0Pp1tX2eQGdiosnipUlZK1o97xQs1llitQSz1MVIoTd2LJsIDA i3IQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=qsDa4vv7; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id m15si3076213pgu.98.2018.04.19.07.59.41; Thu, 19 Apr 2018 07:59:55 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=qsDa4vv7; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753534AbeDSO5P (ORCPT + 99 others); Thu, 19 Apr 2018 10:57:15 -0400 Received: from mail-ot0-f196.google.com ([74.125.82.196]:36021 "EHLO mail-ot0-f196.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752455AbeDSO5K (ORCPT ); Thu, 19 Apr 2018 10:57:10 -0400 Received: by mail-ot0-f196.google.com with SMTP id p2-v6so6153644otf.3; Thu, 19 Apr 2018 07:57:10 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=subject:to:cc:references:from:message-id:date:user-agent :mime-version:in-reply-to:content-language:content-transfer-encoding; bh=dfjhi12EzP30nZ8fM9J4vdKskIrcR1/K99rjkUIRxbc=; b=qsDa4vv75l0BXMegQZmwb2BZa5EVo1viC+PCkf6XcuIKAwnQYIHYG4N/rerdenOpx5 LobmTMs+Cra5NgyBiyrRNbChNgZr9j/U941MIdNj9igbg7YzHXgu3dDuryBbQ7LU7fy5 EehPjBVQ6VSPTWfGLuf2uljdmJY28HLbD09J8q2NBJTGHNc/naeJQd5xhLRvOpSwyASO b6Q3+nUo+e6D8MGSBcGPd4DzY0J4uu6e49SYzDXGGkyut/XqOWclNxinOFYfpblZ1yF8 cZMexs6NaqaPA9k7rA071ZZmnrON8bfA5ATgZzXFNanYoQb+YNQmu5eXBLPRtYY7phCX iWLw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:to:cc:references:from:message-id:date :user-agent:mime-version:in-reply-to:content-language :content-transfer-encoding; bh=dfjhi12EzP30nZ8fM9J4vdKskIrcR1/K99rjkUIRxbc=; b=DID9DpNe+Zq8rsHoqw1i2bnwXfNcOuxNg1XfGDh4X3GitTTC3UXmZAAydD6usmBajm 1qqYPjRVLx29QP7tUG8HjavKynj7eNDLKoqc4T823Rux5WxstWvBfF53lh2DlidHAPkd HJUoRntLdDZEHVz6lNKhI3LbNI76/PrClbMmHoK4ImARY7tiFWb1jHJyZXKvfD9wPpu3 pbATLHny8XZ9UO3k3nNUUYuG//y1edomkQQnROIoaeVWZEK3Yr7DjKbbNjNqzUGpUT4i JViMZykqPQ6QaJlHKwIoTr+NsCIIqTcvRqm8H4QSOFEvK1lcGRRvZjNUIH1yGFKo8pw0 xDrQ== X-Gm-Message-State: ALQs6tCTMiexLlJ8OQzx47u9TKlKh3nK48b+v0flGRUvWHqrefs5umhc RMVBwpKCvLqDO+5OjFoHGHs= X-Received: by 2002:a9d:1f0e:: with SMTP id x14-v6mr4391348otd.25.1524149829461; Thu, 19 Apr 2018 07:57:09 -0700 (PDT) Received: from nuclearis2_1.gtech (c-98-197-2-30.hsd1.tx.comcast.net. [98.197.2.30]) by smtp.gmail.com with ESMTPSA id t140-v6sm1978282oif.36.2018.04.19.07.57.07 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 19 Apr 2018 07:57:08 -0700 (PDT) Subject: Re: [RFC PATCH v2 3/4] acpi: apei: Do not panic() when correctable errors are marked as fatal. To: Borislav Petkov Cc: linux-acpi@vger.kernel.org, linux-edac@vger.kernel.org, rjw@rjwysocki.net, lenb@kernel.org, tony.luck@intel.com, tbaicar@codeaurora.org, will.deacon@arm.com, james.morse@arm.com, shiju.jose@huawei.com, zjzhang@codeaurora.org, gengdongjiu@huawei.com, linux-kernel@vger.kernel.org, alex_gagniuc@dellteam.com, austin_bolen@dell.com, shyam_iyer@dell.com, devel@acpica.org, mchehab@kernel.org, robert.moore@intel.com, erik.schmauss@intel.com References: <20180416215903.7318-1-mr.nuke.me@gmail.com> <20180416215903.7318-4-mr.nuke.me@gmail.com> <20180418175415.GJ4795@pd.tnic> From: "Alex G." Message-ID: Date: Thu, 19 Apr 2018 09:57:07 -0500 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.6.0 MIME-Version: 1.0 In-Reply-To: <20180418175415.GJ4795@pd.tnic> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 04/18/2018 12:54 PM, Borislav Petkov wrote: > On Mon, Apr 16, 2018 at 04:59:02PM -0500, Alexandru Gagniuc wrote: >> Firmware is evil: >> - ACPI was created to "try and make the 'ACPI' extensions somehow >> Windows specific" in order to "work well with NT and not the others >> even if they are open" >> - EFI was created to hide "secret" registers from the OS. >> - UEFI was created to allow compromising an otherwise secure OS. >> >> Never has firmware been created to solve a problem or simplify an >> otherwise cumbersome process. It is of no surprise then, that >> firmware nowadays intentionally crashes an OS. > > I don't believe I'm saying this but, get rid of that rant. Even though I > agree, it doesn't belong in a commit message. Of course. (snip)> Well, Tyler touched that AER error severity handling recently and we had > it all nicely documented in the comment above ghes_handle_aer(). > > Your ghes_handle_aer_irqsafe() graft basically bypasses > ghes_handle_aer() instead of incorporating in it. > > If all you wanna say is, the severity computation should go through all > the sections and look at each error's severity before making a decision, > then add that to ghes_severity() instead of doing that "deferrable" > severity dance. ghes_severity() is a one-to-one mapping from a set of unsorted severities to monotonically increasing numbers. The "one-to-one" mapping part of the sentence is obvious from the function name. To change it to parse the entire GHES would completely destroy this, and I think it would apply policy in the wrong place. Should I do that, I might have to call it something like ghes_parse_and_apply_policy_to_severity(). But that misses the whole point if these changes. I would like to get to the handlers first, and then decide if things are okay or not, but the ARM guys didn't exactly like this approach. It seems there are quite some per-error-type considerations. The logical step is to associate these considerations with the specific error type they apply to, rather than hide them as a decision under an innocent ghes_severity(). > And add the changes to the policy to the comment above > ghes_handle_aer(). I don't want any changes from people coming and going > and leaving us scratching heads why we did it this way. > > And no need for those handlers and so on - make it simple first - then we > can talk more complex handling. I don't want to leave people scratching their heads, but I also don't want to make AER a special case without having a generic way to handle these cases. People are just as susceptible to scratch their heads wondering why AER is a special case and everything else crashes. Maybe it's better move the AER handling to NMI/IRQ context, since ghes_handle_aer() is only scheduling the real AER andler, and is irq safe. I'm scratching my head about why we're messing with IRQ work from NMI context, instead of just scheduling a regular handler to take care of things. Alex