Received: by 2002:ac0:a594:0:0:0:0:0 with SMTP id m20-v6csp3631537imm; Fri, 25 May 2018 08:55:04 -0700 (PDT) X-Google-Smtp-Source: AB8JxZqlQQM2gSbyrMEcgIVFkLVArwS4V0GL3FGN0c+NcQxpvv577+qIZ8jgf1YCWMj9SSfcDgU8 X-Received: by 2002:a17:902:56e:: with SMTP id 101-v6mr3206999plf.25.1527263704146; Fri, 25 May 2018 08:55:04 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1527263704; cv=none; d=google.com; s=arc-20160816; b=DsYyb34F68Sc8dgTHI9uxOnmFv3cPelkBu+DbJS0XiV4SdJ2dK4NYYRb1dsdkUdyg1 dGoqWRKqbqowBzrwa0Xf86BV7TWTLi3p6uoSuwTEpooRt0T5BHWSJp82FeopUaNQTuBd ZNG2DJLE6n0W4OXdbnDuN65FRRvyslz9jkdLFV8uCvVLcyZVzvpJf0pBwRuJDEGO/jQD zrbWdYZYiAL/87PAjS6p/3VX0O9cNPc5vbZuLoOSrULSogquNuA1fCedqVK/Ca0LN7dk FuTsOYuCC7icTmqN5GZ6S41TxE2fjyKUBdebyaOS6ZrZHneN+uVLXoLunRjejJx2ufmv bz9g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:message-id:date:subject:cc:to:from :dkim-signature:arc-authentication-results; bh=pDAHgzp0qikL8moIQFbxNpeasgbjXvp4QCkUuAyLdEo=; b=kOLvegenUm09GecO6TXzkdzOudc9t50/3JZn+TcdtTbeJQnbh05MDIHsAbyEe/B6zA L2I1vMVyegvkWU1wqFhBZzdkRp10gPJzTcg6caLSjnI3qyf/3/E8GVv5Xr4bePLmPe5Y uxHoNz25XlyajeSHDqp+xR3cdsICdiBjHbB5ehIORmEo9piazH7zu8KG7PtLp885MgD2 ju2kDEkYpEyT/oIlKLY5P5gFWPSMCtyVoZ5flkkaDsGasdqHzn/5jFtKZJWTKNaexVyS RP+mMHHrtMcw+70MLEmH/jSwMThH6zBEMr6U8o0VNOl8W2g36gjwpICybctuOQEnD1Ae qtKA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=Kz0wFeL/; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id bi1-v6si22799938plb.267.2018.05.25.08.54.49; Fri, 25 May 2018 08:55:04 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=Kz0wFeL/; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S966986AbeEYPyF (ORCPT + 99 others); Fri, 25 May 2018 11:54:05 -0400 Received: from mail-oi0-f68.google.com ([209.85.218.68]:35712 "EHLO mail-oi0-f68.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S965666AbeEYPyD (ORCPT ); Fri, 25 May 2018 11:54:03 -0400 Received: by mail-oi0-f68.google.com with SMTP id a6-v6so4991343oia.2; Fri, 25 May 2018 08:54:02 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id; bh=pDAHgzp0qikL8moIQFbxNpeasgbjXvp4QCkUuAyLdEo=; b=Kz0wFeL/J4U04ZcPeu/RwSmnecnZSsn8KhCXkTcAO2NgIpXLm1vQ5v8wxoPg/TUrxT GaL6PW6I4oVB85sMog9uFsfqyGZsirMccPA8zKGcE6puShOZ/acRyjAu2i4Is83rTOBf yec3fOYStRldjpHH8+dFqaVrm3EZLQZP75hkkNbvLuHeZx/xhyosj9VjLkB+ZDVOBlnn bpTlo9figq3Iw1ylNIFjaYw1sAI8LdNusqqm1eHA6dmm57NcEbt8bs6XKxDKJjuC5gZt NcQ5e16cI5ksSfvd+byU00Z+04iFflz8Ro35FksVq8u7o3FlM5ArIcU681x6f6A4zJov yRyA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id; bh=pDAHgzp0qikL8moIQFbxNpeasgbjXvp4QCkUuAyLdEo=; b=GRL6rxsBR94Gp3q21S+ZX6x++YImf4HgZtvdJoe8S8Pu9+zdMcNUFzZqTygijb8A9M M3Wuxx4GOHSUESzKwEuzZQEr4DMHjatGzOGx3Id7Vlntp54EcQDQ6wYlFw5SDGzvP//7 OvfSh/AQ52/LzLSVk8HwBqFkYlRp4GSPOwp18/rZJwkYwkjMEPechbF1ETMAcTB2ApKN REPWCj+3ETGriWGrddau/doqeAnvIoBqtFzpgj+oIATHVZWXchbVUU+35oo+bnp74wbl FmvQK4K6n9ZPKcNokh44EybVajKvMsItOgOOjsKAFcDM/pSpifNvA/QnWSGCu4yttzO5 31vQ== X-Gm-Message-State: ALKqPweZ/F6f1MhYMp+bRKAYy4JRKt10EJtSImgaMglNGwGKRnMB6oV3 +uFtffXa5OaHj2Hdyfpq/8gVKj1+ X-Received: by 2002:aca:544:: with SMTP id 65-v6mr1720114oif.313.1527263642280; Fri, 25 May 2018 08:54:02 -0700 (PDT) Received: from nuclearis2_1.lan (c-98-201-114-184.hsd1.tx.comcast.net. [98.201.114.184]) by smtp.gmail.com with ESMTPSA id o206-v6sm2636856oia.35.2018.05.25.08.54.00 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Fri, 25 May 2018 08:54:01 -0700 (PDT) From: Alexandru Gagniuc To: linux-acpi@vger.kernel.org Cc: alex_gagniuc@dellteam.com, austin_bolen@dell.com, shyam_iyer@dell.com, Alexandru Gagniuc , Tony Luck , Borislav Petkov , Thomas Gleixner , Ingo Molnar , "H. Peter Anvin" , x86@kernel.org, "Rafael J. Wysocki" , Len Brown , Mauro Carvalho Chehab , Robert Moore , Erik Schmauss , Tyler Baicar , Will Deacon , James Morse , "Jonathan (Zhixiong) Zhang" , Dongjiu Geng , linux-edac@vger.kernel.org, linux-kernel@vger.kernel.org, devel@acpica.org Subject: [PATCH v7 0/3] acpi: apei: Drop panic() on fatal errors policy Date: Fri, 25 May 2018 10:53:45 -0500 Message-Id: <20180525155352.22350-1-mr.nuke.me@gmail.com> X-Mailer: git-send-email 2.14.3 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org FFS (firmware-first) handling through APEI seems to have developed a policy to panic() on any fatal errors. This policy is completely independent of the non-FFS case. It is also inconsistent with how the native error handlers, a number of which will recover the system from fatal errors. The purpose of this series is to obsolete this idiotic policy, with the motivation to enable identical handling of PCIe errors to native reporting. Rafael, this is copypaste from the previous patch series. I suspect you might have missed it last time, because you asked questions which were answered here. I've included it so you don't have to go digging old emails: " The purpose of these changes is to see if we can safely de-escalate the situation and notify the appropriate error handler. Since FFS reports errors through NMIs or other non-standard mechanism, we have to be just a little more careful with reporting the error. We're concerned with things, such as being able to cross the NMI/IRQ boundary, or being able to safely schedule work and notify the appropriate subsystem. Once the notification is sent, our job is done. I'm explicitly _NOT_ concerned with whether the error is handled or not, especially since such concern reduces to a call to __ghes_panic(). There are rare cases that prevent us from de-escalating to lesser contexts, such as uncorrectable memory errors in kernel. In these sort of cases, trying to leave the NMI might cause a triple fault. James Morse explained this very well when discussing v1 of this series. In and only in such cases, we are justified to panic(). Once the error is safely sent its merry way, it's really up to the error handler to panic() or continue. For example, aer_recover_queue() might for ungodly reasons fail. However, it's up to the AER code to decide whether failing to queue an error for handling is panic worthy. " Changes since v6: - Fixed silly compilation warning - Dropped concept of Changes since v5: - Removed zoological references from commit message Changes since v4: - Fix Freudian slip and use GHES_ instead of CPER_ enum - Rephrased comments to clarify what we don't care about Changes since v3: - Renamed ghes_severity to something more concrete - Reorganized code to make it look like more than just a rename - Remembered to remove last patch in the series Changes since v2: - Due to popular request, simple is chosen over flexible - Removed splitting of handlers into irq safe portion. - Change behavior only for PCIe errors Changes since v1: - Due to popular request, the panic() is left in the NMI handler - GHES AER handler is split into NMI and non-NMI portions - ghes_notify_nmi() does not panic on deferrable errors - The handlers are put in a mapping and given a common call signature Alexandru Gagniuc (3): acpi: apei: Rename GHES_SEV_PANIC to GHES_SEV_FATAL acpi: apei: Rename ghes_severity() to ghes_cper_severity() acpi: apei: Do not panic() on PCIe errors reported through GHES arch/x86/kernel/cpu/mcheck/mce-apei.c | 2 +- drivers/acpi/apei/ghes.c | 65 +++++++++++++++++++++++++++-------- drivers/edac/ghes_edac.c | 2 +- include/acpi/ghes.h | 2 +- 4 files changed, 54 insertions(+), 17 deletions(-) -- 2.14.3