Received: by 10.192.165.148 with SMTP id m20csp963610imm; Wed, 25 Apr 2018 10:18:56 -0700 (PDT) X-Google-Smtp-Source: AIpwx4/g7xcpkqO7cQzkph3ufdhQUqn0//4aDLcmAyNuegrWfsk9TcSg+JEvaGBq2L0QYG8y4zqe X-Received: by 10.101.98.202 with SMTP id m10mr19289998pgv.348.1524676735907; Wed, 25 Apr 2018 10:18:55 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1524676735; cv=none; d=google.com; s=arc-20160816; b=FSdYEF4vvh9YoNQomPnWC5L9jSCYoJWF5GSWqEhQw2G8NlRt/i+njSwztxEyPIM8vy s0rCJyn0iX7rt9KMQSnH3XPP5hYm22YubucYX2C585OYCa80PDGEfFgytl6xZnlfbPiu 6fJM7mdDUnmIndm8z1pr2ImabVlYil0XnxAaMzrIeBmurnqxX3iqaTkDlh+tzc0AITzQ OsyPRTjXYn6rLGuPWSodEIsBIQz1ZT6OMeycz5VJEhDcSTHXhbwqIcvdfYbMYLojiMEB k3G6GEDr12/+oOHNRzabOlo8+1w689TKM6zWRHthZihY0wwxZfZLivpV/FsUHLoJrMSP pyxg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:arc-authentication-results; bh=fVHzZv2Sz3GRX5aSmUQIGV3MeoPHCdNr4wojeiT4Yxg=; b=LCarRm//n9w0wHQiUqZZaqtob70JhGrg8btMpEIBPuIB1STDp8H59bS46nZcO4k4U5 HKSe/JpwSXe/NONLZTAW1DcvEw/2PvlalAR9Wu5bVM7w8BFmW8HCsbxMJQu7R7xjxNoM FX9RKkdTCRJJ80pgVb1CTRd99ep7IymmBtAP6HZ92/FdfFdU1tY/0PDnLXd46/oSv2by 2jI4tB1mIH+15PLiHW+MLQehayjB4wr9Yx6UWGVhUXpRx1ZY+Lwi0FSP7gEB6ISlJ/JP BAB4Y1cdI1CXzkeQG5GThdm4laOPy59NVH4aMtCtpAwXdfIR3wGka6YoFx/Af8byQceF h7oA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id m26si14717041pfa.45.2018.04.25.10.18.41; Wed, 25 Apr 2018 10:18:55 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755298AbeDYRQ2 (ORCPT + 99 others); Wed, 25 Apr 2018 13:16:28 -0400 Received: from mail.skyhub.de ([5.9.137.197]:35870 "EHLO mail.skyhub.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754552AbeDYRQ0 (ORCPT ); Wed, 25 Apr 2018 13:16:26 -0400 X-Virus-Scanned: Nedap ESD1 at mail.skyhub.de Received: from mail.skyhub.de ([127.0.0.1]) by localhost (blast.alien8.de [127.0.0.1]) (amavisd-new, port 10026) with ESMTP id wboDcZ2KHvlU; Wed, 25 Apr 2018 19:16:09 +0200 (CEST) Received: from pd.tnic (p200300EC2BCDA80010D6750E125BA95E.dip0.t-ipconnect.de [IPv6:2003:ec:2bcd:a800:10d6:750e:125b:a95e]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.skyhub.de (SuperMail on ZX Spectrum 128k) with ESMTPSA id 3A3291EC02C8; Wed, 25 Apr 2018 19:16:09 +0200 (CEST) Date: Wed, 25 Apr 2018 19:15:57 +0200 From: Borislav Petkov To: "Alex G." Cc: linux-acpi@vger.kernel.org, linux-edac@vger.kernel.org, rjw@rjwysocki.net, lenb@kernel.org, tony.luck@intel.com, tbaicar@codeaurora.org, will.deacon@arm.com, james.morse@arm.com, shiju.jose@huawei.com, zjzhang@codeaurora.org, gengdongjiu@huawei.com, linux-kernel@vger.kernel.org, alex_gagniuc@dellteam.com, austin_bolen@dell.com, shyam_iyer@dell.com, devel@acpica.org, mchehab@kernel.org, robert.moore@intel.com, erik.schmauss@intel.com, Yazen Ghannam , Ard Biesheuvel Subject: Re: [RFC PATCH v2 3/4] acpi: apei: Do not panic() when correctable errors are marked as fatal. Message-ID: <20180425171557.GC2597@pd.tnic> References: <20180419154006.GE3600@pd.tnic> <977608e6-9f5d-c523-a78a-993ac5bfd55f@gmail.com> <20180419164528.GD5635@pd.tnic> <20180419190323.GF5635@pd.tnic> <20180422104849.GA32754@pd.tnic> <70c43399-e8e5-5061-b5a5-451deb5f02fa@gmail.com> <20180425140108.GA2597@pd.tnic> <48944beb-4e29-05cc-857b-7698e3dbe89b@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <48944beb-4e29-05cc-857b-7698e3dbe89b@gmail.com> User-Agent: Mutt/1.9.3 (2018-01-21) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Apr 25, 2018 at 10:00:53AM -0500, Alex G. wrote: > Firmware-first. Ok, my guess was right. > We could probably use more of the native AER print functions, but that's > beyond the scope of this patch. No no, this does not belong in this patchset. > Like the exact thing that this patch series implements? :) Exact thing? I don't think so. No, your patchset is grafting some funky and questionable side-handler which gets to see the PCIe errors first, out-of-line and then it practically downgrades their severity outside of the error processing flow. What I've been telling you to do is to extend ghes_severity() to give the lower than PANIC severity for CPER_SEC_PCIE errors first so that the machine doesn't panic from them anymore and those PCIe errors get processed in the normal error processing path down through ghes_do_proc() and then land in ghes_handle_aer(). No adhoc ->handle_irqsafe thing - just the normal straightforward error processing path. There, in ghes_handle_aer(), you do the check whether the device is still there - i.e., you try to apply some heuristics to detect the error type and why the system is complaining - you maybe even check whether the NVMe device is still there - and *then* you do the proper recovery action. And you document for the future people looking at this code *why* you're doing this. -- Regards/Gruss, Boris. Good mailing practices for 400: avoid top-posting and trim the reply.