Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751881AbdGaRpG (ORCPT ); Mon, 31 Jul 2017 13:45:06 -0400 Received: from smtp.codeaurora.org ([198.145.29.96]:46200 "EHLO smtp.codeaurora.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750999AbdGaRoS (ORCPT ); Mon, 31 Jul 2017 13:44:18 -0400 DMARC-Filter: OpenDMARC Filter v1.3.2 smtp.codeaurora.org DC354601D1 Authentication-Results: pdx-caf-mail.web.codeaurora.org; dmarc=none (p=none dis=none) header.from=codeaurora.org Authentication-Results: pdx-caf-mail.web.codeaurora.org; spf=none smtp.mailfrom=tbaicar@codeaurora.org Subject: Re: [PATCH] acpi: apei: clear error status before acknowledging the error To: "Luck, Tony" Cc: Borislav Petkov , rjw@rjwysocki.net, lenb@kernel.org, will.deacon@arm.com, james.morse@arm.com, shiju.jose@huawei.com, geliangtang@gmail.com, andriy.shevchenko@linux.intel.com, linux-acpi@vger.kernel.org, linux-kernel@vger.kernel.org References: <1501280703-21471-1-git-send-email-tbaicar@codeaurora.org> <20170729065345.GA30608@nazgul.tnic> <20170731170017.2vwxhewivgpyvpea@intel.com> From: "Baicar, Tyler" Message-ID: <01b3550d-1fca-c051-3581-41dda3b62779@codeaurora.org> Date: Mon, 31 Jul 2017 11:44:12 -0600 User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:52.0) Gecko/20100101 Thunderbird/52.2.1 MIME-Version: 1.0 In-Reply-To: <20170731170017.2vwxhewivgpyvpea@intel.com> Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit Content-Language: en-US Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1677 Lines: 42 On 7/31/2017 11:00 AM, Luck, Tony wrote: > On Mon, Jul 31, 2017 at 10:15:27AM -0600, Baicar, Tyler wrote: >> I think the better thing to do in this case is still send the ack. If >> ghes_read_estatus() fails, then >> either we are unable to read the estatus or the estatus is empty/invalid. > Right now we silently handle that failure of ghes_read_estatus(). That > might be hiding some Linux bugs if we are calling ghes_proc() in cases > where we shouldn't. > > Perhaps we should have something like this, so if systems do start acting > weirdly there will be a note that we took this path: > > rc = ghes_read_estatus(ghes, 0); > if (rc) { > pr_notice("surprise failure reading ghes estatus\n"); > goto out; > } Thank you Tony for the feedback, I can add a print like this in the next version. I'll verify that rc is not -ENOENT though so we don't print it on empty scenarios since the polled source will be hitting this path frequently. -Tyler > >> If we do not send the ack, then we will be in a scenario where FW will not >> send any more errors. > We might ACK something that the firmware didn't send, which may > lead to other problems. > >> I think it would be better to still have the FW send the errors and kernel >> complain about issues with > But I agree with this. We should send the ACK. Luckliy this doesn't have > a long legacy problem because the whole ACK mechanism is a new thing. So > we only have to worry about GHESv2 supporting BIOS. > > -Tony -- Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc. Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project.