Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752047AbdF3Qr1 (ORCPT ); Fri, 30 Jun 2017 12:47:27 -0400 Received: from smtp.codeaurora.org ([198.145.29.96]:57972 "EHLO smtp.codeaurora.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751668AbdF3QrY (ORCPT ); Fri, 30 Jun 2017 12:47:24 -0400 DMARC-Filter: OpenDMARC Filter v1.3.2 smtp.codeaurora.org 07E96607DF Authentication-Results: pdx-caf-mail.web.codeaurora.org; dmarc=none (p=none dis=none) header.from=codeaurora.org Authentication-Results: pdx-caf-mail.web.codeaurora.org; spf=none smtp.mailfrom=tbaicar@codeaurora.org Subject: Re: [PATCH V17 01/11] acpi: apei: read ack upon ghes record consumption To: Robert Richter Cc: christoffer.dall@linaro.org, marc.zyngier@arm.com, pbonzini@redhat.com, rkrcmar@redhat.com, linux@armlinux.org.uk, catalin.marinas@arm.com, will.deacon@arm.com, rjw@rjwysocki.net, lenb@kernel.org, matt@codeblueprint.co.uk, robert.moore@intel.com, lv.zheng@intel.com, nkaje@codeaurora.org, zjzhang@codeaurora.org, mark.rutland@arm.com, james.morse@arm.com, akpm@linux-foundation.org, eun.taik.lee@samsung.com, sandeepa.s.prabhu@gmail.com, labbott@redhat.com, shijie.huang@arm.com, rruigrok@codeaurora.org, paul.gortmaker@windriver.com, tn@semihalf.com, fu.wei@linaro.org, rostedt@goodmis.org, bristot@redhat.com, linux-arm-kernel@lists.infradead.org, kvmarm@lists.cs.columbia.edu, kvm@vger.kernel.org, linux-kernel@vger.kernel.org, linux-acpi@vger.kernel.org, linux-efi@vger.kernel.org, Suzuki.Poulose@arm.com, punit.agrawal@arm.com, astone@redhat.com, harba@codeaurora.org, hanjun.guo@linaro.org, john.garry@huawei.com, shiju.jose@huawei.com, joe@perches.com, bp@alien8.de, rafael@kernel.org, tony.luck@intel.com, gengdongjiu@huawei.com, xiexiuqi@huawei.com References: <1495225933-4410-1-git-send-email-tbaicar@codeaurora.org> <1495225933-4410-2-git-send-email-tbaicar@codeaurora.org> <20170630101043.GZ658@rric.localdomain> From: "Baicar, Tyler" Message-ID: Date: Fri, 30 Jun 2017 10:47:17 -0600 User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:52.0) Gecko/20100101 Thunderbird/52.2.1 MIME-Version: 1.0 In-Reply-To: <20170630101043.GZ658@rric.localdomain> Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit Content-Language: en-US Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2626 Lines: 67 On 6/30/2017 4:10 AM, Robert Richter wrote: > Tyler, > > On 19.05.17 14:32:03, Tyler Baicar wrote: >> A RAS (Reliability, Availability, Serviceability) controller >> may be a separate processor running in parallel with OS >> execution, and may generate error records for consumption by >> the OS. If the RAS controller produces multiple error records, >> then they may be overwritten before the OS has consumed them. >> >> The Generic Hardware Error Source (GHES) v2 structure >> introduces the capability for the OS to acknowledge the >> consumption of the error record generated by the RAS >> controller. A RAS controller supporting GHESv2 shall wait for >> the acknowledgment before writing a new error record, thus >> eliminating the race condition. >> >> Add support for parsing of GHESv2 sub-tables as well. >> >> Signed-off-by: Tyler Baicar >> CC: Jonathan (Zhixiong) Zhang >> Reviewed-by: James Morse >> --- >> drivers/acpi/apei/ghes.c | 59 +++++++++++++++++++++++++++++++++++++++++++++--- >> drivers/acpi/apei/hest.c | 7 ++++-- >> include/acpi/ghes.h | 5 +++- >> 3 files changed, 65 insertions(+), 6 deletions(-) >> static int ghes_proc(struct ghes *ghes) >> { >> int rc; >> @@ -661,6 +704,16 @@ static int ghes_proc(struct ghes *ghes) >> ghes_estatus_cache_add(ghes->generic, ghes->estatus); >> } >> ghes_do_proc(ghes, ghes->estatus); >> + >> + /* >> + * GHESv2 type HEST entries introduce support for error acknowledgment, >> + * so only acknowledge the error if this support is present. >> + */ >> + if (is_hest_type_generic_v2(ghes)) { >> + rc = ghes_ack_error(ghes->generic_v2); >> + if (rc) >> + return rc; >> + } >> out: >> ghes_clear_estatus(ghes); >> return rc; > was there any specific reason why the ack is sent before clearing the > block status? Spec says the ack should be sent at last. > > Also, the block is never cleared if ghes_ack_error() returns an error. > IMO we should fall through and clear the block status (this will > change anyway if the bloc status is cleared first). Hello Robert, Thank you for pointing this out. I will send a patch to move the ack after the ghes_clear_estatus. This is probably the right thing to do since right now if the FW populates an invalid estatus, we will fail to read the estatus, jump to 'out:', and never send the ack. Thanks, Tyler -- Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc. Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project.