Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S938426AbcJXIv7 (ORCPT ); Mon, 24 Oct 2016 04:51:59 -0400 Received: from foss.arm.com ([217.140.101.70]:46052 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S934212AbcJXIv4 (ORCPT ); Mon, 24 Oct 2016 04:51:56 -0400 Subject: Re: [PATCH V4 01/10] acpi: apei: read ack upon ghes record consumption To: Tyler Baicar , marc.zyngier@arm.com, pbonzini@redhat.com, rkrcmar@redhat.com, linux@armlinux.org.uk, catalin.marinas@arm.com, will.deacon@arm.com, rjw@rjwysocki.net, lenb@kernel.org, matt@codeblueprint.co.uk, robert.moore@intel.com, lv.zheng@intel.com, nkaje@codeaurora.org, zjzhang@codeaurora.org, mark.rutland@arm.com, james.morse@arm.com, akpm@linux-foundation.org, eun.taik.lee@samsung.com, sandeepa.s.prabhu@gmail.com, shijie.huang@arm.com, rruigrok@codeaurora.org, paul.gortmaker@windriver.com, tomasz.nowicki@linaro.org, fu.wei@linaro.org, rostedt@goodmis.org, bristot@redhat.com, linux-arm-kernel@lists.infradead.org, kvmarm@lists.cs.columbia.edu, kvm@vger.kernel.org, linux-kernel@vger.kernel.org, linux-acpi@vger.kernel.org, linux-efi@vger.kernel.org, punit.agrawal@arm.com, astone@redhat.com, harba@codeaurora.org, hanjun.guo@linaro.org References: <1477071013-29563-1-git-send-email-tbaicar@codeaurora.org> <1477071013-29563-2-git-send-email-tbaicar@codeaurora.org> From: Suzuki K Poulose Message-ID: Date: Mon, 24 Oct 2016 09:51:47 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.3.0 MIME-Version: 1.0 In-Reply-To: <1477071013-29563-2-git-send-email-tbaicar@codeaurora.org> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3681 Lines: 106 On 21/10/16 18:30, Tyler Baicar wrote: > A RAS (Reliability, Availability, Serviceability) controller > may be a separate processor running in parallel with OS > execution, and may generate error records for consumption by > the OS. If the RAS controller produces multiple error records, > then they may be overwritten before the OS has consumed them. > > The Generic Hardware Error Source (GHES) v2 structure > introduces the capability for the OS to acknowledge the > consumption of the error record generated by the RAS > controller. A RAS controller supporting GHESv2 shall wait for > the acknowledgment before writing a new error record, thus > eliminating the race condition. > > Signed-off-by: Jonathan (Zhixiong) Zhang > Signed-off-by: Richard Ruigrok > Signed-off-by: Tyler Baicar > Signed-off-by: Naveen Kaje > --- > drivers/acpi/apei/ghes.c | 42 ++++++++++++++++++++++++++++++++++++++++++ > drivers/acpi/apei/hest.c | 7 +++++-- > include/acpi/ghes.h | 5 ++++- > 3 files changed, 51 insertions(+), 3 deletions(-) > > diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c > index 60746ef..7d020b0 100644 > --- a/drivers/acpi/apei/ghes.c > +++ b/drivers/acpi/apei/ghes.c > @@ -45,6 +45,7 @@ > #include > #include > > +#include > #include > #include > #include > @@ -79,6 +80,10 @@ > ((struct acpi_hest_generic_status *) \ > ((struct ghes_estatus_node *)(estatus_node) + 1)) > > +#define HEST_TYPE_GENERIC_V2(ghes) \ > + ((struct acpi_hest_header *)ghes->generic)->type == \ > + ACPI_HEST_TYPE_GENERIC_ERROR_V2 > + > /* > * This driver isn't really modular, however for the time being, > * continuing to use module_param is the easiest way to remain > @@ -248,7 +253,15 @@ static struct ghes *ghes_new(struct acpi_hest_generic *generic) > ghes = kzalloc(sizeof(*ghes), GFP_KERNEL); > if (!ghes) > return ERR_PTR(-ENOMEM); > + > ghes->generic = generic; > + if (HEST_TYPE_GENERIC_V2(ghes)) { > + rc = apei_map_generic_address( > + &ghes->generic_v2->read_ack_register); > + if (rc) > + goto err_unmap; I think should be goto err_free, see more below. > + } > + > rc = apei_map_generic_address(&generic->error_status_address); > if (rc) > goto err_free; > @@ -270,6 +283,9 @@ static struct ghes *ghes_new(struct acpi_hest_generic *generic) > > err_unmap: > apei_unmap_generic_address(&generic->error_status_address); > + if (HEST_TYPE_GENERIC_V2(ghes)) > + apei_unmap_generic_address( > + &ghes->generic_v2->read_ack_register); We might end up trying to unmap (error_status_address) which is not mapped if we hit the error in mapping read_ack_register. The read_ack_register unmap hunk should be moved below to err_free. > err_free: > kfree(ghes); > return ERR_PTR(rc); > @@ -279,6 +295,9 @@ static void ghes_fini(struct ghes *ghes) > { > kfree(ghes->estatus); > apei_unmap_generic_address(&ghes->generic->error_status_address); > + if (HEST_TYPE_GENERIC_V2(ghes)) > + apei_unmap_generic_address( > + &ghes->generic_v2->read_ack_register); > } > > static inline int ghes_severity(int severity) > @@ -648,6 +667,23 @@ static void ghes_estatus_cache_add( > rcu_read_unlock(); > } > > +static int ghes_do_read_ack(struct acpi_hest_generic_v2 *generic_v2) nit: We are actually writing something to the read_ack_register. The names read_ack_register (which may be as per standard) and more importantly the function name (ghes_do_read_ack) sounds a bit misleading. Rest looks fine to me. Suzuki