Currently iterating through the GHES estatus blocks does not
take into account the new generic data v3 structure size. This
can result in garbage non-standard trace events to be triggered
since the loop will not properly iterate through the estatus
blocks and not properly terminate.
Update the GHES estatus iteration to properly increment through
the estatus blocks similar to how the CPER estatus printing
iterates through them.
Fixes: bbcc2e7b642e ("ras: acpi/apei: cper: add support for generic data v3 structure")
Signed-off-by: Tyler Baicar <[email protected]>
Tested-by: Austin Christ <[email protected]>
---
drivers/acpi/apei/apei-internal.h | 5 -----
drivers/acpi/apei/ghes.c | 8 +++++++-
2 files changed, 7 insertions(+), 6 deletions(-)
diff --git a/drivers/acpi/apei/apei-internal.h b/drivers/acpi/apei/apei-internal.h
index 6e9f14c..cb41260 100644
--- a/drivers/acpi/apei/apei-internal.h
+++ b/drivers/acpi/apei/apei-internal.h
@@ -120,11 +120,6 @@ int apei_exec_collect_resources(struct apei_exec_context *ctx,
struct dentry;
struct dentry *apei_get_debugfs_dir(void);
-#define apei_estatus_for_each_section(estatus, section) \
- for (section = (struct acpi_hest_generic_data *)(estatus + 1); \
- (void *)section - (void *)estatus < estatus->data_length; \
- section = (void *)(section+1) + section->error_data_length)
-
static inline u32 cper_estatus_len(struct acpi_hest_generic_status *estatus)
{
if (estatus->raw_data_length)
diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c
index 6a6895a..dba1771 100644
--- a/drivers/acpi/apei/ghes.c
+++ b/drivers/acpi/apei/ghes.c
@@ -466,9 +466,13 @@ static void ghes_do_proc(struct ghes *ghes,
guid_t *sec_type;
guid_t *fru_id = &NULL_UUID_LE;
char *fru_text = "";
+ unsigned int data_len;
sev = ghes_severity(estatus->error_severity);
- apei_estatus_for_each_section(estatus, gdata) {
+ data_len = estatus->data_length;
+ gdata = (struct acpi_hest_generic_data *)(estatus + 1);
+
+ while (data_len >= acpi_hest_get_size(gdata)) {
sec_type = (guid_t *)gdata->section_type;
sec_sev = ghes_severity(gdata->error_severity);
if (gdata->validation_bits & CPER_SEC_VALID_FRU_ID)
@@ -528,6 +532,8 @@ static void ghes_do_proc(struct ghes *ghes,
sec_sev, err,
gdata->error_data_length);
}
+ data_len -= acpi_hest_get_record_size(gdata);
+ gdata = acpi_hest_get_next(gdata);
}
}
--
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project.
On Thu, Aug 03, 2017 at 03:32:25PM -0600, Tyler Baicar wrote:
> Currently iterating through the GHES estatus blocks does not
> take into account the new generic data v3 structure size. This
> can result in garbage non-standard trace events to be triggered
> since the loop will not properly iterate through the estatus
> blocks and not properly terminate.
>
> Update the GHES estatus iteration to properly increment through
> the estatus blocks similar to how the CPER estatus printing
> iterates through them.
>
> Fixes: bbcc2e7b642e ("ras: acpi/apei: cper: add support for generic data v3 structure")
> Signed-off-by: Tyler Baicar <[email protected]>
> Tested-by: Austin Christ <[email protected]>
> ---
> drivers/acpi/apei/apei-internal.h | 5 -----
> drivers/acpi/apei/ghes.c | 8 +++++++-
> 2 files changed, 7 insertions(+), 6 deletions(-)
Whilst much of the initial code here went through the arm64 tree in the
previous merge window, I'm assuming that Boris will take this fix via his
tree (likewise for "[PATCH V2] acpi: apei: clear error status before
acknowledging the error").
Will
On Tuesday, August 8, 2017 6:32:20 PM CEST Will Deacon wrote:
> On Thu, Aug 03, 2017 at 03:32:25PM -0600, Tyler Baicar wrote:
> > Currently iterating through the GHES estatus blocks does not
> > take into account the new generic data v3 structure size. This
> > can result in garbage non-standard trace events to be triggered
> > since the loop will not properly iterate through the estatus
> > blocks and not properly terminate.
> >
> > Update the GHES estatus iteration to properly increment through
> > the estatus blocks similar to how the CPER estatus printing
> > iterates through them.
> >
> > Fixes: bbcc2e7b642e ("ras: acpi/apei: cper: add support for generic data v3 structure")
> > Signed-off-by: Tyler Baicar <[email protected]>
> > Tested-by: Austin Christ <[email protected]>
> > ---
> > drivers/acpi/apei/apei-internal.h | 5 -----
> > drivers/acpi/apei/ghes.c | 8 +++++++-
> > 2 files changed, 7 insertions(+), 6 deletions(-)
>
> Whilst much of the initial code here went through the arm64 tree in the
> previous merge window, I'm assuming that Boris will take this fix via his
> tree (likewise for "[PATCH V2] acpi: apei: clear error status before
> acknowledging the error").
Actually I will if Boris ACKs these.
Thanks,
Rafael
On Wed, Aug 09, 2017 at 02:52:06AM +0200, Rafael J. Wysocki wrote:
> Actually I will if Boris ACKs these.
It's all on the TODO list for when I get back from vacation next week.
Unless Tony gets to them before me...
Thanks.
--
Regards/Gruss,
Boris.
SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg)
--
On 2017/8/9 8:52, Rafael J. Wysocki wrote:
> On Tuesday, August 8, 2017 6:32:20 PM CEST Will Deacon wrote:
>> On Thu, Aug 03, 2017 at 03:32:25PM -0600, Tyler Baicar wrote:
>>> Currently iterating through the GHES estatus blocks does not
>>> take into account the new generic data v3 structure size. This
>>> can result in garbage non-standard trace events to be triggered
>>> since the loop will not properly iterate through the estatus
>>> blocks and not properly terminate.
>>>
>>> Update the GHES estatus iteration to properly increment through
>>> the estatus blocks similar to how the CPER estatus printing
>>> iterates through them.
>>>
>>> Fixes: bbcc2e7b642e ("ras: acpi/apei: cper: add support for generic data v3 structure")
I do not think it has relationship with this "bbcc2e7b642e", the root cause is that it
does not consider the acpi_hest_generic_data_v300, this code exists long long time ago.
so please also review this patch:
https://lkml.org/lkml/2017/8/10/747
previously I do not see Tyler's this modification, just know see it.
>>> Signed-off-by: Tyler Baicar <[email protected]>
>>> Tested-by: Austin Christ <[email protected]>
>>> ---
>>> drivers/acpi/apei/apei-internal.h | 5 -----
>>> drivers/acpi/apei/ghes.c | 8 +++++++-
>>> 2 files changed, 7 insertions(+), 6 deletions(-)
>>
>> Whilst much of the initial code here went through the arm64 tree in the
>> previous merge window, I'm assuming that Boris will take this fix via his
>> tree (likewise for "[PATCH V2] acpi: apei: clear error status before
>> acknowledging the error").
>
> Actually I will if Boris ACKs these.
>
> Thanks,
> Rafael
>
>
> .
>
On 8/10/2017 4:04 PM, gengdongjiu wrote:
>
> On 2017/8/9 8:52, Rafael J. Wysocki wrote:
>> On Tuesday, August 8, 2017 6:32:20 PM CEST Will Deacon wrote:
>>> On Thu, Aug 03, 2017 at 03:32:25PM -0600, Tyler Baicar wrote:
>>>> Currently iterating through the GHES estatus blocks does not
>>>> take into account the new generic data v3 structure size. This
>>>> can result in garbage non-standard trace events to be triggered
>>>> since the loop will not properly iterate through the estatus
>>>> blocks and not properly terminate.
>>>>
>>>> Update the GHES estatus iteration to properly increment through
>>>> the estatus blocks similar to how the CPER estatus printing
>>>> iterates through them.
>>>>
>>>> Fixes: bbcc2e7b642e ("ras: acpi/apei: cper: add support for generic data v3 structure")
> I do not think it has relationship with this "bbcc2e7b642e", the root cause is that it
> does not consider the acpi_hest_generic_data_v300, this code exists long long time ago.
bbcc2e7b642e ("ras: acpi/apei: cper: add support for generic data v3 structure") is the change
that introduced the generic data v300 structure and this patch does fix the issue with this
iteration not taking into account the size difference of the v300 structure.
>
> so please also review this patch:
> https://lkml.org/lkml/2017/8/10/747
>
> previously I do not see Tyler's this modification, just know see it.
>
>
>>>> Signed-off-by: Tyler Baicar <[email protected]>
>>>> Tested-by: Austin Christ <[email protected]>
>>>> ---
>>>> drivers/acpi/apei/apei-internal.h | 5 -----
>>>> drivers/acpi/apei/ghes.c | 8 +++++++-
>>>> 2 files changed, 7 insertions(+), 6 deletions(-)
>>> Whilst much of the initial code here went through the arm64 tree in the
>>> previous merge window, I'm assuming that Boris will take this fix via his
>>> tree (likewise for "[PATCH V2] acpi: apei: clear error status before
>>> acknowledging the error").
>> Actually I will if Boris ACKs these.
>>
>> Thanks,
>> Rafael
>>
>>
>> .
>>
--
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project.