The revision 0x300 generic error data entry is different
from the old version, but currently iterating through the
GHES estatus blocks does not take into account this difference.
This will lead to failure to get the right data entry if GHES
has revision 0x300 error data entry.
Update the GHES estatus iteration to properly increment using
acpi_hest_get_next, and correct the iteration termination condition
because the status block data length only includes error data length.
Clear the CPER estatus printing iteration logic to use same macro.
Signed-off-by: Dongjiu Geng <[email protected]>
CC: Tyler Baicar <[email protected]>
---
drivers/acpi/apei/apei-internal.h | 5 -----
drivers/firmware/efi/cper.c | 7 +------
include/acpi/ghes.h | 5 +++++
3 files changed, 6 insertions(+), 11 deletions(-)
diff --git a/drivers/acpi/apei/apei-internal.h b/drivers/acpi/apei/apei-internal.h
index 6e9f14c0a71b..cb4126051f62 100644
--- a/drivers/acpi/apei/apei-internal.h
+++ b/drivers/acpi/apei/apei-internal.h
@@ -120,11 +120,6 @@ int apei_exec_collect_resources(struct apei_exec_context *ctx,
struct dentry;
struct dentry *apei_get_debugfs_dir(void);
-#define apei_estatus_for_each_section(estatus, section) \
- for (section = (struct acpi_hest_generic_data *)(estatus + 1); \
- (void *)section - (void *)estatus < estatus->data_length; \
- section = (void *)(section+1) + section->error_data_length)
-
static inline u32 cper_estatus_len(struct acpi_hest_generic_status *estatus)
{
if (estatus->raw_data_length)
diff --git a/drivers/firmware/efi/cper.c b/drivers/firmware/efi/cper.c
index 48a8f69da42a..dff454321160 100644
--- a/drivers/firmware/efi/cper.c
+++ b/drivers/firmware/efi/cper.c
@@ -606,7 +606,6 @@ void cper_estatus_print(const char *pfx,
const struct acpi_hest_generic_status *estatus)
{
struct acpi_hest_generic_data *gdata;
- unsigned int data_len;
int sec_no = 0;
char newpfx[64];
__u16 severity;
@@ -617,14 +616,10 @@ void cper_estatus_print(const char *pfx,
"It has been corrected by h/w "
"and requires no further action");
printk("%s""event severity: %s\n", pfx, cper_severity_str(severity));
- data_len = estatus->data_length;
- gdata = (struct acpi_hest_generic_data *)(estatus + 1);
snprintf(newpfx, sizeof(newpfx), "%s%s", pfx, INDENT_SP);
- while (data_len >= acpi_hest_get_size(gdata)) {
+ apei_estatus_for_each_section(estatus, gdata) {
cper_estatus_print_section(newpfx, gdata, sec_no);
- data_len -= acpi_hest_get_record_size(gdata);
- gdata = acpi_hest_get_next(gdata);
sec_no++;
}
}
diff --git a/include/acpi/ghes.h b/include/acpi/ghes.h
index 9f26e01186ae..9061c5c743b3 100644
--- a/include/acpi/ghes.h
+++ b/include/acpi/ghes.h
@@ -113,6 +113,11 @@ static inline void *acpi_hest_get_next(struct acpi_hest_generic_data *gdata)
return (void *)(gdata) + acpi_hest_get_record_size(gdata);
}
+#define apei_estatus_for_each_section(estatus, section) \
+ for (section = (struct acpi_hest_generic_data *)(estatus + 1); \
+ (void *)section - (void *)(estatus + 1) < estatus->data_length; \
+ section = acpi_hest_get_next(section))
+
int ghes_notify_sea(void);
#endif /* GHES_H */
--
2.14.0
Loop more people to review the patch.
2017-08-15 19:15 GMT+08:00, Dongjiu Geng <[email protected]>:
> The revision 0x300 generic error data entry is different
> from the old version, but currently iterating through the
> GHES estatus blocks does not take into account this difference.
> This will lead to failure to get the right data entry if GHES
> has revision 0x300 error data entry.
>
> Update the GHES estatus iteration to properly increment using
> acpi_hest_get_next, and correct the iteration termination condition
> because the status block data length only includes error data length.
> Clear the CPER estatus printing iteration logic to use same macro.
>
> Signed-off-by: Dongjiu Geng <[email protected]>
> CC: Tyler Baicar <[email protected]>
> ---
> drivers/acpi/apei/apei-internal.h | 5 -----
> drivers/firmware/efi/cper.c | 7 +------
> include/acpi/ghes.h | 5 +++++
> 3 files changed, 6 insertions(+), 11 deletions(-)
>
> diff --git a/drivers/acpi/apei/apei-internal.h
> b/drivers/acpi/apei/apei-internal.h
> index 6e9f14c0a71b..cb4126051f62 100644
> --- a/drivers/acpi/apei/apei-internal.h
> +++ b/drivers/acpi/apei/apei-internal.h
> @@ -120,11 +120,6 @@ int apei_exec_collect_resources(struct
> apei_exec_context *ctx,
> struct dentry;
> struct dentry *apei_get_debugfs_dir(void);
>
> -#define apei_estatus_for_each_section(estatus, section) \
> - for (section = (struct acpi_hest_generic_data *)(estatus + 1); \
> - (void *)section - (void *)estatus < estatus->data_length; \
> - section = (void *)(section+1) + section->error_data_length)
> -
> static inline u32 cper_estatus_len(struct acpi_hest_generic_status
> *estatus)
> {
> if (estatus->raw_data_length)
> diff --git a/drivers/firmware/efi/cper.c b/drivers/firmware/efi/cper.c
> index 48a8f69da42a..dff454321160 100644
> --- a/drivers/firmware/efi/cper.c
> +++ b/drivers/firmware/efi/cper.c
> @@ -606,7 +606,6 @@ void cper_estatus_print(const char *pfx,
> const struct acpi_hest_generic_status *estatus)
> {
> struct acpi_hest_generic_data *gdata;
> - unsigned int data_len;
> int sec_no = 0;
> char newpfx[64];
> __u16 severity;
> @@ -617,14 +616,10 @@ void cper_estatus_print(const char *pfx,
> "It has been corrected by h/w "
> "and requires no further action");
> printk("%s""event severity: %s\n", pfx, cper_severity_str(severity));
> - data_len = estatus->data_length;
> - gdata = (struct acpi_hest_generic_data *)(estatus + 1);
> snprintf(newpfx, sizeof(newpfx), "%s%s", pfx, INDENT_SP);
>
> - while (data_len >= acpi_hest_get_size(gdata)) {
> + apei_estatus_for_each_section(estatus, gdata) {
> cper_estatus_print_section(newpfx, gdata, sec_no);
> - data_len -= acpi_hest_get_record_size(gdata);
> - gdata = acpi_hest_get_next(gdata);
> sec_no++;
> }
> }
> diff --git a/include/acpi/ghes.h b/include/acpi/ghes.h
> index 9f26e01186ae..9061c5c743b3 100644
> --- a/include/acpi/ghes.h
> +++ b/include/acpi/ghes.h
> @@ -113,6 +113,11 @@ static inline void *acpi_hest_get_next(struct
> acpi_hest_generic_data *gdata)
> return (void *)(gdata) + acpi_hest_get_record_size(gdata);
> }
>
> +#define apei_estatus_for_each_section(estatus, section) \
> + for (section = (struct acpi_hest_generic_data *)(estatus + 1); \
> + (void *)section - (void *)(estatus + 1) < estatus->data_length; \
> + section = acpi_hest_get_next(section))
> +
> int ghes_notify_sea(void);
>
> #endif /* GHES_H */
> --
> 2.14.0
>
>
On Tue, Aug 15, 2017 at 10:16:20PM +0800, gengdongjiu wrote:
> 2017-08-15 19:15 GMT+08:00, Dongjiu Geng <[email protected]>:
> > The revision 0x300 generic error data entry is different
> > from the old version, but currently iterating through the
> > GHES estatus blocks does not take into account this difference.
> > This will lead to failure to get the right data entry if GHES
> > has revision 0x300 error data entry.
> >
> > Update the GHES estatus iteration to properly increment using
> > acpi_hest_get_next, and correct the iteration termination condition
> > because the status block data length only includes error data length.
> > Clear the CPER estatus printing iteration logic to use same macro.
> >
> > Signed-off-by: Dongjiu Geng <[email protected]>
> > CC: Tyler Baicar <[email protected]>
...
> > diff --git a/drivers/firmware/efi/cper.c b/drivers/firmware/efi/cper.c
> > index 48a8f69da42a..dff454321160 100644
> > --- a/drivers/firmware/efi/cper.c
> > +++ b/drivers/firmware/efi/cper.c
> > @@ -606,7 +606,6 @@ void cper_estatus_print(const char *pfx,
> > const struct acpi_hest_generic_status *estatus)
> > {
> > struct acpi_hest_generic_data *gdata;
> > - unsigned int data_len;
> > int sec_no = 0;
> > char newpfx[64];
> > __u16 severity;
> > @@ -617,14 +616,10 @@ void cper_estatus_print(const char *pfx,
> > "It has been corrected by h/w "
> > "and requires no further action");
> > printk("%s""event severity: %s\n", pfx, cper_severity_str(severity));
> > - data_len = estatus->data_length;
> > - gdata = (struct acpi_hest_generic_data *)(estatus + 1);
> > snprintf(newpfx, sizeof(newpfx), "%s%s", pfx, INDENT_SP);
> >
> > - while (data_len >= acpi_hest_get_size(gdata)) {
> > + apei_estatus_for_each_section(estatus, gdata) {
> > cper_estatus_print_section(newpfx, gdata, sec_no);
> > - data_len -= acpi_hest_get_record_size(gdata);
> > - gdata = acpi_hest_get_next(gdata);
> > sec_no++;
This one looks cleaner to me because it gets rid of all those
variables...
> > }
> > }
> > diff --git a/include/acpi/ghes.h b/include/acpi/ghes.h
> > index 9f26e01186ae..9061c5c743b3 100644
> > --- a/include/acpi/ghes.h
> > +++ b/include/acpi/ghes.h
> > @@ -113,6 +113,11 @@ static inline void *acpi_hest_get_next(struct
> > acpi_hest_generic_data *gdata)
> > return (void *)(gdata) + acpi_hest_get_record_size(gdata);
> > }
> >
> > +#define apei_estatus_for_each_section(estatus, section) \
> > + for (section = (struct acpi_hest_generic_data *)(estatus + 1); \
> > + (void *)section - (void *)(estatus + 1) < estatus->data_length; \
> > + section = acpi_hest_get_next(section))
... and uses that accessor.
Tyler?
I'd prefer if you guys merge your two patches, Tyler's from
https://marc.info/?l=linux-acpi&m=150179595323038&w=2 and this one into
a single one.
How does that sound?
--
Regards/Gruss,
Boris.
SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg)
--
Hi Borislav,
>
> ... and uses that accessor.
>
> Tyler?
>
> I'd prefer if you guys merge your two patches, Tyler's from
> https://marc.info/?l=linux-acpi&m=150179595323038&w=2 and this one into
> a single one.
I think this patch has merged them to one.
>
> How does that sound?
>
> --
> Regards/Gruss,
> Boris.
>
> SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB
> 21284 (AG Nürnberg)
> --
>
On Wed, Aug 16, 2017 at 12:30:55AM +0800, gengdongjiu wrote:
> I think this patch has merged them to one.
Look at both patches again.
--
Regards/Gruss,
Boris.
SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg)
--
Borislav,
2017-08-16 0:32 GMT+08:00, Borislav Petkov <[email protected]>:
> On Wed, Aug 16, 2017 at 12:30:55AM +0800, gengdongjiu wrote:
>> I think this patch has merged them to one.
>
> Look at both patches again.
I ever discuss it with Tyler about it, as shown below link, thanks
https://lkml.org/lkml/2017/8/14/355
>
> --
> Regards/Gruss,
> Boris.
>
> SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB
> 21284 (AG Nürnberg)
> --
>
On Wed, Aug 16, 2017 at 12:42:43AM +0800, gengdongjiu wrote:
> I ever discuss it with Tyler about it, as shown below link, thanks
>
> https://lkml.org/lkml/2017/8/14/355
Tyler's patch fixes ghes_do_proc() - yours cper_estatus_print(). How is
that merged?
--
Regards/Gruss,
Boris.
SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg)
--
On 8/15/2017 10:55 AM, Borislav Petkov wrote:
> On Wed, Aug 16, 2017 at 12:42:43AM +0800, gengdongjiu wrote:
>> I ever discuss it with Tyler about it, as shown below link, thanks
>>
>> https://lkml.org/lkml/2017/8/14/355
> Tyler's patch fixes ghes_do_proc() - yours cper_estatus_print(). How is
> that merged?
>
Hello Boris,
His patch fixes the define for apei_estatus_for_each_section which in
turn should fix ghes_do_proc(). So my patch should no longer be needed.
I'm going to test this out just to verify if fixes the issue I found.
Dongjiu,
This patch changes cper_estatus_print() to use
apei_estatus_for_each_section. Can you also make that same change to
cper_estatus_check() since that function is doing the same iteration?
Thanks,
Tyler
--
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project.
On Tue, Aug 15, 2017 at 12:31:14PM -0600, Baicar, Tyler wrote:
> His patch fixes the define for apei_estatus_for_each_section which in turn
> should fix ghes_do_proc(). So my patch should no longer be needed.
I see. We're keeping the macro, of course.
> I'm going to test this out just to verify if fixes the issue I found.
Please.
Thanks.
--
Regards/Gruss,
Boris.
SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg)
--
Hi Tyler ,
> Hello Boris,
>
> His patch fixes the define for apei_estatus_for_each_section which in turn
> should fix ghes_do_proc(). So my patch should no longer be needed. I'm going
> to test this out just to verify if fixes the issue I found.
I have verified the issue about the iteration for the revision 0x300
generic error data,
it works well. it is good that you will verify that in your platform.
>
> Dongjiu,
>
> This patch changes cper_estatus_print() to use
> apei_estatus_for_each_section. Can you also make that same change to
> cper_estatus_check() since that function is doing the same iteration?
I will do it, Tyler.
>
> Thanks,
> Tyler
>
> --
> Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm
> Technologies, Inc.
> Qualcomm Technologies, Inc. is a member of the Code Aurora Forum,
> a Linux Foundation Collaborative Project.
>
On 8/15/2017 3:34 PM, gengdongjiu wrote:
> Hi Tyler ,
>
>> Hello Boris,
>>
>> His patch fixes the define for apei_estatus_for_each_section which in turn
>> should fix ghes_do_proc(). So my patch should no longer be needed. I'm going
>> to test this out just to verify if fixes the issue I found.
> I have verified the issue about the iteration for the revision 0x300
> generic error data,
> it works well. it is good that you will verify that in your platform.
I've verified that this resolves the issue as well! I'll re-test with
the next version that adds this usage to cper_estatus_check() and add my
tested-by after that.
Thanks,
Tyler
--
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project.
Hello Tyler,
I have already updated a new version patch to adds this usage to cper_estatus_check(), please re-test.
Hello Boris,
The original macro of apei_estatus_for_each_section has two issues:
one is the iteration loop termination condition; another is the iteration steps. please review it.
thanks.
On 2017/8/16 7:26, Baicar, Tyler wrote:
> On 8/15/2017 3:34 PM, gengdongjiu wrote:
>> Hi Tyler ,
>>
>>> Hello Boris,
>>>
>>> His patch fixes the define for apei_estatus_for_each_section which in turn
>>> should fix ghes_do_proc(). So my patch should no longer be needed. I'm going
>>> to test this out just to verify if fixes the issue I found.
>> I have verified the issue about the iteration for the revision 0x300
>> generic error data,
>> it works well. it is good that you will verify that in your platform.
> I've verified that this resolves the issue as well! I'll re-test with the next version that adds this usage to cper_estatus_check() and add my tested-by after that.
>
> Thanks,
> Tyler
>