changes since v6:
- add more explicty error message suggested by Xiaofei
- pick up reviewed-by tag from Xiaofei
- pick up internal reviewed-by tag from Baolin
changes since v5 by addressing comments from Kefeng:
- document return value of memory_failure()
- drop redundant comments in call site of memory_failure()
- make ghes_do_proc void and handle abnormal case within it
- pick up reviewed-by tag from Kefeng Wang
changes since v4 by addressing comments from Xiaofei:
- do a force kill only for abnormal sync errors
changes since v3 by addressing comments from Xiaofei:
- do a force kill for abnormal memory failure error such as invalid PA,
unexpected severity, OOM, etc
- pcik up tested-by tag from Ma Wupeng
changes since v2 by addressing comments from Naoya:
- rename mce_task_work to sync_task_work
- drop ACPI_HEST_NOTIFY_MCE case in is_hest_sync_notify()
- add steps to reproduce this problem in cover letter
changes since v1:
- synchronous events by notify type
- Link: https://lore.kernel.org/lkml/[email protected]/
Shuai Xue (2):
ACPI: APEI: set memory failure flags as MF_ACTION_REQUIRED on
synchronous events
ACPI: APEI: handle synchronous exceptions in task work
arch/x86/kernel/cpu/mce/core.c | 9 +--
drivers/acpi/apei/ghes.c | 113 ++++++++++++++++++++++-----------
include/acpi/ghes.h | 3 -
mm/memory-failure.c | 17 +----
4 files changed, 79 insertions(+), 63 deletions(-)
--
2.20.1.12.g72788fdb
On 2023/6/6 15:42, Shuai Xue wrote:
> changes since v6:
> - add more explicty error message suggested by Xiaofei
> - pick up reviewed-by tag from Xiaofei
> - pick up internal reviewed-by tag from Baolin
>
> changes since v5 by addressing comments from Kefeng:
> - document return value of memory_failure()
> - drop redundant comments in call site of memory_failure()
> - make ghes_do_proc void and handle abnormal case within it
> - pick up reviewed-by tag from Kefeng Wang
>
> changes since v4 by addressing comments from Xiaofei:
> - do a force kill only for abnormal sync errors
>
> changes since v3 by addressing comments from Xiaofei:
> - do a force kill for abnormal memory failure error such as invalid PA,
> unexpected severity, OOM, etc
> - pcik up tested-by tag from Ma Wupeng
>
> changes since v2 by addressing comments from Naoya:
> - rename mce_task_work to sync_task_work
> - drop ACPI_HEST_NOTIFY_MCE case in is_hest_sync_notify()
> - add steps to reproduce this problem in cover letter
>
> changes since v1:
> - synchronous events by notify type
> - Link: https://lore.kernel.org/lkml/[email protected]/
>
>
> Shuai Xue (2):
> ACPI: APEI: set memory failure flags as MF_ACTION_REQUIRED on
> synchronous events
> ACPI: APEI: handle synchronous exceptions in task work
>
> arch/x86/kernel/cpu/mce/core.c | 9 +--
> drivers/acpi/apei/ghes.c | 113 ++++++++++++++++++++++-----------
> include/acpi/ghes.h | 3 -
> mm/memory-failure.c | 17 +----
> 4 files changed, 79 insertions(+), 63 deletions(-)
>
Hi, Rafael,
Gentle ping.
Are you happy to queue this patch set or anything I can do to improve it?
As @Kefeng said, this issue is met in Alibaba and Huawei products, we hope it
could be fixed ASAP.
Thank you.
Best Regards,
Shuai
On 2023/6/16 15:15, Shuai Xue wrote:
>
>
> On 2023/6/6 15:42, Shuai Xue wrote:
>> changes since v6:
>> - add more explicty error message suggested by Xiaofei
>> - pick up reviewed-by tag from Xiaofei
>> - pick up internal reviewed-by tag from Baolin
>>
>> changes since v5 by addressing comments from Kefeng:
>> - document return value of memory_failure()
>> - drop redundant comments in call site of memory_failure()
>> - make ghes_do_proc void and handle abnormal case within it
>> - pick up reviewed-by tag from Kefeng Wang
>>
>> changes since v4 by addressing comments from Xiaofei:
>> - do a force kill only for abnormal sync errors
>>
>> changes since v3 by addressing comments from Xiaofei:
>> - do a force kill for abnormal memory failure error such as invalid PA,
>> unexpected severity, OOM, etc
>> - pcik up tested-by tag from Ma Wupeng
>>
>> changes since v2 by addressing comments from Naoya:
>> - rename mce_task_work to sync_task_work
>> - drop ACPI_HEST_NOTIFY_MCE case in is_hest_sync_notify()
>> - add steps to reproduce this problem in cover letter
>>
>> changes since v1:
>> - synchronous events by notify type
>> - Link: https://lore.kernel.org/lkml/[email protected]/
>>
>>
>> Shuai Xue (2):
>> ACPI: APEI: set memory failure flags as MF_ACTION_REQUIRED on
>> synchronous events
>> ACPI: APEI: handle synchronous exceptions in task work
>>
>> arch/x86/kernel/cpu/mce/core.c | 9 +--
>> drivers/acpi/apei/ghes.c | 113 ++++++++++++++++++++++-----------
>> include/acpi/ghes.h | 3 -
>> mm/memory-failure.c | 17 +----
>> 4 files changed, 79 insertions(+), 63 deletions(-)
>>
>
>
> Hi, Rafael,
>
> Gentle ping.
>
> Are you happy to queue this patch set or anything I can do to improve it?
> As @Kefeng said, this issue is met in Alibaba and Huawei products, we hope it
> could be fixed ASAP.
Hi Rafael, Tony, and Naoya,
Gentle ping. I am sorry to see that we have missed v6.3 and v6.4 merge window
since three Reviewed-by tags and one Tested-by tag.
Do we still need any designated APEI reviewers Reviewed-by? Could you give me your
Reviewed-by @Tony, and @Naoya if you are happy with the change.
Or Please could you Ack this change if you are happy with the proposal and the
change? @Rafael
>
> Thank you.
>
> Best Regards,
> Shuai
On 2023/7/10 11:15, Shuai Xue wrote:
>
>
> On 2023/6/16 15:15, Shuai Xue wrote:
>>
>>
>> On 2023/6/6 15:42, Shuai Xue wrote:
>>> changes since v6:
>>> - add more explicty error message suggested by Xiaofei
>>> - pick up reviewed-by tag from Xiaofei
>>> - pick up internal reviewed-by tag from Baolin
>>>
>>> changes since v5 by addressing comments from Kefeng:
>>> - document return value of memory_failure()
>>> - drop redundant comments in call site of memory_failure()
>>> - make ghes_do_proc void and handle abnormal case within it
>>> - pick up reviewed-by tag from Kefeng Wang
>>>
>>> changes since v4 by addressing comments from Xiaofei:
>>> - do a force kill only for abnormal sync errors
>>>
>>> changes since v3 by addressing comments from Xiaofei:
>>> - do a force kill for abnormal memory failure error such as invalid PA,
>>> unexpected severity, OOM, etc
>>> - pcik up tested-by tag from Ma Wupeng
>>>
>>> changes since v2 by addressing comments from Naoya:
>>> - rename mce_task_work to sync_task_work
>>> - drop ACPI_HEST_NOTIFY_MCE case in is_hest_sync_notify()
>>> - add steps to reproduce this problem in cover letter
>>>
>>> changes since v1:
>>> - synchronous events by notify type
>>> - Link: https://lore.kernel.org/lkml/[email protected]/
>>>
>>>
>>> Shuai Xue (2):
>>> ACPI: APEI: set memory failure flags as MF_ACTION_REQUIRED on
>>> synchronous events
>>> ACPI: APEI: handle synchronous exceptions in task work
>>>
>>> arch/x86/kernel/cpu/mce/core.c | 9 +--
>>> drivers/acpi/apei/ghes.c | 113 ++++++++++++++++++++++-----------
>>> include/acpi/ghes.h | 3 -
>>> mm/memory-failure.c | 17 +----
>>> 4 files changed, 79 insertions(+), 63 deletions(-)
>>>
>>
>>
>> Hi, Rafael,
>>
>> Gentle ping.
>>
>> Are you happy to queue this patch set or anything I can do to improve it?
>> As @Kefeng said, this issue is met in Alibaba and Huawei products, we hope it
>> could be fixed ASAP.
>
> Hi Rafael, Tony, and Naoya,
>
> Gentle ping. I am sorry to see that we have missed v6.3 and v6.4 merge window
> since three Reviewed-by tags and one Tested-by tag.
>
> Do we still need any designated APEI reviewers Reviewed-by? Could you give me your
> Reviewed-by @Tony, and @Naoya if you are happy with the change.
>
> Or Please could you Ack this change if you are happy with the proposal and the
> change? @Rafael
>
Hi, ALL,
Gentle ping.
>>
>> Thank you.
>>
>> Best Regards,
>> Shuai