2024-05-01 23:27:31

by Jane Chu

[permalink] [raw]
Subject: [PATCH 2/3] mm/madvise: Add MF_ACTION_REQUIRED to madvise(MADV_HWPOISON)

The soft hwpoison injector via madvise(MADV_HWPOISON) operates in
a synchrous way in a sense, the injector is also a process under
test, and should it have the poisoned page mapped in its address
space, it should legitimately get killed as much as in a real UE
situation.

Signed-off-by: Jane Chu <[email protected]>
---
mm/madvise.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/mm/madvise.c b/mm/madvise.c
index 1a073fcc4c0c..eaeae5252c02 100644
--- a/mm/madvise.c
+++ b/mm/madvise.c
@@ -1127,7 +1127,7 @@ static int madvise_inject_error(int behavior,
} else {
pr_info("Injecting memory failure for pfn %#lx at process virtual address %#lx\n",
pfn, start);
- ret = memory_failure(pfn, MF_COUNT_INCREASED | MF_SW_SIMULATED);
+ ret = memory_failure(pfn, MF_ACTION_REQUIRED | MF_COUNT_INCREASED | MF_SW_SIMULATED);
if (ret == -EOPNOTSUPP)
ret = 0;
}
--
2.39.3



2024-05-06 19:54:44

by Jane Chu

[permalink] [raw]
Subject: Re: [PATCH 2/3] mm/madvise: Add MF_ACTION_REQUIRED to madvise(MADV_HWPOISON)

On 5/5/2024 12:02 AM, Miaohe Lin wrote:

> On 2024/5/2 7:24, Jane Chu wrote:
>> The soft hwpoison injector via madvise(MADV_HWPOISON) operates in
>> a synchrous way in a sense, the injector is also a process under
>> test, and should it have the poisoned page mapped in its address
>> space, it should legitimately get killed as much as in a real UE
>> situation.
> Will it be better to add a method to set MF_ACTION_REQUIRED explicitly when inject soft hwpoison?
> Thanks.

So the first question is: Is there a need to preserve the existing
behavior of  madvise(MADV_HWPOISON)?

The madvise(2) man page says -

*MADV_HWPOISON *(since Linux 2.6.32)
Poison the pages in the range specified by/addr/ and/length/
and handle subsequent references to those pages like a
hardware memory corruption. This operation is available
only for privileged (*CAP_SYS_ADMIN*) processes. This
operation may result in the calling process receiving a
*SIGBUS *and the page being unmapped.

This feature is intended for testing of memory error-
handling code; it is available only if the kernel was
configured with*CONFIG_MEMORY_FAILURE*.

And the impression from my reading is that: there doesn't seem to be a need.

A couple observations -
- The man page states that the calling process may receive a SIGBUS and the page being unmapped.
But the existing behavior is no SIGBUS unless MCE early kill is elected, so it doesn't quite match
the man page.
- There is 'hwpoison-inject' which behaves similar to the existing madvise(MADV_HWPOISON), that is,
soft inject without MF_ACTION_REQUIRED flag.

thanks,
-jane

> .
>
>> Signed-off-by: Jane Chu <[email protected]>
>> ---
>> mm/madvise.c | 2 +-
>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/mm/madvise.c b/mm/madvise.c
>> index 1a073fcc4c0c..eaeae5252c02 100644
>> --- a/mm/madvise.c
>> +++ b/mm/madvise.c
>> @@ -1127,7 +1127,7 @@ static int madvise_inject_error(int behavior,
>> } else {
>> pr_info("Injecting memory failure for pfn %#lx at process virtual address %#lx\n",
>> pfn, start);
>> - ret = memory_failure(pfn, MF_COUNT_INCREASED | MF_SW_SIMULATED);
>> + ret = memory_failure(pfn, MF_ACTION_REQUIRED | MF_COUNT_INCREASED | MF_SW_SIMULATED);
>> if (ret == -EOPNOTSUPP)
>> ret = 0;
>> }
>>