2021-06-30 14:59:08

by Paolo Pisati

[permalink] [raw]
Subject: [PATCH] selftests: memory-hotplug: avoid spamming logs with dump_page(), ratio limit hot-remove error test

While the offline memory test obey ratio limit, the same test with error
injection does not and tries to offline all the hotpluggable memory, spamming
system logs with hundreds of thousands of dump_page() entries, slowing system
down (to the point the test itself timeout and gets terminated) and excessive fs
occupation:

...
[ 9784.393354] page:c00c0000007d1b40 refcount:3 mapcount:0 mapping:c0000001fc03e950 index:0xe7b
[ 9784.393355] def_blk_aops
[ 9784.393356] flags: 0x3ffff800002062(referenced|active|workingset|private)
[ 9784.393358] raw: 003ffff800002062 c0000001b9343a68 c0000001b9343a68 c0000001fc03e950
[ 9784.393359] raw: 0000000000000e7b c000000006607b18 00000003ffffffff c00000000490d000
[ 9784.393359] page dumped because: migration failure
[ 9784.393360] page->mem_cgroup:c00000000490d000
[ 9784.393416] migrating pfn 1f46d failed ret:1
...

$ grep "page dumped because: migration failure" /var/log/kern.log | wc -l
2405558

$ ls -la /var/log/kern.log
-rw-r----- 1 syslog adm 2256109539 Jun 30 14:19 /var/log/kern.log

Signed-off-by: Paolo Pisati <[email protected]>
---
tools/testing/selftests/memory-hotplug/mem-on-off-test.sh | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/tools/testing/selftests/memory-hotplug/mem-on-off-test.sh b/tools/testing/selftests/memory-hotplug/mem-on-off-test.sh
index b37585e6aa38..46a97f318f58 100755
--- a/tools/testing/selftests/memory-hotplug/mem-on-off-test.sh
+++ b/tools/testing/selftests/memory-hotplug/mem-on-off-test.sh
@@ -282,7 +282,9 @@ done
#
echo $error > $NOTIFIER_ERR_INJECT_DIR/actions/MEM_GOING_OFFLINE/error
for memory in `hotpluggable_online_memory`; do
- offline_memory_expect_fail $memory
+ if [ $((RANDOM % 100)) -lt $ratio ]; then
+ offline_memory_expect_fail $memory
+ fi
done

echo 0 > $NOTIFIER_ERR_INJECT_DIR/actions/MEM_GOING_OFFLINE/error
--
2.30.2


2021-07-02 15:15:45

by Krzysztof Kozlowski

[permalink] [raw]
Subject: Re: [PATCH] selftests: memory-hotplug: avoid spamming logs with dump_page(), ratio limit hot-remove error test

On 30/06/2021 16:57, Paolo Pisati wrote:
> While the offline memory test obey ratio limit, the same test with error
> injection does not and tries to offline all the hotpluggable memory, spamming
> system logs with hundreds of thousands of dump_page() entries, slowing system
> down (to the point the test itself timeout and gets terminated) and excessive fs
> occupation:
>
> ...
> [ 9784.393354] page:c00c0000007d1b40 refcount:3 mapcount:0 mapping:c0000001fc03e950 index:0xe7b
> [ 9784.393355] def_blk_aops
> [ 9784.393356] flags: 0x3ffff800002062(referenced|active|workingset|private)
> [ 9784.393358] raw: 003ffff800002062 c0000001b9343a68 c0000001b9343a68 c0000001fc03e950
> [ 9784.393359] raw: 0000000000000e7b c000000006607b18 00000003ffffffff c00000000490d000
> [ 9784.393359] page dumped because: migration failure
> [ 9784.393360] page->mem_cgroup:c00000000490d000
> [ 9784.393416] migrating pfn 1f46d failed ret:1
> ...
>
> $ grep "page dumped because: migration failure" /var/log/kern.log | wc -l
> 2405558
>
> $ ls -la /var/log/kern.log
> -rw-r----- 1 syslog adm 2256109539 Jun 30 14:19 /var/log/kern.log

Makes sense to me and looks better choice than to disable the test
completely (as other choice...).

Acked-by: Krzysztof Kozlowski <[email protected]>

>
> Signed-off-by: Paolo Pisati <[email protected]>
> ---
> tools/testing/selftests/memory-hotplug/mem-on-off-test.sh | 4 +++-
> 1 file changed, 3 insertions(+), 1 deletion(-)
>
Best regards,
Krzysztof

2021-07-09 13:13:07

by Paolo Pisati

[permalink] [raw]
Subject: Re: [PATCH] selftests: memory-hotplug: avoid spamming logs with dump_page(), ratio limit hot-remove error test

On Wed, Jun 30, 2021 at 04:57:40PM +0200, Paolo Pisati wrote:
> While the offline memory test obey ratio limit, the same test with error
> injection does not and tries to offline all the hotpluggable memory, spamming
> system logs with hundreds of thousands of dump_page() entries, slowing system
> down (to the point the test itself timeout and gets terminated) and excessive fs
> occupation:
>
> ...

Anyone with spare cycles could review this? It got one ack already.
--
bye,
p.

2021-07-09 17:04:34

by Shuah Khan

[permalink] [raw]
Subject: Re: [PATCH] selftests: memory-hotplug: avoid spamming logs with dump_page(), ratio limit hot-remove error test

On 7/9/21 7:00 AM, Paolo Pisati wrote:
> On Wed, Jun 30, 2021 at 04:57:40PM +0200, Paolo Pisati wrote:
>> While the offline memory test obey ratio limit, the same test with error
>> injection does not and tries to offline all the hotpluggable memory, spamming
>> system logs with hundreds of thousands of dump_page() entries, slowing system
>> down (to the point the test itself timeout and gets terminated) and excessive fs
>> occupation:
>>
>> ...
>
> Anyone with spare cycles could review this? It got one ack already.
>

Thanks for finding and fixing this.

Looks good to me. I will pull this in as soon as the merge window
ends and 5.14-rc1 comes out.

thanks,
-- Shuah