2016-10-11 20:48:02

by kernel test robot

[permalink] [raw]
Subject: [mm] c4344e8035: WARNING: CPU: 0 PID: 101 at mm/memory.c:303 __tlb_remove_page_size+0x25/0x99

FYI, we noticed the following commit:

https://github.com/0day-ci/linux Aneesh-Kumar-K-V/mm-Use-the-correct-page-size-when-removing-the-page/20161012-013446
commit c4344e80359420d7574b3b90fddf53311f1d24e6 ("mm: Remove the page size change check in tlb_remove_page")

in testcase: boot

on test machine: qemu-system-i386 -enable-kvm -cpu Haswell,+smep,+smap -m 360M

caused below changes:


+------------------------------------------------+------------+------------+
| | eff764128d | c4344e8035 |
+------------------------------------------------+------------+------------+
| boot_successes | 59 | 0 |
| boot_failures | 0 | 43 |
| WARNING:at_mm/memory.c:#__tlb_remove_page_size | 0 | 43 |
| calltrace:SyS_execve | 0 | 43 |
| calltrace:run_init_process | 0 | 21 |
+------------------------------------------------+------------+------------+



[ 4.096204] Write protecting the kernel text: 3148k
[ 4.096911] Write protecting the kernel read-only data: 1444k
[ 4.120357] ------------[ cut here ]------------
[ 4.121078] WARNING: CPU: 0 PID: 101 at mm/memory.c:303 __tlb_remove_page_size+0x25/0x99
[ 4.122380] Modules linked in:
[ 4.122788] CPU: 0 PID: 101 Comm: run-parts Not tainted 4.8.0-mm1-00315-gc4344e8 #5
[ 4.123956] bd145dc4 b111e5e6 bd145de0 b10320dc 0000012f b10974d1 bd145e70 c4954170
[ 4.125277] c4954170 bd145df4 b103215f 00000009 00000000 00000000 bd145e04 b10974d1
[ 4.126424] c4954170 bd145e70 bd145e14 b10263ca bd145e70 bd47bafc bd145e40 b109767a
[ 4.127622] Call Trace:
[ 4.128255] ------------[ cut here ]------------
[ 4.128261] WARNING: CPU: 0 PID: 103 at mm/memory.c:303 __tlb_remove_page_size+0x25/0x99
[ 4.128261] Modules linked in:
[ 4.128264] CPU: 0 PID: 103 Comm: sh Not tainted 4.8.0-mm1-00315-gc4344e8 #5
[ 4.128268] bd143dc4 b111e5e6 bd143de0 b10320dc 0000012f b10974d1 bd143e70 c494cd00
[ 4.128271] c494cd00 bd143df4 b103215f 00000009 00000000 00000000 bd143e04 b10974d1
[ 4.128274] c494cd00 bd143e70 bd143e14 b10263ca bd143e70 bd47dafc bd143e40 b109767a
[ 4.128275] Call Trace:
[ 4.128281] [<b111e5e6>] dump_stack+0x16/0x18
[ 4.128284] [<b10320dc>] __warn+0xa5/0xbc
[ 4.128286] [<b10974d1>] ? __tlb_remove_page_size+0x25/0x99
[ 4.128288] [<b103215f>] warn_slowpath_null+0x11/0x16
[ 4.128290] [<b10974d1>] __tlb_remove_page_size+0x25/0x99
[ 4.128293] [<b10263ca>] ___pte_free_tlb+0x57/0x66
[ 4.128295] [<b109767a>] free_pgd_range+0x135/0x1d0
[ 4.128298] [<b10b6ad7>] setup_arg_pages+0x219/0x29a
[ 4.128302] [<b10db3a6>] load_elf_binary+0x2ad/0x94a
[ 4.128305] [<b11266fe>] ? _copy_from_user+0x49/0x5c
[ 4.128307] [<b10b6015>] search_binary_handler+0x106/0x159
[ 4.128309] [<b10b7558>] do_execveat_common+0x3bf/0x4dc
[ 4.128311] [<b10b7689>] do_execve+0x14/0x16
[ 4.128313] [<b10b780a>] SyS_execve+0x16/0x18
[ 4.128316] [<b1000d90>] do_fast_syscall_32+0x8f/0xce
[ 4.128320] [<b131145e>] sysenter_past_esp+0x47/0x75
[ 4.128322] ---[ end trace 816334aebb0eaffe ]---
[ 4.132981] ------------[ cut here ]------------





Thanks,
Kernel Test Robot


Attachments:
(No filename) (3.27 kB)
config-4.8.0-mm1-00315-gc4344e8 (93.61 kB)
job-script (3.87 kB)
dmesg.xz (28.70 kB)
Download all attachments

2016-10-12 04:34:35

by Aneesh Kumar K.V

[permalink] [raw]
Subject: Re: [mm] c4344e8035: WARNING: CPU: 0 PID: 101 at mm/memory.c:303 __tlb_remove_page_size+0x25/0x99

kernel test robot <[email protected]> writes:

> FYI, we noticed the following commit:
>
> https://github.com/0day-ci/linux Aneesh-Kumar-K-V/mm-Use-the-correct-page-size-when-removing-the-page/20161012-013446
> commit c4344e80359420d7574b3b90fddf53311f1d24e6 ("mm: Remove the page size change check in tlb_remove_page")
>
> in testcase: boot
>
> on test machine: qemu-system-i386 -enable-kvm -cpu Haswell,+smep,+smap -m 360M
>
> caused below changes:
>
>
> +------------------------------------------------+------------+------------+
> | | eff764128d | c4344e8035 |
> +------------------------------------------------+------------+------------+
> | boot_successes | 59 | 0 |
> | boot_failures | 0 | 43 |
> | WARNING:at_mm/memory.c:#__tlb_remove_page_size | 0 | 43 |
> | calltrace:SyS_execve | 0 | 43 |
> | calltrace:run_init_process | 0 | 21 |
> +------------------------------------------------+------------+------------+
>
>
>
> [ 4.096204] Write protecting the kernel text: 3148k
> [ 4.096911] Write protecting the kernel read-only data: 1444k
> [ 4.120357] ------------[ cut here ]------------
> [ 4.121078] WARNING: CPU: 0 PID: 101 at mm/memory.c:303 __tlb_remove_page_size+0x25/0x99
> [ 4.122380] Modules linked in:
> [ 4.122788] CPU: 0 PID: 101 Comm: run-parts Not tainted 4.8.0-mm1-00315-gc4344e8 #5
> [ 4.123956] bd145dc4 b111e5e6 bd145de0 b10320dc 0000012f b10974d1 bd145e70 c4954170
> [ 4.125277] c4954170 bd145df4 b103215f 00000009 00000000 00000000 bd145e04 b10974d1
> [ 4.126424] c4954170 bd145e70 bd145e14 b10263ca bd145e70 bd47bafc bd145e40 b109767a
> [ 4.127622] Call Trace:

Thanks for the report. The below change should fix this.

commit 18c929e7cf672da617dc218c6265366bf78b1644
Author: Aneesh Kumar K.V <[email protected]>
Date: Wed Oct 12 08:40:41 2016 +0530

update mmu gather page size before flushing page table cache

diff --git a/mm/memory.c b/mm/memory.c
index 26d1ba8c87e6..7e7eccb82a2b 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -526,7 +526,11 @@ void free_pgd_range(struct mmu_gather *tlb,
end -= PMD_SIZE;
if (addr > end - 1)
return;
-
+ /*
+ * We add page table cache pages with PAGE_SIZE,
+ * (see pte_free_tlb()), flush the tlb if we need
+ */
+ tlb_remove_check_page_size_change(tlb, PAGE_SIZE);
pgd = pgd_offset(tlb->mm, addr);
do {
next = pgd_addr_end(addr, end);

2016-10-12 07:15:32

by kernel test robot

[permalink] [raw]
Subject: Re: [mm] c4344e8035: WARNING: CPU: 0 PID: 101 at mm/memory.c:303 __tlb_remove_page_size+0x25/0x99

On 10/12, Aneesh Kumar K.V wrote:
>kernel test robot <[email protected]> writes:
>
>> FYI, we noticed the following commit:
>>
>> https://github.com/0day-ci/linux Aneesh-Kumar-K-V/mm-Use-the-correct-page-size-when-removing-the-page/20161012-013446
>> commit c4344e80359420d7574b3b90fddf53311f1d24e6 ("mm: Remove the page size change check in tlb_remove_page")
>>
>> in testcase: boot
>>
>> on test machine: qemu-system-i386 -enable-kvm -cpu Haswell,+smep,+smap -m 360M
>>
>> caused below changes:
>>
>>
>> +------------------------------------------------+------------+------------+
>> | | eff764128d | c4344e8035 |
>> +------------------------------------------------+------------+------------+
>> | boot_successes | 59 | 0 |
>> | boot_failures | 0 | 43 |
>> | WARNING:at_mm/memory.c:#__tlb_remove_page_size | 0 | 43 |
>> | calltrace:SyS_execve | 0 | 43 |
>> | calltrace:run_init_process | 0 | 21 |
>> +------------------------------------------------+------------+------------+
>>
>>
>>
>> [ 4.096204] Write protecting the kernel text: 3148k
>> [ 4.096911] Write protecting the kernel read-only data: 1444k
>> [ 4.120357] ------------[ cut here ]------------
>> [ 4.121078] WARNING: CPU: 0 PID: 101 at mm/memory.c:303 __tlb_remove_page_size+0x25/0x99
>> [ 4.122380] Modules linked in:
>> [ 4.122788] CPU: 0 PID: 101 Comm: run-parts Not tainted 4.8.0-mm1-00315-gc4344e8 #5
>> [ 4.123956] bd145dc4 b111e5e6 bd145de0 b10320dc 0000012f b10974d1 bd145e70 c4954170
>> [ 4.125277] c4954170 bd145df4 b103215f 00000009 00000000 00000000 bd145e04 b10974d1
>> [ 4.126424] c4954170 bd145e70 bd145e14 b10263ca bd145e70 bd47bafc bd145e40 b109767a
>> [ 4.127622] Call Trace:
>
>Thanks for the report. The below change should fix this.
>
>commit 18c929e7cf672da617dc218c6265366bf78b1644
>Author: Aneesh Kumar K.V <[email protected]>
>Date: Wed Oct 12 08:40:41 2016 +0530
>
> update mmu gather page size before flushing page table cache
>
>diff --git a/mm/memory.c b/mm/memory.c
>index 26d1ba8c87e6..7e7eccb82a2b 100644
>--- a/mm/memory.c
>+++ b/mm/memory.c
>@@ -526,7 +526,11 @@ void free_pgd_range(struct mmu_gather *tlb,
> end -= PMD_SIZE;
> if (addr > end - 1)
> return;
>-
>+ /*
>+ * We add page table cache pages with PAGE_SIZE,
>+ * (see pte_free_tlb()), flush the tlb if we need
>+ */
>+ tlb_remove_check_page_size_change(tlb, PAGE_SIZE);
> pgd = pgd_offset(tlb->mm, addr);
> do {
> next = pgd_addr_end(addr, end);
>

Just applied this fix on top of commit c4344e8035 and confirmed that
reportedwarning is gone with this fix.

Tested-by: Xiaolong Ye <[email protected]>

=========================================================================================
compiler/kconfig/rootfs/sleep/tbox_group/testcase:
gcc-6/i386-randconfig-s1-201641/quantal-core-i386.cgz/1/vm-vp-quantal-i386/boot

commit:
c4344e80359420d7574b3b90fddf53311f1d24e6
384db818365c90b91d8bad80be188765e801cf58 ("update mmu gather page size before flushing page table cache")

c4344e80359420d7 384db818365c90b91d8bad80be
---------------- --------------------------
fail:runs %reproduction fail:runs
| | |
24:24 -100% :5 dmesg.WARNING:at_mm/memory.c:#__tlb_remove_page_size

Thanks,
Xiaolong