2019-04-10 14:57:55

by kernel test robot

[permalink] [raw]
Subject: 1808d65b55 ("asm-generic/tlb: Remove arch_tlb*_mmu()"): BUG: KASAN: stack-out-of-bounds in __change_page_attr_set_clr

Greetings,

0day kernel testing robot got the below dmesg and the first bad commit is

https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git core/mm

commit 1808d65b55e4489770dd4f76fb0dff5b81eb9b11
Author: Peter Zijlstra <[email protected]>
AuthorDate: Thu Sep 20 10:50:11 2018 +0200
Commit: Ingo Molnar <[email protected]>
CommitDate: Wed Apr 3 10:32:58 2019 +0200

asm-generic/tlb: Remove arch_tlb*_mmu()

Now that all architectures are converted to the generic code, remove
the arch hooks.

No change in behavior intended.

Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Acked-by: Will Deacon <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: H. Peter Anvin <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Rik van Riel <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>

9de7d833e3 s390/tlb: Convert to generic mmu_gather
1808d65b55 asm-generic/tlb: Remove arch_tlb*_mmu()
6455959819 ia64/tlb: Eradicate tlb_migrate_finish() callback
31437a258f Merge branch 'perf/urgent'
+------------------------------------------------------------+------------+------------+------------+------------+
| | 9de7d833e3 | 1808d65b55 | 6455959819 | 31437a258f |
+------------------------------------------------------------+------------+------------+------------+------------+
| boot_successes | 0 | 0 | 0 | 0 |
| boot_failures | 44 | 11 | 11 | 11 |
| BUG:KASAN:stack-out-of-bounds_in__unwind_start | 44 | | | |
| BUG:KASAN:stack-out-of-bounds_in__change_page_attr_set_clr | 0 | 11 | 11 | 11 |
+------------------------------------------------------------+------------+------------+------------+------------+

[ 13.977997] rodata_test: all tests were successful
[ 13.979792] x86/mm: Checking user space page tables
[ 14.011779] x86/mm: Checked W+X mappings: passed, no W+X pages found.
[ 14.013022] Run /init as init process
[ 14.015154] ==================================================================
[ 14.016489] BUG: KASAN: stack-out-of-bounds in __change_page_attr_set_clr+0xa8/0x4df
[ 14.017853] Read of size 8 at addr ffff8880191ef8b0 by task init/1
[ 14.018976]
[ 14.019259] CPU: 0 PID: 1 Comm: init Not tainted 5.1.0-rc3-00029-g1808d65 #3
[ 14.020509] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/2014
[ 14.022028] Call Trace:
[ 14.022471] print_address_description+0x9d/0x26b
[ 14.023295] ? __change_page_attr_set_clr+0xa8/0x4df
[ 14.024161] ? __change_page_attr_set_clr+0xa8/0x4df
[ 14.025031] kasan_report+0x145/0x18a
[ 14.025667] ? __change_page_attr_set_clr+0xa8/0x4df
[ 14.026542] __change_page_attr_set_clr+0xa8/0x4df
[ 14.027433] ? __change_page_attr+0xad0/0xad0
[ 14.028260] ? kasan_unpoison_shadow+0xf/0x2e
[ 14.029062] ? preempt_latency_start+0x22/0x68
[ 14.029962] ? get_page_from_freelist+0xf37/0x1281
[ 14.030796] ? native_flush_tlb_one_user+0x54/0x95
[ 14.031602] ? trace_tlb_flush+0x1f/0x106
[ 14.032352] ? flush_tlb_func_common+0x26a/0x289
[ 14.033322] ? trace_irq_enable_rcuidle+0x21/0xf5
[ 14.034109] __kernel_map_pages+0x148/0x1b1
[ 14.034777] ? set_pages_rw+0x94/0x94
[ 14.035408] ? flush_tlb_mm_range+0x161/0x1ae
[ 14.036134] ? atomic_read+0xe/0x3f
[ 14.036715] ? page_expected_state+0x46/0x81
[ 14.037442] free_unref_page_prepare+0xe1/0x192
[ 14.038201] free_unref_page_list+0xd3/0x319
[ 14.038960] release_pages+0x5d1/0x612
[ 14.039581] ? __put_compound_page+0x91/0x91
[ 14.040346] ? tlb_flush_mmu_tlbonly+0x107/0x1c5
[ 14.041193] ? preempt_latency_start+0x22/0x68
[ 14.041922] ? free_swap_cache+0x51/0xd5
[ 14.042566] tlb_flush_mmu_free+0x31/0xca
[ 14.043254] tlb_finish_mmu+0xf6/0x1b5
[ 14.043883] shift_arg_pages+0x280/0x30b
[ 14.044535] ? __register_binfmt+0x18d/0x18d
[ 14.045259] ? trace_irq_enable_rcuidle+0x21/0xf5
[ 14.046029] ? ___might_sleep+0xac/0x33e
[ 14.046666] setup_arg_pages+0x46a/0x56e
[ 14.047347] ? shift_arg_pages+0x30b/0x30b
[ 14.048208] load_elf_binary+0x888/0x20dd
[ 14.048872] ? _raw_read_unlock+0x14/0x24
[ 14.049532] ? ima_bprm_check+0x18c/0x1c2
[ 14.050199] ? elf_map+0x1e8/0x1e8
[ 14.050756] ? ima_file_mmap+0xf3/0xf3
[ 14.051583] search_binary_handler+0x154/0x511
[ 14.052323] __do_execve_file+0x10b5/0x15e9
[ 14.053004] ? open_exec+0x3a/0x3a
[ 14.053564] ? memcpy+0x34/0x46
[ 14.054095] ? rest_init+0xdd/0xdd
[ 14.054669] kernel_init+0x66/0x10d
[ 14.055262] ? rest_init+0xdd/0xdd
[ 14.055833] ret_from_fork+0x3a/0x50
[ 14.056516]
[ 14.056769] The buggy address belongs to the page:
[ 14.057552] page:ffff88801de82c48 count:0 mapcount:0 mapping:0000000000000000 index:0x0
[ 14.058923] flags: 0x680000000000()
[ 14.059495] raw: 0000680000000000 ffff88801de82c50 ffff88801de82c50 0000000000000000

# HH:MM RESULT GOOD BAD GOOD_BUT_DIRTY DIRTY_NOT_BAD
git bisect start 73f7e0e993d885606124134bd88c4c0e6b8b45bd 15ade5d2e7775667cf191cf2f94327a4889f8b9d --
git bisect bad 9b748775c7b377ae207813cb9ecdb0153b74ca55 # 17:54 B 0 9 24 0 Merge 'hwmon/hwmon' into devel-hourly-2019040920
git bisect bad a891bf73affea7bf5e4a7ef78b23b1c3f5b29d58 # 18:05 B 0 11 26 0 Merge 'linux-review/Heiner-Kallweit/net-phy-switch-drivers-to-use-dynamic-feature-detection/20190408-065213' into devel-hourly-2019040920
git bisect good 7ccb8fbbe4f0de58aeaac0f783d926a091de2942 # 18:18 G 11 0 11 11 Merge 'brgl-linux/gpio/for-next' into devel-hourly-2019040920
git bisect bad 081419eb685d308d93e12e1ddea3d02bfa52c0a4 # 18:33 B 0 4 19 0 Merge 'csky-linux/linux-next' into devel-hourly-2019040920
git bisect good 95041a63b3167fdc27aa36ef3b54daeeff12bdae # 18:52 G 11 0 11 11 Merge 'linux-review/Ido-Schimmel/mlxsw-Add-support-for-devlink-info-command/20190408-210315' into devel-hourly-2019040920
git bisect good 404993d745381524b38243176df8f1a11dd99d3b # 19:14 G 11 0 11 11 Merge 'linux-review/Simon-Horman/ravb-Avoid-unsupported-internal-delay-mode-for-R-Car-E3-D3/20190408-204324' into devel-hourly-2019040920
git bisect good 8cad949760cbcba41a6981993d0c65b4604a9e18 # 19:31 G 11 0 11 11 Merge 'gfs2/for-next.glock-refcount' into devel-hourly-2019040920
git bisect good 2c0d83617d30e6e747067a08e48a0e2de7404aa2 # 19:43 G 11 0 11 11 Merge 'pinctrl/devel' into devel-hourly-2019040920
git bisect good 14fb0415d4eba1e4f63efc98d4b3f1b8ceea047f # 19:57 G 11 0 11 11 Merge 'linux-review/Kristian-Evensen/qmi_wwan-Add-quirk-for-Quectel-dynamic-config/20190408-073833' into devel-hourly-2019040920
git bisect bad 8e115919d3366790a875d7dfad33bcb7009a957d # 20:10 B 0 1 16 0 Merge 'tip/master' into devel-hourly-2019040920
git bisect bad 9402fa854486829a7792fbb4038b5585473f3b1a # 20:31 B 0 8 23 0 Merge branch 'perf/urgent'
git bisect good 2e8623e9bc0ba4907e94c4d94a1caeac23d1fadb # 20:44 G 11 0 11 11 Merge branch 'linus'
git bisect good 64604d54d3115fee89598bfb6d8d2252f8a2d114 # 20:54 G 11 0 11 11 sched/x86_64: Don't save flags on context switch
git bisect bad b3fa8ed4e48802e6ba0aa5f3283313a27dcbf46f # 21:04 B 0 11 26 0 asm-generic/tlb: Remove CONFIG_HAVE_GENERIC_MMU_GATHER
git bisect good b78180b97dcf667350aac716cd3f32356eaf4984 # 21:20 G 11 0 11 11 arm/tlb: Convert to generic mmu_gather
git bisect good 6137fed0823247e32306bde2b48cac627c24f894 # 21:30 G 11 0 11 11 arch/tlb: Clean up simple architectures
git bisect good 9de7d833e3708213bf99d75c37483e0f773f5e16 # 21:43 G 11 0 11 11 s390/tlb: Convert to generic mmu_gather
git bisect bad 1808d65b55e4489770dd4f76fb0dff5b81eb9b11 # 21:52 B 0 11 26 0 asm-generic/tlb: Remove arch_tlb*_mmu()
# first bad commit: [1808d65b55e4489770dd4f76fb0dff5b81eb9b11] asm-generic/tlb: Remove arch_tlb*_mmu()
git bisect good 9de7d833e3708213bf99d75c37483e0f773f5e16 # 21:53 G 33 0 33 44 s390/tlb: Convert to generic mmu_gather
# extra tests with debug options
git bisect bad 1808d65b55e4489770dd4f76fb0dff5b81eb9b11 # 22:07 B 0 1 16 0 asm-generic/tlb: Remove arch_tlb*_mmu()
# extra tests on HEAD of linux-devel/devel-hourly-2019040920
git bisect bad 73f7e0e993d885606124134bd88c4c0e6b8b45bd # 22:13 B 0 13 31 0 0day head guard for 'devel-hourly-2019040920'
# extra tests on tree/branch tip/core/mm
git bisect bad 6455959819bf2469190ae9f6b4ccebaa9827e884 # 22:37 B 0 4 19 0 ia64/tlb: Eradicate tlb_migrate_finish() callback
# extra tests on tree/branch tip/master
git bisect bad 31437a258fa637d7449385ef2e1b33efc6786397 # 22:54 B 0 11 26 0 Merge branch 'perf/urgent'

---
0-DAY kernel test infrastructure Open Source Technology Center
https://lists.01.org/pipermail/lkp Intel Corporation


Attachments:
(No filename) (9.48 kB)
dmesg-quantal-vm-quantal-217:20190410215134:x86_64-randconfig-s3-04092154:5.1.0-rc3-00029-g1808d65:3.gz (14.09 kB)
dmesg-quantal-vm-quantal-101:20190410215248:x86_64-randconfig-s3-04092154:5.1.0-rc3-00028-g9de7d83:1.gz (14.29 kB)
reproduce-quantal-vm-quantal-217:20190410215134:x86_64-randconfig-s3-04092154:5.1.0-rc3-00029-g1808d65:3 (955.00 B)
config-5.1.0-rc3-00029-g1808d65 (128.42 kB)
Download all attachments

2019-04-11 19:40:36

by Peter Zijlstra

[permalink] [raw]
Subject: Re: 1808d65b55 ("asm-generic/tlb: Remove arch_tlb*_mmu()"): BUG: KASAN: stack-out-of-bounds in __change_page_attr_set_clr

On Wed, Apr 10, 2019 at 10:55:00PM +0800, kernel test robot wrote:
> Greetings,
>
> 0day kernel testing robot got the below dmesg and the first bad commit is
>
> https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git core/mm
>
> commit 1808d65b55e4489770dd4f76fb0dff5b81eb9b11
> Author: Peter Zijlstra <[email protected]>
> AuthorDate: Thu Sep 20 10:50:11 2018 +0200
> Commit: Ingo Molnar <[email protected]>
> CommitDate: Wed Apr 3 10:32:58 2019 +0200
>
> asm-generic/tlb: Remove arch_tlb*_mmu()
>
> Now that all architectures are converted to the generic code, remove
> the arch hooks.
>
> No change in behavior intended.
>
> Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
> Acked-by: Will Deacon <[email protected]>
> Cc: Andrew Morton <[email protected]>
> Cc: Andy Lutomirski <[email protected]>
> Cc: Borislav Petkov <[email protected]>
> Cc: Dave Hansen <[email protected]>
> Cc: H. Peter Anvin <[email protected]>
> Cc: Linus Torvalds <[email protected]>
> Cc: Peter Zijlstra <[email protected]>
> Cc: Rik van Riel <[email protected]>
> Cc: Thomas Gleixner <[email protected]>
> Signed-off-by: Ingo Molnar <[email protected]>
>
> 9de7d833e3 s390/tlb: Convert to generic mmu_gather
> 1808d65b55 asm-generic/tlb: Remove arch_tlb*_mmu()
> 6455959819 ia64/tlb: Eradicate tlb_migrate_finish() callback
> 31437a258f Merge branch 'perf/urgent'
> +------------------------------------------------------------+------------+------------+------------+------------+
> | | 9de7d833e3 | 1808d65b55 | 6455959819 | 31437a258f |
> +------------------------------------------------------------+------------+------------+------------+------------+
> | boot_successes | 0 | 0 | 0 | 0 |
> | boot_failures | 44 | 11 | 11 | 11 |
> | BUG:KASAN:stack-out-of-bounds_in__unwind_start | 44 | | | |
> | BUG:KASAN:stack-out-of-bounds_in__change_page_attr_set_clr | 0 | 11 | 11 | 11 |
> +------------------------------------------------------------+------------+------------+------------+------------+
>
> [ 13.977997] rodata_test: all tests were successful
> [ 13.979792] x86/mm: Checking user space page tables
> [ 14.011779] x86/mm: Checked W+X mappings: passed, no W+X pages found.
> [ 14.013022] Run /init as init process
> [ 14.015154] ==================================================================
> [ 14.016489] BUG: KASAN: stack-out-of-bounds in __change_page_attr_set_clr+0xa8/0x4df
> [ 14.017853] Read of size 8 at addr ffff8880191ef8b0 by task init/1
> [ 14.018976]
> [ 14.019259] CPU: 0 PID: 1 Comm: init Not tainted 5.1.0-rc3-00029-g1808d65 #3
> [ 14.020509] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/2014
> [ 14.022028] Call Trace:
> [ 14.022471] print_address_description+0x9d/0x26b
> [ 14.023295] ? __change_page_attr_set_clr+0xa8/0x4df
> [ 14.024161] ? __change_page_attr_set_clr+0xa8/0x4df
> [ 14.025031] kasan_report+0x145/0x18a
> [ 14.025667] ? __change_page_attr_set_clr+0xa8/0x4df
> [ 14.026542] __change_page_attr_set_clr+0xa8/0x4df
> [ 14.027433] ? __change_page_attr+0xad0/0xad0
> [ 14.028260] ? kasan_unpoison_shadow+0xf/0x2e
> [ 14.029062] ? preempt_latency_start+0x22/0x68
> [ 14.029962] ? get_page_from_freelist+0xf37/0x1281
> [ 14.030796] ? native_flush_tlb_one_user+0x54/0x95
> [ 14.031602] ? trace_tlb_flush+0x1f/0x106
> [ 14.032352] ? flush_tlb_func_common+0x26a/0x289
> [ 14.033322] ? trace_irq_enable_rcuidle+0x21/0xf5
> [ 14.034109] __kernel_map_pages+0x148/0x1b1
> [ 14.034777] ? set_pages_rw+0x94/0x94
> [ 14.035408] ? flush_tlb_mm_range+0x161/0x1ae
> [ 14.036134] ? atomic_read+0xe/0x3f
> [ 14.036715] ? page_expected_state+0x46/0x81
> [ 14.037442] free_unref_page_prepare+0xe1/0x192
> [ 14.038201] free_unref_page_list+0xd3/0x319
> [ 14.038960] release_pages+0x5d1/0x612
> [ 14.039581] ? __put_compound_page+0x91/0x91
> [ 14.040346] ? tlb_flush_mmu_tlbonly+0x107/0x1c5
> [ 14.041193] ? preempt_latency_start+0x22/0x68
> [ 14.041922] ? free_swap_cache+0x51/0xd5
> [ 14.042566] tlb_flush_mmu_free+0x31/0xca
> [ 14.043254] tlb_finish_mmu+0xf6/0x1b5
> [ 14.043883] shift_arg_pages+0x280/0x30b
> [ 14.044535] ? __register_binfmt+0x18d/0x18d
> [ 14.045259] ? trace_irq_enable_rcuidle+0x21/0xf5
> [ 14.046029] ? ___might_sleep+0xac/0x33e
> [ 14.046666] setup_arg_pages+0x46a/0x56e
> [ 14.047347] ? shift_arg_pages+0x30b/0x30b
> [ 14.048208] load_elf_binary+0x888/0x20dd
> [ 14.048872] ? _raw_read_unlock+0x14/0x24
> [ 14.049532] ? ima_bprm_check+0x18c/0x1c2
> [ 14.050199] ? elf_map+0x1e8/0x1e8
> [ 14.050756] ? ima_file_mmap+0xf3/0xf3
> [ 14.051583] search_binary_handler+0x154/0x511
> [ 14.052323] __do_execve_file+0x10b5/0x15e9
> [ 14.053004] ? open_exec+0x3a/0x3a
> [ 14.053564] ? memcpy+0x34/0x46
> [ 14.054095] ? rest_init+0xdd/0xdd
> [ 14.054669] kernel_init+0x66/0x10d
> [ 14.055262] ? rest_init+0xdd/0xdd
> [ 14.055833] ret_from_fork+0x3a/0x50
> [ 14.056516]
> [ 14.056769] The buggy address belongs to the page:
> [ 14.057552] page:ffff88801de82c48 count:0 mapcount:0 mapping:0000000000000000 index:0x0
> [ 14.058923] flags: 0x680000000000()
> [ 14.059495] raw: 0000680000000000 ffff88801de82c50 ffff88801de82c50 0000000000000000

I think this bisect is bad. If you look at your own logs this patch
merely changes the failure, but doesn't make it go away.

Before this patch (in fact, before tip/core/mm entirely) the errror
reads like the below, which suggests there is memory corruption
somewhere, and the fingered patch just makes it trigger differently.

It would be very good to find the source of this corruption, but I'm
fairly certain it is not here.

[ 10.273617] rodata_test: all tests were successful
[ 10.275015] x86/mm: Checking user space page tables
[ 10.295444] x86/mm: Checked W+X mappings: passed, no W+X pages found.
[ 10.296334] Run /init as init process
[ 10.301465] ==================================================================
[ 10.302460] BUG: KASAN: stack-out-of-bounds in __unwind_start+0x7e/0x4fe
[ 10.303355] Write of size 88 at addr ffff8880191efa28 by task init/1
[ 10.304241]
[ 10.304455] CPU: 0 PID: 1 Comm: init Not tainted 5.1.0-rc4-00288-ga131d61b43e0-dirty #10
[ 10.305542] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/2014
[ 10.306641] Call Trace:
[ 10.306990] print_address_description+0x9d/0x26b
[ 10.307654] ? __unwind_start+0x7e/0x4fe
[ 10.308222] ? __unwind_start+0x7e/0x4fe
[ 10.308755] __kasan_report+0x145/0x18a
[ 10.309266] ? __unwind_start+0x7e/0x4fe
[ 10.309823] kasan_report+0xe/0x12
[ 10.310273] memset+0x1f/0x31
[ 10.310703] __unwind_start+0x7e/0x4fe
[ 10.311223] ? unwind_next_frame+0x10a9/0x10a9
[ 10.311839] ? native_flush_tlb_one_user+0x54/0x95
[ 10.312504] ? kasan_unpoison_shadow+0xf/0x2e
[ 10.313090] __save_stack_trace+0x65/0xe7
[ 10.313667] ? trace_irq_enable_rcuidle+0x21/0xf5
[ 10.314284] ? tracer_hardirqs_on+0xb/0x1b
[ 10.314830] ? trace_hardirqs_on+0x2c/0x37
[ 10.315369] save_stack+0x32/0xa3
[ 10.315842] ? __put_compound_page+0x91/0x91
[ 10.316458] ? preempt_latency_start+0x22/0x68
[ 10.317052] ? free_swap_cache+0x51/0xd5
[ 10.317586] ? tlb_flush_mmu_free+0x31/0xca
[ 10.318140] ? arch_tlb_finish_mmu+0x8c/0x112
[ 10.318759] ? tlb_finish_mmu+0xc7/0xd6
[ 10.319298] ? unmap_region+0x275/0x2b9
[ 10.319835] ? special_mapping_fault+0x26d/0x26d
[ 10.320448] ? trace_irq_disable_rcuidle+0x21/0xf5
[ 10.321085] __kasan_slab_free+0xd3/0xf4
[ 10.321623] ? remove_vma+0xdf/0xe7
[ 10.322105] kmem_cache_free+0x4e/0xca
[ 10.322600] remove_vma+0xdf/0xe7
[ 10.323038] __do_munmap+0x72c/0x75e
[ 10.323514] __vm_munmap+0xd0/0x135
[ 10.323980] ? __x64_sys_brk+0x40e/0x40e
[ 10.324496] ? trace_irq_disable_rcuidle+0x21/0xf5
[ 10.325160] __x64_sys_munmap+0x6a/0x6f
[ 10.325670] do_syscall_64+0x3f0/0x462
[ 10.326162] ? syscall_return_slowpath+0x154/0x154
[ 10.326810] ? entry_SYSCALL_64_after_hwframe+0x3e/0xbe
[ 10.327485] ? trace_irq_disable_rcuidle+0x21/0xf5
[ 10.328153] ? entry_SYSCALL_64_after_hwframe+0x3e/0xbe
[ 10.328873] ? trace_hardirqs_off_caller+0x3e/0x40
[ 10.329505] ? trace_hardirqs_off_thunk+0x1a/0x1c
[ 10.330162] entry_SYSCALL_64_after_hwframe+0x49/0xbe
[ 10.330830] RIP: 0033:0x7efc4d707457
[ 10.331306] Code: f0 ff ff 73 01 c3 48 8d 0d 5a be 20 00 31 d2 48 29 c2 89 11 48 83 c8 ff eb eb 90 90 90 90 90 90 90 90 90 b8 0b 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8d 0d 2d be 20 00 31 d2 48 29 c2 89
[ 10.333711] RSP: 002b:00007fff973da398 EFLAGS: 00000203 ORIG_RAX: 000000000000000b
[ 10.334728] RAX: ffffffffffffffda RBX: 00007efc4d9132c8 RCX: 00007efc4d707457
[ 10.335670] RDX: 0000000000000000 RSI: 0000000000001d67 RDI: 00007efc4d90d000
[ 10.336596] RBP: 00007fff973da4f0 R08: 0000000000000007 R09: 00000000ffffffff
[ 10.337512] R10: 0000000000000000 R11: 0000000000000203 R12: 000000073dd74283
[ 10.338457] R13: 000000073db1ab4f R14: 00007efc4d909700 R15: 00007efc4d9132c8
[ 10.339373]
[ 10.339585] The buggy address belongs to the page:
[ 10.340224] page:ffff88801de82c48 count:0 mapcount:0 mapping:0000000000000000 index:0x0
[ 10.341338] flags: 0x680000000000()
[ 10.341832] raw: 0000680000000000 ffff88801de82c50 ffff88801de82c50 0000000000000000
[ 10.342846] raw: 0000000000000000 0000000000000000 00000000ffffffff
[ 10.343679] page dumped because: kasan: bad access detected
[ 10.344415]
[ 10.344629] Memory state around the buggy address:
[ 10.345254] ffff8880191ef900: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 10.346245] ffff8880191ef980: 00 00 f1 f1 f1 f1 00 f2 f2 f2 00 00 00 00 00 00
[ 10.347217] >ffff8880191efa00: 00 00 00 00 00 f2 f2 f2 00 00 00 00 00 00 00 00
[ 10.348152] ^
[ 10.348755] ffff8880191efa80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 10.349698] ffff8880191efb00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 10.350650] ==================================================================

2019-04-11 19:55:24

by Peter Zijlstra

[permalink] [raw]
Subject: Re: 1808d65b55 ("asm-generic/tlb: Remove arch_tlb*_mmu()"): BUG: KASAN: stack-out-of-bounds in __change_page_attr_set_clr

On Thu, Apr 11, 2019 at 09:39:06PM +0200, Peter Zijlstra wrote:
> I think this bisect is bad. If you look at your own logs this patch
> merely changes the failure, but doesn't make it go away.
>
> Before this patch (in fact, before tip/core/mm entirely) the errror
> reads like the below, which suggests there is memory corruption
> somewhere, and the fingered patch just makes it trigger differently.
>
> It would be very good to find the source of this corruption, but I'm
> fairly certain it is not here.

I went back to v4.20 to try and find a time when the below error did not
occur, but even that reliably triggers the warning.

> [ 10.273617] rodata_test: all tests were successful
> [ 10.275015] x86/mm: Checking user space page tables
> [ 10.295444] x86/mm: Checked W+X mappings: passed, no W+X pages found.
> [ 10.296334] Run /init as init process
> [ 10.301465] ==================================================================
> [ 10.302460] BUG: KASAN: stack-out-of-bounds in __unwind_start+0x7e/0x4fe
> [ 10.303355] Write of size 88 at addr ffff8880191efa28 by task init/1
> [ 10.304241]
> [ 10.304455] CPU: 0 PID: 1 Comm: init Not tainted 5.1.0-rc4-00288-ga131d61b43e0-dirty #10
> [ 10.305542] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/2014
> [ 10.306641] Call Trace:
> [ 10.306990] print_address_description+0x9d/0x26b
> [ 10.307654] ? __unwind_start+0x7e/0x4fe
> [ 10.308222] ? __unwind_start+0x7e/0x4fe
> [ 10.308755] __kasan_report+0x145/0x18a
> [ 10.309266] ? __unwind_start+0x7e/0x4fe
> [ 10.309823] kasan_report+0xe/0x12
> [ 10.310273] memset+0x1f/0x31
> [ 10.310703] __unwind_start+0x7e/0x4fe
> [ 10.311223] ? unwind_next_frame+0x10a9/0x10a9
> [ 10.311839] ? native_flush_tlb_one_user+0x54/0x95
> [ 10.312504] ? kasan_unpoison_shadow+0xf/0x2e
> [ 10.313090] __save_stack_trace+0x65/0xe7
> [ 10.313667] ? trace_irq_enable_rcuidle+0x21/0xf5
> [ 10.314284] ? tracer_hardirqs_on+0xb/0x1b
> [ 10.314830] ? trace_hardirqs_on+0x2c/0x37
> [ 10.315369] save_stack+0x32/0xa3
> [ 10.315842] ? __put_compound_page+0x91/0x91
> [ 10.316458] ? preempt_latency_start+0x22/0x68
> [ 10.317052] ? free_swap_cache+0x51/0xd5
> [ 10.317586] ? tlb_flush_mmu_free+0x31/0xca
> [ 10.318140] ? arch_tlb_finish_mmu+0x8c/0x112
> [ 10.318759] ? tlb_finish_mmu+0xc7/0xd6
> [ 10.319298] ? unmap_region+0x275/0x2b9
> [ 10.319835] ? special_mapping_fault+0x26d/0x26d
> [ 10.320448] ? trace_irq_disable_rcuidle+0x21/0xf5
> [ 10.321085] __kasan_slab_free+0xd3/0xf4
> [ 10.321623] ? remove_vma+0xdf/0xe7
> [ 10.322105] kmem_cache_free+0x4e/0xca
> [ 10.322600] remove_vma+0xdf/0xe7
> [ 10.323038] __do_munmap+0x72c/0x75e
> [ 10.323514] __vm_munmap+0xd0/0x135
> [ 10.323980] ? __x64_sys_brk+0x40e/0x40e
> [ 10.324496] ? trace_irq_disable_rcuidle+0x21/0xf5
> [ 10.325160] __x64_sys_munmap+0x6a/0x6f
> [ 10.325670] do_syscall_64+0x3f0/0x462
> [ 10.326162] ? syscall_return_slowpath+0x154/0x154
> [ 10.326810] ? entry_SYSCALL_64_after_hwframe+0x3e/0xbe
> [ 10.327485] ? trace_irq_disable_rcuidle+0x21/0xf5
> [ 10.328153] ? entry_SYSCALL_64_after_hwframe+0x3e/0xbe
> [ 10.328873] ? trace_hardirqs_off_caller+0x3e/0x40
> [ 10.329505] ? trace_hardirqs_off_thunk+0x1a/0x1c
> [ 10.330162] entry_SYSCALL_64_after_hwframe+0x49/0xbe
> [ 10.330830] RIP: 0033:0x7efc4d707457
> [ 10.331306] Code: f0 ff ff 73 01 c3 48 8d 0d 5a be 20 00 31 d2 48 29 c2 89 11 48 83 c8 ff eb eb 90 90 90 90 90 90 90 90 90 b8 0b 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8d 0d 2d be 20 00 31 d2 48 29 c2 89
> [ 10.333711] RSP: 002b:00007fff973da398 EFLAGS: 00000203 ORIG_RAX: 000000000000000b
> [ 10.334728] RAX: ffffffffffffffda RBX: 00007efc4d9132c8 RCX: 00007efc4d707457
> [ 10.335670] RDX: 0000000000000000 RSI: 0000000000001d67 RDI: 00007efc4d90d000
> [ 10.336596] RBP: 00007fff973da4f0 R08: 0000000000000007 R09: 00000000ffffffff
> [ 10.337512] R10: 0000000000000000 R11: 0000000000000203 R12: 000000073dd74283
> [ 10.338457] R13: 000000073db1ab4f R14: 00007efc4d909700 R15: 00007efc4d9132c8
> [ 10.339373]
> [ 10.339585] The buggy address belongs to the page:
> [ 10.340224] page:ffff88801de82c48 count:0 mapcount:0 mapping:0000000000000000 index:0x0
> [ 10.341338] flags: 0x680000000000()
> [ 10.341832] raw: 0000680000000000 ffff88801de82c50 ffff88801de82c50 0000000000000000
> [ 10.342846] raw: 0000000000000000 0000000000000000 00000000ffffffff
> [ 10.343679] page dumped because: kasan: bad access detected
> [ 10.344415]
> [ 10.344629] Memory state around the buggy address:
> [ 10.345254] ffff8880191ef900: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> [ 10.346245] ffff8880191ef980: 00 00 f1 f1 f1 f1 00 f2 f2 f2 00 00 00 00 00 00
> [ 10.347217] >ffff8880191efa00: 00 00 00 00 00 f2 f2 f2 00 00 00 00 00 00 00 00
> [ 10.348152] ^
> [ 10.348755] ffff8880191efa80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> [ 10.349698] ffff8880191efb00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> [ 10.350650] ==================================================================

2019-04-11 21:14:56

by Peter Zijlstra

[permalink] [raw]
Subject: Re: 1808d65b55 ("asm-generic/tlb: Remove arch_tlb*_mmu()"): BUG: KASAN: stack-out-of-bounds in __change_page_attr_set_clr

On Thu, Apr 11, 2019 at 09:54:24PM +0200, Peter Zijlstra wrote:
> On Thu, Apr 11, 2019 at 09:39:06PM +0200, Peter Zijlstra wrote:
> > I think this bisect is bad. If you look at your own logs this patch
> > merely changes the failure, but doesn't make it go away.
> >
> > Before this patch (in fact, before tip/core/mm entirely) the errror
> > reads like the below, which suggests there is memory corruption
> > somewhere, and the fingered patch just makes it trigger differently.
> >
> > It would be very good to find the source of this corruption, but I'm
> > fairly certain it is not here.
>
> I went back to v4.20 to try and find a time when the below error did not
> occur, but even that reliably triggers the warning.

So I also tested v4.19 and found that that was good, which made me
bisect v4.19..v4.20

# bad: [8fe28cb58bcb235034b64cbbb7550a8a43fd88be] Linux 4.20
# good: [84df9525b0c27f3ebc2ebb1864fa62a97fdedb7d] Linux 4.19
git bisect start 'v4.20' 'v4.19'
# bad: [ec9c166434595382be3babf266febf876327774d] Merge tag 'mips_fixes_4.20_1' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux
git bisect bad ec9c166434595382be3babf266febf876327774d
# bad: [50b825d7e87f4cff7070df6eb26390152bb29537] Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next
git bisect bad 50b825d7e87f4cff7070df6eb26390152bb29537
# good: [99e9acd85ccbdc8f5785f9e961d4956e96bd6aa5] Merge tag 'mlx5-updates-2018-10-17' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux
git bisect good 99e9acd85ccbdc8f5785f9e961d4956e96bd6aa5
# good: [c403993a41d50db1e7d9bc2d43c3c8498162312f] Merge tag 'for-linus-4.20' of https://github.com/cminyard/linux-ipmi
git bisect good c403993a41d50db1e7d9bc2d43c3c8498162312f
# good: [c05f3642f4304dd081876e77a68555b6aba4483f] Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
git bisect good c05f3642f4304dd081876e77a68555b6aba4483f
# bad: [44786880df196a4200c178945c4d41675faf9fb7] Merge branch 'parisc-4.20-1' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux
git bisect bad 44786880df196a4200c178945c4d41675faf9fb7
# bad: [99792e0cea1ed733cdc8d0758677981e0cbebfed] Merge branch 'x86-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
git bisect bad 99792e0cea1ed733cdc8d0758677981e0cbebfed
# good: [fec98069fb72fb656304a3e52265e0c2fc9adf87] Merge branch 'x86-cpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
git bisect good fec98069fb72fb656304a3e52265e0c2fc9adf87
# bad: [a31acd3ee8f7dbc0370bdf4a4bfef7a8c13c7542] x86/mm: Page size aware flush_tlb_mm_range()
git bisect bad a31acd3ee8f7dbc0370bdf4a4bfef7a8c13c7542
# good: [a7295fd53c39ce781a9792c9dd2c8747bf274160] x86/mm/cpa: Use flush_tlb_kernel_range()
git bisect good a7295fd53c39ce781a9792c9dd2c8747bf274160
# good: [9cf38d5559e813cccdba8b44c82cc46ba48d0896] kexec: Allocate decrypted control pages for kdump if SME is enabled
git bisect good 9cf38d5559e813cccdba8b44c82cc46ba48d0896
# good: [5b12904065798fee8b153a506ac7b72d5ebbe26c] x86/mm/doc: Clean up the x86-64 virtual memory layout descriptions
git bisect good 5b12904065798fee8b153a506ac7b72d5ebbe26c
# good: [cf089611f4c446285046fcd426d90c18f37d2905] proc/vmcore: Fix i386 build error of missing copy_oldmem_page_encrypted()
git bisect good cf089611f4c446285046fcd426d90c18f37d2905
# good: [a5b966ae42a70b194b03eaa5eaea70d8b3790c40] Merge branch 'tlb/asm-generic' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux into x86/mm
git bisect good a5b966ae42a70b194b03eaa5eaea70d8b3790c40
# first bad commit: [a31acd3ee8f7dbc0370bdf4a4bfef7a8c13c7542] x86/mm: Page size aware flush_tlb_mm_range()

And 'funnily' the bad patch is one of mine too :/

I'll go have a look at that tomorrow, because currrently I'm way past
tired.

> > [ 10.273617] rodata_test: all tests were successful
> > [ 10.275015] x86/mm: Checking user space page tables
> > [ 10.295444] x86/mm: Checked W+X mappings: passed, no W+X pages found.
> > [ 10.296334] Run /init as init process
> > [ 10.301465] ==================================================================
> > [ 10.302460] BUG: KASAN: stack-out-of-bounds in __unwind_start+0x7e/0x4fe
> > [ 10.303355] Write of size 88 at addr ffff8880191efa28 by task init/1
> > [ 10.304241]
> > [ 10.304455] CPU: 0 PID: 1 Comm: init Not tainted 5.1.0-rc4-00288-ga131d61b43e0-dirty #10
> > [ 10.305542] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/2014
> > [ 10.306641] Call Trace:
> > [ 10.306990] print_address_description+0x9d/0x26b
> > [ 10.307654] ? __unwind_start+0x7e/0x4fe
> > [ 10.308222] ? __unwind_start+0x7e/0x4fe
> > [ 10.308755] __kasan_report+0x145/0x18a
> > [ 10.309266] ? __unwind_start+0x7e/0x4fe
> > [ 10.309823] kasan_report+0xe/0x12
> > [ 10.310273] memset+0x1f/0x31
> > [ 10.310703] __unwind_start+0x7e/0x4fe
> > [ 10.311223] ? unwind_next_frame+0x10a9/0x10a9
> > [ 10.311839] ? native_flush_tlb_one_user+0x54/0x95
> > [ 10.312504] ? kasan_unpoison_shadow+0xf/0x2e
> > [ 10.313090] __save_stack_trace+0x65/0xe7
> > [ 10.313667] ? trace_irq_enable_rcuidle+0x21/0xf5
> > [ 10.314284] ? tracer_hardirqs_on+0xb/0x1b
> > [ 10.314830] ? trace_hardirqs_on+0x2c/0x37
> > [ 10.315369] save_stack+0x32/0xa3
> > [ 10.315842] ? __put_compound_page+0x91/0x91
> > [ 10.316458] ? preempt_latency_start+0x22/0x68
> > [ 10.317052] ? free_swap_cache+0x51/0xd5
> > [ 10.317586] ? tlb_flush_mmu_free+0x31/0xca
> > [ 10.318140] ? arch_tlb_finish_mmu+0x8c/0x112
> > [ 10.318759] ? tlb_finish_mmu+0xc7/0xd6
> > [ 10.319298] ? unmap_region+0x275/0x2b9
> > [ 10.319835] ? special_mapping_fault+0x26d/0x26d
> > [ 10.320448] ? trace_irq_disable_rcuidle+0x21/0xf5
> > [ 10.321085] __kasan_slab_free+0xd3/0xf4
> > [ 10.321623] ? remove_vma+0xdf/0xe7
> > [ 10.322105] kmem_cache_free+0x4e/0xca
> > [ 10.322600] remove_vma+0xdf/0xe7
> > [ 10.323038] __do_munmap+0x72c/0x75e
> > [ 10.323514] __vm_munmap+0xd0/0x135
> > [ 10.323980] ? __x64_sys_brk+0x40e/0x40e
> > [ 10.324496] ? trace_irq_disable_rcuidle+0x21/0xf5
> > [ 10.325160] __x64_sys_munmap+0x6a/0x6f
> > [ 10.325670] do_syscall_64+0x3f0/0x462
> > [ 10.326162] ? syscall_return_slowpath+0x154/0x154
> > [ 10.326810] ? entry_SYSCALL_64_after_hwframe+0x3e/0xbe
> > [ 10.327485] ? trace_irq_disable_rcuidle+0x21/0xf5
> > [ 10.328153] ? entry_SYSCALL_64_after_hwframe+0x3e/0xbe
> > [ 10.328873] ? trace_hardirqs_off_caller+0x3e/0x40
> > [ 10.329505] ? trace_hardirqs_off_thunk+0x1a/0x1c
> > [ 10.330162] entry_SYSCALL_64_after_hwframe+0x49/0xbe
> > [ 10.330830] RIP: 0033:0x7efc4d707457
> > [ 10.331306] Code: f0 ff ff 73 01 c3 48 8d 0d 5a be 20 00 31 d2 48 29 c2 89 11 48 83 c8 ff eb eb 90 90 90 90 90 90 90 90 90 b8 0b 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8d 0d 2d be 20 00 31 d2 48 29 c2 89
> > [ 10.333711] RSP: 002b:00007fff973da398 EFLAGS: 00000203 ORIG_RAX: 000000000000000b
> > [ 10.334728] RAX: ffffffffffffffda RBX: 00007efc4d9132c8 RCX: 00007efc4d707457
> > [ 10.335670] RDX: 0000000000000000 RSI: 0000000000001d67 RDI: 00007efc4d90d000
> > [ 10.336596] RBP: 00007fff973da4f0 R08: 0000000000000007 R09: 00000000ffffffff
> > [ 10.337512] R10: 0000000000000000 R11: 0000000000000203 R12: 000000073dd74283
> > [ 10.338457] R13: 000000073db1ab4f R14: 00007efc4d909700 R15: 00007efc4d9132c8
> > [ 10.339373]
> > [ 10.339585] The buggy address belongs to the page:
> > [ 10.340224] page:ffff88801de82c48 count:0 mapcount:0 mapping:0000000000000000 index:0x0
> > [ 10.341338] flags: 0x680000000000()
> > [ 10.341832] raw: 0000680000000000 ffff88801de82c50 ffff88801de82c50 0000000000000000
> > [ 10.342846] raw: 0000000000000000 0000000000000000 00000000ffffffff
> > [ 10.343679] page dumped because: kasan: bad access detected
> > [ 10.344415]
> > [ 10.344629] Memory state around the buggy address:
> > [ 10.345254] ffff8880191ef900: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> > [ 10.346245] ffff8880191ef980: 00 00 f1 f1 f1 f1 00 f2 f2 f2 00 00 00 00 00 00
> > [ 10.347217] >ffff8880191efa00: 00 00 00 00 00 f2 f2 f2 00 00 00 00 00 00 00 00
> > [ 10.348152] ^
> > [ 10.348755] ffff8880191efa80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> > [ 10.349698] ffff8880191efb00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> > [ 10.350650] ==================================================================

2019-04-12 10:57:44

by Peter Zijlstra

[permalink] [raw]
Subject: Re: 1808d65b55 ("asm-generic/tlb: Remove arch_tlb*_mmu()"): BUG: KASAN: stack-out-of-bounds in __change_page_attr_set_clr

On Thu, Apr 11, 2019 at 11:13:48PM +0200, Peter Zijlstra wrote:
> On Thu, Apr 11, 2019 at 09:54:24PM +0200, Peter Zijlstra wrote:
> > On Thu, Apr 11, 2019 at 09:39:06PM +0200, Peter Zijlstra wrote:
> > > I think this bisect is bad. If you look at your own logs this patch
> > > merely changes the failure, but doesn't make it go away.
> > >
> > > Before this patch (in fact, before tip/core/mm entirely) the errror
> > > reads like the below, which suggests there is memory corruption
> > > somewhere, and the fingered patch just makes it trigger differently.
> > >
> > > It would be very good to find the source of this corruption, but I'm
> > > fairly certain it is not here.
> >
> > I went back to v4.20 to try and find a time when the below error did not
> > occur, but even that reliably triggers the warning.
>
> So I also tested v4.19 and found that that was good, which made me
> bisect v4.19..v4.20
>
> # bad: [8fe28cb58bcb235034b64cbbb7550a8a43fd88be] Linux 4.20
> # good: [84df9525b0c27f3ebc2ebb1864fa62a97fdedb7d] Linux 4.19
> git bisect start 'v4.20' 'v4.19'
> # bad: [ec9c166434595382be3babf266febf876327774d] Merge tag 'mips_fixes_4.20_1' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux
> git bisect bad ec9c166434595382be3babf266febf876327774d
> # bad: [50b825d7e87f4cff7070df6eb26390152bb29537] Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next
> git bisect bad 50b825d7e87f4cff7070df6eb26390152bb29537
> # good: [99e9acd85ccbdc8f5785f9e961d4956e96bd6aa5] Merge tag 'mlx5-updates-2018-10-17' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux
> git bisect good 99e9acd85ccbdc8f5785f9e961d4956e96bd6aa5
> # good: [c403993a41d50db1e7d9bc2d43c3c8498162312f] Merge tag 'for-linus-4.20' of https://github.com/cminyard/linux-ipmi
> git bisect good c403993a41d50db1e7d9bc2d43c3c8498162312f
> # good: [c05f3642f4304dd081876e77a68555b6aba4483f] Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
> git bisect good c05f3642f4304dd081876e77a68555b6aba4483f
> # bad: [44786880df196a4200c178945c4d41675faf9fb7] Merge branch 'parisc-4.20-1' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux
> git bisect bad 44786880df196a4200c178945c4d41675faf9fb7
> # bad: [99792e0cea1ed733cdc8d0758677981e0cbebfed] Merge branch 'x86-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
> git bisect bad 99792e0cea1ed733cdc8d0758677981e0cbebfed
> # good: [fec98069fb72fb656304a3e52265e0c2fc9adf87] Merge branch 'x86-cpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
> git bisect good fec98069fb72fb656304a3e52265e0c2fc9adf87
> # bad: [a31acd3ee8f7dbc0370bdf4a4bfef7a8c13c7542] x86/mm: Page size aware flush_tlb_mm_range()
> git bisect bad a31acd3ee8f7dbc0370bdf4a4bfef7a8c13c7542
> # good: [a7295fd53c39ce781a9792c9dd2c8747bf274160] x86/mm/cpa: Use flush_tlb_kernel_range()
> git bisect good a7295fd53c39ce781a9792c9dd2c8747bf274160
> # good: [9cf38d5559e813cccdba8b44c82cc46ba48d0896] kexec: Allocate decrypted control pages for kdump if SME is enabled
> git bisect good 9cf38d5559e813cccdba8b44c82cc46ba48d0896
> # good: [5b12904065798fee8b153a506ac7b72d5ebbe26c] x86/mm/doc: Clean up the x86-64 virtual memory layout descriptions
> git bisect good 5b12904065798fee8b153a506ac7b72d5ebbe26c
> # good: [cf089611f4c446285046fcd426d90c18f37d2905] proc/vmcore: Fix i386 build error of missing copy_oldmem_page_encrypted()
> git bisect good cf089611f4c446285046fcd426d90c18f37d2905
> # good: [a5b966ae42a70b194b03eaa5eaea70d8b3790c40] Merge branch 'tlb/asm-generic' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux into x86/mm
> git bisect good a5b966ae42a70b194b03eaa5eaea70d8b3790c40
> # first bad commit: [a31acd3ee8f7dbc0370bdf4a4bfef7a8c13c7542] x86/mm: Page size aware flush_tlb_mm_range()
>
> And 'funnily' the bad patch is one of mine too :/
>
> I'll go have a look at that tomorrow, because currrently I'm way past
> tired.

OK, so the below patchlet makes it all good. It turns out that the
provided config has:

CONFIG_X86_L1_CACHE_SHIFT=7

which then, for some obscure raisin, results in flush_tlb_mm_range()
compiling to use 320 bytes of stack:

sub $0x140,%rsp

Where a 'defconfig' build results in:

sub $0x58,%rsp

The thing that pushes it over the edge in the above fingered patch is
the addition of a field to struct flush_tlb_info, which grows if from 32
to 36 bytes.

So my proposal is to basically revert that, unless we can come up with
something that GCC can't screw up.

---
arch/x86/mm/tlb.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c
index bc4bc7b2f075..487b8474c01c 100644
--- a/arch/x86/mm/tlb.c
+++ b/arch/x86/mm/tlb.c
@@ -728,7 +728,7 @@ void flush_tlb_mm_range(struct mm_struct *mm, unsigned long start,
{
int cpu;

- struct flush_tlb_info info __aligned(SMP_CACHE_BYTES) = {
+ struct flush_tlb_info info = {
.mm = mm,
.stride_shift = stride_shift,
.freed_tables = freed_tables,


> > > [ 10.273617] rodata_test: all tests were successful
> > > [ 10.275015] x86/mm: Checking user space page tables
> > > [ 10.295444] x86/mm: Checked W+X mappings: passed, no W+X pages found.
> > > [ 10.296334] Run /init as init process
> > > [ 10.301465] ==================================================================
> > > [ 10.302460] BUG: KASAN: stack-out-of-bounds in __unwind_start+0x7e/0x4fe
> > > [ 10.303355] Write of size 88 at addr ffff8880191efa28 by task init/1
> > > [ 10.304241]
> > > [ 10.304455] CPU: 0 PID: 1 Comm: init Not tainted 5.1.0-rc4-00288-ga131d61b43e0-dirty #10
> > > [ 10.305542] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/2014
> > > [ 10.306641] Call Trace:
> > > [ 10.306990] print_address_description+0x9d/0x26b
> > > [ 10.307654] ? __unwind_start+0x7e/0x4fe
> > > [ 10.308222] ? __unwind_start+0x7e/0x4fe
> > > [ 10.308755] __kasan_report+0x145/0x18a
> > > [ 10.309266] ? __unwind_start+0x7e/0x4fe
> > > [ 10.309823] kasan_report+0xe/0x12
> > > [ 10.310273] memset+0x1f/0x31
> > > [ 10.310703] __unwind_start+0x7e/0x4fe
> > > [ 10.311223] ? unwind_next_frame+0x10a9/0x10a9
> > > [ 10.311839] ? native_flush_tlb_one_user+0x54/0x95
> > > [ 10.312504] ? kasan_unpoison_shadow+0xf/0x2e
> > > [ 10.313090] __save_stack_trace+0x65/0xe7
> > > [ 10.313667] ? trace_irq_enable_rcuidle+0x21/0xf5
> > > [ 10.314284] ? tracer_hardirqs_on+0xb/0x1b
> > > [ 10.314830] ? trace_hardirqs_on+0x2c/0x37
> > > [ 10.315369] save_stack+0x32/0xa3
> > > [ 10.315842] ? __put_compound_page+0x91/0x91
> > > [ 10.316458] ? preempt_latency_start+0x22/0x68
> > > [ 10.317052] ? free_swap_cache+0x51/0xd5
> > > [ 10.317586] ? tlb_flush_mmu_free+0x31/0xca
> > > [ 10.318140] ? arch_tlb_finish_mmu+0x8c/0x112
> > > [ 10.318759] ? tlb_finish_mmu+0xc7/0xd6
> > > [ 10.319298] ? unmap_region+0x275/0x2b9
> > > [ 10.319835] ? special_mapping_fault+0x26d/0x26d
> > > [ 10.320448] ? trace_irq_disable_rcuidle+0x21/0xf5
> > > [ 10.321085] __kasan_slab_free+0xd3/0xf4
> > > [ 10.321623] ? remove_vma+0xdf/0xe7
> > > [ 10.322105] kmem_cache_free+0x4e/0xca
> > > [ 10.322600] remove_vma+0xdf/0xe7
> > > [ 10.323038] __do_munmap+0x72c/0x75e
> > > [ 10.323514] __vm_munmap+0xd0/0x135
> > > [ 10.323980] ? __x64_sys_brk+0x40e/0x40e
> > > [ 10.324496] ? trace_irq_disable_rcuidle+0x21/0xf5
> > > [ 10.325160] __x64_sys_munmap+0x6a/0x6f
> > > [ 10.325670] do_syscall_64+0x3f0/0x462
> > > [ 10.326162] ? syscall_return_slowpath+0x154/0x154
> > > [ 10.326810] ? entry_SYSCALL_64_after_hwframe+0x3e/0xbe
> > > [ 10.327485] ? trace_irq_disable_rcuidle+0x21/0xf5
> > > [ 10.328153] ? entry_SYSCALL_64_after_hwframe+0x3e/0xbe
> > > [ 10.328873] ? trace_hardirqs_off_caller+0x3e/0x40
> > > [ 10.329505] ? trace_hardirqs_off_thunk+0x1a/0x1c
> > > [ 10.330162] entry_SYSCALL_64_after_hwframe+0x49/0xbe
> > > [ 10.330830] RIP: 0033:0x7efc4d707457
> > > [ 10.331306] Code: f0 ff ff 73 01 c3 48 8d 0d 5a be 20 00 31 d2 48 29 c2 89 11 48 83 c8 ff eb eb 90 90 90 90 90 90 90 90 90 b8 0b 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8d 0d 2d be 20 00 31 d2 48 29 c2 89
> > > [ 10.333711] RSP: 002b:00007fff973da398 EFLAGS: 00000203 ORIG_RAX: 000000000000000b
> > > [ 10.334728] RAX: ffffffffffffffda RBX: 00007efc4d9132c8 RCX: 00007efc4d707457
> > > [ 10.335670] RDX: 0000000000000000 RSI: 0000000000001d67 RDI: 00007efc4d90d000
> > > [ 10.336596] RBP: 00007fff973da4f0 R08: 0000000000000007 R09: 00000000ffffffff
> > > [ 10.337512] R10: 0000000000000000 R11: 0000000000000203 R12: 000000073dd74283
> > > [ 10.338457] R13: 000000073db1ab4f R14: 00007efc4d909700 R15: 00007efc4d9132c8
> > > [ 10.339373]
> > > [ 10.339585] The buggy address belongs to the page:
> > > [ 10.340224] page:ffff88801de82c48 count:0 mapcount:0 mapping:0000000000000000 index:0x0
> > > [ 10.341338] flags: 0x680000000000()
> > > [ 10.341832] raw: 0000680000000000 ffff88801de82c50 ffff88801de82c50 0000000000000000
> > > [ 10.342846] raw: 0000000000000000 0000000000000000 00000000ffffffff
> > > [ 10.343679] page dumped because: kasan: bad access detected
> > > [ 10.344415]
> > > [ 10.344629] Memory state around the buggy address:
> > > [ 10.345254] ffff8880191ef900: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> > > [ 10.346245] ffff8880191ef980: 00 00 f1 f1 f1 f1 00 f2 f2 f2 00 00 00 00 00 00
> > > [ 10.347217] >ffff8880191efa00: 00 00 00 00 00 f2 f2 f2 00 00 00 00 00 00 00 00
> > > [ 10.348152] ^
> > > [ 10.348755] ffff8880191efa80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> > > [ 10.349698] ffff8880191efb00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> > > [ 10.350650] ==================================================================

2019-04-12 11:19:07

by Peter Zijlstra

[permalink] [raw]
Subject: Re: 1808d65b55 ("asm-generic/tlb: Remove arch_tlb*_mmu()"): BUG: KASAN: stack-out-of-bounds in __change_page_attr_set_clr

On Fri, Apr 12, 2019 at 12:56:33PM +0200, Peter Zijlstra wrote:
> On Thu, Apr 11, 2019 at 11:13:48PM +0200, Peter Zijlstra wrote:
> > On Thu, Apr 11, 2019 at 09:54:24PM +0200, Peter Zijlstra wrote:
> > > On Thu, Apr 11, 2019 at 09:39:06PM +0200, Peter Zijlstra wrote:
> > > > I think this bisect is bad. If you look at your own logs this patch
> > > > merely changes the failure, but doesn't make it go away.
> > > >
> > > > Before this patch (in fact, before tip/core/mm entirely) the errror
> > > > reads like the below, which suggests there is memory corruption
> > > > somewhere, and the fingered patch just makes it trigger differently.
> > > >
> > > > It would be very good to find the source of this corruption, but I'm
> > > > fairly certain it is not here.
> > >
> > > I went back to v4.20 to try and find a time when the below error did not
> > > occur, but even that reliably triggers the warning.
> >
> > So I also tested v4.19 and found that that was good, which made me
> > bisect v4.19..v4.20
> >
> > # bad: [8fe28cb58bcb235034b64cbbb7550a8a43fd88be] Linux 4.20
> > # good: [84df9525b0c27f3ebc2ebb1864fa62a97fdedb7d] Linux 4.19
> > git bisect start 'v4.20' 'v4.19'
> > # bad: [ec9c166434595382be3babf266febf876327774d] Merge tag 'mips_fixes_4.20_1' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux
> > git bisect bad ec9c166434595382be3babf266febf876327774d
> > # bad: [50b825d7e87f4cff7070df6eb26390152bb29537] Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next
> > git bisect bad 50b825d7e87f4cff7070df6eb26390152bb29537
> > # good: [99e9acd85ccbdc8f5785f9e961d4956e96bd6aa5] Merge tag 'mlx5-updates-2018-10-17' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux
> > git bisect good 99e9acd85ccbdc8f5785f9e961d4956e96bd6aa5
> > # good: [c403993a41d50db1e7d9bc2d43c3c8498162312f] Merge tag 'for-linus-4.20' of https://github.com/cminyard/linux-ipmi
> > git bisect good c403993a41d50db1e7d9bc2d43c3c8498162312f
> > # good: [c05f3642f4304dd081876e77a68555b6aba4483f] Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
> > git bisect good c05f3642f4304dd081876e77a68555b6aba4483f
> > # bad: [44786880df196a4200c178945c4d41675faf9fb7] Merge branch 'parisc-4.20-1' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux
> > git bisect bad 44786880df196a4200c178945c4d41675faf9fb7
> > # bad: [99792e0cea1ed733cdc8d0758677981e0cbebfed] Merge branch 'x86-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
> > git bisect bad 99792e0cea1ed733cdc8d0758677981e0cbebfed
> > # good: [fec98069fb72fb656304a3e52265e0c2fc9adf87] Merge branch 'x86-cpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
> > git bisect good fec98069fb72fb656304a3e52265e0c2fc9adf87
> > # bad: [a31acd3ee8f7dbc0370bdf4a4bfef7a8c13c7542] x86/mm: Page size aware flush_tlb_mm_range()
> > git bisect bad a31acd3ee8f7dbc0370bdf4a4bfef7a8c13c7542
> > # good: [a7295fd53c39ce781a9792c9dd2c8747bf274160] x86/mm/cpa: Use flush_tlb_kernel_range()
> > git bisect good a7295fd53c39ce781a9792c9dd2c8747bf274160
> > # good: [9cf38d5559e813cccdba8b44c82cc46ba48d0896] kexec: Allocate decrypted control pages for kdump if SME is enabled
> > git bisect good 9cf38d5559e813cccdba8b44c82cc46ba48d0896
> > # good: [5b12904065798fee8b153a506ac7b72d5ebbe26c] x86/mm/doc: Clean up the x86-64 virtual memory layout descriptions
> > git bisect good 5b12904065798fee8b153a506ac7b72d5ebbe26c
> > # good: [cf089611f4c446285046fcd426d90c18f37d2905] proc/vmcore: Fix i386 build error of missing copy_oldmem_page_encrypted()
> > git bisect good cf089611f4c446285046fcd426d90c18f37d2905
> > # good: [a5b966ae42a70b194b03eaa5eaea70d8b3790c40] Merge branch 'tlb/asm-generic' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux into x86/mm
> > git bisect good a5b966ae42a70b194b03eaa5eaea70d8b3790c40
> > # first bad commit: [a31acd3ee8f7dbc0370bdf4a4bfef7a8c13c7542] x86/mm: Page size aware flush_tlb_mm_range()
> >
> > And 'funnily' the bad patch is one of mine too :/
> >
> > I'll go have a look at that tomorrow, because currrently I'm way past
> > tired.
>
> OK, so the below patchlet makes it all good. It turns out that the
> provided config has:
>
> CONFIG_X86_L1_CACHE_SHIFT=7
>
> which then, for some obscure raisin, results in flush_tlb_mm_range()
> compiling to use 320 bytes of stack:
>
> sub $0x140,%rsp
>
> Where a 'defconfig' build results in:
>
> sub $0x58,%rsp
>
> The thing that pushes it over the edge in the above fingered patch is
> the addition of a field to struct flush_tlb_info, which grows if from 32
> to 36 bytes.
>
> So my proposal is to basically revert that, unless we can come up with
> something that GCC can't screw up.

To clarify, 'that' is Nadav's patch:

515ab7c41306 ("x86/mm: Align TLB invalidation info")

which turns out to be the real problem.

> ---
> arch/x86/mm/tlb.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c
> index bc4bc7b2f075..487b8474c01c 100644
> --- a/arch/x86/mm/tlb.c
> +++ b/arch/x86/mm/tlb.c
> @@ -728,7 +728,7 @@ void flush_tlb_mm_range(struct mm_struct *mm, unsigned long start,
> {
> int cpu;
>
> - struct flush_tlb_info info __aligned(SMP_CACHE_BYTES) = {
> + struct flush_tlb_info info = {
> .mm = mm,
> .stride_shift = stride_shift,
> .freed_tables = freed_tables,
>
>
> > > > [ 10.273617] rodata_test: all tests were successful
> > > > [ 10.275015] x86/mm: Checking user space page tables
> > > > [ 10.295444] x86/mm: Checked W+X mappings: passed, no W+X pages found.
> > > > [ 10.296334] Run /init as init process
> > > > [ 10.301465] ==================================================================
> > > > [ 10.302460] BUG: KASAN: stack-out-of-bounds in __unwind_start+0x7e/0x4fe
> > > > [ 10.303355] Write of size 88 at addr ffff8880191efa28 by task init/1
> > > > [ 10.304241]
> > > > [ 10.304455] CPU: 0 PID: 1 Comm: init Not tainted 5.1.0-rc4-00288-ga131d61b43e0-dirty #10
> > > > [ 10.305542] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/2014
> > > > [ 10.306641] Call Trace:
> > > > [ 10.306990] print_address_description+0x9d/0x26b
> > > > [ 10.307654] ? __unwind_start+0x7e/0x4fe
> > > > [ 10.308222] ? __unwind_start+0x7e/0x4fe
> > > > [ 10.308755] __kasan_report+0x145/0x18a
> > > > [ 10.309266] ? __unwind_start+0x7e/0x4fe
> > > > [ 10.309823] kasan_report+0xe/0x12
> > > > [ 10.310273] memset+0x1f/0x31
> > > > [ 10.310703] __unwind_start+0x7e/0x4fe
> > > > [ 10.311223] ? unwind_next_frame+0x10a9/0x10a9
> > > > [ 10.311839] ? native_flush_tlb_one_user+0x54/0x95
> > > > [ 10.312504] ? kasan_unpoison_shadow+0xf/0x2e
> > > > [ 10.313090] __save_stack_trace+0x65/0xe7
> > > > [ 10.313667] ? trace_irq_enable_rcuidle+0x21/0xf5
> > > > [ 10.314284] ? tracer_hardirqs_on+0xb/0x1b
> > > > [ 10.314830] ? trace_hardirqs_on+0x2c/0x37
> > > > [ 10.315369] save_stack+0x32/0xa3
> > > > [ 10.315842] ? __put_compound_page+0x91/0x91
> > > > [ 10.316458] ? preempt_latency_start+0x22/0x68
> > > > [ 10.317052] ? free_swap_cache+0x51/0xd5
> > > > [ 10.317586] ? tlb_flush_mmu_free+0x31/0xca
> > > > [ 10.318140] ? arch_tlb_finish_mmu+0x8c/0x112
> > > > [ 10.318759] ? tlb_finish_mmu+0xc7/0xd6
> > > > [ 10.319298] ? unmap_region+0x275/0x2b9
> > > > [ 10.319835] ? special_mapping_fault+0x26d/0x26d
> > > > [ 10.320448] ? trace_irq_disable_rcuidle+0x21/0xf5
> > > > [ 10.321085] __kasan_slab_free+0xd3/0xf4
> > > > [ 10.321623] ? remove_vma+0xdf/0xe7
> > > > [ 10.322105] kmem_cache_free+0x4e/0xca
> > > > [ 10.322600] remove_vma+0xdf/0xe7
> > > > [ 10.323038] __do_munmap+0x72c/0x75e
> > > > [ 10.323514] __vm_munmap+0xd0/0x135
> > > > [ 10.323980] ? __x64_sys_brk+0x40e/0x40e
> > > > [ 10.324496] ? trace_irq_disable_rcuidle+0x21/0xf5
> > > > [ 10.325160] __x64_sys_munmap+0x6a/0x6f
> > > > [ 10.325670] do_syscall_64+0x3f0/0x462
> > > > [ 10.326162] ? syscall_return_slowpath+0x154/0x154
> > > > [ 10.326810] ? entry_SYSCALL_64_after_hwframe+0x3e/0xbe
> > > > [ 10.327485] ? trace_irq_disable_rcuidle+0x21/0xf5
> > > > [ 10.328153] ? entry_SYSCALL_64_after_hwframe+0x3e/0xbe
> > > > [ 10.328873] ? trace_hardirqs_off_caller+0x3e/0x40
> > > > [ 10.329505] ? trace_hardirqs_off_thunk+0x1a/0x1c
> > > > [ 10.330162] entry_SYSCALL_64_after_hwframe+0x49/0xbe
> > > > [ 10.330830] RIP: 0033:0x7efc4d707457
> > > > [ 10.331306] Code: f0 ff ff 73 01 c3 48 8d 0d 5a be 20 00 31 d2 48 29 c2 89 11 48 83 c8 ff eb eb 90 90 90 90 90 90 90 90 90 b8 0b 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8d 0d 2d be 20 00 31 d2 48 29 c2 89
> > > > [ 10.333711] RSP: 002b:00007fff973da398 EFLAGS: 00000203 ORIG_RAX: 000000000000000b
> > > > [ 10.334728] RAX: ffffffffffffffda RBX: 00007efc4d9132c8 RCX: 00007efc4d707457
> > > > [ 10.335670] RDX: 0000000000000000 RSI: 0000000000001d67 RDI: 00007efc4d90d000
> > > > [ 10.336596] RBP: 00007fff973da4f0 R08: 0000000000000007 R09: 00000000ffffffff
> > > > [ 10.337512] R10: 0000000000000000 R11: 0000000000000203 R12: 000000073dd74283
> > > > [ 10.338457] R13: 000000073db1ab4f R14: 00007efc4d909700 R15: 00007efc4d9132c8
> > > > [ 10.339373]
> > > > [ 10.339585] The buggy address belongs to the page:
> > > > [ 10.340224] page:ffff88801de82c48 count:0 mapcount:0 mapping:0000000000000000 index:0x0
> > > > [ 10.341338] flags: 0x680000000000()
> > > > [ 10.341832] raw: 0000680000000000 ffff88801de82c50 ffff88801de82c50 0000000000000000
> > > > [ 10.342846] raw: 0000000000000000 0000000000000000 00000000ffffffff
> > > > [ 10.343679] page dumped because: kasan: bad access detected
> > > > [ 10.344415]
> > > > [ 10.344629] Memory state around the buggy address:
> > > > [ 10.345254] ffff8880191ef900: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> > > > [ 10.346245] ffff8880191ef980: 00 00 f1 f1 f1 f1 00 f2 f2 f2 00 00 00 00 00 00
> > > > [ 10.347217] >ffff8880191efa00: 00 00 00 00 00 f2 f2 f2 00 00 00 00 00 00 00 00
> > > > [ 10.348152] ^
> > > > [ 10.348755] ffff8880191efa80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> > > > [ 10.349698] ffff8880191efb00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> > > > [ 10.350650] ==================================================================

2019-04-12 15:12:26

by Nadav Amit

[permalink] [raw]
Subject: Re: 1808d65b55 ("asm-generic/tlb: Remove arch_tlb*_mmu()"): BUG: KASAN: stack-out-of-bounds in __change_page_attr_set_clr

> On Apr 12, 2019, at 4:17 AM, Peter Zijlstra <[email protected]> wrote:
>
> On Fri, Apr 12, 2019 at 12:56:33PM +0200, Peter Zijlstra wrote:
>> On Thu, Apr 11, 2019 at 11:13:48PM +0200, Peter Zijlstra wrote:
>>> On Thu, Apr 11, 2019 at 09:54:24PM +0200, Peter Zijlstra wrote:
>>>> On Thu, Apr 11, 2019 at 09:39:06PM +0200, Peter Zijlstra wrote:
>>>>> I think this bisect is bad. If you look at your own logs this patch
>>>>> merely changes the failure, but doesn't make it go away.
>>>>>
>>>>> Before this patch (in fact, before tip/core/mm entirely) the errror
>>>>> reads like the below, which suggests there is memory corruption
>>>>> somewhere, and the fingered patch just makes it trigger differently.
>>>>>
>>>>> It would be very good to find the source of this corruption, but I'm
>>>>> fairly certain it is not here.
>>>>
>>>> I went back to v4.20 to try and find a time when the below error did not
>>>> occur, but even that reliably triggers the warning.
>>>
>>> So I also tested v4.19 and found that that was good, which made me
>>> bisect v4.19..v4.20
>>>
>>> # bad: [8fe28cb58bcb235034b64cbbb7550a8a43fd88be] Linux 4.20
>>> # good: [84df9525b0c27f3ebc2ebb1864fa62a97fdedb7d] Linux 4.19
>>> git bisect start 'v4.20' 'v4.19'
>>> # bad: [ec9c166434595382be3babf266febf876327774d] Merge tag 'mips_fixes_4.20_1' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux
>>> git bisect bad ec9c166434595382be3babf266febf876327774d
>>> # bad: [50b825d7e87f4cff7070df6eb26390152bb29537] Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next
>>> git bisect bad 50b825d7e87f4cff7070df6eb26390152bb29537
>>> # good: [99e9acd85ccbdc8f5785f9e961d4956e96bd6aa5] Merge tag 'mlx5-updates-2018-10-17' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux
>>> git bisect good 99e9acd85ccbdc8f5785f9e961d4956e96bd6aa5
>>> # good: [c403993a41d50db1e7d9bc2d43c3c8498162312f] Merge tag 'for-linus-4.20' of https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fcminyard%2Flinux-ipmi&amp;data=02%7C01%7Cnamit%40vmware.com%7Ca1c3ea5d4bc34cfc785508d6bf388ff3%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C0%7C0%7C636906647013777573&amp;sdata=3VSR3VdE5rxOitAdkqFNPpAnAtLgDmYLzJtoUrs5v9Y%3D&amp;reserved=0
>>> git bisect good c403993a41d50db1e7d9bc2d43c3c8498162312f
>>> # good: [c05f3642f4304dd081876e77a68555b6aba4483f] Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
>>> git bisect good c05f3642f4304dd081876e77a68555b6aba4483f
>>> # bad: [44786880df196a4200c178945c4d41675faf9fb7] Merge branch 'parisc-4.20-1' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux
>>> git bisect bad 44786880df196a4200c178945c4d41675faf9fb7
>>> # bad: [99792e0cea1ed733cdc8d0758677981e0cbebfed] Merge branch 'x86-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
>>> git bisect bad 99792e0cea1ed733cdc8d0758677981e0cbebfed
>>> # good: [fec98069fb72fb656304a3e52265e0c2fc9adf87] Merge branch 'x86-cpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
>>> git bisect good fec98069fb72fb656304a3e52265e0c2fc9adf87
>>> # bad: [a31acd3ee8f7dbc0370bdf4a4bfef7a8c13c7542] x86/mm: Page size aware flush_tlb_mm_range()
>>> git bisect bad a31acd3ee8f7dbc0370bdf4a4bfef7a8c13c7542
>>> # good: [a7295fd53c39ce781a9792c9dd2c8747bf274160] x86/mm/cpa: Use flush_tlb_kernel_range()
>>> git bisect good a7295fd53c39ce781a9792c9dd2c8747bf274160
>>> # good: [9cf38d5559e813cccdba8b44c82cc46ba48d0896] kexec: Allocate decrypted control pages for kdump if SME is enabled
>>> git bisect good 9cf38d5559e813cccdba8b44c82cc46ba48d0896
>>> # good: [5b12904065798fee8b153a506ac7b72d5ebbe26c] x86/mm/doc: Clean up the x86-64 virtual memory layout descriptions
>>> git bisect good 5b12904065798fee8b153a506ac7b72d5ebbe26c
>>> # good: [cf089611f4c446285046fcd426d90c18f37d2905] proc/vmcore: Fix i386 build error of missing copy_oldmem_page_encrypted()
>>> git bisect good cf089611f4c446285046fcd426d90c18f37d2905
>>> # good: [a5b966ae42a70b194b03eaa5eaea70d8b3790c40] Merge branch 'tlb/asm-generic' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux into x86/mm
>>> git bisect good a5b966ae42a70b194b03eaa5eaea70d8b3790c40
>>> # first bad commit: [a31acd3ee8f7dbc0370bdf4a4bfef7a8c13c7542] x86/mm: Page size aware flush_tlb_mm_range()
>>>
>>> And 'funnily' the bad patch is one of mine too :/
>>>
>>> I'll go have a look at that tomorrow, because currrently I'm way past
>>> tired.
>>
>> OK, so the below patchlet makes it all good. It turns out that the
>> provided config has:
>>
>> CONFIG_X86_L1_CACHE_SHIFT=7
>>
>> which then, for some obscure raisin, results in flush_tlb_mm_range()
>> compiling to use 320 bytes of stack:
>>
>> sub $0x140,%rsp
>>
>> Where a 'defconfig' build results in:
>>
>> sub $0x58,%rsp
>>
>> The thing that pushes it over the edge in the above fingered patch is
>> the addition of a field to struct flush_tlb_info, which grows if from 32
>> to 36 bytes.
>>
>> So my proposal is to basically revert that, unless we can come up with
>> something that GCC can't screw up.
>
> To clarify, 'that' is Nadav's patch:
>
> 515ab7c41306 ("x86/mm: Align TLB invalidation info")
>
> which turns out to be the real problem.

Sorry for that. I still think it should be aligned, especially with all the
effort the Intel puts around to avoid bus-locking on unaligned atomic
operations.

So the right solution seems to me as putting this data structure off stack.
It would prevent flush_tlb_mm_range() from being reentrant, so we can keep a
few entries for this matter and atomically increase the entry number every
time we enter flush_tlb_mm_range().

But my question is - should flush_tlb_mm_range() be reentrant, or can we
assume no TLB shootdowns are initiated in interrupt handlers and #MC
handlers?

2019-04-12 15:19:56

by Nadav Amit

[permalink] [raw]
Subject: Re: 1808d65b55 ("asm-generic/tlb: Remove arch_tlb*_mmu()"): BUG: KASAN: stack-out-of-bounds in __change_page_attr_set_clr

> On Apr 12, 2019, at 8:11 AM, Nadav Amit <[email protected]> wrote:
>
>> On Apr 12, 2019, at 4:17 AM, Peter Zijlstra <[email protected]> wrote:
>>
>> On Fri, Apr 12, 2019 at 12:56:33PM +0200, Peter Zijlstra wrote:
>>> On Thu, Apr 11, 2019 at 11:13:48PM +0200, Peter Zijlstra wrote:
>>>> On Thu, Apr 11, 2019 at 09:54:24PM +0200, Peter Zijlstra wrote:
>>>>> On Thu, Apr 11, 2019 at 09:39:06PM +0200, Peter Zijlstra wrote:
>>>>>> I think this bisect is bad. If you look at your own logs this patch
>>>>>> merely changes the failure, but doesn't make it go away.
>>>>>>
>>>>>> Before this patch (in fact, before tip/core/mm entirely) the errror
>>>>>> reads like the below, which suggests there is memory corruption
>>>>>> somewhere, and the fingered patch just makes it trigger differently.
>>>>>>
>>>>>> It would be very good to find the source of this corruption, but I'm
>>>>>> fairly certain it is not here.
>>>>>
>>>>> I went back to v4.20 to try and find a time when the below error did not
>>>>> occur, but even that reliably triggers the warning.
>>>>
>>>> So I also tested v4.19 and found that that was good, which made me
>>>> bisect v4.19..v4.20
>>>>
>>>> # bad: [8fe28cb58bcb235034b64cbbb7550a8a43fd88be] Linux 4.20
>>>> # good: [84df9525b0c27f3ebc2ebb1864fa62a97fdedb7d] Linux 4.19
>>>> git bisect start 'v4.20' 'v4.19'
>>>> # bad: [ec9c166434595382be3babf266febf876327774d] Merge tag 'mips_fixes_4.20_1' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux
>>>> git bisect bad ec9c166434595382be3babf266febf876327774d
>>>> # bad: [50b825d7e87f4cff7070df6eb26390152bb29537] Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next
>>>> git bisect bad 50b825d7e87f4cff7070df6eb26390152bb29537
>>>> # good: [99e9acd85ccbdc8f5785f9e961d4956e96bd6aa5] Merge tag 'mlx5-updates-2018-10-17' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux
>>>> git bisect good 99e9acd85ccbdc8f5785f9e961d4956e96bd6aa5
>>>> # good: [c403993a41d50db1e7d9bc2d43c3c8498162312f] Merge tag 'for-linus-4.20' of https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fcminyard%2Flinux-ipmi&amp;data=02%7C01%7Cnamit%40vmware.com%7Ca1c3ea5d4bc34cfc785508d6bf388ff3%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C0%7C0%7C636906647013777573&amp;sdata=3VSR3VdE5rxOitAdkqFNPpAnAtLgDmYLzJtoUrs5v9Y%3D&amp;reserved=0
>>>> git bisect good c403993a41d50db1e7d9bc2d43c3c8498162312f
>>>> # good: [c05f3642f4304dd081876e77a68555b6aba4483f] Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
>>>> git bisect good c05f3642f4304dd081876e77a68555b6aba4483f
>>>> # bad: [44786880df196a4200c178945c4d41675faf9fb7] Merge branch 'parisc-4.20-1' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux
>>>> git bisect bad 44786880df196a4200c178945c4d41675faf9fb7
>>>> # bad: [99792e0cea1ed733cdc8d0758677981e0cbebfed] Merge branch 'x86-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
>>>> git bisect bad 99792e0cea1ed733cdc8d0758677981e0cbebfed
>>>> # good: [fec98069fb72fb656304a3e52265e0c2fc9adf87] Merge branch 'x86-cpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
>>>> git bisect good fec98069fb72fb656304a3e52265e0c2fc9adf87
>>>> # bad: [a31acd3ee8f7dbc0370bdf4a4bfef7a8c13c7542] x86/mm: Page size aware flush_tlb_mm_range()
>>>> git bisect bad a31acd3ee8f7dbc0370bdf4a4bfef7a8c13c7542
>>>> # good: [a7295fd53c39ce781a9792c9dd2c8747bf274160] x86/mm/cpa: Use flush_tlb_kernel_range()
>>>> git bisect good a7295fd53c39ce781a9792c9dd2c8747bf274160
>>>> # good: [9cf38d5559e813cccdba8b44c82cc46ba48d0896] kexec: Allocate decrypted control pages for kdump if SME is enabled
>>>> git bisect good 9cf38d5559e813cccdba8b44c82cc46ba48d0896
>>>> # good: [5b12904065798fee8b153a506ac7b72d5ebbe26c] x86/mm/doc: Clean up the x86-64 virtual memory layout descriptions
>>>> git bisect good 5b12904065798fee8b153a506ac7b72d5ebbe26c
>>>> # good: [cf089611f4c446285046fcd426d90c18f37d2905] proc/vmcore: Fix i386 build error of missing copy_oldmem_page_encrypted()
>>>> git bisect good cf089611f4c446285046fcd426d90c18f37d2905
>>>> # good: [a5b966ae42a70b194b03eaa5eaea70d8b3790c40] Merge branch 'tlb/asm-generic' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux into x86/mm
>>>> git bisect good a5b966ae42a70b194b03eaa5eaea70d8b3790c40
>>>> # first bad commit: [a31acd3ee8f7dbc0370bdf4a4bfef7a8c13c7542] x86/mm: Page size aware flush_tlb_mm_range()
>>>>
>>>> And 'funnily' the bad patch is one of mine too :/
>>>>
>>>> I'll go have a look at that tomorrow, because currrently I'm way past
>>>> tired.
>>>
>>> OK, so the below patchlet makes it all good. It turns out that the
>>> provided config has:
>>>
>>> CONFIG_X86_L1_CACHE_SHIFT=7
>>>
>>> which then, for some obscure raisin, results in flush_tlb_mm_range()
>>> compiling to use 320 bytes of stack:
>>>
>>> sub $0x140,%rsp
>>>
>>> Where a 'defconfig' build results in:
>>>
>>> sub $0x58,%rsp
>>>
>>> The thing that pushes it over the edge in the above fingered patch is
>>> the addition of a field to struct flush_tlb_info, which grows if from 32
>>> to 36 bytes.
>>>
>>> So my proposal is to basically revert that, unless we can come up with
>>> something that GCC can't screw up.
>>
>> To clarify, 'that' is Nadav's patch:
>>
>> 515ab7c41306 ("x86/mm: Align TLB invalidation info")
>>
>> which turns out to be the real problem.
>
> Sorry for that. I still think it should be aligned, especially with all the
> effort the Intel puts around to avoid bus-locking on unaligned atomic
> operations.
>
> So the right solution seems to me as putting this data structure off stack.
> It would prevent flush_tlb_mm_range() from being reentrant, so we can keep a
> few entries for this matter and atomically increase the entry number every
> time we enter flush_tlb_mm_range().
>
> But my question is - should flush_tlb_mm_range() be reentrant, or can we
> assume no TLB shootdowns are initiated in interrupt handlers and #MC
> handlers?

Just to clarify - I obviously suggest a per-cpu structure. The question
is about one core reentering flush_tlb_mm_range().

2019-04-12 15:35:05

by Linus Torvalds

[permalink] [raw]
Subject: Re: 1808d65b55 ("asm-generic/tlb: Remove arch_tlb*_mmu()"): BUG: KASAN: stack-out-of-bounds in __change_page_attr_set_clr

On Fri, Apr 12, 2019 at 3:56 AM Peter Zijlstra <[email protected]> wrote:
>
> --- a/arch/x86/mm/tlb.c
> +++ b/arch/x86/mm/tlb.c
> @@ -728,7 +728,7 @@ void flush_tlb_mm_range(struct mm_struct *mm, unsigned long start,
> {
> int cpu;
>
> - struct flush_tlb_info info __aligned(SMP_CACHE_BYTES) = {
> + struct flush_tlb_info info = {
> .mm = mm,
> .stride_shift = stride_shift,
> .freed_tables = freed_tables,
>

Ack.

We should never have stack alignment bigger than 16 bytes. And
preferably not even that. Trying to align stack at a cacheline
boundary is wrong - if you *really* need things to be that aligned, do
something else (regular kmalloc, percpu temp area, static allocation -
whatever).

Linus

2019-04-12 16:51:37

by David Howells

[permalink] [raw]
Subject: Re: 1808d65b55 ("asm-generic/tlb: Remove arch_tlb*_mmu()"): BUG: KASAN: stack-out-of-bounds in __change_page_attr_set_clr

Linus Torvalds <[email protected]> wrote:

> We should never have stack alignment bigger than 16 bytes. And
> preferably not even that.

At least one arch I know of (FRV) had instructions that could atomically
load/store register pairs or register quads, but they had to be pair- or
quad-aligned (ie. 8- or 16-byte), which made for more efficient code if you
could use them.

I don't know whether any arch we currently support has features like this (I
know some have multi-reg load/stores, but they seem to require only
word-alignment).

David

2019-04-12 17:08:00

by Nadav Amit

[permalink] [raw]
Subject: Re: 1808d65b55 ("asm-generic/tlb: Remove arch_tlb*_mmu()"): BUG: KASAN: stack-out-of-bounds in __change_page_attr_set_clr

> On Apr 12, 2019, at 8:11 AM, Nadav Amit <[email protected]> wrote:
>
>> On Apr 12, 2019, at 4:17 AM, Peter Zijlstra <[email protected]> wrote:
>>
>> On Fri, Apr 12, 2019 at 12:56:33PM +0200, Peter Zijlstra wrote:
>>> On Thu, Apr 11, 2019 at 11:13:48PM +0200, Peter Zijlstra wrote:
>>>> On Thu, Apr 11, 2019 at 09:54:24PM +0200, Peter Zijlstra wrote:
>>>>> On Thu, Apr 11, 2019 at 09:39:06PM +0200, Peter Zijlstra wrote:
>>>>>> I think this bisect is bad. If you look at your own logs this patch
>>>>>> merely changes the failure, but doesn't make it go away.
>>>>>>
>>>>>> Before this patch (in fact, before tip/core/mm entirely) the errror
>>>>>> reads like the below, which suggests there is memory corruption
>>>>>> somewhere, and the fingered patch just makes it trigger differently.
>>>>>>
>>>>>> It would be very good to find the source of this corruption, but I'm
>>>>>> fairly certain it is not here.
>>>>>
>>>>> I went back to v4.20 to try and find a time when the below error did not
>>>>> occur, but even that reliably triggers the warning.
>>>>
>>>> So I also tested v4.19 and found that that was good, which made me
>>>> bisect v4.19..v4.20
>>>>
>>>> # bad: [8fe28cb58bcb235034b64cbbb7550a8a43fd88be] Linux 4.20
>>>> # good: [84df9525b0c27f3ebc2ebb1864fa62a97fdedb7d] Linux 4.19
>>>> git bisect start 'v4.20' 'v4.19'
>>>> # bad: [ec9c166434595382be3babf266febf876327774d] Merge tag 'mips_fixes_4.20_1' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux
>>>> git bisect bad ec9c166434595382be3babf266febf876327774d
>>>> # bad: [50b825d7e87f4cff7070df6eb26390152bb29537] Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next
>>>> git bisect bad 50b825d7e87f4cff7070df6eb26390152bb29537
>>>> # good: [99e9acd85ccbdc8f5785f9e961d4956e96bd6aa5] Merge tag 'mlx5-updates-2018-10-17' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux
>>>> git bisect good 99e9acd85ccbdc8f5785f9e961d4956e96bd6aa5
>>>> # good: [c403993a41d50db1e7d9bc2d43c3c8498162312f] Merge tag 'for-linus-4.20' of https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fcminyard%2Flinux-ipmi&amp;data=02%7C01%7Cnamit%40vmware.com%7Ca1c3ea5d4bc34cfc785508d6bf388ff3%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C0%7C0%7C636906647013777573&amp;sdata=3VSR3VdE5rxOitAdkqFNPpAnAtLgDmYLzJtoUrs5v9Y%3D&amp;reserved=0
>>>> git bisect good c403993a41d50db1e7d9bc2d43c3c8498162312f
>>>> # good: [c05f3642f4304dd081876e77a68555b6aba4483f] Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
>>>> git bisect good c05f3642f4304dd081876e77a68555b6aba4483f
>>>> # bad: [44786880df196a4200c178945c4d41675faf9fb7] Merge branch 'parisc-4.20-1' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux
>>>> git bisect bad 44786880df196a4200c178945c4d41675faf9fb7
>>>> # bad: [99792e0cea1ed733cdc8d0758677981e0cbebfed] Merge branch 'x86-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
>>>> git bisect bad 99792e0cea1ed733cdc8d0758677981e0cbebfed
>>>> # good: [fec98069fb72fb656304a3e52265e0c2fc9adf87] Merge branch 'x86-cpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
>>>> git bisect good fec98069fb72fb656304a3e52265e0c2fc9adf87
>>>> # bad: [a31acd3ee8f7dbc0370bdf4a4bfef7a8c13c7542] x86/mm: Page size aware flush_tlb_mm_range()
>>>> git bisect bad a31acd3ee8f7dbc0370bdf4a4bfef7a8c13c7542
>>>> # good: [a7295fd53c39ce781a9792c9dd2c8747bf274160] x86/mm/cpa: Use flush_tlb_kernel_range()
>>>> git bisect good a7295fd53c39ce781a9792c9dd2c8747bf274160
>>>> # good: [9cf38d5559e813cccdba8b44c82cc46ba48d0896] kexec: Allocate decrypted control pages for kdump if SME is enabled
>>>> git bisect good 9cf38d5559e813cccdba8b44c82cc46ba48d0896
>>>> # good: [5b12904065798fee8b153a506ac7b72d5ebbe26c] x86/mm/doc: Clean up the x86-64 virtual memory layout descriptions
>>>> git bisect good 5b12904065798fee8b153a506ac7b72d5ebbe26c
>>>> # good: [cf089611f4c446285046fcd426d90c18f37d2905] proc/vmcore: Fix i386 build error of missing copy_oldmem_page_encrypted()
>>>> git bisect good cf089611f4c446285046fcd426d90c18f37d2905
>>>> # good: [a5b966ae42a70b194b03eaa5eaea70d8b3790c40] Merge branch 'tlb/asm-generic' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux into x86/mm
>>>> git bisect good a5b966ae42a70b194b03eaa5eaea70d8b3790c40
>>>> # first bad commit: [a31acd3ee8f7dbc0370bdf4a4bfef7a8c13c7542] x86/mm: Page size aware flush_tlb_mm_range()
>>>>
>>>> And 'funnily' the bad patch is one of mine too :/
>>>>
>>>> I'll go have a look at that tomorrow, because currrently I'm way past
>>>> tired.
>>>
>>> OK, so the below patchlet makes it all good. It turns out that the
>>> provided config has:
>>>
>>> CONFIG_X86_L1_CACHE_SHIFT=7
>>>
>>> which then, for some obscure raisin, results in flush_tlb_mm_range()
>>> compiling to use 320 bytes of stack:
>>>
>>> sub $0x140,%rsp
>>>
>>> Where a 'defconfig' build results in:
>>>
>>> sub $0x58,%rsp
>>>
>>> The thing that pushes it over the edge in the above fingered patch is
>>> the addition of a field to struct flush_tlb_info, which grows if from 32
>>> to 36 bytes.
>>>
>>> So my proposal is to basically revert that, unless we can come up with
>>> something that GCC can't screw up.
>>
>> To clarify, 'that' is Nadav's patch:
>>
>> 515ab7c41306 ("x86/mm: Align TLB invalidation info")
>>
>> which turns out to be the real problem.
>
> Sorry for that. I still think it should be aligned, especially with all the
> effort the Intel puts around to avoid bus-locking on unaligned atomic
> operations.
>
> So the right solution seems to me as putting this data structure off stack.
> It would prevent flush_tlb_mm_range() from being reentrant, so we can keep a
> few entries for this matter and atomically increase the entry number every
> time we enter flush_tlb_mm_range().
>
> But my question is - should flush_tlb_mm_range() be reentrant, or can we
> assume no TLB shootdowns are initiated in interrupt handlers and #MC
> handlers?

Peter, what do you say about this one? I assume there are no nested TLB
flushes, but the code can easily be adapted (assuming there is a limit on
the nesting level).

-- >8 --

Subject: [PATCH] x86: Move flush_tlb_info off the stack
---
arch/x86/mm/tlb.c | 49 +++++++++++++++++++++++++++++++++--------------
1 file changed, 35 insertions(+), 14 deletions(-)

diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c
index bc4bc7b2f075..15fe90d4e3e1 100644
--- a/arch/x86/mm/tlb.c
+++ b/arch/x86/mm/tlb.c
@@ -14,6 +14,7 @@
#include <asm/cache.h>
#include <asm/apic.h>
#include <asm/uv/uv.h>
+#include <asm/local.h>

#include "mm_internal.h"

@@ -722,43 +723,63 @@ void native_flush_tlb_others(const struct cpumask *cpumask,
*/
unsigned long tlb_single_page_flush_ceiling __read_mostly = 33;

+static DEFINE_PER_CPU_SHARED_ALIGNED(struct flush_tlb_info, flush_tlb_info);
+#ifdef CONFIG_DEBUG_VM
+static DEFINE_PER_CPU(local_t, flush_tlb_info_idx);
+#endif
+
void flush_tlb_mm_range(struct mm_struct *mm, unsigned long start,
unsigned long end, unsigned int stride_shift,
bool freed_tables)
{
+ struct flush_tlb_info *info;
int cpu;

- struct flush_tlb_info info __aligned(SMP_CACHE_BYTES) = {
- .mm = mm,
- .stride_shift = stride_shift,
- .freed_tables = freed_tables,
- };
-
cpu = get_cpu();

+ info = this_cpu_ptr(&flush_tlb_info);
+
+#ifdef CONFIG_DEBUG_VM
+ /*
+ * Ensure that the following code is non-reentrant and flush_tlb_info
+ * is not overwritten. This means no TLB flushing is initiated by
+ * interrupt handlers and machine-check exception handlers. If needed,
+ * we can add additional flush_tlb_info entries.
+ */
+ BUG_ON(local_inc_return(this_cpu_ptr(&flush_tlb_info_idx)) != 1);
+#endif
+
+ info->mm = mm;
+ info->stride_shift = stride_shift;
+ info->freed_tables = freed_tables;
+
/* This is also a barrier that synchronizes with switch_mm(). */
- info.new_tlb_gen = inc_mm_tlb_gen(mm);
+ info->new_tlb_gen = inc_mm_tlb_gen(mm);

/* Should we flush just the requested range? */
if ((end != TLB_FLUSH_ALL) &&
((end - start) >> stride_shift) <= tlb_single_page_flush_ceiling) {
- info.start = start;
- info.end = end;
+ info->start = start;
+ info->end = end;
} else {
- info.start = 0UL;
- info.end = TLB_FLUSH_ALL;
+ info->start = 0UL;
+ info->end = TLB_FLUSH_ALL;
}

if (mm == this_cpu_read(cpu_tlbstate.loaded_mm)) {
- VM_WARN_ON(irqs_disabled());
+ lockdep_assert_irqs_enabled();
local_irq_disable();
- flush_tlb_func_local(&info, TLB_LOCAL_MM_SHOOTDOWN);
+ flush_tlb_func_local(info, TLB_LOCAL_MM_SHOOTDOWN);
local_irq_enable();
}

if (cpumask_any_but(mm_cpumask(mm), cpu) < nr_cpu_ids)
- flush_tlb_others(mm_cpumask(mm), &info);
+ flush_tlb_others(mm_cpumask(mm), info);

+#ifdef CONFIG_DEBUG_VM
+ barrier();
+ local_dec(this_cpu_ptr(&flush_tlb_info_idx));
+#endif
put_cpu();
}

--
2.17.1


2019-04-12 17:16:40

by Andy Lutomirski

[permalink] [raw]
Subject: Re: 1808d65b55 ("asm-generic/tlb: Remove arch_tlb*_mmu()"): BUG: KASAN: stack-out-of-bounds in __change_page_attr_set_clr



On Apr 12, 2019, at 10:05 AM, Nadav Amit <[email protected]> wrote:

>> On Apr 12, 2019, at 8:11 AM, Nadav Amit <[email protected]> wrote:
>>
>>> On Apr 12, 2019, at 4:17 AM, Peter Zijlstra <[email protected]> wrote:
>>>
>>> On Fri, Apr 12, 2019 at 12:56:33PM +0200, Peter Zijlstra wrote:
>>>>> On Thu, Apr 11, 2019 at 11:13:48PM +0200, Peter Zijlstra wrote:
>>>>>> On Thu, Apr 11, 2019 at 09:54:24PM +0200, Peter Zijlstra wrote:
>>>>>>> On Thu, Apr 11, 2019 at 09:39:06PM +0200, Peter Zijlstra wrote:
>>>>>>> I think this bisect is bad. If you look at your own logs this patch
>>>>>>> merely changes the failure, but doesn't make it go away.
>>>>>>>
>>>>>>> Before this patch (in fact, before tip/core/mm entirely) the errror
>>>>>>> reads like the below, which suggests there is memory corruption
>>>>>>> somewhere, and the fingered patch just makes it trigger differently.
>>>>>>>
>>>>>>> It would be very good to find the source of this corruption, but I'm
>>>>>>> fairly certain it is not here.
>>>>>>
>>>>>> I went back to v4.20 to try and find a time when the below error did not
>>>>>> occur, but even that reliably triggers the warning.
>>>>>
>>>>> So I also tested v4.19 and found that that was good, which made me
>>>>> bisect v4.19..v4.20
>>>>>
>>>>> # bad: [8fe28cb58bcb235034b64cbbb7550a8a43fd88be] Linux 4.20
>>>>> # good: [84df9525b0c27f3ebc2ebb1864fa62a97fdedb7d] Linux 4.19
>>>>> git bisect start 'v4.20' 'v4.19'
>>>>> # bad: [ec9c166434595382be3babf266febf876327774d] Merge tag 'mips_fixes_4.20_1' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux
>>>>> git bisect bad ec9c166434595382be3babf266febf876327774d
>>>>> # bad: [50b825d7e87f4cff7070df6eb26390152bb29537] Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next
>>>>> git bisect bad 50b825d7e87f4cff7070df6eb26390152bb29537
>>>>> # good: [99e9acd85ccbdc8f5785f9e961d4956e96bd6aa5] Merge tag 'mlx5-updates-2018-10-17' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux
>>>>> git bisect good 99e9acd85ccbdc8f5785f9e961d4956e96bd6aa5
>>>>> # good: [c403993a41d50db1e7d9bc2d43c3c8498162312f] Merge tag 'for-linus-4.20' of https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fcminyard%2Flinux-ipmi&amp;data=02%7C01%7Cnamit%40vmware.com%7Ca1c3ea5d4bc34cfc785508d6bf388ff3%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C0%7C0%7C636906647013777573&amp;sdata=3VSR3VdE5rxOitAdkqFNPpAnAtLgDmYLzJtoUrs5v9Y%3D&amp;reserved=0
>>>>> git bisect good c403993a41d50db1e7d9bc2d43c3c8498162312f
>>>>> # good: [c05f3642f4304dd081876e77a68555b6aba4483f] Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
>>>>> git bisect good c05f3642f4304dd081876e77a68555b6aba4483f
>>>>> # bad: [44786880df196a4200c178945c4d41675faf9fb7] Merge branch 'parisc-4.20-1' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux
>>>>> git bisect bad 44786880df196a4200c178945c4d41675faf9fb7
>>>>> # bad: [99792e0cea1ed733cdc8d0758677981e0cbebfed] Merge branch 'x86-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
>>>>> git bisect bad 99792e0cea1ed733cdc8d0758677981e0cbebfed
>>>>> # good: [fec98069fb72fb656304a3e52265e0c2fc9adf87] Merge branch 'x86-cpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
>>>>> git bisect good fec98069fb72fb656304a3e52265e0c2fc9adf87
>>>>> # bad: [a31acd3ee8f7dbc0370bdf4a4bfef7a8c13c7542] x86/mm: Page size aware flush_tlb_mm_range()
>>>>> git bisect bad a31acd3ee8f7dbc0370bdf4a4bfef7a8c13c7542
>>>>> # good: [a7295fd53c39ce781a9792c9dd2c8747bf274160] x86/mm/cpa: Use flush_tlb_kernel_range()
>>>>> git bisect good a7295fd53c39ce781a9792c9dd2c8747bf274160
>>>>> # good: [9cf38d5559e813cccdba8b44c82cc46ba48d0896] kexec: Allocate decrypted control pages for kdump if SME is enabled
>>>>> git bisect good 9cf38d5559e813cccdba8b44c82cc46ba48d0896
>>>>> # good: [5b12904065798fee8b153a506ac7b72d5ebbe26c] x86/mm/doc: Clean up the x86-64 virtual memory layout descriptions
>>>>> git bisect good 5b12904065798fee8b153a506ac7b72d5ebbe26c
>>>>> # good: [cf089611f4c446285046fcd426d90c18f37d2905] proc/vmcore: Fix i386 build error of missing copy_oldmem_page_encrypted()
>>>>> git bisect good cf089611f4c446285046fcd426d90c18f37d2905
>>>>> # good: [a5b966ae42a70b194b03eaa5eaea70d8b3790c40] Merge branch 'tlb/asm-generic' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux into x86/mm
>>>>> git bisect good a5b966ae42a70b194b03eaa5eaea70d8b3790c40
>>>>> # first bad commit: [a31acd3ee8f7dbc0370bdf4a4bfef7a8c13c7542] x86/mm: Page size aware flush_tlb_mm_range()
>>>>>
>>>>> And 'funnily' the bad patch is one of mine too :/
>>>>>
>>>>> I'll go have a look at that tomorrow, because currrently I'm way past
>>>>> tired.
>>>>
>>>> OK, so the below patchlet makes it all good. It turns out that the
>>>> provided config has:
>>>>
>>>> CONFIG_X86_L1_CACHE_SHIFT=7
>>>>
>>>> which then, for some obscure raisin, results in flush_tlb_mm_range()
>>>> compiling to use 320 bytes of stack:
>>>>
>>>> sub $0x140,%rsp
>>>>
>>>> Where a 'defconfig' build results in:
>>>>
>>>> sub $0x58,%rsp
>>>>
>>>> The thing that pushes it over the edge in the above fingered patch is
>>>> the addition of a field to struct flush_tlb_info, which grows if from 32
>>>> to 36 bytes.
>>>>
>>>> So my proposal is to basically revert that, unless we can come up with
>>>> something that GCC can't screw up.
>>>
>>> To clarify, 'that' is Nadav's patch:
>>>
>>> 515ab7c41306 ("x86/mm: Align TLB invalidation info")
>>>
>>> which turns out to be the real problem.
>>
>> Sorry for that. I still think it should be aligned, especially with all the
>> effort the Intel puts around to avoid bus-locking on unaligned atomic
>> operations.
>>
>> So the right solution seems to me as putting this data structure off stack.
>> It would prevent flush_tlb_mm_range() from being reentrant, so we can keep a
>> few entries for this matter and atomically increase the entry number every
>> time we enter flush_tlb_mm_range().
>>
>> But my question is - should flush_tlb_mm_range() be reentrant, or can we
>> assume no TLB shootdowns are initiated in interrupt handlers and #MC
>> handlers?
>
> Peter, what do you say about this one? I assume there are no nested TLB
> flushes, but the code can easily be adapted (assuming there is a limit on
> the nesting level).

You need IRQs on to flush, right? So as long as preemption is off, it won’t nest.

But is there really any measurable performance benefit to aligning it like this? There shouldn’t actually be any atomically — it’s just a little data structure telling everyone what to do.

>
> -- >8 --
>
> Subject: [PATCH] x86: Move flush_tlb_info off the stack
> ---
> arch/x86/mm/tlb.c | 49 +++++++++++++++++++++++++++++++++--------------
> 1 file changed, 35 insertions(+), 14 deletions(-)
>
> diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c
> index bc4bc7b2f075..15fe90d4e3e1 100644
> --- a/arch/x86/mm/tlb.c
> +++ b/arch/x86/mm/tlb.c
> @@ -14,6 +14,7 @@
> #include <asm/cache.h>
> #include <asm/apic.h>
> #include <asm/uv/uv.h>
> +#include <asm/local.h>
>
> #include "mm_internal.h"
>
> @@ -722,43 +723,63 @@ void native_flush_tlb_others(const struct cpumask *cpumask,
> */
> unsigned long tlb_single_page_flush_ceiling __read_mostly = 33;
>
> +static DEFINE_PER_CPU_SHARED_ALIGNED(struct flush_tlb_info, flush_tlb_info);
> +#ifdef CONFIG_DEBUG_VM
> +static DEFINE_PER_CPU(local_t, flush_tlb_info_idx);
> +#endif
> +
> void flush_tlb_mm_range(struct mm_struct *mm, unsigned long start,
> unsigned long end, unsigned int stride_shift,
> bool freed_tables)
> {
> + struct flush_tlb_info *info;
> int cpu;
>
> - struct flush_tlb_info info __aligned(SMP_CACHE_BYTES) = {
> - .mm = mm,
> - .stride_shift = stride_shift,
> - .freed_tables = freed_tables,
> - };
> -
> cpu = get_cpu();
>
> + info = this_cpu_ptr(&flush_tlb_info);
> +
> +#ifdef CONFIG_DEBUG_VM
> + /*
> + * Ensure that the following code is non-reentrant and flush_tlb_info
> + * is not overwritten. This means no TLB flushing is initiated by
> + * interrupt handlers and machine-check exception handlers. If needed,
> + * we can add additional flush_tlb_info entries.
> + */
> + BUG_ON(local_inc_return(this_cpu_ptr(&flush_tlb_info_idx)) != 1);
> +#endif
> +
> + info->mm = mm;
> + info->stride_shift = stride_shift;
> + info->freed_tables = freed_tables;
> +
> /* This is also a barrier that synchronizes with switch_mm(). */
> - info.new_tlb_gen = inc_mm_tlb_gen(mm);
> + info->new_tlb_gen = inc_mm_tlb_gen(mm);
>
> /* Should we flush just the requested range? */
> if ((end != TLB_FLUSH_ALL) &&
> ((end - start) >> stride_shift) <= tlb_single_page_flush_ceiling) {
> - info.start = start;
> - info.end = end;
> + info->start = start;
> + info->end = end;
> } else {
> - info.start = 0UL;
> - info.end = TLB_FLUSH_ALL;
> + info->start = 0UL;
> + info->end = TLB_FLUSH_ALL;
> }
>
> if (mm == this_cpu_read(cpu_tlbstate.loaded_mm)) {
> - VM_WARN_ON(irqs_disabled());
> + lockdep_assert_irqs_enabled();
> local_irq_disable();
> - flush_tlb_func_local(&info, TLB_LOCAL_MM_SHOOTDOWN);
> + flush_tlb_func_local(info, TLB_LOCAL_MM_SHOOTDOWN);
> local_irq_enable();
> }
>
> if (cpumask_any_but(mm_cpumask(mm), cpu) < nr_cpu_ids)
> - flush_tlb_others(mm_cpumask(mm), &info);
> + flush_tlb_others(mm_cpumask(mm), info);
>
> +#ifdef CONFIG_DEBUG_VM
> + barrier();
> + local_dec(this_cpu_ptr(&flush_tlb_info_idx));
> +#endif
> put_cpu();
> }
>
> --
> 2.17.1
>
>

2019-04-12 17:50:33

by Nadav Amit

[permalink] [raw]
Subject: Re: 1808d65b55 ("asm-generic/tlb: Remove arch_tlb*_mmu()"): BUG: KASAN: stack-out-of-bounds in __change_page_attr_set_clr

> On Apr 12, 2019, at 10:14 AM, Andy Lutomirski <[email protected]> wrote:
>
>
>
> On Apr 12, 2019, at 10:05 AM, Nadav Amit <[email protected]> wrote:
>
>>> On Apr 12, 2019, at 8:11 AM, Nadav Amit <[email protected]> wrote:
>>>
>>>> On Apr 12, 2019, at 4:17 AM, Peter Zijlstra <[email protected]> wrote:
>>>>
>>>> On Fri, Apr 12, 2019 at 12:56:33PM +0200, Peter Zijlstra wrote:
>>>>>> On Thu, Apr 11, 2019 at 11:13:48PM +0200, Peter Zijlstra wrote:
>>>>>>> On Thu, Apr 11, 2019 at 09:54:24PM +0200, Peter Zijlstra wrote:
>>>>>>>> On Thu, Apr 11, 2019 at 09:39:06PM +0200, Peter Zijlstra wrote:
>>>>>>>> I think this bisect is bad. If you look at your own logs this patch
>>>>>>>> merely changes the failure, but doesn't make it go away.
>>>>>>>>
>>>>>>>> Before this patch (in fact, before tip/core/mm entirely) the errror
>>>>>>>> reads like the below, which suggests there is memory corruption
>>>>>>>> somewhere, and the fingered patch just makes it trigger differently.
>>>>>>>>
>>>>>>>> It would be very good to find the source of this corruption, but I'm
>>>>>>>> fairly certain it is not here.
>>>>>>>
>>>>>>> I went back to v4.20 to try and find a time when the below error did not
>>>>>>> occur, but even that reliably triggers the warning.
>>>>>>
>>>>>> So I also tested v4.19 and found that that was good, which made me
>>>>>> bisect v4.19..v4.20
>>>>>>
>>>>>> # bad: [8fe28cb58bcb235034b64cbbb7550a8a43fd88be] Linux 4.20
>>>>>> # good: [84df9525b0c27f3ebc2ebb1864fa62a97fdedb7d] Linux 4.19
>>>>>> git bisect start 'v4.20' 'v4.19'
>>>>>> # bad: [ec9c166434595382be3babf266febf876327774d] Merge tag 'mips_fixes_4.20_1' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux
>>>>>> git bisect bad ec9c166434595382be3babf266febf876327774d
>>>>>> # bad: [50b825d7e87f4cff7070df6eb26390152bb29537] Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next
>>>>>> git bisect bad 50b825d7e87f4cff7070df6eb26390152bb29537
>>>>>> # good: [99e9acd85ccbdc8f5785f9e961d4956e96bd6aa5] Merge tag 'mlx5-updates-2018-10-17' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux
>>>>>> git bisect good 99e9acd85ccbdc8f5785f9e961d4956e96bd6aa5
>>>>>> # good: [c403993a41d50db1e7d9bc2d43c3c8498162312f] Merge tag 'for-linus-4.20' of https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fcminyard%2Flinux-ipmi&amp;data=02%7C01%7Cnamit%40vmware.com%7Ca1c3ea5d4bc34cfc785508d6bf388ff3%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C0%7C0%7C636906647013777573&amp;sdata=3VSR3VdE5rxOitAdkqFNPpAnAtLgDmYLzJtoUrs5v9Y%3D&amp;reserved=0
>>>>>> git bisect good c403993a41d50db1e7d9bc2d43c3c8498162312f
>>>>>> # good: [c05f3642f4304dd081876e77a68555b6aba4483f] Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
>>>>>> git bisect good c05f3642f4304dd081876e77a68555b6aba4483f
>>>>>> # bad: [44786880df196a4200c178945c4d41675faf9fb7] Merge branch 'parisc-4.20-1' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux
>>>>>> git bisect bad 44786880df196a4200c178945c4d41675faf9fb7
>>>>>> # bad: [99792e0cea1ed733cdc8d0758677981e0cbebfed] Merge branch 'x86-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
>>>>>> git bisect bad 99792e0cea1ed733cdc8d0758677981e0cbebfed
>>>>>> # good: [fec98069fb72fb656304a3e52265e0c2fc9adf87] Merge branch 'x86-cpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
>>>>>> git bisect good fec98069fb72fb656304a3e52265e0c2fc9adf87
>>>>>> # bad: [a31acd3ee8f7dbc0370bdf4a4bfef7a8c13c7542] x86/mm: Page size aware flush_tlb_mm_range()
>>>>>> git bisect bad a31acd3ee8f7dbc0370bdf4a4bfef7a8c13c7542
>>>>>> # good: [a7295fd53c39ce781a9792c9dd2c8747bf274160] x86/mm/cpa: Use flush_tlb_kernel_range()
>>>>>> git bisect good a7295fd53c39ce781a9792c9dd2c8747bf274160
>>>>>> # good: [9cf38d5559e813cccdba8b44c82cc46ba48d0896] kexec: Allocate decrypted control pages for kdump if SME is enabled
>>>>>> git bisect good 9cf38d5559e813cccdba8b44c82cc46ba48d0896
>>>>>> # good: [5b12904065798fee8b153a506ac7b72d5ebbe26c] x86/mm/doc: Clean up the x86-64 virtual memory layout descriptions
>>>>>> git bisect good 5b12904065798fee8b153a506ac7b72d5ebbe26c
>>>>>> # good: [cf089611f4c446285046fcd426d90c18f37d2905] proc/vmcore: Fix i386 build error of missing copy_oldmem_page_encrypted()
>>>>>> git bisect good cf089611f4c446285046fcd426d90c18f37d2905
>>>>>> # good: [a5b966ae42a70b194b03eaa5eaea70d8b3790c40] Merge branch 'tlb/asm-generic' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux into x86/mm
>>>>>> git bisect good a5b966ae42a70b194b03eaa5eaea70d8b3790c40
>>>>>> # first bad commit: [a31acd3ee8f7dbc0370bdf4a4bfef7a8c13c7542] x86/mm: Page size aware flush_tlb_mm_range()
>>>>>>
>>>>>> And 'funnily' the bad patch is one of mine too :/
>>>>>>
>>>>>> I'll go have a look at that tomorrow, because currrently I'm way past
>>>>>> tired.
>>>>>
>>>>> OK, so the below patchlet makes it all good. It turns out that the
>>>>> provided config has:
>>>>>
>>>>> CONFIG_X86_L1_CACHE_SHIFT=7
>>>>>
>>>>> which then, for some obscure raisin, results in flush_tlb_mm_range()
>>>>> compiling to use 320 bytes of stack:
>>>>>
>>>>> sub $0x140,%rsp
>>>>>
>>>>> Where a 'defconfig' build results in:
>>>>>
>>>>> sub $0x58,%rsp
>>>>>
>>>>> The thing that pushes it over the edge in the above fingered patch is
>>>>> the addition of a field to struct flush_tlb_info, which grows if from 32
>>>>> to 36 bytes.
>>>>>
>>>>> So my proposal is to basically revert that, unless we can come up with
>>>>> something that GCC can't screw up.
>>>>
>>>> To clarify, 'that' is Nadav's patch:
>>>>
>>>> 515ab7c41306 ("x86/mm: Align TLB invalidation info")
>>>>
>>>> which turns out to be the real problem.
>>>
>>> Sorry for that. I still think it should be aligned, especially with all the
>>> effort the Intel puts around to avoid bus-locking on unaligned atomic
>>> operations.
>>>
>>> So the right solution seems to me as putting this data structure off stack.
>>> It would prevent flush_tlb_mm_range() from being reentrant, so we can keep a
>>> few entries for this matter and atomically increase the entry number every
>>> time we enter flush_tlb_mm_range().
>>>
>>> But my question is - should flush_tlb_mm_range() be reentrant, or can we
>>> assume no TLB shootdowns are initiated in interrupt handlers and #MC
>>> handlers?
>>
>> Peter, what do you say about this one? I assume there are no nested TLB
>> flushes, but the code can easily be adapted (assuming there is a limit on
>> the nesting level).
>
> You need IRQs on to flush, right? So as long as preemption is off, it won’t nest.

Yes. I figured, but it still had all kind of theoretical scenarios in my mind
(IRQs are conditionally enabled in #MC handler, etc.)

> But is there really any measurable performance benefit to aligning it like
> this? There shouldn’t actually be any atomically — it’s just a little data
> structure telling everyone what to do.

At the time I measured (I hacked the code to force misalignment), and it was
marginal (i.e., hard to say).

Having said that, it seems to me as a more scalable design choice. From a
brief look, the vast majority of on_each_cpu() arguments are off the stack.

Correct me if I am wrong, but keeping them off the stack should help, not
only by preventing unalignment, but also by preventing some TLB misses:
right now tlb_flush_info instances of different threads are in different
virtual addresses (and because the kernel stack is vmalloc’d, kernel stack
virtual-address mappings do not share the same TLB-entry).

2019-04-12 18:14:44

by Peter Zijlstra

[permalink] [raw]
Subject: Re: 1808d65b55 ("asm-generic/tlb: Remove arch_tlb*_mmu()"): BUG: KASAN: stack-out-of-bounds in __change_page_attr_set_clr

On Fri, Apr 12, 2019 at 05:05:53PM +0000, Nadav Amit wrote:
> Peter, what do you say about this one? I assume there are no nested TLB
> flushes, but the code can easily be adapted (assuming there is a limit on
> the nesting level).

Possible. Althoug at this point I think we should just remove the
alignment, and them maybe do this on top later.

> -- >8 --
>
> Subject: [PATCH] x86: Move flush_tlb_info off the stack
> ---
> arch/x86/mm/tlb.c | 49 +++++++++++++++++++++++++++++++++--------------
> 1 file changed, 35 insertions(+), 14 deletions(-)
>
> diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c
> index bc4bc7b2f075..15fe90d4e3e1 100644
> --- a/arch/x86/mm/tlb.c
> +++ b/arch/x86/mm/tlb.c
> @@ -14,6 +14,7 @@
> #include <asm/cache.h>
> #include <asm/apic.h>
> #include <asm/uv/uv.h>
> +#include <asm/local.h>
>
> #include "mm_internal.h"
>
> @@ -722,43 +723,63 @@ void native_flush_tlb_others(const struct cpumask *cpumask,
> */
> unsigned long tlb_single_page_flush_ceiling __read_mostly = 33;
>
> +static DEFINE_PER_CPU_SHARED_ALIGNED(struct flush_tlb_info, flush_tlb_info);
> +#ifdef CONFIG_DEBUG_VM
> +static DEFINE_PER_CPU(local_t, flush_tlb_info_idx);
> +#endif
> +
> void flush_tlb_mm_range(struct mm_struct *mm, unsigned long start,
> unsigned long end, unsigned int stride_shift,
> bool freed_tables)
> {
> + struct flush_tlb_info *info;
> int cpu;
>
> - struct flush_tlb_info info __aligned(SMP_CACHE_BYTES) = {
> - .mm = mm,
> - .stride_shift = stride_shift,
> - .freed_tables = freed_tables,
> - };
> -
> cpu = get_cpu();
>
> + info = this_cpu_ptr(&flush_tlb_info);
> +
> +#ifdef CONFIG_DEBUG_VM
> + /*
> + * Ensure that the following code is non-reentrant and flush_tlb_info
> + * is not overwritten. This means no TLB flushing is initiated by
> + * interrupt handlers and machine-check exception handlers. If needed,
> + * we can add additional flush_tlb_info entries.
> + */
> + BUG_ON(local_inc_return(this_cpu_ptr(&flush_tlb_info_idx)) != 1);

That's what we have this_cpu_inc_return() for.

> +#endif
> +
> + info->mm = mm;
> + info->stride_shift = stride_shift;
> + info->freed_tables = freed_tables;
> +
> /* This is also a barrier that synchronizes with switch_mm(). */
> - info.new_tlb_gen = inc_mm_tlb_gen(mm);
> + info->new_tlb_gen = inc_mm_tlb_gen(mm);
>
> /* Should we flush just the requested range? */
> if ((end != TLB_FLUSH_ALL) &&
> ((end - start) >> stride_shift) <= tlb_single_page_flush_ceiling) {
> - info.start = start;
> - info.end = end;
> + info->start = start;
> + info->end = end;
> } else {
> - info.start = 0UL;
> - info.end = TLB_FLUSH_ALL;
> + info->start = 0UL;
> + info->end = TLB_FLUSH_ALL;
> }
>
> if (mm == this_cpu_read(cpu_tlbstate.loaded_mm)) {
> - VM_WARN_ON(irqs_disabled());
> + lockdep_assert_irqs_enabled();
> local_irq_disable();
> - flush_tlb_func_local(&info, TLB_LOCAL_MM_SHOOTDOWN);
> + flush_tlb_func_local(info, TLB_LOCAL_MM_SHOOTDOWN);
> local_irq_enable();
> }
>
> if (cpumask_any_but(mm_cpumask(mm), cpu) < nr_cpu_ids)
> - flush_tlb_others(mm_cpumask(mm), &info);
> + flush_tlb_others(mm_cpumask(mm), info);
>
> +#ifdef CONFIG_DEBUG_VM
> + barrier();
> + local_dec(this_cpu_ptr(&flush_tlb_info_idx));

this_cpu_dec();

> +#endif
> put_cpu();
> }
>
> --
> 2.17.1
>
>

2019-04-12 18:18:11

by Peter Zijlstra

[permalink] [raw]
Subject: Re: 1808d65b55 ("asm-generic/tlb: Remove arch_tlb*_mmu()"): BUG: KASAN: stack-out-of-bounds in __change_page_attr_set_clr

On Fri, Apr 12, 2019 at 05:50:30PM +0100, David Howells wrote:
> Linus Torvalds <[email protected]> wrote:
>
> > We should never have stack alignment bigger than 16 bytes. And
> > preferably not even that.
>
> At least one arch I know of (FRV) had instructions that could atomically
> load/store register pairs or register quads, but they had to be pair- or
> quad-aligned (ie. 8- or 16-byte), which made for more efficient code if you
> could use them.
>
> I don't know whether any arch we currently support has features like this (I
> know some have multi-reg load/stores, but they seem to require only
> word-alignment).

ARC (iirc) has u64 atomics with natural alignment requirements but
alignof(u64)=4 due it being a 32bit arch. Which is awkward.

ARMv7 can also do u64 ops when aligned right, but I forgot if they have
proper alignment or not.

2019-04-12 18:20:32

by Peter Zijlstra

[permalink] [raw]
Subject: Re: 1808d65b55 ("asm-generic/tlb: Remove arch_tlb*_mmu()"): BUG: KASAN: stack-out-of-bounds in __change_page_attr_set_clr

On Fri, Apr 12, 2019 at 03:11:22PM +0000, Nadav Amit wrote:
> > On Apr 12, 2019, at 4:17 AM, Peter Zijlstra <[email protected]> wrote:

> > To clarify, 'that' is Nadav's patch:
> >
> > 515ab7c41306 ("x86/mm: Align TLB invalidation info")
> >
> > which turns out to be the real problem.
>
> Sorry for that. I still think it should be aligned, especially with all the
> effort the Intel puts around to avoid bus-locking on unaligned atomic
> operations.

No atomics anywhere in sight, so that's not a concern.

> So the right solution seems to me as putting this data structure off stack.
> It would prevent flush_tlb_mm_range() from being reentrant, so we can keep a
> few entries for this matter and atomically increase the entry number every
> time we enter flush_tlb_mm_range().
>
> But my question is - should flush_tlb_mm_range() be reentrant, or can we
> assume no TLB shootdowns are initiated in interrupt handlers and #MC
> handlers?

There _should_ not be, but then don't look at those XPFO patches that
were posted (they're broken anyway).

2019-04-12 19:43:50

by Nadav Amit

[permalink] [raw]
Subject: Re: 1808d65b55 ("asm-generic/tlb: Remove arch_tlb*_mmu()"): BUG: KASAN: stack-out-of-bounds in __change_page_attr_set_clr

> On Apr 12, 2019, at 11:19 AM, Peter Zijlstra <[email protected]> wrote:
>
> On Fri, Apr 12, 2019 at 03:11:22PM +0000, Nadav Amit wrote:
>>> On Apr 12, 2019, at 4:17 AM, Peter Zijlstra <[email protected]> wrote:
>
>>> To clarify, 'that' is Nadav's patch:
>>>
>>> 515ab7c41306 ("x86/mm: Align TLB invalidation info")
>>>
>>> which turns out to be the real problem.
>>
>> Sorry for that. I still think it should be aligned, especially with all the
>> effort the Intel puts around to avoid bus-locking on unaligned atomic
>> operations.
>
> No atomics anywhere in sight, so that's not a concern.

You are right. I still think that at least TLB-wise it should be better to
have the argument off-stack. I’ll try to run some experiments, based on
your feedback, and send a patch on top of your revert.

Sorry for the mess, again.