2020-04-22 15:57:28

by Qian Cai

[permalink] [raw]
Subject: AMD boot woe due to "x86/mm: Cleanup pgprot_4k_2_large() and pgprot_large_2_4k()"

Reverted the linux-next commit and its dependency,

a85573f7e741 ("x86/mm: Unexport __cachemode2pte_tbl”)
9e294786c89a (“x86/mm: Cleanup pgprot_4k_2_large() and pgprot_large_2_4k()”)

fixed crashes or hard reset on AMD machines during boot that have been flagged by
KASAN in different forms indicating some sort of memory corruption with this config,

https://raw.githubusercontent.com/cailca/linux-mm/master/x86.config

[ 0.000000] ACPI: LAPIC_NMI (acpi_id[0x71] high level lint[0x1])
[ 0.000000] ACPI: LAPIC_NMI (acpi_id[0x78] high level lint[0x1])
[ 0.000000] ACPI: LAPIC_NMI (acpi_id[0x79] high level lint[0x1])
[ 0.000000] BUG: unable to handle page fault for address: ffffed107c782fff
[ 0.000000] #PF: supervisor read access in kernel mode
[ 0.000000] #PF: error_code(0x0000) - not-present page
[ 0.000000] ==================================================================
[ 0.000000] BUG: KASAN: stack-out-of-bounds in cmp_ex_search+0x1e/0x40
ex_to_insn at lib/extable.c:20
(inlined by) cmp_ex_search at lib/extable.c:101
[ 0.000000] Read of size 4 at addr ffffffffae27cae4 by task swapper/0[ 0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted 5.7.0-rc2-next-20200422+ #4
[ 0.000000]
[ 0.000000] Hardware name: HPE ProLiant DL385 Gen10/ProLiant DL385 Gen10, BIOS A40 03/09/2018
[ 0.000000] Call Trace:
[ 0.000000]
[ 0.000000] The buggy address belongs to the variable:
[ 0.000000] __start___ex_table+0x1cd4/0x2670
[ 0.000000]
[ 0.000000] Memory state around the buggy address:
[ 0.000000] ffffffffae27c980: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 0.000000] ffffffffae27ca00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 0.000000] >ffffffffae27ca80: 00 00 00 00 00 f1 f1 f1 f1 00 f2 f2 f2 00 00 00
[ 0.000000] ^
[ 0.000000] ffffffffae27cb00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 0.000000] ffffffffae27cb80: 00 00 00 00 00 f1 f1 f1 f1 02 f2 f2 f2 f2 f2 f2
[ 0.000000] ==================================================================


[ 5.125583][ T0] BUG: KASAN: null-ptr-deref in __check_object_size+0x12c/0x503
__read_once_size at include/linux/compiler.h:199
(inlined by) compound_head at include/linux/page-flags.h:182
(inlined by) PageSlab at include/linux/page-flags.h:333
(inlined by) check_heap_object at mm/usercopy.c:238
(inlined by) __check_object_size at mm/usercopy.c:286
(inlined by) __check_object_size at mm/usercopy.c:256
[ 5.133083][ T0] Read of size 8 at addr 0000000000000006 by task swapper/0
[ 5.140244][ T0]
[ 5.142434][ T0] CPU: 0 PID: 0 Comm: swapper Not tainted 5.7.0-rc2+ #8
[ 5.149241][ T0] Hardware name: HPE ProLiant DL385 Gen10/ProLiant DL385 Gen10, BIOS A40 07/10/2019
[ 5.158502][ T0] Call Trace:
[ 5.161654][ T0] ? do_raw_spin_lock+0x11e/0x1e0
[ 5.166542][ T0] ? rwlock_bug.part.0+0x60/0x60
[ 5.171348][ T0] ? bad_area_nosemaphore+0x16/0x20
[ 5.176409][ T0] ? do_page_fault+0x44b/0x9d7
[ 5.181043][ T0] ? trace_hardirqs_off_thunk+0x1a/0x1c
[ 5.186459][ T0] ? page_fault+0x34/0x40
[ 5.190649][ T0] ? dump_pagetable+0xf3/0x3b0
[ 5.195280][ T0] ? __asan_load8+0x40/0xb0
[ 5.199645][ T0] ? dump_pagetable+0xf3/0x3b0
[ 5.204274][ T0] ? vmalloc_fault+0x450/0x450
[ 5.208906][ T0] ? search_exception_tables+0x4c/0x50
[ 5.214231][ T0] ? fixup_exception+0x38/0x92
[ 5.218861][ T0] ? no_context.cold.21+0x160/0x2e0
[ 5.223928][ T0] ? pgtable_bad+0x80/0x80
[ 5.228209][ T0] ? register_lock_class+0xb40/0xb40
[ 5.233362][ T0] ? register_lock_class+0xb40/0xb40
[ 5.238519][ T0] ? snprintf+0xc0/0xc0
[ 5.242538][ T0] ? debug_lockdep_rcu_enabled+0x11/0x60
[ 5.248039][ T0] ? debug_lockdep_rcu_enabled+0x11/0x60
[ 5.253538][ T0] ? console_unlock+0x3e5/0x740
[ 5.258257][ T0] ? console_unlock+0x3ff/0x740
[ 5.262975][ T0] ? __bad_area_nosemaphore+0x66/0x230
[ 5.268303][ T0] ? do_raw_spin_lock+0x11e/0x1e0
[ 5.273193][ T0] ? rwlock_bug.part.0+0x60/0x60
[ 5.277998][ T0] ? bad_area_nosemaphore+0x16/0x20
[ 5.283064][ T0] ? do_page_fault+0x44b/0x9d7
[ 5.287694][ T0] ? trace_hardirqs_off_thunk+0x1a/0x1c
[ 5.293111][ T0] ? page_fault+0x34/0x40
[ 5.297303][ T0] ? dump_pagetable+0xf3/0x3b0
[ 5.301933][ T0] ? __asan_load8+0x40/0xb0
[ 5.306300][ T0] ? dump_pagetable+0xf3/0x3b0
[ 5.310929][ T0] ? vmalloc_fault+0x450/0x450
[ 5.315560][ T0] ? search_exception_tables+0x4c/0x50
[ 5.320886][ T0] ? fixup_exception+0x38/0x92
[ 5.325518][ T0] ? no_context.cold.21+0x160/0x2e0
[ 5.330582][ T0] ? pgtable_bad+0x80/0x80
[ 5.334863][ T0] ? register_lock_class+0xb40/0xb40
[ 5.340018][ T0] ? register_lock_class+0xb40/0xb40
[ 5.345171][ T0] ? snprintf+0xc0/0xc0
[ 5.349190][ T0] ? debug_lockdep_rcu_enabled+0x11/0x60
[ 5.354694][ T0] ? debug_lockdep_rcu_enabled+0x11/0x60
[ 5.360521][ T0] ? console_unlock+0x3e5/0x740
[ 5.365235][ T0] ? console_unlock+0x3ff/0x740
[ 5.369954][ T0] ? __bad_area_nosemaphore+0x66/0x230
[ 5.375282][ T0] ? do_raw_spin_lock+0x11e/0x1e0
[ 5.380174][ T0] ? rwlock_bug.part.0+0x60/0x60
[ 5.384977][ T0] ? bad_area_nosemaphore+0x16/0x20
[ 5.390044][ T0] ? do_page_fault+0x44b/0x9d7
[ 5.394674][ T0] ? trace_hardirqs_off_thunk+0x1a/0x1c
[ 5.400090][ T0] ? page_fault+0x34/0x40
[ 5.404282][ T0] ? dump_pagetable+0xf3/0x3b0
[ 5.408912][ T0] ? __asan_load8+0x40/0xb0
[ 5.413279][ T0] ? dump_pagetable+0xf3/0x3b0
[ 5.417908][ T0] ? vmalloc_fault+0x450/0x450
[ 5.422538][ T0] ? search_exception_tables+0x4c/0x50
[ 5.427868][ T0] ? fixup_exception+0x38/0x92
[ 5.432496][ T0] ? no_context.cold.21+0x160/0x2e0
[ 5.437562][ T0] ? pgtable_bad+0x80/0x80
[ 5.441843][ T0] ? register_lock_class+0xb40/0xb40
[ 5.446996][ T0] ? register_lock_class+0xb40/0xb40
[ 5.452152][ T0] ? snprintf+0xc0/0xc0
[ 5.456170][ T0] ? debug_lockdep_rcu_enabled+0x11/0x60
[ 5.461674][ T0] ? debug_lockdep_rcu_enabled+0x11/0x60
[ 5.467173][ T0] ? console_unlock+0x3e5/0x740
[ 5.471891][ T0] ? console_unlock+0x3ff/0x740
[ 5.476609][ T0] ? __bad_area_nosemaphore+0x66/0x230
[ 5.481937][ T0] ? do_raw_spin_lock+0x11e/0x1e0
[ 5.486829][ T0] ? rwlock_bug.part.0+0x60/0x60
[ 5.491632][ T0] ? bad_area_nosemaphore+0x16/0x20
[ 5.496699][ T0] ? do_page_fault+0x44b/0x9d7
[ 5.501329][ T0] ? trace_hardirqs_off_thunk+0x1a/0x1c
[ 5.506746][ T0] ? page_fault+0x34/0x40
[ 5.510938][ T0] ? dump_pagetable+0xf3/0x3b0
[ 5.515568][ T0] ? __asan_load8+0x40/0xb0
[ 5.519935][ T0] ? dump_pagetable+0xf3/0x3b0
[ 5.524564][ T0] ? vmalloc_fault+0x450/0x450
[ 5.529194][ T0] ? search_exception_tables+0x4c/0x50
[ 5.534521][ T0] ? fixup_exception+0x38/0x92
[ 5.539153][ T0] ? no_context.cold.21+0x160/0x2e0
[ 5.544218][ T0] ? pgtable_bad+0x80/0x80
[ 5.548499][ T0] ? register_lock_class+0xb40/0xb40
[ 5.553652][ T0] ? register_lock_class+0xb40/0xb40
[ 5.558805][ T0] ? snprintf+0xc0/0xc0
[ 5.562826][ T0] ? debug_lockdep_rcu_enabled+0x11/0x60
[ 5.568327][ T0] ? debug_lockdep_rcu_enabled+0x11/0x60
[ 5.573829][ T0] ? console_unlock+0x3e5/0x740
[ 5.578546][ T0] ? console_unlock+0x3ff/0x740
[ 5.583265][ T0] ? __bad_area_nosemaphore+0x66/0x230
[ 5.588592][ T0] ? do_raw_spin_lock+0x11e/0x1e0
[ 5.593485][ T0] ? rwlock_bug.part.0+0x60/0x60
[ 5.598290][ T0] ? bad_area_nosemaphore+0x16/0x20
[ 5.603354][ T0] ? do_page_fault+0x44b/0x9d7
[ 5.607985][ T0] ? trace_hardirqs_off_thunk+0x1a/0x1c
[ 5.613401][ T0] ? page_fault+0x34/0x40
[ 5.617594][ T0] ? dump_pagetable+0xf3/0x3b0
[ 5.622224][ T0] ? __asan_load8+0x40/0xb0
[ 5.626590][ T0] ? dump_pagetable+0xf3/0x3b0
[ 5.631219][ T0] ? vmalloc_fault+0x450/0x450
[ 5.635849][ T0] ? search_exception_tables+0x4c/0x50
[ 5.641176][ T0] ? fixup_exception+0x38/0x92
[ 5.645809][ T0] ? no_context.cold.21+0x160/0x2e0
[ 5.650874][ T0] ? pgtable_bad+0x80/0x80
[ 5.655154][ T0] ? register_lock_class+0xb40/0xb40
[ 5.660307][ T0] ? register_lock_class+0xb40/0xb40
[ 5.665461][ T0] ? snprintf+0xc0/0xc0
[ 5.669480][ T0] ? debug_lockdep_rcu_enabled+0x11/0x60
[ 5.674983][ T0] ? debug_lockdep_rcu_enabled+0x11/0x60
[ 5.680485][ T0] ? console_unlock+0x3e5/0x740
[ 5.685202][ T0] ? console_unlock+0x3ff/0x740
[ 5.689921][ T0] ? __bad_area_nosemaphore+0x66/0x230
[ 5.695248][ T0] ? do_raw_spin_lock+0x11e/0x1e0
[ 5.700141][ T0] ? rwlock_bug.part.0+0x60/0x60
[ 5.704945][ T0] ? bad_area_nosemaphore+0x16/0x20
[ 5.710010][ T0] ? do_page_fault+0x44b/0x9d7
[ 5.714639][ T0] ? trace_hardirqs_off_thunk+0x1a/0x1c
[ 5.720056][ T0] ? page_fault+0x34/0x40
[ 5.724249][ T0] ? dump_pagetable+0xf3/0x3b0
[ 5.728878][ T0] ? __asan_load8+0x40/0xb0
[ 5.733246][ T0] ? dump_pagetable+0xf3/0x3b0
[ 5.737875][ T0] ? vmalloc_fault+0x450/0x450
[ 5.742505][ T0] ? search_exception_tables+0x4c/0x50
[ 5.747832][ T0] ? fixup_exception+0x38/0x92
[ 5.752462][ T0] ? no_context.cold.21+0x160/0x2e0
[ 5.757529][ T0] ? pgtable_bad+0x80/0x80
[ 5.761809][ T0] ? register_lock_class+0xb40/0xb40
[ 5.766962][ T0] ? register_lock_class+0xb40/0xb40
[ 5.772118][ T0] ? snprintf+0xc0/0xc0
[ 5.776137][ T0] ? debug_lockdep_rcu_enabled+0x11/0x60
[ 5.781639][ T0] ? debug_lockdep_rcu_enabled+0x11/0x60
[ 5.787144][ T0] ? console_unlock+0x3e5/0x740
[ 5.791857][ T0] ? console_unlock+0x3ff/0x740
[ 5.796576][ T0] ? __bad_area_nosemaphore+0x66/0x230
[ 5.801903][ T0] ? do_raw_spin_lock+0x11e/0x1e0
[ 5.806796][ T0] ? rwlock_bug.part.0+0x60/0x60
[ 5.811601][ T0] ? bad_area_nosemaphore+0x16/0x20
[ 5.816665][ T0] ? do_page_fault+0x44b/0x9d7
[ 5.821295][ T0] ? trace_hardirqs_off_thunk+0x1a/0x1c
[ 5.826711][ T0] ? page_fault+0x34/0x40
[ 5.830904][ T0] ? dump_pagetable+0xf3/0x3b0
[ 5.835534][ T0] ? __asan_load8+0x40/0xb0
[ 5.839901][ T0] ? dump_pagetable+0xf3/0x3b0
[ 5.844530][ T0] ? vmalloc_fault+0x450/0x450
[ 5.849160][ T0] ? search_exception_tables+0x4c/0x50
[ 5.854487][ T0] ? fixup_exception+0x38/0x92
[ 5.859120][ T0] ? no_context.cold.21+0x160/0x2e0
[ 5.864185][ T0] ? pgtable_bad+0x80/0x80
[ 5.868465][ T0] ? register_lock_class+0xb40/0xb40
[ 5.873618][ T0] ? register_lock_class+0xb40/0xb40
[ 5.878773][ 6.262157][ T0] ? __asan_load8+0x40/0xb0
[ 6.266523][ T0] ? dump_pagetable+0xf3/0x3b0
[ 6.271152][ T0] ? vmalloc_fault+0x450/0x450
[ 6.275782][ T0] ? search_exception_tables+0x4c/0x50
[ 6.281109][ T0] ? fixup_exception+0x38/0x92
[ 6.285739][ T0] ? no_context.cold.21+0x160/0x2e0
[ 6.290807][ T0] ? pgtable_bad+0x80/0x80
[ 6.295087][ T0] ? register_lock_class+0xb40/0xb40
[ 6.300240][ T0] ? register_lock_class+0xb40/0xb40
[ 6.305394][ T0] ? snprintf+0xc0/0xc0
[ 6.309413][ T0] ? debug_lockdep_rcu_enabled+0x11/0x60
[ 6.314917][ T0] ? debug_lockdep_rcu_enabled+0x11/0x60
[ 6.320419][ T0] ? console_unlock+0x3e5/0x740
[ 6.325134][ T0] ? console_unlock+0x3ff/0x740
[ 6.329852][ T0] ? __bad_area_nosemaphore+0x66/0x230
[ 6.335180][ T0] ? do_raw_spin_lock+0x11e/0x1e0
[ 6.340072][ T0] ? rwlock_bug.part.0+0x60/0x60
[ 6.344877][ T0] ? bad_area_nosemaphore+0x16/0x20
[ 6.349944][ T0] ? do_page_fault+0x44b/0x9d7x11/0x60
[ 6.741538][ T0] ? debug_lockdep_rcu_enabled+0x11/0x60
[ 6.747040][ T0] ? console_unlock+0x3e5/0x740
[ 6.751757][ T0] ? console_unlock+0x3ff/0x740
[ 6.756476][ T0] ? __bad_area_nosemaphore+0x66/0x230
[ 6.761803][ T0] ? do_raw_spin_lock+0x11e/0x1e0
[ 6.766694][ T0] ? rwlock_bug.part.0+0x60/0x60
[ 6.771500][ T0] ? bad_area_nosemaphore+0x16/0x20
[ 6.776565][ T0] ? do_page_fault+0x44b/0x9d7
[ 6.781196][ T0] ? trace_hardirqs_off_thunk+0x1a/0x1c
[ 6.786610][ T0] ? page_fault+0x34/0x40
[ 6.790805][ T0] ? dump_pagetable+0xf3/0x3b0
[ 6.795432][ T0] ? __asan_load8+0x40/0xb0
[ 6.799801][ T0] ? dump_pagetable+0xf3/0x3b0
[ 6.804430][ T0] ? vmalloc_fault+0x450/0x450
[ 6.809060][ T0] ? search_exception_tables+0x4c/0x50
[ 6.814387][ T0] ? fixup_exception+0x38/0x92
[ 6.819017][ T0] ? no_context.cold.21+0x160/0x2e0
[ 6.824084][ T0] ? pgtable_bad+0x80/0x80
[ 6.828363][ T0] ? register_lock_class+0xb40/0xb40
[ 6.833 T0] ? dump_pagetable+0xf3/0x3b0
[ 7.222054][ T0] ? __asan_load8+0x40/0xb0
[ 7.226423][ T0] ? dump_pagetable+0xf3/0x3b0
[ 7.231052][ T0] ? vmalloc_fault+0x450/0x450
[ 7.235682][ T0] ? search_exception_tables+0x4c/0x50
[ 7.241009][ T0] ? fixup_exception+0x38/0x92
[ 7.245640][ T0] ? no_context.cold.21+0x160/0x2e0
[ 7.250704][ T0] ? pgtable_bad+0x80/0x80
[ 7.254985][ T0] ? register_lock_class+0xb40/0xb40
[ 7.260140][ T0] ? register_lock_class+0xb40/0xb40
[ 7.265293][ T0] ? snprintf+0xc0/0xc0
[ 7.269312][ T0] ? debug_lockdep_rcu_enabled+0x11/0x60
[ 7.274816][ T0] ? debug_lockdep_rcu_enabled+0x11/0x60
[ 7.280317][ T0] ? console_unlock+0x3e5/0x740
[ 7.285035][ T0] ? console_unlock+0x3ff/0x740
[ 7.289751][ T0] ? __bad_area_nosemaphore+0x66/0x230
[ 7.295080][ T0] ? do_raw_spin_lock+0x11e/0x1e0
[ 7.299970][ T0] ? rwlock_bug.part.0+0x60pgtable_bad+0x80/0x80
[ 7.788263][ T0] ? register_lock_class+0xb40/0xb40
[ 7.793415][ T0] ? register_lock_class+0xb40/0xb40
[ 7.798571][ T0] ? snprintf+0xc0/0xc0
[ 7.802590][ T0] ? debug_lockdep_rcu_enabled+0x11/0x60
[ 7.808094][ T0] ? debug_lockdep_rcu_enabled+0x11/0x60
[ 7.813595][ T0] ? console_unlock+0x3e5/0x740
[ 7.818312][ T0] ? console_unlock+0x3ff/0x740
[ 7.823030][ T0] ? __bad_area_nosemaphore+0x66/0x230
[ 7.828357][ T0] ? do_raw_spin_lock+0x11e/0x1e0
[ 7.833250][ T0] ? rwlock_bug.part.0+0x60/0x60
[ 7.838054][ T0] ? bad_area_nosemaphore+0x16/0x20
[ 7.843118][ T0] ? do_page_fault+0x44b/0x9d7
[ 7.847751][ T0] ? trace_hardirqs_off_thunk+0x1a/0x1c
[ 7.853165][ T0] ? page_fault+0x34/0x40
[ 7.857357][ T0] ? dump_pagetable+0xf3/0x3b0
[ 7.861987][ T0] ? __asan_load8+0x40/0xb0
[ 7.866356][ T0] ? dump_pagetable+0xf3/0x3b0
[ 7.870985][ T0] ? vmalloc_fault+0x450/0x450
[ 7.875613][ T0] ? search_exception_tables 8.264676][ T0] ? bad_area_nosemaphore+0x16/0x20
[ 8.269740][ T0] ? do_page_fault+0x44b/0x9d7
[ 8.274372][ T0] ? trace_hardirqs_off_thunk+0x1a/0x1c
[ 8.279787][ T0] ? page_fault+0x34/0x40
[ 8.283980][ T0] ? dump_pagetable+0xf3/0x3b0
[ 8.288611][ T0] ? __asan_load8+0x40/0xb0
[ 8.292975][ T0] ? dump_pagetable+0xf3/0x3b0
[ 8.297607][ T0] ? vmalloc_fault+0x450/0x450
[ 8.302237][ T0] ? __kasan_check_write+0x14/0x20
[ 8.307220][ T0] ? debug_locks_off+0x44/0x70
[ 8.311845][ T0] ? no_context.cold.21+0x160/0x2e0
[ 8.316910][ T0] ? __kasan_check_read+0x11/0x20
[ 8.321802][ T0] ? pgtable_bad+0x80/0x80


2020-04-22 16:22:23

by Borislav Petkov

[permalink] [raw]
Subject: Re: AMD boot woe due to "x86/mm: Cleanup pgprot_4k_2_large() and pgprot_large_2_4k()"

On Wed, Apr 22, 2020 at 11:55:54AM -0400, Qian Cai wrote:
> Reverted the linux-next commit and its dependency,
>
> a85573f7e741 ("x86/mm: Unexport __cachemode2pte_tbl”)
> 9e294786c89a (“x86/mm: Cleanup pgprot_4k_2_large() and pgprot_large_2_4k()”)
>
> fixed crashes or hard reset on AMD machines during boot that have been flagged by
> KASAN in different forms indicating some sort of memory corruption with this config,
>
> https://raw.githubusercontent.com/cailca/linux-mm/master/x86.config

What is the special thing about this config? You have KASAN enabled and?
Anything else?

I need to know what are the relevant switches you've enabled so that I
can enable them on my box too and try to reproduce.

Thx.

--
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette

2020-04-22 16:38:51

by Qian Cai

[permalink] [raw]
Subject: Re: AMD boot woe due to "x86/mm: Cleanup pgprot_4k_2_large() and pgprot_large_2_4k()"



> On Apr 22, 2020, at 12:18 PM, Borislav Petkov <[email protected]> wrote:
>
> What is the special thing about this config? You have KASAN enabled and?
> Anything else?
>
> I need to know what are the relevant switches you've enabled so that I
> can enable them on my box too and try to reproduce.

The config has a few extra memory debugging options enabled like KASAN, debug_pagealloc, debug_vm etc. The affected machines are NUMA AMD servers.

2020-04-22 16:51:36

by Borislav Petkov

[permalink] [raw]
Subject: Re: AMD boot woe due to "x86/mm: Cleanup pgprot_4k_2_large() and pgprot_large_2_4k()"

On Wed, Apr 22, 2020 at 12:35:08PM -0400, Qian Cai wrote:
> The config has a few extra memory debugging options enabled like
> KASAN, debug_pagealloc, debug_vm etc.

How about you specify exactly which CONFIG_ switches and cmdline options
you have enabled deliberately? I can rhyme up the rest from the .config
file.

Full dmesg would be good too, sent privately's fine too.

"etc." is not good enough.

Thx.

--
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette

2020-04-22 17:03:57

by Christoph Hellwig

[permalink] [raw]
Subject: Re: AMD boot woe due to "x86/mm: Cleanup pgprot_4k_2_large() and pgprot_large_2_4k()"

On Wed, Apr 22, 2020 at 11:55:54AM -0400, Qian Cai wrote:
> Reverted the linux-next commit and its dependency,
>
> a85573f7e741 ("x86/mm: Unexport __cachemode2pte_tbl”)
> 9e294786c89a (“x86/mm: Cleanup pgprot_4k_2_large() and pgprot_large_2_4k()”)
>
> fixed crashes or hard reset on AMD machines during boot that have been flagged by
> KASAN in different forms indicating some sort of memory corruption with this config,

Interesting. Your config seems to boot fine in my VM until the point
where the lack of virtio-blk support stops it from mounting the root
file system.

Looking at the patch I found one bug, although that should not affect
your config (it should use the pgprotval_t type), and one difference
that could affect code generation, although I prefer the new version
(use of __pgprot vs a local variable + pgprot_val()).

Two patches attached, can you try them?


Attachments:
(No filename) (905.00 B)
0001-x86-Use-pgprotval_t-in-protval_4k_2_large-and-pgprot.patch (1.38 kB)
0002-foo.patch (1.22 kB)
Download all attachments

2020-04-22 18:37:13

by Qian Cai

[permalink] [raw]
Subject: Re: AMD boot woe due to "x86/mm: Cleanup pgprot_4k_2_large() and pgprot_large_2_4k()"



> On Apr 22, 2020, at 1:01 PM, Christoph Hellwig <[email protected]> wrote:
>
> On Wed, Apr 22, 2020 at 11:55:54AM -0400, Qian Cai wrote:
>> Reverted the linux-next commit and its dependency,
>>
>> a85573f7e741 ("x86/mm: Unexport __cachemode2pte_tbl”)
>> 9e294786c89a (“x86/mm: Cleanup pgprot_4k_2_large() and pgprot_large_2_4k()”)
>>
>> fixed crashes or hard reset on AMD machines during boot that have been flagged by
>> KASAN in different forms indicating some sort of memory corruption with this config,
>
> Interesting. Your config seems to boot fine in my VM until the point
> where the lack of virtio-blk support stops it from mounting the root
> file system.
>
> Looking at the patch I found one bug, although that should not affect
> your config (it should use the pgprotval_t type), and one difference
> that could affect code generation, although I prefer the new version
> (use of __pgprot vs a local variable + pgprot_val()).
>
> Two patches attached, can you try them?
> <0001-x86-Use-pgprotval_t-in-protval_4k_2_large-and-pgprot.patch><0002-foo.patch>

Yes, but both patches do not help here. This time flagged by UBSAN,

static void dump_pagetable(unsigned long address)
{
pgd_t *base = __va(read_cr3_pa());
pgd_t *pgd = base + pgd_index(address); <—— shift-out-of-bounds here

[ 4.452663][ T0] ACPI: LAPIC_NMI (acpi_id[0x73] high level lint[0x1])
[ 4.459391][ T0] ACPI: LAPIC_NMI (acpi_id[0x74] high level lint[0x1])
[ 4.466115][ T0] ACPI: LAPIC_NMI (acpi_id[0x75] high level lint[0x1])
[ 4.472842][ T0] ACPI: LAPIC_NMI (acpi_id[0x76] high level lint[0x1])
[ 4.479567][ T0] ACPI: LAPIC_NMI (acpi_id[0x77] high level lint[0x1])
[ 4.486294][ T0] ACPI: LAPIC_NMI (acpi_id[0x78] high level lint[0x1])
[ 4.493021][ T0] ACPI: LAPIC_NMI (acpi_id[0x79] high level lint[0x1])
[ 4.499745][ T0] ACPI: LAPIC_NMI (acpi_id[0x7a] high level lint[0x1])
[ 4.506471][ T0] ACPI: LAPIC_NMI (acpi_id[0x7b] high level liad access in kernel mode
[ 4.901030][ T0] #PF: error_code(0x0000) - not-present page
[ 4.906884][ T0] BUG: unable to handle page fault for address: ffffed11509c29da
[ 4.914483][ T0] #PF: supervisor read access in kernel mode
[ 4.920334][ T0] #PF: error_code(0x0000) - not-present page
[ 4.926189][ T0] BUG: unable to handle page fault for address: ffffed11509c29da
[ 4.933786][ T0] #PF: supervisor read access in kernel mode
[ 4.939640][ T0] #PF: error_code(0x0000) - not-present page
[ 4.945492][ T0] BUG: unable to handle page fault for address: ffffed11509c29da
[ 4.953091][ T0] #PF: supervisor read access in kernel mode
[ 4.958943][ T0] #PF: error_code(0x0000) - not-present page
[ 4.964797][ T0] BUG: unable to handle page fault for address: ffffed11509c29da
[ 4.972395][ T0] #PF: supervisor read access in kernel mode
[ 4.978247][ T0] #PF: error_code(0x0000) - not-present page
[ 4.984102][ T0] BUG: unable to handle page fault for address: ffffed11509c29da
[ 4.9917age fault for address: ffffed11509c29da
[ 5.481007][ T0] #PF: supervisor read access in kernel mode
[ 5.486862][ T0] #PF: error_code(0x0000) - not-present page
[ 5.492713][ T0] BUG: unable to handle page fault for address: ffffed11509c29da
[ 5.500314][ T0] #PF: supervisor read access in kernel mode
[ 5.506165][ T0] #PF: error_code(0x0000) - not-present page
[ 5.512020][ T0] ================================================================================
[ 5.521193][ T0] UBSAN: shift-out-of-bounds in arch/x86/mm/fault.c:450:22
[ 5.528268][ T0] shift exponent 4294967295 is too large for 64-bit type 'long unsigned int'
[ 5.536916][ T0] CPU: 0 PID: 0 Comm: swapper Tainted: G B 5.7.0-rc2-next-20200422+ #10
[ 5.546434][ T0] Hardware name: HPE ProLiant DL385 Gen10/ProLiant DL385 Gen10, BIOS A40 07/10/2019
[ 5.555692][ T0] Call Trace:
[ 5.558837][ T0] ================================================================================
[ 5.568012][T0] BUG: unable to handle page fault for address: 0000000a2b84dda8
[ 5.961699][ T0] #PF: supervisor read access in kernel mode
[ 5.967550][ T0] #PF: error_code(0x0000) - not-present page
[ 5.973405][ T0] BUG: unable to handle page fault for address: 0000000a2b84dda8
[ 5.981005][ T0] #PF: supervisor read access in kernel mode
[ 5.986856][ T0] #PF: error_code(0x0000) - not-present page
[ 5.992708][ T0] BUG: unable to handle page fault for address: 0000000a2b84dda8
[ 6.000308][ T0] #PF: supervisor read access in kernel mode
[ 6.006159][ T0] #PF: error_code(0x0000) - not-present page
[ 6.012013][ T0] BUG: unable to handle page fault for address: 0000000a2b84dda8
[ 6.019612][ T0] #PF: supervisor read access in kernel mode

2020-04-22 18:56:04

by Qian Cai

[permalink] [raw]
Subject: Re: AMD boot woe due to "x86/mm: Cleanup pgprot_4k_2_large() and pgprot_large_2_4k()"



> On Apr 22, 2020, at 12:47 PM, Borislav Petkov <[email protected]> wrote:
>
> On Wed, Apr 22, 2020 at 12:35:08PM -0400, Qian Cai wrote:
>> The config has a few extra memory debugging options enabled like
>> KASAN, debug_pagealloc, debug_vm etc.
>
> How about you specify exactly which CONFIG_ switches and cmdline options
> you have enabled deliberately? I can rhyme up the rest from the .config
> file.

The thing is pretty much the same debug kernel config has been used for
a few years, so I don’t deliberately enable anything today.

The best bet is probably to skim through the “Kernel hacking” section of
the config and enable whatever you feel relevant if you have not enabled
already.

The cmdline is also in the .config via CONFIG_CMDLINE.

>
> Full dmesg would be good too, sent privately's fine too.

https://cailca.github.io/files/dmesg.txt

First, it comes with the dmesg that crashes and followed by the good dmesg
after reverting the commits (starting from line 644).

2020-04-22 21:35:22

by Qian Cai

[permalink] [raw]
Subject: Re: AMD boot woe due to "x86/mm: Cleanup pgprot_4k_2_large() and pgprot_large_2_4k()"



> On Apr 22, 2020, at 1:01 PM, Christoph Hellwig <[email protected]> wrote:
>
> On Wed, Apr 22, 2020 at 11:55:54AM -0400, Qian Cai wrote:
>> Reverted the linux-next commit and its dependency,
>>
>> a85573f7e741 ("x86/mm: Unexport __cachemode2pte_tbl”)
>> 9e294786c89a (“x86/mm: Cleanup pgprot_4k_2_large() and pgprot_large_2_4k()”)
>>
>> fixed crashes or hard reset on AMD machines during boot that have been flagged by
>> KASAN in different forms indicating some sort of memory corruption with this config,
>
> Interesting. Your config seems to boot fine in my VM until the point
> where the lack of virtio-blk support stops it from mounting the root
> file system.
>
> Looking at the patch I found one bug, although that should not affect
> your config (it should use the pgprotval_t type), and one difference
> that could affect code generation, although I prefer the new version
> (use of __pgprot vs a local variable + pgprot_val()).
>
> Two patches attached, can you try them?
> <0001-x86-Use-pgprotval_t-in-protval_4k_2_large-and-pgprot.patch><0002-foo.patch>

This fixed the sucker,

diff --git a/arch/x86/mm/pgtable.c b/arch/x86/mm/pgtable.c
index edf9cea4871f..c54d1d0a8e3b 100644
--- a/arch/x86/mm/pgtable.c
+++ b/arch/x86/mm/pgtable.c
@@ -708,7 +708,7 @@ int pud_set_huge(pud_t *pud, phys_addr_t addr, pgprot_t prot)

set_pte((pte_t *)pud, pfn_pte(
(u64)addr >> PAGE_SHIFT,
- __pgprot(protval_4k_2_large(pgprot_val(prot) | _PAGE_PSE))));
+ __pgprot(protval_4k_2_large(pgprot_val(prot)) | _PAGE_PSE)));

return 1;
}

2020-04-22 21:51:58

by Borislav Petkov

[permalink] [raw]
Subject: Re: AMD boot woe due to "x86/mm: Cleanup pgprot_4k_2_large() and pgprot_large_2_4k()"

On Wed, Apr 22, 2020 at 05:32:00PM -0400, Qian Cai wrote:
> This fixed the sucker,
>
> diff --git a/arch/x86/mm/pgtable.c b/arch/x86/mm/pgtable.c
> index edf9cea4871f..c54d1d0a8e3b 100644
> --- a/arch/x86/mm/pgtable.c
> +++ b/arch/x86/mm/pgtable.c
> @@ -708,7 +708,7 @@ int pud_set_huge(pud_t *pud, phys_addr_t addr, pgprot_t prot)
>
> set_pte((pte_t *)pud, pfn_pte(
> (u64)addr >> PAGE_SHIFT,
> - __pgprot(protval_4k_2_large(pgprot_val(prot) | _PAGE_PSE))));
> + __pgprot(protval_4k_2_large(pgprot_val(prot)) | _PAGE_PSE)));
>

Very good catch - that's one nasty wrongly placed closing bracket!
pmd_set_huge() has it correct.

Mind sending a proper patch?

Thx.

--
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette

2020-04-22 21:58:47

by Qian Cai

[permalink] [raw]
Subject: Re: AMD boot woe due to "x86/mm: Cleanup pgprot_4k_2_large() and pgprot_large_2_4k()"



> On Apr 22, 2020, at 5:47 PM, Borislav Petkov <[email protected]> wrote:
>
> Very good catch - that's one nasty wrongly placed closing bracket!
> pmd_set_huge() has it correct.
>
> Mind sending a proper patch?

I thought Christ is going to send some minor updates anyway, so it may be better for him to include this one together? Otherwise, I am fine to send this one standalone.

2020-04-22 22:07:55

by Borislav Petkov

[permalink] [raw]
Subject: Re: AMD boot woe due to "x86/mm: Cleanup pgprot_4k_2_large() and pgprot_large_2_4k()"

On Wed, Apr 22, 2020 at 05:57:09PM -0400, Qian Cai wrote:
> I thought Christ is going to send some minor updates anyway, so it may
> be better for him to include this one together? Otherwise, I am fine to
> send this one standalone.

You mean Christoph.

Ok, I'll let you guys hash it out.

Thx.

--
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette

2020-04-23 06:12:30

by Christoph Hellwig

[permalink] [raw]
Subject: Re: AMD boot woe due to "x86/mm: Cleanup pgprot_4k_2_large() and pgprot_large_2_4k()"

On Thu, Apr 23, 2020 at 12:05:12AM +0200, Borislav Petkov wrote:
> On Wed, Apr 22, 2020 at 05:57:09PM -0400, Qian Cai wrote:
> > I thought Christ is going to send some minor updates anyway, so it may
> > be better for him to include this one together? Otherwise, I am fine to
> > send this one standalone.
>
> You mean Christoph.
>
> Ok, I'll let you guys hash it out.

I can send one, but given that Qian found it and fixed it I'd have
to attribute it to him anyway :)

This assumes you don't want a complete resend of the series, of course.

2020-04-23 10:50:04

by Qian Cai

[permalink] [raw]
Subject: Re: AMD boot woe due to "x86/mm: Cleanup pgprot_4k_2_large() and pgprot_large_2_4k()"



> On Apr 23, 2020, at 2:08 AM, Christoph Hellwig <[email protected]> wrote:
>
> I can send one, but given that Qian found it and fixed it I'd have
> to attribute it to him anyway :)
>
> This assumes you don't want a complete resend of the series, of course.

How about you send a single patch to include this and the the other pgprotval_t fix you mentioned early as well? Feel free to add my reported-by while all I care is to close out those bugs.

2020-04-23 11:08:20

by Borislav Petkov

[permalink] [raw]
Subject: Re: AMD boot woe due to "x86/mm: Cleanup pgprot_4k_2_large() and pgprot_large_2_4k()"

On April 23, 2020 12:47:15 PM GMT+02:00, Qian Cai <[email protected]> wrote:
>
>
>> On Apr 23, 2020, at 2:08 AM, Christoph Hellwig <[email protected]> wrote:
>>
>> I can send one, but given that Qian found it and fixed it I'd have
>> to attribute it to him anyway :)
>>
>> This assumes you don't want a complete resend of the series, of
>course.
>
>How about you send a single patch to include this and the the other
>pgprotval_t fix you mentioned early as well? Feel free to add my
>reported-by while all I care is to close out those bugs.

No need, I've rebased and testing. Stay tuned.


--
Sent from a small device: formatting sux and brevity is inevitable.

2020-04-23 11:24:27

by Qian Cai

[permalink] [raw]
Subject: Re: AMD boot woe due to "x86/mm: Cleanup pgprot_4k_2_large() and pgprot_large_2_4k()"



> On Apr 23, 2020, at 7:06 AM, Boris Petkov <[email protected]> wrote:
>
> No need, I've rebased and testing. Stay tuned.

Cool. I can only advocate to take another closer look at this patchset (it looks like going to break PAE without the pgprotval_t fix), because bugs do cluster.

2020-04-23 12:27:40

by Borislav Petkov

[permalink] [raw]
Subject: Re: AMD boot woe due to "x86/mm: Cleanup pgprot_4k_2_large() and pgprot_large_2_4k()"

On Thu, Apr 23, 2020 at 07:21:50AM -0400, Qian Cai wrote:
> Cool. I can only advocate to take another closer look at this patchset
> (it looks like going to break PAE without the pgprotval_t fix),
> because bugs do cluster.

So, I took the pgprotval_t fix and tested it on two boxes. I'd
appreciate it if you ran tip:x86/mm on your machine too. tip-bot
notifications coming up.

Thx.

--
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette

Subject: [tip: x86/mm] x86/mm: Use pgprotval_t in protval_4k_2_large() and protval_large_2_4k()

The following commit has been merged into the x86/mm branch of tip:

Commit-ID: 325518e9b743686f471e7a4ef617b57c91386795
Gitweb: https://git.kernel.org/tip/325518e9b743686f471e7a4ef617b57c91386795
Author: Christoph Hellwig <[email protected]>
AuthorDate: Wed, 22 Apr 2020 18:53:08 +02:00
Committer: Borislav Petkov <[email protected]>
CommitterDate: Thu, 23 Apr 2020 11:38:42 +02:00

x86/mm: Use pgprotval_t in protval_4k_2_large() and protval_large_2_4k()

Use the proper type for "raw" page table values.

Signed-off-by: Christoph Hellwig <[email protected]>
Signed-off-by: Borislav Petkov <[email protected]>
Acked-by: Peter Zijlstra (Intel) <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
---
arch/x86/include/asm/pgtable_types.h | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/x86/include/asm/pgtable_types.h b/arch/x86/include/asm/pgtable_types.h
index 567abdb..7b6ddcf 100644
--- a/arch/x86/include/asm/pgtable_types.h
+++ b/arch/x86/include/asm/pgtable_types.h
@@ -478,7 +478,7 @@ static inline pteval_t pte_flags(pte_t pte)

unsigned long cachemode2protval(enum page_cache_mode pcm);

-static inline unsigned long protval_4k_2_large(unsigned long val)
+static inline pgprotval_t protval_4k_2_large(pgprotval_t val)
{
return (val & ~(_PAGE_PAT | _PAGE_PAT_LARGE)) |
((val & _PAGE_PAT) << (_PAGE_BIT_PAT_LARGE - _PAGE_BIT_PAT));
@@ -487,7 +487,7 @@ static inline pgprot_t pgprot_4k_2_large(pgprot_t pgprot)
{
return __pgprot(protval_4k_2_large(pgprot_val(pgprot)));
}
-static inline unsigned long protval_large_2_4k(unsigned long val)
+static inline pgprotval_t protval_large_2_4k(pgprotval_t val)
{
return (val & ~(_PAGE_PAT | _PAGE_PAT_LARGE)) |
((val & _PAGE_PAT_LARGE) >>