2024-04-01 18:16:57

by Reinette Chatre

[permalink] [raw]
Subject: [PATCH V2] x86/resctrl: Fix uninitialized memory read when last CPU of domain goes offline

Tony encountered the OOPS below when the last CPU of a domain goes
offline while running a kernel built with CONFIG_NO_HZ_FULL:

BUG: kernel NULL pointer dereference, address: 0000000000000000
#PF: supervisor read access in kernel mode
#PF: error_code(0x0000) - not-present page
PGD 0
Oops: 0000 [#1] PREEMPT SMP NOPTI
...
RIP: 0010:__find_nth_andnot_bit+0x66/0x110
...
Call Trace:
<TASK>
? __die+0x1f/0x60
? page_fault_oops+0x176/0x5a0
? exc_page_fault+0x7f/0x260
? asm_exc_page_fault+0x22/0x30
? __pfx_resctrl_arch_offline_cpu+0x10/0x10
? __find_nth_andnot_bit+0x66/0x110
? __cancel_work+0x7d/0xc0
cpumask_any_housekeeping+0x55/0x110
mbm_setup_overflow_handler+0x40/0x70
resctrl_offline_cpu+0x101/0x110
resctrl_arch_offline_cpu+0x19/0x260
cpuhp_invoke_callback+0x156/0x6b0
? cpuhp_thread_fun+0x5f/0x250
cpuhp_thread_fun+0x1ca/0x250
? __pfx_smpboot_thread_fn+0x10/0x10
smpboot_thread_fn+0x184/0x220
kthread+0xe0/0x110
? __pfx_kthread+0x10/0x10
ret_from_fork+0x2d/0x50
? __pfx_kthread+0x10/0x10
ret_from_fork_asm+0x1a/0x30
</TASK>

The NULL pointer dereference is encountered while searching for another
online CPU in the domain (of which there are none) that can be used to
run the MBM overflow handler.

Because the kernel is configured with CONFIG_NO_HZ_FULL the search for
another CPU (in its effort to prefer those CPUs that aren't marked
nohz_full) consults the mask representing the nohz_full CPUs,
tick_nohz_full_mask. On a kernel with CONFIG_CPUMASK_OFFSTACK=y
tick_nohz_full_mask is not allocated unless the kernel is booted with
the "nohz_full=" parameter and because of that any access to
tick_nohz_full_mask needs to be guarded with tick_nohz_full_enabled().

Replace the IS_ENABLED(CONFIG_NO_HZ_FULL) with tick_nohz_full_enabled().
The latter ensures tick_nohz_full_mask can be accessed safely and can be
used whether kernel is built with CONFIG_NO_HZ_FULL enabled or not.

Fixes: a4846aaf3945 ("x86/resctrl: Add cpumask_any_housekeeping() for limbo/overflow")
Reported-by: Tony Luck <[email protected]>
Closes: https://lore.kernel.org/lkml/ZgIFT5gZgIQ9A9G7@agluck-desk3/
Suggested-by: Ingo Molnar <[email protected]>
Signed-off-by: Reinette Chatre <[email protected]>
---
Changes since v1:
- Use Ingo's suggestion that combines the two NO_HZ checks into one.
- Tony provided tags but since patch changed so much I did not apply
tags to this version.

arch/x86/kernel/cpu/resctrl/internal.h | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h
index c99f26ebe7a6..1a8687f8073a 100644
--- a/arch/x86/kernel/cpu/resctrl/internal.h
+++ b/arch/x86/kernel/cpu/resctrl/internal.h
@@ -78,7 +78,8 @@ cpumask_any_housekeeping(const struct cpumask *mask, int exclude_cpu)
else
cpu = cpumask_any_but(mask, exclude_cpu);

- if (!IS_ENABLED(CONFIG_NO_HZ_FULL))
+ /* Only continue if tick_nohz_full_mask has been initialized. */
+ if (!tick_nohz_full_enabled())
return cpu;

/* If the CPU picked isn't marked nohz_full nothing more needs doing. */
--
2.34.1



2024-04-01 19:18:44

by Moger, Babu

[permalink] [raw]
Subject: Re: [PATCH V2] x86/resctrl: Fix uninitialized memory read when last CPU of domain goes offline



On 4/1/24 13:16, Reinette Chatre wrote:
> Tony encountered the OOPS below when the last CPU of a domain goes
> offline while running a kernel built with CONFIG_NO_HZ_FULL:
>
> BUG: kernel NULL pointer dereference, address: 0000000000000000
> #PF: supervisor read access in kernel mode
> #PF: error_code(0x0000) - not-present page
> PGD 0
> Oops: 0000 [#1] PREEMPT SMP NOPTI
> ...
> RIP: 0010:__find_nth_andnot_bit+0x66/0x110
> ...
> Call Trace:
> <TASK>
> ? __die+0x1f/0x60
> ? page_fault_oops+0x176/0x5a0
> ? exc_page_fault+0x7f/0x260
> ? asm_exc_page_fault+0x22/0x30
> ? __pfx_resctrl_arch_offline_cpu+0x10/0x10
> ? __find_nth_andnot_bit+0x66/0x110
> ? __cancel_work+0x7d/0xc0
> cpumask_any_housekeeping+0x55/0x110
> mbm_setup_overflow_handler+0x40/0x70
> resctrl_offline_cpu+0x101/0x110
> resctrl_arch_offline_cpu+0x19/0x260
> cpuhp_invoke_callback+0x156/0x6b0
> ? cpuhp_thread_fun+0x5f/0x250
> cpuhp_thread_fun+0x1ca/0x250
> ? __pfx_smpboot_thread_fn+0x10/0x10
> smpboot_thread_fn+0x184/0x220
> kthread+0xe0/0x110
> ? __pfx_kthread+0x10/0x10
> ret_from_fork+0x2d/0x50
> ? __pfx_kthread+0x10/0x10
> ret_from_fork_asm+0x1a/0x30
> </TASK>
>
> The NULL pointer dereference is encountered while searching for another
> online CPU in the domain (of which there are none) that can be used to
> run the MBM overflow handler.
>
> Because the kernel is configured with CONFIG_NO_HZ_FULL the search for
> another CPU (in its effort to prefer those CPUs that aren't marked
> nohz_full) consults the mask representing the nohz_full CPUs,
> tick_nohz_full_mask. On a kernel with CONFIG_CPUMASK_OFFSTACK=y
> tick_nohz_full_mask is not allocated unless the kernel is booted with
> the "nohz_full=" parameter and because of that any access to
> tick_nohz_full_mask needs to be guarded with tick_nohz_full_enabled().
>
> Replace the IS_ENABLED(CONFIG_NO_HZ_FULL) with tick_nohz_full_enabled().
> The latter ensures tick_nohz_full_mask can be accessed safely and can be
> used whether kernel is built with CONFIG_NO_HZ_FULL enabled or not.
>
> Fixes: a4846aaf3945 ("x86/resctrl: Add cpumask_any_housekeeping() for limbo/overflow")
> Reported-by: Tony Luck <[email protected]>
> Closes: https://lore.kernel.org/lkml/ZgIFT5gZgIQ9A9G7@agluck-desk3/
> Suggested-by: Ingo Molnar <[email protected]>
> Signed-off-by: Reinette Chatre <[email protected]>

Reviewed-by: Babu Moger <[email protected]>


--
Thanks
Babu Moger