2023-10-24 14:28:58

by Uros Bizjak

[permalink] [raw]
Subject: [PATCH] x86/percpu: Return correct variable from current_top_of_stack()

current_top_of_stack() should return variable from _seg_gs
qualified named address space when CONFIG_USE_X86_SEG_SUPPORT
is enbled.

Fixes: ed2f752e0e0a ("x86/percpu: Introduce const-qualified const_pcpu_hot to micro-optimize code generation")
Cc: Andy Lutomirski <[email protected]>
Cc: Brian Gerst <[email protected]>
Cc: Denys Vlasenko <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: H. Peter Anvin <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Josh Poimboeuf <[email protected]>
Signed-off-by: Uros Bizjak <[email protected]>
---
arch/x86/include/asm/processor.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index a807025a4dee..4b130d894cb6 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -534,7 +534,7 @@ static __always_inline unsigned long current_top_of_stack(void)
* entry trampoline.
*/
if (IS_ENABLED(CONFIG_USE_X86_SEG_SUPPORT))
- return pcpu_hot.top_of_stack;
+ return const_pcpu_hot.top_of_stack;

return this_cpu_read_stable(pcpu_hot.top_of_stack);
}
--
2.41.0


2023-10-24 15:57:17

by Borislav Petkov

[permalink] [raw]
Subject: Re: [PATCH] x86/percpu: Return correct variable from current_top_of_stack()

On Tue, Oct 24, 2023 at 04:28:14PM +0200, Uros Bizjak wrote:
> current_top_of_stack() should return variable from _seg_gs
> qualified named address space when CONFIG_USE_X86_SEG_SUPPORT
> is enbled.

I presume you're sending those two in order to fix stuff like the splat
below which fires in my guest with latest Linus + latest tip/master
lineup.

Because disabling CONFIG_USE_X86_SEG_SUPPORT fixes it.

I'm wondering that close to the merge window whether we should delay
all that new and fancy percpu stuff one more round until it is tested
more widely...

[ 1.623994] kprobes: kprobe jump-optimization is enabled. All kprobes are optimized if possible.
[ 1.627398] HugeTLB: registered 1.00 GiB page size, pre-allocated 0 pages
[ 1.627101] BUG: unable to handle page fault for address: 000000000002f0d8
[ 1.629645] HugeTLB: 16380 KiB vmemmap can be freed for a 1.00 GiB page
[ 1.628158] #PF: supervisor read access in kernel mode
[ 1.628161] #PF: error_code(0x0000) - not-present page
[ 1.628163] PGD 0 P4D 0
[ 1.628167] Oops: 0000 [#1] PREEMPT SMP
[ 1.628171] CPU: 1 PID: 10 Comm: kworker/u32:0 Not tainted 6.6.0-rc7+ #1
[ 1.631566] HugeTLB: registered 2.00 MiB page size, pre-allocated 0 pages
[ 1.629156] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014
[ 1.632494] HugeTLB: 28 KiB vmemmap can be freed for a 2.00 MiB page
[ 1.629990] Workqueue: ftrace_check_wq ftrace_check_work_func
[ 1.631041] RIP: 0010:raw_irqentry_exit_cond_resched+0x16/0x50
[ 1.631041] Code: 00 0f 1f 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 f3 0f 1e fa 0f 1f 44 00 00 65 f7 05 d4 ff ef 7e ff ff ff 7f 75 21 <48> 8b 05 db ff ef 7e 48 29 e0 48 3d ff 3f 00 00 77 19 65 48 8b 05
[ 1.631041] RSP: 0018:ffffc9000005bab8 EFLAGS: 00010046
[ 1.631041] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 000000000002f900
[ 1.631041] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffffc9000005bac8
[ 1.631041] RBP: 0000000000000000 R08: 0000000000000002 R09: 0000000000000001
[ 1.631041] R10: 0000000000000000 R11: 0000000000000001 R12: 0000000000000000
[ 1.631041] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
[ 1.631041] FS: 0000000000000000(0000) GS:ffff88807da40000(0000) knlGS:0000000000000000
[ 1.631041] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 1.631041] CR2: 000000000002f0d8 CR3: 0000000002416000 CR4: 00000000003506f0
[ 1.631041] Call Trace:
[ 1.631041] <TASK>
[ 1.631041] ? __die+0x31/0x80
[ 1.631041] ? page_fault_oops+0x160/0x440
[ 1.631041] ? exc_page_fault+0x74/0x150
[ 1.631041] ? asm_exc_page_fault+0x26/0x30
[ 1.631041] ? raw_irqentry_exit_cond_resched+0x16/0x50
[ 1.631041] irqentry_exit+0x21/0x60
[ 1.631041] asm_sysvec_apic_timer_interrupt+0x1a/0x20
[ 1.631041] RIP: 0010:get_symbol_offset+0x26/0x60
[ 1.631041] Code: 90 90 90 90 0f 1f 44 00 00 48 89 f8 48 c1 e8 08 8b 04 85 80 4f 0b 82 48 05 88 af f1 81 81 e7 ff 00 00 00 74 25 31 c9 0f b6 10 <84> d2 79 0e 0f b6 70 01 83 e2 7f c1 e6 07 09 f2 ff c2 ff c2 ff c1

--
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette

2023-10-24 16:05:52

by Uros Bizjak

[permalink] [raw]
Subject: Re: [PATCH] x86/percpu: Return correct variable from current_top_of_stack()

On Tue, Oct 24, 2023 at 5:56 PM Borislav Petkov <[email protected]> wrote:
>
> On Tue, Oct 24, 2023 at 04:28:14PM +0200, Uros Bizjak wrote:
> > current_top_of_stack() should return variable from _seg_gs
> > qualified named address space when CONFIG_USE_X86_SEG_SUPPORT
> > is enbled.
>
> I presume you're sending those two in order to fix stuff like the splat
> below which fires in my guest with latest Linus + latest tip/master
> lineup.

Yes, the first one is the fix, the second one is only tangentially
related to the fix.

> Because disabling CONFIG_USE_X86_SEG_SUPPORT fixes it.
>
> I'm wondering that close to the merge window whether we should delay
> all that new and fancy percpu stuff one more round until it is tested
> more widely...

The percpu stuff won't be merged for 6.7, it will have to sit out until 6.8.

Thanks,
Uros.


>
> [ 1.623994] kprobes: kprobe jump-optimization is enabled. All kprobes are optimized if possible.
> [ 1.627398] HugeTLB: registered 1.00 GiB page size, pre-allocated 0 pages
> [ 1.627101] BUG: unable to handle page fault for address: 000000000002f0d8
> [ 1.629645] HugeTLB: 16380 KiB vmemmap can be freed for a 1.00 GiB page
> [ 1.628158] #PF: supervisor read access in kernel mode
> [ 1.628161] #PF: error_code(0x0000) - not-present page
> [ 1.628163] PGD 0 P4D 0
> [ 1.628167] Oops: 0000 [#1] PREEMPT SMP
> [ 1.628171] CPU: 1 PID: 10 Comm: kworker/u32:0 Not tainted 6.6.0-rc7+ #1
> [ 1.631566] HugeTLB: registered 2.00 MiB page size, pre-allocated 0 pages
> [ 1.629156] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014
> [ 1.632494] HugeTLB: 28 KiB vmemmap can be freed for a 2.00 MiB page
> [ 1.629990] Workqueue: ftrace_check_wq ftrace_check_work_func
> [ 1.631041] RIP: 0010:raw_irqentry_exit_cond_resched+0x16/0x50
> [ 1.631041] Code: 00 0f 1f 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 f3 0f 1e fa 0f 1f 44 00 00 65 f7 05 d4 ff ef 7e ff ff ff 7f 75 21 <48> 8b 05 db ff ef 7e 48 29 e0 48 3d ff 3f 00 00 77 19 65 48 8b 05
> [ 1.631041] RSP: 0018:ffffc9000005bab8 EFLAGS: 00010046
> [ 1.631041] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 000000000002f900
> [ 1.631041] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffffc9000005bac8
> [ 1.631041] RBP: 0000000000000000 R08: 0000000000000002 R09: 0000000000000001
> [ 1.631041] R10: 0000000000000000 R11: 0000000000000001 R12: 0000000000000000
> [ 1.631041] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
> [ 1.631041] FS: 0000000000000000(0000) GS:ffff88807da40000(0000) knlGS:0000000000000000
> [ 1.631041] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 1.631041] CR2: 000000000002f0d8 CR3: 0000000002416000 CR4: 00000000003506f0
> [ 1.631041] Call Trace:
> [ 1.631041] <TASK>
> [ 1.631041] ? __die+0x31/0x80
> [ 1.631041] ? page_fault_oops+0x160/0x440
> [ 1.631041] ? exc_page_fault+0x74/0x150
> [ 1.631041] ? asm_exc_page_fault+0x26/0x30
> [ 1.631041] ? raw_irqentry_exit_cond_resched+0x16/0x50
> [ 1.631041] irqentry_exit+0x21/0x60
> [ 1.631041] asm_sysvec_apic_timer_interrupt+0x1a/0x20
> [ 1.631041] RIP: 0010:get_symbol_offset+0x26/0x60
> [ 1.631041] Code: 90 90 90 90 0f 1f 44 00 00 48 89 f8 48 c1 e8 08 8b 04 85 80 4f 0b 82 48 05 88 af f1 81 81 e7 ff 00 00 00 74 25 31 c9 0f b6 10 <84> d2 79 0e 0f b6 70 01 83 e2 7f c1 e6 07 09 f2 ff c2 ff c2 ff c1
>
> --
> Regards/Gruss,
> Boris.
>
> https://people.kernel.org/tglx/notes-about-netiquette

Subject: [tip: x86/percpu] x86/percpu: Return correct variable from current_top_of_stack()

The following commit has been merged into the x86/percpu branch of tip:

Commit-ID: 0548eb067ed664b93043e033295ca71e3e706245
Gitweb: https://git.kernel.org/tip/0548eb067ed664b93043e033295ca71e3e706245
Author: Uros Bizjak <[email protected]>
AuthorDate: Tue, 24 Oct 2023 16:28:14 +02:00
Committer: Ingo Molnar <[email protected]>
CommitterDate: Tue, 24 Oct 2023 18:37:20 +02:00

x86/percpu: Return correct variable from current_top_of_stack()

current_top_of_stack() should return variable from _seg_gs
qualified named address space when CONFIG_USE_X86_SEG_SUPPORT=y
is enbled.

Fixes: ed2f752e0e0a ("x86/percpu: Introduce const-qualified const_pcpu_hot to micro-optimize code generation")
Signed-off-by: Uros Bizjak <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Cc: Andy Lutomirski <[email protected]>
Cc: Brian Gerst <[email protected]>
Cc: Denys Vlasenko <[email protected]>
Cc: H. Peter Anvin <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Josh Poimboeuf <[email protected]>
Cc: Sean Christopherson <[email protected]>
---
arch/x86/include/asm/processor.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index b47a997..f20e876 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -519,7 +519,7 @@ static __always_inline unsigned long current_top_of_stack(void)
* entry trampoline.
*/
if (IS_ENABLED(CONFIG_USE_X86_SEG_SUPPORT))
- return pcpu_hot.top_of_stack;
+ return const_pcpu_hot.top_of_stack;

return this_cpu_read_stable(pcpu_hot.top_of_stack);
}