2022-07-27 03:19:22

by Chen Zhongjin

[permalink] [raw]
Subject: [PATCH] Revert "x86/unwind/orc: Don't skip the first frame for inactive tasks"

This reverts commit f1d9a2abff66aa8156fbc1493abed468db63ea48.

When CONFIG_GCOV_PROFILE_ALL is enabled, show_stack() and related
functions (e.g. dump_stack) will break for x86 ORC unwinder.

Call Trace:
<TASK>
? dump_stack_lvl+0x83/0xb7
? schedule+0x1/0x190
? dump_stack+0x13/0x1f
? handler_pre0+0x3f/0x53 [kp_unwind]
...

show_trace_log_lvl() searches text address on stack to validate
whether unwind results are reliable. The code:

for (; stack < stack_info.end; stack++) {
...
if (stack == ret_addr_p)
reliable = 1;
...
if (!reliable)
continue;
...
}

This requires:

*stack* <= ret_addr_p

So that the first ret_addr_p can be found when stack++.

In normal cases the frame of show_stack() should be optimized out.
However if it is not optimized such as CONFIG_GCOV_PROFILE_ALL=y,
unwind_start() will stop at show_stack(), where:

state->sp == first_frame == *stack*

And this will causes:

ret_addr_p = unwind_get_return_address_ptr = state->sp - 1
=> *stack* > ret_addr_p

Then reliable check will ignore all unwind because first ret_addr_p
can't be found.

'f1d9a2abff66 ("x86/unwind/orc: Don't skip the first frame for inactive tasks")'

This patch removed the equal condition when state->sp == first_frame
which makes frame of show_stack() not be skipped. But the reason to
do that is not established now:

'f2ac57a4c49d ("x86/unwind/orc: Fix inactive tasks with stack pointer in %sp on GCC 10 compiled kernels")'

state->sp = first_frame + sizeof(*frame),

state->sp and first_frame can't be equal for inactive stack any more.

Regard this equal condition doesn't involve other cases now,
revert it to fix above problem.

After revert, stack can be printed right:

Call Trace:
<TASK>
dump_stack_lvl+0x83/0xb7
? schedule+0x1/0x190
dump_stack+0x13/0x1f
handler_pre0+0x3f/0x53 [kp_unwind]
...

Fixes: f1d9a2abff66 ("x86/unwind/orc: Don't skip the first frame for inactive tasks")
Signed-off-by: Chen Zhongjin <[email protected]>
---
arch/x86/kernel/unwind_orc.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/kernel/unwind_orc.c b/arch/x86/kernel/unwind_orc.c
index 38185aedf7d1..514dc9ef99fe 100644
--- a/arch/x86/kernel/unwind_orc.c
+++ b/arch/x86/kernel/unwind_orc.c
@@ -708,7 +708,7 @@ void __unwind_start(struct unwind_state *state, struct task_struct *task,
/* Otherwise, skip ahead to the user-specified starting frame: */
while (!unwind_done(state) &&
(!on_stack(&state->stack_info, first_frame, sizeof(long)) ||
- state->sp < (unsigned long)first_frame))
+ state->sp <= (unsigned long)first_frame))
unwind_next_frame(state);

return;
--
2.17.1


2022-08-04 05:28:27

by Josh Poimboeuf

[permalink] [raw]
Subject: Re: [PATCH] Revert "x86/unwind/orc: Don't skip the first frame for inactive tasks"

On Wed, Jul 27, 2022 at 11:15:06AM +0800, Chen Zhongjin wrote:
> This reverts commit f1d9a2abff66aa8156fbc1493abed468db63ea48.
>
> When CONFIG_GCOV_PROFILE_ALL is enabled, show_stack() and related
> functions (e.g. dump_stack) will break for x86 ORC unwinder.
>
> Call Trace:
> <TASK>
> ? dump_stack_lvl+0x83/0xb7
> ? schedule+0x1/0x190
> ? dump_stack+0x13/0x1f
> ? handler_pre0+0x3f/0x53 [kp_unwind]
> ...
>
> show_trace_log_lvl() searches text address on stack to validate
> whether unwind results are reliable. The code:

Hi,

Thanks for the patch. The change itself makes sense, though I'm having
trouble recreating the bug described in the patch description.

I enabled CONFIG_GCOV_PROFILE_ALL and did 'echo l >
/proc/sysrq-trigger', but I got a valid stack trace:

# echo l > /proc/sysrq-trigger
[ 343.916728] sysrq: Show backtrace of all active CPUs
[ 343.917459] NMI backtrace for cpu 2
[ 343.917884] CPU: 2 PID: 1007 Comm: bash Not tainted 5.19.0-rc8+ #68
[ 343.918562] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.0-1.fc36 04/01/2014
[ 343.919534] Call Trace:
[ 343.919904] <TASK>
[ 343.920226] dump_stack_lvl+0xcc/0x11b
[ 343.920742] dump_stack+0x17/0x24
[ 343.921199] nmi_cpu_backtrace.cold+0xb5/0x10d
[ 343.921795] ? lapic_can_unplug_cpu+0xa0/0xa0
[ 343.922375] nmi_trigger_cpumask_backtrace+0x171/0x200
[ 343.923053] arch_trigger_cpumask_backtrace+0x21/0x30
[ 343.923599] sysrq_handle_showallcpus+0x1f/0x30
[ 343.924104] __handle_sysrq.cold+0x81/0x234
[ 343.924576] write_sysrq_trigger+0x6a/0x90
[ 343.925098] proc_reg_write+0x99/0x120
[ 343.925587] vfs_write+0x16c/0x350
[ 343.926093] ksys_write+0x8c/0x180
[ 343.926597] __x64_sys_write+0x21/0x30
[ 343.927144] do_syscall_64+0x76/0x100
[ 343.927676] entry_SYSCALL_64_after_hwframe+0x63/0xcd
[ 343.928343] RIP: 0033:0x7f555e8f7b50
[ 343.928822] Code: 73 01 c3 48 8b 0d 38 83 2c 00 f7 d8 64 89 01 48 83 c8 ff c3 66 0f 1f 44 00 00 83 3d 79 db 2c 00 00 75 10 b8 01 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 31 c3 48 83 ec 08 e8 1e e3 01 00 48 89 04 24
[ 343.930801] RSP: 002b:00007ffdd20d0978 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
[ 343.931761] RAX: ffffffffffffffda RBX: 0000000000000002 RCX: 00007f555e8f7b50
[ 343.932551] RDX: 0000000000000002 RSI: 000055ff471bfe20 RDI: 0000000000000001
[ 343.933347] RBP: 000055ff471bfe20 R08: 00007f555ebc2740 R09: 00007f555f80f700
[ 343.934155] R10: 0000000000000073 R11: 0000000000000246 R12: 0000000000000002
[ 343.934962] R13: 0000000000000001 R14: 00007f555ebc15e0 R15: 00007f555ebbd3c0
[ 343.935798] </TASK>
[ 343.936341] Sending NMI from CPU 2 to CPUs 0-1,3:
[ 343.937163] NMI backtrace for cpu 3
[ 343.937167] CPU: 3 PID: 600 Comm: systemd-journal Not tainted 5.19.0-rc8+ #68

Was this with an upstream kernel? Can you share the config and
toolchain versions?

--
Josh

2022-08-04 07:53:35

by Chen Zhongjin

[permalink] [raw]
Subject: Re: [PATCH] Revert "x86/unwind/orc: Don't skip the first frame for inactive tasks"


On 2022/8/4 12:55, Josh Poimboeuf wrote:
> On Wed, Jul 27, 2022 at 11:15:06AM +0800, Chen Zhongjin wrote:
>> This reverts commit f1d9a2abff66aa8156fbc1493abed468db63ea48.
>>
>> When CONFIG_GCOV_PROFILE_ALL is enabled, show_stack() and related
>> functions (e.g. dump_stack) will break for x86 ORC unwinder.
>>
>> Call Trace:
>> <TASK>
>> ? dump_stack_lvl+0x83/0xb7
>> ? schedule+0x1/0x190
>> ? dump_stack+0x13/0x1f
>> ? handler_pre0+0x3f/0x53 [kp_unwind]
>> ...
>>
>> show_trace_log_lvl() searches text address on stack to validate
>> whether unwind results are reliable. The code:
> Hi,
>
> Thanks for the patch. The change itself makes sense, though I'm having
> trouble recreating the bug described in the patch description.
>
> I enabled CONFIG_GCOV_PROFILE_ALL and did 'echo l >
> /proc/sysrq-trigger', but I got a valid stack trace:
>
> # echo l > /proc/sysrq-trigger
> [ 343.916728] sysrq: Show backtrace of all active CPUs
> [ 343.917459] NMI backtrace for cpu 2
> [ 343.917884] CPU: 2 PID: 1007 Comm: bash Not tainted 5.19.0-rc8+ #68
> [ 343.918562] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.0-1.fc36 04/01/2014
> [ 343.919534] Call Trace:
> [ 343.919904] <TASK>
> [ 343.920226] dump_stack_lvl+0xcc/0x11b
> [ 343.920742] dump_stack+0x17/0x24
> [ 343.921199] nmi_cpu_backtrace.cold+0xb5/0x10d
> [ 343.921795] ? lapic_can_unplug_cpu+0xa0/0xa0
> [ 343.922375] nmi_trigger_cpumask_backtrace+0x171/0x200
> [ 343.923053] arch_trigger_cpumask_backtrace+0x21/0x30
> [ 343.923599] sysrq_handle_showallcpus+0x1f/0x30
> [ 343.924104] __handle_sysrq.cold+0x81/0x234
> [ 343.924576] write_sysrq_trigger+0x6a/0x90
> [ 343.925098] proc_reg_write+0x99/0x120
> [ 343.925587] vfs_write+0x16c/0x350
> [ 343.926093] ksys_write+0x8c/0x180
> [ 343.926597] __x64_sys_write+0x21/0x30
> [ 343.927144] do_syscall_64+0x76/0x100
> [ 343.927676] entry_SYSCALL_64_after_hwframe+0x63/0xcd
> [ 343.928343] RIP: 0033:0x7f555e8f7b50
> [ 343.928822] Code: 73 01 c3 48 8b 0d 38 83 2c 00 f7 d8 64 89 01 48 83 c8 ff c3 66 0f 1f 44 00 00 83 3d 79 db 2c 00 00 75 10 b8 01 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 31 c3 48 83 ec 08 e8 1e e3 01 00 48 89 04 24
> [ 343.930801] RSP: 002b:00007ffdd20d0978 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
> [ 343.931761] RAX: ffffffffffffffda RBX: 0000000000000002 RCX: 00007f555e8f7b50
> [ 343.932551] RDX: 0000000000000002 RSI: 000055ff471bfe20 RDI: 0000000000000001
> [ 343.933347] RBP: 000055ff471bfe20 R08: 00007f555ebc2740 R09: 00007f555f80f700
> [ 343.934155] R10: 0000000000000073 R11: 0000000000000246 R12: 0000000000000002
> [ 343.934962] R13: 0000000000000001 R14: 00007f555ebc15e0 R15: 00007f555ebbd3c0
> [ 343.935798] </TASK>
> [ 343.936341] Sending NMI from CPU 2 to CPUs 0-1,3:
> [ 343.937163] NMI backtrace for cpu 3
> [ 343.937167] CPU: 3 PID: 600 Comm: systemd-journal Not tainted 5.19.0-rc8+ #68
>
> Was this with an upstream kernel? Can you share the config and
> toolchain versions?
Hi,

My kernel is upstream and config is defconfig with only
CONFIG_GCOV_PROFILE_ALL=y then make menuconfig.

CONFIG_CC_VERSION_TEXT="gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0"

And I got:

[   29.745728] NMI backtrace for cpu 1
[   29.745728] CPU: 1 PID: 135 Comm: sh Not tainted
5.19.0-02858-ge2b542100719 #58
[   29.745728] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
BIOS rel-1.15.0-0-g2dd4b9b3f8404
[   29.745728] Call Trace:
[   29.745728]  <TASK>
[   29.745728]  ? dump_stack_lvl+0x72/0xa1
[   29.745728]  ? dump_stack+0x13/0x1f
[   29.745728]  ? nmi_cpu_backtrace+0x19f/0x1d0
[   29.745728]  ? lapic_can_unplug_cpu+0xf0/0xf0
[   29.745728]  ? nmi_trigger_cpumask_backtrace+0x15e/0x210
[   29.745728]  ? arch_trigger_cpumask_backtrace+0x18/0x30
[   29.745728]  ? sysrq_handle_showallcpus+0x16/0x30
[   29.745728]  ? __handle_sysrq+0xc4/0x250
[   29.745728]  ? write_sysrq_trigger+0x46/0x60
[   29.745728]  ? proc_reg_write+0xbc/0x120
[   29.745728]  ? preempt_count_add+0x8e/0x130
[   29.745728]  ? vfs_write+0xfe/0x460
[   29.745728]  ? ksys_write+0xdc/0x170
[   29.745728]  ? fpregs_assert_state_consistent+0x2d/0x90
[   29.745728]  ? __x64_sys_write+0x19/0x30
[   29.745728]  ? do_syscall_64+0x3a/0x90
[   29.745728]  ? entry_SYSCALL_64_after_hwframe+0x63/0xcd
[   29.745728]  </TASK>


It also happened with CONFIG_CC_VERSION_TEXT="gcc-8 (Ubuntu
8.4.0-1ubuntu1~18.04) 8.4.0".

I believe disassemble show_stack in vmlinux and if we have:

    push   %rbp

    mov    %rsp,%rbp

    ... (no regs pushed to stack)

    callq  <show_trace_log_lvl>

This can be reproduced.

Your stacktrace looks just as normal as the one in my commit message
(show_stack is optimized out).


Best,

Chen




2022-08-04 22:16:51

by Josh Poimboeuf

[permalink] [raw]
Subject: Re: [PATCH] Revert "x86/unwind/orc: Don't skip the first frame for inactive tasks"

On Thu, Aug 04, 2022 at 03:27:39PM +0800, Chen Zhongjin wrote:
> I believe disassemble show_stack in vmlinux and if we have:
>
>     push   %rbp
>
>     mov    %rsp,%rbp
>
>     ... (no regs pushed to stack)
>
>     callq  <show_trace_log_lvl>
>
> This can be reproduced.

Weird, that's what I have. This is GCC 12.1.

ffffffff81056de0 <show_stack>:
ffffffff81056de0: e8 0b 43 05 00 call ffffffff810ab0f0 <__fentry__>
ffffffff81056de5: 55 push %rbp
ffffffff81056de6: 48 83 05 b2 0f c9 02 01 addq $0x1,0x2c90fb2(%rip) # ffffffff83ce7da0 <__gcov0.show_stack>
ffffffff81056dee: 48 89 e5 mov %rsp,%rbp
ffffffff81056df1: 48 85 ff test %rdi,%rdi
ffffffff81056df4: 74 41 je ffffffff81056e37 <show_stack+0x57>
ffffffff81056df6: 48 85 f6 test %rsi,%rsi
ffffffff81056df9: 0f 85 c2 24 15 01 jne ffffffff821a92c1 <show_stack.cold+0xd>
ffffffff81056dff: 65 48 8b 04 25 c0 bd 01 00 mov %gs:0x1bdc0,%rax
ffffffff81056e08: 48 39 c7 cmp %rax,%rdi
ffffffff81056e0b: 0f 85 a3 24 15 01 jne ffffffff821a92b4 <show_stack.cold>
ffffffff81056e11: 48 83 05 af 0f c9 02 01 addq $0x1,0x2c90faf(%rip) # ffffffff83ce7dc8 <__gcov0.show_stack+0x28>
ffffffff81056e19: 48 89 ee mov %rbp,%rsi
ffffffff81056e1c: 48 89 d1 mov %rdx,%rcx
ffffffff81056e1f: 48 89 f2 mov %rsi,%rdx
ffffffff81056e22: 31 f6 xor %esi,%esi
ffffffff81056e24: e8 8e 20 15 01 call ffffffff821a8eb7 <show_trace_log_lvl>
ffffffff81056e29: 48 83 05 9f 0f c9 02 01 addq $0x1,0x2c90f9f(%rip) # ffffffff83ce7dd0 <__gcov0.show_stack+0x30>
ffffffff81056e31: 5d pop %rbp
ffffffff81056e32: e9 49 b2 5a 01 jmp ffffffff82602080 <__x86_return_thunk>
ffffffff81056e37: 48 83 05 69 0f c9 02 01 addq $0x1,0x2c90f69(%rip) # ffffffff83ce7da8 <__gcov0.show_stack+0x8>
ffffffff81056e3f: 65 48 8b 3c 25 c0 bd 01 00 mov %gs:0x1bdc0,%rdi
ffffffff81056e48: eb ac jmp ffffffff81056df6 <show_stack+0x16>
ffffffff81056e4a: 66 0f 1f 44 00 00 nopw 0x0(%rax,%rax,1)

--
Josh

2022-10-09 03:27:08

by Chen Zhongjin

[permalink] [raw]
Subject: Re: [PATCH] Revert "x86/unwind/orc: Don't skip the first frame for inactive tasks"

Hi Josh,

On 2022/8/5 6:06, Josh Poimboeuf wrote:
> On Thu, Aug 04, 2022 at 03:27:39PM +0800, Chen Zhongjin wrote:
>> I believe disassemble show_stack in vmlinux and if we have:
>>
>>     push   %rbp
>>
>>     mov    %rsp,%rbp
>>
>>     ... (no regs pushed to stack)
>>
>>     callq  <show_trace_log_lvl>
>>
>> This can be reproduced.
> Weird, that's what I have. This is GCC 12.1.
>
> ffffffff81056de0 <show_stack>:
> ffffffff81056de0: e8 0b 43 05 00 call ffffffff810ab0f0 <__fentry__>
> ffffffff81056de5: 55 push %rbp
> ffffffff81056de6: 48 83 05 b2 0f c9 02 01 addq $0x1,0x2c90fb2(%rip) # ffffffff83ce7da0 <__gcov0.show_stack>
> ffffffff81056dee: 48 89 e5 mov %rsp,%rbp
> ffffffff81056df1: 48 85 ff test %rdi,%rdi
> ffffffff81056df4: 74 41 je ffffffff81056e37 <show_stack+0x57>
> ffffffff81056df6: 48 85 f6 test %rsi,%rsi
> ffffffff81056df9: 0f 85 c2 24 15 01 jne ffffffff821a92c1 <show_stack.cold+0xd>
> ffffffff81056dff: 65 48 8b 04 25 c0 bd 01 00 mov %gs:0x1bdc0,%rax
> ffffffff81056e08: 48 39 c7 cmp %rax,%rdi
> ffffffff81056e0b: 0f 85 a3 24 15 01 jne ffffffff821a92b4 <show_stack.cold>
> ffffffff81056e11: 48 83 05 af 0f c9 02 01 addq $0x1,0x2c90faf(%rip) # ffffffff83ce7dc8 <__gcov0.show_stack+0x28>
> ffffffff81056e19: 48 89 ee mov %rbp,%rsi
> ffffffff81056e1c: 48 89 d1 mov %rdx,%rcx
> ffffffff81056e1f: 48 89 f2 mov %rsi,%rdx
> ffffffff81056e22: 31 f6 xor %esi,%esi
> ffffffff81056e24: e8 8e 20 15 01 call ffffffff821a8eb7 <show_trace_log_lvl>
> ffffffff81056e29: 48 83 05 9f 0f c9 02 01 addq $0x1,0x2c90f9f(%rip) # ffffffff83ce7dd0 <__gcov0.show_stack+0x30>
> ffffffff81056e31: 5d pop %rbp
> ffffffff81056e32: e9 49 b2 5a 01 jmp ffffffff82602080 <__x86_return_thunk>
> ffffffff81056e37: 48 83 05 69 0f c9 02 01 addq $0x1,0x2c90f69(%rip) # ffffffff83ce7da8 <__gcov0.show_stack+0x8>
> ffffffff81056e3f: 65 48 8b 3c 25 c0 bd 01 00 mov %gs:0x1bdc0,%rdi
> ffffffff81056e48: eb ac jmp ffffffff81056df6 <show_stack+0x16>
> ffffffff81056e4a: 66 0f 1f 44 00 00 nopw 0x0(%rax,%rax,1)

This problem still exist on my machine:

CONFIG: defconfig + CONFIG_GCOV_PROFILE_ALL

GCC: gcc 7.5.0 & 12.1.0

TRIGGER:

# echo l > /proc/sysrq-trigger

[  173.156737]  <TASK>
[  173.156737]  ? dump_stack_lvl+0x79/0xaf
[  173.156737]  ? dump_stack+0x13/0x1f
[  173.156737]  ? nmi_cpu_backtrace.cold+0x97/0xef
[  173.156737]  ? lapic_can_unplug_cpu+0x90/0x90
[  173.156737]  ? nmi_trigger_cpumask_backtrace+0x15d/0x1e0
[  173.156737]  ? arch_trigger_cpumask_backtrace+0x18/0x30
[  173.156737]  ? sysrq_handle_showallcpus+0x16/0x30
[  173.156737]  ? __handle_sysrq.cold+0x51/0x1ae
[  173.156737]  ? write_sysrq_trigger+0x2f/0x60
[  173.156737]  ? proc_reg_write+0x90/0x110
[  173.156737]  ? vfs_write+0x12a/0x5b0
[  173.156737]  ? handle_mm_fault+0x158/0x340
[  173.156737]  ? ksys_write+0x76/0x170
[  173.156737]  ? __x64_sys_write+0x18/0x30
[  173.156737]  ? do_syscall_64+0x6e/0x100
[  173.156737]  ? entry_SYSCALL_64_after_hwframe+0x63/0xcd
[  173.156737]  </TASK>

DEBUG CODE:

With your test code and unwind_debug = 1

diff --git a/arch/x86/kernel/unwind_orc.c b/arch/x86/kernel/unwind_orc.c
index a80ac029bab5..c9d7e6f69e22 100644
--- a/arch/x86/kernel/unwind_orc.c
+++ b/arch/x86/kernel/unwind_orc.c
@@ -750,17 +750,24 @@ void __unwind_start(struct unwind_state *state,
struct task_struct *task,
         * to start unwinding at.  Skip ahead until we reach it.
         */

+orc_warn_current("skipping frames: ");
        /* When starting from regs, skip the regs frame: */
        if (regs) {
                unwind_next_frame(state);
                return;
        }
+pr_info("[%pS] %lx\nsp: %lx\nfirst_frame: %lx",
+               (void *)state->ip, state->ip, state->sp,
+               (unsigned long)first_frame);

        /* Otherwise, skip ahead to the user-specified starting frame: */
        while (!unwind_done(state) &&
               (!on_stack(&state->stack_info, first_frame, sizeof(long)) ||
                        state->sp < (unsigned long)first_frame))
                unwind_next_frame(state);
+pr_info("[%pS] %lx\nsp: %lx\nfirst_frame: %lx",
+               (void *)state->ip, state->ip, state->sp,
+               (unsigned long)first_frame);

        return;

DEBUG LOG:

[  173.156737] unwind stack type:1 next_sp:0000000000000000 mask:0x2
graph_idx:0
[  173.156737] ffffad5d419b3c50: ffffad5d419b3d78 (0xffffad5d419b3d78)
[  173.156737] ffffad5d419b3c58: ffffffffb9490905
(__unwind_start+0x175/0x220)
[  173.156737] ffffad5d419b3c60: ffffad5d419b3d78 (0xffffad5d419b3d78)
[  173.156737] ffffad5d419b3c68: ffffad5d419b3d68 (0xffffad5d419b3d68)
[  173.156737] ffffad5d419b3c70: 0000000000000000 (0x0)
[  173.156737] ffffad5d419b3c78: ffff9c6a02d16580 (0xffff9c6a02d16580)
[  173.156737] ffffad5d419b3c80: ffffffffba69e5fb
(show_trace_log_lvl+0xc9/0x3fb)
[  173.156737] ffffad5d419b3c88: ffff9c6a3ec9b520 (0xffff9c6a3ec9b520)
[  173.156737] ffffad5d419b3c90: 0000000000000000 (0x0)
[  173.156737] ffffad5d419b3c98: ffffffffb953aa17
(__wake_up_klogd+0xa7/0x100)
[  173.156737] ffffad5d419b3ca0: 0000000000000000 (0x0)
[  173.156737] ffffad5d419b3ca8: 0000000000000000 (0x0)
[  173.156737] ffffad5d419b3cb0: 0000000000000000 (0x0)
[  173.156737] ffffad5d419b3cb8: 0000000000000000 (0x0)
[  173.156737] ffffad5d419b3cc0: 0000000000000000 (0x0)
[  173.156737] ffffad5d419b3cc8: 0000000000000001 (0x1)
[  173.156737] ffffad5d419b3cd0: ffffad5d419b0000 (0xffffad5d419b0000)
[  173.156737] ffffad5d419b3cd8: ffffad5d419b4000 (0xffffad5d419b4000)
[  173.156737] ffffad5d419b3ce0: 0000000000000000 (0x0)
[  173.156737] ffffad5d419b3ce8: 0000000000000002 (0x2)
[  173.156737] ffffad5d419b3cf0: ffff9c6a02d16580 (0xffff9c6a02d16580)
[  173.156737] ffffad5d419b3cf8: 0000000000000000 (0x0)
[  173.156737] ffffad5d419b3d00: 0000000000000000 (0x0)
[  173.156737] ffffad5d419b3d08: 0000000000000000 (0x0)
[  173.156737] ffffad5d419b3d10: ffffad5d419b3c60 (0xffffad5d419b3c60)
[  173.156737] ffffad5d419b3d18: ffffad5d419b3d78 (0xffffad5d419b3d78)
[  173.156737] ffffad5d419b3d20: ffffffffb949097d
(__unwind_start+0x1ed/0x220)
[  173.156737] ffffad5d419b3d28: 0000000000000000 (0x0)
[  173.156737] ffffad5d419b3d30: 0000000000000000 (0x0)
[  173.156737] ffffad5d419b3d38: a4641208047dd200 (0xa4641208047dd200)
[  173.156737] ffffad5d419b3d40: ffffffffbb2829a3
(msr_save_dmi_table+0x53e9a3/0x544c20)
[  173.156737] ffffad5d419b3d48: 0000000000000000 (0x0)
[  173.156737] ffffad5d419b3d50: 0000000000000001 (0x1)
[  173.156737] ffffad5d419b3d58: 0000000000000001 (0x1)
[  173.156737] ffffad5d419b3d60: ffff9c6a019e0800 (0xffff9c6a019e0800)
[  173.156737] ffffad5d419b3d68: ffffad5d419b3d78 (0xffffad5d419b3d78)
[  173.156737] ffffad5d419b3d70: ffffffffb943d9e4 (show_stack+0x44/0x70)
[  173.156737] ffffad5d419b3d78: 0000000000000002 (0x2)
[  173.156737] ffffad5d419b3d80: ffffffffba6c4628 (dump_stack_lvl+0x79/0xaf)
[  173.156737] ffffad5d419b3d88: 0000000000000202 (0x202)
[  173.156737] ffffad5d419b3d90: 0000000000000001 (0x1)
[  173.156737] ffffad5d419b3d98: ffffffffba6c4671 (dump_stack+0x13/0x1f)
... ...

[  173.156737] [__unwind_start+0x1ed/0x220] ffffffffb949097d
[  173.156737] sp: ffffad5d419b3c60
[  173.156737] first_frame: ffffad5d419b3d78
[  173.156737] [show_stack+0x44/0x70] ffffffffb943d9e4
[  173.156737] sp: ffffad5d419b3d78
[  173.156737] first_frame: ffffad5d419b3d78


As I said in commit log, problem is caused because we got *state->sp ==
first_frame* at the end of __unwind_start().

Could you please try this debug code on your machine and have a look if
this happened ?

Thanks for your time!


Best,

Chen

2022-10-14 23:08:45

by Josh Poimboeuf

[permalink] [raw]
Subject: Re: [PATCH] Revert "x86/unwind/orc: Don't skip the first frame for inactive tasks"

On Sun, Oct 09, 2022 at 11:14:05AM +0800, Chen Zhongjin wrote:
> Hi Josh,
>
> On 2022/8/5 6:06, Josh Poimboeuf wrote:
> > On Thu, Aug 04, 2022 at 03:27:39PM +0800, Chen Zhongjin wrote:
> > > I believe disassemble show_stack in vmlinux and if we have:
> > >
> > >     push   %rbp
> > >
> > >     mov    %rsp,%rbp
> > >
> > >     ... (no regs pushed to stack)
> > >
> > >     callq  <show_trace_log_lvl>
> > >
> > > This can be reproduced.
> > Weird, that's what I have. This is GCC 12.1.
> >
> > ffffffff81056de0 <show_stack>:
> > ffffffff81056de0: e8 0b 43 05 00 call ffffffff810ab0f0 <__fentry__>
> > ffffffff81056de5: 55 push %rbp
> > ffffffff81056de6: 48 83 05 b2 0f c9 02 01 addq $0x1,0x2c90fb2(%rip) # ffffffff83ce7da0 <__gcov0.show_stack>
> > ffffffff81056dee: 48 89 e5 mov %rsp,%rbp
> > ffffffff81056df1: 48 85 ff test %rdi,%rdi
> > ffffffff81056df4: 74 41 je ffffffff81056e37 <show_stack+0x57>
> > ffffffff81056df6: 48 85 f6 test %rsi,%rsi
> > ffffffff81056df9: 0f 85 c2 24 15 01 jne ffffffff821a92c1 <show_stack.cold+0xd>
> > ffffffff81056dff: 65 48 8b 04 25 c0 bd 01 00 mov %gs:0x1bdc0,%rax
> > ffffffff81056e08: 48 39 c7 cmp %rax,%rdi
> > ffffffff81056e0b: 0f 85 a3 24 15 01 jne ffffffff821a92b4 <show_stack.cold>
> > ffffffff81056e11: 48 83 05 af 0f c9 02 01 addq $0x1,0x2c90faf(%rip) # ffffffff83ce7dc8 <__gcov0.show_stack+0x28>
> > ffffffff81056e19: 48 89 ee mov %rbp,%rsi
> > ffffffff81056e1c: 48 89 d1 mov %rdx,%rcx
> > ffffffff81056e1f: 48 89 f2 mov %rsi,%rdx
> > ffffffff81056e22: 31 f6 xor %esi,%esi
> > ffffffff81056e24: e8 8e 20 15 01 call ffffffff821a8eb7 <show_trace_log_lvl>
> > ffffffff81056e29: 48 83 05 9f 0f c9 02 01 addq $0x1,0x2c90f9f(%rip) # ffffffff83ce7dd0 <__gcov0.show_stack+0x30>
> > ffffffff81056e31: 5d pop %rbp
> > ffffffff81056e32: e9 49 b2 5a 01 jmp ffffffff82602080 <__x86_return_thunk>
> > ffffffff81056e37: 48 83 05 69 0f c9 02 01 addq $0x1,0x2c90f69(%rip) # ffffffff83ce7da8 <__gcov0.show_stack+0x8>
> > ffffffff81056e3f: 65 48 8b 3c 25 c0 bd 01 00 mov %gs:0x1bdc0,%rdi
> > ffffffff81056e48: eb ac jmp ffffffff81056df6 <show_stack+0x16>
> > ffffffff81056e4a: 66 0f 1f 44 00 00 nopw 0x0(%rax,%rax,1)
>
> This problem still exist on my machine:
>
> CONFIG: defconfig + CONFIG_GCOV_PROFILE_ALL
>
> GCC: gcc 7.5.0 & 12.1.0
>
> TRIGGER:
>
> # echo l > /proc/sysrq-trigger

Hi Chen,

Thanks for your patience. I'm able to recreate now. I still have no
idea why I couldn't recreate before.

Anyway the patch looks good. I'll run it through some more testing.

--
Josh