2019-07-11 11:54:34

by Peter Zijlstra

[permalink] [raw]
Subject: [PATCH v3 0/6] Tracing vs CR2

Hi,

Here's the latest (and hopefully final) set of tracing vs CR2 patches.

They are basically the same as v2, with only minor edits and tags collected
from the last review.

Please consider.


2019-07-16 19:38:41

by Vegard Nossum

[permalink] [raw]
Subject: Re: [PATCH v3 0/6] Tracing vs CR2


On 7/11/19 1:40 PM, Peter Zijlstra wrote:
> Hi,
>
> Here's the latest (and hopefully final) set of tracing vs CR2 patches.
>
> They are basically the same as v2, with only minor edits and tags collected
> from the last review.
>
> Please consider.
>

Hi,

I ran my own battery of tests on your patch set on top of
5ad18b2e60b75c7297a998dea702451d33a052ed and ran into this:

------------[ cut here ]------------
General protection fault in user access. Non-canonical address?
WARNING: CPU: 0 PID: 5039 at arch/x86/mm/extable.c:126
ex_handler_uaccess+0x5d/0x70
CPU: 0 PID: 5039 Comm: init Not tainted 5.2.0+ #124
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
Ubuntu-1.8.2-1ubuntu1 04/01/2014
RIP: 0010:ex_handler_uaccess+0x5d/0x70
Code: 5d 41 5c c3 e8 c4 8e 0e 00 80 3d e5 74 1e 01 00 75 d3 e8 b6 8e 0e
00 48 c7 c7 10 a7 fb 81 c6 05 d0 74 1e 01 01 e8 d1 43 01 00 <0f> 0b eb
b7 0f 1f 44 00 00 66 2e 0f 1f 84 00 00 00 00 00 55 48 89
RSP: 0000:fffffe000000fc48 EFLAGS: 00010086
RAX: 0000000000000000 RBX: ffffffff81c07dac RCX: ffffffff811a887c
RDX: 0000000000000000 RSI: ffffffff8289f05f RDI: 0000000000000093
RBP: fffffe000000fcb8 R08: 00000036fe0f15d3 R09: 000000000000003f
R10: 0000000000000000 R11: 0000000000000000 R12: 000000000000000d
R13: 000000000000000d R14: 0000000000000000 R15: 0000000000000000
FS: 00005555563ab8c0(0000) GS:ffff88803ec00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000001ff7 CR3: 000000003c804002 CR4: 00000000003606f0
DR0: 0000000040209100 DR1: 00000000402091a1 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff1 DR7: 00000000000b062a
Call Trace:
<#DB>
fixup_exception+0x50/0x6a
do_general_protection+0x40/0x160
general_protection+0x2d/0x40
RIP: 0010:arch_stack_walk_user+0x71/0x100
Code: 00 48 83 e8 10 49 39 c4 77 45 4c 8b 04 24 4c 89 e3 4d 89 fd 4c 89
fd 41 83 87 98 0a 00 00 01 0f 01 cb 0f ae e8 31 c0 4c 89 e2 <4c> 8b 33
4d 89 f4 85 c0 75 7a 48 8b 73 08 0f 01 ca 85 c0 74 1f 65
RSP: 0000:fffffe000000fd68 EFLAGS: 00050046
RAX: 0000000000000000 RBX: 854163717acc2789 RCX: ffffffff811ca27b
RDX: 854163717acc2789 RSI: 0000000040209102 RDI: fffffe000000fdb8
RBP: ffff88803d55d040 R08: ffffc9000520bf58 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: 854163717acc2789
R13: ffff88803d55d040 R14: 0000000000000093 R15: ffff88803d55d040
? stack_trace_consume_entry+0x4b/0x80
? arch_stack_walk_user+0x34/0x100
? profile_setup.cold+0xc1/0xc1
stack_trace_save_user+0x71/0x9c
trace_buffer_unlock_commit_regs+0x1ae/0x270
trace_event_buffer_commit+0x90/0x240
trace_event_raw_event_preemptirq_template+0x9a/0x100
? debug+0x16/0x70
? perf_trace_preemptirq_template+0x120/0x120
? trace_hardirqs_off_thunk+0x1a/0x1c
trace_hardirqs_off_caller+0xf4/0x150
trace_hardirqs_off_thunk+0x1a/0x1c
? debug+0x11/0x70
debug+0x16/0x70
RIP: 0010:copy_user_generic_unrolled+0xa0/0xc0
Code: 7f 40 ff c9 75 b6 89 d1 83 e2 07 c1 e9 03 74 12 4c 8b 06 4c 89 07
48 8d 76 08 48 8d 7f 08 ff c9 75 ee 21 d2 74 10 89 d1 8a 06 <88> 07 48
ff c6 48 ff c7 ff c9 75 f2 31 c0 0f 01 ca c3 0f 1f 40 00
RSP: 0000:ffffc9000520be38 EFLAGS: 00040202
RAX: ffff88803d55d09c RBX: ffff88803d55d040 RCX: 0000000000000001
RDX: 0000000000000001 RSI: 0000000040209102 RDI: ffffc9000520be76
RBP: 0000000000000001 R08: 0000000000000001 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: 00007ffffffff000
R13: 0000000040209102 R14: ffffc9000520be76 R15: 0000000000000000
</#DB>
__probe_kernel_read+0x57/0x90
is_prefetch.isra.0+0xb5/0x210
? tracer_hardirqs_on+0x53/0x1a0
__bad_area_nosemaphore+0x9e/0x220
__do_page_fault+0x483/0x630
? async_page_fault+0x8/0x40
async_page_fault+0x36/0x40
RIP: 0033:0x40209102
Code: 00 00 49 bc 00 20 23 40 00 00 00 00 49 bd 00 00 d0 40 00 00 00 00
49 be ff ff ff ff ff ff ff ff 49 bf 00 50 80 40 00 00 00 00 <9c> 48 81
0c 24 00 04 00 00 48 81 0c 24 00 00 04 00 9d ff 2c 25 00
RSP: 002b:0000000000001fff EFLAGS: 00010217
RAX: 0000000000000000 RBX: 00000000402090b0 RCX: 0000000000000001
RDX: 0000000000000001 RSI: 0000000000000000 RDI: 0000000041ebb000
RBP: 854163717acc2789 R08: 0000000000000001 R09: b1f39cc399a61ebb
R10: 00007ffeab175000 R11: 0000000000000360 R12: 0000000040232000
R13: 0000000040d00000 R14: ffffffffffffffff R15: 0000000040805000
---[ end trace e5e49800ff5aa5ed ]---
PANIC: double fault, error_code: 0x0
CPU: 0 PID: 5039 Comm: init Tainted: G W 5.2.0+ #124
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
Ubuntu-1.8.2-1ubuntu1 04/01/2014
RIP: 0010:__sanitizer_cov_trace_pc+0x0/0x50
Code: 82 e8 74 2d f8 ff 48 89 9d 10 01 00 00 48 89 ee 5b 4c 89 e7 5d 41
5c e9 8e 5d 12 00 5b b8 f4 ff ff ff 5d 41 5c c3 0f 1f 40 00 <65> 48 8b
04 25 c0 6c 01 00 65 8b 15 78 ba df 7e 81 e2 00 01 1f 00
RSP: 0000:fffffe000000f008 EFLAGS: 00010093
RAX: 0000000000016cc0 RBX: ffffffff81a01436 RCX: ffffffff81a00b97
RDX: 0000000000016cc0 RSI: ffffffff81a01428 RDI: ffffffff81a01436
RBP: fffffe000000f088 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
FS: 00005555563ab8c0(0000) GS:ffff88803ec00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: fffffe000000eff8 CR3: 000000003c804002 CR4: 00000000003606f0
DR0: 0000000040209100 DR1: 00000000402091a1 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 00000000000b062a
Call Trace:
<#DB>
trace_hardirqs_off_caller+0x10/0x150
trace_hardirqs_off_thunk+0x1a/0x1c
? native_iret+0x7/0x7
? async_page_fault+0x8/0x40
? async_page_fault+0x16/0x40
async_page_fault+0x16/0x40
RIP: 0010:__sanitizer_cov_trace_pc+0x0/0x50
Code: 82 e8 74 2d f8 ff 48 89 9d 10 01 00 00 48 89 ee 5b 4c 89 e7 5d 41
5c e9 8e 5d 12 00 5b b8 f4 ff ff ff 5d 41 5c c3 0f 1f 40 00 <65> 48 8b
04 25 c0 6c 01 00 65 8b 15 78 ba df 7e 81 e2 00 01 1f 00
RSP: 0000:fffffe000000f148 EFLAGS: 00010093 ORIG_RAX: 0000000000000000
RAX: 0000000000016cc0 RBX: ffffffff81a01436 RCX: ffffffff81a00b97
RDX: 0000000000016cc0 RSI: ffffffff81a01428 RDI: ffffffff81a01436
RBP: fffffe000000f1c8 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
? async_page_fault+0x16/0x40
? native_iret+0x7/0x7
? async_page_fault+0x8/0x40
? async_page_fault+0x16/0x40
trace_hardirqs_off_caller+0x10/0x150
trace_hardirqs_off_thunk+0x1a/0x1c
? native_iret+0x7/0x7
? async_page_fault+0x8/0x40
? async_page_fault+0x16/0x40
async_page_fault+0x16/0x40
RIP: 0010:__sanitizer_cov_trace_pc+0x0/0x50
Code: 82 e8 74 2d f8 ff 48 89 9d 10 01 00 00 48 89 ee 5b 4c 89 e7 5d 41
5c e9 8e 5d 12 00 5b b8 f4 ff ff ff 5d 41 5c c3 0f 1f 40 00 <65> 48 8b
04 25 c0 6c 01 00 65 8b 15 78 ba df 7e 81 e2 00 01 1f 00
RSP: 0000:fffffe000000f288 EFLAGS: 00010093 ORIG_RAX: 0000000000000000
RAX: 0000000000016cc0 RBX: ffffffff81a01436 RCX: ffffffff81a00b97
RDX: 0000000000016cc0 RSI: ffffffff81a01428 RDI: ffffffff81a01436
RBP: fffffe000000f308 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
? async_page_fault+0x16/0x40
? native_iret+0x7/0x7
? async_page_fault+0x8/0x40
? async_page_fault+0x16/0x40
trace_hardirqs_off_caller+0x10/0x150
trace_hardirqs_off_thunk+0x1a/0x1c
? native_iret+0x7/0x7
? async_page_fault+0x8/0x40
? async_page_fault+0x16/0x40
async_page_fault+0x16/0x40
RIP: 0010:__sanitizer_cov_trace_pc+0x0/0x50
Code: 82 e8 74 2d f8 ff 48 89 9d 10 01 00 00 48 89 ee 5b 4c 89 e7 5d 41
5c e9 8e 5d 12 00 5b b8 f4 ff ff ff 5d 41 5c c3 0f 1f 40 00 <65> 48 8b
04 25 c0 6c 01 00 65 8b 15 78 ba df 7e 81 e2 00 01 1f 00
RSP: 0000:fffffe000000f3c8 EFLAGS: 00010093 ORIG_RAX: 0000000000000000
RAX: 0000000000016cc0 RBX: ffffffff81a01436 RCX: ffffffff81a00b97
RDX: 0000000000016cc0 RSI: ffffffff81a01428 RDI: ffffffff81a01436
RBP: fffffe000000f448 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
? async_page_fault+0x16/0x40
? native_iret+0x7/0x7
? async_page_fault+0x8/0x40
? async_page_fault+0x16/0x40
trace_hardirqs_off_caller+0x10/0x150
trace_hardirqs_off_thunk+0x1a/0x1c
? native_iret+0x7/0x7
? async_page_fault+0x8/0x40
? async_page_fault+0x16/0x40
async_page_fault+0x16/0x40
RIP: 0010:__sanitizer_cov_trace_pc+0x0/0x50
Code: 82 e8 74 2d f8 ff 48 89 9d 10 01 00 00 48 89 ee 5b 4c 89 e7 5d 41
5c e9 8e 5d 12 00 5b b8 f4 ff ff ff 5d 41 5c c3 0f 1f 40 00 <65> 48 8b
04 25 c0 6c 01 00 65 8b 15 78 ba df 7e 81 e2 00 01 1f 00
RSP: 0000:fffffe000000f508 EFLAGS: 00010093 ORIG_RAX: 0000000000000000
RAX: 0000000000016cc0 RBX: ffffffff81a01436 RCX: ffffffff81a00b97
RDX: 0000000000016cc0 RSI: ffffffff81a01428 RDI: ffffffff81a01436
RBP: fffffe000000f588 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
? async_page_fault+0x16/0x40
? native_iret+0x7/0x7
? async_page_fault+0x8/0x40
? async_page_fault+0x16/0x40
trace_hardirqs_off_caller+0x10/0x150
trace_hardirqs_off_thunk+0x1a/0x1c
? native_iret+0x7/0x7
? async_page_fault+0x8/0x40
? async_page_fault+0x16/0x40
async_page_fault+0x16/0x40
RIP: 0010:__sanitizer_cov_trace_pc+0x0/0x50
Code: 82 e8 74 2d f8 ff 48 89 9d 10 01 00 00 48 89 ee 5b 4c 89 e7 5d 41
5c e9 8e 5d 12 00 5b b8 f4 ff ff ff 5d 41 5c c3 0f 1f 40 00 <65> 48 8b
04 25 c0 6c 01 00 65 8b 15 78 ba df 7e 81 e2 00 01 1f 00
RSP: 0000:fffffe000000f648 EFLAGS: 00010093 ORIG_RAX: 0000000000000000
RAX: 0000000000016cc0 RBX: ffffffff81a01436 RCX: ffffffff81a00b97
RDX: 0000000000016cc0 RSI: ffffffff81a01428 RDI: ffffffff81a01436
RBP: fffffe000000f6c8 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
? async_page_fault+0x16/0x40
? native_iret+0x7/0x7
? async_page_fault+0x8/0x40
? async_page_fault+0x16/0x40
trace_hardirqs_off_caller+0x10/0x150
trace_hardirqs_off_thunk+0x1a/0x1c
? native_iret+0x7/0x7
? async_page_fault+0x8/0x40
? async_page_fault+0x16/0x40
async_page_fault+0x16/0x40
RIP: 0010:__sanitizer_cov_trace_pc+0x0/0x50
Code: 82 e8 74 2d f8 ff 48 89 9d 10 01 00 00 48 89 ee 5b 4c 89 e7 5d 41
5c e9 8e 5d 12 00 5b b8 f4 ff ff ff 5d 41 5c c3 0f 1f 40 00 <65> 48 8b
04 25 c0 6c 01 00 65 8b 15 78 ba df 7e 81 e2 00 01 1f 00
RSP: 0000:fffffe000000f788 EFLAGS: 00010093 ORIG_RAX: 0000000000000000
RAX: 0000000000016cc0 RBX: ffffffff81a01436 RCX: ffffffff81a00b97
RDX: 0000000000016cc0 RSI: ffffffff81a01428 RDI: ffffffff81a01436
RBP: fffffe000000f808 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
? async_page_fault+0x16/0x40
? native_iret+0x7/0x7
? async_page_fault+0x8/0x40
? async_page_fault+0x16/0x40
trace_hardirqs_off_caller+0x10/0x150
trace_hardirqs_off_thunk+0x1a/0x1c
? native_iret+0x7/0x7
? async_page_fault+0x8/0x40
? async_page_fault+0x16/0x40
async_page_fault+0x16/0x40
RIP: 0010:__sanitizer_cov_trace_pc+0x0/0x50
Code: 82 e8 74 2d f8 ff 48 89 9d 10 01 00 00 48 89 ee 5b 4c 89 e7 5d 41
5c e9 8e 5d 12 00 5b b8 f4 ff ff ff 5d 41 5c c3 0f 1f 40 00 <65> 48 8b
04 25 c0 6c 01 00 65 8b 15 78 ba df 7e 81 e2 00 01 1f 00
RSP: 0000:fffffe000000f8c8 EFLAGS: 00010093 ORIG_RAX: 0000000000000000
RAX: 0000000000016cc0 RBX: ffffffff81a01436 RCX: ffffffff81a00b97
RDX: 0000000000016cc0 RSI: ffffffff81a01428 RDI: ffffffff81a01436
RBP: fffffe000000f948 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
? async_page_fault+0x16/0x40
? native_iret+0x7/0x7
? async_page_fault+0x8/0x40
? async_page_fault+0x16/0x40
trace_hardirqs_off_caller+0x10/0x150
trace_hardirqs_off_thunk+0x1a/0x1c
? native_iret+0x7/0x7
? async_page_fault+0x8/0x40
? async_page_fault+0x16/0x40
async_page_fault+0x16/0x40
RIP: 0010:__sanitizer_cov_trace_pc+0x0/0x50
Code: 82 e8 74 2d f8 ff 48 89 9d 10 01 00 00 48 89 ee 5b 4c 89 e7 5d 41
5c e9 8e 5d 12 00 5b b8 f4 ff ff ff 5d 41 5c c3 0f 1f 40 00 <65> 48 8b
04 25 c0 6c 01 00 65 8b 15 78 ba df 7e 81 e2 00 01 1f 00
RSP: 0000:fffffe000000fa08 EFLAGS: 00010083 ORIG_RAX: 0000000000000000
RAX: 0000000000006004 RBX: ffffffff81a01436 RCX: ffffffff81a00b97
RDX: 0000000000006004 RSI: ffffffff81a01428 RDI: ffffffff81a01436
RBP: fffffe000000fa88 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
? async_page_fault+0x16/0x40
? native_iret+0x7/0x7
? async_page_fault+0x8/0x40
? async_page_fault+0x16/0x40
trace_hardirqs_off_caller+0x10/0x150
trace_hardirqs_off_thunk+0x1a/0x1c
? native_iret+0x7/0x7
? async_page_fault+0x8/0x40
async_page_fault+0x16/0x40
RIP: 0010:fixup_bad_iret+0x6/0x50
Code: 2d a8 00 00 00 48 39 f8 74 0b b9 15 00 00 00 48 89 c7 f3 48 a5 c3
0f 1f 40 00 66 2e 0f 1f 84 00 00 00 00 00 41 54 55 48 89 fd <65> 48 8b
3d a6 31 f2 7e 48 8b b5 a0 00 00 00 4c 8d a7 50 ff ff ff
RSP: 0000:fffffe000000fb48 EFLAGS: 00010082 ORIG_RAX: 0000000000000000
RAX: 800000003c804002 RBX: 0000000000000000 RCX: ffffffff81a00b97
RDX: 0000000000000000 RSI: ffffffff81a013a8 RDI: fffffe000000fb60
RBP: fffffe000000fb60 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
? native_iret+0x7/0x7
? general_protection+0x8/0x40
error_entry+0xe5/0xf0
RIP: 0010:native_irq_return_iret+0x0/0x2
Code: 5b 41 5b 41 5a 41 59 41 58 58 59 5a 5e 5f 48 83 c4 08 e9 0c 00 00
00 90 90 66 2e 0f 1f 84 00 00 00 00 00 f6 44 24 20 04 75 02 <48> cf 57
0f 01 f8 0f 1f 00 66 90 0f 20 df 48 0f ba ef 3f 48 81 e7
RSP: 0000:fffffe000000fc18 EFLAGS: 00010046 ORIG_RAX: 0000000000000000
RAX: fffffe000000fe08 RBX: ffffffff81c07dac RCX: ffff88803c824000
RDX: ffffffff8126a228 RSI: 0000000040209100 RDI: 0000000000000000
RBP: fffffe000000fcb8 R08: fffffe000000fec0 R09: ffffffff8125b177
R10: 0000000000000000 R11: 0000000000000000 R12: 000000000000000d
R13: 000000000000000d R14: 0000000000000000 R15: 0000000000000000
? general_protection+0x8/0x40
? perf_exclude_event+0x67/0x90
? perf_bp_event+0x98/0xe0
RIP: 0000:0x2
Code: Bad RIP value.
RSP: 0000:0000000005080021 EFLAGS: 00000000
? ex_handler_uaccess+0x5d/0x70
? fixup_exception+0x50/0x6a
? do_general_protection+0x40/0x160
? general_protection+0x2d/0x40
? stack_trace_consume_entry+0x4b/0x80
? arch_stack_walk_user+0x71/0x100
? arch_stack_walk_user+0x34/0x100
? profile_setup.cold+0xc1/0xc1
? stack_trace_save_user+0x71/0x9c
? __this_cpu_preempt_check+0xc/0xc6
? hw_breakpoint_exceptions_notify+0x120/0x1c0
? notifier_call_chain+0x8e/0xb0
? atomic_notifier_call_chain+0x37/0x40
? notify_die+0x5c/0x80
? trace_hardirqs_off_caller+0x20/0x150
? trace_hardirqs_off_thunk+0x1a/0x1c
? debug_smp_processor_id+0x28/0xd0
? paranoid_exit+0xb/0xb0
? copy_user_enhanced_fast_string+0xe/0x20
</#DB>
WARNING: stack recursion on stack type 9
Kernel panic - not syncing: Machine halted.
CPU: 0 PID: 5039 Comm: init Tainted: G W 5.2.0+ #124
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
Ubuntu-1.8.2-1ubuntu1 04/01/2014
Call Trace:
<#DF>
dump_stack+0xe1/0x133
panic+0x159/0x3d8
? get_cpu_entry_area+0x8/0x30
df_debug+0x24/0x2d
do_double_fault+0x94/0xf0
double_fault+0x2c/0x40
RIP: 0010:__sanitizer_cov_trace_pc+0x0/0x50
Code: 82 e8 74 2d f8 ff 48 89 9d 10 01 00 00 48 89 ee 5b 4c 89 e7 5d 41
5c e9 8e 5d 12 00 5b b8 f4 ff ff ff 5d 41 5c c3 0f 1f 40 00 <65> 48 8b
04 25 c0 6c 01 00 65 8b 15 78 ba df 7e 81 e2 00 01 1f 00
RSP: 0000:fffffe000000f008 EFLAGS: 00010093
RAX: 0000000000016cc0 RBX: ffffffff81a01436 RCX: ffffffff81a00b97
RDX: 0000000000016cc0 RSI: ffffffff81a01428 RDI: ffffffff81a01436
RBP: fffffe000000f088 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
? async_page_fault+0x16/0x40
? native_iret+0x7/0x7
? async_page_fault+0x8/0x40
? async_page_fault+0x16/0x40
</#DF>
<#DB>
trace_hardirqs_off_caller+0x10/0x150
trace_hardirqs_off_thunk+0x1a/0x1c
? native_iret+0x7/0x7
? async_page_fault+0x8/0x40
? async_page_fault+0x16/0x40
async_page_fault+0x16/0x40
RIP: 0010:__sanitizer_cov_trace_pc+0x0/0x50
Code: 82 e8 74 2d f8 ff 48 89 9d 10 01 00 00 48 89 ee 5b 4c 89 e7 5d 41
5c e9 8e 5d 12 00 5b b8 f4 ff ff ff 5d 41 5c c3 0f 1f 40 00 <65> 48 8b
04 25 c0 6c 01 00 65 8b 15 78 ba df 7e 81 e2 00 01 1f 00
RSP: 0000:fffffe000000f148 EFLAGS: 00010093 ORIG_RAX: 0000000000000000
RAX: 0000000000016cc0 RBX: ffffffff81a01436 RCX: ffffffff81a00b97
RDX: 0000000000016cc0 RSI: ffffffff81a01428 RDI: ffffffff81a01436
RBP: fffffe000000f1c8 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
? async_page_fault+0x16/0x40
? native_iret+0x7/0x7
? async_page_fault+0x8/0x40
? async_page_fault+0x16/0x40
trace_hardirqs_off_caller+0x10/0x150
trace_hardirqs_off_thunk+0x1a/0x1c
? native_iret+0x7/0x7
? async_page_fault+0x8/0x40
? async_page_fault+0x16/0x40
async_page_fault+0x16/0x40
RIP: 0010:__sanitizer_cov_trace_pc+0x0/0x50
Code: 82 e8 74 2d f8 ff 48 89 9d 10 01 00 00 48 89 ee 5b 4c 89 e7 5d 41
5c e9 8e 5d 12 00 5b b8 f4 ff ff ff 5d 41 5c c3 0f 1f 40 00 <65> 48 8b
04 25 c0 6c 01 00 65 8b 15 78 ba df 7e 81 e2 00 01 1f 00
RSP: 0000:fffffe000000f288 EFLAGS: 00010093 ORIG_RAX: 0000000000000000
RAX: 0000000000016cc0 RBX: ffffffff81a01436 RCX: ffffffff81a00b97
RDX: 0000000000016cc0 RSI: ffffffff81a01428 RDI: ffffffff81a01436
RBP: fffffe000000f308 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
? async_page_fault+0x16/0x40
? native_iret+0x7/0x7
? async_page_fault+0x8/0x40
? async_page_fault+0x16/0x40
trace_hardirqs_off_caller+0x10/0x150
trace_hardirqs_off_thunk+0x1a/0x1c
? native_iret+0x7/0x7
? async_page_fault+0x8/0x40
? async_page_fault+0x16/0x40
async_page_fault+0x16/0x40
RIP: 0010:__sanitizer_cov_trace_pc+0x0/0x50
Code: 82 e8 74 2d f8 ff 48 89 9d 10 01 00 00 48 89 ee 5b 4c 89 e7 5d 41
5c e9 8e 5d 12 00 5b b8 f4 ff ff ff 5d 41 5c c3 0f 1f 40 00 <65> 48 8b
04 25 c0 6c 01 00 65 8b 15 78 ba df 7e 81 e2 00 01 1f 00
RSP: 0000:fffffe000000f3c8 EFLAGS: 00010093 ORIG_RAX: 0000000000000000
RAX: 0000000000016cc0 RBX: ffffffff81a01436 RCX: ffffffff81a00b97
RDX: 0000000000016cc0 RSI: ffffffff81a01428 RDI: ffffffff81a01436
RBP: fffffe000000f448 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
? async_page_fault+0x16/0x40
? native_iret+0x7/0x7
? async_page_fault+0x8/0x40
? async_page_fault+0x16/0x40
trace_hardirqs_off_caller+0x10/0x150
trace_hardirqs_off_thunk+0x1a/0x1c
? native_iret+0x7/0x7
? async_page_fault+0x8/0x40
? async_page_fault+0x16/0x40
async_page_fault+0x16/0x40
RIP: 0010:__sanitizer_cov_trace_pc+0x0/0x50
Code: 82 e8 74 2d f8 ff 48 89 9d 10 01 00 00 48 89 ee 5b 4c 89 e7 5d 41
5c e9 8e 5d 12 00 5b b8 f4 ff ff ff 5d 41 5c c3 0f 1f 40 00 <65> 48 8b
04 25 c0 6c 01 00 65 8b 15 78 ba df 7e 81 e2 00 01 1f 00
RSP: 0000:fffffe000000f508 EFLAGS: 00010093 ORIG_RAX: 0000000000000000
RAX: 0000000000016cc0 RBX: ffffffff81a01436 RCX: ffffffff81a00b97
RDX: 0000000000016cc0 RSI: ffffffff81a01428 RDI: ffffffff81a01436
RBP: fffffe000000f588 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
? async_page_fault+0x16/0x40
? native_iret+0x7/0x7
? async_page_fault+0x8/0x40
? async_page_fault+0x16/0x40
trace_hardirqs_off_caller+0x10/0x150
trace_hardirqs_off_thunk+0x1a/0x1c
? native_iret+0x7/0x7
? async_page_fault+0x8/0x40
? async_page_fault+0x16/0x40
async_page_fault+0x16/0x40
RIP: 0010:__sanitizer_cov_trace_pc+0x0/0x50
Code: 82 e8 74 2d f8 ff 48 89 9d 10 01 00 00 48 89 ee 5b 4c 89 e7 5d 41
5c e9 8e 5d 12 00 5b b8 f4 ff ff ff 5d 41 5c c3 0f 1f 40 00 <65> 48 8b
04 25 c0 6c 01 00 65 8b 15 78 ba df 7e 81 e2 00 01 1f 00
RSP: 0000:fffffe000000f648 EFLAGS: 00010093 ORIG_RAX: 0000000000000000
RAX: 0000000000016cc0 RBX: ffffffff81a01436 RCX: ffffffff81a00b97
RDX: 0000000000016cc0 RSI: ffffffff81a01428 RDI: ffffffff81a01436
RBP: fffffe000000f6c8 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
? async_page_fault+0x16/0x40
? native_iret+0x7/0x7
? async_page_fault+0x8/0x40
? async_page_fault+0x16/0x40
trace_hardirqs_off_caller+0x10/0x150
trace_hardirqs_off_thunk+0x1a/0x1c
? native_iret+0x7/0x7
? async_page_fault+0x8/0x40
? async_page_fault+0x16/0x40
async_page_fault+0x16/0x40
RIP: 0010:__sanitizer_cov_trace_pc+0x0/0x50
Code: 82 e8 74 2d f8 ff 48 89 9d 10 01 00 00 48 89 ee 5b 4c 89 e7 5d 41
5c e9 8e 5d 12 00 5b b8 f4 ff ff ff 5d 41 5c c3 0f 1f 40 00 <65> 48 8b
04 25 c0 6c 01 00 65 8b 15 78 ba df 7e 81 e2 00 01 1f 00
RSP: 0000:fffffe000000f788 EFLAGS: 00010093 ORIG_RAX: 0000000000000000
RAX: 0000000000016cc0 RBX: ffffffff81a01436 RCX: ffffffff81a00b97
RDX: 0000000000016cc0 RSI: ffffffff81a01428 RDI: ffffffff81a01436
RBP: fffffe000000f808 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
? async_page_fault+0x16/0x40
? native_iret+0x7/0x7
? async_page_fault+0x8/0x40
? async_page_fault+0x16/0x40
trace_hardirqs_off_caller+0x10/0x150
trace_hardirqs_off_thunk+0x1a/0x1c
? native_iret+0x7/0x7
? async_page_fault+0x8/0x40
? async_page_fault+0x16/0x40
async_page_fault+0x16/0x40
RIP: 0010:__sanitizer_cov_trace_pc+0x0/0x50
Code: 82 e8 74 2d f8 ff 48 89 9d 10 01 00 00 48 89 ee 5b 4c 89 e7 5d 41
5c e9 8e 5d 12 00 5b b8 f4 ff ff ff 5d 41 5c c3 0f 1f 40 00 <65> 48 8b
04 25 c0 6c 01 00 65 8b 15 78 ba df 7e 81 e2 00 01 1f 00
RSP: 0000:fffffe000000f8c8 EFLAGS: 00010093 ORIG_RAX: 0000000000000000
RAX: 0000000000016cc0 RBX: ffffffff81a01436 RCX: ffffffff81a00b97
RDX: 0000000000016cc0 RSI: ffffffff81a01428 RDI: ffffffff81a01436
RBP: fffffe000000f948 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
? async_page_fault+0x16/0x40
? native_iret+0x7/0x7
? async_page_fault+0x8/0x40
? async_page_fault+0x16/0x40
trace_hardirqs_off_caller+0x10/0x150
trace_hardirqs_off_thunk+0x1a/0x1c
? native_iret+0x7/0x7
? async_page_fault+0x8/0x40
? async_page_fault+0x16/0x40
async_page_fault+0x16/0x40
RIP: 0010:__sanitizer_cov_trace_pc+0x0/0x50
Code: 82 e8 74 2d f8 ff 48 89 9d 10 01 00 00 48 89 ee 5b 4c 89 e7 5d 41
5c e9 8e 5d 12 00 5b b8 f4 ff ff ff 5d 41 5c c3 0f 1f 40 00 <65> 48 8b
04 25 c0 6c 01 00 65 8b 15 78 ba df 7e 81 e2 00 01 1f 00
RSP: 0000:fffffe000000fa08 EFLAGS: 00010083 ORIG_RAX: 0000000000000000
RAX: 0000000000006004 RBX: ffffffff81a01436 RCX: ffffffff81a00b97
RDX: 0000000000006004 RSI: ffffffff81a01428 RDI: ffffffff81a01436
RBP: fffffe000000fa88 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
? async_page_fault+0x16/0x40
? native_iret+0x7/0x7
? async_page_fault+0x8/0x40
? async_page_fault+0x16/0x40
trace_hardirqs_off_caller+0x10/0x150
trace_hardirqs_off_thunk+0x1a/0x1c
? native_iret+0x7/0x7
? async_page_fault+0x8/0x40
async_page_fault+0x16/0x40
RIP: 0010:fixup_bad_iret+0x6/0x50
Code: 2d a8 00 00 00 48 39 f8 74 0b b9 15 00 00 00 48 89 c7 f3 48 a5 c3
0f 1f 40 00 66 2e 0f 1f 84 00 00 00 00 00 41 54 55 48 89 fd <65> 48 8b
3d a6 31 f2 7e 48 8b b5 a0 00 00 00 4c 8d a7 50 ff ff ff
RSP: 0000:fffffe000000fb48 EFLAGS: 00010082 ORIG_RAX: 0000000000000000
RAX: 800000003c804002 RBX: 0000000000000000 RCX: ffffffff81a00b97
RDX: 0000000000000000 RSI: ffffffff81a013a8 RDI: fffffe000000fb60
RBP: fffffe000000fb60 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
? native_iret+0x7/0x7
? general_protection+0x8/0x40
error_entry+0xe5/0xf0
RIP: 0010:native_irq_return_iret+0x0/0x2
Code: 5b 41 5b 41 5a 41 59 41 58 58 59 5a 5e 5f 48 83 c4 08 e9 0c 00 00
00 90 90 66 2e 0f 1f 84 00 00 00 00 00 f6 44 24 20 04 75 02 <48> cf 57
0f 01 f8 0f 1f 00 66 90 0f 20 df 48 0f ba ef 3f 48 81 e7
RSP: 0000:fffffe000000fc18 EFLAGS: 00010046 ORIG_RAX: 0000000000000000
RAX: fffffe000000fe08 RBX: ffffffff81c07dac RCX: ffff88803c824000
RDX: ffffffff8126a228 RSI: 0000000040209100 RDI: 0000000000000000
RBP: fffffe000000fcb8 R08: fffffe000000fec0 R09: ffffffff8125b177
R10: 0000000000000000 R11: 0000000000000000 R12: 000000000000000d
R13: 000000000000000d R14: 0000000000000000 R15: 0000000000000000
? general_protection+0x8/0x40
? perf_exclude_event+0x67/0x90
? perf_bp_event+0x98/0xe0
RIP: 0000:0x2
Code: Bad RIP value.
RSP: 0000:0000000005080021 EFLAGS: 00000000
? ex_handler_uaccess+0x5d/0x70
? fixup_exception+0x50/0x6a
? do_general_protection+0x40/0x160
? general_protection+0x2d/0x40
? stack_trace_consume_entry+0x4b/0x80
? arch_stack_walk_user+0x71/0x100
? arch_stack_walk_user+0x34/0x100
? profile_setup.cold+0xc1/0xc1
? stack_trace_save_user+0x71/0x9c
? __this_cpu_preempt_check+0xc/0xc6
? hw_breakpoint_exceptions_notify+0x120/0x1c0
? notifier_call_chain+0x8e/0xb0
? atomic_notifier_call_chain+0x37/0x40
? notify_die+0x5c/0x80
? trace_hardirqs_off_caller+0x20/0x150
? trace_hardirqs_off_thunk+0x1a/0x1c
? debug_smp_processor_id+0x28/0xd0
? paranoid_exit+0xb/0xb0
? copy_user_enhanced_fast_string+0xe/0x20
</#DB>
Kernel Offset: disabled
---[ end Kernel panic - not syncing: Machine halted. ]---

There's quite a bit to unpack there... I haven't looked into it AT ALL
yet, but at least you have the report. Will try to see if I can get a
reproducible test case.


Vegard

2019-07-16 21:55:56

by Vegard Nossum

[permalink] [raw]
Subject: Re: [PATCH v3 0/6] Tracing vs CR2


On 7/16/19 9:33 PM, Vegard Nossum wrote:
>
> On 7/11/19 1:40 PM, Peter Zijlstra wrote:
>> Hi,
>>
>> Here's the latest (and hopefully final) set of tracing vs CR2 patches.
>>
>> They are basically the same as v2, with only minor edits and tags
>> collected
>> from the last review.
>>
>> Please consider.
>>
>
> Hi,
>
> I ran my own battery of tests on your patch set on top of
> 5ad18b2e60b75c7297a998dea702451d33a052ed and ran into this:
>
> ------------[ cut here ]------------
> General protection fault in user access. Non-canonical address?
> WARNING: CPU: 0 PID: 5039 at arch/x86/mm/extable.c:126
> ex_handler_uaccess+0x5d/0x70

Got a different one:

WARNING: CPU: 0 PID: 2150 at arch/x86/kernel/traps.c:791 do_debug+0xfe/0x240
CPU: 0 PID: 2150 Comm: init Not tainted 5.2.0+ #124
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
Ubuntu-1.8.2-1ubuntu1 04/01/2014
RIP: 0010:do_debug+0xfe/0x240
Code: 05 07 3d f3 7e f6 85 91 00 00 00 02 0f 85 d8 00 00 00 49 8b 84 24
18 0b 00 00 f6 44 24 01 40 74 2f f6 85 88 00 00 00 03 75 26 <0f> 0b 80
e4 bf 49 89 84 24 18 0b 00 00 f0 41 80 0c 24 10 48 81 a5
RSP: 0000:fffffe000000ff20 EFLAGS: 00010046
RAX: 0000000000004002 RBX: 0000000000000000 RCX: ffffffff810e2f72
RDX: 0000000000000000 RSI: 0000000000000003 RDI: ffffffff8201f090
RBP: fffffe000000ff58 R08: 0000000000000000 R09: 0000000000000005
R10: 0000000000000000 R11: 0000000000000000 R12: ffff88803e0df040
R13: 0000000000000000 R14: 000000003d376001 R15: 0000000000000000
FS: 0000555556dbc8c0(0000) GS:ffff88803ec00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000041f38010 CR3: 000000003d376001 CR4: 00000000003606f0
DR0: 0000000000000001 DR1: 0000000041a4f070 DR2: 00007fff959ff000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000003b3062a
Call Trace:
<#DB>
debug+0x2d/0x70
RIP: 0010:arch_stack_walk_user+0x74/0x100
Code: e8 10 49 39 c4 77 45 4c 8b 04 24 4c 89 e3 4d 89 fd 4c 89 fd 41 83
87 98 0a 00 00 01 0f 01 cb 0f ae e8 31 c0 4c 89 e2 4c 8b 33 <4d> 89 f4
85 c0 75 7a 48 8b 73 08 0f 01 ca 85 c0 74 1f 65 48 8b 04
RSP: 0000:ffffc900030dbd68 EFLAGS: 00040046
RAX: 0000000000000000 RBX: 0000000041a4f073 RCX: ffffffff811ca27b
RDX: 0000000041a4f073 RSI: 0000000041a4f0dd RDI: ffffc900030dbdb8
RBP: ffff88803e0df040 R08: ffffc900030dbf58 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: 0000000041a4f073
R13: ffff88803e0df040 R14: 0041281000bf4800 R15: ffff88803e0df040
? stack_trace_consume_entry+0x4b/0x80
</#DB>
? profile_setup.cold+0xc1/0xc1
stack_trace_save_user+0x71/0x9c
trace_buffer_unlock_commit_regs+0x1ae/0x270
trace_event_buffer_commit+0x90/0x240
trace_event_raw_event_preemptirq_template+0x9a/0x100
? debug+0x49/0x70
? perf_trace_preemptirq_template+0x120/0x120
? trace_hardirqs_off_thunk+0x1a/0x1c
trace_hardirqs_off_caller+0xf4/0x150
? debug+0x44/0x70
trace_hardirqs_off_thunk+0x1a/0x1c
debug+0x49/0x70
RIP: 0033:0x41a4f0dd
Code: 47 11 b7 d2 36 45 6c 49 be 00 f0 9f 95 ff 7f 00 00 49 bf de a7 b3
e8 d7 21 3c 15 9c 48 81 0c 24 00 01 00 00 9d b8 62 00 00 00 <8e> c0 0f
05 66 8c c8 9c 48 81 24 24 ff fe ff ff 9d 48 89 04 25 40
RSP: 002b:0000000040901ea0 EFLAGS: 00000317
RAX: 0000000000000062 RBX: 0000000041281000 RCX: ffffffffffffffff
RDX: 00000000401c0000 RSI: 0000000041892000 RDI: 0000000041281000
RBP: 0000000041a4f073 R08: 0000000000000001 R09: 0000000000000001
R10: ffffffff917d7748 R11: 1000000000000000 R12: fdffffffffffffff
R13: 6c4536d2b71147a5 R14: 00007fff959ff000 R15: 153c21d7e8b3a7de
---[ end trace 0cd51ba690f12b47 ]---

The warning is this:

if (WARN_ON_ONCE((dr6 & DR_STEP) && !user_mode(regs))) {
/*
* Historical junk that used to handle SYSENTER
single-stepping.
* This should be unreachable now. If we survive for a
while
* without anyone hitting this warning, we'll turn this
into
* an oops.
*/
tsk->thread.debugreg6 &= ~DR_STEP;
set_tsk_thread_flag(tsk, TIF_SINGLESTEP);
regs->flags &= ~X86_EFLAGS_TF;
}

Unfortunately DR6 from the register dump has already been cleared at the
top of do_debug() and the local variable dr6 is on the stack and not
loaded into any of the registers AFAICT.

From the userspace Code: line you can clearly see it setting EFLAGS_TF,
then it seems to be trapping on the next instruction:

1b: 9c pushfq
1c: 48 81 0c 24 00 01 00 orq $0x100,(%rsp)
23: 00
24: 9d popfq
25: b8 62 00 00 00 mov $0x62,%eax
2a:* 8e c0 mov %eax,%es <-- trapping
instruction

You can see that DR1 points to 41a4f070, which is close to userspace RBP
(41a4f073), which is perhaps being accessed by stack_trace_save_user()
and causing the debug exception on a data breakpoint?

The Code: line from stack_trace_save_user() is:

27: 4c 8b 33 mov (%rbx),%r14
2a:* 4d 89 f4 mov %r14,%r12 <--
trapping instruction

with RBX == 41a4f073 so that seems to fit the theory, except I'd have
expected the "trapping instruction" to point at the memory dereference.
(But maybe it's one of those "points to return address" kind of things?)

DR7 is 03b3062a, which is..
- DR0, DR1, DR2 global breakpoints
- DR0 reads + writes
- DR1 reads + writes
- DR2 reads + writes

A second instance of the same warning:

WARNING: CPU: 0 PID: 601 at arch/x86/kernel/traps.c:791 do_debug+0xfe/0x240
CPU: 0 PID: 601 Comm: init Not tainted 5.2.0+ #124
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
Ubuntu-1.8.2-1ubuntu1 04/01/2014
RIP: 0010:do_debug+0xfe/0x240
Code: 05 07 3d f3 7e f6 85 91 00 00 00 02 0f 85 d8 00 00 00 49 8b 84 24
18 0b 00 00 f6 44 24 01 40 74 2f f6 85 88 00 00 00 03 75 26 <0f> 0b 80
41 80 0c 24 10 48 81 a5
RSP: 0000:fffffe000000ff20 EFLAGS: 00010046
RAX: 0000000000004002 RBX: 0000000000000000 RCX: ffffffff810e2f72
RDX: 0000000000000000 RSI: 0000000000000003 RDI: ffffffff8201f090
RBP: fffffe000000ff58 R08: 0000000000000000 R09: 0000000000000005
R10: 0000000000000000 R11: 0000000000000000 R12: ffff88803e0bd040
R13: 0000000000000000 R14: 000000003d3c6001 R15: 0000000000000000
FS: 0000555556efb8c0(0000) GS:ffff88803ec00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000041686010 CR3: 000000003d3c6001 CR4: 00000000003606f0
DR0: 0000000000000001 DR1: 00000000400be070 DR2: 00007ffd20c67000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000003b3062a
Call Trace:
<#DB>
debug+0x2d/0x70
RIP: 0010:arch_stack_walk_user+0x74/0x100
Code: e8 10 49 39 c4 77 45 4c 8b 04 24 4c 89 e3 4d 89 fd 4c 89 fd 41 83
87 98 0a 00 00 01 0f 01 cb 0f ae e8 31 c0 4c 89 e2 4c 8b 33 <4d> 89 f4
85 c0 74 1f 65 48 8b 04
RSP: 0000:ffffc900024f3d68 EFLAGS: 00040046
RAX: 0000000000000000 RBX: 00000000400be073 RCX: ffffffff811ca27b
RDX: 00000000400be073 RSI: 00000000400be0dd RDI: ffffc900024f3db8
RBP: ffff88803e0bd040 R08: ffffc900024f3f58 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: 00000000400be073
R13: ffff88803e0bd040 R14: 00404e4000bf4800 R15: ffff88803e0bd040
? stack_trace_consume_entry+0x4b/0x80
</#DB>
? profile_setup.cold+0xc1/0xc1
stack_trace_save_user+0x71/0x9c
trace_buffer_unlock_commit_regs+0x1ae/0x270
trace_event_buffer_commit+0x90/0x240
trace_event_raw_event_preemptirq_template+0x9a/0x100
? debug+0x49/0x70
? perf_trace_preemptirq_template+0x120/0x120
? trace_hardirqs_off_thunk+0x1a/0x1c
trace_hardirqs_off_caller+0xf4/0x150
? debug+0x44/0x70
trace_hardirqs_off_thunk+0x1a/0x1c
debug+0x49/0x70
RIP: 0033:0x400be0dd
Code: 3a 51 3e 59 a9 b2 e3 49 be 00 70 c6 20 fd 7f 00 00 49 bf de a7 b3
e8 d7 21 3c 15 9c 48 81 0c 24 00 01 00 00 9d b8 62 00 00 00 <8e> c0 0f
ff ff 9d 48 89 04 25 40
RSP: 002b:00000000417d0ea0 EFLAGS: 00000317
RAX: 0000000000000062 RBX: 00000000404e4000 RCX: ffffffffffffffff
RDX: 0000000041da4000 RSI: 0000000040bb0000 RDI: 00000000404e4000
RBP: 00000000400be073 R08: 0000000000000001 R09: 0000000000000001
R10: 9c7fa8aa10386cdb R11: 1000000000000000 R12: fdffffffffffffff
R13: e3b2a9593e513a6b R14: 00007ffd20c67000 R15: 153c21d7e8b3a7de
---[ end trace beb9776710443227 ]---


Vegard

2019-07-17 01:05:44

by Andy Lutomirski

[permalink] [raw]
Subject: Re: [PATCH v3 0/6] Tracing vs CR2

On Tue, Jul 16, 2019 at 2:53 PM Vegard Nossum <[email protected]> wrote:
>
>
> On 7/16/19 9:33 PM, Vegard Nossum wrote:
> >
> > On 7/11/19 1:40 PM, Peter Zijlstra wrote:
> >> Hi,
> >>
> >> Here's the latest (and hopefully final) set of tracing vs CR2 patches.
> >>
> >> They are basically the same as v2, with only minor edits and tags
> >> collected
> >> from the last review.
> >>
> >> Please consider.
> >>
> >
> > Hi,
> >
> > I ran my own battery of tests on your patch set on top of
> > 5ad18b2e60b75c7297a998dea702451d33a052ed and ran into this:
> >

On a different thread, Peter and I decided that the last patch in this
series (the one that removes the _DEBUG stuff) is wrong. Can you see
if these are reproducible with that patch removed?

--Andy

2019-07-17 07:48:28

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH v3 0/6] Tracing vs CR2

On Tue, Jul 16, 2019 at 06:02:33PM -0700, Andy Lutomirski wrote:

> On a different thread, Peter and I decided that the last patch in this
> series (the one that removes the _DEBUG stuff) is wrong. Can you see
> if these are reproducible with that patch removed?

Wrong is maybe the wrong word :-), premature maybe, we definitely want to
get there, but the #DB crud needs a wee bit of work first.

2019-07-17 07:51:45

by Vegard Nossum

[permalink] [raw]
Subject: Re: [PATCH v3 0/6] Tracing vs CR2


On 7/17/19 3:02 AM, Andy Lutomirski wrote:
> On Tue, Jul 16, 2019 at 2:53 PM Vegard Nossum <[email protected]> wrote:
>>
>>
>> On 7/16/19 9:33 PM, Vegard Nossum wrote:
>>>
>>> On 7/11/19 1:40 PM, Peter Zijlstra wrote:
>>>> Hi,
>>>>
>>>> Here's the latest (and hopefully final) set of tracing vs CR2 patches.
>>>>
>>>> They are basically the same as v2, with only minor edits and tags
>>>> collected
>>>> from the last review.
>>>>
>>>> Please consider.
>>>>
>>>
>>> Hi,
>>>
>>> I ran my own battery of tests on your patch set on top of
>>> 5ad18b2e60b75c7297a998dea702451d33a052ed and ran into this:
>>>
>
> On a different thread, Peter and I decided that the last patch in this
> series (the one that removes the _DEBUG stuff) is wrong. Can you see
> if these are reproducible with that patch removed?

Yes, without the last patch I still get this:

Run /init as init process
init[711]: segfault at 40000000 ip 000000004000000a sp 0000000040000ff8
error 7
------------[ cut here ]------------
General protection fault in user access. Non-canonical address?
WARNING: CPU: 0 PID: 711 at arch/x86/mm/extable.c:126
ex_handler_uaccess+0x5d/0x70
CPU: 0 PID: 711 Comm: init Not tainted 5.2.0+ #125
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
Ubuntu-1.8.2-1ubuntu1 04/01/2014
init[716]: segfault at 40000000 ip 000000004000000a sp 0000000040000ff8
error 7
RIP: 0010:ex_handler_uaccess+0x5d/0x70
Code: 5d 41 5c c3 e8 c4 8e 0e 00 80 3d e5 74 1e 01 00 75 d3 e8 b6 8e 0e
00 48 c7 c7 10 a7 fb 81 c6 05 d0 74 1e 01 01 e8 d1 43 01 00 <0f> 0b eb
b7 0f 1f 44 00 00 66 2e 0f 1f 84 00 00 00 00 00 55 48 89
RSP: 0000:ffffc9000065fa18 EFLAGS: 00010086
RAX: 0000000000000000 RBX: ffffffff81c07dac RCX: ffffffff811a887c
init[714]: segfault at 40000000 ip 000000004000000a sp 0000000040000ff8
error 7
RDX: 0000000000000000 RSI: ffffffff8289f05f RDI: 0000000000000093
RBP: ffffc9000065fa88 R08: 000000002e80b265 R09: 000000000000003f
init[718]: segfault at 40000000 ip 000000004000000a sp 0000000040000ff8
error 7
R10: 0000000000000000 R11: 0000000000000000 R12: 000000000000000d
R13: 000000000000000d R14: 0000000000000000 R15: 0000000000000000
FS: 00000000006ce880(0000) GS:ffff88803ec00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 000000003fffffe0 CR3: 000000003d2f6004 CR4: 00000000003606f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
Code: Bad RIP value.
fixup_exception+0x50/0x6a
do_general_protection+0x40/0x160
general_protection+0x2d/0x40
RIP: 0010:arch_stack_walk_user+0x71/0x100
Code: 00 48 83 e8 10 49 39 c4 77 45 4c 8b 04 24 4c 89 e3 4d 89 fd 4c 89
fd 41 83 87 98 0a 00 00 01 0f 01 cb 0f ae e8 31 c0 4c 89 e2 <4c> 8b 33
4d 89 f4 85 c0 75 7a 48 8b 73 08 0f 01 ca 85 c0 74 1f 65
[...]

This is my reproducer (as init):

#include <fcntl.h>
#include <sched.h>
#include <sys/mman.h>
#include <sys/mount.h>
#include <sys/stat.h>
#include <sys/user.h>
#include <unistd.h>
#include <wait.h>

struct child_data {
(*code)();
};

child_fn(void *arg)
{
child_data *data = arg;
mprotect(data->code, PAGE_SIZE, PROT_EXEC);
data->code();
}

int main()
{
mkdir("/sys", 7);
mount("nodev", "/sys", "sysfs", 0, "");
mount("nodev", "/sys/kernel/tracing", "tracefs", 0, "");

int tracing_options_userstacktrace =
open("/sys/kernel/tracing/options/userstacktrace", O_RDWR);
write(tracing_options_userstacktrace, "1\n", 2);

int tracing_events_preemptirq_irq_disable =
open("/sys/kernel/tracing/events/preemptirq/irq_disable/enable", O_RDWR);
write(tracing_events_preemptirq_irq_disable, "1\n", 2);

void *code = mmap(0, PAGE_SIZE, PROT_WRITE, MAP_PRIVATE |
MAP_ANONYMOUS | MAP_32BIT, 1, 0);
{
unsigned char *output = code;

*output++ = 72;
*output++ = 189;
for (int i = 0; i < 8; ++i)
*output++ = i;
}

void *child_stack = mmap(0, PAGE_SIZE, PROT_WRITE, MAP_PRIVATE |
MAP_ANONYMOUS | MAP_32BIT, 1, 0);

while (1) {
child_data data = { code };
clone(child_fn, child_stack, SIGCHLD, &data);
}
}

Compiled with -static and booted with "norandmaps" (for some reason that
makes a difference), this is 100% reproducible for me, although the
reproducer is somewhat sensitive to small changes that I don't quite
understand.


Vegard

2019-07-17 08:08:40

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH v3 0/6] Tracing vs CR2

On Tue, Jul 16, 2019 at 09:33:50PM +0200, Vegard Nossum wrote:
> ------------[ cut here ]------------
> General protection fault in user access. Non-canonical address?
> WARNING: CPU: 0 PID: 5039 at arch/x86/mm/extable.c:126
> ex_handler_uaccess+0x5d/0x70
> CPU: 0 PID: 5039 Comm: init Not tainted 5.2.0+ #124
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
> Ubuntu-1.8.2-1ubuntu1 04/01/2014
> RIP: 0010:ex_handler_uaccess+0x5d/0x70
> Code: 5d 41 5c c3 e8 c4 8e 0e 00 80 3d e5 74 1e 01 00 75 d3 e8 b6 8e 0e 00
> 48 c7 c7 10 a7 fb 81 c6 05 d0 74 1e 01 01 e8 d1 43 01 00 <0f> 0b eb b7 0f 1f
> 44 00 00 66 2e 0f 1f 84 00 00 00 00 00 55 48 89
> RSP: 0000:fffffe000000fc48 EFLAGS: 00010086
> RAX: 0000000000000000 RBX: ffffffff81c07dac RCX: ffffffff811a887c
> RDX: 0000000000000000 RSI: ffffffff8289f05f RDI: 0000000000000093
> RBP: fffffe000000fcb8 R08: 00000036fe0f15d3 R09: 000000000000003f
> R10: 0000000000000000 R11: 0000000000000000 R12: 000000000000000d
> R13: 000000000000000d R14: 0000000000000000 R15: 0000000000000000
> FS: 00005555563ab8c0(0000) GS:ffff88803ec00000(0000) knlGS:0000000000000000
> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 0000000000001ff7 CR3: 000000003c804002 CR4: 00000000003606f0
> DR0: 0000000040209100 DR1: 00000000402091a1 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000ffff0ff1 DR7: 00000000000b062a
> Call Trace:
> <#DB>
> fixup_exception+0x50/0x6a
> do_general_protection+0x40/0x160
> general_protection+0x2d/0x40
> RIP: 0010:arch_stack_walk_user+0x71/0x100
> Code: 00 48 83 e8 10 49 39 c4 77 45 4c 8b 04 24 4c 89 e3 4d 89 fd 4c 89 fd
> 41 83 87 98 0a 00 00 01 0f 01 cb 0f ae e8 31 c0 4c 89 e2 <4c> 8b 33 4d 89 f4
> 85 c0 75 7a 48 8b 73 08 0f 01 ca 85 c0 74 1f 65
> RSP: 0000:fffffe000000fd68 EFLAGS: 00050046
> RAX: 0000000000000000 RBX: 854163717acc2789 RCX: ffffffff811ca27b
> RDX: 854163717acc2789 RSI: 0000000040209102 RDI: fffffe000000fdb8
> RBP: ffff88803d55d040 R08: ffffc9000520bf58 R09: 0000000000000000
> R10: 0000000000000000 R11: 0000000000000000 R12: 854163717acc2789
> R13: ffff88803d55d040 R14: 0000000000000093 R15: ffff88803d55d040
> ? stack_trace_consume_entry+0x4b/0x80
> ? arch_stack_walk_user+0x34/0x100
> ? profile_setup.cold+0xc1/0xc1
> stack_trace_save_user+0x71/0x9c
> trace_buffer_unlock_commit_regs+0x1ae/0x270
> trace_event_buffer_commit+0x90/0x240
> trace_event_raw_event_preemptirq_template+0x9a/0x100
> ? debug+0x16/0x70
> ? perf_trace_preemptirq_template+0x120/0x120
> ? trace_hardirqs_off_thunk+0x1a/0x1c
> trace_hardirqs_off_caller+0xf4/0x150
> trace_hardirqs_off_thunk+0x1a/0x1c
> ? debug+0x11/0x70
> debug+0x16/0x70
> RIP: 0010:copy_user_generic_unrolled+0xa0/0xc0
> Code: 7f 40 ff c9 75 b6 89 d1 83 e2 07 c1 e9 03 74 12 4c 8b 06 4c 89 07 48
> 8d 76 08 48 8d 7f 08 ff c9 75 ee 21 d2 74 10 89 d1 8a 06 <88> 07 48 ff c6 48
> ff c7 ff c9 75 f2 31 c0 0f 01 ca c3 0f 1f 40 00
> RSP: 0000:ffffc9000520be38 EFLAGS: 00040202
> RAX: ffff88803d55d09c RBX: ffff88803d55d040 RCX: 0000000000000001
> RDX: 0000000000000001 RSI: 0000000040209102 RDI: ffffc9000520be76
> RBP: 0000000000000001 R08: 0000000000000001 R09: 0000000000000000
> R10: 0000000000000000 R11: 0000000000000000 R12: 00007ffffffff000
> R13: 0000000040209102 R14: ffffc9000520be76 R15: 0000000000000000
> </#DB>
> __probe_kernel_read+0x57/0x90
> is_prefetch.isra.0+0xb5/0x210
> ? tracer_hardirqs_on+0x53/0x1a0
> __bad_area_nosemaphore+0x9e/0x220
> __do_page_fault+0x483/0x630
> ? async_page_fault+0x8/0x40
> async_page_fault+0x36/0x40
> RIP: 0033:0x40209102
> Code: 00 00 49 bc 00 20 23 40 00 00 00 00 49 bd 00 00 d0 40 00 00 00 00 49
> be ff ff ff ff ff ff ff ff 49 bf 00 50 80 40 00 00 00 00 <9c> 48 81 0c 24 00
> 04 00 00 48 81 0c 24 00 00 04 00 9d ff 2c 25 00
> RSP: 002b:0000000000001fff EFLAGS: 00010217
> RAX: 0000000000000000 RBX: 00000000402090b0 RCX: 0000000000000001
> RDX: 0000000000000001 RSI: 0000000000000000 RDI: 0000000041ebb000
> RBP: 854163717acc2789 R08: 0000000000000001 R09: b1f39cc399a61ebb
> R10: 00007ffeab175000 R11: 0000000000000360 R12: 0000000040232000
> R13: 0000000040d00000 R14: ffffffffffffffff R15: 0000000040805000
> ---[ end trace e5e49800ff5aa5ed ]---


https://lkml.kernel.org/r/[email protected]

Does something like the below help?

diff --git a/kernel/stacktrace.c b/kernel/stacktrace.c
index c8d0f05721a1..80ad4ccb7025 100644
--- a/kernel/stacktrace.c
+++ b/kernel/stacktrace.c
@@ -226,12 +226,16 @@ unsigned int stack_trace_save_user(unsigned long *store, unsigned int size)
.store = store,
.size = size,
};
+ mm_segment_t fs;

/* Trace user stack if not a kernel thread */
if (current->flags & PF_KTHREAD)
return 0;

+ fs = get_fs();
+ set_fs(USER_DS);
arch_stack_walk_user(consume_entry, &c, task_pt_regs(current));
+ set_fs(fs);
return c.len;
}
#endif

2019-07-17 08:12:54

by Vegard Nossum

[permalink] [raw]
Subject: Re: [PATCH v3 0/6] Tracing vs CR2

On 7/17/19 10:07 AM, Peter Zijlstra wrote:
> On Tue, Jul 16, 2019 at 09:33:50PM +0200, Vegard Nossum wrote:
>> ------------[ cut here ]------------
>> General protection fault in user access. Non-canonical address?
>> WARNING: CPU: 0 PID: 5039 at arch/x86/mm/extable.c:126
>> ex_handler_uaccess+0x5d/0x70
[...]
>
>
> https://lkml.kernel.org/r/[email protected]
>
> Does something like the below help?
>
> diff --git a/kernel/stacktrace.c b/kernel/stacktrace.c
> index c8d0f05721a1..80ad4ccb7025 100644
> --- a/kernel/stacktrace.c
> +++ b/kernel/stacktrace.c
> @@ -226,12 +226,16 @@ unsigned int stack_trace_save_user(unsigned long *store, unsigned int size)
> .store = store,
> .size = size,
> };
> + mm_segment_t fs;
>
> /* Trace user stack if not a kernel thread */
> if (current->flags & PF_KTHREAD)
> return 0;
>
> + fs = get_fs();
> + set_fs(USER_DS);
> arch_stack_walk_user(consume_entry, &c, task_pt_regs(current));
> + set_fs(fs);
> return c.len;
> }
> #endif
>

Yes.


Vegard

2019-07-17 09:38:00

by Eiichi Tsukata

[permalink] [raw]
Subject: Re: [PATCH v3 0/6] Tracing vs CR2



On 2019/07/17 6:51, Vegard Nossum wrote:
>
...
>
> Got a different one:
>
> WARNING: CPU: 0 PID: 2150 at arch/x86/kernel/traps.c:791 do_debug+0xfe/0x240
> CPU: 0 PID: 2150 Comm: init Not tainted 5.2.0+ #124
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014
> RIP: 0010:do_debug+0xfe/0x240
...


Hello Vegard

I found a way to reproduce #DB WARNING by setting hardware watchpoint to
the address arch_stack_walk_user() will touch.


[Steps to Reproduce #DB WARNING]

poc.s:

```
.global _start

.text
_start:
# exit(0)
mov $60, %rax
xor %rdi, %rdi
syscall
```

build:

# gcc -g -c poc.s; ld -o poc poc.o

setup ftrace:

# echo 1 > options/userstacktrace
# echo 1 > events/preemptirq/irq_disable/enable

exec gdb:(set hardware watch point to $rbp)

[18:28:48 root@vm loops]# gdb ./poc
GNU gdb (GDB) Fedora 8.3-6.fc30
Copyright (C) 2019 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from ./poc...
(gdb) l
1 .global _start
2
3 .text
4 _start:
5 # exit(0)
6 mov $60, %rax
7 xor %rdi, %rdi
8 syscall
(gdb) b 6
Breakpoint 1 at 0x401000: file poc.s, line 6.
(gdb) start
Function "main" not defined.
Make breakpoint pending on future shared library load? (y or [n]) n
Starting program: /root/tmp/loops/poc

Breakpoint 1, _start () at poc.s:6
6 mov $60, %rax
(gdb) set $rbp = $rsp
(gdb) p $rbp
$1 = (void *) 0x7fffffffe4b0
(gdb) rwatch *0x7fffffffe4b0
Hardware read watchpoint 2: *0x7fffffffe4b0
(gdb) c
Continuing.
[Inferior 1 (process 2744) exited normally]

dmesg:

[ 564.646159][ T2744] WARNING: CPU: 0 PID: 2744 at arch/x86/kernel/traps.c:791 do_debug+0x220/0x490
[ 564.648581][ T2744] Modules linked in:
[ 564.649530][ T2744] CPU: 0 PID: 2744 Comm: poc Tainted: G W 5.2.0+ #77
[ 564.651121][ T2744] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-2.fc30 04/01/2014
[ 564.653569][ T2744] RIP: 0010:do_debug+0x220/0x490
[ 564.654847][ T2744] Code: 00 48 8b 95 60 ff ff ff 48 b8 00 00 00 00 00 fc ff df 48 c1 ea 03 80 3c 02 00 0f 85 03 02 00 00 41 f6 87 88 00 00 00 03 75 60 <0f> 0b 4c 89 f2 49 81 e5 ff bf ff ff 48 b8 00 00 00 00 00 fc ff df
[ 564.659905][ T2744] RSP: 0000:fffffe0000014e98 EFLAGS: 00010046
[ 564.661500][ T2744] RAX: dffffc0000000000 RBX: 1fffffc0000029d8 RCX: 1ffff1100f81c2d3
[ 564.663531][ T2744] RDX: 1fffffc0000029fc RSI: 0000000000000000 RDI: ffffffff85c19f00
[ 564.665553][ T2744] RBP: fffffe0000014f48 R08: fffffe0000014fe8 R09: ffff88807c0e08a0
[ 564.667637][ T2744] R10: 0000000000000001 R11: 1ffff1100d1042ba R12: ffff88807c0e0000
[ 564.669700][ T2744] R13: 0000000000004001 R14: ffff88807c0e1698 R15: fffffe0000014f58
[ 564.671768][ T2744] FS: 0000000000000000(0000) GS:ffff888068800000(0000) knlGS:0000000000000000
[ 564.674032][ T2744] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 564.675752][ T2744] CR2: 0000000000000001 CR3: 000000005fe08002 CR4: 0000000000160ef0
[ 564.677570][ T2744] DR0: 00007fffffffe4b0 DR1: 0000000000000000 DR2: 0000000000000000
[ 564.679686][ T2744] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 00000000000f0602
[ 564.681788][ T2744] Call Trace:
[ 564.682700][ T2744] <#DB>
[ 564.683492][ T2744] ? trace_hardirqs_off_thunk+0x1a/0x1c
[ 564.684954][ T2744] ? do_int3+0x1f0/0x1f0
[ 564.686074][ T2744] ? trace_hardirqs_off_thunk+0x1a/0x1c
[ 564.687512][ T2744] debug+0x2d/0x70
[ 564.688456][ T2744] RIP: 0010:arch_stack_walk_user+0x7d/0xf2
[ 564.689899][ T2744] Code: 00 0f 85 8d 00 00 00 49 8b 87 d8 16 00 00 48 83 e8 10 49 39 c6 77 32 41 83 87 e8 15 00 00 01 0f 1f 00 0f ae e8 31 c0 49 8b 0e <85> c0 75 6d 49 8b 76 08 0f 1f 00 85 c0 74 1f 65 48 8b 04 25 00 ef
[ 564.694763][ T2744] RSP: 0000:ffff888061fb7c48 EFLAGS: 00000046
[ 564.696316][ T2744] RAX: 0000000000000000 RBX: ffff88807c0e0000 RCX: 0000000000000001
[ 564.698342][ T2744] RDX: 1ffff1100ba08e93 RSI: 0000000000401009 RDI: ffff888061fb7cbc
[ 564.700323][ T2744] RBP: ffff888061fb7c80 R08: 1ffff1100ba08e93 R09: ffff88805d04749c
[ 564.702337][ T2744] R10: ffffed100ba08e9b R11: ffff88805d0474db R12: ffff888061fb7cb0
[ 564.704359][ T2744] R13: ffff888061fb7f58 R14: 00007fffffffe4b0 R15: ffff88807c0e0000
[ 564.706413][ T2744] </#DB>
[ 564.707182][ T2744] ? stack_trace_save+0xc0/0xc0
[ 564.708447][ T2744] stack_trace_save_user+0x138/0x160
[ 564.709752][ T2744] ? stack_trace_save_tsk_reliable+0x210/0x210
[ 564.711235][ T2744] ? kasan_check_read+0x11/0x20
[ 564.712358][ T2744] trace_buffer_unlock_commit_regs+0x208/0x360
[ 564.713871][ T2744] trace_event_buffer_commit+0x1a0/0x790
[ 564.715278][ T2744] ? trace_event_buffer_reserve+0x163/0x240
[ 564.716742][ T2744] trace_event_raw_event_preemptirq_template+0x156/0x200
[ 564.718431][ T2744] ? perf_trace_preemptirq_template+0x490/0x490
[ 564.719761][ T2744] ? rcu_irq_enter_irqson+0x23/0x30
[ 564.721064][ T2744] ? trace_hardirqs_off+0x28/0x180
[ 564.722337][ T2744] ? trace_hardirqs_off_thunk+0x1a/0x1c
[ 564.723729][ T2744] ? debug+0x49/0x70
[ 564.724703][ T2744] ? trace_hardirqs_off_thunk+0x1a/0x1c
[ 564.726113][ T2744] ? perf_trace_preemptirq_template+0x490/0x490
[ 564.727721][ T2744] trace_hardirqs_off_caller+0x106/0x170
[ 564.729154][ T2744] ? debug+0x44/0x70
[ 564.730108][ T2744] trace_hardirqs_off_thunk+0x1a/0x1c
[ 564.731456][ T2744] debug+0x49/0x70
[ 564.732278][ T2744] RIP: 0033:0x401009
[ 564.733155][ T2744] Code: Bad RIP value.
[ 564.734019][ T2744] RSP: 002b:00007fffffffe4b0 EFLAGS: 00000302
[ 564.735366][ T2744] RAX: 000000000000003c RBX: 0000000000000000 RCX: 0000000000000000
[ 564.737290][ T2744] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
[ 564.739314][ T2744] RBP: 00007fffffffe4b0 R08: 0000000000000000 R09: 0000000000000000
[ 564.741338][ T2744] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
[ 564.743311][ T2744] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
[ 564.745463][ T2744] irq event stamp: 3340
[ 564.746554][ T2744] hardirqs last enabled at (3339): [<ffffffff82a04c12>] trace_hardirqs_on_thunk+0x1a/0x1c
[ 564.748558][ T2744] hardirqs last disabled at (3340): [<ffffffff82d28b32>] rcu_irq_enter_irqson+0x12/0x30
[ 564.750479][ T2744] softirqs last enabled at (3330): [<ffffffff85a00634>] __do_softirq+0x634/0x9f1
[ 564.752737][ T2744] softirqs last disabled at (3319): [<ffffffff82b77920>] irq_exit+0x150/0x180
[ 564.754933][ T2744] ---[ end trace 67c6e66ff6ba5cd0 ]---


2019-07-18 08:59:50

by Peter Zijlstra

[permalink] [raw]
Subject: [PATCH] stacktrace: Force USER_DS for stack_trace_save_user()

On Wed, Jul 17, 2019 at 10:09:45AM +0200, Vegard Nossum wrote:
> On 7/17/19 10:07 AM, Peter Zijlstra wrote:

> > Does something like the below help?

> Yes.

Thanks!

---
Subject: stacktrace: Force USER_DS for stack_trace_save_user()
From: Peter Zijlstra <[email protected]>
Date: Thu Jul 18 10:47:47 CEST 2019

When walking userspace stacks, we should set USER_DS, otherwise
access_ok() will not function as expected.

Reported-by: Vegard Nossum <[email protected]>
Tested-by: Vegard Nossum <[email protected]>
Reported-by: Eiichi Tsukata <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
---
--- a/kernel/stacktrace.c
+++ b/kernel/stacktrace.c
@@ -226,12 +226,17 @@ unsigned int stack_trace_save_user(unsig
.store = store,
.size = size,
};
+ mm_segment_t fs;

/* Trace user stack if not a kernel thread */
if (current->flags & PF_KTHREAD)
return 0;

+ fs = get_fs();
+ set_fs(USER_DS);
arch_stack_walk_user(consume_entry, &c, task_pt_regs(current));
+ set_fs(fs);
+
return c.len;
}
#endif

2019-07-18 13:22:37

by Joel Fernandes

[permalink] [raw]
Subject: Re: [PATCH] stacktrace: Force USER_DS for stack_trace_save_user()

On Thu, Jul 18, 2019 at 10:57:54AM +0200, Peter Zijlstra wrote:
> On Wed, Jul 17, 2019 at 10:09:45AM +0200, Vegard Nossum wrote:
> > On 7/17/19 10:07 AM, Peter Zijlstra wrote:
>
> > > Does something like the below help?
>
> > Yes.
>
> Thanks!
>
> ---
> Subject: stacktrace: Force USER_DS for stack_trace_save_user()
> From: Peter Zijlstra <[email protected]>
> Date: Thu Jul 18 10:47:47 CEST 2019
>
> When walking userspace stacks, we should set USER_DS, otherwise
> access_ok() will not function as expected.
>
> Reported-by: Vegard Nossum <[email protected]>
> Tested-by: Vegard Nossum <[email protected]>
> Reported-by: Eiichi Tsukata <[email protected]>
> Signed-off-by: Peter Zijlstra (Intel) <[email protected]>

Reviewed-by: Joel Fernandes (Google) <[email protected]>

thanks,

- Joel


> ---
> --- a/kernel/stacktrace.c
> +++ b/kernel/stacktrace.c
> @@ -226,12 +226,17 @@ unsigned int stack_trace_save_user(unsig
> .store = store,
> .size = size,
> };
> + mm_segment_t fs;
>
> /* Trace user stack if not a kernel thread */
> if (current->flags & PF_KTHREAD)
> return 0;
>
> + fs = get_fs();
> + set_fs(USER_DS);
> arch_stack_walk_user(consume_entry, &c, task_pt_regs(current));
> + set_fs(fs);
> +
> return c.len;
> }
> #endif

Subject: [tip:core/urgent] stacktrace: Force USER_DS for stack_trace_save_user()

Commit-ID: cac9b9a4b08304f11daace03b8b48659355e44c1
Gitweb: https://git.kernel.org/tip/cac9b9a4b08304f11daace03b8b48659355e44c1
Author: Peter Zijlstra <[email protected]>
AuthorDate: Thu, 18 Jul 2019 10:47:47 +0200
Committer: Thomas Gleixner <[email protected]>
CommitDate: Thu, 18 Jul 2019 16:47:24 +0200

stacktrace: Force USER_DS for stack_trace_save_user()

When walking userspace stacks, USER_DS needs to be set, otherwise
access_ok() will not function as expected.

Reported-by: Vegard Nossum <[email protected]>
Reported-by: Eiichi Tsukata <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Tested-by: Vegard Nossum <[email protected]>
Reviewed-by: Joel Fernandes (Google) <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
---
kernel/stacktrace.c | 5 +++++
1 file changed, 5 insertions(+)

diff --git a/kernel/stacktrace.c b/kernel/stacktrace.c
index e6a02b274b73..f5440abb7532 100644
--- a/kernel/stacktrace.c
+++ b/kernel/stacktrace.c
@@ -226,12 +226,17 @@ unsigned int stack_trace_save_user(unsigned long *store, unsigned int size)
.store = store,
.size = size,
};
+ mm_segment_t fs;

/* Trace user stack if not a kernel thread */
if (current->flags & PF_KTHREAD)
return 0;

+ fs = get_fs();
+ set_fs(USER_DS);
arch_stack_walk_user(consume_entry, &c, task_pt_regs(current));
+ set_fs(fs);
+
return c.len;
}
#endif

2019-07-18 20:28:47

by Andy Lutomirski

[permalink] [raw]
Subject: Re: [PATCH v3 0/6] Tracing vs CR2

Hi all-

I suspect that a bunch of the bugs you're all finding boil down to:

- Nested debug exceptions could corrupt the outer exception's DR6.
- Nested debug exceptions in which *both* exceptions came from the
kernel were probably all kinds of buggy
- Data breakpoints in bad places in the kernel were bad news

Could you give this not-quite-finished series a try?

https://git.kernel.org/pub/scm/linux/kernel/git/luto/linux.git/

2019-07-20 13:08:48

by Eiichi Tsukata

[permalink] [raw]
Subject: Re: [PATCH v3 0/6] Tracing vs CR2


On 2019/07/19 5:27, Andy Lutomirski wrote:
> Hi all-
>
> I suspect that a bunch of the bugs you're all finding boil down to:
>
> - Nested debug exceptions could corrupt the outer exception's DR6.
> - Nested debug exceptions in which *both* exceptions came from the
> kernel were probably all kinds of buggy
> - Data breakpoints in bad places in the kernel were bad news
>
> Could you give this not-quite-finished series a try?
>
> https://git.kernel.org/pub/scm/linux/kernel/git/luto/linux.git/
>

Though I'm still trying to find out other cases(other areas which could
be buggy if we set hw breakpoints), as far as I tested, there is
no problem so far.

If I understand correctly, the call trace and the dr6 value will be:

====

debug() // dr6: 0xffff4ff0, user_mode: 1
TRACE_IRQS_OFF
arch_stack_user_walk()
debug() // dr6: 0xffff4ff1 == 0xffff4ff0 | 0xffff0ff1 ... (*)
do_debug()
WARN_ON_ONCE
do_debug() // dr6: 0xffff0ff0(cleared in the above do_debug())

(*) :
> * The Intel SDM says:
> *
> * Certain debug exceptions may clear bits 0-3. The remaining
> * contents of the DR6 register are never cleared by the
> * processor. To avoid confusion in identifying debug
> * exceptions, debug handlers should clear the register before
> * returning to the interrupted task.

====

Note: printk() in do_debug() can cause infinite loop(printk() ->
irq_disable() -> do_debug() -> printk() ...), so printk_deferred()
was preferable.

Thanks

Eiichi

2019-07-20 18:42:45

by Andy Lutomirski

[permalink] [raw]
Subject: Re: [PATCH v3 0/6] Tracing vs CR2

On Fri, Jul 19, 2019 at 8:59 PM Eiichi Tsukata <[email protected]> wrote:
>
>
> On 2019/07/19 5:27, Andy Lutomirski wrote:
> > Hi all-
> >
> > I suspect that a bunch of the bugs you're all finding boil down to:
> >
> > - Nested debug exceptions could corrupt the outer exception's DR6.
> > - Nested debug exceptions in which *both* exceptions came from the
> > kernel were probably all kinds of buggy
> > - Data breakpoints in bad places in the kernel were bad news
> >
> > Could you give this not-quite-finished series a try?
> >
> > https://git.kernel.org/pub/scm/linux/kernel/git/luto/linux.git/
> >
>
> Though I'm still trying to find out other cases(other areas which could
> be buggy if we set hw breakpoints), as far as I tested, there is
> no problem so far.
>
> If I understand correctly, the call trace and the dr6 value will be:
>
> ====
>
> debug() // dr6: 0xffff4ff0, user_mode: 1
> TRACE_IRQS_OFF
> arch_stack_user_walk()
> debug() // dr6: 0xffff4ff1 == 0xffff4ff0 | 0xffff0ff1 ... (*)
> do_debug()
> WARN_ON_ONCE
> do_debug() // dr6: 0xffff0ff0(cleared in the above do_debug())

The dr6 register will indeed be cleared like this, but the dr6
variable should still be 0xffff4ff0.

>
> (*) :
> > * The Intel SDM says:
> > *
> > * Certain debug exceptions may clear bits 0-3. The remaining
> > * contents of the DR6 register are never cleared by the
> > * processor. To avoid confusion in identifying debug
> > * exceptions, debug handlers should clear the register before
> > * returning to the interrupted task.
>
> ====
>
> Note: printk() in do_debug() can cause infinite loop(printk() ->
> irq_disable() -> do_debug() -> printk() ...), so printk_deferred()
> was preferable.
>

Shouldn't that be fixed with my patches? It should only be able to
recurse two deep: do_debug() from user mode can indeed trip
breakpoints, but the next do_debug() will clear DR7 in paranoid_entry.

2019-07-20 18:44:24

by Eiichi Tsukata

[permalink] [raw]
Subject: Re: [PATCH v3 0/6] Tracing vs CR2


On 2019/07/20 21:49, Andy Lutomirski wrote:
> On Fri, Jul 19, 2019 at 8:59 PM Eiichi Tsukata <[email protected]> wrote:
>>
...
>>
>> ====
>>
>> debug() // dr6: 0xffff4ff0, user_mode: 1
>> TRACE_IRQS_OFF
>> arch_stack_user_walk()
>> debug() // dr6: 0xffff4ff1 == 0xffff4ff0 | 0xffff0ff1 ... (*)
>> do_debug()
>> WARN_ON_ONCE
>> do_debug() // dr6: 0xffff0ff0(cleared in the above do_debug())
>
> The dr6 register will indeed be cleared like this, but the dr6
> variable should still be 0xffff4ff0.

I should have use DR6 to mean it is a register, not variable.
"dr6" was ambiguous.

>
>>
...
>>
>> Note: printk() in do_debug() can cause infinite loop(printk() ->
>> irq_disable() -> do_debug() -> printk() ...), so printk_deferred()
>> was preferable.
>>
>
> Shouldn't that be fixed with my patches? It should only be able to
> recurse two deep: do_debug() from user mode can indeed trip
> breakpoints, but the next do_debug() will clear DR7 in paranoid_entry.
>

Sorry, I missed that. Now I confirmed your patches fixed the loop.

Thanks

Eiichi