I was able to trigger this oops with the perf_fuzzer with current git.
I can reliably trigger this on my Haswell machine.
I haven't done any analysis yet, I might not have time to today, but I
wanted to report it in case the cause was obvious to someone else.
Vince
*** perf_fuzzer 0.32-rc0 *** by Vince Weaver
Linux version 4.20.0-rc1+ x86_64
Processor: Intel 6/60/3
Stopping after 30000
Watchdog enabled with timeout 60s
Will auto-exit if signal storm detected
Seeding RNG from time 1541627285
To reproduce, try:
echo 1 > /proc/sys/kernel/nmi_watchdog
echo 0 > /proc/sys/kernel/perf_event_paranoid
echo 1250 > /proc/sys/kernel/perf_event_max_sample_rate
./perf_fuzzer -s 30000 -r 1541627285
Fuzzing the following syscalls: mmap perf_event_open close read write ioctl fork prctl poll
Also attempting the following: signal-handler-on-overflow busy-instruction-loop accessing-perf-proc-and-sys-files trashing-the-mmap-page
Pid=14868, sleeping 1s
==================================================
Starting fuzzing at 2018-11-07 16:48:06
==================================================
Cannot open /sys/kernel/tracing/kprobe_events
Iteration 10000, 125098 syscalls in 4.90 s (25.525 k syscalls/s)
Open attempts: 117090 Successful: 951 Currently open: 47
EPERM : 11
ENOENT : 598
E2BIG : 10074
EBADF : 7879
EACCES : 4691
UNKNOWN 19 : 1
EINVAL : 92824
EOPNOTSUPP : 61
Trinity Type (Normal 163/29305)(Sampling 17/29139)(Global 719/29405)(Random 52/29241)
Type (Hardware 224/16272)(software 346/15851)(tracepoint 63/15585)(Cache 58/14732)(cpu 230/15625)(breakpoint 9/15556)(kprobe 0/948)(msr 7/940)(power 0/1021)(uncore_imc 0/924)(uncore_cbox_0 3/911)(uncore_cbox_1 3/957)(uncore_cbox_2 2/914)(uncore_cbox_3 2/860)(uncore_arb 3/873)(cstate_core 1/902)(cstate_pkg 0/1016)(i915 0/942)(#18 0/16)(>19 0/12245)
Close: 904/904 Successful
Read: 795/881 Successful
Write: 0/934 Successful
Ioctl: 328/952 Successful: (ENABLE 84/84)(DISABLE 76/76)(REFRESH 4/74)(RESET 68/68)(PERIOD 9/69)(SET_OUTPUT 14/66)(SET_FILTER 0/78)(ID 69/69)(SET_BPF 0/70)(PAUSE_OUTPUT 4/60)(QUERY_BPF 0/67)(MOD_ATTR 0/55)(#12 0/0)(#13 0/0)(#14 0/0)(>14 0/116)
Mmap: 442/1113 Successful: (MMAP 442/1113)(TRASH 111/160)(READ 98/100)(UNMAP 438/1010)(AUX 0/119)(AUX_READ 0/0)
Prctl: 952/952 Successful
Fork: 421/421 Successful
Poll: 889/905 Successful
Access: 113/876 Successful
Overflows: 0 Recursive: 0
SIGIOs due to RT signal queue full: 0
[91760.326510] BUG: unable to handle kernel NULL pointer dereference at 0000000000000000
[91760.334876] PGD 0 P4D 0
[91760.337596] Oops: 0000 [#1] SMP PTI
[91760.341332] CPU: 6 PID: 0 Comm: swapper/6 Tainted: G W 4.20.0-rc1+ #119
[91760.349816] Hardware name: LENOVO 10AM000AUS/SHARKBAY, BIOS FBKT72AUS 01/26/2014
[91760.357723] RIP: 0010:perf_prepare_sample+0x82/0x4a0
[91760.363065] Code: 06 4c 89 ea 4c 89 e6 e8 3c 54 ff ff 40 f6 c5 01 0f 85 28 01 00 00 40 f6 c5 20 74 1c 48 85 ed 0f 89 04 01 00 00 49 8b 44 24 70 <48> 8b 00 8d 04 c5 08 00 00 00 66 01 43 06 f7 c5 00 04 00 00 74 41
[91760.383164] RSP: 0000:ffff88011ab83b80 EFLAGS: 00010086
[91760.388753] RAX: 0000000000000000 RBX: ffff88011ab83bd8 RCX: 000000000000001f
[91760.396373] RDX: 0000000000000000 RSI: 0000000025bbfcb9 RDI: 0000000000000000
[91760.404062] RBP: 80000000000b8165 R08: 0000000000000002 R09: 00000000000215c0
[91760.411678] R10: 00011b422ed4649b R11: 0000000000000000 R12: ffff88011ab83cc0
[91760.419287] R13: ffff8800a8c8c800 R14: ffff88011ab83c18 R15: ffffe8ffffd86300
[91760.426933] FS: 0000000000000000(0000) GS:ffff88011ab80000(0000) knlGS:0000000000000000
[91760.435616] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[91760.441735] CR2: 0000000000000000 CR3: 000000000200c002 CR4: 00000000001606e0
[91760.449369] DR0: 000000a4a7ffb768 DR1: 0000000000000000 DR2: 0000000000000000
[91760.457005] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000600
[91760.464641] Call Trace:
[91760.467265] <IRQ>
[91760.469427] intel_pmu_drain_bts_buffer+0x151/0x220
[91760.474650] ? intel_get_event_constraints+0x219/0x360
[91760.480145] ? perf_assign_events+0xe2/0x2a0
[91760.484732] ? select_idle_sibling+0x22/0x3a0
[91760.489403] ? __update_load_avg_se+0x1ec/0x270
[91760.494244] ? enqueue_task_fair+0x377/0xdd0
[91760.498832] ? cpumask_next_and+0x19/0x20
[91760.503105] ? load_balance+0x134/0x950
[91760.507239] ? check_preempt_curr+0x7a/0x90
[91760.511683] ? ttwu_do_wakeup+0x19/0x140
[91760.515877] x86_pmu_stop+0x3b/0x90
[91760.519606] x86_pmu_del+0x57/0x160
[91760.523343] event_sched_out.isra.106+0x81/0x170
[91760.528288] group_sched_out.part.108+0x51/0xc0
[91760.533151] __perf_event_disable+0x7f/0x160
[91760.537736] event_function+0x8c/0xd0
[91760.541671] remote_function+0x3c/0x50
[91760.545666] flush_smp_call_function_queue+0x35/0xe0
[91760.550979] smp_call_function_single_interrupt+0x3a/0xd0
[91760.556802] call_function_single_interrupt+0xf/0x20
[91760.562107] </IRQ>
[91760.564369] RIP: 0010:cpuidle_enter_state+0xb9/0x330
[91760.569671] Code: e8 ac a4 a7 ff 80 7c 24 0b 00 74 17 9c 58 0f 1f 44 00 00 f6 c4 02 0f 85 4c 02 00 00 31 ff e8 6e 30 ad ff fb 66 0f 1f 44 00 00 <85> ed 0f 88 1a 02 00 00 48 b8 ff ff ff ff f3 01 00 00 48 2b 1c 24
[91760.589707] RSP: 0000:ffffc900006ebea0 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff04
[91760.597785] RAX: ffff88011aba1dc0 RBX: 000053749daa0731 RCX: 000000000000001f
[91760.605431] RDX: 000053749daa0731 RSI: 0000000025bbfcb9 RDI: 0000000000000000
[91760.613057] RBP: 0000000000000005 R08: 0000000000000002 R09: 00000000000215c0
[91760.620691] R10: 00011b422ed2ea3e R11: ffff88011aba0d84 R12: ffffffff820caa58
[91760.628311] R13: ffffe8ffffd93370 R14: 0000000000000005 R15: 0000000000000000
[91760.635981] do_idle+0x208/0x240
[91760.639429] cpu_startup_entry+0x19/0x20
[91760.643591] start_secondary+0x195/0x1d0
[91760.647786] secondary_startup_64+0xa4/0xb0
[91760.652249] Modules linked in: intel_rapl x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel snd_hda_codec_realtek kvm snd_hda_codec_hdmi snd_hda_codec_generic snd_hda_intel snd_hda_codec snd_hda_core irqbypass i915 snd_hwdep crct10dif_pclmul iosf_mbi drm_kms_helper tpm_tis tpm_tis_core drm snd_pcm crc32_pclmul mei_me ghash_clmulni_intel i2c_algo_bit tpm snd_timer aesni_intel rng_core evdev video mei snd wmi_bmof sg aes_x86_64 pcspkr iTCO_wdt iTCO_vendor_support soundcore wmi pcc_cpufreq crypto_simd button cryptd glue_helper binfmt_misc ip_tables x_tables autofs4 sr_mod sd_mod cdrom ahci libahci xhci_pci ehci_pci libata xhci_hcd ehci_hcd scsi_mod usbcore lpc_ich e1000e crc32c_intel i2c_i801 mfd_core usb_common fan thermal
[91760.721157] CR2: 0000000000000000
[91760.724710] ---[ end trace d94a9891f848ef0a ]---
[91760.729652] RIP: 0010:perf_prepare_sample+0x82/0x4a0
[91760.734963] Code: 06 4c 89 ea 4c 89 e6 e8 3c 54 ff ff 40 f6 c5 01 0f 85 28 01 00 00 40 f6 c5 20 74 1c 48 85 ed 0f 89 04 01 00 00 49 8b 44 24 70 <48> 8b 00 8d 04 c5 08 00 00 00 66 01 43 06 f7 c5 00 04 00 00 74 41
[91760.755044] RSP: 0000:ffff88011ab83b80 EFLAGS: 00010086
[91760.760641] RAX: 0000000000000000 RBX: ffff88011ab83bd8 RCX: 000000000000001f
[91760.768294] RDX: 0000000000000000 RSI: 0000000025bbfcb9 RDI: 0000000000000000
[91760.775906] RBP: 80000000000b8165 R08: 0000000000000002 R09: 00000000000215c0
[91760.783514] R10: 00011b422ed4649b R11: 0000000000000000 R12: ffff88011ab83cc0
[91760.791134] R13: ffff8800a8c8c800 R14: ffff88011ab83c18 R15: ffffe8ffffd86300
[91760.798742] FS: 0000000000000000(0000) GS:ffff88011ab80000(0000) knlGS:0000000000000000
[91760.807383] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[91760.813560] CR2: 0000000000000000 CR3: 000000000200c002 CR4: 00000000001606e0
[91760.821197] DR0: 000000a4a7ffb768 DR1: 0000000000000000 DR2: 0000000000000000
[91760.828806] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000600
[91760.836434] Kernel panic - not syncing: Fatal exception in interrupt
[91760.843232] Kernel Offset: disabled
[91760.846971] ---[ end Kernel panic - not syncing: Fatal exception in interrupt ]---
[91760.855081] ------------[ cut here ]------------
[91760.860023] sched: Unexpected reschedule of offline CPU#5!
[91760.865928] WARNING: CPU: 6 PID: 0 at arch/x86/kernel/smp.c:128 native_smp_send_reschedule+0x34/0x40
[91760.875711] Modules linked in: intel_rapl x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel snd_hda_codec_realtek kvm snd_hda_codec_hdmi snd_hda_codec_generic snd_hda_intel snd_hda_codec snd_hda_core irqbypass i915 snd_hwdep crct10dif_pclmul iosf_mbi drm_kms_helper tpm_tis tpm_tis_core drm snd_pcm crc32_pclmul mei_me ghash_clmulni_intel i2c_algo_bit tpm snd_timer aesni_intel rng_core evdev video mei snd wmi_bmof sg aes_x86_64 pcspkr iTCO_wdt iTCO_vendor_support soundcore wmi pcc_cpufreq crypto_simd button cryptd glue_helper binfmt_misc ip_tables x_tables autofs4 sr_mod sd_mod cdrom ahci libahci xhci_pci ehci_pci libata xhci_hcd ehci_hcd scsi_mod usbcore lpc_ich e1000e crc32c_intel i2c_i801 mfd_core usb_common fan thermal
[91760.944533] CPU: 6 PID: 0 Comm: swapper/6 Tainted: G D W 4.20.0-rc1+ #119
[91760.953006] Hardware name: LENOVO 10AM000AUS/SHARKBAY, BIOS FBKT72AUS 01/26/2014
[91760.960923] RIP: 0010:native_smp_send_reschedule+0x34/0x40
[91760.966790] Code: 05 41 b1 0d 01 73 15 48 8b 05 f8 2e e8 00 be fd 00 00 00 48 8b 40 30 e9 0a 95 9a 00 89 fe 48 c7 c7 a0 10 e4 81 e8 a6 94 02 00 <0f> 0b c3 66 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 8b 05 c5 df 5d
[91760.986832] RSP: 0000:ffff88011ab835e0 EFLAGS: 00010082
[91760.992431] RAX: 0000000000000000 RBX: ffff88011ab61dc0 RCX: 0000000000000006
[91761.000060] RDX: 0000000000000007 RSI: 0000000000000096 RDI: ffff88011ab965f0
[91761.007706] RBP: ffff8801153ae140 R08: 0000000000000005 R09: 000000000000000f
[91761.015340] R10: 0000000000000000 R11: ffffffff8263d76d R12: ffff8801153ae7f8
[91761.022932] R13: ffff88011ab83630 R14: 0000000000000046 R15: 0000000000021dc0
[91761.030592] FS: 0000000000000000(0000) GS:ffff88011ab80000(0000) knlGS:0000000000000000
[91761.039243] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[91761.045385] CR2: 0000000000000000 CR3: 000000000200c002 CR4: 00000000001606e0
[91761.053014] DR0: 000000a4a7ffb768 DR1: 0000000000000000 DR2: 0000000000000000
[91761.060623] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000600
[91761.068230] Call Trace:
[91761.070851] <IRQ>
[91761.073008] check_preempt_curr+0x7a/0x90
[91761.077317] ttwu_do_wakeup+0x19/0x140
[91761.081320] try_to_wake_up+0x1d2/0x470
[91761.085394] ? enqueue_task_fair+0x377/0xdd0
[91761.089958] __wake_up_common+0x70/0x130
[91761.094147] ep_poll_callback+0xcb/0x2b0
[91761.098347] __wake_up_common+0x70/0x130
[91761.102561] __wake_up_common_lock+0x7c/0xc0
[91761.107146] ? tick_sched_do_timer+0x60/0x60
[91761.111695] irq_work_run_list+0x4d/0x70
[91761.115900] update_process_times+0x4a/0x60
[91761.120388] tick_sched_handle+0x22/0x60
[91761.124577] tick_sched_timer+0x37/0x70
[91761.128693] __hrtimer_run_queues+0x10c/0x280
[91761.133327] hrtimer_interrupt+0x100/0x220
[91761.137705] smp_apic_timer_interrupt+0x6a/0x140
[91761.142644] apic_timer_interrupt+0xf/0x20
[91761.147022] RIP: 0010:panic+0x21b/0x261
[91761.151111] Code: eb a6 83 3d 9d 3a 5b 01 00 74 05 e8 fe 57 02 00 48 c7 c6 c0 4f 63 82 48 c7 c7 88 8c e4 81 e8 c9 fe 05 00 fb 66 0f 1f 44 00 00 <31> db e8 8d e4 0b 00 4c 39 eb 7c 1d 41 83 f4 01 48 8b 05 5d 3a 5b
[91761.171166] RSP: 0000:ffff88011ab839e0 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff13
[91761.179277] RAX: 0000000000000046 RBX: 0000000000000000 RCX: 0000000000000006
[91761.186897] RDX: 0000000000000000 RSI: 0000000000000086 RDI: ffff88011ab965f0
[91761.194514] RBP: ffff88011ab83a50 R08: 0000000000000005 R09: 000000000000000f
[91761.202142] R10: 0000000000000000 R11: ffffffff8263d76d R12: 0000000000000000
[91761.209759] R13: 0000000000000000 R14: 0000000000000009 R15: 0000000000000001
[91761.217385] ? apic_timer_interrupt+0xa/0x20
[91761.221937] oops_end.cold.10+0xc/0x18
[91761.225940] no_context+0x1bb/0x380
[91761.229666] page_fault+0x1e/0x30
[91761.233219] RIP: 0010:perf_prepare_sample+0x82/0x4a0
[91761.238515] Code: 06 4c 89 ea 4c 89 e6 e8 3c 54 ff ff 40 f6 c5 01 0f 85 28 01 00 00 40 f6 c5 20 74 1c 48 85 ed 0f 89 04 01 00 00 49 8b 44 24 70 <48> 8b 00 8d 04 c5 08 00 00 00 66 01 43 06 f7 c5 00 04 00 00 74 41
[91761.258559] RSP: 0000:ffff88011ab83b80 EFLAGS: 00010086
[91761.264134] RAX: 0000000000000000 RBX: ffff88011ab83bd8 RCX: 000000000000001f
[91761.271745] RDX: 0000000000000000 RSI: 0000000025bbfcb9 RDI: 0000000000000000
[91761.279370] RBP: 80000000000b8165 R08: 0000000000000002 R09: 00000000000215c0
[91761.286987] R10: 00011b422ed4649b R11: 0000000000000000 R12: ffff88011ab83cc0
[91761.294625] R13: ffff8800a8c8c800 R14: ffff88011ab83c18 R15: ffffe8ffffd86300
[91761.302277] intel_pmu_drain_bts_buffer+0x151/0x220
[91761.307513] ? intel_get_event_constraints+0x219/0x360
[91761.313016] ? perf_assign_events+0xe2/0x2a0
[91761.317598] ? select_idle_sibling+0x22/0x3a0
[91761.322245] ? __update_load_avg_se+0x1ec/0x270
[91761.327074] ? enqueue_task_fair+0x377/0xdd0
[91761.331641] ? cpumask_next_and+0x19/0x20
[91761.335910] ? load_balance+0x134/0x950
[91761.340013] ? check_preempt_curr+0x7a/0x90
[91761.344484] ? ttwu_do_wakeup+0x19/0x140
[91761.348686] x86_pmu_stop+0x3b/0x90
[91761.352448] x86_pmu_del+0x57/0x160
[91761.356208] event_sched_out.isra.106+0x81/0x170
[91761.361183] group_sched_out.part.108+0x51/0xc0
[91761.366020] __perf_event_disable+0x7f/0x160
[91761.370615] event_function+0x8c/0xd0
[91761.374540] remote_function+0x3c/0x50
[91761.378553] flush_smp_call_function_queue+0x35/0xe0
[91761.383855] smp_call_function_single_interrupt+0x3a/0xd0
[91761.389627] call_function_single_interrupt+0xf/0x20
[91761.394932] </IRQ>
[91761.397176] RIP: 0010:cpuidle_enter_state+0xb9/0x330
[91761.402471] Code: e8 ac a4 a7 ff 80 7c 24 0b 00 74 17 9c 58 0f 1f 44 00 00 f6 c4 02 0f 85 4c 02 00 00 31 ff e8 6e 30 ad ff fb 66 0f 1f 44 00 00 <85> ed 0f 88 1a 02 00 00 48 b8 ff ff ff ff f3 01 00 00 48 2b 1c 24
[91761.422518] RSP: 0000:ffffc900006ebea0 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff04
[91761.430619] RAX: ffff88011aba1dc0 RBX: 000053749daa0731 RCX: 000000000000001f
[91761.438211] RDX: 000053749daa0731 RSI: 0000000025bbfcb9 RDI: 0000000000000000
[91761.445865] RBP: 0000000000000005 R08: 0000000000000002 R09: 00000000000215c0
[91761.453492] R10: 00011b422ed2ea3e R11: ffff88011aba0d84 R12: ffffffff820caa58
[91761.461094] R13: ffffe8ffffd93370 R14: 0000000000000005 R15: 0000000000000000
[91761.468736] do_idle+0x208/0x240
[91761.472189] cpu_startup_entry+0x19/0x20
[91761.476392] start_secondary+0x195/0x1d0
[91761.480576] secondary_startup_64+0xa4/0xb0
[91761.485050] ---[ end trace d94a9891f848ef0b ]---
On Thu, 8 Nov 2018, Vince Weaver wrote:
> [91760.326510] BUG: unable to handle kernel NULL pointer dereference at 0000000000000000
> [91760.334876] PGD 0 P4D 0
> [91760.337596] Oops: 0000 [#1] SMP PTI
> [91760.341332] CPU: 6 PID: 0 Comm: swapper/6 Tainted: G W 4.20.0-rc1+ #119
> [91760.349816] Hardware name: LENOVO 10AM000AUS/SHARKBAY, BIOS FBKT72AUS 01/26/2014
> [91760.357723] RIP: 0010:perf_prepare_sample+0x82/0x4a0
so what's the best way to do the equivelent of addr2line on something like
this, now that we aren't allowed to know the RIP anymore?
I probably knew at one point but I've spent the last 3 months doing 6502
assembly language Demoscene coding and so apparently I've forgotten
everything I once knew about x86_64 kernel interfaces. (As can be
imagined, Demoscene coding work is a lot more mentally rewarding than
perf_fuzzer work).
Vince
Vince Weaver <[email protected]> writes:
> On Thu, 8 Nov 2018, Vince Weaver wrote:
>
>> [91760.326510] BUG: unable to handle kernel NULL pointer dereference at 0000000000000000
>> [91760.334876] PGD 0 P4D 0
>> [91760.337596] Oops: 0000 [#1] SMP PTI
>> [91760.341332] CPU: 6 PID: 0 Comm: swapper/6 Tainted: G W 4.20.0-rc1+ #119
>> [91760.349816] Hardware name: LENOVO 10AM000AUS/SHARKBAY, BIOS FBKT72AUS 01/26/2014
>> [91760.357723] RIP: 0010:perf_prepare_sample+0x82/0x4a0
>
> so what's the best way to do the equivelent of addr2line on something like
> this, now that we aren't allowed to know the RIP anymore?
scripts/decode_stacktrace.sh works most of the time.
Sounds like BTS needs fixing up again. Thanks for looking at it though!
Regards,
--
Alex
On Thu, 8 Nov 2018, Alexander Shishkin wrote:
> Vince Weaver <[email protected]> writes:
>
> > On Thu, 8 Nov 2018, Vince Weaver wrote:
> >
> >> [91760.326510] BUG: unable to handle kernel NULL pointer dereference at 0000000000000000
> >> [91760.334876] PGD 0 P4D 0
> >> [91760.337596] Oops: 0000 [#1] SMP PTI
> >> [91760.341332] CPU: 6 PID: 0 Comm: swapper/6 Tainted: G W 4.20.0-rc1+ #119
> >> [91760.349816] Hardware name: LENOVO 10AM000AUS/SHARKBAY, BIOS FBKT72AUS 01/26/2014
> >> [91760.357723] RIP: 0010:perf_prepare_sample+0x82/0x4a0
> >
> > so what's the best way to do the equivelent of addr2line on something like
> > this, now that we aren't allowed to know the RIP anymore?
>
> scripts/decode_stacktrace.sh works most of the time.
>
> Sounds like BTS needs fixing up again. Thanks for looking at it though!
In case it matters, it looks like the address of the oops comes down to
linux.git/kernel/events/core.c:6393
size += data->callchain->nr;
Vince
On Thu, Nov 08, 2018 at 11:46:41AM -0500, Vince Weaver wrote:
> On Thu, 8 Nov 2018, Alexander Shishkin wrote:
>
> > Vince Weaver <[email protected]> writes:
> >
> > > On Thu, 8 Nov 2018, Vince Weaver wrote:
> > >
> > >> [91760.326510] BUG: unable to handle kernel NULL pointer dereference at 0000000000000000
> > >> [91760.334876] PGD 0 P4D 0
> > >> [91760.337596] Oops: 0000 [#1] SMP PTI
> > >> [91760.341332] CPU: 6 PID: 0 Comm: swapper/6 Tainted: G W 4.20.0-rc1+ #119
> > >> [91760.349816] Hardware name: LENOVO 10AM000AUS/SHARKBAY, BIOS FBKT72AUS 01/26/2014
> > >> [91760.357723] RIP: 0010:perf_prepare_sample+0x82/0x4a0
> > >
> > > so what's the best way to do the equivelent of addr2line on something like
> > > this, now that we aren't allowed to know the RIP anymore?
> >
> > scripts/decode_stacktrace.sh works most of the time.
> >
> > Sounds like BTS needs fixing up again. Thanks for looking at it though!
>
> In case it matters, it looks like the address of the oops comes down to
>
> linux.git/kernel/events/core.c:6393
>
> size += data->callchain->nr;
>
nice ;-) we can actual fake cpu event to become the bts event
and relay on that EARLY callchain stuff
I can bring my server down by:
perf record -e cpu/event=0xc4/p -g -c 1
where 0xc4 is the branch instructions events
I guess something like below could prevent it,
but haven't tested it yet, will do next week
jirka
---
diff --git a/arch/x86/events/intel/ds.c b/arch/x86/events/intel/ds.c
index b7b01d762d32..1049b547fdfe 100644
--- a/arch/x86/events/intel/ds.c
+++ b/arch/x86/events/intel/ds.c
@@ -577,6 +577,8 @@ void intel_pmu_disable_bts(void)
update_debugctlmsr(debugctlmsr);
}
+static struct perf_callchain_entry __empty_callchain = { .nr = 0, };
+
int intel_pmu_drain_bts_buffer(void)
{
struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
@@ -612,6 +614,9 @@ int intel_pmu_drain_bts_buffer(void)
perf_sample_data_init(&data, 0, event->hw.last_period);
+ if (event->attr.sample_type & __PERF_SAMPLE_CALLCHAIN_EARLY)
+ data.callchain = &__empty_callchain;
+
/*
* BTS leaks kernel addresses in branches across the cpl boundary,
* such as traps or system calls, so unless the user is asking for