LinuxLists.cc - [perf] perf_fuzzer causes crash in intel_pmu_drain_pebs

2021-01-28 14:29:29

Subject: [perf] perf_fuzzer causes crash in intel_pmu_drain_pebs_nhm()

Hello

the perf_fuzzer has turned up a repeatable crash on my haswell system.

addr2line is not being very helpful, it points to DECLARE_PER_CPU_FIRST.
I'll investigate more when I have the chance.

Vince

[96289.009646] BUG: kernel NULL pointer dereference, address: 0000000000000150
[96289.017094] #PF: supervisor read access in kernel mode
[96289.022588] #PF: error_code(0x0000) - not-present page
[96289.028069] PGD 0 P4D 0
[96289.030796] Oops: 0000 [#1] SMP PTI
[96289.034549] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G W 5.11.0-rc5+ #151
[96289.043059] Hardware name: LENOVO 10AM000AUS/SHARKBAY, BIOS FBKT72AUS 01/26/2014
[96289.050946] RIP: 0010:intel_pmu_drain_pebs_nhm+0x464/0x5f0
[96289.056817] Code: 09 00 00 0f b6 c0 49 39 c4 74 2a 48 63 82 78 09 00 00 48 01 c5 48 39 6c 24 08 76 17 0f b6 05 14 70 3f 01 83 e0 0f 3c 03 77 a4 <48> 8b 85 90 00 00 00 eb 9f 31 ed 83 eb 01 83 fb 01 0f 85 30 ff ff
[96289.076876] RSP: 0000:ffffffff822039e0 EFLAGS: 00010097
[96289.082468] RAX: 0000000000000002 RBX: 0000000000000155 RCX: 0000000000000008
[96289.090095] RDX: ffff88811ac118a0 RSI: ffffffff82203980 RDI: ffffffff82203980
[96289.097746] RBP: 00000000000000c0 R08: 0000000000000000 R09: 0000000000000000
[96289.105376] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000001
[96289.113008] R13: ffffffff82203bc0 R14: ffff88801c3cf800 R15: ffffffff829814a0
[96289.120671] FS: 0000000000000000(0000) GS:ffff88811ac00000(0000) knlGS:0000000000000000
[96289.129346] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[96289.135526] CR2: 0000000000000150 CR3: 000000000220c003 CR4: 00000000001706f0
[96289.143159] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[96289.150803] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000600
[96289.158414] Call Trace:
[96289.161041] ? update_blocked_averages+0x532/0x620
[96289.166152] ? update_group_capacity+0x25/0x1d0
[96289.171025] ? cpumask_next_and+0x19/0x20
[96289.175339] ? update_sd_lb_stats.constprop.0+0x702/0x820
[96289.181105] intel_pmu_drain_pebs_buffer+0x33/0x50
[96289.186259] ? x86_pmu_commit_txn+0xbc/0xf0
[96289.190749] ? _raw_spin_lock_irqsave+0x1d/0x30
[96289.195603] ? timerqueue_add+0x64/0xb0
[96289.199720] ? update_load_avg+0x6c/0x5e0
[96289.204001] ? enqueue_task_fair+0x98/0x5a0
[96289.208464] ? timerqueue_del+0x1e/0x40
[96289.212556] ? uncore_msr_read_counter+0x10/0x20
[96289.217513] intel_pmu_pebs_disable+0x12a/0x130
[96289.222324] x86_pmu_stop+0x48/0xa0
[96289.226076] x86_pmu_del+0x40/0x160
[96289.229813] event_sched_out.isra.0+0x81/0x1e0
[96289.234602] group_sched_out.part.0+0x4f/0xc0
[96289.239257] __perf_event_disable+0xef/0x1d0
[96289.243831] event_function+0x8c/0xd0
[96289.247785] remote_function+0x3e/0x50
[96289.251797] flush_smp_call_function_queue+0x11b/0x1a0
[96289.257268] flush_smp_call_function_from_idle+0x38/0x60
[96289.262944] do_idle+0x15f/0x240
[96289.266421] cpu_startup_entry+0x19/0x20
[96289.270639] start_kernel+0x7df/0x804
[96289.274558] ? apply_microcode_early.cold+0xc/0x27
[96289.279678] secondary_startup_64_no_verify+0xb0/0xbb
[96289.285078] Modules linked in: nf_tables libcrc32c nfnetlink intel_rapl_msr intel_rapl_common snd_hda_codec_realtek snd_hda_codec_generic snd_hda_codec_hdmi x86_pkg_temp_thermal ledtrig_audio intel_powerclamp snd_hda_intel coretemp snd_intel_dspcfg snd_hda_codec snd_hda_core kvm_intel kvm snd_hwdep irqbypass at24 snd_pcm tpm_tis crct10dif_pclmul snd_timer crc32_pclmul regmap_i2c wmi_bmof sg tpm_tis_core snd ghash_clmulni_intel tpm iTCO_wdt aesni_intel soundcore rng_core iTCO_vendor_support crypto_simd mei_me mei cryptd pcspkr evdev glue_helper binfmt_misc ip_tables x_tables autofs4 sr_mod sd_mod t10_pi cdrom i915 iosf_mbi ahci i2c_algo_bit libahci drm_kms_helper xhci_pci ehci_pci ehci_hcd libata xhci_hcd lpc_ich usbcore i2c_i801 drm crc32c_intel e1000e mfd_core scsi_mod usb_common i2c_smbus wmi fan thermal video button
[96289.362498] CR2: 0000000000000150
[96289.366070] ---[ end trace 80c577f99562015f ]---
[96289.371007] RIP: 0010:intel_pmu_drain_pebs_nhm+0x464/0x5f0
[96289.376868] Code: 09 00 00 0f b6 c0 49 39 c4 74 2a 48 63 82 78 09 00 00 48 01 c5 48 39 6c 24 08 76 17 0f b6 05 14 70 3f 01 83 e0 0f 3c 03 77 a4 <48> 8b 85 90 00 00 00 eb 9f 31 ed 83 eb 01 83 fb 01 0f 85 30 ff ff
[96289.396981] RSP: 0000:ffffffff822039e0 EFLAGS: 00010097
[96289.402573] RAX: 0000000000000002 RBX: 0000000000000155 RCX: 0000000000000008
[96289.410226] RDX: ffff88811ac118a0 RSI: ffffffff82203980 RDI: ffffffff82203980
[96289.417841] RBP: 00000000000000c0 R08: 0000000000000000 R09: 0000000000000000
[96289.425461] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000001
[96289.433122] R13: ffffffff82203bc0 R14: ffff88801c3cf800 R15: ffffffff829814a0
[96289.440774] FS: 0000000000000000(0000) GS:ffff88811ac00000(0000) knlGS:0000000000000000
[96289.449374] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[96289.455507] CR2: 0000000000000150 CR3: 000000000220c003 CR4: 00000000001706f0
[96289.463119] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[96289.470764] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000600
[96289.478408] Kernel panic - not syncing: Attempted to kill the idle task!
[96289.485598] Kernel Offset: disabled
[96289.489355] ---[ end Kernel panic - not syncing: Attempted to kill the idle task! ]---

2021-01-28 20:02:19

by Vince Weaver

[permalink] [raw]

Subject: Re: [perf] perf_fuzzer causes crash in intel_pmu_drain_pebs_nhm()

On Thu, 28 Jan 2021, Vince Weaver wrote:

> the perf_fuzzer has turned up a repeatable crash on my haswell system.
>
> addr2line is not being very helpful, it points to DECLARE_PER_CPU_FIRST.
> I'll investigate more when I have the chance.

so I poked around some more.

This seems to be caused in

__intel_pmu_pebs_event()
get_next_pebs_record_by_bit() ds.c line 1639
get_pebs_status(at) ds.c line 1317
return ((struct pebs_record_nhm *)n)->status;

where "n" has the value of 0xc0 rather than a proper pointer.

this does seem to be repetable, but fairly deep in a fuzzing run so I
don't have a quick reproducer.

Vince

> [96289.009646] BUG: kernel NULL pointer dereference, address: 0000000000000150
> [96289.017094] #PF: supervisor read access in kernel mode
> [96289.022588] #PF: error_code(0x0000) - not-present page
> [96289.028069] PGD 0 P4D 0
> [96289.030796] Oops: 0000 [#1] SMP PTI
> [96289.034549] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G W 5.11.0-rc5+ #151
> [96289.043059] Hardware name: LENOVO 10AM000AUS/SHARKBAY, BIOS FBKT72AUS 01/26/2014
> [96289.050946] RIP: 0010:intel_pmu_drain_pebs_nhm+0x464/0x5f0
> [96289.056817] Code: 09 00 00 0f b6 c0 49 39 c4 74 2a 48 63 82 78 09 00 00 48 01 c5 48 39 6c 24 08 76 17 0f b6 05 14 70 3f 01 83 e0 0f 3c 03 77 a4 <48> 8b 85 90 00 00 00 eb 9f 31 ed 83 eb 01 83 fb 01 0f 85 30 ff ff
> [96289.076876] RSP: 0000:ffffffff822039e0 EFLAGS: 00010097
> [96289.082468] RAX: 0000000000000002 RBX: 0000000000000155 RCX: 0000000000000008
> [96289.090095] RDX: ffff88811ac118a0 RSI: ffffffff82203980 RDI: ffffffff82203980
> [96289.097746] RBP: 00000000000000c0 R08: 0000000000000000 R09: 0000000000000000
> [96289.105376] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000001
> [96289.113008] R13: ffffffff82203bc0 R14: ffff88801c3cf800 R15: ffffffff829814a0
> [96289.120671] FS: 0000000000000000(0000) GS:ffff88811ac00000(0000) knlGS:0000000000000000
> [96289.129346] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [96289.135526] CR2: 0000000000000150 CR3: 000000000220c003 CR4: 00000000001706f0
> [96289.143159] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [96289.150803] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000600
> [96289.158414] Call Trace:
> [96289.161041] ? update_blocked_averages+0x532/0x620
> [96289.166152] ? update_group_capacity+0x25/0x1d0
> [96289.171025] ? cpumask_next_and+0x19/0x20
> [96289.175339] ? update_sd_lb_stats.constprop.0+0x702/0x820
> [96289.181105] intel_pmu_drain_pebs_buffer+0x33/0x50
> [96289.186259] ? x86_pmu_commit_txn+0xbc/0xf0
> [96289.190749] ? _raw_spin_lock_irqsave+0x1d/0x30
> [96289.195603] ? timerqueue_add+0x64/0xb0
> [96289.199720] ? update_load_avg+0x6c/0x5e0
> [96289.204001] ? enqueue_task_fair+0x98/0x5a0
> [96289.208464] ? timerqueue_del+0x1e/0x40
> [96289.212556] ? uncore_msr_read_counter+0x10/0x20
> [96289.217513] intel_pmu_pebs_disable+0x12a/0x130
> [96289.222324] x86_pmu_stop+0x48/0xa0
> [96289.226076] x86_pmu_del+0x40/0x160
> [96289.229813] event_sched_out.isra.0+0x81/0x1e0
> [96289.234602] group_sched_out.part.0+0x4f/0xc0
> [96289.239257] __perf_event_disable+0xef/0x1d0
> [96289.243831] event_function+0x8c/0xd0
> [96289.247785] remote_function+0x3e/0x50
> [96289.251797] flush_smp_call_function_queue+0x11b/0x1a0
> [96289.257268] flush_smp_call_function_from_idle+0x38/0x60
> [96289.262944] do_idle+0x15f/0x240
> [96289.266421] cpu_startup_entry+0x19/0x20
> [96289.270639] start_kernel+0x7df/0x804
> [96289.274558] ? apply_microcode_early.cold+0xc/0x27
> [96289.279678] secondary_startup_64_no_verify+0xb0/0xbb
> [96289.285078] Modules linked in: nf_tables libcrc32c nfnetlink intel_rapl_msr intel_rapl_common snd_hda_codec_realtek snd_hda_codec_generic snd_hda_codec_hdmi x86_pkg_temp_thermal ledtrig_audio intel_powerclamp snd_hda_intel coretemp snd_intel_dspcfg snd_hda_codec snd_hda_core kvm_intel kvm snd_hwdep irqbypass at24 snd_pcm tpm_tis crct10dif_pclmul snd_timer crc32_pclmul regmap_i2c wmi_bmof sg tpm_tis_core snd ghash_clmulni_intel tpm iTCO_wdt aesni_intel soundcore rng_core iTCO_vendor_support crypto_simd mei_me mei cryptd pcspkr evdev glue_helper binfmt_misc ip_tables x_tables autofs4 sr_mod sd_mod t10_pi cdrom i915 iosf_mbi ahci i2c_algo_bit libahci drm_kms_helper xhci_pci ehci_pci ehci_hcd libata xhci_hcd lpc_ich usbcore i2c_i801 drm crc32c_intel e1000e mfd_core scsi_mod usb_common i2c_smbus wmi fan thermal video button
> [96289.362498] CR2: 0000000000000150
> [96289.366070] ---[ end trace 80c577f99562015f ]---
> [96289.371007] RIP: 0010:intel_pmu_drain_pebs_nhm+0x464/0x5f0
> [96289.376868] Code: 09 00 00 0f b6 c0 49 39 c4 74 2a 48 63 82 78 09 00 00 48 01 c5 48 39 6c 24 08 76 17 0f b6 05 14 70 3f 01 83 e0 0f 3c 03 77 a4 <48> 8b 85 90 00 00 00 eb 9f 31 ed 83 eb 01 83 fb 01 0f 85 30 ff ff
> [96289.396981] RSP: 0000:ffffffff822039e0 EFLAGS: 00010097
> [96289.402573] RAX: 0000000000000002 RBX: 0000000000000155 RCX: 0000000000000008
> [96289.410226] RDX: ffff88811ac118a0 RSI: ffffffff82203980 RDI: ffffffff82203980
> [96289.417841] RBP: 00000000000000c0 R08: 0000000000000000 R09: 0000000000000000
> [96289.425461] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000001
> [96289.433122] R13: ffffffff82203bc0 R14: ffff88801c3cf800 R15: ffffffff829814a0
> [96289.440774] FS: 0000000000000000(0000) GS:ffff88811ac00000(0000) knlGS:0000000000000000
> [96289.449374] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [96289.455507] CR2: 0000000000000150 CR3: 000000000220c003 CR4: 00000000001706f0
> [96289.463119] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [96289.470764] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000600
> [96289.478408] Kernel panic - not syncing: Attempted to kill the idle task!
> [96289.485598] Kernel Offset: disabled
> [96289.489355] ---[ end Kernel panic - not syncing: Attempted to kill the idle task! ]---
>
>

2021-02-11 15:23:16

by Peter Zijlstra

[permalink] [raw]

Subject: Re: [perf] perf_fuzzer causes crash in intel_pmu_drain_pebs_nhm()

Kan, do you have time to look at this?

On Thu, Jan 28, 2021 at 02:49:47PM -0500, Vince Weaver wrote:
> On Thu, 28 Jan 2021, Vince Weaver wrote:
>
> > the perf_fuzzer has turned up a repeatable crash on my haswell system.
> >
> > addr2line is not being very helpful, it points to DECLARE_PER_CPU_FIRST.
> > I'll investigate more when I have the chance.
>
> so I poked around some more.
>
> This seems to be caused in
>
> __intel_pmu_pebs_event()
> get_next_pebs_record_by_bit() ds.c line 1639
> get_pebs_status(at) ds.c line 1317
> return ((struct pebs_record_nhm *)n)->status;
>
> where "n" has the value of 0xc0 rather than a proper pointer.
>
> this does seem to be repetable, but fairly deep in a fuzzing run so I
> don't have a quick reproducer.
>
> Vince
>
>
> > [96289.009646] BUG: kernel NULL pointer dereference, address: 0000000000000150
> > [96289.017094] #PF: supervisor read access in kernel mode
> > [96289.022588] #PF: error_code(0x0000) - not-present page
> > [96289.028069] PGD 0 P4D 0
> > [96289.030796] Oops: 0000 [#1] SMP PTI
> > [96289.034549] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G W 5.11.0-rc5+ #151
> > [96289.043059] Hardware name: LENOVO 10AM000AUS/SHARKBAY, BIOS FBKT72AUS 01/26/2014
> > [96289.050946] RIP: 0010:intel_pmu_drain_pebs_nhm+0x464/0x5f0
> > [96289.056817] Code: 09 00 00 0f b6 c0 49 39 c4 74 2a 48 63 82 78 09 00 00 48 01 c5 48 39 6c 24 08 76 17 0f b6 05 14 70 3f 01 83 e0 0f 3c 03 77 a4 <48> 8b 85 90 00 00 00 eb 9f 31 ed 83 eb 01 83 fb 01 0f 85 30 ff ff
> > [96289.076876] RSP: 0000:ffffffff822039e0 EFLAGS: 00010097
> > [96289.082468] RAX: 0000000000000002 RBX: 0000000000000155 RCX: 0000000000000008
> > [96289.090095] RDX: ffff88811ac118a0 RSI: ffffffff82203980 RDI: ffffffff82203980
> > [96289.097746] RBP: 00000000000000c0 R08: 0000000000000000 R09: 0000000000000000
> > [96289.105376] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000001
> > [96289.113008] R13: ffffffff82203bc0 R14: ffff88801c3cf800 R15: ffffffff829814a0
> > [96289.120671] FS: 0000000000000000(0000) GS:ffff88811ac00000(0000) knlGS:0000000000000000
> > [96289.129346] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > [96289.135526] CR2: 0000000000000150 CR3: 000000000220c003 CR4: 00000000001706f0
> > [96289.143159] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> > [96289.150803] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000600
> > [96289.158414] Call Trace:
> > [96289.161041] ? update_blocked_averages+0x532/0x620
> > [96289.166152] ? update_group_capacity+0x25/0x1d0
> > [96289.171025] ? cpumask_next_and+0x19/0x20
> > [96289.175339] ? update_sd_lb_stats.constprop.0+0x702/0x820
> > [96289.181105] intel_pmu_drain_pebs_buffer+0x33/0x50
> > [96289.186259] ? x86_pmu_commit_txn+0xbc/0xf0
> > [96289.190749] ? _raw_spin_lock_irqsave+0x1d/0x30
> > [96289.195603] ? timerqueue_add+0x64/0xb0
> > [96289.199720] ? update_load_avg+0x6c/0x5e0
> > [96289.204001] ? enqueue_task_fair+0x98/0x5a0
> > [96289.208464] ? timerqueue_del+0x1e/0x40
> > [96289.212556] ? uncore_msr_read_counter+0x10/0x20
> > [96289.217513] intel_pmu_pebs_disable+0x12a/0x130
> > [96289.222324] x86_pmu_stop+0x48/0xa0
> > [96289.226076] x86_pmu_del+0x40/0x160
> > [96289.229813] event_sched_out.isra.0+0x81/0x1e0
> > [96289.234602] group_sched_out.part.0+0x4f/0xc0
> > [96289.239257] __perf_event_disable+0xef/0x1d0
> > [96289.243831] event_function+0x8c/0xd0
> > [96289.247785] remote_function+0x3e/0x50
> > [96289.251797] flush_smp_call_function_queue+0x11b/0x1a0
> > [96289.257268] flush_smp_call_function_from_idle+0x38/0x60
> > [96289.262944] do_idle+0x15f/0x240
> > [96289.266421] cpu_startup_entry+0x19/0x20
> > [96289.270639] start_kernel+0x7df/0x804
> > [96289.274558] ? apply_microcode_early.cold+0xc/0x27
> > [96289.279678] secondary_startup_64_no_verify+0xb0/0xbb
> > [96289.285078] Modules linked in: nf_tables libcrc32c nfnetlink intel_rapl_msr intel_rapl_common snd_hda_codec_realtek snd_hda_codec_generic snd_hda_codec_hdmi x86_pkg_temp_thermal ledtrig_audio intel_powerclamp snd_hda_intel coretemp snd_intel_dspcfg snd_hda_codec snd_hda_core kvm_intel kvm snd_hwdep irqbypass at24 snd_pcm tpm_tis crct10dif_pclmul snd_timer crc32_pclmul regmap_i2c wmi_bmof sg tpm_tis_core snd ghash_clmulni_intel tpm iTCO_wdt aesni_intel soundcore rng_core iTCO_vendor_support crypto_simd mei_me mei cryptd pcspkr evdev glue_helper binfmt_misc ip_tables x_tables autofs4 sr_mod sd_mod t10_pi cdrom i915 iosf_mbi ahci i2c_algo_bit libahci drm_kms_helper xhci_pci ehci_pci ehci_hcd libata xhci_hcd lpc_ich usbcore i2c_i801 drm crc32c_intel e1000e mfd_core scsi_mod usb_common i2c_smbus wmi fan thermal video button
> > [96289.362498] CR2: 0000000000000150
> > [96289.366070] ---[ end trace 80c577f99562015f ]---
> > [96289.371007] RIP: 0010:intel_pmu_drain_pebs_nhm+0x464/0x5f0
> > [96289.376868] Code: 09 00 00 0f b6 c0 49 39 c4 74 2a 48 63 82 78 09 00 00 48 01 c5 48 39 6c 24 08 76 17 0f b6 05 14 70 3f 01 83 e0 0f 3c 03 77 a4 <48> 8b 85 90 00 00 00 eb 9f 31 ed 83 eb 01 83 fb 01 0f 85 30 ff ff
> > [96289.396981] RSP: 0000:ffffffff822039e0 EFLAGS: 00010097
> > [96289.402573] RAX: 0000000000000002 RBX: 0000000000000155 RCX: 0000000000000008
> > [96289.410226] RDX: ffff88811ac118a0 RSI: ffffffff82203980 RDI: ffffffff82203980
> > [96289.417841] RBP: 00000000000000c0 R08: 0000000000000000 R09: 0000000000000000
> > [96289.425461] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000001
> > [96289.433122] R13: ffffffff82203bc0 R14: ffff88801c3cf800 R15: ffffffff829814a0
> > [96289.440774] FS: 0000000000000000(0000) GS:ffff88811ac00000(0000) knlGS:0000000000000000
> > [96289.449374] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > [96289.455507] CR2: 0000000000000150 CR3: 000000000220c003 CR4: 00000000001706f0
> > [96289.463119] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> > [96289.470764] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000600
> > [96289.478408] Kernel panic - not syncing: Attempted to kill the idle task!
> > [96289.485598] Kernel Offset: disabled
> > [96289.489355] ---[ end Kernel panic - not syncing: Attempted to kill the idle task! ]---
> >
> >
>

2021-02-11 21:42:51

by Liang, Kan

[permalink] [raw]

Subject: Re: [perf] perf_fuzzer causes crash in intel_pmu_drain_pebs_nhm()

On 2/11/2021 9:53 AM, Peter Zijlstra wrote:
>
> Kan, do you have time to look at this?
>
> On Thu, Jan 28, 2021 at 02:49:47PM -0500, Vince Weaver wrote:
>> On Thu, 28 Jan 2021, Vince Weaver wrote:
>>
>>> the perf_fuzzer has turned up a repeatable crash on my haswell system.
>>>
>>> addr2line is not being very helpful, it points to DECLARE_PER_CPU_FIRST.
>>> I'll investigate more when I have the chance.
>>
>> so I poked around some more.
>>
>> This seems to be caused in
>>
>> __intel_pmu_pebs_event()
>> get_next_pebs_record_by_bit() ds.c line 1639
>> get_pebs_status(at) ds.c line 1317
>> return ((struct pebs_record_nhm *)n)->status;
>>
>> where "n" has the value of 0xc0 rather than a proper pointer.
>>

The issue is pretty strange. I haven't found the root cause yet.

The base->status (aka "n") was just used in the intel_pmu_drain_pebs_nhm().
for (at = base; at < top; at += x86_pmu.pebs_record_size) {
struct pebs_record_nhm *p = at;
u64 pebs_status;

pebs_status = p->status & cpuc->pebs_enabled;
pebs_status &= mask;
Then it seems to be modified to 0xc0, and crash at get_pebs_status()
based on your investigation.
The "base" is a local variable. The ds->pebs_buffer_base is assigned to
"base" at the beginning of the function. After that, no one change it.

>> this does seem to be repetable,

I'd like to reproduce it on my machine.
Is this issue only found in a Haswell client machine?

To reproduce the issue, can I use ./perf_fuzzer under
perf_event_tests/fuzzer?
Do I need to apply any parameters with ./perf_fuzzer?

Usually how long does it take to reproduce the issue?

Thanks,
Kan

>> but fairly deep in a fuzzing run so I
>> don't have a quick reproducer. >> Vince
>>
>>
>>> [96289.009646] BUG: kernel NULL pointer dereference, address: 0000000000000150
>>> [96289.017094] #PF: supervisor read access in kernel mode
>>> [96289.022588] #PF: error_code(0x0000) - not-present page
>>> [96289.028069] PGD 0 P4D 0
>>> [96289.030796] Oops: 0000 [#1] SMP PTI
>>> [96289.034549] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G W 5.11.0-rc5+ #151
>>> [96289.043059] Hardware name: LENOVO 10AM000AUS/SHARKBAY, BIOS FBKT72AUS 01/26/2014
>>> [96289.050946] RIP: 0010:intel_pmu_drain_pebs_nhm+0x464/0x5f0
>>> [96289.056817] Code: 09 00 00 0f b6 c0 49 39 c4 74 2a 48 63 82 78 09 00 00 48 01 c5 48 39 6c 24 08 76 17 0f b6 05 14 70 3f 01 83 e0 0f 3c 03 77 a4 <48> 8b 85 90 00 00 00 eb 9f 31 ed 83 eb 01 83 fb 01 0f 85 30 ff ff
>>> [96289.076876] RSP: 0000:ffffffff822039e0 EFLAGS: 00010097
>>> [96289.082468] RAX: 0000000000000002 RBX: 0000000000000155 RCX: 0000000000000008
>>> [96289.090095] RDX: ffff88811ac118a0 RSI: ffffffff82203980 RDI: ffffffff82203980
>>> [96289.097746] RBP: 00000000000000c0 R08: 0000000000000000 R09: 0000000000000000
>>> [96289.105376] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000001
>>> [96289.113008] R13: ffffffff82203bc0 R14: ffff88801c3cf800 R15: ffffffff829814a0
>>> [96289.120671] FS: 0000000000000000(0000) GS:ffff88811ac00000(0000) knlGS:0000000000000000
>>> [96289.129346] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>> [96289.135526] CR2: 0000000000000150 CR3: 000000000220c003 CR4: 00000000001706f0
>>> [96289.143159] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>>> [96289.150803] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000600
>>> [96289.158414] Call Trace:
>>> [96289.161041] ? update_blocked_averages+0x532/0x620
>>> [96289.166152] ? update_group_capacity+0x25/0x1d0
>>> [96289.171025] ? cpumask_next_and+0x19/0x20
>>> [96289.175339] ? update_sd_lb_stats.constprop.0+0x702/0x820
>>> [96289.181105] intel_pmu_drain_pebs_buffer+0x33/0x50
>>> [96289.186259] ? x86_pmu_commit_txn+0xbc/0xf0
>>> [96289.190749] ? _raw_spin_lock_irqsave+0x1d/0x30
>>> [96289.195603] ? timerqueue_add+0x64/0xb0
>>> [96289.199720] ? update_load_avg+0x6c/0x5e0
>>> [96289.204001] ? enqueue_task_fair+0x98/0x5a0
>>> [96289.208464] ? timerqueue_del+0x1e/0x40
>>> [96289.212556] ? uncore_msr_read_counter+0x10/0x20
>>> [96289.217513] intel_pmu_pebs_disable+0x12a/0x130
>>> [96289.222324] x86_pmu_stop+0x48/0xa0
>>> [96289.226076] x86_pmu_del+0x40/0x160
>>> [96289.229813] event_sched_out.isra.0+0x81/0x1e0
>>> [96289.234602] group_sched_out.part.0+0x4f/0xc0
>>> [96289.239257] __perf_event_disable+0xef/0x1d0
>>> [96289.243831] event_function+0x8c/0xd0
>>> [96289.247785] remote_function+0x3e/0x50
>>> [96289.251797] flush_smp_call_function_queue+0x11b/0x1a0
>>> [96289.257268] flush_smp_call_function_from_idle+0x38/0x60
>>> [96289.262944] do_idle+0x15f/0x240
>>> [96289.266421] cpu_startup_entry+0x19/0x20
>>> [96289.270639] start_kernel+0x7df/0x804
>>> [96289.274558] ? apply_microcode_early.cold+0xc/0x27
>>> [96289.279678] secondary_startup_64_no_verify+0xb0/0xbb
>>> [96289.285078] Modules linked in: nf_tables libcrc32c nfnetlink intel_rapl_msr intel_rapl_common snd_hda_codec_realtek snd_hda_codec_generic snd_hda_codec_hdmi x86_pkg_temp_thermal ledtrig_audio intel_powerclamp snd_hda_intel coretemp snd_intel_dspcfg snd_hda_codec snd_hda_core kvm_intel kvm snd_hwdep irqbypass at24 snd_pcm tpm_tis crct10dif_pclmul snd_timer crc32_pclmul regmap_i2c wmi_bmof sg tpm_tis_core snd ghash_clmulni_intel tpm iTCO_wdt aesni_intel soundcore rng_core iTCO_vendor_support crypto_simd mei_me mei cryptd pcspkr evdev glue_helper binfmt_misc ip_tables x_tables autofs4 sr_mod sd_mod t10_pi cdrom i915 iosf_mbi ahci i2c_algo_bit libahci drm_kms_helper xhci_pci ehci_pci ehci_hcd libata xhci_hcd lpc_ich usbcore i2c_i801 drm crc32c_intel e1000e mfd_core scsi_mod usb_common i2c_smbus wmi fan thermal video button
>>> [96289.362498] CR2: 0000000000000150
>>> [96289.366070] ---[ end trace 80c577f99562015f ]---
>>> [96289.371007] RIP: 0010:intel_pmu_drain_pebs_nhm+0x464/0x5f0
>>> [96289.376868] Code: 09 00 00 0f b6 c0 49 39 c4 74 2a 48 63 82 78 09 00 00 48 01 c5 48 39 6c 24 08 76 17 0f b6 05 14 70 3f 01 83 e0 0f 3c 03 77 a4 <48> 8b 85 90 00 00 00 eb 9f 31 ed 83 eb 01 83 fb 01 0f 85 30 ff ff
>>> [96289.396981] RSP: 0000:ffffffff822039e0 EFLAGS: 00010097
>>> [96289.402573] RAX: 0000000000000002 RBX: 0000000000000155 RCX: 0000000000000008
>>> [96289.410226] RDX: ffff88811ac118a0 RSI: ffffffff82203980 RDI: ffffffff82203980
>>> [96289.417841] RBP: 00000000000000c0 R08: 0000000000000000 R09: 0000000000000000
>>> [96289.425461] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000001
>>> [96289.433122] R13: ffffffff82203bc0 R14: ffff88801c3cf800 R15: ffffffff829814a0
>>> [96289.440774] FS: 0000000000000000(0000) GS:ffff88811ac00000(0000) knlGS:0000000000000000
>>> [96289.449374] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>> [96289.455507] CR2: 0000000000000150 CR3: 000000000220c003 CR4: 00000000001706f0
>>> [96289.463119] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>>> [96289.470764] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000600
>>> [96289.478408] Kernel panic - not syncing: Attempted to kill the idle task!
>>> [96289.485598] Kernel Offset: disabled
>>> [96289.489355] ---[ end Kernel panic - not syncing: Attempted to kill the idle task! ]---
>>>
>>>
>>

2021-02-11 22:19:14

by Vince Weaver

[permalink] [raw]

Subject: Re: [perf] perf_fuzzer causes crash in intel_pmu_drain_pebs_nhm()

On Thu, 11 Feb 2021, Liang, Kan wrote:

> > On Thu, Jan 28, 2021 at 02:49:47PM -0500, Vince Weaver wrote:
> I'd like to reproduce it on my machine.
> Is this issue only found in a Haswell client machine?
>
> To reproduce the issue, can I use ./perf_fuzzer under perf_event_tests/fuzzer?
> Do I need to apply any parameters with ./perf_fuzzer?
>
> Usually how long does it take to reproduce the issue?

On my machine if I run the commands
echo 1 > /proc/sys/kernel/nmi_watchdog
echo 0 > /proc/sys/kernel/perf_event_paranoid
echo 1000 > /proc/sys/kernel/perf_event_max_sample_rate
./perf_fuzzer -s 30000 -r 1611784483

it is repeatable within a minute, but because of the nature of the fuzzer
it probably won't work for you because the random events will diverge
based on the different configs of the system.

I can try to generate a simple reproducer, I've just been extremely busy
here at work and haven't had the chance.

If you want to try to reproduce it the hard way, run the
./fast_repro99.sh
script in the perf_fuzzer directory. It will start fuzzing. My machine
turned up the issue within a day or so.

Vince

2021-02-25 20:18:45

by Liang, Kan

[permalink] [raw]

Subject: Re: [perf] perf_fuzzer causes crash in intel_pmu_drain_pebs_nhm()

On 2/11/2021 5:14 PM, Vince Weaver wrote:
> On Thu, 11 Feb 2021, Liang, Kan wrote:
>
>>> On Thu, Jan 28, 2021 at 02:49:47PM -0500, Vince Weaver wrote:
>> I'd like to reproduce it on my machine.
>> Is this issue only found in a Haswell client machine?
>>
>> To reproduce the issue, can I use ./perf_fuzzer under perf_event_tests/fuzzer?
>> Do I need to apply any parameters with ./perf_fuzzer?
>>
>> Usually how long does it take to reproduce the issue?
>
> On my machine if I run the commands
> echo 1 > /proc/sys/kernel/nmi_watchdog
> echo 0 > /proc/sys/kernel/perf_event_paranoid
> echo 1000 > /proc/sys/kernel/perf_event_max_sample_rate
> ./perf_fuzzer -s 30000 -r 1611784483
>
> it is repeatable within a minute, but because of the nature of the fuzzer
> it probably won't work for you because the random events will diverge
> based on the different configs of the system.
>
> I can try to generate a simple reproducer, I've just been extremely busy
> here at work and haven't had the chance.
>
> If you want to try to reproduce it the hard way, run the
> ./fast_repro99.sh
> script in the perf_fuzzer directory. It will start fuzzing. My machine
> turned up the issue within a day or so.
>
Sorry for the late response. Just want to let you know I'm still trying
to reproduce the issue.

I only have a Haswell server on hand. I run the ./fast_repro99.sh on the
machine for 3 days, but I didn't observe any crash.
Now, I'm looking for a HSW client in my lab. I will check if I can
reproduce it on a client machine.

If you have a simple reproducer, please let me know.

Thanks,
Kan

2021-03-01 13:24:58

by Liang, Kan

[permalink] [raw]

Subject: Re: [perf] perf_fuzzer causes crash in intel_pmu_drain_pebs_nhm()

2021-03-04 05:49:52

by Vince Weaver

[permalink] [raw]

Subject: Re: [perf] perf_fuzzer causes crash in intel_pmu_drain_pebs_nhm()

On Mon, 1 Mar 2021, Liang, Kan wrote:

> https://lore.kernel.org/lkml/[email protected]/
> The patch is an SW workaround for some old CPUs (HSW and earlier), which may
> set 0 to the PEBS status. It adds a check in the intel_pmu_drain_pebs_nhm().
> It tries to minimize the impact of the defect by avoiding dropping the PEBS
> records which have PEBS status 0.
> But, it doesn't correct the PEBS status, which may bring problems,
> especially for the large PEBS.
> It's possible that all the PEBS records in a large PEBS have the PEBS status
> 0. If so, the first get_next_pebs_record_by_bit() in the
> __intel_pmu_pebs_event() returns NULL. The at = NULL. Since it's a large PEBS,
> the 'count' parameter must > 1. The second get_next_pebs_record_by_bit() will
> crash.
>
> Could you please revert the patch and check whether it fixes your issue?

I've reverted that patch and my test-case no longer triggers the issue.

I'll restart a longer fuzzing run to see if any other issues turn up.

Thanks,

Vince

2021-03-04 12:29:58

by Vince Weaver

[permalink] [raw]

Subject: [perf] perf_fuzzer causes unchecked MSR access error

Hello

on my Haswell machine the perf_fuzzer managed to trigger this message:

[117248.075892] unchecked MSR access error: WRMSR to 0x3f1 (tried to write 0x0400000000000000) at rIP: 0xffffffff8106e4f4 (native_write_msr+0x4/0x20)
[117248.089957] Call Trace:
[117248.092685] intel_pmu_pebs_enable_all+0x31/0x40
[117248.097737] intel_pmu_enable_all+0xa/0x10
[117248.102210] __perf_event_task_sched_in+0x2df/0x2f0
[117248.107511] finish_task_switch.isra.0+0x15f/0x280
[117248.112765] schedule_tail+0xc/0x40
[117248.116562] ret_from_fork+0x8/0x30

that shouldn't be possible, should it? MSR 0x3f1 is MSR_IA32_PEBS_ENABLE

this is on recent-git with the patch causing the pebs-related crash
reverted.

Vince

2021-03-04 12:41:25

by Stephane Eranian

[permalink] [raw]

Subject: Re: [perf] perf_fuzzer causes unchecked MSR access error

On Wed, Mar 3, 2021 at 10:16 AM Vince Weaver <[email protected]> wrote:
>
> Hello
>
> on my Haswell machine the perf_fuzzer managed to trigger this message:
>
> [117248.075892] unchecked MSR access error: WRMSR to 0x3f1 (tried to write 0x0400000000000000) at rIP: 0xffffffff8106e4f4 (native_write_msr+0x4/0x20)
> [117248.089957] Call Trace:
> [117248.092685] intel_pmu_pebs_enable_all+0x31/0x40
> [117248.097737] intel_pmu_enable_all+0xa/0x10
> [117248.102210] __perf_event_task_sched_in+0x2df/0x2f0
> [117248.107511] finish_task_switch.isra.0+0x15f/0x280
> [117248.112765] schedule_tail+0xc/0x40
> [117248.116562] ret_from_fork+0x8/0x30
>
> that shouldn't be possible, should it? MSR 0x3f1 is MSR_IA32_PEBS_ENABLE
>
Not possible, bit 58 is not defined in PEBS_ENABLE, AFAIK.

>
> this is on recent-git with the patch causing the pebs-related crash
> reverted.
>
> Vince

2021-03-04 12:46:12

by Liang, Kan

[permalink] [raw]

Subject: Re: [perf] perf_fuzzer causes unchecked MSR access error

On 3/3/2021 2:28 PM, Stephane Eranian wrote:
> On Wed, Mar 3, 2021 at 10:16 AM Vince Weaver <[email protected]> wrote:
>>
>> Hello
>>
>> on my Haswell machine the perf_fuzzer managed to trigger this message:
>>
>> [117248.075892] unchecked MSR access error: WRMSR to 0x3f1 (tried to write 0x0400000000000000) at rIP: 0xffffffff8106e4f4 (native_write_msr+0x4/0x20)
>> [117248.089957] Call Trace:
>> [117248.092685] intel_pmu_pebs_enable_all+0x31/0x40
>> [117248.097737] intel_pmu_enable_all+0xa/0x10
>> [117248.102210] __perf_event_task_sched_in+0x2df/0x2f0
>> [117248.107511] finish_task_switch.isra.0+0x15f/0x280
>> [117248.112765] schedule_tail+0xc/0x40
>> [117248.116562] ret_from_fork+0x8/0x30
>>
>> that shouldn't be possible, should it? MSR 0x3f1 is MSR_IA32_PEBS_ENABLE
>>
> Not possible, bit 58 is not defined in PEBS_ENABLE, AFAIK.
>
>>
>> this is on recent-git with the patch causing the pebs-related crash
>> reverted.
>>

We never use bit 58. It should be a new issue.
Is it repeatable?

Thanks,
Kan

2021-03-04 12:54:54

by Vince Weaver

[permalink] [raw]

Subject: Re: [perf] perf_fuzzer causes unchecked MSR access error

On Wed, 3 Mar 2021, Liang, Kan wrote:

> We never use bit 58. It should be a new issue.
> Is it repeatable?

yes, it's repeatable.

(which I'm glad to see because it looks suspiciously like a memory bit
flip)

Though since it's a WARN_ONCE I have to reboot each time I want to test.

If I get a chance I'll try to come up with a reduced test case but
probably won't have time for that today.

Vince

2021-03-04 19:36:33

by Liang, Kan

[permalink] [raw]

Subject: Re: [perf] perf_fuzzer causes unchecked MSR access error

On 3/3/2021 3:22 PM, Vince Weaver wrote:
> On Wed, 3 Mar 2021, Liang, Kan wrote:
>
>> We never use bit 58. It should be a new issue.

Actually, KVM uses it. They create a fake event called VLBR_EVENT, which
uses bit 58. It's introduced from the commit 097e4311cda9 ("perf/x86:
Add constraint to create guest LBR event without hw counter").

Since it's a fake event, it doesn't support PEBS. Perf should reject it
if it sets the precise_ip.

The below patch should fix the MSR access error.

diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
index 5bac48d..1ea3c67 100644
--- a/arch/x86/events/intel/core.c
+++ b/arch/x86/events/intel/core.c
@@ -3659,6 +3659,10 @@ static int intel_pmu_hw_config(struct perf_event
*event)
return ret;

if (event->attr.precise_ip) {
+
+ if ((event->attr.config & INTEL_ARCH_EVENT_MASK) ==
INTEL_FIXED_VLBR_EVENT)
+ return -EINVAL;
+
if (!(event->attr.freq || (event->attr.wakeup_events &&
!event->attr.watermark))) {
event->hw.flags |= PERF_X86_EVENT_AUTO_RELOAD;
if (!(event->attr.sample_type &

Thanks,
Kan