2022-12-17 12:25:28

by Borislav Petkov

[permalink] [raw]
Subject: amdgpu refcount saturation

Hi folks,

this is with Linus' tree from Wed:

041fae9c105a ("Merge tag 'f2fs-for-6.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs")

on a CZ laptop:

[ 7.782901] [drm] initializing kernel modesetting (CARRIZO 0x1002:0x9874 0x103C:0x807E 0xC4)

The splat is kinda messy:

---

[ 7.755306] [drm] amdgpu kernel modesetting enabled.
[ 7.779110] amdgpu 0000:00:01.0: vgaarb: deactivate vga console
[ 7.780417] Console: switching to colour dummy device 80x25
[ 7.782901] [drm] initializing kernel modesetting (CARRIZO 0x1002:0x9874 0x103C:0x807E 0xC4).
[ 7.783244] [drm] register mmio base: 0xD0C00000
[ 7.783405] [drm] register mmio size: 262144
[ 7.784182] [drm] add ip block number 0 <vi_common>
[ 7.784375] [drm] add ip block number 1 <gmc_v8_0>
[ 7.784555] [drm] add ip block number 2 <cz_ih>
[ 7.784717] [drm] add ip block number 3 <gfx_v8_0>
[ 7.784925] [drm] add ip block number 4 <sdma_v3_0>
[ 7.785094] [drm] add ip block number 5 <powerplay>
[ 7.785264] [drm] add ip block number 6 <dm>
[ 7.785413] [drm] add ip block number 7 <uvd_v6_0>
[ 7.785580] [drm] add ip block number 8 <vce_v3_0>
[ 7.800919] [drm] BIOS signature incorrect 5b 7
[ 7.801095] resource sanity check: requesting [mem 0x000c0000-0x000dffff], which spans more than PCI Bus 0000:00 [mem 0x000c0000-0x000cbfff window]
[ 7.801544] caller pci_map_rom+0x68/0x1c0 mapping multiple BARs
[ 7.801838] amdgpu 0000:00:01.0: amdgpu: Fetched VBIOS from ROM BAR
[ 7.802067] amdgpu: ATOM BIOS: SWBRT27354.001
[ 7.802272] [drm] UVD is enabled in physical mode
[ 7.802438] [drm] VCE enabled in physical mode
[ 7.802592] amdgpu 0000:00:01.0: amdgpu: Trusted Memory Zone (TMZ) feature not supported
[ 7.803100] [drm] vm size is 64 GB, 2 levels, block size is 10-bit, fragment size is 9-bit
[ 7.803387] amdgpu 0000:00:01.0: amdgpu: VRAM: 512M 0x000000F400000000 - 0x000000F41FFFFFFF (512M used)
[ 7.803708] amdgpu 0000:00:01.0: amdgpu: GART: 1024M 0x000000FF00000000 - 0x000000FF3FFFFFFF
[ 7.804007] [drm] Detected VRAM RAM=512M, BAR=512M
[ 7.804174] [drm] RAM width 128bits UNKNOWN
[ 7.804703] [drm] amdgpu: 512M of VRAM memory ready
[ 7.804882] [drm] amdgpu: 7638M of GTT memory ready.
[ 7.805164] [drm] GART: num cpu pages 262144, num gpu pages 262144
[ 7.805484] [drm] PCIE GART of 1024M enabled (table at 0x000000F400A00000).
[ 7.808418] amdgpu: hwmgr_sw_init smu backed is smu8_smu
[ 7.809070] [drm] Found UVD firmware Version: 1.91 Family ID: 11
[ 7.809413] [drm] UVD ENC is disabled
[ 7.810321] [drm] Found VCE firmware Version: 52.4 Binary ID: 3
[ 7.812036] amdgpu: smu version 18.62.00
[ 7.818378] [drm] DM_PPLIB: values for Engine clock
[ 7.818566] [drm] DM_PPLIB: 300000
[ 7.818689] [drm] DM_PPLIB: 360000
[ 7.818811] [drm] DM_PPLIB: 423530
[ 7.818934] [drm] DM_PPLIB: 514290
[ 7.819056] [drm] DM_PPLIB: 626090
[ 7.819179] [drm] DM_PPLIB: 720000
[ 7.819302] [drm] DM_PPLIB: Validation clocks:
[ 7.819456] [drm] DM_PPLIB: engine_max_clock: 72000
[ 7.819633] [drm] DM_PPLIB: memory_max_clock: 80000
[ 7.819810] [drm] DM_PPLIB: level : 8
[ 7.819977] [drm] DM_PPLIB: values for Display clock
[ 7.820148] [drm] DM_PPLIB: 300000
[ 7.820271] [drm] DM_PPLIB: 400000
[ 7.820394] [drm] DM_PPLIB: 496560
[ 7.820563] [drm] DM_PPLIB: 626090
[ 7.820694] [drm] DM_PPLIB: 685720
[ 7.820857] [drm] DM_PPLIB: 757900
[ 7.820979] [drm] DM_PPLIB: Validation clocks:
[ 7.821133] [drm] DM_PPLIB: engine_max_clock: 72000
[ 7.821310] [drm] DM_PPLIB: memory_max_clock: 80000
[ 7.821487] [drm] DM_PPLIB: level : 8
[ 7.821653] [drm] DM_PPLIB: values for Memory clock
[ 7.821821] [drm] DM_PPLIB: 333000
[ 7.821944] [drm] DM_PPLIB: 800000
[ 7.822066] [drm] DM_PPLIB: Validation clocks:
[ 7.822220] [drm] DM_PPLIB: engine_max_clock: 72000
[ 7.822397] [drm] DM_PPLIB: memory_max_clock: 80000
[ 7.822574] [drm] DM_PPLIB: level : 8
[ 7.823044] [drm] Display Core initialized with v3.2.215!
[ 7.903994] [drm] UVD initialized successfully.
[ 8.103416] [drm] VCE initialized successfully.
[ 8.104616] amdgpu 0000:00:01.0: amdgpu: SE 1, SH per SE 1, CU per SH 8, active_cu_number 8
[ 8.109430] [drm] Initialized amdgpu 3.49.0 20150101 for 0000:00:01.0 on minor 0
[ 8.120099] fbcon: amdgpudrmfb (fb0) is primary device
[ 8.886332] Console: switching to colour frame buffer device 320x90
[ 8.902118] amdgpu 0000:00:01.0: [drm] fb0: amdgpudrmfb frame buffer device
[ 8.967565] process '/usr/bin/fstype' started with executable stack
[ 8.979419] PM: Image not found (code -22)
[ 9.043724] EXT4-fs (sda2): mounted filesystem c34989f9-7c8f-49ae-8285-7896af84c685 with ordered data mode. Quota mode: disabled.
[ 9.540346] systemd-udevd[1404]: /etc/udev/rules.d/storage_devices.rules:1 Invalid value for OPTIONS key, ignoring: 'all_partitions'
[ 9.766687] input: Power Button as /devices/LNXSYSTM:00/LNXSYBUS:00/PNP0C0C:00/input/input10
[ 9.770181] ACPI: button: Power Button [PWRB]
[ 9.782936] acpi_cpufreq: overriding BIOS provided _PSD data
[ 9.784086] ACPI: AC: AC Adapter [AC] (off-line)
[ 9.789339] input: Sleep Button as /devices/LNXSYSTM:00/LNXSYBUS:00/PNP0C0E:00/input/input11
[ 9.792905] ACPI: button: Sleep Button [SLPB]
[ 9.815432] input: Lid Switch as /devices/LNXSYSTM:00/LNXSYBUS:00/PNP0C0D:00/input/input12
[ 9.840976] ACPI: button: Lid Switch [LID]
[ 9.842731] ACPI: battery: Slot [BAT0] (battery present)
[ 9.862066] input: Power Button as /devices/LNXSYSTM:00/LNXPWRBN:00/input/input13
[ 9.884592] ACPI: button: Power Button [PWRF]
[ 9.998674] cryptd: max_cpu_qlen set to 1000
[ 10.011682] input: PC Speaker as /devices/platform/pcspkr/input/input14
[ 10.019917] AVX2 version of gcm_enc/dec engaged.
[ 10.020328] AES CTR mode by8 optimization enabled
[ 10.024120] systemd-udevd[1427]: ethtool: autonegotiation is unset or enabled, the speed and duplex are not writable.
[ 10.025845] cfg80211: Loading compiled-in X.509 certificates for regulatory database
[ 10.089758] cfg80211: Loaded X.509 cert 'sforshee: 00b28ddf47aef9cea7'
[ 10.092829] cfg80211: loaded regulatory.db is malformed or signature is missing/invalid
[ 10.113060] snd_hda_intel 0000:00:01.1: Force to non-snoop mode
[ 10.114044] tg3 0000:01:00.0 eth0: Tigon3 [partno(BCM95762) rev 5762100] (PCI Express) MAC address fc:3f:db:fc:10:9f
[ 10.120860] tg3 0000:01:00.0 eth0: attached PHY is 5762C (10/100/1000Base-T Ethernet) (WireSpeed[1], EEE[1])
[ 10.121311] tg3 0000:01:00.0 eth0: RXcsums[1] LinkChgREG[0] MIirq[0] ASF[0] TSOcap[1]
[ 10.121588] tg3 0000:01:00.0 eth0: dma_rwctrl[00000001] dma_mask[64-bit]
[ 10.141135] snd_hda_intel 0000:00:01.1: bound 0000:00:01.0 (ops amdgpu_dm_audio_component_bind_ops [amdgpu])
[ 10.156355] Intel(R) Wireless WiFi driver for Linux
[ 10.162066] input: HDA ATI HDMI HDMI/DP,pcm=3 as /devices/pci0000:00/0000:00:01.1/sound/card0/input15
[ 10.163016] input: HDA ATI HDMI HDMI/DP,pcm=7 as /devices/pci0000:00/0000:00:01.1/sound/card0/input16
[ 10.163881] input: HDA ATI HDMI HDMI/DP,pcm=8 as /devices/pci0000:00/0000:00:01.1/sound/card0/input17
[ 10.165613] iwlwifi 0000:02:00.0: loaded firmware version 29.1044073957.0 7265D-29.ucode op_mode iwlmvm
[ 10.170553] snd_hda_codec_generic hdaudioC1D0: autoconfig for Generic: line_outs=1 (0x17/0x0/0x0/0x0/0x0) type:speaker
[ 10.171019] snd_hda_codec_generic hdaudioC1D0: speaker_outs=0 (0x0/0x0/0x0/0x0/0x0)
[ 10.171349] snd_hda_codec_generic hdaudioC1D0: hp_outs=1 (0x1d/0x0/0x0/0x0/0x0)
[ 10.171662] snd_hda_codec_generic hdaudioC1D0: mono: mono_out=0x0
[ 10.171901] snd_hda_codec_generic hdaudioC1D0: inputs:
[ 10.172111] snd_hda_codec_generic hdaudioC1D0: Internal Mic=0x1a
[ 10.172358] snd_hda_codec_generic hdaudioC1D0: Mic=0x19
[ 10.194871] input: HDA ATI HDMI HDMI/DP,pcm=9 as /devices/pci0000:00/0000:00:01.1/sound/card0/input18
[ 10.196005] input: HD-Audio Generic Mic as /devices/pci0000:00/0000:00:09.2/sound/card1/input19
[ 10.196800] input: HD-Audio Generic Headphone as /devices/pci0000:00/0000:00:09.2/sound/card1/input20
[ 10.341545] iwlwifi 0000:02:00.0: Detected Intel(R) Dual Band Wireless AC 7265, REV=0x210
[ 10.347851] thermal thermal_zone5: failed to read out thermal zone (-61)
[ 10.370248] iwlwifi 0000:02:00.0: base HW address: 18:5e:0f:ef:3f:49, OTP minor version: 0x0
[ 10.442314] systemd-udevd[1415]: Using default interface naming scheme 'v243'.
[ 10.450847] ieee80211 phy0: Selected rate control algorithm 'iwl-mvm-rs'
[ 10.460077] systemd-udevd[1415]: ethtool: autonegotiation is unset or enabled, the speed and duplex are not writable.
[ 10.461982] systemd-udevd[1430]: Using default interface naming scheme 'v243'.
[ 10.472784] systemd-udevd[1430]: ethtool: autonegotiation is unset or enabled, the speed and duplex are not writable.
[ 10.990278] SVM: TSC scaling supported
[ 10.992345] kvm: Nested Virtualization enabled
[ 10.994632] SVM: kvm: Nested Paging enabled
[ 10.997010] SVM: Virtual GIF supported
[ 10.999325] SVM: LBR virtualization supported

...

[ 11.923155] Adding 15721468k swap on /dev/sda1. Priority:-2 extents:1 across:15721468k SS
[ 11.959678] EXT4-fs (sda2): re-mounted c34989f9-7c8f-49ae-8285-7896af84c685. Quota mode: disabled.
[ 12.431892] device-mapper: ioctl: 4.47.0-ioctl (2022-07-28) initialised: [email protected]
[ 12.457215] loop: module loaded
[ 12.583033] EXT4-fs (sda5): mounted filesystem d78a2e53-75c6-4d4a-887c-f4a66a64ba8c with ordered data mode. Quota mode: disabled.
[ 12.586978] tg3 0000:01:00.0 eth0: Link is up at 100 Mbps, full duplex
[ 12.588511] tg3 0000:01:00.0 eth0: Flow control is on for TX and on for RX
[ 12.589552] /dev/stick1: Can't open blockdev
[ 12.589847] tg3 0000:01:00.0 eth0: EEE is disabled
[ 12.593385] IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready
[ 15.638763] ------------[ cut here ]------------
[ 15.638772] ------------[ cut here ]------------
[ 15.638910] refcount_t: underflow; use-after-free.
[ 15.638937] WARNING: CPU: 1 PID: 1214 at lib/refcount.c:28 refcount_warn_saturate+0xba/0x110
[ 15.639052] refcount_t: saturated; leaking memory.
[ 15.639078] WARNING: CPU: 3 PID: 2437 at lib/refcount.c:19 refcount_warn_saturate+0x74/0x110
[ 15.639192] Modules linked in: loop
[ 15.639433] Modules linked in: loop
[ 15.639574] dm_crypt dm_mod edac_mce_amd
[ 15.639815] dm_crypt dm_mod
[ 15.639919] kvm_amd ccp rng_core kvm irqbypass crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel iwlmvm sha512_ssse3 sha512_generic mac80211 libarc4 snd_hda_codec_generic ledtrig_audio snd_hda_codec_hdmi iwlwifi snd_hda_intel snd_intel_dspcfg snd_hda_codec aesni_intel
[ 15.639941] edac_mce_amd kvm_amd
[ 15.640045] snd_hda_core
[ 15.640047] ccp
[ 15.640049] crypto_simd pcspkr
[ 15.640167] rng_core kvm irqbypass
[ 15.640254] cryptd snd_pcm
[ 15.641109] crct10dif_pclmul
[ 15.641215] fam15h_power cfg80211
[ 15.641311] crc32_pclmul crc32c_intel
[ 15.641380] snd_timer k10temp
[ 15.641494] ghash_clmulni_intel
[ 15.641496] snd
[ 15.641499] iwlmvm sha512_ssse3 sha512_generic
[ 15.641623] rfkill tg3 soundcore
[ 15.641725] mac80211
[ 15.641727] battery acpi_cpufreq ac
[ 15.641834] libarc4 snd_hda_codec_generic
[ 15.641955] button input_leds
[ 15.642089] ledtrig_audio snd_hda_codec_hdmi
[ 15.642199] led_class psmouse
[ 15.642315] iwlwifi snd_hda_intel
[ 15.642385] serio_raw amdgpu drm_buddy
[ 15.642545] snd_intel_dspcfg snd_hda_codec
[ 15.642663] gpu_sched
[ 15.642665] aesni_intel
[ 15.642667] drm_display_helper video wmi
[ 15.642752] snd_hda_core crypto_simd

[ 15.642881] pcspkr cryptd
[ 15.643026] CPU: 1 PID: 1214 Comm: sdma1 Not tainted 6.1.0+ #1
[ 15.643135] snd_pcm fam15h_power
[ 15.643288] Hardware name: HP HP EliteBook 745 G3/807E, BIOS N73 Ver. 01.39 04/16/2019
[ 15.643291] RIP: 0010:refcount_warn_saturate+0xba/0x110
[ 15.643399] cfg80211 snd_timer k10temp
[ 15.643521] Code: 07 01 e8 3d 7b 52 00 0f 0b e9 92 ec 57 00 80 3d 0e 1e ee 07 00 75 85 48 c7 c7 78 20 fe 81 c6 05 fe 1d ee 07 01 e8 1a 7b 52 00 <0f> 0b e9 6f ec 57 00 80 3d e9 1d ee 07 00 0f 85 5e ff ff ff 48 c7
[ 15.643657] snd rfkill
[ 15.643805] RSP: 0018:ffffc90000acbe60 EFLAGS: 00010286
[ 15.643892] tg3 soundcore battery

[ 15.643986] RAX: 0000000000000026 RBX: ffff888103624040 RCX: 0000000000000027
[ 15.644127] acpi_cpufreq ac button
[ 15.644257] RDX: ffff88842f49f3c8 RSI: 0000000000000001 RDI: ffff88842f49f3c0
[ 15.644260] RBP: ffff8881051531f0 R08: 80000000fff003ff R09: ffffc90000acbe00
[ 15.644316] input_leds led_class
[ 15.644415] R10: 0000000000000001 R11: ffffffffffffffff R12: ffff88810e2bdc00
[ 15.644417] R13: ffff88810e2bdc78 R14: ffff888103624000 R15: ffff8881051cd008
[ 15.644622] psmouse
[ 15.644739] FS: 0000000000000000(0000) GS:ffff88842f480000(0000) knlGS:0000000000000000
[ 15.645011] serio_raw amdgpu drm_buddy
[ 15.645193] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 15.645195] CR2: 00005563d7a6d000 CR3: 0000000103e9c000 CR4: 00000000001506e0
[ 15.645329] gpu_sched drm_display_helper video
[ 15.645962] Call Trace:
[ 15.645965] <TASK>
[ 15.646052] wmi
[ 15.646057] CPU: 3 PID: 2437 Comm: Xorg Not tainted 6.1.0+ #1
[ 15.646235] drm_sched_entity_pop_job+0xfb/0x430 [gpu_sched]
[ 15.646357] Hardware name: HP HP EliteBook 745 G3/807E, BIOS N73 Ver. 01.39 04/16/2019
[ 15.646360] RIP: 0010:refcount_warn_saturate+0x74/0x110
[ 15.646416] drm_sched_main+0x99/0x3f0 [gpu_sched]
[ 15.646662] Code: 07 01 e8 83 7b 52 00 0f 0b e9 d8 ec 57 00 80 3d 57 1e ee 07 00 75 cb 48 c7 c7 20 20 fe 81 c6 05 47 1e ee 07 01 e8 60 7b 52 00 <0f> 0b e9 b5 ec 57 00 80 3d 32 1e ee 07 00 75 a8 48 c7 c7 48 20 fe
[ 15.646785] ? __pfx_autoremove_wake_function+0x10/0x10
[ 15.647030] RSP: 0000:ffffc90001b33ca0 EFLAGS: 00010286
[ 15.647276] ? __pfx_drm_sched_main+0x10/0x10 [gpu_sched]
[ 15.647394] RAX: 0000000000000026 RBX: ffffc90001b33ce0 RCX: 0000000000000027
[ 15.647397] RDX: ffff88842f59f3c8 RSI: 0000000000000001 RDI: ffff88842f59f3c0
[ 15.647640] kthread+0xd4/0x100
[ 15.647886] RBP: ffff888103624040 R08: 80000000fff00401 R09: ffffc90001b33c40
[ 15.647888] R10: 0000000000000001 R11: ffffffffffffffff R12: 00000000ffffffff
[ 15.647967] ? __pfx_kthread+0x10/0x10
[ 15.648244] R13: ffff888103624040 R14: ffff88810e968058 R15: ffff888000000000
[ 15.648247] ret_from_fork+0x2c/0x50
[ 15.648383] FS: 00007f385db36a00(0000) GS:ffff88842f580000(0000) knlGS:0000000000000000
[ 15.648586] </TASK>
[ 15.648588] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 15.648591] CR2: 00007f37d3016df0 CR3: 0000000103e9c000 CR4: 00000000001506e0
[ 15.648833] ---[ end trace 0000000000000000 ]---
[ 15.654004] Call Trace:
[ 15.654006] <TASK>
[ 15.654007] dma_resv_iter_walk_unlocked.part.0+0x147/0x180
[ 15.654015] dma_resv_iter_first_unlocked+0x25/0x70
[ 15.654018] dma_resv_test_signaled+0x22/0xb0
[ 15.654021] ttm_bo_vm_fault_reserved+0x43/0x350
[ 15.654886] amdgpu_gem_fault+0x7f/0xf0 [amdgpu]
[ 15.655696] __do_fault+0x41/0x240
[ 15.655974] __handle_mm_fault+0xcf4/0x1740
[ 15.655978] ? do_mmap+0x33d/0x4f0
[ 15.655981] handle_mm_fault+0xb5/0x180
[ 15.655984] do_user_addr_fault+0x19b/0x6b0
[ 15.656574] exc_page_fault+0x6d/0x140
[ 15.656715] asm_exc_page_fault+0x22/0x30
[ 15.656723] RIP: 0033:0x7f385d17b1bb
[ 15.657027] Code: 00 00 48 01 d6 48 01 d7 4c 8d 1d 00 0b 05 00 49 63 14 93 49 8d 14 13 ff e2 0f 0b 0f 1f 40 00 48 81 ea 80 00 00 00 0f 28 4e f0 <0f> 29 4f f0 0f 28 56 e0 0f 29 57 e0 0f 28 5e d0 0f 29 5f d0 0f 28
[ 15.657030] RSP: 002b:00007ffdc75943a8 EFLAGS: 00010202
[ 15.657032] RAX: 00007f37d3015000 RBX: 00007f37d3015000 RCX: 0000000000010000
[ 15.657034] RDX: 0000000000001d80 RSI: 00007f37f000ae00 RDI: 00007f37d3016e00
[ 15.658395] RBP: 0000000000000001 R08: 00007f37d3016df0 R09: 0000000000000000
[ 15.658396] R10: 0000000000000000 R11: 0000000000000246 R12: 00000000000080e1
[ 15.658397] R13: 00005563d7c23e78 R14: 00007f37f0009000 R15: 0000000000001e00
[ 15.658400] </TASK>
[ 15.659189] ---[ end trace 0000000000000000 ]---
[ 15.668798] ------------[ cut here ]------------
[ 15.669007] refcount_t: saturated; leaking memory.
[ 15.669201] WARNING: CPU: 1 PID: 2479 at lib/refcount.c:22 refcount_warn_saturate+0x51/0x110
[ 15.669492] Modules linked in: loop dm_crypt dm_mod edac_mce_amd kvm_amd ccp rng_core kvm irqbypass crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel iwlmvm sha512_ssse3 sha512_generic mac80211 libarc4 snd_hda_codec_generic ledtrig_audio snd_hda_codec_hdmi iwlwifi snd_hda_intel snd_intel_dspcfg snd_hda_codec aesni_intel snd_hda_core crypto_simd pcspkr cryptd snd_pcm fam15h_power cfg80211 snd_timer k10temp snd rfkill tg3 soundcore battery acpi_cpufreq ac button input_leds led_class psmouse serio_raw amdgpu drm_buddy gpu_sched drm_display_helper video wmi
[ 15.671256] CPU: 1 PID: 2479 Comm: Xorg:cs0 Tainted: G W 6.1.0+ #1
[ 15.671510] Hardware name: HP HP EliteBook 745 G3/807E, BIOS N73 Ver. 01.39 04/16/2019
[ 15.671774] RIP: 0010:refcount_warn_saturate+0x51/0x110
[ 15.671954] Code: 84 bc 00 00 00 e9 ff ec 57 00 85 f6 74 23 80 3d 79 1e ee 07 00 75 ee 48 c7 c7 20 20 fe 81 c6 05 69 1e ee 07 01 e8 83 7b 52 00 <0f> 0b e9 d8 ec 57 00 80 3d 57 1e ee 07 00 75 cb 48 c7 c7 20 20 fe
[ 15.672626] RSP: 0018:ffffc90000beb820 EFLAGS: 00010282
[ 15.672812] RAX: 0000000000000026 RBX: ffff888103624040 RCX: 0000000000000027
[ 15.672815] RDX: ffff88842f49f3c8 RSI: 0000000000000001 RDI: ffff88842f49f3c0
[ 15.672816] RBP: 0000000000000003 R08: 0000000000000058 R09: 00000000ffefffff
[ 15.672818] R10: ffffffff8224a280 R11: 0000000000000003 R12: ffff888105153000
[ 15.672820] R13: ffff8881051c0000 R14: ffff888105493f48 R15: ffff888105153000
[ 15.672822] FS: 00007f38541ff640(0000) GS:ffff88842f480000(0000) knlGS:0000000000000000
[ 15.672824] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 15.672826] CR2: 00005563d7a6d000 CR3: 0000000103e9c000 CR4: 00000000001506e0
[ 15.672828] Call Trace:
[ 15.674888] <TASK>
[ 15.674889] amdgpu_sync_resv+0x191/0x1e0 [amdgpu]
[ 15.675956] amdgpu_vm_sdma_prepare.part.0+0x4b/0x90 [amdgpu]
[ 15.677154] amdgpu_vm_update_range+0x140/0x6e0 [amdgpu]
[ 15.678185] amdgpu_vm_bo_update+0x32e/0x570 [amdgpu]
[ 15.679068] amdgpu_vm_handle_moved+0x5e/0x120 [amdgpu]
[ 15.679999] amdgpu_cs_ioctl+0x1289/0x1e00 [amdgpu]
[ 15.681234] ? __pfx_amdgpu_cs_ioctl+0x10/0x10 [amdgpu]
[ 15.682257] drm_ioctl_kernel+0xbf/0x160
[ 15.682398] drm_ioctl+0x21c/0x500
[ 15.682517] ? __pfx_amdgpu_cs_ioctl+0x10/0x10 [amdgpu]
[ 15.683403] ? futex_wake+0x6d/0x160
[ 15.683533] amdgpu_drm_ioctl+0x5e/0xb0 [amdgpu]
[ 15.684394] __x64_sys_ioctl+0xb6/0xd0
[ 15.684589] ? exit_to_user_mode_prepare+0x97/0x140
[ 15.684774] do_syscall_64+0x3a/0x90
[ 15.684784] entry_SYSCALL_64_after_hwframe+0x72/0xdc
[ 15.685094] RIP: 0033:0x7f385d12d957
[ 15.685229] Code: 3c 1c 48 f7 d8 4c 39 e0 77 b9 e8 24 ff ff ff 85 c0 78 be 4c 89 e0 5b 5d 41 5c c3 0f 1f 84 00 00 00 00 00 b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d e1 94 0c 00 f7 d8 64 89 01 48
[ 15.685232] RSP: 002b:00007f38541fe868 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[ 15.685236] RAX: ffffffffffffffda RBX: 00007f38541fe8d0 RCX: 00007f385d12d957
[ 15.685238] RDX: 00007f38541fe8d0 RSI: 00000000c0186444 RDI: 000000000000000e
[ 15.685239] RBP: 00000000c0186444 R08: 00007f38541fea00 R09: 0000000000000020
[ 15.685241] R10: 00007f38541fea00 R11: 0000000000000246 R12: 00005563d7964490
[ 15.685242] R13: 000000000000000e R14: 0000000000000000 R15: 00005563d7b4f6a0
[ 15.685245] </TASK>
[ 15.685247] ---[ end trace 0000000000000000 ]---

--
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette


2022-12-19 08:48:37

by Christian König

[permalink] [raw]
Subject: Re: amdgpu refcount saturation

Thanks for the notice, going to take a look today.

Regards,
Christian.

Am 17.12.22 um 12:53 schrieb Borislav Petkov:
> Hi folks,
>
> this is with Linus' tree from Wed:
>
> 041fae9c105a ("Merge tag 'f2fs-for-6.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs")
>
> on a CZ laptop:
>
> [ 7.782901] [drm] initializing kernel modesetting (CARRIZO 0x1002:0x9874 0x103C:0x807E 0xC4)
>
> The splat is kinda messy:
>
> ---
>
> [ 7.755306] [drm] amdgpu kernel modesetting enabled.
> [ 7.779110] amdgpu 0000:00:01.0: vgaarb: deactivate vga console
> [ 7.780417] Console: switching to colour dummy device 80x25
> [ 7.782901] [drm] initializing kernel modesetting (CARRIZO 0x1002:0x9874 0x103C:0x807E 0xC4).
> [ 7.783244] [drm] register mmio base: 0xD0C00000
> [ 7.783405] [drm] register mmio size: 262144
> [ 7.784182] [drm] add ip block number 0 <vi_common>
> [ 7.784375] [drm] add ip block number 1 <gmc_v8_0>
> [ 7.784555] [drm] add ip block number 2 <cz_ih>
> [ 7.784717] [drm] add ip block number 3 <gfx_v8_0>
> [ 7.784925] [drm] add ip block number 4 <sdma_v3_0>
> [ 7.785094] [drm] add ip block number 5 <powerplay>
> [ 7.785264] [drm] add ip block number 6 <dm>
> [ 7.785413] [drm] add ip block number 7 <uvd_v6_0>
> [ 7.785580] [drm] add ip block number 8 <vce_v3_0>
> [ 7.800919] [drm] BIOS signature incorrect 5b 7
> [ 7.801095] resource sanity check: requesting [mem 0x000c0000-0x000dffff], which spans more than PCI Bus 0000:00 [mem 0x000c0000-0x000cbfff window]
> [ 7.801544] caller pci_map_rom+0x68/0x1c0 mapping multiple BARs
> [ 7.801838] amdgpu 0000:00:01.0: amdgpu: Fetched VBIOS from ROM BAR
> [ 7.802067] amdgpu: ATOM BIOS: SWBRT27354.001
> [ 7.802272] [drm] UVD is enabled in physical mode
> [ 7.802438] [drm] VCE enabled in physical mode
> [ 7.802592] amdgpu 0000:00:01.0: amdgpu: Trusted Memory Zone (TMZ) feature not supported
> [ 7.803100] [drm] vm size is 64 GB, 2 levels, block size is 10-bit, fragment size is 9-bit
> [ 7.803387] amdgpu 0000:00:01.0: amdgpu: VRAM: 512M 0x000000F400000000 - 0x000000F41FFFFFFF (512M used)
> [ 7.803708] amdgpu 0000:00:01.0: amdgpu: GART: 1024M 0x000000FF00000000 - 0x000000FF3FFFFFFF
> [ 7.804007] [drm] Detected VRAM RAM=512M, BAR=512M
> [ 7.804174] [drm] RAM width 128bits UNKNOWN
> [ 7.804703] [drm] amdgpu: 512M of VRAM memory ready
> [ 7.804882] [drm] amdgpu: 7638M of GTT memory ready.
> [ 7.805164] [drm] GART: num cpu pages 262144, num gpu pages 262144
> [ 7.805484] [drm] PCIE GART of 1024M enabled (table at 0x000000F400A00000).
> [ 7.808418] amdgpu: hwmgr_sw_init smu backed is smu8_smu
> [ 7.809070] [drm] Found UVD firmware Version: 1.91 Family ID: 11
> [ 7.809413] [drm] UVD ENC is disabled
> [ 7.810321] [drm] Found VCE firmware Version: 52.4 Binary ID: 3
> [ 7.812036] amdgpu: smu version 18.62.00
> [ 7.818378] [drm] DM_PPLIB: values for Engine clock
> [ 7.818566] [drm] DM_PPLIB: 300000
> [ 7.818689] [drm] DM_PPLIB: 360000
> [ 7.818811] [drm] DM_PPLIB: 423530
> [ 7.818934] [drm] DM_PPLIB: 514290
> [ 7.819056] [drm] DM_PPLIB: 626090
> [ 7.819179] [drm] DM_PPLIB: 720000
> [ 7.819302] [drm] DM_PPLIB: Validation clocks:
> [ 7.819456] [drm] DM_PPLIB: engine_max_clock: 72000
> [ 7.819633] [drm] DM_PPLIB: memory_max_clock: 80000
> [ 7.819810] [drm] DM_PPLIB: level : 8
> [ 7.819977] [drm] DM_PPLIB: values for Display clock
> [ 7.820148] [drm] DM_PPLIB: 300000
> [ 7.820271] [drm] DM_PPLIB: 400000
> [ 7.820394] [drm] DM_PPLIB: 496560
> [ 7.820563] [drm] DM_PPLIB: 626090
> [ 7.820694] [drm] DM_PPLIB: 685720
> [ 7.820857] [drm] DM_PPLIB: 757900
> [ 7.820979] [drm] DM_PPLIB: Validation clocks:
> [ 7.821133] [drm] DM_PPLIB: engine_max_clock: 72000
> [ 7.821310] [drm] DM_PPLIB: memory_max_clock: 80000
> [ 7.821487] [drm] DM_PPLIB: level : 8
> [ 7.821653] [drm] DM_PPLIB: values for Memory clock
> [ 7.821821] [drm] DM_PPLIB: 333000
> [ 7.821944] [drm] DM_PPLIB: 800000
> [ 7.822066] [drm] DM_PPLIB: Validation clocks:
> [ 7.822220] [drm] DM_PPLIB: engine_max_clock: 72000
> [ 7.822397] [drm] DM_PPLIB: memory_max_clock: 80000
> [ 7.822574] [drm] DM_PPLIB: level : 8
> [ 7.823044] [drm] Display Core initialized with v3.2.215!
> [ 7.903994] [drm] UVD initialized successfully.
> [ 8.103416] [drm] VCE initialized successfully.
> [ 8.104616] amdgpu 0000:00:01.0: amdgpu: SE 1, SH per SE 1, CU per SH 8, active_cu_number 8
> [ 8.109430] [drm] Initialized amdgpu 3.49.0 20150101 for 0000:00:01.0 on minor 0
> [ 8.120099] fbcon: amdgpudrmfb (fb0) is primary device
> [ 8.886332] Console: switching to colour frame buffer device 320x90
> [ 8.902118] amdgpu 0000:00:01.0: [drm] fb0: amdgpudrmfb frame buffer device
> [ 8.967565] process '/usr/bin/fstype' started with executable stack
> [ 8.979419] PM: Image not found (code -22)
> [ 9.043724] EXT4-fs (sda2): mounted filesystem c34989f9-7c8f-49ae-8285-7896af84c685 with ordered data mode. Quota mode: disabled.
> [ 9.540346] systemd-udevd[1404]: /etc/udev/rules.d/storage_devices.rules:1 Invalid value for OPTIONS key, ignoring: 'all_partitions'
> [ 9.766687] input: Power Button as /devices/LNXSYSTM:00/LNXSYBUS:00/PNP0C0C:00/input/input10
> [ 9.770181] ACPI: button: Power Button [PWRB]
> [ 9.782936] acpi_cpufreq: overriding BIOS provided _PSD data
> [ 9.784086] ACPI: AC: AC Adapter [AC] (off-line)
> [ 9.789339] input: Sleep Button as /devices/LNXSYSTM:00/LNXSYBUS:00/PNP0C0E:00/input/input11
> [ 9.792905] ACPI: button: Sleep Button [SLPB]
> [ 9.815432] input: Lid Switch as /devices/LNXSYSTM:00/LNXSYBUS:00/PNP0C0D:00/input/input12
> [ 9.840976] ACPI: button: Lid Switch [LID]
> [ 9.842731] ACPI: battery: Slot [BAT0] (battery present)
> [ 9.862066] input: Power Button as /devices/LNXSYSTM:00/LNXPWRBN:00/input/input13
> [ 9.884592] ACPI: button: Power Button [PWRF]
> [ 9.998674] cryptd: max_cpu_qlen set to 1000
> [ 10.011682] input: PC Speaker as /devices/platform/pcspkr/input/input14
> [ 10.019917] AVX2 version of gcm_enc/dec engaged.
> [ 10.020328] AES CTR mode by8 optimization enabled
> [ 10.024120] systemd-udevd[1427]: ethtool: autonegotiation is unset or enabled, the speed and duplex are not writable.
> [ 10.025845] cfg80211: Loading compiled-in X.509 certificates for regulatory database
> [ 10.089758] cfg80211: Loaded X.509 cert 'sforshee: 00b28ddf47aef9cea7'
> [ 10.092829] cfg80211: loaded regulatory.db is malformed or signature is missing/invalid
> [ 10.113060] snd_hda_intel 0000:00:01.1: Force to non-snoop mode
> [ 10.114044] tg3 0000:01:00.0 eth0: Tigon3 [partno(BCM95762) rev 5762100] (PCI Express) MAC address fc:3f:db:fc:10:9f
> [ 10.120860] tg3 0000:01:00.0 eth0: attached PHY is 5762C (10/100/1000Base-T Ethernet) (WireSpeed[1], EEE[1])
> [ 10.121311] tg3 0000:01:00.0 eth0: RXcsums[1] LinkChgREG[0] MIirq[0] ASF[0] TSOcap[1]
> [ 10.121588] tg3 0000:01:00.0 eth0: dma_rwctrl[00000001] dma_mask[64-bit]
> [ 10.141135] snd_hda_intel 0000:00:01.1: bound 0000:00:01.0 (ops amdgpu_dm_audio_component_bind_ops [amdgpu])
> [ 10.156355] Intel(R) Wireless WiFi driver for Linux
> [ 10.162066] input: HDA ATI HDMI HDMI/DP,pcm=3 as /devices/pci0000:00/0000:00:01.1/sound/card0/input15
> [ 10.163016] input: HDA ATI HDMI HDMI/DP,pcm=7 as /devices/pci0000:00/0000:00:01.1/sound/card0/input16
> [ 10.163881] input: HDA ATI HDMI HDMI/DP,pcm=8 as /devices/pci0000:00/0000:00:01.1/sound/card0/input17
> [ 10.165613] iwlwifi 0000:02:00.0: loaded firmware version 29.1044073957.0 7265D-29.ucode op_mode iwlmvm
> [ 10.170553] snd_hda_codec_generic hdaudioC1D0: autoconfig for Generic: line_outs=1 (0x17/0x0/0x0/0x0/0x0) type:speaker
> [ 10.171019] snd_hda_codec_generic hdaudioC1D0: speaker_outs=0 (0x0/0x0/0x0/0x0/0x0)
> [ 10.171349] snd_hda_codec_generic hdaudioC1D0: hp_outs=1 (0x1d/0x0/0x0/0x0/0x0)
> [ 10.171662] snd_hda_codec_generic hdaudioC1D0: mono: mono_out=0x0
> [ 10.171901] snd_hda_codec_generic hdaudioC1D0: inputs:
> [ 10.172111] snd_hda_codec_generic hdaudioC1D0: Internal Mic=0x1a
> [ 10.172358] snd_hda_codec_generic hdaudioC1D0: Mic=0x19
> [ 10.194871] input: HDA ATI HDMI HDMI/DP,pcm=9 as /devices/pci0000:00/0000:00:01.1/sound/card0/input18
> [ 10.196005] input: HD-Audio Generic Mic as /devices/pci0000:00/0000:00:09.2/sound/card1/input19
> [ 10.196800] input: HD-Audio Generic Headphone as /devices/pci0000:00/0000:00:09.2/sound/card1/input20
> [ 10.341545] iwlwifi 0000:02:00.0: Detected Intel(R) Dual Band Wireless AC 7265, REV=0x210
> [ 10.347851] thermal thermal_zone5: failed to read out thermal zone (-61)
> [ 10.370248] iwlwifi 0000:02:00.0: base HW address: 18:5e:0f:ef:3f:49, OTP minor version: 0x0
> [ 10.442314] systemd-udevd[1415]: Using default interface naming scheme 'v243'.
> [ 10.450847] ieee80211 phy0: Selected rate control algorithm 'iwl-mvm-rs'
> [ 10.460077] systemd-udevd[1415]: ethtool: autonegotiation is unset or enabled, the speed and duplex are not writable.
> [ 10.461982] systemd-udevd[1430]: Using default interface naming scheme 'v243'.
> [ 10.472784] systemd-udevd[1430]: ethtool: autonegotiation is unset or enabled, the speed and duplex are not writable.
> [ 10.990278] SVM: TSC scaling supported
> [ 10.992345] kvm: Nested Virtualization enabled
> [ 10.994632] SVM: kvm: Nested Paging enabled
> [ 10.997010] SVM: Virtual GIF supported
> [ 10.999325] SVM: LBR virtualization supported
>
> ...
>
> [ 11.923155] Adding 15721468k swap on /dev/sda1. Priority:-2 extents:1 across:15721468k SS
> [ 11.959678] EXT4-fs (sda2): re-mounted c34989f9-7c8f-49ae-8285-7896af84c685. Quota mode: disabled.
> [ 12.431892] device-mapper: ioctl: 4.47.0-ioctl (2022-07-28) initialised: [email protected]
> [ 12.457215] loop: module loaded
> [ 12.583033] EXT4-fs (sda5): mounted filesystem d78a2e53-75c6-4d4a-887c-f4a66a64ba8c with ordered data mode. Quota mode: disabled.
> [ 12.586978] tg3 0000:01:00.0 eth0: Link is up at 100 Mbps, full duplex
> [ 12.588511] tg3 0000:01:00.0 eth0: Flow control is on for TX and on for RX
> [ 12.589552] /dev/stick1: Can't open blockdev
> [ 12.589847] tg3 0000:01:00.0 eth0: EEE is disabled
> [ 12.593385] IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready
> [ 15.638763] ------------[ cut here ]------------
> [ 15.638772] ------------[ cut here ]------------
> [ 15.638910] refcount_t: underflow; use-after-free.
> [ 15.638937] WARNING: CPU: 1 PID: 1214 at lib/refcount.c:28 refcount_warn_saturate+0xba/0x110
> [ 15.639052] refcount_t: saturated; leaking memory.
> [ 15.639078] WARNING: CPU: 3 PID: 2437 at lib/refcount.c:19 refcount_warn_saturate+0x74/0x110
> [ 15.639192] Modules linked in: loop
> [ 15.639433] Modules linked in: loop
> [ 15.639574] dm_crypt dm_mod edac_mce_amd
> [ 15.639815] dm_crypt dm_mod
> [ 15.639919] kvm_amd ccp rng_core kvm irqbypass crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel iwlmvm sha512_ssse3 sha512_generic mac80211 libarc4 snd_hda_codec_generic ledtrig_audio snd_hda_codec_hdmi iwlwifi snd_hda_intel snd_intel_dspcfg snd_hda_codec aesni_intel
> [ 15.639941] edac_mce_amd kvm_amd
> [ 15.640045] snd_hda_core
> [ 15.640047] ccp
> [ 15.640049] crypto_simd pcspkr
> [ 15.640167] rng_core kvm irqbypass
> [ 15.640254] cryptd snd_pcm
> [ 15.641109] crct10dif_pclmul
> [ 15.641215] fam15h_power cfg80211
> [ 15.641311] crc32_pclmul crc32c_intel
> [ 15.641380] snd_timer k10temp
> [ 15.641494] ghash_clmulni_intel
> [ 15.641496] snd
> [ 15.641499] iwlmvm sha512_ssse3 sha512_generic
> [ 15.641623] rfkill tg3 soundcore
> [ 15.641725] mac80211
> [ 15.641727] battery acpi_cpufreq ac
> [ 15.641834] libarc4 snd_hda_codec_generic
> [ 15.641955] button input_leds
> [ 15.642089] ledtrig_audio snd_hda_codec_hdmi
> [ 15.642199] led_class psmouse
> [ 15.642315] iwlwifi snd_hda_intel
> [ 15.642385] serio_raw amdgpu drm_buddy
> [ 15.642545] snd_intel_dspcfg snd_hda_codec
> [ 15.642663] gpu_sched
> [ 15.642665] aesni_intel
> [ 15.642667] drm_display_helper video wmi
> [ 15.642752] snd_hda_core crypto_simd
>
> [ 15.642881] pcspkr cryptd
> [ 15.643026] CPU: 1 PID: 1214 Comm: sdma1 Not tainted 6.1.0+ #1
> [ 15.643135] snd_pcm fam15h_power
> [ 15.643288] Hardware name: HP HP EliteBook 745 G3/807E, BIOS N73 Ver. 01.39 04/16/2019
> [ 15.643291] RIP: 0010:refcount_warn_saturate+0xba/0x110
> [ 15.643399] cfg80211 snd_timer k10temp
> [ 15.643521] Code: 07 01 e8 3d 7b 52 00 0f 0b e9 92 ec 57 00 80 3d 0e 1e ee 07 00 75 85 48 c7 c7 78 20 fe 81 c6 05 fe 1d ee 07 01 e8 1a 7b 52 00 <0f> 0b e9 6f ec 57 00 80 3d e9 1d ee 07 00 0f 85 5e ff ff ff 48 c7
> [ 15.643657] snd rfkill
> [ 15.643805] RSP: 0018:ffffc90000acbe60 EFLAGS: 00010286
> [ 15.643892] tg3 soundcore battery
>
> [ 15.643986] RAX: 0000000000000026 RBX: ffff888103624040 RCX: 0000000000000027
> [ 15.644127] acpi_cpufreq ac button
> [ 15.644257] RDX: ffff88842f49f3c8 RSI: 0000000000000001 RDI: ffff88842f49f3c0
> [ 15.644260] RBP: ffff8881051531f0 R08: 80000000fff003ff R09: ffffc90000acbe00
> [ 15.644316] input_leds led_class
> [ 15.644415] R10: 0000000000000001 R11: ffffffffffffffff R12: ffff88810e2bdc00
> [ 15.644417] R13: ffff88810e2bdc78 R14: ffff888103624000 R15: ffff8881051cd008
> [ 15.644622] psmouse
> [ 15.644739] FS: 0000000000000000(0000) GS:ffff88842f480000(0000) knlGS:0000000000000000
> [ 15.645011] serio_raw amdgpu drm_buddy
> [ 15.645193] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 15.645195] CR2: 00005563d7a6d000 CR3: 0000000103e9c000 CR4: 00000000001506e0
> [ 15.645329] gpu_sched drm_display_helper video
> [ 15.645962] Call Trace:
> [ 15.645965] <TASK>
> [ 15.646052] wmi
> [ 15.646057] CPU: 3 PID: 2437 Comm: Xorg Not tainted 6.1.0+ #1
> [ 15.646235] drm_sched_entity_pop_job+0xfb/0x430 [gpu_sched]
> [ 15.646357] Hardware name: HP HP EliteBook 745 G3/807E, BIOS N73 Ver. 01.39 04/16/2019
> [ 15.646360] RIP: 0010:refcount_warn_saturate+0x74/0x110
> [ 15.646416] drm_sched_main+0x99/0x3f0 [gpu_sched]
> [ 15.646662] Code: 07 01 e8 83 7b 52 00 0f 0b e9 d8 ec 57 00 80 3d 57 1e ee 07 00 75 cb 48 c7 c7 20 20 fe 81 c6 05 47 1e ee 07 01 e8 60 7b 52 00 <0f> 0b e9 b5 ec 57 00 80 3d 32 1e ee 07 00 75 a8 48 c7 c7 48 20 fe
> [ 15.646785] ? __pfx_autoremove_wake_function+0x10/0x10
> [ 15.647030] RSP: 0000:ffffc90001b33ca0 EFLAGS: 00010286
> [ 15.647276] ? __pfx_drm_sched_main+0x10/0x10 [gpu_sched]
> [ 15.647394] RAX: 0000000000000026 RBX: ffffc90001b33ce0 RCX: 0000000000000027
> [ 15.647397] RDX: ffff88842f59f3c8 RSI: 0000000000000001 RDI: ffff88842f59f3c0
> [ 15.647640] kthread+0xd4/0x100
> [ 15.647886] RBP: ffff888103624040 R08: 80000000fff00401 R09: ffffc90001b33c40
> [ 15.647888] R10: 0000000000000001 R11: ffffffffffffffff R12: 00000000ffffffff
> [ 15.647967] ? __pfx_kthread+0x10/0x10
> [ 15.648244] R13: ffff888103624040 R14: ffff88810e968058 R15: ffff888000000000
> [ 15.648247] ret_from_fork+0x2c/0x50
> [ 15.648383] FS: 00007f385db36a00(0000) GS:ffff88842f580000(0000) knlGS:0000000000000000
> [ 15.648586] </TASK>
> [ 15.648588] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 15.648591] CR2: 00007f37d3016df0 CR3: 0000000103e9c000 CR4: 00000000001506e0
> [ 15.648833] ---[ end trace 0000000000000000 ]---
> [ 15.654004] Call Trace:
> [ 15.654006] <TASK>
> [ 15.654007] dma_resv_iter_walk_unlocked.part.0+0x147/0x180
> [ 15.654015] dma_resv_iter_first_unlocked+0x25/0x70
> [ 15.654018] dma_resv_test_signaled+0x22/0xb0
> [ 15.654021] ttm_bo_vm_fault_reserved+0x43/0x350
> [ 15.654886] amdgpu_gem_fault+0x7f/0xf0 [amdgpu]
> [ 15.655696] __do_fault+0x41/0x240
> [ 15.655974] __handle_mm_fault+0xcf4/0x1740
> [ 15.655978] ? do_mmap+0x33d/0x4f0
> [ 15.655981] handle_mm_fault+0xb5/0x180
> [ 15.655984] do_user_addr_fault+0x19b/0x6b0
> [ 15.656574] exc_page_fault+0x6d/0x140
> [ 15.656715] asm_exc_page_fault+0x22/0x30
> [ 15.656723] RIP: 0033:0x7f385d17b1bb
> [ 15.657027] Code: 00 00 48 01 d6 48 01 d7 4c 8d 1d 00 0b 05 00 49 63 14 93 49 8d 14 13 ff e2 0f 0b 0f 1f 40 00 48 81 ea 80 00 00 00 0f 28 4e f0 <0f> 29 4f f0 0f 28 56 e0 0f 29 57 e0 0f 28 5e d0 0f 29 5f d0 0f 28
> [ 15.657030] RSP: 002b:00007ffdc75943a8 EFLAGS: 00010202
> [ 15.657032] RAX: 00007f37d3015000 RBX: 00007f37d3015000 RCX: 0000000000010000
> [ 15.657034] RDX: 0000000000001d80 RSI: 00007f37f000ae00 RDI: 00007f37d3016e00
> [ 15.658395] RBP: 0000000000000001 R08: 00007f37d3016df0 R09: 0000000000000000
> [ 15.658396] R10: 0000000000000000 R11: 0000000000000246 R12: 00000000000080e1
> [ 15.658397] R13: 00005563d7c23e78 R14: 00007f37f0009000 R15: 0000000000001e00
> [ 15.658400] </TASK>
> [ 15.659189] ---[ end trace 0000000000000000 ]---
> [ 15.668798] ------------[ cut here ]------------
> [ 15.669007] refcount_t: saturated; leaking memory.
> [ 15.669201] WARNING: CPU: 1 PID: 2479 at lib/refcount.c:22 refcount_warn_saturate+0x51/0x110
> [ 15.669492] Modules linked in: loop dm_crypt dm_mod edac_mce_amd kvm_amd ccp rng_core kvm irqbypass crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel iwlmvm sha512_ssse3 sha512_generic mac80211 libarc4 snd_hda_codec_generic ledtrig_audio snd_hda_codec_hdmi iwlwifi snd_hda_intel snd_intel_dspcfg snd_hda_codec aesni_intel snd_hda_core crypto_simd pcspkr cryptd snd_pcm fam15h_power cfg80211 snd_timer k10temp snd rfkill tg3 soundcore battery acpi_cpufreq ac button input_leds led_class psmouse serio_raw amdgpu drm_buddy gpu_sched drm_display_helper video wmi
> [ 15.671256] CPU: 1 PID: 2479 Comm: Xorg:cs0 Tainted: G W 6.1.0+ #1
> [ 15.671510] Hardware name: HP HP EliteBook 745 G3/807E, BIOS N73 Ver. 01.39 04/16/2019
> [ 15.671774] RIP: 0010:refcount_warn_saturate+0x51/0x110
> [ 15.671954] Code: 84 bc 00 00 00 e9 ff ec 57 00 85 f6 74 23 80 3d 79 1e ee 07 00 75 ee 48 c7 c7 20 20 fe 81 c6 05 69 1e ee 07 01 e8 83 7b 52 00 <0f> 0b e9 d8 ec 57 00 80 3d 57 1e ee 07 00 75 cb 48 c7 c7 20 20 fe
> [ 15.672626] RSP: 0018:ffffc90000beb820 EFLAGS: 00010282
> [ 15.672812] RAX: 0000000000000026 RBX: ffff888103624040 RCX: 0000000000000027
> [ 15.672815] RDX: ffff88842f49f3c8 RSI: 0000000000000001 RDI: ffff88842f49f3c0
> [ 15.672816] RBP: 0000000000000003 R08: 0000000000000058 R09: 00000000ffefffff
> [ 15.672818] R10: ffffffff8224a280 R11: 0000000000000003 R12: ffff888105153000
> [ 15.672820] R13: ffff8881051c0000 R14: ffff888105493f48 R15: ffff888105153000
> [ 15.672822] FS: 00007f38541ff640(0000) GS:ffff88842f480000(0000) knlGS:0000000000000000
> [ 15.672824] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 15.672826] CR2: 00005563d7a6d000 CR3: 0000000103e9c000 CR4: 00000000001506e0
> [ 15.672828] Call Trace:
> [ 15.674888] <TASK>
> [ 15.674889] amdgpu_sync_resv+0x191/0x1e0 [amdgpu]
> [ 15.675956] amdgpu_vm_sdma_prepare.part.0+0x4b/0x90 [amdgpu]
> [ 15.677154] amdgpu_vm_update_range+0x140/0x6e0 [amdgpu]
> [ 15.678185] amdgpu_vm_bo_update+0x32e/0x570 [amdgpu]
> [ 15.679068] amdgpu_vm_handle_moved+0x5e/0x120 [amdgpu]
> [ 15.679999] amdgpu_cs_ioctl+0x1289/0x1e00 [amdgpu]
> [ 15.681234] ? __pfx_amdgpu_cs_ioctl+0x10/0x10 [amdgpu]
> [ 15.682257] drm_ioctl_kernel+0xbf/0x160
> [ 15.682398] drm_ioctl+0x21c/0x500
> [ 15.682517] ? __pfx_amdgpu_cs_ioctl+0x10/0x10 [amdgpu]
> [ 15.683403] ? futex_wake+0x6d/0x160
> [ 15.683533] amdgpu_drm_ioctl+0x5e/0xb0 [amdgpu]
> [ 15.684394] __x64_sys_ioctl+0xb6/0xd0
> [ 15.684589] ? exit_to_user_mode_prepare+0x97/0x140
> [ 15.684774] do_syscall_64+0x3a/0x90
> [ 15.684784] entry_SYSCALL_64_after_hwframe+0x72/0xdc
> [ 15.685094] RIP: 0033:0x7f385d12d957
> [ 15.685229] Code: 3c 1c 48 f7 d8 4c 39 e0 77 b9 e8 24 ff ff ff 85 c0 78 be 4c 89 e0 5b 5d 41 5c c3 0f 1f 84 00 00 00 00 00 b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d e1 94 0c 00 f7 d8 64 89 01 48
> [ 15.685232] RSP: 002b:00007f38541fe868 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
> [ 15.685236] RAX: ffffffffffffffda RBX: 00007f38541fe8d0 RCX: 00007f385d12d957
> [ 15.685238] RDX: 00007f38541fe8d0 RSI: 00000000c0186444 RDI: 000000000000000e
> [ 15.685239] RBP: 00000000c0186444 R08: 00007f38541fea00 R09: 0000000000000020
> [ 15.685241] R10: 00007f38541fea00 R11: 0000000000000246 R12: 00005563d7964490
> [ 15.685242] R13: 000000000000000e R14: 0000000000000000 R15: 00005563d7b4f6a0
> [ 15.685245] </TASK>
> [ 15.685247] ---[ end trace 0000000000000000 ]---
>

2022-12-22 22:42:13

by Michal Kubecek

[permalink] [raw]
Subject: Re: amdgpu refcount saturation

On Mon, Dec 19, 2022 at 09:23:05AM +0100, Christian K?nig wrote:
> Am 17.12.22 um 12:53 schrieb Borislav Petkov:
> > Hi folks,
> >
> > this is with Linus' tree from Wed:
> >
> > 041fae9c105a ("Merge tag 'f2fs-for-6.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs")
> >
> > on a CZ laptop:
> >
> > [ 7.782901] [drm] initializing kernel modesetting (CARRIZO 0x1002:0x9874 0x103C:0x807E 0xC4)
> >
> > The splat is kinda messy:
>
> Thanks for the notice, going to take a look today.
>
> Regards,
> Christian.

In case it might help, I have similar crashes with 6.2 merge window
snapshots on a desktop machine with Radeon WX2100

[ 16.045850] [drm] initializing kernel modesetting (POLARIS12 0x1002:0x6995 0x1002:0x0B0C 0x00).

The behavior seems pretty deterministic so far, the system boots
cleanly, login into KDE is fine but then it crashes as soon as I start
firefox.

Unfortunately, just like Boris, I always seem to have multiple stack
traces tangled together.

Michal


Commit 77856d911a8c:
------------------------------------------------------------------------------
[ 165.210008] ------------[ cut here ]------------
[ 165.215427] refcount_t: underflow; use-after-free.
[ 165.221026] WARNING: CPU: 14 PID: 1165 at lib/refcount.c:28 refcount_warn_saturate+0xba/0x110
[ 165.230420] Modules linked in: echainiv esp4 af_packet tun 8021q garp mrp stp llc iscsi_ibft iscsi_boot_sysfs xt_REDIRECT xt_MASQUERADE xt_nat iptable_nat nf_nat deflate sm4_generic sm4_aesni_avx2_x86_64 xt_LOG sm4_aesni_avx_x86_64 nf_log_syslog sm4 twofish_generic twofish_avx_x86_64 twofish_x86_64_3way twofish_x86_64 twofish_common camellia_generic xt_conntrack camellia_aesni_avx2 nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 camellia_aesni_avx_x86_64 camellia_x86_64 serpent_avx2 serpent_avx_x86_64 serpent_sse2_x86_64 serpent_generic blowfish_generic blowfish_x86_64 vmnet(OE) blowfish_common ppdev parport_pc parport cast5_avx_x86_64 vmw_vsock_vmci_transport cast5_generic cast_common ipt_REJECT nf_reject_ipv4 vsock des_generic libdes sm3_generic rfkill xt_tcpudp sm3_avx_x86_64 sm3 xt_set vmw_vmci cmac xcbc iptable_filter vmmon(OE) rmd160 bpfilter dmi_sysfs ip_set_hash_ip af_key ip_set xfrm_algo nfnetlink msr hwmon_vid dm_crypt essiv authenc trusted asn1_encoder tee amdgpu
[ 165.230464] intel_rapl_msr uvcvideo videobuf2_vmalloc iommu_v2 videobuf2_memops drm_buddy i2c_dev videobuf2_v4l2 gpu_sched video intel_rapl_common snd_usb_audio videodev xfs drm_display_helper videobuf2_common snd_usbmidi_lib drm_ttm_helper ttm libcrc32c edac_mce_amd joydev mc irqbypass cec pcspkr wmi_bmof gigabyte_wmi k10temp i2c_piix4 tiny_power_button rc_core igb dca thermal button acpi_cpufreq fuse configfs ip_tables x_tables ext4 mbcache jbd2 hid_generic uas usb_storage usbhid crct10dif_pclmul crc32_pclmul crc32c_intel xhci_pci polyval_clmulni xhci_pci_renesas polyval_generic gf128mul xhci_hcd ghash_clmulni_intel sha512_ssse3 aesni_intel crypto_simd nvme cryptd usbcore ccp sr_mod sp5100_tco cdrom nvme_core wmi snd_emu10k1 snd_hwdep snd_util_mem snd_ac97_codec ac97_bus snd_pcm snd_timer snd_rawmidi snd_seq_device snd soundcore sg dm_multipath dm_mod scsi_dh_rdac scsi_dh_emc scsi_dh_alua
[ 165.339552] ------------[ cut here ]------------
[ 165.339552] ------------[ cut here ]------------
[ 165.339553] refcount_t: saturated; leaking memory.
[ 165.339557] WARNING: CPU: 18 PID: 6237 at lib/refcount.c:19 refcount_warn_saturate+0x97/0x110
[ 165.339562] Modules linked in: echainiv esp4 af_packet tun 8021q garp mrp stp llc iscsi_ibft iscsi_boot_sysfs xt_REDIRECT xt_MASQUERADE xt_nat iptable_nat nf_nat deflate sm4_generic sm4_aesni_avx2_x86_64 xt_LOG sm4_aesni_avx_x86_64 nf_log_syslog sm4 twofish_generic twofish_avx_x86_64 twofish_x86_64_3way twofish_x86_64 twofish_common camellia_generic xt_conntrack camellia_aesni_avx2 nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 camellia_aesni_avx_x86_64 camellia_x86_64 serpent_avx2 serpent_avx_x86_64 serpent_sse2_x86_64 serpent_generic blowfish_generic blowfish_x86_64 vmnet(OE) blowfish_common ppdev parport_pc parport cast5_avx_x86_64 vmw_vsock_vmci_transport cast5_generic cast_common ipt_REJECT nf_reject_ipv4 vsock des_generic libdes sm3_generic rfkill xt_tcpudp sm3_avx_x86_64 sm3 xt_set vmw_vmci cmac xcbc iptable_filter vmmon(OE) rmd160 bpfilter dmi_sysfs ip_set_hash_ip af_key ip_set xfrm_algo nfnetlink msr hwmon_vid dm_crypt essiv authenc trusted asn1_encoder tee amdgpu
[ 165.339588] intel_rapl_msr uvcvideo videobuf2_vmalloc iommu_v2 videobuf2_memops drm_buddy i2c_dev videobuf2_v4l2 gpu_sched video intel_rapl_common snd_usb_audio videodev xfs drm_display_helper videobuf2_common snd_usbmidi_lib drm_ttm_helper ttm libcrc32c edac_mce_amd joydev mc irqbypass cec pcspkr wmi_bmof gigabyte_wmi k10temp i2c_piix4 tiny_power_button rc_core igb dca thermal button acpi_cpufreq fuse configfs ip_tables x_tables ext4 mbcache jbd2 hid_generic uas usb_storage usbhid crct10dif_pclmul crc32_pclmul crc32c_intel xhci_pci polyval_clmulni xhci_pci_renesas polyval_generic gf128mul xhci_hcd ghash_clmulni_intel sha512_ssse3 aesni_intel crypto_simd nvme cryptd usbcore ccp sr_mod sp5100_tco cdrom nvme_core wmi snd_emu10k1 snd_hwdep snd_util_mem snd_ac97_codec ac97_bus snd_pcm snd_timer snd_rawmidi snd_seq_device snd soundcore sg dm_multipath dm_mod scsi_dh_rdac scsi_dh_emc scsi_dh_alua
[ 165.339613] CPU: 18 PID: 6237 Comm: Renderer Kdump: loaded Tainted: G OE 6.1.0-1-g77856d911a8c-lp153.1.g31c4e7c-default #1 openSUSE Tumbleweed (unreleased) 0a5448e9051496e556e87d2b25245d24e80e3753
[ 165.339616] Hardware name: Gigabyte Technology Co., Ltd. X570 AORUS ELITE/X570 AORUS ELITE, BIOS F37d 07/27/2022
[ 165.339617] RIP: 0010:refcount_warn_saturate+0x97/0x110
[ 165.339619] Code: 01 01 e8 4e 86 61 00 0f 0b c3 cc cc cc cc 80 3d d9 75 6c 01 00 75 a8 48 c7 c7 68 6c 02 93 c6 05 c9 75 6c 01 01 e8 2b 86 61 00 <0f> 0b c3 cc cc cc cc 80 3d b3 75 6c 01 00 75 85 48 c7 c7 c0 6c 02
[ 165.339620] RSP: 0018:ffffa66a028378e8 EFLAGS: 00010282
[ 165.339621] RAX: 0000000000000000 RBX: ffffa66a02837960 RCX: ffff9001feea24c8
[ 165.339622] RDX: 00000000ffffffd8 RSI: 0000000000000027 RDI: ffff9001feea24c0
[ 165.339622] RBP: ffff8ff3ef1890c0 R08: ffffffff93fd1ac0 R09: ffffa66a02837880
[ 165.339623] R10: 0000000000000001 R11: 000000000000000d R12: 00000000ffffffff
[ 165.339623] R13: ffff8ff3ef1890c0 R14: ffffa66a02837ae8 R15: 0000000000139800
[ 165.339624] FS: 00007fa0e68eb700(0000) GS:ffff9001fee80000(0000) knlGS:0000000000000000
[ 165.339625] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 165.339625] CR2: 00007fa0b2a85000 CR3: 000000017f168000 CR4: 0000000000750ee0
[ 165.339626] PKRU: 55555554
[ 165.339627] Call Trace:
[ 165.339628] <TASK>
[ 165.339628] dma_resv_iter_walk_unlocked.part.0+0x14a/0x180
[ 165.339633] dma_resv_iter_first_unlocked+0x25/0x70
[ 165.339635] amdgpu_vm_sdma_update+0x63/0x360 [amdgpu a70a7113837b00a71a821b24709a752a95bf48d2]
[ 165.339810] amdgpu_vm_ptes_update+0x2a6/0x7e0 [amdgpu a70a7113837b00a71a821b24709a752a95bf48d2]
[ 165.339936] amdgpu_vm_update_range+0x21b/0x750 [amdgpu a70a7113837b00a71a821b24709a752a95bf48d2]
[ 165.340059] amdgpu_vm_bo_update+0x287/0x570 [amdgpu a70a7113837b00a71a821b24709a752a95bf48d2]
[ 165.340181] amdgpu_gem_va_ioctl+0x4e9/0x510 [amdgpu a70a7113837b00a71a821b24709a752a95bf48d2]
[ 165.340302] ? __pfx_amdgpu_gem_create_ioctl+0x10/0x10 [amdgpu a70a7113837b00a71a821b24709a752a95bf48d2]
[ 165.340421] ? __pfx_amdgpu_gem_va_ioctl+0x10/0x10 [amdgpu a70a7113837b00a71a821b24709a752a95bf48d2]
[ 165.340540] drm_ioctl_kernel+0xb4/0x140
[ 165.340543] drm_ioctl+0x1e5/0x450
[ 165.340545] ? __pfx_amdgpu_gem_va_ioctl+0x10/0x10 [amdgpu a70a7113837b00a71a821b24709a752a95bf48d2]
[ 165.340664] amdgpu_drm_ioctl+0x49/0x80 [amdgpu a70a7113837b00a71a821b24709a752a95bf48d2]
[ 165.340780] __x64_sys_ioctl+0x8b/0xc0
[ 165.340784] do_syscall_64+0x5c/0x90
[ 165.340787] ? __x64_sys_futex+0x81/0x1c0
[ 165.340790] ? do_user_addr_fault+0x1db/0x6a0
[ 165.340793] ? exit_to_user_mode_prepare+0x190/0x1f0
[ 165.340794] ? syscall_exit_to_user_mode+0x17/0x40
[ 165.340797] ? do_syscall_64+0x69/0x90
[ 165.340798] entry_SYSCALL_64_after_hwframe+0x72/0xdc
[ 165.340800] RIP: 0033:0x7fa0eb30cc27
[ 165.340801] Code: 90 90 90 48 8b 05 69 c2 2d 00 64 c7 00 26 00 00 00 48 c7 c0 ff ff ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 39 c2 2d 00 f7 d8 64 89 01 48
[ 165.340802] RSP: 002b:00007fa0e68e8208 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[ 165.340803] RAX: ffffffffffffffda RBX: 00007fa065e4fac0 RCX: 00007fa0eb30cc27
[ 165.340803] RDX: 00007fa0e68e8250 RSI: 00000000c0286448 RDI: 0000000000000008
[ 165.340804] RBP: 00007fa0e68e8250 R08: 0000000139200000 R09: 000000000000000e
[ 165.340804] R10: 000000000000009d R11: 0000000000000246 R12: 00000000c0286448
[ 165.340805] R13: 0000000000000008 R14: 0000000001010000 R15: 00007fa0ebd95100
[ 165.340806] </TASK>
[ 165.340806] ---[ end trace 0000000000000000 ]---
[ 165.340815] ------------[ cut here ]------------
[ 165.340815] refcount_t: saturated; leaking memory.
[ 165.340818] WARNING: CPU: 18 PID: 6237 at lib/refcount.c:22 refcount_warn_saturate+0x51/0x110
[ 165.340822] Modules linked in: echainiv esp4 af_packet tun 8021q garp mrp stp llc iscsi_ibft iscsi_boot_sysfs xt_REDIRECT xt_MASQUERADE xt_nat iptable_nat nf_nat deflate sm4_generic sm4_aesni_avx2_x86_64 xt_LOG sm4_aesni_avx_x86_64 nf_log_syslog sm4 twofish_generic twofish_avx_x86_64 twofish_x86_64_3way twofish_x86_64 twofish_common camellia_generic xt_conntrack camellia_aesni_avx2 nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 camellia_aesni_avx_x86_64 camellia_x86_64 serpent_avx2 serpent_avx_x86_64 serpent_sse2_x86_64 serpent_generic blowfish_generic blowfish_x86_64 vmnet(OE) blowfish_common ppdev parport_pc parport cast5_avx_x86_64 vmw_vsock_vmci_transport cast5_generic cast_common ipt_REJECT nf_reject_ipv4 vsock des_generic libdes sm3_generic rfkill xt_tcpudp sm3_avx_x86_64 sm3 xt_set vmw_vmci cmac xcbc iptable_filter vmmon(OE) rmd160 bpfilter dmi_sysfs ip_set_hash_ip af_key ip_set xfrm_algo nfnetlink msr hwmon_vid dm_crypt essiv authenc trusted asn1_encoder tee amdgpu
[ 165.340844] intel_rapl_msr uvcvideo videobuf2_vmalloc iommu_v2 videobuf2_memops drm_buddy i2c_dev videobuf2_v4l2 gpu_sched video intel_rapl_common snd_usb_audio videodev xfs drm_display_helper videobuf2_common snd_usbmidi_lib drm_ttm_helper ttm libcrc32c edac_mce_amd joydev mc irqbypass cec pcspkr wmi_bmof gigabyte_wmi k10temp i2c_piix4 tiny_power_button rc_core igb dca thermal button acpi_cpufreq fuse configfs ip_tables x_tables ext4 mbcache jbd2 hid_generic uas usb_storage usbhid crct10dif_pclmul crc32_pclmul crc32c_intel xhci_pci polyval_clmulni xhci_pci_renesas polyval_generic gf128mul xhci_hcd ghash_clmulni_intel sha512_ssse3 aesni_intel crypto_simd nvme cryptd usbcore ccp sr_mod sp5100_tco cdrom nvme_core wmi snd_emu10k1 snd_hwdep snd_util_mem snd_ac97_codec ac97_bus snd_pcm snd_timer snd_rawmidi snd_seq_device snd soundcore sg dm_multipath dm_mod scsi_dh_rdac scsi_dh_emc scsi_dh_alua
[ 165.340867] CPU: 18 PID: 6237 Comm: Renderer Kdump: loaded Tainted: G W OE 6.1.0-1-g77856d911a8c-lp153.1.g31c4e7c-default #1 openSUSE Tumbleweed (unreleased) 0a5448e9051496e556e87d2b25245d24e80e3753
[ 165.340868] Hardware name: Gigabyte Technology Co., Ltd. X570 AORUS ELITE/X570 AORUS ELITE, BIOS F37d 07/27/2022
[ 165.340869] RIP: 0010:refcount_warn_saturate+0x51/0x110
[ 165.340870] Code: 84 bc 00 00 00 c3 cc cc cc cc 85 f6 74 46 80 3d 1e 76 6c 01 00 75 ee 48 c7 c7 68 6c 02 93 c6 05 0e 76 6c 01 01 e8 71 86 61 00 <0f> 0b c3 cc cc cc cc 80 3d fa 75 6c 01 00 75 cb 48 c7 c7 90 6c 02
[ 165.340871] RSP: 0018:ffffa66a02837788 EFLAGS: 00010282
[ 165.340872] RAX: 0000000000000000 RBX: ffff8ff3ef1890c0 RCX: ffff9001feea24c8
[ 165.340873] RDX: 00000000ffffffd8 RSI: 0000000000000027 RDI: ffff9001feea24c0
[ 165.340873] RBP: ffff8ff35d95a458 R08: ffffffff93fd1ac0 R09: ffffa66a02837720
[ 165.340874] R10: 0000000000000001 R11: 0000000000001000 R12: 0000000000000000
[ 165.340874] R13: ffff8ff3ef1890f8 R14: ffff8ff30fe25f20 R15: ffff8ff30fe25f00
[ 165.340875] FS: 00007fa0e68eb700(0000) GS:ffff9001fee80000(0000) knlGS:0000000000000000
[ 165.340876] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 165.340876] CR2: 00007fa0b2a85000 CR3: 000000017f168000 CR4: 0000000000750ee0
[ 165.340877] PKRU: 55555554
[ 165.340877] Call Trace:
[ 165.340878] <TASK>
[ 165.340878] ttm_bo_add_move_fence.constprop.0+0x148/0x160 [ttm fcae6956db0c19a17837d67fe868f42f3841f057]
[ 165.340884] ttm_bo_mem_space+0xb0/0x230 [ttm fcae6956db0c19a17837d67fe868f42f3841f057]
[ 165.340888] ttm_bo_validate+0xa0/0x140 [ttm fcae6956db0c19a17837d67fe868f42f3841f057]
[ 165.340892] ? ttm_bo_validate+0x5/0x140 [ttm fcae6956db0c19a17837d67fe868f42f3841f057]
[ 165.340896] ttm_bo_init_reserved+0xdc/0x1a0 [ttm fcae6956db0c19a17837d67fe868f42f3841f057]
[ 165.340900] amdgpu_bo_create+0x192/0x450 [amdgpu a70a7113837b00a71a821b24709a752a95bf48d2]
[ 165.341019] ? __pfx_amdgpu_bo_destroy+0x10/0x10 [amdgpu a70a7113837b00a71a821b24709a752a95bf48d2]
[ 165.341137] ? amdgpu_bo_gpu_offset_no_check+0x26/0x50 [amdgpu a70a7113837b00a71a821b24709a752a95bf48d2]
[ 165.341255] amdgpu_bo_create_vm+0x2e/0x80 [amdgpu a70a7113837b00a71a821b24709a752a95bf48d2]
[ 165.341373] amdgpu_vm_pt_create+0xf5/0x260 [amdgpu a70a7113837b00a71a821b24709a752a95bf48d2]
[ 165.341495] ? __pfx_amdgpu_bo_destroy+0x10/0x10 [amdgpu a70a7113837b00a71a821b24709a752a95bf48d2]
[ 165.341613] amdgpu_vm_ptes_update+0x504/0x7e0 [amdgpu a70a7113837b00a71a821b24709a752a95bf48d2]
[ 165.341735] amdgpu_vm_update_range+0x21b/0x750 [amdgpu a70a7113837b00a71a821b24709a752a95bf48d2]
[ 165.341856] amdgpu_vm_bo_update+0x287/0x570 [amdgpu a70a7113837b00a71a821b24709a752a95bf48d2]
[ 165.341976] amdgpu_gem_va_ioctl+0x4e9/0x510 [amdgpu a70a7113837b00a71a821b24709a752a95bf48d2]
[ 165.342103] ? __pfx_amdgpu_gem_create_ioctl+0x10/0x10 [amdgpu a70a7113837b00a71a821b24709a752a95bf48d2]
[ 165.342221] ? __pfx_amdgpu_gem_va_ioctl+0x10/0x10 [amdgpu a70a7113837b00a71a821b24709a752a95bf48d2]
[ 165.342340] drm_ioctl_kernel+0xb4/0x140
[ 165.342342] drm_ioctl+0x1e5/0x450
[ 165.342343] ? __pfx_amdgpu_gem_va_ioctl+0x10/0x10 [amdgpu a70a7113837b00a71a821b24709a752a95bf48d2]
[ 165.342462] amdgpu_drm_ioctl+0x49/0x80 [amdgpu a70a7113837b00a71a821b24709a752a95bf48d2]
[ 165.342576] __x64_sys_ioctl+0x8b/0xc0
[ 165.342578] do_syscall_64+0x5c/0x90
[ 165.342579] ? __x64_sys_futex+0x81/0x1c0
[ 165.342581] ? do_user_addr_fault+0x1db/0x6a0
[ 165.342583] ? exit_to_user_mode_prepare+0x190/0x1f0
[ 165.342584] ? syscall_exit_to_user_mode+0x17/0x40
[ 165.342585] ? do_syscall_64+0x69/0x90
[ 165.342586] entry_SYSCALL_64_after_hwframe+0x72/0xdc
[ 165.342588] RIP: 0033:0x7fa0eb30cc27
[ 165.342589] Code: 90 90 90 48 8b 05 69 c2 2d 00 64 c7 00 26 00 00 00 48 c7 c0 ff ff ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 39 c2 2d 00 f7 d8 64 89 01 48
[ 165.342589] RSP: 002b:00007fa0e68e8208 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[ 165.342590] RAX: ffffffffffffffda RBX: 00007fa065e4fac0 RCX: 00007fa0eb30cc27
[ 165.342591] RDX: 00007fa0e68e8250 RSI: 00000000c0286448 RDI: 0000000000000008
[ 165.342591] RBP: 00007fa0e68e8250 R08: 0000000139200000 R09: 000000000000000e
[ 165.342592] R10: 000000000000009d R11: 0000000000000246 R12: 00000000c0286448
[ 165.342592] R13: 0000000000000008 R14: 0000000001010000 R15: 00007fa0ebd95100
[ 165.342594] </TASK>
[ 165.342594] ---[ end trace 0000000000000000 ]---
[ 165.399457] CPU: 14 PID: 1165 Comm: sdma0 Kdump: loaded Tainted: G W OE 6.1.0-1-g77856d911a8c-lp153.1.g31c4e7c-default #1 openSUSE Tumbleweed (unreleased) 0a5448e9051496e556e87d2b25245d24e80e3753
[ 165.404860] refcount_t: addition on 0; use-after-free.
[ 165.404864] WARNING: CPU: 0 PID: 2397 at lib/refcount.c:25 refcount_warn_saturate+0x74/0x110
[ 165.410175] Hardware name: Gigabyte Technology Co., Ltd. X570 AORUS ELITE/X570 AORUS ELITE, BIOS F37d 07/27/2022
[ 165.410176] RIP: 0010:refcount_warn_saturate+0xba/0x110
[ 165.415825] Modules linked in: echainiv esp4
[ 165.425112] Code: 01 01 e8 2b 86 61 00 0f 0b c3 cc cc cc cc 80 3d b3 75 6c 01 00 75 85 48 c7 c7 c0 6c 02 93 c6 05 a3 75 6c 01 01 e8 08 86 61 00 <0f> 0b c3 cc cc cc cc 80 3d 8e 75 6c 01 00 0f 85 5e ff ff ff 48 c7
[ 165.425114] RSP: 0018:ffffa66a010bbe68 EFLAGS: 00010286
[ 165.513048] af_packet tun 8021q
[ 165.594093]
[ 165.594094] RAX: 0000000000000000 RBX: ffff8ff38e79be78 RCX: 0000000000000000
[ 165.594095] RDX: ffff9001fedaf168 RSI: ffff9001feda24c0 RDI: ffff9001feda24c0
[ 165.613896] garp mrp stp llc
[ 165.624950] RBP: ffff8ff30fe25e28 R08: 0000000000000000 R09: 00000000fffeffff
[ 165.624951] R10: ffffa66a010bbd20 R11: ffff9001fe7fffe8 R12: ffff8ff3ef15adc0
[ 165.624951] R13: ffff8ff38e79be00 R14: ffff8ff30fe2cab8 R15: ffff8ff38e79a800
[ 165.630971] iscsi_ibft iscsi_boot_sysfs xt_REDIRECT
[ 165.650763] FS: 0000000000000000(0000) GS:ffff9001fed80000(0000) knlGS:0000000000000000
[ 165.650764] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 165.656782] xt_MASQUERADE xt_nat iptable_nat
[ 165.664746] CR2: 00007fa064012008 CR3: 0000000610610000 CR4: 0000000000750ee0
[ 165.664747] PKRU: 55555554
[ 165.672621] nf_nat deflate sm4_generic sm4_aesni_avx2_x86_64
[ 165.680654] Call Trace:
[ 165.680656] <TASK>
[ 165.688616] xt_LOG sm4_aesni_avx_x86_64 nf_log_syslog
[ 165.696582] drm_sched_entity_pop_job+0x1b7/0x430 [gpu_sched 49a78bc0ca14ba43b21b85e650a83354884c8272]
[ 165.705515] sm4 twofish_generic twofish_avx_x86_64
[ 165.712067] drm_sched_main+0xcd/0x400 [gpu_sched 49a78bc0ca14ba43b21b85e650a83354884c8272]
[ 165.720030] twofish_x86_64_3way twofish_x86_64 twofish_common camellia_generic
[ 165.723402] ? __pfx_autoremove_wake_function+0x10/0x10
[ 165.726668] xt_conntrack camellia_aesni_avx2
[ 165.729510] ? __pfx_drm_sched_main+0x10/0x10 [gpu_sched 49a78bc0ca14ba43b21b85e650a83354884c8272]
[ 165.735885] nf_conntrack nf_defrag_ipv6
[ 165.741553] kthread+0xd8/0x100
[ 165.751105] nf_defrag_ipv4 camellia_aesni_avx_x86_64 camellia_x86_64 serpent_avx2
[ 165.760657] ? __pfx_kthread+0x10/0x10
[ 165.760658] ret_from_fork+0x2c/0x50
[ 165.770457] serpent_avx_x86_64 serpent_sse2_x86_64 serpent_generic blowfish_generic
[ 165.779833] </TASK>
[ 165.779834] ---[ end trace 0000000000000000 ]---
[ 165.789370] blowfish_x86_64
[ 165.805263] BUG: unable to handle page fault for address: ffffffffc0d1d5b0
[ 165.809709] vmnet(OE)
[ 165.814401] #PF: supervisor write access in kernel mode
[ 165.814402] #PF: error_code(0x0003) - permissions violation
[ 165.814403] PGD 610615067 P4D 610615067 PUD 610617067
[ 165.818569] blowfish_common
[ 165.828559] PMD 10b3cb067 PTE 10fad5161
[ 165.828560] Oops: 0003 [#1] PREEMPT SMP NOPTI
[ 165.828562] CPU: 15 PID: 0 Comm: swapper/15 Kdump: loaded Tainted: G W OE 6.1.0-1-g77856d911a8c-lp153.1.g31c4e7c-default #1 openSUSE Tumbleweed (unreleased) 0a5448e9051496e556e87d2b25245d24e80e3753
[ 165.837584] ppdev
[ 165.842100] Hardware name: Gigabyte Technology Co., Ltd. X570 AORUS ELITE/X570 AORUS ELITE, BIOS F37d 07/27/2022
[ 165.842101] RIP: 0010:dma_fence_signal_timestamp_locked+0x45/0x100
[ 165.846356] parport_pc
[ 165.851213] Code: 31 c0 f0 48 0f ba 6f 30 00 0f 82 c6 00 00 00 48 8b 47 10 49 89 e4 48 89 fd 4c 89 60 08 48 89 04 24 48 8b 47 18 48 89 44 24 08 <4c> 89 20 48 89 77 10 f0 80 4f 30 02 66 90 48 8b 34 24 48 8b 1e 48
[ 165.851214] RSP: 0018:ffffa66a00584d78 EFLAGS: 00010046
[ 165.851215] RAX: ffffffffc0d1d5b0 RBX: ffff8ff3ef1890c0 RCX: 0000000000000017
[ 165.856351] parport
[ 165.862102] RDX: 0000000000bcc8f2 RSI: 0000002686959ae3 RDI: ffff8ff3ef1890c0
[ 165.862103] RBP: ffff8ff3ef1890c0 R08: 0000000000000000 R09: ffffffffffffffff
[ 165.862104] R10: 0000000000000000 R11: ffffa66a00584ff8 R12: ffffa66a00584d78
[ 165.867682] cast5_avx_x86_64
[ 165.872196] R13: 0000000000000001 R14: ffff8ff30fe20000 R15: 00000000ffffffff
[ 165.872197] FS: 0000000000000000(0000) GS:ffff9001fedc0000(0000) knlGS:0000000000000000
[ 165.872198] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 165.872198] CR2: ffffffffc0d1d5b0 CR3: 0000000610610000 CR4: 0000000000750ee0
[ 165.878040] vmw_vsock_vmci_transport
[ 165.882379] PKRU: 55555554
[ 165.882379] Call Trace:
[ 165.882380] <IRQ>
[ 165.882382] ? __pfx_drm_sched_fence_free_rcu+0x10/0x10 [gpu_sched 49a78bc0ca14ba43b21b85e650a83354884c8272]
[ 165.902085] cast5_generic
[ 165.910561] dma_fence_signal+0x2c/0x50
[ 165.910563] drm_sched_job_done.isra.0+0x5d/0x130 [gpu_sched 49a78bc0ca14ba43b21b85e650a83354884c8272]
[ 165.918524] cast_common
[ 165.926485] dma_fence_signal_timestamp_locked+0x7a/0x100
[ 165.926486] dma_fence_signal+0x2c/0x50
[ 165.934447] ipt_REJECT
[ 165.942408] amdgpu_fence_process+0xce/0x140 [amdgpu a70a7113837b00a71a821b24709a752a95bf48d2]
[ 165.950370] nf_reject_ipv4
[ 165.953301] sdma_v3_0_process_trap_irq+0x64/0x90 [amdgpu a70a7113837b00a71a821b24709a752a95bf48d2]
[ 165.958614] vsock
[ 165.964085] amdgpu_irq_dispatch+0x106/0x230 [amdgpu a70a7113837b00a71a821b24709a752a95bf48d2]
[ 165.969664] des_generic
[ 165.979037] amdgpu_ih_process+0x66/0x100 [amdgpu a70a7113837b00a71a821b24709a752a95bf48d2]
[ 166.066990] libdes
[ 166.147963] amdgpu_irq_handler+0x1f/0x60 [amdgpu a70a7113837b00a71a821b24709a752a95bf48d2]
[ 166.167669] sm3_generic
[ 166.178794] __handle_irq_event_percpu+0x46/0x190
[ 166.184816] rfkill
[ 166.204608] handle_irq_event+0x34/0x70
[ 166.204610] handle_edge_irq+0x87/0x220
[ 166.210632] xt_tcpudp
[ 166.218592] __common_interrupt+0x3e/0xa0
[ 166.218595] common_interrupt+0x7b/0xa0
[ 166.226557] sm3_avx_x86_64
[ 166.234518] </IRQ>
[ 166.234518] <TASK>
[ 166.234518] asm_common_interrupt+0x22/0x40
[ 166.242393] sm3
[ 166.250425] RIP: 0010:cpuidle_enter_state+0xd8/0x410
[ 166.250428] Code: 00 00 31 ff e8 a9 e4 7b ff 45 84 ff 74 16 9c 58 0f 1f 40 00 f6 c4 02 0f 85 18 03 00 00 31 ff e8 4e 81 83 ff fb 0f 1f 44 00 00 <45> 85 f6 0f 88 75 01 00 00 49 63 c6 4c 2b 2c 24 48 8d 14 40 48 8d
[ 166.259362] xt_set
[ 166.265911] RSP: 0018:ffffa66a001efe90 EFLAGS: 00000246
[ 166.265912] RAX: ffff9001fedf3fc0 RBX: 0000000000000001 RCX: 000000000000001f
[ 166.265913] RDX: 0000000000000000 RSI: 0000000022983930 RDI: 0000000000000000
[ 166.265914] RBP: ffff8ff3067a6800 R08: 000000269ac37f1f R09: 0000000000000008
[ 166.273875] vmw_vmci
[ 166.277332] R10: 0000000000000006 R11: 0000000000000001 R12: ffffffff9362ad80
[ 166.277333] R13: 000000269ac37f1f R14: 0000000000000001 R15: 0000000000000000
[ 166.277334] cpuidle_enter+0x29/0x40
[ 166.280531] cmac
[ 166.283370] do_idle+0x1fc/0x2a0
[ 166.283372] cpu_startup_entry+0x19/0x20
[ 166.293807] xcbc
[ 166.302649] start_secondary+0x114/0x140
[ 166.302652] secondary_startup_64_no_verify+0xe5/0xeb
[ 166.311408] iptable_filter
[ 166.320250] </TASK>
[ 166.320250] Modules linked in: echainiv esp4 af_packet tun 8021q
[ 166.329448] vmmon(OE)
[ 166.338644] garp mrp stp llc iscsi_ibft iscsi_boot_sysfs xt_REDIRECT xt_MASQUERADE xt_nat iptable_nat nf_nat deflate sm4_generic sm4_aesni_avx2_x86_64 xt_LOG sm4_aesni_avx_x86_64 nf_log_syslog sm4 twofish_generic twofish_avx_x86_64 twofish_x86_64_3way twofish_x86_64 twofish_common camellia_generic xt_conntrack camellia_aesni_avx2 nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 camellia_aesni_avx_x86_64 camellia_x86_64 serpent_avx2 serpent_avx_x86_64 serpent_sse2_x86_64 serpent_generic blowfish_generic blowfish_x86_64 vmnet(OE) blowfish_common ppdev parport_pc parport cast5_avx_x86_64 vmw_vsock_vmci_transport cast5_generic cast_common ipt_REJECT nf_reject_ipv4 vsock des_generic libdes sm3_generic rfkill xt_tcpudp sm3_avx_x86_64 sm3 xt_set vmw_vmci cmac xcbc iptable_filter vmmon(OE) rmd160 bpfilter dmi_sysfs ip_set_hash_ip af_key ip_set xfrm_algo nfnetlink msr hwmon_vid dm_crypt essiv authenc trusted asn1_encoder tee amdgpu intel_rapl_msr uvcvideo videobuf2_vmalloc iommu_v2
[ 166.338664] videobuf2_memops drm_buddy i2c_dev videobuf2_v4l2 gpu_sched video intel_rapl_common snd_usb_audio videodev xfs drm_display_helper videobuf2_common snd_usbmidi_lib drm_ttm_helper ttm libcrc32c edac_mce_amd joydev mc irqbypass cec pcspkr wmi_bmof gigabyte_wmi k10temp i2c_piix4 tiny_power_button rc_core igb dca thermal button acpi_cpufreq fuse configfs ip_tables x_tables ext4 mbcache jbd2 hid_generic uas usb_storage usbhid crct10dif_pclmul crc32_pclmul crc32c_intel xhci_pci polyval_clmulni xhci_pci_renesas polyval_generic gf128mul xhci_hcd ghash_clmulni_intel sha512_ssse3 aesni_intel crypto_simd nvme cryptd usbcore ccp sr_mod sp5100_tco cdrom nvme_core wmi snd_emu10k1 snd_hwdep snd_util_mem snd_ac97_codec ac97_bus snd_pcm snd_timer snd_rawmidi snd_seq_device snd soundcore sg dm_multipath dm_mod scsi_dh_rdac scsi_dh_emc scsi_dh_alua
[ 166.338688] CR2: ffffffffc0d1d5b0
------------------------------------------------------------------------------


Commit b6bb9676f216:
------------------------------------------------------------------------------
[ 87.503013] ------------[ cut here ]------------
[ 87.508432] refcount_t: addition on 0; use-after-free.
[ 87.514384] WARNING: CPU: 3 PID: 5171 at lib/refcount.c:25 refcount_warn_saturate+0x74/0x110
[ 87.523723] Modules linked in: 8021q garp mrp stp llc iscsi_ibft iscsi_boot_sysfs xt_REDIRECT xt_MASQUERADE xt_nat iptable_nat nf_nat deflate sm4_generic sm4_aesni_avx2_x86_64 xt_LOG nf_log_syslog sm4_aesni_avx_x86_64 sm4 twofish_generic twofish_avx_x86_64 twofish_x86_64_3way twofish_x86_64 xt_conntrack twofish_common nf_conntrack camellia_generic nf_defrag_ipv6 nf_defrag_ipv4 camellia_aesni_avx2 camellia_aesni_avx_x86_64 camellia_x86_64 serpent_avx2 serpent_avx_x86_64 serpent_sse2_x86_64 serpent_generic blowfish_generic blowfish_x86_64 vmnet(OE) blowfish_common ppdev parport_pc parport ipt_REJECT cast5_avx_x86_64 vmw_vsock_vmci_transport cast5_generic nf_reject_ipv4 cast_common vsock des_generic rfkill libdes xt_tcpudp vmw_vmci sm3_generic xt_set sm3_avx_x86_64 sm3 cmac xcbc iptable_filter vmmon(OE) rmd160 bpfilter dmi_sysfs ip_set_hash_ip af_key ip_set xfrm_algo nfnetlink msr hwmon_vid dm_crypt essiv authenc trusted asn1_encoder tee amdgpu uvcvideo iommu_v2
[ 87.523777] videobuf2_vmalloc
[ 87.532990] ------------[ cut here ]------------
[ 87.610862] drm_buddy
[ 87.614674] refcount_t: saturated; leaking memory.
[ 87.614679] WARNING: CPU: 20 PID: 1152 at lib/refcount.c:22 refcount_warn_saturate+0x51/0x110
[ 87.620087] videobuf2_memops gpu_sched
[ 87.623194] Modules linked in: 8021q garp mrp stp llc
[ 87.628777] videobuf2_v4l2 video
[ 87.638150] iscsi_ibft iscsi_boot_sysfs xt_REDIRECT xt_MASQUERADE xt_nat iptable_nat nf_nat deflate sm4_generic sm4_aesni_avx2_x86_64
[ 87.642674] videodev
[ 87.648587] xt_LOG nf_log_syslog sm4_aesni_avx_x86_64 sm4 twofish_generic twofish_avx_x86_64 twofish_x86_64_3way twofish_x86_64 xt_conntrack twofish_common
[ 87.652678] intel_rapl_msr
[ 87.665662] nf_conntrack camellia_generic nf_defrag_ipv6 nf_defrag_ipv4 camellia_aesni_avx2 camellia_aesni_avx_x86_64 camellia_x86_64 serpent_avx2 serpent_avx_x86_64 serpent_sse2_x86_64 serpent_generic blowfish_generic blowfish_x86_64 vmnet(OE) blowfish_common ppdev parport_pc parport ipt_REJECT cast5_avx_x86_64 vmw_vsock_vmci_transport cast5_generic nf_reject_ipv4 cast_common vsock des_generic rfkill libdes xt_tcpudp vmw_vmci sm3_generic xt_set sm3_avx_x86_64 sm3 cmac xcbc iptable_filter vmmon(OE) rmd160 bpfilter dmi_sysfs ip_set_hash_ip af_key ip_set xfrm_algo nfnetlink msr hwmon_vid dm_crypt essiv authenc trusted asn1_encoder tee amdgpu uvcvideo iommu_v2 videobuf2_vmalloc drm_buddy videobuf2_memops gpu_sched videobuf2_v4l2 video
[ 87.668700] snd_usb_audio
[ 87.683636] videodev intel_rapl_msr snd_usb_audio i2c_dev intel_rapl_common drm_display_helper snd_usbmidi_lib videobuf2_common xfs drm_ttm_helper libcrc32c tiny_power_button
[ 87.687191] i2c_dev
[ 87.753864] edac_mce_amd mc joydev irqbypass ttm pcspkr wmi_bmof gigabyte_wmi k10temp i2c_piix4 cec igb rc_core dca thermal button acpi_cpufreq fuse configfs ip_tables x_tables ext4 mbcache jbd2 hid_generic uas usb_storage usbhid crct10dif_pclmul crc32_pclmul crc32c_intel polyval_clmulni polyval_generic gf128mul ghash_clmulni_intel xhci_pci xhci_pci_renesas sha512_ssse3 xhci_hcd aesni_intel crypto_simd cryptd ccp usbcore nvme sp5100_tco
[ 87.757349] intel_rapl_common
[ 87.773955] sr_mod cdrom nvme_core wmi snd_emu10k1 snd_hwdep snd_util_mem snd_ac97_codec ac97_bus snd_pcm
[ 87.776898] drm_display_helper
[ 87.817003] snd_timer
[ 87.820815] snd_usbmidi_lib videobuf2_common
[ 87.831337] snd_rawmidi snd_seq_device snd soundcore sg dm_multipath
[ 87.835241] xfs
[ 87.838348] dm_mod scsi_dh_rdac scsi_dh_emc scsi_dh_alua
[ 87.838351] CPU: 20 PID: 1152 Comm: sdma0 Kdump: loaded Tainted: G OE 6.1.0-2-gb6bb9676f216-lp153.1.g2c89d56-default #1 openSUSE Tumbleweed (unreleased) c1e72e89c7d509c25789f27966fb394c07b2a770
[ 87.838353] Hardware name: Gigabyte Technology Co., Ltd. X570 AORUS ELITE/X570 AORUS ELITE, BIOS F37d 07/27/2022
[ 87.838355] RIP: 0010:refcount_warn_saturate+0x51/0x110
[ 87.838357] Code: 84 bc 00 00 00 c3 cc cc cc cc 85 f6 74 46 80 3d be 76 6c 01 00 75 ee 48 c7 c7 20 87 42 b3 c6 05 ae 76 6c 01 01 e8 b3 56 64 00 <0f> 0b c3 cc cc cc cc 80 3d 9a 76 6c 01 00 75 cb 48 c7 c7 48 87 42
[ 87.838359] RSP: 0018:ffffb65c01113e68 EFLAGS: 00010286
[ 87.838360] RAX: 0000000000000000 RBX: ffff98f20cc43780 RCX: ffff98fffef224c8
[ 87.838360] RDX: 00000000ffffffd8 RSI: 0000000000000027 RDI: ffff98fffef224c0
[ 87.838361] RBP: ffff98f103425e28 R08: ffffffffb43d6ac0 R09: ffffb65c01113e00
[ 87.843499] drm_ttm_helper
[ 87.850755] R10: 0000000000000001 R11: 0000000037313554 R12: 0000000000000000
[ 87.850756] R13: ffff98f13d4b0e00 R14: ffff98f10342cab8 R15: ffff98f13d4b1200
[ 87.850757] FS: 0000000000000000(0000) GS:ffff98fffef00000(0000) knlGS:0000000000000000
[ 87.850758] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 87.850759] CR2: 00007f19ad8b24f8 CR3: 000000016e260000 CR4: 0000000000750ee0
[ 87.853250] libcrc32c
[ 87.859518] PKRU: 55555554
[ 87.859519] Call Trace:
[ 87.859520] <TASK>
[ 87.859522] drm_sched_entity_pop_job+0x3b3/0x430 [gpu_sched 6d68b5979a6bfbb06286516e2330a5b15acb01f1]
[ 87.859529] drm_sched_main+0xcd/0x400 [gpu_sched 6d68b5979a6bfbb06286516e2330a5b15acb01f1]
[ 87.879076] tiny_power_button
[ 87.890126] ? __pfx_autoremove_wake_function+0x10/0x10
[ 87.896147] edac_mce_amd mc
[ 87.915872] ? __pfx_drm_sched_main+0x10/0x10 [gpu_sched 6d68b5979a6bfbb06286516e2330a5b15acb01f1]
[ 87.921956] joydev irqbypass ttm pcspkr
[ 87.929842] kthread+0xd8/0x100
[ 87.937870] wmi_bmof gigabyte_wmi k10temp i2c_piix4 cec igb rc_core dca thermal button
[ 87.945836] ? __pfx_kthread+0x10/0x10
[ 87.945838] ret_from_fork+0x2c/0x50
[ 87.945841] </TASK>
[ 87.945842] ---[ end trace 0000000000000000 ]---
[ 87.945847] ------------[ cut here ]------------
[ 87.945848] refcount_t: underflow; use-after-free.
[ 87.945851] WARNING: CPU: 20 PID: 1152 at lib/refcount.c:28 refcount_warn_saturate+0xba/0x110
[ 87.945869] BUG: kernel NULL pointer dereference, address: 0000000000000008
[ 87.945870] #PF: supervisor write access in kernel mode
[ 87.945871] #PF: error_code(0x0002) - not-present page
[ 87.945872] PGD 0 P4D 0
[ 87.945874] Oops: 0002 [#1] PREEMPT SMP NOPTI
[ 87.945875] CPU: 15 PID: 0 Comm: swapper/15 Kdump: loaded Tainted: G W OE 6.1.0-2-gb6bb9676f216-lp153.1.g2c89d56-default #1 openSUSE Tumbleweed (unreleased) c1e72e89c7d509c25789f27966fb394c07b2a770
[ 87.945878] Hardware name: Gigabyte Technology Co., Ltd. X570 AORUS ELITE/X570 AORUS ELITE, BIOS F37d 07/27/2022
[ 87.945879] RIP: 0010:dma_fence_signal_timestamp_locked+0x34/0x100
[ 87.945882] Code: 83 ec 18 65 48 8b 04 25 28 00 00 00 48 89 44 24 10 31 c0 f0 48 0f ba 6f 30 00 0f 82 c6 00 00 00 48 8b 47 10 49 89 e4 48 89 fd <4c> 89 60 08 48 89 04 24 48 8b 47 18 48 89 44 24 08 4c 89 20 48 89
[ 87.945884] RSP: 0018:ffffb65c00584d78 EFLAGS: 00010046
[ 87.945885] RAX: 0000000000000000 RBX: ffff98f20cc43780 RCX: 0000000000000017
[ 87.945886] RDX: 0000000000fe7af8 RSI: 00000014661fbce8 RDI: ffff98f20cc43780
[ 87.945887] RBP: ffff98f20cc43780 R08: 00000000000000e0 R09: 00000000000000e0
[ 87.945887] R10: 0000000000000000 R11: ffffb65c00584ff8 R12: ffffb65c00584d78
[ 87.945888] R13: 0000000000000002 R14: ffff98f103420000 R15: 00000000ffffffff
[ 87.945889] FS: 0000000000000000(0000) GS:ffff98fffedc0000(0000) knlGS:0000000000000000
[ 87.945890] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 87.945891] CR2: 0000000000000008 CR3: 0000000133274000 CR4: 0000000000750ee0
[ 87.945892] PKRU: 55555554
[ 87.945892] Call Trace:
[ 87.945894] <IRQ>
[ 87.945895] ? __pfx_read_tsc+0x10/0x10
[ 87.945898] dma_fence_signal+0x2c/0x50
[ 87.945902] drm_sched_job_done.isra.0+0x5d/0x130 [gpu_sched 6d68b5979a6bfbb06286516e2330a5b15acb01f1]
[ 87.945908] dma_fence_signal_timestamp_locked+0x7a/0x100
[ 87.945909] dma_fence_signal+0x2c/0x50
[ 87.945911] amdgpu_fence_process+0xce/0x140 [amdgpu 8c17887efba5d4ec504ba4125e259aa15c52f60d]
[ 87.946145] sdma_v3_0_process_trap_irq+0x64/0x90 [amdgpu 8c17887efba5d4ec504ba4125e259aa15c52f60d]
[ 87.946387] amdgpu_irq_dispatch+0x106/0x230 [amdgpu 8c17887efba5d4ec504ba4125e259aa15c52f60d]
[ 87.946620] amdgpu_ih_process+0x66/0x100 [amdgpu 8c17887efba5d4ec504ba4125e259aa15c52f60d]
[ 87.946849] amdgpu_irq_handler+0x1f/0x60 [amdgpu 8c17887efba5d4ec504ba4125e259aa15c52f60d]
[ 87.947077] __handle_irq_event_percpu+0x46/0x190
[ 87.947081] handle_irq_event+0x34/0x70
[ 87.947083] handle_edge_irq+0x87/0x220
[ 87.947085] __common_interrupt+0x3e/0xa0
[ 87.947088] common_interrupt+0x7b/0xa0
[ 87.947091] </IRQ>
[ 87.947092] <TASK>
[ 87.947092] asm_common_interrupt+0x22/0x40
[ 87.947094] RIP: 0010:cpuidle_enter_state+0xd8/0x410
[ 87.947097] Code: 00 00 31 ff e8 e9 15 79 ff 45 84 ff 74 16 9c 58 0f 1f 40 00 f6 c4 02 0f 85 18 03 00 00 31 ff e8 7e b2 80 ff fb 0f 1f 44 00 00 <45> 85 f6 0f 88 75 01 00 00 49 63 c6 4c 2b 2c 24 48 8d 14 40 48 8d
[ 87.947099] RSP: 0018:ffffb65c001efe90 EFLAGS: 00000246
[ 87.947100] RAX: ffff98fffedf3fc0 RBX: 0000000000000002 RCX: 000000000000001f
[ 87.947101] RDX: 0000000000000000 RSI: 00000000229839cd RDI: 0000000000000000
[ 87.947101] RBP: ffff98f101e78800 R08: 0000001479fb6ed2 R09: 0000000000000018
[ 87.947102] R10: 00000000000009a0 R11: 00000000000005f6 R12: ffffffffb3a2ad80
[ 87.947103] R13: 0000001479fb6ed2 R14: 0000000000000002 R15: 0000000000000000
[ 87.947105] cpuidle_enter+0x29/0x40
[ 87.947106] do_idle+0x1fc/0x2a0
[ 87.947109] cpu_startup_entry+0x19/0x20
[ 87.947110] start_secondary+0x114/0x140
[ 87.947112] secondary_startup_64_no_verify+0xe5/0xeb
[ 87.947116] </TASK>
[ 87.947116] Modules linked in: 8021q garp mrp stp llc iscsi_ibft iscsi_boot_sysfs xt_REDIRECT xt_MASQUERADE xt_nat iptable_nat nf_nat deflate sm4_generic sm4_aesni_avx2_x86_64 xt_LOG nf_log_syslog sm4_aesni_avx_x86_64 sm4 twofish_generic twofish_avx_x86_64 twofish_x86_64_3way twofish_x86_64 xt_conntrack twofish_common nf_conntrack camellia_generic nf_defrag_ipv6 nf_defrag_ipv4 camellia_aesni_avx2 camellia_aesni_avx_x86_64 camellia_x86_64 serpent_avx2 serpent_avx_x86_64 serpent_sse2_x86_64 serpent_generic blowfish_generic blowfish_x86_64 vmnet(OE) blowfish_common ppdev parport_pc parport ipt_REJECT cast5_avx_x86_64 vmw_vsock_vmci_transport cast5_generic nf_reject_ipv4 cast_common vsock des_generic rfkill libdes xt_tcpudp vmw_vmci sm3_generic xt_set sm3_avx_x86_64 sm3 cmac xcbc iptable_filter vmmon(OE) rmd160 bpfilter dmi_sysfs ip_set_hash_ip af_key ip_set xfrm_algo nfnetlink msr hwmon_vid dm_crypt essiv authenc trusted asn1_encoder tee amdgpu uvcvideo iommu_v2
[ 87.947148] videobuf2_vmalloc drm_buddy videobuf2_memops gpu_sched videobuf2_v4l2 video videodev intel_rapl_msr snd_usb_audio i2c_dev intel_rapl_common drm_display_helper snd_usbmidi_lib videobuf2_common xfs drm_ttm_helper libcrc32c tiny_power_button edac_mce_amd mc joydev irqbypass ttm pcspkr wmi_bmof gigabyte_wmi k10temp i2c_piix4 cec igb rc_core dca thermal button acpi_cpufreq fuse configfs ip_tables x_tables ext4 mbcache jbd2 hid_generic uas usb_storage usbhid crct10dif_pclmul crc32_pclmul crc32c_intel polyval_clmulni polyval_generic gf128mul ghash_clmulni_intel xhci_pci xhci_pci_renesas sha512_ssse3 xhci_hcd aesni_intel crypto_simd cryptd ccp usbcore nvme sp5100_tco sr_mod cdrom nvme_core wmi snd_emu10k1 snd_hwdep snd_util_mem snd_ac97_codec ac97_bus snd_pcm snd_timer snd_rawmidi snd_seq_device snd soundcore sg dm_multipath dm_mod scsi_dh_rdac scsi_dh_emc scsi_dh_alua
[ 87.947180] CR2: 0000000000000008
------------------------------------------------------------------------------

2022-12-22 23:01:25

by Borislav Petkov

[permalink] [raw]
Subject: Re: amdgpu refcount saturation

On Thu, Dec 22, 2022 at 10:20:37PM +0100, Michal Kubecek wrote:
> Unfortunately, just like Boris, I always seem to have multiple stack
> traces tangled together.

See if this fixes it:

https://lore.kernel.org/r/[email protected]

Thx.

--
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette