2022-01-16 06:46:57

by Borislav Petkov

[permalink] [raw]
Subject: RIP: 0010:radeon_vm_fini+0x15/0x220 [radeon]

Hi folks,

so this is a *very* old K8 laptop - yap, you read it right, family 0xf.

[ 31.353032] powernow_k8: fid 0xa (1800 MHz), vid 0xa
[ 31.353569] powernow_k8: fid 0x8 (1600 MHz), vid 0xc
[ 31.354081] powernow_k8: fid 0x0 (800 MHz), vid 0x16
[ 31.354844] powernow_k8: Found 1 AMD Turion(tm) 64 Mobile Technology MT-34 (1 cpu cores) (version 2.20.00)

This is true story.

Anyway, it blows up, see below.

Kernel is latest Linus tree, top commit is:

a33f5c380c4b ("Merge tag 'xfs-5.17-merge-3' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux")

I can bisect if you don't see it immediately why it blows up.

HTH.

[ 37.904144] [drm] radeon kernel modesetting enabled.
[ 37.904823] radeon 0000:01:05.0: vgaarb: deactivate vga console
[ 37.907351] Console: switching to colour dummy device 80x25
[ 37.909076] [drm] initializing kernel modesetting (RS480 0x1002:0x5955 0x10CF:0x1302 0x00).
[ 37.909767] resource sanity check: requesting [mem 0x000c0000-0x000dffff], which spans more than pnp 00:01 [mem 0x000d1800-0x000d1fff]
[ 37.911775] caller pci_map_rom+0x78/0x1d0 mapping multiple BARs
[ 37.912498] [drm] Generation 2 PCI interface, using max accessible memory
[ 37.913450] radeon 0000:01:05.0: VRAM: 64M 0x000000003C000000 - 0x000000003FFFFFFF (64M used)
[ 37.914199] radeon 0000:01:05.0: GTT: 512M 0x0000000040000000 - 0x000000005FFFFFFF
[ 37.914856] [drm] Detected VRAM RAM=64M, BAR=128M
[ 37.915758] [drm] RAM width 128bits DDR
[ 37.916181] [drm] radeon: 64M of VRAM memory ready
[ 37.917328] [drm] radeon: 512M of GTT memory ready.
[ 37.917832] [drm] GART: num cpu pages 131072, num gpu pages 131072
[ 37.923315] [drm] radeon: power management initialized
[ 37.923827] [drm] radeon: 2 quad pipes, 1 z pipes initialized.
[ 37.924372] [drm] PCIE GART of 512M enabled (table at 0x0000000008400000).
[ 37.925900] radeon 0000:01:05.0: WB enabled
[ 37.926436] radeon 0000:01:05.0: fence driver on ring 0 use gpu addr 0x0000000040000000
[ 37.929710] radeon 0000:01:05.0: radeon: MSI limited to 32-bit
[ 37.930282] [drm] radeon: irq initialized.
[ 37.930826] [drm] Loading R300 Microcode
[ 38.093492] [drm] radeon: ring at 0x0000000040001000
[ 38.094022] [drm] ring test succeeded in 4 usecs
[ 38.094561] [drm] ib test succeeded in 0 usecs
[ 38.096275] [drm] Panel ID String: 1024x768
[ 38.096762] [drm] Panel Size 1024x768
[ 38.097741] [drm] Radeon Display Connectors
[ 38.098107] [drm] Connector 0:
[ 38.098500] [drm] VGA-1
[ 38.098779] [drm] DDC: 0x68 0x68 0x68 0x68 0x68 0x68 0x68 0x68
[ 38.099321] [drm] Encoders:
[ 38.099708] [drm] CRT1: INTERNAL_DAC2
[ 38.100075] [drm] Connector 1:
[ 38.100440] [drm] LVDS-1
[ 38.100761] [drm] Encoders:
[ 38.101126] [drm] LCD1: INTERNAL_LVDS
[ 38.101491] [drm] Connector 2:
[ 38.101883] [drm] SVIDEO-1
[ 38.102248] [drm] Encoders:
[ 38.102613] [drm] TV1: INTERNAL_DAC2
[ 38.103023] BUG: kernel NULL pointer dereference, address: 0000000000000023
[ 38.103564] #PF: supervisor read access in kernel mode
[ 38.104018] #PF: error_code(0x0000) - not-present page
[ 38.104472] PGD 0 P4D 0
[ 38.104728] Oops: 0000 [#1] PREEMPT SMP NOPTI
[ 38.104728] CPU: 0 PID: 349 Comm: systemd-udevd Not tainted 5.16.0+ #1
[ 38.104728] Hardware name: FUJITSU SIEMENS LIFEBOOK S2110/FJNB19A, BIOS Version 1.11 05/19/2006
[ 38.104728] RIP: 0010:radeon_vm_fini+0x15/0x220 [radeon]
[ 38.104728] Code: e1 d7 c3 ff b8 f4 ff ff ff eb b6 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 41 57 41 56 41 55 49 89 fd 41 54 49 89 f4 55 53 <48> 8b 46 20 48 85 c0 0f 85 ae fd 01 00 4d 8d 74 24 20 4c 89 f7 e8
[ 38.104728] RSP: 0018:ffffc900004b3a98 EFLAGS: 00010202
[ 38.104728] RAX: 0000000000000001 RBX: ffff8880081a4000 RCX: 0000000000000000
[ 38.104728] RDX: 0000000000000001 RSI: 0000000000000003 RDI: ffff888008168000
[ 38.104728] RBP: ffff88800432b200 R08: 0000000000000000 R09: ffff88800432b200
[ 38.104728] R10: 0000000000000001 R11: 0000000000000001 R12: 0000000000000003
[ 38.104728] R13: ffff888008168000 R14: ffff8880081a4000 R15: 0000000000000003
[ 38.104728] FS: 00007fe2cfab18c0(0000) GS:ffff888039000000(0000) knlGS:0000000000000000
[ 38.104728] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 38.104728] CR2: 0000000000000023 CR3: 000000000820e000 CR4: 00000000000006f0
[ 38.104728] Call Trace:
[ 38.104728] <TASK>
[ 38.104728] radeon_driver_open_kms+0x6b/0x1b0 [radeon]
[ 38.104728] drm_file_alloc+0x19c/0x260 [drm]
[ 38.104728] drm_client_init+0xdc/0x190 [drm]
[ 38.104728] drm_fb_helper_init+0x3a/0x60 [drm_kms_helper]
[ 38.104728] radeon_fbdev_init+0x8e/0x130 [radeon]
[ 38.104728] radeon_modeset_init.cold+0x206/0x521 [radeon]
[ 38.104728] radeon_driver_load_kms+0xe5/0x1f0 [radeon]
[ 38.104728] drm_dev_register+0xfc/0x1e0 [drm]
[ 38.104728] radeon_pci_probe+0xc6/0x100 [radeon]
[ 38.104728] pci_device_probe+0xbb/0x170
[ 38.104728] really_probe+0xca/0x3c0
[ 38.104728] __driver_probe_device+0xfe/0x180
[ 38.104728] driver_probe_device+0x2c/0xb0
[ 38.104728] __driver_attach+0xc5/0x1d0
[ 38.104728] ? __device_attach_driver+0xf0/0xf0
[ 38.104728] ? __device_attach_driver+0xf0/0xf0
[ 38.104728] bus_for_each_dev+0x7a/0xc0
[ 38.104728] ? klist_add_tail+0x4f/0x90
[ 38.104728] bus_add_driver+0x16b/0x210
[ 38.104728] driver_register+0x8b/0xe0
[ 38.104728] ? 0xffffffffa0758000
[ 38.104728] do_one_initcall+0x44/0x200
[ 38.104728] ? kmem_cache_alloc_trace+0xb3/0x1f0
[ 38.104728] do_init_module+0x5c/0x260
[ 38.104728] __do_sys_finit_module+0xca/0x140
[ 38.104728] do_syscall_64+0x3b/0x80
[ 38.104728] entry_SYSCALL_64_after_hwframe+0x44/0xae
[ 38.104728] RIP: 0033:0x7fe2cff62679
[ 38.104728] Code: 48 8d 3d 9a a1 0c 00 0f 05 eb a5 66 0f 1f 44 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d c7 57 0c 00 f7 d8 64 89 01 48
[ 38.104728] RSP: 002b:00007ffd2a00b738 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
[ 38.104728] RAX: ffffffffffffffda RBX: 0000562792263fb0 RCX: 00007fe2cff62679
[ 38.104728] RDX: 0000000000000000 RSI: 00005627922c0a00 RDI: 0000000000000016
[ 38.104728] RBP: 0000000000020000 R08: 0000000000000000 R09: 0000562791f34e62
[ 38.104728] R10: 0000000000000016 R11: 0000000000000246 R12: 00005627922c0a00
[ 38.104728] R13: 0000000000000000 R14: 00005627922c5fa0 R15: 0000562792263fb0
[ 38.104728] </TASK>
[ 38.104728] Modules linked in: radeon(+) mac80211 ath drm_ttm_helper ttm cfg80211 snd_atiixp drm_kms_helper snd_ac97_codec drm ac97_bus snd_pcm powernow_k8 snd_timer rfkill pcmcia drm_panel_orientation_quirks snd libarc4 yenta_socket edac_mce_amd i2c_algo_bit pcmcia_rsrc evdev joydev fb_sys_fops soundcore syscopyarea pcspkr pcmcia_core sysfillrect input_leds sysimgblt k8temp video battery ac button b44 mii sdhci_pci iosf_mbi firewire_ohci psmouse ssb cqhci ohci_pci firewire_core sdhci led_class crc_itu_t libphy ehci_pci ohci_hcd ehci_hcd mmc_core i2c_piix4 usbcore usb_common
[ 38.104728] CR2: 0000000000000023
[ 38.137547] ---[ end trace 91f9e835d12cf639 ]---
[ 38.138012] RIP: 0010:radeon_vm_fini+0x15/0x220 [radeon]
[ 38.138690] Code: e1 d7 c3 ff b8 f4 ff ff ff eb b6 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 41 57 41 56 41 55 49 89 fd 41 54 49 89 f4 55 53 <48> 8b 46 20 48 85 c0 0f 85 ae fd 01 00 4d 8d 74 24 20 4c 89 f7 e8
[ 38.140044] RSP: 0018:ffffc900004b3a98 EFLAGS: 00010202
[ 38.140501] RAX: 0000000000000001 RBX: ffff8880081a4000 RCX: 0000000000000000
[ 38.141181] RDX: 0000000000000001 RSI: 0000000000000003 RDI: ffff888008168000
[ 38.141808] RBP: ffff88800432b200 R08: 0000000000000000 R09: ffff88800432b200
[ 38.142462] R10: 0000000000000001 R11: 0000000000000001 R12: 0000000000000003
[ 38.143089] R13: ffff888008168000 R14: ffff8880081a4000 R15: 0000000000000003
[ 38.143739] FS: 00007fe2cfab18c0(0000) GS:ffff888039000000(0000) knlGS:0000000000000000
[ 38.144369] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 38.144959] CR2: 00005627922c1218 CR3: 000000000820e000 CR4: 00000000000006f0
[ 38.190852] ath5k 0000:08:0a.0: registered as 'phy0'
[ 38.872632] ath: EEPROM regdomain: 0x67
[ 38.873033] ath: EEPROM indicates we should expect a direct regpair map
[ 38.873576] ath: Country alpha2 being used: 00
[ 38.874056] ath: Regpair used: 0x67
[ 38.874464] ieee80211 phy0: Selected rate control algorithm 'minstrel_ht'
[ 38.876582] ath5k: phy0: Atheros AR5414 chip found (MAC: 0xa5, PHY: 0x61)
[ 42.868873] b44 ssb0:0 eth0: Link is up at 100 Mbps, full duplex
[ 42.869437] b44 ssb0:0 eth0: Flow control is off for TX and off for RX
[ 42.870108] IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready
[ 43.788129] Adding 1951740k swap on /dev/sda1. Priority:-2 extents:1 across:1951740k
[ 43.879669] EXT4-fs (sda2): re-mounted. Quota mode: disabled.
[ 44.683438] loop: module loaded
[ 47.618816] fuse: init (API version 7.36)
[ 47.839777] input: ACPI Virtual Keyboard Device as /devices/virtual/input/input14
[ 55.500008] NET: Registered PF_AX25 protocol family

--
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette


2022-01-17 15:57:04

by Christian König

[permalink] [raw]
Subject: Re: RIP: 0010:radeon_vm_fini+0x15/0x220 [radeon]

Hi Borislav,

Am 15.01.22 um 17:11 schrieb Borislav Petkov:
> Hi folks,
>
> so this is a *very* old K8 laptop - yap, you read it right, family 0xf.
>
> [ 31.353032] powernow_k8: fid 0xa (1800 MHz), vid 0xa
> [ 31.353569] powernow_k8: fid 0x8 (1600 MHz), vid 0xc
> [ 31.354081] powernow_k8: fid 0x0 (800 MHz), vid 0x16
> [ 31.354844] powernow_k8: Found 1 AMD Turion(tm) 64 Mobile Technology MT-34 (1 cpu cores) (version 2.20.00)
>
> This is true story.

well, that hardware is ancient ^^.

Interesting to see that even that old stuff is still used.

> Anyway, it blows up, see below.
>
> Kernel is latest Linus tree, top commit is:
>
> a33f5c380c4b ("Merge tag 'xfs-5.17-merge-3' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux")
>
> I can bisect if you don't see it immediately why it blows up.

Immediately I see that code is called which isn't for this hardware
generation.

This is extremely odd because it means that we either have recently
added a logic bug or the detection of the hardware generation doesn't
work as expected any more.

Please bisect,
Christian.

>
> HTH.
>
> [ 37.904144] [drm] radeon kernel modesetting enabled.
> [ 37.904823] radeon 0000:01:05.0: vgaarb: deactivate vga console
> [ 37.907351] Console: switching to colour dummy device 80x25
> [ 37.909076] [drm] initializing kernel modesetting (RS480 0x1002:0x5955 0x10CF:0x1302 0x00).
> [ 37.909767] resource sanity check: requesting [mem 0x000c0000-0x000dffff], which spans more than pnp 00:01 [mem 0x000d1800-0x000d1fff]
> [ 37.911775] caller pci_map_rom+0x78/0x1d0 mapping multiple BARs
> [ 37.912498] [drm] Generation 2 PCI interface, using max accessible memory
> [ 37.913450] radeon 0000:01:05.0: VRAM: 64M 0x000000003C000000 - 0x000000003FFFFFFF (64M used)
> [ 37.914199] radeon 0000:01:05.0: GTT: 512M 0x0000000040000000 - 0x000000005FFFFFFF
> [ 37.914856] [drm] Detected VRAM RAM=64M, BAR=128M
> [ 37.915758] [drm] RAM width 128bits DDR
> [ 37.916181] [drm] radeon: 64M of VRAM memory ready
> [ 37.917328] [drm] radeon: 512M of GTT memory ready.
> [ 37.917832] [drm] GART: num cpu pages 131072, num gpu pages 131072
> [ 37.923315] [drm] radeon: power management initialized
> [ 37.923827] [drm] radeon: 2 quad pipes, 1 z pipes initialized.
> [ 37.924372] [drm] PCIE GART of 512M enabled (table at 0x0000000008400000).
> [ 37.925900] radeon 0000:01:05.0: WB enabled
> [ 37.926436] radeon 0000:01:05.0: fence driver on ring 0 use gpu addr 0x0000000040000000
> [ 37.929710] radeon 0000:01:05.0: radeon: MSI limited to 32-bit
> [ 37.930282] [drm] radeon: irq initialized.
> [ 37.930826] [drm] Loading R300 Microcode
> [ 38.093492] [drm] radeon: ring at 0x0000000040001000
> [ 38.094022] [drm] ring test succeeded in 4 usecs
> [ 38.094561] [drm] ib test succeeded in 0 usecs
> [ 38.096275] [drm] Panel ID String: 1024x768
> [ 38.096762] [drm] Panel Size 1024x768
> [ 38.097741] [drm] Radeon Display Connectors
> [ 38.098107] [drm] Connector 0:
> [ 38.098500] [drm] VGA-1
> [ 38.098779] [drm] DDC: 0x68 0x68 0x68 0x68 0x68 0x68 0x68 0x68
> [ 38.099321] [drm] Encoders:
> [ 38.099708] [drm] CRT1: INTERNAL_DAC2
> [ 38.100075] [drm] Connector 1:
> [ 38.100440] [drm] LVDS-1
> [ 38.100761] [drm] Encoders:
> [ 38.101126] [drm] LCD1: INTERNAL_LVDS
> [ 38.101491] [drm] Connector 2:
> [ 38.101883] [drm] SVIDEO-1
> [ 38.102248] [drm] Encoders:
> [ 38.102613] [drm] TV1: INTERNAL_DAC2
> [ 38.103023] BUG: kernel NULL pointer dereference, address: 0000000000000023
> [ 38.103564] #PF: supervisor read access in kernel mode
> [ 38.104018] #PF: error_code(0x0000) - not-present page
> [ 38.104472] PGD 0 P4D 0
> [ 38.104728] Oops: 0000 [#1] PREEMPT SMP NOPTI
> [ 38.104728] CPU: 0 PID: 349 Comm: systemd-udevd Not tainted 5.16.0+ #1
> [ 38.104728] Hardware name: FUJITSU SIEMENS LIFEBOOK S2110/FJNB19A, BIOS Version 1.11 05/19/2006
> [ 38.104728] RIP: 0010:radeon_vm_fini+0x15/0x220 [radeon]
> [ 38.104728] Code: e1 d7 c3 ff b8 f4 ff ff ff eb b6 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 41 57 41 56 41 55 49 89 fd 41 54 49 89 f4 55 53 <48> 8b 46 20 48 85 c0 0f 85 ae fd 01 00 4d 8d 74 24 20 4c 89 f7 e8
> [ 38.104728] RSP: 0018:ffffc900004b3a98 EFLAGS: 00010202
> [ 38.104728] RAX: 0000000000000001 RBX: ffff8880081a4000 RCX: 0000000000000000
> [ 38.104728] RDX: 0000000000000001 RSI: 0000000000000003 RDI: ffff888008168000
> [ 38.104728] RBP: ffff88800432b200 R08: 0000000000000000 R09: ffff88800432b200
> [ 38.104728] R10: 0000000000000001 R11: 0000000000000001 R12: 0000000000000003
> [ 38.104728] R13: ffff888008168000 R14: ffff8880081a4000 R15: 0000000000000003
> [ 38.104728] FS: 00007fe2cfab18c0(0000) GS:ffff888039000000(0000) knlGS:0000000000000000
> [ 38.104728] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 38.104728] CR2: 0000000000000023 CR3: 000000000820e000 CR4: 00000000000006f0
> [ 38.104728] Call Trace:
> [ 38.104728] <TASK>
> [ 38.104728] radeon_driver_open_kms+0x6b/0x1b0 [radeon]
> [ 38.104728] drm_file_alloc+0x19c/0x260 [drm]
> [ 38.104728] drm_client_init+0xdc/0x190 [drm]
> [ 38.104728] drm_fb_helper_init+0x3a/0x60 [drm_kms_helper]
> [ 38.104728] radeon_fbdev_init+0x8e/0x130 [radeon]
> [ 38.104728] radeon_modeset_init.cold+0x206/0x521 [radeon]
> [ 38.104728] radeon_driver_load_kms+0xe5/0x1f0 [radeon]
> [ 38.104728] drm_dev_register+0xfc/0x1e0 [drm]
> [ 38.104728] radeon_pci_probe+0xc6/0x100 [radeon]
> [ 38.104728] pci_device_probe+0xbb/0x170
> [ 38.104728] really_probe+0xca/0x3c0
> [ 38.104728] __driver_probe_device+0xfe/0x180
> [ 38.104728] driver_probe_device+0x2c/0xb0
> [ 38.104728] __driver_attach+0xc5/0x1d0
> [ 38.104728] ? __device_attach_driver+0xf0/0xf0
> [ 38.104728] ? __device_attach_driver+0xf0/0xf0
> [ 38.104728] bus_for_each_dev+0x7a/0xc0
> [ 38.104728] ? klist_add_tail+0x4f/0x90
> [ 38.104728] bus_add_driver+0x16b/0x210
> [ 38.104728] driver_register+0x8b/0xe0
> [ 38.104728] ? 0xffffffffa0758000
> [ 38.104728] do_one_initcall+0x44/0x200
> [ 38.104728] ? kmem_cache_alloc_trace+0xb3/0x1f0
> [ 38.104728] do_init_module+0x5c/0x260
> [ 38.104728] __do_sys_finit_module+0xca/0x140
> [ 38.104728] do_syscall_64+0x3b/0x80
> [ 38.104728] entry_SYSCALL_64_after_hwframe+0x44/0xae
> [ 38.104728] RIP: 0033:0x7fe2cff62679
> [ 38.104728] Code: 48 8d 3d 9a a1 0c 00 0f 05 eb a5 66 0f 1f 44 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d c7 57 0c 00 f7 d8 64 89 01 48
> [ 38.104728] RSP: 002b:00007ffd2a00b738 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
> [ 38.104728] RAX: ffffffffffffffda RBX: 0000562792263fb0 RCX: 00007fe2cff62679
> [ 38.104728] RDX: 0000000000000000 RSI: 00005627922c0a00 RDI: 0000000000000016
> [ 38.104728] RBP: 0000000000020000 R08: 0000000000000000 R09: 0000562791f34e62
> [ 38.104728] R10: 0000000000000016 R11: 0000000000000246 R12: 00005627922c0a00
> [ 38.104728] R13: 0000000000000000 R14: 00005627922c5fa0 R15: 0000562792263fb0
> [ 38.104728] </TASK>
> [ 38.104728] Modules linked in: radeon(+) mac80211 ath drm_ttm_helper ttm cfg80211 snd_atiixp drm_kms_helper snd_ac97_codec drm ac97_bus snd_pcm powernow_k8 snd_timer rfkill pcmcia drm_panel_orientation_quirks snd libarc4 yenta_socket edac_mce_amd i2c_algo_bit pcmcia_rsrc evdev joydev fb_sys_fops soundcore syscopyarea pcspkr pcmcia_core sysfillrect input_leds sysimgblt k8temp video battery ac button b44 mii sdhci_pci iosf_mbi firewire_ohci psmouse ssb cqhci ohci_pci firewire_core sdhci led_class crc_itu_t libphy ehci_pci ohci_hcd ehci_hcd mmc_core i2c_piix4 usbcore usb_common
> [ 38.104728] CR2: 0000000000000023
> [ 38.137547] ---[ end trace 91f9e835d12cf639 ]---
> [ 38.138012] RIP: 0010:radeon_vm_fini+0x15/0x220 [radeon]
> [ 38.138690] Code: e1 d7 c3 ff b8 f4 ff ff ff eb b6 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 41 57 41 56 41 55 49 89 fd 41 54 49 89 f4 55 53 <48> 8b 46 20 48 85 c0 0f 85 ae fd 01 00 4d 8d 74 24 20 4c 89 f7 e8
> [ 38.140044] RSP: 0018:ffffc900004b3a98 EFLAGS: 00010202
> [ 38.140501] RAX: 0000000000000001 RBX: ffff8880081a4000 RCX: 0000000000000000
> [ 38.141181] RDX: 0000000000000001 RSI: 0000000000000003 RDI: ffff888008168000
> [ 38.141808] RBP: ffff88800432b200 R08: 0000000000000000 R09: ffff88800432b200
> [ 38.142462] R10: 0000000000000001 R11: 0000000000000001 R12: 0000000000000003
> [ 38.143089] R13: ffff888008168000 R14: ffff8880081a4000 R15: 0000000000000003
> [ 38.143739] FS: 00007fe2cfab18c0(0000) GS:ffff888039000000(0000) knlGS:0000000000000000
> [ 38.144369] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 38.144959] CR2: 00005627922c1218 CR3: 000000000820e000 CR4: 00000000000006f0
> [ 38.190852] ath5k 0000:08:0a.0: registered as 'phy0'
> [ 38.872632] ath: EEPROM regdomain: 0x67
> [ 38.873033] ath: EEPROM indicates we should expect a direct regpair map
> [ 38.873576] ath: Country alpha2 being used: 00
> [ 38.874056] ath: Regpair used: 0x67
> [ 38.874464] ieee80211 phy0: Selected rate control algorithm 'minstrel_ht'
> [ 38.876582] ath5k: phy0: Atheros AR5414 chip found (MAC: 0xa5, PHY: 0x61)
> [ 42.868873] b44 ssb0:0 eth0: Link is up at 100 Mbps, full duplex
> [ 42.869437] b44 ssb0:0 eth0: Flow control is off for TX and off for RX
> [ 42.870108] IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready
> [ 43.788129] Adding 1951740k swap on /dev/sda1. Priority:-2 extents:1 across:1951740k
> [ 43.879669] EXT4-fs (sda2): re-mounted. Quota mode: disabled.
> [ 44.683438] loop: module loaded
> [ 47.618816] fuse: init (API version 7.36)
> [ 47.839777] input: ACPI Virtual Keyboard Device as /devices/virtual/input/input14
> [ 55.500008] NET: Registered PF_AX25 protocol family
>

2022-01-17 17:02:01

by Jan Stancek

[permalink] [raw]
Subject: Re: RIP: 0010:radeon_vm_fini+0x15/0x220 [radeon]

On Mon, Jan 17, 2022 at 08:16:09AM +0100, Christian K?nig wrote:
>Hi Borislav,
>
>Am 15.01.22 um 17:11 schrieb Borislav Petkov:
>>Hi folks,
>>
>>so this is a *very* old K8 laptop - yap, you read it right, family 0xf.
>>
>>[ 31.353032] powernow_k8: fid 0xa (1800 MHz), vid 0xa
>>[ 31.353569] powernow_k8: fid 0x8 (1600 MHz), vid 0xc
>>[ 31.354081] powernow_k8: fid 0x0 (800 MHz), vid 0x16
>>[ 31.354844] powernow_k8: Found 1 AMD Turion(tm) 64 Mobile Technology MT-34 (1 cpu cores) (version 2.20.00)
>>
>>This is true story.
>
>well, that hardware is ancient ^^.
>
>Interesting to see that even that old stuff is still used.
>
>>Anyway, it blows up, see below.
>>
>>Kernel is latest Linus tree, top commit is:
>>
>>a33f5c380c4b ("Merge tag 'xfs-5.17-merge-3' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux")
>>
>>I can bisect if you don't see it immediately why it blows up.
>
>Immediately I see that code is called which isn't for this hardware
>generation.
>
>This is extremely odd because it means that we either have recently
>added a logic bug or the detection of the hardware generation doesn't
>work as expected any more.
>
>Please bisect,
>Christian.

I'm see panics like this one as well on multiple systems in lab (e.g. ProLiant SL390s G7,
PowerEdge R805). Looks same to what Bruno reported here:
https://lore.kernel.org/all/CA+QYu4rt2VHWzbOt-SegA9yABqC-D36PoqTZmy6DscWvp+6ZMQ@mail.gmail.com/

It started around 8d0749b4f83b - Merge tag 'drm-next-2022-01-07', running a bisect atm.

[ 15.230105] SGI XFS with ACLs, security attributes, scrub, quota, no debug enabled
[ 15.234816] XFS (sdb1): Mounting V5 Filesystem
[ 15.342261] [drm] ib test succeeded in 0 usecs
[ 15.343311] [drm] No TV DAC info found in BIOS
[ 15.344061] [drm] Radeon Display Connectors
[ 15.344330] [drm] Connector 0:
[ 15.344961] [drm] VGA-1
[ 15.345174] [drm] DDC: 0x60 0x60 0x60 0x60 0x60 0x60 0x60 0x60
[ 15.345991] [drm] Encoders:
[ 15.346617] [drm] CRT1: INTERNAL_DAC1
[ 15.346942] [drm] Connector 1:
[ 15.347561] [drm] VGA-2
[ 15.347746] [drm] DDC: 0x6c 0x6c 0x6c 0x6c 0x6c 0x6c 0x6c 0x6c
[ 15.348598] [drm] Encoders:
[ 15.349217] [drm] CRT2: INTERNAL_DAC2
[ 15.349521] BUG: kernel NULL pointer dereference, address: 0000000000000000
[ 15.349974] #PF: supervisor read access in kernel mode
[ 15.350305] #PF: error_code(0x0000) - not-present page
[ 15.350675] PGD 0 P4D 0
[ 15.350814] Oops: 0000 [#[ 15.431048] CPU: 0 PID: 410 Comm: systemd-udevd Tainted: G I 5.16.0 #1
[ 15.443401] XFS (sdb1): Ending clean mount
[ 15.451541] Hardware name: HP ProLiant SL390s G7/, BIOS P69 07/02/2013
[ 15.451545] RIP: 0010:radeon_vm_fini+0x174/0x300 [radeon]
[ 15.452689] Code: e8 74 cc 7a c1 eb d1 4c 8b 24 24 4d 8d 74 24 48 49 8b 5c 24 48 49 39 de 74 38 66 2e 0f 1f 84 00 00 00 00 00 66 90 4c 8d 7b a8 <48> 8b 2b 48 8d 7b 18 e8 30 1e f4 ff 48 83 c3 c0 48 89 df e8 34 f3
[ 15.454412] RSP: 0018:ffffa3494800001 R08: 0000000000200000 R09: 0000000000000000
[ 15.533944] R10: 0000000000000000 R11: ffffffffc04f7810 R12: ffff979b4ba46730
[ 15.533945] R13: ffff979d5c260000 R14: ffff979b4ba46778 R15: ffffffffffffffa8
[ 15.533947] FS: 00007f3a13141500(0000) GS:ffff979d4ba00000(0000) knlGS:0000000000000000
[ 15.533948] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 15.533950] CR2: 0000000000000000 CR3: 000000031c7fc005 CR4: 00000000000206f0
[ 15.533952] Call Trace:
[ 15.533956] <TASK>
[ 15.533959] radeon_driver_open_kms+0x118/0x180 [radeon]
[ 15.533998] drm_file_alloc+0x1a8/0x230 [drm]
[
OK [[ 15.961755] drm_client_init+0x99/0x130 [drm]
[ 15.961777] drm_fb_helper_init+0x32/0x50 [drm_kms_helper]
[ 15.961809] radeon_fbdev_init+0xbc/0x110 [radeon]
[ 15.963653] radeon_modeset_init+0x857/0x9e0 [radeon]
0m] Mounted [0;[ 15.964003] radeon_driver_load_kms+0x19b/0x290 [radeon]
[ 15.964474] drm_dev_register+0xf5/0x2d0 [drm]
1;39msysroot.mou[ 15.965196] radeon_pci_probe+0xc3/0x120 [radeon]
[ 15.965972] pci_device_probe+0x185/0x220
[ 15.966225] call_driver_probe+0x32/0xd0
[ 15.966505] really_probe+0x157/0x380
[ 15.99bus_add_driver+0x111/0x210
[ 16.467150] ? 0xffffffffc0412000
[ 16.467805] driver_register+0x81/0x120
[ 16.468069] do_one_initcall+0xb0/0x290
[ 16.468359] ? down_write+0xe/0x40
[ 16.469008] ? kernfs_activate+0x28/0x130
[ 16.469267] ? kernfs_add_one+0x1c8/0x210
[ 16.469563] ? vunmap_p4d_range+0x3dc/0x420
[ 16.469858] ? __vunmap+0x1df/0x2a0
[ 16.470466] ? kmem_cache_alloc_trace+0x1a4/0x330
[ 16.471224] ? do_init_module+0x24/0x230
[ 16.471485] do_init_module+0x5a/0x230
[ 16.471779] load_module+0x145f/0x1630
[ 16.472022] ? kernel_read_file_from_fd+0x5d/0x80
[ 16.472762] __se_sys_finit_module+0x9f/0xd0
[ 16.473480] do_syscall_64+0x43/0x90
[ 16.473778] entry_SYSCALL_64_after_hwframe+0x44/0xae
[ 16.474123] RIP: 0033:0x7f3a13d11e2d
[ 16.474422] Code: 5b 41 5c c3 66 0f 1f 84 00 00 00 00 00 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d bb 7f 0e 00 f7 d8 64 89 01 48
[ 16.476010] RSP: 002b:00007fff9cb92b78 EFLAGS: 00000246 ORIG_RAX: 000000 R08: 0000000000000000 R09: 0000000000000002
[ 16.977414] R10: 0000000000000012 R11: 0000000000000246 R12: 00007f3a13e6d43c
[ 16.978320] R13: 0000555c5eba3080 R14: 0000000000000007 R15: 0000555c5eba3d70
[ 16.979218] </TASK>
[ 16.979381] Modules linked in: xfs radeon(+) drm_ttm_helper ttm i2c_algo_bit drm_kms_helper crct10dif_pclmul crc32_pclmul crc32c_intel cec ata_generic ghash_clmulni_intel drm serio_raw pata_acpi hpwdt
[ 16.980516] CR2: 0000000000000000
[ 16.981179] ---[ end trace d6f7f573dad76bd2 ]---
[ 16.981861] RIP: 0010:radeon_vm_fini+0x174/0x300 [radeon]
[ 16.982257] Code: e8 74 cc 7a c1 eb d1 4c 8b 24 24 4d 8d 74 24 48 49 8b 5c 24 48 49 39 de 74 38 66 2e 0f 1f 84 00 00 00 00 00 66 90 4c 8d 7b a8 <48> 8b 2b 48 8d 7b 18 e8 30 1e f4 ff 48 83 c3 c0 48 89 df e8 34 f3
[ 16.983766] RSP: 0018:ffffa3494801f8e8 EFLAGS: 00010286
[ 16.984124] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
nt
- /sysroo[ 16.984981] RDX: 0000000000000001 RSI: ffff979b4ba46730 RDI: ffff979b4ba46750
[ 16.985898] RBP: 0000000000000001 R08: 0000000000200000 R09: 0000000000000000
[ 16.986730] R10: 0000000000000000 R11: ffffffffc04f7810 R12: 0 ES: 0000 CR0: 0000000080050033
[ 17.488057] CR2: 0000000000000000 CR3: 000000031c7fc005 CR4: 00000000000206f0
[ 17.489013] Kernel panic - not syncing: Fatal exception
[ 17.489404] Kernel Offset: 0x0 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
[ 17.490485] ---[ end Kernel panic - not syncing: Fatal exception ]---


2022-01-17 17:03:36

by Christian König

[permalink] [raw]
Subject: Re: RIP: 0010:radeon_vm_fini+0x15/0x220 [radeon]



Am 17.01.22 um 09:42 schrieb Jan Stancek:
> On Mon, Jan 17, 2022 at 08:16:09AM +0100, Christian König wrote:
>> Hi Borislav,
>>
>> Am 15.01.22 um 17:11 schrieb Borislav Petkov:
>>> Hi folks,
>>>
>>> so this is a *very* old K8 laptop - yap, you read it right, family 0xf.
>>>
>>> [   31.353032] powernow_k8: fid 0xa (1800 MHz), vid 0xa
>>> [   31.353569] powernow_k8: fid 0x8 (1600 MHz), vid 0xc
>>> [   31.354081] powernow_k8: fid 0x0 (800 MHz), vid 0x16
>>> [   31.354844] powernow_k8: Found 1 AMD Turion(tm) 64 Mobile
>>> Technology MT-34 (1 cpu cores) (version 2.20.00)
>>>
>>> This is true story.
>>
>> well, that hardware is ancient ^^.
>>
>> Interesting to see that even that old stuff is still used.
>>
>>> Anyway, it blows up, see below.
>>>
>>> Kernel is latest Linus tree, top commit is:
>>>
>>> a33f5c380c4b ("Merge tag 'xfs-5.17-merge-3' of
>>> git://git.kernel.org/pub/scm/fs/xfs/xfs-linux")
>>>
>>> I can bisect if you don't see it immediately why it blows up.
>>
>> Immediately I see that code is called which isn't for this hardware
>> generation.
>>
>> This is extremely odd because it means that we either have recently
>> added a logic bug or the detection of the hardware generation doesn't
>> work as expected any more.
>>
>> Please bisect,
>> Christian.
>
> I'm see panics like this one as well on multiple systems in lab (e.g.
> ProLiant SL390s G7,
> PowerEdge R805). Looks same to what Bruno reported here:
>  https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Fall%2FCA%2BQYu4rt2VHWzbOt-SegA9yABqC-D36PoqTZmy6DscWvp%2B6ZMQ%40mail.gmail.com%2F&amp;data=04%7C01%7Cchristian.koenig%40amd.com%7C42f29e6eb93243584c2108d9d9953e25%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637780057291895847%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&amp;sdata=HO5dYKo7kQHtneS%2F5ftl9KobWa%2BIjgXKjf7SXe0aRcw%3D&amp;reserved=0
>
>
> It started around 8d0749b4f83b - Merge tag 'drm-next-2022-01-07',
> running a bisect atm.

Not necessary any more. That is probably caused by commit
drm/radeon/radeon_kms: Fix a NULL pointer dereference in
radeon_driver_open_kms() ab50cb9df8896b39aae65c537a30de2c79c19735.

I'm getting other bug reports for that one as well. Going to take a look.

Regards,
Christian.

>
> [   15.230105] SGI XFS with ACLs, security attributes, scrub, quota,
> no debug enabled [   15.234816] XFS (sdb1): Mounting V5 Filesystem [  
> 15.342261] [drm] ib test succeeded in 0 usecs [ 15.343311] [drm] No TV
> DAC info found in BIOS [   15.344061] [drm] Radeon Display Connectors
> [   15.344330] [drm] Connector 0: [ 15.344961] [drm]   VGA-1 [  
> 15.345174] [drm]   DDC: 0x60 0x60 0x60 0x60 0x60 0x60 0x60 0x60 [  
> 15.345991] [drm]   Encoders: [ 15.346617] [drm]     CRT1:
> INTERNAL_DAC1 [   15.346942] [drm] Connector 1: [   15.347561] [drm]  
> VGA-2 [   15.347746] [drm] DDC: 0x6c 0x6c 0x6c 0x6c 0x6c 0x6c 0x6c
> 0x6c [   15.348598] [drm]   Encoders: [   15.349217] [drm]     CRT2:
> INTERNAL_DAC2 [ 15.349521] BUG: kernel NULL pointer dereference,
> address: 0000000000000000 [   15.349974] #PF: supervisor read access
> in kernel mode [   15.350305] #PF: error_code(0x0000) - not-present
> page [   15.350675] PGD 0 P4D 0  [   15.350814] Oops: 0000 [#[
> 15.431048] CPU: 0 PID: 410 Comm: systemd-udevd Tainted: G I      
> 5.16.0 #1 [   15.443401] XFS (sdb1): Ending clean mount [   15.451541]
> Hardware name: HP ProLiant SL390s G7/, BIOS P69 07/02/2013 [  
> 15.451545] RIP: 0010:radeon_vm_fini+0x174/0x300 [radeon] [  
> 15.452689] Code: e8 74 cc 7a c1 eb d1 4c 8b 24 24 4d 8d 74 24 48 49 8b
> 5c 24 48 49 39 de 74 38 66 2e 0f 1f 84 00 00 00 00 00 66 90 4c 8d 7b
> a8 <48> 8b 2b 48 8d 7b 18 e8 30 1e f4 ff 48 83 c3 c0 48 89 df e8 34 f3
> [   15.454412] RSP: 0018:ffffa3494800001 R08: 0000000000200000 R09:
> 0000000000000000 [   15.533944] R10: 0000000000000000 R11:
> ffffffffc04f7810 R12: ffff979b4ba46730 [   15.533945] R13:
> ffff979d5c260000 R14: ffff979b4ba46778 R15: ffffffffffffffa8 [  
> 15.533947] FS: 00007f3a13141500(0000) GS:ffff979d4ba00000(0000)
> knlGS:0000000000000000 [   15.533948] CS:  0010 DS: 0000 ES: 0000 CR0:
> 0000000080050033 [   15.533950] CR2: 0000000000000000 CR3:
> 000000031c7fc005 CR4: 00000000000206f0 [   15.533952] Call Trace: [  
> 15.533956]  <TASK> [   15.533959] radeon_driver_open_kms+0x118/0x180
> [radeon] [   15.533998] drm_file_alloc+0x1a8/0x230 [drm] [       OK  
> [[   15.961755] drm_client_init+0x99/0x130 [drm]  [   15.961777]
> drm_fb_helper_init+0x32/0x50 [drm_kms_helper]  [   15.961809]
> radeon_fbdev_init+0xbc/0x110 [radeon]  [   15.963653]
> radeon_modeset_init+0x857/0x9e0 [radeon]  0m] Mounted  [0;[
> 15.964003]  radeon_driver_load_kms+0x19b/0x290 [radeon]  [ 15.964474] 
> drm_dev_register+0xf5/0x2d0 [drm]  1;39msysroot.mou[ 15.965196] 
> radeon_pci_probe+0xc3/0x120 [radeon]  [   15.965972]
> pci_device_probe+0x185/0x220  [   15.966225]
> call_driver_probe+0x32/0xd0  [   15.966505] really_probe+0x157/0x380
>  [   15.99bus_add_driver+0x111/0x210  [ 16.467150]  ?
> 0xffffffffc0412000  [   16.467805] driver_register+0x81/0x120  [  
> 16.468069] do_one_initcall+0xb0/0x290  [   16.468359]  ?
> down_write+0xe/0x40  [   16.469008]  ? kernfs_activate+0x28/0x130  [  
> 16.469267]  ? kernfs_add_one+0x1c8/0x210  [   16.469563]  ?
> vunmap_p4d_range+0x3dc/0x420  [   16.469858]  ? __vunmap+0x1df/0x2a0
>  [   16.470466]  ? kmem_cache_alloc_trace+0x1a4/0x330  [   16.471224] 
> ? do_init_module+0x24/0x230  [   16.471485] do_init_module+0x5a/0x230
>  [   16.471779] load_module+0x145f/0x1630  [   16.472022]  ?
> kernel_read_file_from_fd+0x5d/0x80  [   16.472762]
> __se_sys_finit_module+0x9f/0xd0  [   16.473480]
> do_syscall_64+0x43/0x90  [   16.473778]
> entry_SYSCALL_64_after_hwframe+0x44/0xae  [   16.474123] RIP:
> 0033:0x7f3a13d11e2d  [   16.474422] Code: 5b 41 5c c3 66 0f 1f 84 00
> 00 00 00 00 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2
> 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d bb
> 7f 0e 00 f7 d8 64 89 01 48  [   16.476010] RSP: 002b:00007fff9cb92b78
> EFLAGS: 00000246 ORIG_RAX: 000000 R08: 0000000000000000 R09:
> 0000000000000002  [   16.977414] R10: 0000000000000012 R11:
> 0000000000000246 R12: 00007f3a13e6d43c  [ 16.978320] R13:
> 0000555c5eba3080 R14: 0000000000000007 R15: 0000555c5eba3d70  [  
> 16.979218]  </TASK>  [   16.979381] Modules linked in: xfs radeon(+)
> drm_ttm_helper ttm i2c_algo_bit drm_kms_helper crct10dif_pclmul
> crc32_pclmul crc32c_intel cec ata_generic ghash_clmulni_intel drm
> serio_raw pata_acpi hpwdt  [ 16.980516] CR2: 0000000000000000  [  
> 16.981179] ---[ end trace d6f7f573dad76bd2 ]---  [   16.981861] RIP:
> 0010:radeon_vm_fini+0x174/0x300 [radeon]  [   16.982257] Code: e8 74
> cc 7a c1 eb d1 4c 8b 24 24 4d 8d 74 24 48 49 8b 5c 24 48 49 39 de 74
> 38 66 2e 0f 1f 84 00 00 00 00 00 66 90 4c 8d 7b a8 <48> 8b 2b 48 8d 7b
> 18 e8 30 1e f4 ff 48 83 c3 c0 48 89 df e8 34 f3  [   16.983766] RSP:
> 0018:ffffa3494801f8e8 EFLAGS: 00010286  [   16.984124] RAX:
> 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000  nt     -
> /sysroo[ 16.984981] RDX: 0000000000000001 RSI: ffff979b4ba46730 RDI:
> ffff979b4ba46750   [   16.985898] RBP: 0000000000000001 R08:
> 0000000000200000 R09: 0000000000000000   [   16.986730] R10:
> 0000000000000000 R11: ffffffffc04f7810 R12: 0 ES: 0000 CR0:
> 0000000080050033   [   17.488057] CR2: 0000000000000000 CR3:
> 000000031c7fc005 CR4: 00000000000206f0   [   17.489013] Kernel panic -
> not syncing: Fatal exception   [   17.489404] Kernel Offset: 0x0 from
> 0xffffffff81000000 (relocation range:
> 0xffffffff80000000-0xffffffffbfffffff)   [   17.490485] ---[ end
> Kernel panic - not syncing: Fatal exception ]---
>

2022-01-17 17:13:23

by Borislav Petkov

[permalink] [raw]
Subject: Re: RIP: 0010:radeon_vm_fini+0x15/0x220 [radeon]

On Mon, Jan 17, 2022 at 08:16:09AM +0100, Christian König wrote:
> Interesting to see that even that old stuff is still used.

Well, "used" is a stretch.

This is my way of testing on K8 as pretty much all the big K8 boxes to
which I had access to, got decommissioned so this baby is the only K8
real hw I have now. :-)

Lemme test your patch.

Thx.

--
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette