2023-08-10 09:59:57

by Valdis Klētnieks

[permalink] [raw]
Subject: next-20230726 and later - crash in radeon module during init

I am seeing the following consistent crash at boot:

[ 61.211213][ T819] [drm] radeon kernel modesetting enabled.
[ 61.584870][ T819] vga_switcheroo: detected switching method \_SB_.PCI0.GFX0.ATPX handle
[ 61.667507][ T819] ATPX version 1, functions 0x00000033
[ 61.748228][ T819] general protection fault, probably for non-canonical address 0x54080068930549a0: 0000 [#1] PREEMPT SMP
[ 61.829840][ T819] CPU: 3 PID: 819 Comm: (udev-worker) Tainted: G I T 6.5.0-rc4-next-20230804 #58 5cce04b101a5bb4a6c0368bfff037f6f096b3d3e
[ 61.911411][ T819] Hardware name: Dell Inc. Inspiron 5559/052K07, BIOS 1.9.0 09/07/2020
[ 61.993285][ T819] RIP: 0010:strnlen+0x21/0x40
[ 62.074885][ T819] Code: 90 90 90 90 90 90 90 90 90 f3 0f 1e fa 55 48 89 e5 48 8d 14 37 31 c0 48 85 f6 74 16 48 89 f8 eb 09 48 83 c0 01 48 39 c2 74 0e <80> 38 00 75 f2 48 29 f8 5d c3
cc cc cc cc 48 89 d0 5d 48 29 f8 c3
[ 62.156529][ T819] RSP: 0018:ffffa310419979b8 EFLAGS: 00010202
[ 62.318407][ T819] RAX: 54080068930549a0 RBX: ffffa31041997a20 RCX: 0000000000000000
[ 62.400015][ T819] RDX: 54080068930549b0 RSI: 0000000000000010 RDI: 54080068930549a0
[ 62.481624][ T819] RBP: ffffa310419979b8 R08: ffff937b85579990 R09: ffffa31041997ad8
[ 62.563644][ T819] R10: ffff937b86ddae00 R11: 0000000000000000 R12: 54080068930549a0
[ 62.645194][ T819] R13: ffff937b814291b8 R14: 0000000000000001 R15: ffffa31041997b81
[ 62.726753][ T819] FS: 00007efd50479600(0000) GS:ffff937ef2e00000(0000) knlGS:0000000000000000
[ 62.808312][ T819] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 62.889830][ T819] CR2: 00007f125d30ee70 CR3: 0000000105644002 CR4: 00000000003706e0
[ 62.971390][ T819] Call Trace:
[ 63.052954][ T819] <TASK>
[ 63.134501][ T819] ? show_regs+0x64/0x70
[ 63.216058][ T819] ? die_addr+0x36/0x90
[ 63.297594][ T819] ? exc_general_protection+0x1c1/0x440
[ 63.379112][ T819] ? asm_exc_general_protection+0x2b/0x30
[ 63.460650][ T819] ? strnlen+0x21/0x40
[ 63.542209][ T819] set_dev_info+0x40/0x170
[ 63.623762][ T819] dev_printk_emit+0xa8/0xe0
[ 63.705308][ T819] __dev_printk+0x34/0x80
[ 63.786806][ T819] _dev_info+0x7a/0xa0
[ 63.868304][ T819] radeon_atpx_validate.constprop.0.isra.0+0xbc/0x100 [radeon f030e9a708043a486415a94978106b28cd7cb9a2]
[ 63.949952][ T819] radeon_atpx_detect+0x17b/0x190 [radeon f030e9a708043a486415a94978106b28cd7cb9a2]
[ 64.031547][ T819] ? __pfx_radeon_module_init+0x10/0x10 [radeon f030e9a708043a486415a94978106b28cd7cb9a2]
[ 64.113102][ T819] radeon_register_atpx_handler+0xd/0x30 [radeon f030e9a708043a486415a94978106b28cd7cb9a2]
[ 64.194721][ T819] radeon_module_init+0x84/0xff0 [radeon f030e9a708043a486415a94978106b28cd7cb9a2]
[ 64.276365][ T819] do_one_initcall+0x86/0x380
[ 64.357865][ T819] do_init_module+0x63/0x220
[ 64.439342][ T819] load_module+0x99d/0xa90

Some quick digging indicates the most likely culprit is:

commit cbd0606e6a776bf2ba10d4a6957bb7628c0da947
Author: Srinivasan Shanmugam <[email protected]>
Date: Thu Jul 20 15:39:24 2023 +0530

drm/radeon: Prefer dev_* variant over printk

Changed from pr_err/info to dev_* variants so that
we get better debug info when there are multiple GPUs
in the system.

Looks like this is the failure point because 'dev' is trashed:

+ dev_info(dev, "ATPX Hybrid Graphics\n");

But I admit I don't know the APCI stuff well enough to see what, if
anything, is wrong with this:

+ struct acpi_device *adev = container_of(atpx->handle, struct acpi_device, handle);
+ struct device *dev = &adev->dev;

Any ideas?


Attachments:
(No filename) (505.00 B)

2023-08-10 10:01:49

by Valdis Klētnieks

[permalink] [raw]
Subject: Re: next-20230726 and later - crash in radeon module during init

On Thu, 10 Aug 2023 05:35:02 -0400, "Valdis Klētnieks" said:

> I am seeing the following consistent crash at boot:

> Some quick digging indicates the most likely culprit is:
>
> commit cbd0606e6a776bf2ba10d4a6957bb7628c0da947
> Author: Srinivasan Shanmugam <[email protected]>
> Date: Thu Jul 20 15:39:24 2023 +0530
>
> drm/radeon: Prefer dev_* variant over printk

Nevermind - I see it was already reverted...


Attachments:
(No filename) (505.00 B)