Hi,
KernelCI has identified a new boot regression on linux-next. It affects the
following platforms:
* sc7180-trogdor-kingoftown
* sc7180-trogdor-lazor-limozeen
The regression was introduced in next-20240509, and still affects today's
(next-20240514) release.
The config used was the upstream arm64 defconfig with a config fragment on top
[1].
The following stack traces are produced during boot and a usable shell is never
reached:
[ 0.381981] Unable to handle kernel NULL pointer dereference at virtual address 000000000000001c
[ 0.381989] Mem abort info:
[ 0.381991] ESR = 0x0000000096000004
[ 0.381994] EC = 0x25: DABT (current EL), IL = 32 bits
[ 0.381997] SET = 0, FnV = 0
[ 0.382000] EA = 0, S1PTW = 0
[ 0.382003] FSC = 0x04: level 0 translation fault
[ 0.382006] Data abort info:
[ 0.382008] ISV = 0, ISS = 0x00000004, ISS2 = 0x00000000
[ 0.382011] CM = 0, WnR = 0, TnD = 0, TagAccess = 0
[ 0.382014] GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
[ 0.382017] [000000000000001c] user address but active_mm is swapper
[ 0.382021] Internal error: Oops: 0000000096000004 [#1] PREEMPT SMP
[ 0.382025] Modules linked in:
[ 0.382032] CPU: 4 PID: 68 Comm: kworker/u32:2 Not tainted 6.9.0-next-20240514-dirty #380
[ 0.382038] Hardware name: Google Kingoftown (DT)
[ 0.382042] Workqueue: async async_run_entry_fn
[ 0.382055] pstate: 80400009 (Nzcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[ 0.382061] pc : iommu_dma_sync_sg_for_device+0x28/0x100
[ 0.382070] lr : __dma_sync_sg_for_device+0x28/0x4c
[ 0.382080] sp : ffff800080943740
[ 0.382082] x29: ffff800080943740 x28: ffff36ee44280000 x27: ffff36ee40bd7810
[ 0.382092] x26: ffff800080943998 x25: ffff36ee44280480 x24: ffffb54600bcf0e8
[ 0.382101] x23: ffff36ee40bd7810 x22: 0000000000000001 x21: 0000000000000000
[ 0.382110] x20: ffffb54600f3d098 x19: 0000000000000000 x18: ffffb54601c1a210
[ 0.382118] x17: 000000040044ffff x16: 0000000000000000 x15: ffff36efb6d95580
[ 0.382126] x14: ffff36ee409156c0 x13: 0000000000001797 x12: 0000000000000002
[ 0.382134] x11: 0000000000000004 x10: ffff36ee4308b3d8 x9 : ffff36ee44280469
[ 0.382143] x8 : ffff36ee4308b304 x7 : 00000000ffffffff x6 : 0000000000000001
[ 0.382151] x5 : ffffb5460033a740 x4 : ffffb545ff50375c x3 : 0000000000000001
[ 0.382159] x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffff36ee40bd7810
[ 0.382167] Call trace:
[ 0.382170] iommu_dma_sync_sg_for_device+0x28/0x100
[ 0.382176] __dma_sync_sg_for_device+0x28/0x4c
[ 0.382183] spi_transfer_one_message+0x378/0x6e4
[ 0.382193] __spi_pump_transfer_message+0x190/0x4a4
[ 0.382199] __spi_sync+0x2a0/0x3c4
[ 0.382205] spi_sync_locked+0x10/0x1c
[ 0.382211] tpm_tis_spi_transfer_full+0x160/0x2fc
[ 0.382217] tpm_tis_spi_transfer+0x34/0x40
[ 0.382221] tpm_tis_spi_cr50_read_bytes+0x5c/0x90
[ 0.382226] tpm_tis_core_init+0xfc/0x7e0
[ 0.382231] tpm_tis_spi_init+0x54/0x70
[ 0.382236] cr50_spi_probe+0xf4/0x27c
[ 0.382241] tpm_tis_spi_driver_probe+0x34/0x64
[ 0.382245] spi_probe+0x84/0xe4
[ 0.382251] really_probe+0xbc/0x2a0
[ 0.382258] __driver_probe_device+0x78/0x12c
[ 0.382264] driver_probe_device+0x40/0x160
[ 0.382269] __device_attach_driver+0xb8/0x134
[ 0.382275] bus_for_each_drv+0x84/0xe0
[ 0.382280] __device_attach_async_helper+0xac/0xd0
[ 0.382286] async_run_entry_fn+0x34/0xe0
[ 0.382291] process_one_work+0x154/0x298
[ 0.382300] worker_thread+0x304/0x408
[ 0.382307] kthread+0x118/0x11c
[ 0.382313] ret_from_fork+0x10/0x20
[ 0.382324] Code: 2a0203f5 2a0303f6 a90363f7 aa0003f7 (b9401c20)
[ 0.382328] ---[ end trace 0000000000000000 ]---
[ 0.393379] spi_master spi6: will run message pump with realtime priority
[ 0.393896] Unable to handle kernel NULL pointer dereference at virtual address 000000000000001c
[ 0.393903] Mem abort info:
[ 0.393905] ESR = 0x0000000096000004
[ 0.393908] EC = 0x25: DABT (current EL), IL = 32 bits
[ 0.393912] SET = 0, FnV = 0
[ 0.393915] EA = 0, S1PTW = 0
[ 0.393917] FSC = 0x04: level 0 translation fault
[ 0.393920] Data abort info:
[ 0.393922] ISV = 0, ISS = 0x00000004, ISS2 = 0x00000000
[ 0.393925] CM = 0, WnR = 0, TnD = 0, TagAccess = 0
[ 0.393928] GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
[ 0.393931] [000000000000001c] user address but active_mm is swapper
[ 0.393935] Internal error: Oops: 0000000096000004 [#2] PREEMPT SMP
[ 0.393939] Modules linked in:
[ 0.393946] CPU: 2 PID: 103 Comm: cros_ec_spi_hig Tainted: G D 6.9.0-next-20240514-dirty #380
[ 0.393953] Hardware name: Google Kingoftown (DT)
[ 0.393956] pstate: 80400009 (Nzcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[ 0.393962] pc : iommu_dma_sync_sg_for_device+0x28/0x100
[ 0.393975] lr : __dma_sync_sg_for_device+0x28/0x4c
[ 0.393985] sp : ffff800080de3aa0
[ 0.393988] x29: ffff800080de3aa0 x28: ffff36ee44281800 x27: ffff36ee40ff8010
[ 0.393997] x26: ffff800080de3cf8 x25: ffff36ee44281c80 x24: ffffb54600bcf0e8
[ 0.394006] x23: ffff36ee40ff8010 x22: 0000000000000001 x21: 0000000000000000
[ 0.394014] x20: ffffb54600f3d3d8 x19: 0000000000000000 x18: ffffb54601c1a210
[ 0.394023] x17: 0000000000010108 x16: 0000000000000000 x15: 000000000000000c
[ 0.394031] x14: 0000000000000000 x13: ffff36ee40b962b0 x12: 0000000000000000
[ 0.394039] x11: 0000000000000000 x10: 0000000000003fff x9 : ffff36ee44281c69
[ 0.394047] x8 : ffff36ee4103e704 x7 : 00000000ffffffff x6 : 0000000000000001
[ 0.394055] x5 : ffffb5460033a740 x4 : ffffb545ff50375c x3 : 0000000000000001
[ 0.394063] x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffff36ee40ff8010
[ 0.394071] Call trace:
[ 0.394074] iommu_dma_sync_sg_for_device+0x28/0x100
[ 0.394081] __dma_sync_sg_for_device+0x28/0x4c
[ 0.394088] spi_transfer_one_message+0x378/0x6e4
[ 0.394096] __spi_pump_transfer_message+0x190/0x4a4
[ 0.394103] __spi_sync+0x2a0/0x3c4
[ 0.394109] spi_sync_locked+0x10/0x1c
[ 0.394115] do_cros_ec_pkt_xfer_spi+0x108/0x530
[ 0.394122] cros_ec_xfer_high_pri_work+0x20/0x34
[ 0.394127] kthread_worker_fn+0xcc/0x184
[ 0.394134] kthread+0x118/0x11c
[ 0.394140] ret_from_fork+0x10/0x20
[ 0.394150] Code: 2a0203f5 2a0303f6 a90363f7 aa0003f7 (b9401c20)
[ 0.394154] ---[ end trace 0000000000000000 ]---
[ 3.654117] Unable to handle kernel NULL pointer dereference at virtual address 000000000000001c
[ 3.663154] Mem abort info:
[ 3.666032] ESR = 0x0000000096000004
[ 3.669943] EC = 0x25: DABT (current EL), IL = 32 bits
[ 3.675417] SET = 0, FnV = 0
[ 3.678563] EA = 0, S1PTW = 0
[ 3.681792] FSC = 0x04: level 0 translation fault
[ 3.686808] Data abort info:
[ 3.689765] ISV = 0, ISS = 0x00000004, ISS2 = 0x00000000
[ 3.695399] CM = 0, WnR = 0, TnD = 0, TagAccess = 0
[ 3.700592] GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
[ 3.706050] [000000000000001c] user address but active_mm is swapper
[ 3.712576] Internal error: Oops: 0000000096000004 [#3] PREEMPT SMP
[ 3.719017] Modules linked in:
[ 3.722162] CPU: 6 PID: 11 Comm: kworker/u32:0 Tainted: G D 6.9.0-next-20240514-dirty #380
[ 3.732067] Hardware name: Google Kingoftown (DT)
[ 3.736904] Workqueue: events_unbound deferred_probe_work_func
[ 3.742907] pstate: 80400009 (Nzcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[ 3.750052] pc : iommu_dma_sync_sg_for_device+0x28/0x100
[ 3.755526] lr : __dma_sync_sg_for_device+0x28/0x4c
[ 3.760548] sp : ffff8000800ab0b0
[ 3.763953] x29: ffff8000800ab0b0 x28: ffff36ee43a6a000 x27: ffff36ee41012010
[ 3.771279] x26: ffff8000800ab2e8 x25: ffff36ee43a6a480 x24: ffffb54600bcf0e8
[ 3.778604] x23: ffff36ee41012010 x22: 0000000000000001 x21: 0000000000000000
[ 3.785928] x20: ffffb54600f3d718 x19: 0000000000000000 x18: ffffb54601c19c48
[ 3.793258] x17: 0000000000010108 x16: 0000000000000000 x15: 000000000000000c
[ 3.800589] x14: 0000000000000000 x13: ffff36ee40b962b0 x12: 0000000000000000
[ 3.807921] x11: 071c71c71c71c71c x10: 0000000000003fff x9 : ffff36ee43a6a469
[ 3.815254] x8 : ffff36ee4101cf04 x7 : 00000000ffffffff x6 : 0000000000000001
[ 3.822587] x5 : ffffb5460033a740 x4 : ffffb545ff50375c x3 : 0000000000000001
[ 3.829910] x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffff36ee41012010
[ 3.837234] Call trace:
[ 3.839750] iommu_dma_sync_sg_for_device+0x28/0x100
[ 3.844853] __dma_sync_sg_for_device+0x28/0x4c
[ 3.849517] spi_transfer_one_message+0x378/0x6e4
[ 3.854360] __spi_pump_transfer_message+0x190/0x4a4
[ 3.859462] __spi_sync+0x2a0/0x3c4
[ 3.863048] spi_sync+0x30/0x54
[ 3.866283] spi_mem_exec_op+0x26c/0x41c
[ 3.870321] spi_nor_read_id+0x7c/0xc4
[ 3.874180] spi_nor_detect+0x34/0x158
[ 3.878039] spi_nor_scan+0x1f0/0xef8
[ 3.881813] spi_nor_probe+0x94/0x2ec
[ 3.885587] spi_mem_probe+0x6c/0xac
[ 3.889262] spi_probe+0x84/0xe4
[ 3.892579] really_probe+0xbc/0x2a0
[ 3.896262] __driver_probe_device+0x78/0x12c
[ 3.900747] driver_probe_device+0x40/0x160
[ 3.905046] __device_attach_driver+0xb8/0x134
[ 3.909619] bus_for_each_drv+0x84/0xe0
[ 3.913568] __device_attach+0xa8/0x1b0
[ 3.917515] device_initial_probe+0x14/0x20
[ 3.921814] bus_probe_device+0xa8/0xac
[ 3.925761] device_add+0x590/0x750
[ 3.929351] __spi_add_device+0x138/0x208
[ 3.933476] of_register_spi_device+0x394/0x57c
[ 3.938139] spi_register_controller+0x394/0x760
[ 3.942888] qcom_qspi_probe+0x328/0x390
[ 3.946928] platform_probe+0x68/0xd8
[ 3.950701] really_probe+0xbc/0x2a0
[ 3.954384] __driver_probe_device+0x78/0x12c
[ 3.958869] driver_probe_device+0x40/0x160
[ 3.963169] __device_attach_driver+0xb8/0x134
[ 3.967734] bus_for_each_drv+0x84/0xe0
[ 3.971682] __device_attach+0xa8/0x1b0
[ 3.975628] device_initial_probe+0x14/0x20
[ 3.979927] bus_probe_device+0xa8/0xac
[ 3.983873] deferred_probe_work_func+0x88/0xc0
[ 3.988536] process_one_work+0x154/0x298
[ 3.992663] worker_thread+0x304/0x408
[ 3.996525] kthread+0x118/0x11c
[ 3.999847] ret_from_fork+0x10/0x20
[ 4.003534] Code: 2a0203f5 2a0303f6 a90363f7 aa0003f7 (b9401c20)
[ 4.009788] ---[ end trace 0000000000000000 ]---
Searching on lore I could only find the following series that caused another
regression, and its subsequent fix:
https://lore.kernel.org/lkml/[email protected]/
https://lore.kernel.org/all/[email protected]/
But even after reverting both the issue was still there, so I've concluded
that's unrelated.
Thanks,
N?colas
#regzbot introduced: next-20240509
[1] https://pastebin.com/raw/sx4bPAa6
On Tue, May 14, 2024 at 12:41:29PM -0400, N?colas F. R. A. Prado wrote:
> Hi,
>
> KernelCI has identified a new boot regression on linux-next. It affects the
> following platforms:
> * sc7180-trogdor-kingoftown
> * sc7180-trogdor-lazor-limozeen
>
> The regression was introduced in next-20240509, and still affects today's
> (next-20240514) release.
>
> The config used was the upstream arm64 defconfig with a config fragment on top
> [1].
>
> The following stack traces are produced during boot and a usable shell is never
> reached:
>
> [ 0.381981] Unable to handle kernel NULL pointer dereference at virtual address 000000000000001c
> [ 0.381989] Mem abort info:
> [ 0.381991] ESR = 0x0000000096000004
> [ 0.381994] EC = 0x25: DABT (current EL), IL = 32 bits
> [ 0.381997] SET = 0, FnV = 0
> [ 0.382000] EA = 0, S1PTW = 0
> [ 0.382003] FSC = 0x04: level 0 translation fault
> [ 0.382006] Data abort info:
> [ 0.382008] ISV = 0, ISS = 0x00000004, ISS2 = 0x00000000
> [ 0.382011] CM = 0, WnR = 0, TnD = 0, TagAccess = 0
> [ 0.382014] GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
> [ 0.382017] [000000000000001c] user address but active_mm is swapper
> [ 0.382021] Internal error: Oops: 0000000096000004 [#1] PREEMPT SMP
> [ 0.382025] Modules linked in:
> [ 0.382032] CPU: 4 PID: 68 Comm: kworker/u32:2 Not tainted 6.9.0-next-20240514-dirty #380
> [ 0.382038] Hardware name: Google Kingoftown (DT)
> [ 0.382042] Workqueue: async async_run_entry_fn
> [ 0.382055] pstate: 80400009 (Nzcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
> [ 0.382061] pc : iommu_dma_sync_sg_for_device+0x28/0x100
> [ 0.382070] lr : __dma_sync_sg_for_device+0x28/0x4c
> [ 0.382080] sp : ffff800080943740
> [ 0.382082] x29: ffff800080943740 x28: ffff36ee44280000 x27: ffff36ee40bd7810
> [ 0.382092] x26: ffff800080943998 x25: ffff36ee44280480 x24: ffffb54600bcf0e8
> [ 0.382101] x23: ffff36ee40bd7810 x22: 0000000000000001 x21: 0000000000000000
> [ 0.382110] x20: ffffb54600f3d098 x19: 0000000000000000 x18: ffffb54601c1a210
> [ 0.382118] x17: 000000040044ffff x16: 0000000000000000 x15: ffff36efb6d95580
> [ 0.382126] x14: ffff36ee409156c0 x13: 0000000000001797 x12: 0000000000000002
> [ 0.382134] x11: 0000000000000004 x10: ffff36ee4308b3d8 x9 : ffff36ee44280469
> [ 0.382143] x8 : ffff36ee4308b304 x7 : 00000000ffffffff x6 : 0000000000000001
> [ 0.382151] x5 : ffffb5460033a740 x4 : ffffb545ff50375c x3 : 0000000000000001
> [ 0.382159] x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffff36ee40bd7810
> [ 0.382167] Call trace:
> [ 0.382170] iommu_dma_sync_sg_for_device+0x28/0x100
> [ 0.382176] __dma_sync_sg_for_device+0x28/0x4c
> [ 0.382183] spi_transfer_one_message+0x378/0x6e4
> [ 0.382193] __spi_pump_transfer_message+0x190/0x4a4
> [ 0.382199] __spi_sync+0x2a0/0x3c4
> [ 0.382205] spi_sync_locked+0x10/0x1c
> [ 0.382211] tpm_tis_spi_transfer_full+0x160/0x2fc
> [ 0.382217] tpm_tis_spi_transfer+0x34/0x40
> [ 0.382221] tpm_tis_spi_cr50_read_bytes+0x5c/0x90
> [ 0.382226] tpm_tis_core_init+0xfc/0x7e0
> [ 0.382231] tpm_tis_spi_init+0x54/0x70
> [ 0.382236] cr50_spi_probe+0xf4/0x27c
> [ 0.382241] tpm_tis_spi_driver_probe+0x34/0x64
> [ 0.382245] spi_probe+0x84/0xe4
> [ 0.382251] really_probe+0xbc/0x2a0
> [ 0.382258] __driver_probe_device+0x78/0x12c
> [ 0.382264] driver_probe_device+0x40/0x160
> [ 0.382269] __device_attach_driver+0xb8/0x134
> [ 0.382275] bus_for_each_drv+0x84/0xe0
> [ 0.382280] __device_attach_async_helper+0xac/0xd0
> [ 0.382286] async_run_entry_fn+0x34/0xe0
> [ 0.382291] process_one_work+0x154/0x298
> [ 0.382300] worker_thread+0x304/0x408
> [ 0.382307] kthread+0x118/0x11c
> [ 0.382313] ret_from_fork+0x10/0x20
> [ 0.382324] Code: 2a0203f5 2a0303f6 a90363f7 aa0003f7 (b9401c20)
> [ 0.382328] ---[ end trace 0000000000000000 ]---
Tracked down the issue to commit 8cc3bad9d9d6 ("spi: Remove unneded check for
orig_nents").
#regzbot introduced: 8cc3bad9d9d6
The issue happens because in spi_dma_sync_for_device(), the line
dma_sync_sgtable_for_device(tx_dev, &xfer->tx_sg, DMA_TO_DEVICE);
is passing a scatterlist table (xfer->tx_sg) that hasn't been initialized, so
the sgl pointer inside it is null. Before the patch, the check would prevent it
from being called since orig_nents is 0.
This initialization of the scatterlist table should happen in
spi_map_buf_attrs(), which should be called in __spi_map_msg(), however some
debugging revealed that the previous check
if (!ctlr->can_dma(ctlr, msg->spi, xfer))
is failing, which is why the scatterlist table isn't initialized. Despite that,
ctlr->cur_msg_mapped is set to true, so spi_dma_sync_for_device() doesn't return
early, which would completely avoid the issue. At this point I'm confused why
that flag tracks DMA mapping per message, if the mapping is done per transfer
(and a message can contain multiple transfers). Maybe that's what needs to
change, though I'd like the input from someone who is familiar with this code.
Thanks,
N?colas
On Wed, May 15, 2024 at 05:05:56PM -0400, N?colas F. R. A. Prado wrote:
> On Tue, May 14, 2024 at 12:41:29PM -0400, N?colas F. R. A. Prado wrote:
[..]
> Tracked down the issue to commit 8cc3bad9d9d6 ("spi: Remove unneded check for
> orig_nents").
>
> #regzbot introduced: 8cc3bad9d9d6
[..]
#regzbot monitor: https://lore.kernel.org/all/d8930bce-6db6-45f4-8f09-8a00fa48e607@notapiano
Hi,
On 14/05/2024 18:41, Nícolas F. R. A. Prado wrote:
> Hi,
>
> KernelCI has identified a new boot regression on linux-next. It affects the
> following platforms:
> * sc7180-trogdor-kingoftown
> * sc7180-trogdor-lazor-limozeen
I also see the regression on:
- SM8550-QRD
- SM8560-QRD
reverting commit 8cc3bad9d9d6 ("spi: Remove unneded check for orig_nents") removes the issue.
Thanks for reporting this,
Neil
[ 6.404623] Unable to handle kernel NULL pointer dereference at virtual address 000000000000001c
[ 6.413685] Mem abort info:
[ 6.416574] ESR = 0x0000000096000006
[ 6.420436] EC = 0x25: DABT (current EL), IL = 32 bits
[ 6.425901] SET = 0, FnV = 0
[ 6.429046] EA = 0, S1PTW = 0
[ 6.432293] FSC = 0x06: level 2 translation fault
[ 6.437320] Data abort info:
[ 6.440289] ISV = 0, ISS = 0x00000006, ISS2 = 0x00000000
[ 6.445927] CM = 0, WnR = 0, TnD = 0, TagAccess = 0
[ 6.451121] GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
[ 6.456585] user pgtable: 4k pages, 48-bit VAs, pgdp=000000088f68b000
[ 6.463208] [000000000000001c] pgd=080000088f68d003, p4d=080000088f68d003, pud=080000088f68e003, pmd=0000000000000000
[ 6.474108] Internal error: Oops: 0000000096000006 [#1] PREEMPT SMP
[ 6.480542] Modules linked in: ucsi_glink pmic_glink_altmode goodix_berlin_spi(+) nb7vpq904m wcd939x_usbss qcom_battmgr typec_ucsi aux_hpd_bridge goodix_berlin_core crct10dif_ce hci_uart rtc_pm8xxx leds_qcom_lpg led_class_multicolor qcom_pon nvmem_qcom_spmi_sdam sm3_ce qcom_pbs btqca snd_soc_wcd939x snd_soc_sc8280xp snd_soc_wcd939x_sdw phy_qcom_eusb2_repeater snd_soc_qcom_sdw regmap_sdw qcom_spmi_temp_alarm snd_soc_qcom_common btbcm snd_soc_wcd_mbhc sm3 qcom_stats snd_soc_wcd_classh drm_dp_aux_bus sha3_ce gpu_sched sha512_ce sha512_arm64 drm_exec bluetooth qcom_q6v5_pas phy_qcom_qmp_combo qcrypto soundwire_qcom qcom_pil_info snd_soc_lpass_va_macro pinctrl_sm8650_lpass_lpi authenc snd_soc_lpass_tx_macro aux_bridge cfg80211 spi_geni_qcom i2c_qcom_geni snd_soc_lpass_rx_macro rfkill phy_qcom_snps_eusb2 dispcc_sm8650 drm_display_helper pinctrl_lpass_lpi gpi snd_soc_lpass_wsa_macro snd_soc_lpass_macro_common slimbus drm_kms_helper gpucc_sm8650 ipa qcom_q6v5 qrtr libdes phy_qcom_qmp_ufs qcom_sysmon qcom_common
[ 6.480602] qcom_glink_smem
[ 6.571649] soundwire_bus mdt_loader pmic_glink qcom_rng phy_qcom_qmp_pcie llcc_qcom ufs_qcom icc_bwmon typec rmtfs_mem pdr_interface qmi_helpers nvmem_reboot_mode socinfo fuse drm backlight ipv6
[ 6.597201] CPU: 4 PID: 241 Comm: (udev-worker) Tainted: G S 6.9.0-next-20240521 #1
[ 6.606488] Hardware name: Qualcomm Technologies, Inc. SM8650 QRD (DT)
[ 6.613189] pstate: 63400005 (nZCv daif +PAN -UAO +TCO +DIT -SSBS BTYPE=--)
[ 6.641597] lr : __dma_sync_sg_for_device+0x3c/0x40
[ 6.646632] sp : ffff800081bf3260
[ 6.660650] x26: ffff59520fbd1c80 x25: 0000000000000000 x24: ffffb46fccd24988
[ 6.660653] x23: ffff595201628410 x22: 0000000000000002 x21: 0000000000000000
[ 6.660655] x20: ffff800081bf33f0 x19: 0000000000000000 x18: 0000000000000001
[ 6.660656] x17: 0000000000000018 x16: 0000000000000100 x15: 0000000000000002
[ 6.688275] x14: 0000000000000001 x13: ffff595200995180 x12: 000000000025a5c8
[ 6.688277] x11: 0000000000000820 x10: 0000000000000001 x9 : ffff59520fbd1c69
[ 6.688279] x8 : ffff595202169704 x7 : 00000000ffffffff x6 : 0000000000000001
[ 6.688281] x5 : fffffdffbf7a8cc0 x4 : ffffb46fcc0232a4 x3 : 0000000000000002
[ 6.688283] x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffff595201628410
[ 6.688286] Call trace:
[ 6.688287] iommu_dma_sync_sg_for_device+0x28/0x100
[ 6.717582] __dma_sync_sg_for_device+0x3c/0x40
[ 6.717585] spi_transfer_one_message+0x358/0x680
[ 6.732229] __spi_pump_transfer_message+0x188/0x494
[ 6.732232] __spi_sync+0x2a8/0x3c4
[ 6.732234] spi_sync+0x30/0x54
[ 6.732236] goodix_berlin_spi_write+0xf8/0x164 [goodix_berlin_spi]
[ 6.739854] _regmap_raw_write_impl+0x538/0x674
[ 6.750053] _regmap_raw_write+0xb4/0x144
[ 6.750056] regmap_raw_write+0x7c/0xc0
[ 6.750058] goodix_berlin_power_on+0xb0/0x1b0 [goodix_berlin_core]
[ 6.765520] goodix_berlin_probe+0xc0/0x660 [goodix_berlin_core]
[ 6.765522] goodix_berlin_spi_probe+0x12c/0x14c [goodix_berlin_spi]
[ 6.772339] spi_probe+0x84/0xe4
[ 6.772342] really_probe+0xbc/0x29c
[ 6.784313] __driver_probe_device+0x78/0x12c
[ 6.784316] driver_probe_device+0x3c/0x15c
[ 6.784319] __driver_attach+0x90/0x19c
[ 6.784322] bus_for_each_dev+0x7c/0xdc
[ 6.794520] driver_attach+0x24/0x30
[ 6.794523] bus_add_driver+0xe4/0x208
[ 6.794526] driver_register+0x5c/0x124
[ 6.802586] __spi_register_driver+0xa4/0xe4
[ 6.802589] goodix_berlin_spi_driver_init+0x20/0x1000 [goodix_berlin_spi]
[ 6.802591] do_one_initcall+0x80/0x1c8
[ 6.902310] do_init_module+0x60/0x218
[ 6.921988] load_module+0x1bcc/0x1d8c
[ 6.925847] init_module_from_file+0x88/0xcc
[ 6.930238] __arm64_sys_finit_module+0x1dc/0x2e4
[ 6.935074] invoke_syscall+0x48/0x114
[ 6.938944] el0_svc_common.constprop.0+0xc0/0xe0
[ 6.943781] do_el0_svc+0x1c/0x28
[ 6.947195] el0_svc+0x34/0xd8
[ 6.950348] el0t_64_sync_handler+0x120/0x12c
[ 6.954833] el0t_64_sync+0x190/0x194
[ 6.958600] Code: 2a0203f5 2a0303f6 a90363f7 aa0003f7 (b9401c20)
[ 6.964859] ---[ end trace 0000000000000000 ]---
>
> The regression was introduced in next-20240509, and still affects today's
> (next-20240514) release.
>
> The config used was the upstream arm64 defconfig with a config fragment on top
> [1].
>
> The following stack traces are produced during boot and a usable shell is never
> reached:
>
> [ 0.381981] Unable to handle kernel NULL pointer dereference at virtual address 000000000000001c
> [ 0.381989] Mem abort info:
> [ 0.381991] ESR = 0x0000000096000004
> [ 0.381994] EC = 0x25: DABT (current EL), IL = 32 bits
> [ 0.381997] SET = 0, FnV = 0
> [ 0.382000] EA = 0, S1PTW = 0
> [ 0.382003] FSC = 0x04: level 0 translation fault
> [ 0.382006] Data abort info:
> [ 0.382008] ISV = 0, ISS = 0x00000004, ISS2 = 0x00000000
> [ 0.382011] CM = 0, WnR = 0, TnD = 0, TagAccess = 0
> [ 0.382014] GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
> [ 0.382017] [000000000000001c] user address but active_mm is swapper
> [ 0.382021] Internal error: Oops: 0000000096000004 [#1] PREEMPT SMP
> [ 0.382025] Modules linked in:
> [ 0.382032] CPU: 4 PID: 68 Comm: kworker/u32:2 Not tainted 6.9.0-next-20240514-dirty #380
> [ 0.382038] Hardware name: Google Kingoftown (DT)
> [ 0.382042] Workqueue: async async_run_entry_fn
> [ 0.382055] pstate: 80400009 (Nzcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
> [ 0.382061] pc : iommu_dma_sync_sg_for_device+0x28/0x100
> [ 0.382070] lr : __dma_sync_sg_for_device+0x28/0x4c
> [ 0.382080] sp : ffff800080943740
> [ 0.382082] x29: ffff800080943740 x28: ffff36ee44280000 x27: ffff36ee40bd7810
> [ 0.382092] x26: ffff800080943998 x25: ffff36ee44280480 x24: ffffb54600bcf0e8
> [ 0.382101] x23: ffff36ee40bd7810 x22: 0000000000000001 x21: 0000000000000000
> [ 0.382110] x20: ffffb54600f3d098 x19: 0000000000000000 x18: ffffb54601c1a210
> [ 0.382118] x17: 000000040044ffff x16: 0000000000000000 x15: ffff36efb6d95580
> [ 0.382126] x14: ffff36ee409156c0 x13: 0000000000001797 x12: 0000000000000002
> [ 0.382134] x11: 0000000000000004 x10: ffff36ee4308b3d8 x9 : ffff36ee44280469
> [ 0.382143] x8 : ffff36ee4308b304 x7 : 00000000ffffffff x6 : 0000000000000001
> [ 0.382151] x5 : ffffb5460033a740 x4 : ffffb545ff50375c x3 : 0000000000000001
> [ 0.382159] x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffff36ee40bd7810
> [ 0.382167] Call trace:
> [ 0.382170] iommu_dma_sync_sg_for_device+0x28/0x100
> [ 0.382176] __dma_sync_sg_for_device+0x28/0x4c
> [ 0.382183] spi_transfer_one_message+0x378/0x6e4
> [ 0.382193] __spi_pump_transfer_message+0x190/0x4a4
> [ 0.382199] __spi_sync+0x2a0/0x3c4
> [ 0.382205] spi_sync_locked+0x10/0x1c
> [ 0.382211] tpm_tis_spi_transfer_full+0x160/0x2fc
> [ 0.382217] tpm_tis_spi_transfer+0x34/0x40
> [ 0.382221] tpm_tis_spi_cr50_read_bytes+0x5c/0x90
> [ 0.382226] tpm_tis_core_init+0xfc/0x7e0
> [ 0.382231] tpm_tis_spi_init+0x54/0x70
> [ 0.382236] cr50_spi_probe+0xf4/0x27c
> [ 0.382241] tpm_tis_spi_driver_probe+0x34/0x64
> [ 0.382245] spi_probe+0x84/0xe4
> [ 0.382251] really_probe+0xbc/0x2a0
> [ 0.382258] __driver_probe_device+0x78/0x12c
> [ 0.382264] driver_probe_device+0x40/0x160
> [ 0.382269] __device_attach_driver+0xb8/0x134
> [ 0.382275] bus_for_each_drv+0x84/0xe0
> [ 0.382280] __device_attach_async_helper+0xac/0xd0
> [ 0.382286] async_run_entry_fn+0x34/0xe0
> [ 0.382291] process_one_work+0x154/0x298
> [ 0.382300] worker_thread+0x304/0x408
> [ 0.382307] kthread+0x118/0x11c
> [ 0.382313] ret_from_fork+0x10/0x20
> [ 0.382324] Code: 2a0203f5 2a0303f6 a90363f7 aa0003f7 (b9401c20)
> [ 0.382328] ---[ end trace 0000000000000000 ]---
>
> [ 0.393379] spi_master spi6: will run message pump with realtime priority
> [ 0.393896] Unable to handle kernel NULL pointer dereference at virtual address 000000000000001c
> [ 0.393903] Mem abort info:
> [ 0.393905] ESR = 0x0000000096000004
> [ 0.393908] EC = 0x25: DABT (current EL), IL = 32 bits
> [ 0.393912] SET = 0, FnV = 0
> [ 0.393915] EA = 0, S1PTW = 0
> [ 0.393917] FSC = 0x04: level 0 translation fault
> [ 0.393920] Data abort info:
> [ 0.393922] ISV = 0, ISS = 0x00000004, ISS2 = 0x00000000
> [ 0.393925] CM = 0, WnR = 0, TnD = 0, TagAccess = 0
> [ 0.393928] GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
> [ 0.393931] [000000000000001c] user address but active_mm is swapper
> [ 0.393935] Internal error: Oops: 0000000096000004 [#2] PREEMPT SMP
> [ 0.393939] Modules linked in:
> [ 0.393946] CPU: 2 PID: 103 Comm: cros_ec_spi_hig Tainted: G D 6.9.0-next-20240514-dirty #380
> [ 0.393953] Hardware name: Google Kingoftown (DT)
> [ 0.393956] pstate: 80400009 (Nzcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
> [ 0.393962] pc : iommu_dma_sync_sg_for_device+0x28/0x100
> [ 0.393975] lr : __dma_sync_sg_for_device+0x28/0x4c
> [ 0.393985] sp : ffff800080de3aa0
> [ 0.393988] x29: ffff800080de3aa0 x28: ffff36ee44281800 x27: ffff36ee40ff8010
> [ 0.393997] x26: ffff800080de3cf8 x25: ffff36ee44281c80 x24: ffffb54600bcf0e8
> [ 0.394006] x23: ffff36ee40ff8010 x22: 0000000000000001 x21: 0000000000000000
> [ 0.394014] x20: ffffb54600f3d3d8 x19: 0000000000000000 x18: ffffb54601c1a210
> [ 0.394023] x17: 0000000000010108 x16: 0000000000000000 x15: 000000000000000c
> [ 0.394031] x14: 0000000000000000 x13: ffff36ee40b962b0 x12: 0000000000000000
> [ 0.394039] x11: 0000000000000000 x10: 0000000000003fff x9 : ffff36ee44281c69
> [ 0.394047] x8 : ffff36ee4103e704 x7 : 00000000ffffffff x6 : 0000000000000001
> [ 0.394055] x5 : ffffb5460033a740 x4 : ffffb545ff50375c x3 : 0000000000000001
> [ 0.394063] x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffff36ee40ff8010
> [ 0.394071] Call trace:
> [ 0.394074] iommu_dma_sync_sg_for_device+0x28/0x100
> [ 0.394081] __dma_sync_sg_for_device+0x28/0x4c
> [ 0.394088] spi_transfer_one_message+0x378/0x6e4
> [ 0.394096] __spi_pump_transfer_message+0x190/0x4a4
> [ 0.394103] __spi_sync+0x2a0/0x3c4
> [ 0.394109] spi_sync_locked+0x10/0x1c
> [ 0.394115] do_cros_ec_pkt_xfer_spi+0x108/0x530
> [ 0.394122] cros_ec_xfer_high_pri_work+0x20/0x34
> [ 0.394127] kthread_worker_fn+0xcc/0x184
> [ 0.394134] kthread+0x118/0x11c
> [ 0.394140] ret_from_fork+0x10/0x20
> [ 0.394150] Code: 2a0203f5 2a0303f6 a90363f7 aa0003f7 (b9401c20)
> [ 0.394154] ---[ end trace 0000000000000000 ]---
>
> [ 3.654117] Unable to handle kernel NULL pointer dereference at virtual address 000000000000001c
> [ 3.663154] Mem abort info:
> [ 3.666032] ESR = 0x0000000096000004
> [ 3.669943] EC = 0x25: DABT (current EL), IL = 32 bits
> [ 3.675417] SET = 0, FnV = 0
> [ 3.678563] EA = 0, S1PTW = 0
> [ 3.681792] FSC = 0x04: level 0 translation fault
> [ 3.686808] Data abort info:
> [ 3.689765] ISV = 0, ISS = 0x00000004, ISS2 = 0x00000000
> [ 3.695399] CM = 0, WnR = 0, TnD = 0, TagAccess = 0
> [ 3.700592] GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
> [ 3.706050] [000000000000001c] user address but active_mm is swapper
> [ 3.712576] Internal error: Oops: 0000000096000004 [#3] PREEMPT SMP
> [ 3.719017] Modules linked in:
> [ 3.722162] CPU: 6 PID: 11 Comm: kworker/u32:0 Tainted: G D 6.9.0-next-20240514-dirty #380
> [ 3.732067] Hardware name: Google Kingoftown (DT)
> [ 3.736904] Workqueue: events_unbound deferred_probe_work_func
> [ 3.742907] pstate: 80400009 (Nzcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
> [ 3.750052] pc : iommu_dma_sync_sg_for_device+0x28/0x100
> [ 3.755526] lr : __dma_sync_sg_for_device+0x28/0x4c
> [ 3.760548] sp : ffff8000800ab0b0
> [ 3.763953] x29: ffff8000800ab0b0 x28: ffff36ee43a6a000 x27: ffff36ee41012010
> [ 3.771279] x26: ffff8000800ab2e8 x25: ffff36ee43a6a480 x24: ffffb54600bcf0e8
> [ 3.778604] x23: ffff36ee41012010 x22: 0000000000000001 x21: 0000000000000000
> [ 3.785928] x20: ffffb54600f3d718 x19: 0000000000000000 x18: ffffb54601c19c48
> [ 3.793258] x17: 0000000000010108 x16: 0000000000000000 x15: 000000000000000c
> [ 3.800589] x14: 0000000000000000 x13: ffff36ee40b962b0 x12: 0000000000000000
> [ 3.807921] x11: 071c71c71c71c71c x10: 0000000000003fff x9 : ffff36ee43a6a469
> [ 3.815254] x8 : ffff36ee4101cf04 x7 : 00000000ffffffff x6 : 0000000000000001
> [ 3.822587] x5 : ffffb5460033a740 x4 : ffffb545ff50375c x3 : 0000000000000001
> [ 3.829910] x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffff36ee41012010
> [ 3.837234] Call trace:
> [ 3.839750] iommu_dma_sync_sg_for_device+0x28/0x100
> [ 3.844853] __dma_sync_sg_for_device+0x28/0x4c
> [ 3.849517] spi_transfer_one_message+0x378/0x6e4
> [ 3.854360] __spi_pump_transfer_message+0x190/0x4a4
> [ 3.859462] __spi_sync+0x2a0/0x3c4
> [ 3.863048] spi_sync+0x30/0x54
> [ 3.866283] spi_mem_exec_op+0x26c/0x41c
> [ 3.870321] spi_nor_read_id+0x7c/0xc4
> [ 3.874180] spi_nor_detect+0x34/0x158
> [ 3.878039] spi_nor_scan+0x1f0/0xef8
> [ 3.881813] spi_nor_probe+0x94/0x2ec
> [ 3.885587] spi_mem_probe+0x6c/0xac
> [ 3.889262] spi_probe+0x84/0xe4
> [ 3.892579] really_probe+0xbc/0x2a0
> [ 3.896262] __driver_probe_device+0x78/0x12c
> [ 3.900747] driver_probe_device+0x40/0x160
> [ 3.905046] __device_attach_driver+0xb8/0x134
> [ 3.909619] bus_for_each_drv+0x84/0xe0
> [ 3.913568] __device_attach+0xa8/0x1b0
> [ 3.917515] device_initial_probe+0x14/0x20
> [ 3.921814] bus_probe_device+0xa8/0xac
> [ 3.925761] device_add+0x590/0x750
> [ 3.929351] __spi_add_device+0x138/0x208
> [ 3.933476] of_register_spi_device+0x394/0x57c
> [ 3.938139] spi_register_controller+0x394/0x760
> [ 3.942888] qcom_qspi_probe+0x328/0x390
> [ 3.946928] platform_probe+0x68/0xd8
> [ 3.950701] really_probe+0xbc/0x2a0
> [ 3.954384] __driver_probe_device+0x78/0x12c
> [ 3.958869] driver_probe_device+0x40/0x160
> [ 3.963169] __device_attach_driver+0xb8/0x134
> [ 3.967734] bus_for_each_drv+0x84/0xe0
> [ 3.971682] __device_attach+0xa8/0x1b0
> [ 3.975628] device_initial_probe+0x14/0x20
> [ 3.979927] bus_probe_device+0xa8/0xac
> [ 3.983873] deferred_probe_work_func+0x88/0xc0
> [ 3.988536] process_one_work+0x154/0x298
> [ 3.992663] worker_thread+0x304/0x408
> [ 3.996525] kthread+0x118/0x11c
> [ 3.999847] ret_from_fork+0x10/0x20
> [ 4.003534] Code: 2a0203f5 2a0303f6 a90363f7 aa0003f7 (b9401c20)
> [ 4.009788] ---[ end trace 0000000000000000 ]---
>
> Searching on lore I could only find the following series that caused another
> regression, and its subsequent fix:
> https://lore.kernel.org/lkml/[email protected]/
> https://lore.kernel.org/all/[email protected]/
>
> But even after reverting both the issue was still there, so I've concluded
> that's unrelated.
>
> Thanks,
> Nícolas
>
> #regzbot introduced: next-20240509
>
> [1] https://pastebin.com/raw/sx4bPAa6
>