2021-12-14 05:13:27

by Joshua Scott

[permalink] [raw]
Subject: Unhandled prefetch abort when probing USB flash drive

Hi,

I have been running into a kernel panic when probing a USB flash drive, and was after some advice or suggestions on what might be going wrong.

First up, the details of my setup:
* Initially seen on Linux 5.7.19, but I've tested on a vanilla copy of Linux 5.15.0 without any extra modules, and the panic still occurs.
* Flash drive (lsusb): Bus 001 Device 002: ID 1005:b113 Apacer Technology, Inc. Handy Steno 2.0/HT203
* Our system is based on the Marvell 98DX323x SoC (Arm v7, based on the Armada 370/XP)

The panic occurs after sd_probe() gets called, with the USB flash drive attached. This occurs around 1 time out of 100. We first saw this during a reboot, but it can be reproduced much faster by loading and unloading the sd_mod module in a loop, to exercise the probe function.

When adding some diagnostic print statements, I found that adding a delay into sd_probe(), just after the call to device_add_disk(), seems to prevent the issue.

The actual headline of the panic is not always the same, we have seen:
* Unhandled fault: external abort on non-linefetch (0x808) at 0x9fbfa73c
* Unhandled fault: imprecise external abort (0x1416) at 0x76f5e508
* Unhandled prefetch abort: external abort on non-linefetch (0x1008) at 0x803c8a88

Included below is an example of the panic output.

Thank you,
Joshua Scott

Unhandled prefetch abort: external abort on non-linefetch (0x1008) at 0x8018fe20
Internal error: : 1008 [#1] PREEMPT SMP ARM
Modules linked in: sd_mod(+) diag tipc platform_driver(O) ipifwd(PO) usb_storage scsi_mod [last unloaded: sd_mod]
CPU: 0 PID: 178 Comm: udevd Tainted: P O 5.7.19-at1 #39
Hardware name: Marvell Armada 370/XP (Device Tree)
PC is at sys_clock_gettime32+0x58/0xc4
LR is at ret_fast_syscall+0x0/0x54
pc : [<8018fe20>] lr : [<80100060>] psr: a00e0013
sp : 9dfcdf80 ip : 10c5387d fp : 00041150
r10: 00000107 r9 : 9dfcc000 r8 : 80100288
r7 : 00000107 r6 : 00000000 r5 : 7eddb208 r4 : 80c052c8
r3 : 8018ea90 r2 : 00000001 r1 : 7eddb208 r0 : 00000001
Flags: NzCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment none
Control: 10c5387d Table: 1dfc006a DAC: 00000051
Process udevd (pid: 178, stack limit = 0xfbd1de7a)
Stack: (0x9dfcdf80 to 0x9dfce000)
df80: 00000000 00000000 00000000 ed40bc3c 000000fc ed40bc3c 00000008 00000001
dfa0: 7eddb208 80100060 00000001 7eddb208 00000001 7eddb208 eec19c0e eec19c0e
dfc0: 00000001 7eddb208 00000000 00000107 000402c4 00040308 7eddbcd0 00041150
dfe0: 76f38060 7eddb1f0 0002505c 76ed9bc4 600e0010 00000001 00000000 00000000
[<8018fe20>] (sys_clock_gettime32) from [<80100060>] (ret_fast_syscall+0x0/0x54)
Exception stack(0x9dfcdfa8 to 0x9dfcdff0)
dfa0: 00000001 7eddb208 00000001 7eddb208 eec19c0e eec19c0e
dfc0: 00000001 7eddb208 00000000 00000107 000402c4 00040308 7eddbcd0 00041150
dfe0: 76f38060 7eddb1f0 0002505c 76ed9bc4
Code: e5933080 e3530000 0a000018 e5933008 (e1a0100d)
---[ end trace 98c4e7c1cd29d9e3 ]---


2021-12-14 11:21:08

by Russell King (Oracle)

[permalink] [raw]
Subject: Re: Unhandled prefetch abort when probing USB flash drive

On Tue, Dec 14, 2021 at 05:13:20AM +0000, Joshua Scott wrote:
> Hi,
>
> I have been running into a kernel panic when probing a USB flash drive, and was after some advice or suggestions on what might be going wrong.
>
> First up, the details of my setup:
> * Initially seen on Linux 5.7.19, but I've tested on a vanilla copy of Linux 5.15.0 without any extra modules, and the panic still occurs.
> * Flash drive (lsusb): Bus 001 Device 002: ID 1005:b113 Apacer Technology, Inc. Handy Steno 2.0/HT203
> * Our system is based on the Marvell 98DX323x SoC (Arm v7, based on the Armada 370/XP)

Does it work with any kernel? If it doesn't, then I would suspect a
hardware bug, power supply glitch, or a SDRAM timing issue.

Why? These seem somewhat random and spurious. In the example prefetch
abort, it's weird on two accounts:

1) "external abort on non-linefetch" means that we weren't accessing
cached memory, but the kernel is always in cached memory.
2) prefetch abort means the instruction stream failed to read from
this location, but we later see in the Code: line that we have been
able to read the instructions into the data cache.

If the system runs fine without the flash drive attached, I would
suggest it's a power issue. I would suggest trying with an externally
powered USB hub, so the hub sources the power for the flash drive and
see whether that makes a difference.

--
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!