2022-04-12 20:20:48

by Zheyu Ma

[permalink] [raw]
Subject: [BUG] mtd: rawnand: denali_pci: page fault when probing fails

Hello,

I found a bug in the denali_pci module.
When the driver fails to probe, we will get the following splat:

[ 4.472703] denali-nand-pci 0000:00:05.0: timeout while waiting for
irq 0x1000
[ 4.474071] denali-nand-pci: probe of 0000:00:05.0 failed with error -5
[ 4.473538] nand: No NAND device found
[ 4.474068] BUG: unable to handle page fault for address:
ffffc90005000410
[ 4.475169] #PF: supervisor write access in kernel mode
[ 4.475579] #PF: error_code(0x0002) - not-present page
[ 4.478362] RIP: 0010:iowrite32+0x9/0x50
[ 4.486068] Call Trace:
[ 4.486269] <IRQ>
[ 4.486443] denali_isr+0x15b/0x300 [denali]
[ 4.486788] ? denali_direct_write+0x50/0x50 [denali]
[ 4.487189] __handle_irq_event_percpu+0x161/0x3b0
[ 4.487571] handle_irq_event+0x7d/0x1b0
[ 4.487884] handle_fasteoi_irq+0x2b0/0x770
[ 4.488219] __common_interrupt+0xc8/0x1b0
[ 4.488549] common_interrupt+0x9a/0xc0

It seems that the driver unmap the memory region before disabling the irq.

Regards,
Zheyu Ma


2022-04-12 20:44:40

by Miquel Raynal

[permalink] [raw]
Subject: Re: [BUG] mtd: rawnand: denali_pci: page fault when probing fails

Hi Zheyu,

[email protected] wrote on Sun, 10 Apr 2022 22:17:35 +0800:

> Hello,
>
> I found a bug in the denali_pci module.
> When the driver fails to probe, we will get the following splat:
>
> [ 4.472703] denali-nand-pci 0000:00:05.0: timeout while waiting for
> irq 0x1000
> [ 4.474071] denali-nand-pci: probe of 0000:00:05.0 failed with error -5
> [ 4.473538] nand: No NAND device found
> [ 4.474068] BUG: unable to handle page fault for address:
> ffffc90005000410
> [ 4.475169] #PF: supervisor write access in kernel mode
> [ 4.475579] #PF: error_code(0x0002) - not-present page
> [ 4.478362] RIP: 0010:iowrite32+0x9/0x50
> [ 4.486068] Call Trace:
> [ 4.486269] <IRQ>
> [ 4.486443] denali_isr+0x15b/0x300 [denali]
> [ 4.486788] ? denali_direct_write+0x50/0x50 [denali]
> [ 4.487189] __handle_irq_event_percpu+0x161/0x3b0
> [ 4.487571] handle_irq_event+0x7d/0x1b0
> [ 4.487884] handle_fasteoi_irq+0x2b0/0x770
> [ 4.488219] __common_interrupt+0xc8/0x1b0
> [ 4.488549] common_interrupt+0x9a/0xc0
>
> It seems that the driver unmap the memory region before disabling the irq.

Thanks for the report! The mapping is done with devm_ helpers and so is
the IRQ registration, so it's slightly more complicated that just
moving a function call in the remove path, apparently. Would you mind
investigating and proposing a patch?

Thanks,
Miquèl