From commit a304e1b82808 ("[PATCH] Debug shared irqs"), there is a test
to make sure the shared irq handler should be able to handle the unexpected
event after deregistration. For this case, let's apply MT76_REMOVED flag to
indicate the device was removed and do not run into the resource access
anymore.
BUG: KASAN: use-after-free in mt7921_irq_handler+0xd8/0x100 [mt7921e]
Read of size 8 at addr ffff88824a7d3b78 by task rmmod/11115
CPU: 28 PID: 11115 Comm: rmmod Tainted: G W L 5.17.0 #10
Hardware name: Micro-Star International Co., Ltd. MS-7D73/MPG B650I
EDGE WIFI (MS-7D73), BIOS 1.81 01/05/2024
Call Trace:
<TASK>
dump_stack_lvl+0x6f/0xa0
print_address_description.constprop.0+0x1f/0x190
? mt7921_irq_handler+0xd8/0x100 [mt7921e]
? mt7921_irq_handler+0xd8/0x100 [mt7921e]
kasan_report.cold+0x7f/0x11b
? mt7921_irq_handler+0xd8/0x100 [mt7921e]
mt7921_irq_handler+0xd8/0x100 [mt7921e]
free_irq+0x627/0xaa0
devm_free_irq+0x94/0xd0
? devm_request_any_context_irq+0x160/0x160
? kobject_put+0x18d/0x4a0
mt7921_pci_remove+0x153/0x190 [mt7921e]
pci_device_remove+0xa2/0x1d0
__device_release_driver+0x346/0x6e0
driver_detach+0x1ef/0x2c0
bus_remove_driver+0xe7/0x2d0
? __check_object_size+0x57/0x310
pci_unregister_driver+0x26/0x250
__do_sys_delete_module+0x307/0x510
? free_module+0x6a0/0x6a0
? fpregs_assert_state_consistent+0x4b/0xb0
? rcu_read_lock_sched_held+0x10/0x70
? syscall_enter_from_user_mode+0x20/0x70
? trace_hardirqs_on+0x1c/0x130
do_syscall_64+0x5c/0x80
? trace_hardirqs_on_prepare+0x72/0x160
? do_syscall_64+0x68/0x80
? trace_hardirqs_on_prepare+0x72/0x160
entry_SYSCALL_64_after_hwframe+0x44/0xae
Reported-by: Mikhail Gavrilov <[email protected]>
Closes: https://lore.kernel.org/linux-wireless/CABXGCsOdvVwdLmSsC8TZ1jF0UOg_F_W3wqLECWX620PUkvNk=A@mail.gmail.com/
Fixes: 9270270d6219 ("wifi: mt76: mt7921: fix PCI DMA hang after reboot")
Tested-by: Mikhail Gavrilov <[email protected]>
Signed-off-by: Deren Wu <[email protected]>
---
drivers/net/wireless/mediatek/mt76/mt7921/pci.c | 1 +
drivers/net/wireless/mediatek/mt76/mt792x_dma.c | 2 ++
2 files changed, 3 insertions(+)
diff --git a/drivers/net/wireless/mediatek/mt76/mt7921/pci.c b/drivers/net/wireless/mediatek/mt76/mt7921/pci.c
index 57903c6e4f11..2f04d6658b6b 100644
--- a/drivers/net/wireless/mediatek/mt76/mt7921/pci.c
+++ b/drivers/net/wireless/mediatek/mt76/mt7921/pci.c
@@ -387,6 +387,7 @@ static void mt7921_pci_remove(struct pci_dev *pdev)
struct mt792x_dev *dev = container_of(mdev, struct mt792x_dev, mt76);
mt7921e_unregister_device(dev);
+ set_bit(MT76_REMOVED, &mdev->phy.state);
devm_free_irq(&pdev->dev, pdev->irq, dev);
mt76_free_device(&dev->mt76);
pci_free_irq_vectors(pdev);
diff --git a/drivers/net/wireless/mediatek/mt76/mt792x_dma.c b/drivers/net/wireless/mediatek/mt76/mt792x_dma.c
index 488326ce5ed4..3893dbe866fe 100644
--- a/drivers/net/wireless/mediatek/mt76/mt792x_dma.c
+++ b/drivers/net/wireless/mediatek/mt76/mt792x_dma.c
@@ -12,6 +12,8 @@ irqreturn_t mt792x_irq_handler(int irq, void *dev_instance)
{
struct mt792x_dev *dev = dev_instance;
+ if (test_bit(MT76_REMOVED, &dev->mt76.phy.state))
+ return IRQ_NONE;
mt76_wr(dev, dev->irq_map->host_irq_enable, 0);
if (!test_bit(MT76_STATE_INITIALIZED, &dev->mphy.state))
--
2.18.0
> -----Original Message-----
> From: Deren Wu <[email protected]>
> Sent: Saturday, January 13, 2024 5:00 PM
> To: Felix Fietkau <[email protected]>; Lorenzo Bianconi <[email protected]>
> Cc: Sean Wang <[email protected]>; Soul Huang <[email protected]>; Ming Yen Hsieh
> <[email protected]>; Leon Yen <[email protected]>; Eric-SY Chang
> <[email protected]>; KM Lin <[email protected]>; Robin Chiu <[email protected]>; CH Yeh
> <[email protected]>; Posh Sun <[email protected]>; Quan Zhou <[email protected]>; Ryder Lee
> <[email protected]>; Shayne Chen <[email protected]>; linux-wireless
> <[email protected]>; linux-mediatek <[email protected]>; Deren Wu
> <[email protected]>
> Subject: [PATCH 1/2] wifi: mt76: mt7921e: fix use-after-free in free_irq()
>
> From commit a304e1b82808 ("[PATCH] Debug shared irqs"), there is a test
> to make sure the shared irq handler should be able to handle the unexpected
> event after deregistration. For this case, let's apply MT76_REMOVED flag to
> indicate the device was removed and do not run into the resource access
> anymore.
>
> BUG: KASAN: use-after-free in mt7921_irq_handler+0xd8/0x100 [mt7921e]
> Read of size 8 at addr ffff88824a7d3b78 by task rmmod/11115
> CPU: 28 PID: 11115 Comm: rmmod Tainted: G W L 5.17.0 #10
> Hardware name: Micro-Star International Co., Ltd. MS-7D73/MPG B650I
> EDGE WIFI (MS-7D73), BIOS 1.81 01/05/2024
> Call Trace:
> <TASK>
> dump_stack_lvl+0x6f/0xa0
> print_address_description.constprop.0+0x1f/0x190
> ? mt7921_irq_handler+0xd8/0x100 [mt7921e]
> ? mt7921_irq_handler+0xd8/0x100 [mt7921e]
> kasan_report.cold+0x7f/0x11b
> ? mt7921_irq_handler+0xd8/0x100 [mt7921e]
> mt7921_irq_handler+0xd8/0x100 [mt7921e]
> free_irq+0x627/0xaa0
> devm_free_irq+0x94/0xd0
> ? devm_request_any_context_irq+0x160/0x160
> ? kobject_put+0x18d/0x4a0
> mt7921_pci_remove+0x153/0x190 [mt7921e]
> pci_device_remove+0xa2/0x1d0
> __device_release_driver+0x346/0x6e0
> driver_detach+0x1ef/0x2c0
> bus_remove_driver+0xe7/0x2d0
> ? __check_object_size+0x57/0x310
> pci_unregister_driver+0x26/0x250
> __do_sys_delete_module+0x307/0x510
> ? free_module+0x6a0/0x6a0
> ? fpregs_assert_state_consistent+0x4b/0xb0
> ? rcu_read_lock_sched_held+0x10/0x70
> ? syscall_enter_from_user_mode+0x20/0x70
> ? trace_hardirqs_on+0x1c/0x130
> do_syscall_64+0x5c/0x80
> ? trace_hardirqs_on_prepare+0x72/0x160
> ? do_syscall_64+0x68/0x80
> ? trace_hardirqs_on_prepare+0x72/0x160
> entry_SYSCALL_64_after_hwframe+0x44/0xae
>
> Reported-by: Mikhail Gavrilov <[email protected]>
> Closes:
> https://lore.kernel.org/linux-wireless/[email protected].
> com/
> Fixes: 9270270d6219 ("wifi: mt76: mt7921: fix PCI DMA hang after reboot")
> Tested-by: Mikhail Gavrilov <[email protected]>
> Signed-off-by: Deren Wu <[email protected]>
> ---
> drivers/net/wireless/mediatek/mt76/mt7921/pci.c | 1 +
> drivers/net/wireless/mediatek/mt76/mt792x_dma.c | 2 ++
> 2 files changed, 3 insertions(+)
>
> diff --git a/drivers/net/wireless/mediatek/mt76/mt7921/pci.c
> b/drivers/net/wireless/mediatek/mt76/mt7921/pci.c
> index 57903c6e4f11..2f04d6658b6b 100644
> --- a/drivers/net/wireless/mediatek/mt76/mt7921/pci.c
> +++ b/drivers/net/wireless/mediatek/mt76/mt7921/pci.c
> @@ -387,6 +387,7 @@ static void mt7921_pci_remove(struct pci_dev *pdev)
> struct mt792x_dev *dev = container_of(mdev, struct mt792x_dev, mt76);
>
> mt7921e_unregister_device(dev);
> + set_bit(MT76_REMOVED, &mdev->phy.state);
Can it do below like mt7921_pci_suspend() to safely stop interrupt handler?
Instead of setting a flag.
synchronize_irq(pdev->irq);
tasklet_kill(&mdev->irq_tasklet);
> devm_free_irq(&pdev->dev, pdev->irq, dev);
> mt76_free_device(&dev->mt76);
> pci_free_irq_vectors(pdev);
> diff --git a/drivers/net/wireless/mediatek/mt76/mt792x_dma.c
> b/drivers/net/wireless/mediatek/mt76/mt792x_dma.c
> index 488326ce5ed4..3893dbe866fe 100644
> --- a/drivers/net/wireless/mediatek/mt76/mt792x_dma.c
> +++ b/drivers/net/wireless/mediatek/mt76/mt792x_dma.c
> @@ -12,6 +12,8 @@ irqreturn_t mt792x_irq_handler(int irq, void *dev_instance)
> {
> struct mt792x_dev *dev = dev_instance;
>
If PCI is removed, is it still safe to access 'dev_instance'?
> + if (test_bit(MT76_REMOVED, &dev->mt76.phy.state))
> + return IRQ_NONE;
> mt76_wr(dev, dev->irq_map->host_irq_enable, 0);
>
> if (!test_bit(MT76_STATE_INITIALIZED, &dev->mphy.state))
> --
> 2.18.0
>
On Mon, 2024-01-15 at 02:03 +0000, Ping-Ke Shih wrote:
>
>
> > -----Original Message-----
> > From: Deren Wu <[email protected]>
> > Sent: Saturday, January 13, 2024 5:00 PM
> > To: Felix Fietkau <[email protected]>; Lorenzo Bianconi <
> [email protected]>
> > Cc: Sean Wang <[email protected]>; Soul Huang <
> [email protected]>; Ming Yen Hsieh
> > <[email protected]>; Leon Yen <[email protected]>;
> Eric-SY Chang
> > <[email protected]>; KM Lin <[email protected]>; Robin
> Chiu <[email protected]>; CH Yeh
> > <[email protected]>; Posh Sun <[email protected]>; Quan Zhou
> <[email protected]>; Ryder Lee
> > <[email protected]>; Shayne Chen <[email protected]>;
> linux-wireless
> > <[email protected]>; linux-mediatek <
> [email protected]>; Deren Wu
> > <[email protected]>
> > Subject: [PATCH 1/2] wifi: mt76: mt7921e: fix use-after-free in
> free_irq()
> >
> > From commit a304e1b82808 ("[PATCH] Debug shared irqs"), there is a
> test
> > to make sure the shared irq handler should be able to handle the
> unexpected
> > event after deregistration. For this case, let's apply MT76_REMOVED
> flag to
> > indicate the device was removed and do not run into the resource
> access
> > anymore.
> >
> > BUG: KASAN: use-after-free in mt7921_irq_handler+0xd8/0x100
> [mt7921e]
> > Read of size 8 at addr ffff88824a7d3b78 by task rmmod/11115
> > CPU: 28 PID: 11115 Comm: rmmod Tainted: G W L 5.17.0
> #10
> > Hardware name: Micro-Star International Co., Ltd. MS-7D73/MPG B650I
> > EDGE WIFI (MS-7D73), BIOS 1.81 01/05/2024
> > Call Trace:
> > <TASK>
> > dump_stack_lvl+0x6f/0xa0
> > print_address_description.constprop.0+0x1f/0x190
> > ? mt7921_irq_handler+0xd8/0x100 [mt7921e]
> > ? mt7921_irq_handler+0xd8/0x100 [mt7921e]
> > kasan_report.cold+0x7f/0x11b
> > ? mt7921_irq_handler+0xd8/0x100 [mt7921e]
> > mt7921_irq_handler+0xd8/0x100 [mt7921e]
> > free_irq+0x627/0xaa0
> > devm_free_irq+0x94/0xd0
> > ? devm_request_any_context_irq+0x160/0x160
> > ? kobject_put+0x18d/0x4a0
> > mt7921_pci_remove+0x153/0x190 [mt7921e]
> > pci_device_remove+0xa2/0x1d0
> > __device_release_driver+0x346/0x6e0
> > driver_detach+0x1ef/0x2c0
> > bus_remove_driver+0xe7/0x2d0
> > ? __check_object_size+0x57/0x310
> > pci_unregister_driver+0x26/0x250
> > __do_sys_delete_module+0x307/0x510
> > ? free_module+0x6a0/0x6a0
> > ? fpregs_assert_state_consistent+0x4b/0xb0
> > ? rcu_read_lock_sched_held+0x10/0x70
> > ? syscall_enter_from_user_mode+0x20/0x70
> > ? trace_hardirqs_on+0x1c/0x130
> > do_syscall_64+0x5c/0x80
> > ? trace_hardirqs_on_prepare+0x72/0x160
> > ? do_syscall_64+0x68/0x80
> > ? trace_hardirqs_on_prepare+0x72/0x160
> > entry_SYSCALL_64_after_hwframe+0x44/0xae
> >
> > Reported-by: Mikhail Gavrilov <[email protected]>
> > Closes:
> >
> https://lore.kernel.org/linux-wireless/[email protected]
> .
> > com/
> > Fixes: 9270270d6219 ("wifi: mt76: mt7921: fix PCI DMA hang after
> reboot")
> > Tested-by: Mikhail Gavrilov <[email protected]>
> > Signed-off-by: Deren Wu <[email protected]>
> > ---
> > drivers/net/wireless/mediatek/mt76/mt7921/pci.c | 1 +
> > drivers/net/wireless/mediatek/mt76/mt792x_dma.c | 2 ++
> > 2 files changed, 3 insertions(+)
> >
> > diff --git a/drivers/net/wireless/mediatek/mt76/mt7921/pci.c
> > b/drivers/net/wireless/mediatek/mt76/mt7921/pci.c
> > index 57903c6e4f11..2f04d6658b6b 100644
> > --- a/drivers/net/wireless/mediatek/mt76/mt7921/pci.c
> > +++ b/drivers/net/wireless/mediatek/mt76/mt7921/pci.c
> > @@ -387,6 +387,7 @@ static void mt7921_pci_remove(struct pci_dev
> *pdev)
> > struct mt792x_dev *dev = container_of(mdev, struct
> mt792x_dev, mt76);
> >
> > mt7921e_unregister_device(dev);
> > + set_bit(MT76_REMOVED, &mdev->phy.state);
>
> Can it do below like mt7921_pci_suspend() to safely stop interrupt
> handler?
> Instead of setting a flag.
>
> synchronize_irq(pdev->irq);
> tasklet_kill(&mdev->irq_tasklet);
>
Here is the snapshot. The code is trying to direct access this irq
handler after deregisering, for IRQF_SHARED case. synchronize_irq() and
tasklet_kill() are all done in previous steps. We need to stop the
extra call here. If there are any alternative, that would be
appreciated.
/*
* It's a shared IRQ -- the driver ought to be prepared for an IRQ
* event to happen even now it's being freed, so let's make sure that
* is so by doing an extra call to the handler ....
*
* ( We do this after actually deregistering it, to make sure that a
* 'real' IRQ doesn't run in parallel with our fake. )
*/
if (action->flags & IRQF_SHARED) {
local_irq_save(flags);
action->handler(irq, dev_id);
local_irq_restore(flags);
}
https://elixir.bootlin.com/linux/v6.7/source/kernel/irq/manage.c#L1949
>
> > devm_free_irq(&pdev->dev, pdev->irq, dev);
> > mt76_free_device(&dev->mt76);
> > pci_free_irq_vectors(pdev);
> > diff --git a/drivers/net/wireless/mediatek/mt76/mt792x_dma.c
> > b/drivers/net/wireless/mediatek/mt76/mt792x_dma.c
> > index 488326ce5ed4..3893dbe866fe 100644
> > --- a/drivers/net/wireless/mediatek/mt76/mt792x_dma.c
> > +++ b/drivers/net/wireless/mediatek/mt76/mt792x_dma.c
> > @@ -12,6 +12,8 @@ irqreturn_t mt792x_irq_handler(int irq, void
> *dev_instance)
> > {
> > struct mt792x_dev *dev = dev_instance;
> >
>
> If PCI is removed, is it still safe to access 'dev_instance'?
>
For this case, we are running into devm_free_irq() and the instance is
still alive. The irq handler should be destroyed before the PCI is
removed.
> > + if (test_bit(MT76_REMOVED, &dev->mt76.phy.state))
> > + return IRQ_NONE;
> > mt76_wr(dev, dev->irq_map->host_irq_enable, 0);
> >
> > if (!test_bit(MT76_STATE_INITIALIZED, &dev->mphy.state))
> > --
> > 2.18.0
> >
>
From: Deren Wu (武德仁) <[email protected]>
Sent: Monday, January 15, 2024 8:18 PM
To: [email protected]; Ping-Ke Shih <[email protected]>; [email protected]
Cc: Mingyen Hsieh (謝明諺) <[email protected]>; [email protected]; Leon Yen (顏良儒) <[email protected]>; Shayne Chen (陳軒丞) <[email protected]>; Quan Zhou (周全) <[email protected]>; Sean Wang <[email protected]>; KM Lin (林昆民) <[email protected]>; Soul Huang (黃至昶) <[email protected]>; Posh Sun (孫瑞廷) <[email protected]>; Eric-SY Chang (張書源) <[email protected]>; CH Yeh (葉志豪) <[email protected]>; Robin Chiu (邱國濱) <[email protected]>; Ryder Lee <[email protected]>; [email protected]
Subject: Re: [PATCH 1/2] wifi: mt76: mt7921e: fix use-after-free in free_irq()
>
> Here is the snapshot. The code is trying to direct access this irq
> handler after deregisering, for IRQF_SHARED case. synchronize_irq() and
> tasklet_kill() are all done in previous steps. We need to stop the
> extra call here. If there are any alternative, that would be
> appreciated.
>
> /*
> * It's a shared IRQ -- the driver ought to be prepared for an IRQ
> * event to happen even now it's being freed, so let's make sure that
> * is so by doing an extra call to the handler ....
> *
> * ( We do this after actually deregistering it, to make sure that a
> * 'real' IRQ doesn't run in parallel with our fake. )
> */
> if (action->flags & IRQF_SHARED) {
> local_irq_save(flags);
> action->handler(irq, dev_id);
> local_irq_restore(flags);
> }
>
I missed this point. Sorry for the noise.