2021-05-19 19:20:15

by Zheyu Ma

[permalink] [raw]
Subject: [PATCH] net/qla3xxx: fix schedule while atomic in ql_sem_spinlock

When calling the 'ql_sem_spinlock', the driver has already acquired the
spin lock, so the driver should not call 'ssleep' in atomic context.

This bug can be fixed by unlocking before calling 'ssleep'.

The KASAN's log reveals it:

[ 3.238124] BUG: scheduling while atomic: swapper/0/1/0x00000002
[ 3.238748] 2 locks held by swapper/0/1:
[ 3.239151] #0: ffff88810177b240 (&dev->mutex){....}-{3:3}, at:
__device_driver_lock+0x41/0x60
[ 3.240026] #1: ffff888107c60e28 (&qdev->hw_lock){....}-{2:2}, at:
ql3xxx_probe+0x2aa/0xea0
[ 3.240873] Modules linked in:
[ 3.241187] irq event stamp: 460854
[ 3.241541] hardirqs last enabled at (460853): [<ffffffff843051bf>]
_raw_spin_unlock_irqrestore+0x4f/0x70
[ 3.242245] hardirqs last disabled at (460854): [<ffffffff843058ca>]
_raw_spin_lock_irqsave+0x2a/0x70
[ 3.242245] softirqs last enabled at (446076): [<ffffffff846002e4>]
__do_softirq+0x2e4/0x4b1
[ 3.242245] softirqs last disabled at (446069): [<ffffffff811ba5e0>]
irq_exit_rcu+0x100/0x110
[ 3.242245] Preemption disabled at:
[ 3.242245] [<ffffffff828ca5ba>] ql3xxx_probe+0x2aa/0xea0
[ 3.242245] Kernel panic - not syncing: scheduling while atomic
[ 3.242245] CPU: 2 PID: 1 Comm: swapper/0 Not tainted 5.13.0-rc1-00145
-gee7dc339169-dirty #16
[ 3.242245] Call Trace:
[ 3.242245] dump_stack+0xba/0xf5
[ 3.242245] ? ql3xxx_probe+0x1f0/0xea0
[ 3.242245] panic+0x15a/0x3f2
[ 3.242245] ? vprintk+0x76/0x150
[ 3.242245] ? ql3xxx_probe+0x2aa/0xea0
[ 3.242245] __schedule_bug+0xae/0xe0
[ 3.242245] __schedule+0x72e/0xa00
[ 3.242245] schedule+0x43/0xf0
[ 3.242245] schedule_timeout+0x28b/0x500
[ 3.242245] ? del_timer_sync+0xf0/0xf0
[ 3.242245] ? msleep+0x2f/0x70
[ 3.242245] msleep+0x59/0x70
[ 3.242245] ql3xxx_probe+0x307/0xea0
[ 3.242245] ? _raw_spin_unlock_irqrestore+0x3a/0x70
[ 3.242245] ? pci_device_remove+0x110/0x110
[ 3.242245] local_pci_probe+0x45/0xa0
[ 3.242245] pci_device_probe+0x12b/0x1d0
[ 3.242245] really_probe+0x2a9/0x610
[ 3.242245] driver_probe_device+0x90/0x1d0
[ 3.242245] ? mutex_lock_nested+0x1b/0x20
[ 3.242245] device_driver_attach+0x68/0x70
[ 3.242245] __driver_attach+0x124/0x1b0
[ 3.242245] ? device_driver_attach+0x70/0x70
[ 3.242245] bus_for_each_dev+0xbb/0x110
[ 3.242245] ? rdinit_setup+0x45/0x45
[ 3.242245] driver_attach+0x27/0x30
[ 3.242245] bus_add_driver+0x1eb/0x2a0
[ 3.242245] driver_register+0xa9/0x180
[ 3.242245] __pci_register_driver+0x82/0x90
[ 3.242245] ? yellowfin_init+0x25/0x25
[ 3.242245] ql3xxx_driver_init+0x23/0x25
[ 3.242245] do_one_initcall+0x7f/0x3d0
[ 3.242245] ? rdinit_setup+0x45/0x45
[ 3.242245] ? rcu_read_lock_sched_held+0x4f/0x80
[ 3.242245] kernel_init_freeable+0x2aa/0x301
[ 3.242245] ? rest_init+0x2c0/0x2c0
[ 3.242245] kernel_init+0x18/0x190
[ 3.242245] ? rest_init+0x2c0/0x2c0
[ 3.242245] ? rest_init+0x2c0/0x2c0
[ 3.242245] ret_from_fork+0x1f/0x30
[ 3.242245] Dumping ftrace buffer:
[ 3.242245] (ftrace buffer empty)
[ 3.242245] Kernel Offset: disabled
[ 3.242245] Rebooting in 1 seconds.

Reported-by: Zheyu Ma <[email protected]>
Signed-off-by: Zheyu Ma <[email protected]>
---
drivers/net/ethernet/qlogic/qla3xxx.c | 2 ++
1 file changed, 2 insertions(+)

diff --git a/drivers/net/ethernet/qlogic/qla3xxx.c b/drivers/net/ethernet/qlogic/qla3xxx.c
index 214e347097a7..af7c142a066f 100644
--- a/drivers/net/ethernet/qlogic/qla3xxx.c
+++ b/drivers/net/ethernet/qlogic/qla3xxx.c
@@ -114,7 +114,9 @@ static int ql_sem_spinlock(struct ql3_adapter *qdev,
value = readl(&port_regs->CommonRegs.semaphoreReg);
if ((value & (sem_mask >> 16)) == sem_bits)
return 0;
+ spin_unlock_irq(&qdev->hw_lock);
ssleep(1);
+ spin_lock_irq(&qdev->hw_lock);
} while (--seconds);
return -1;
}
--
2.17.1



2021-05-19 20:22:02

by David Miller

[permalink] [raw]
Subject: Re: [PATCH] net/qla3xxx: fix schedule while atomic in ql_sem_spinlock

From: Zheyu Ma <[email protected]>
Date: Wed, 19 May 2021 06:49:14 +0000

> When calling the 'ql_sem_spinlock', the driver has already acquired the
> spin lock, so the driver should not call 'ssleep' in atomic context.
>
> This bug can be fixed by unlocking before calling 'ssleep'.
...
> diff --git a/drivers/net/ethernet/qlogic/qla3xxx.c b/drivers/net/ethernet/qlogic/qla3xxx.c
> index 214e347097a7..af7c142a066f 100644
> --- a/drivers/net/ethernet/qlogic/qla3xxx.c
> +++ b/drivers/net/ethernet/qlogic/qla3xxx.c
> @@ -114,7 +114,9 @@ static int ql_sem_spinlock(struct ql3_adapter *qdev,
> value = readl(&port_regs->CommonRegs.semaphoreReg);
> if ((value & (sem_mask >> 16)) == sem_bits)
> return 0;
> + spin_unlock_irq(&qdev->hw_lock);
> ssleep(1);
> + spin_lock_irq(&qdev->hw_lock);
> } while (--seconds);
> return -1;
> }

Are you sure dropping the lock like this dos not introduce a race condition?

Thank you.

2021-05-21 04:31:28

by Zheyu Ma

[permalink] [raw]
Subject: Re: [PATCH] net/qla3xxx: fix schedule while atomic in ql_sem_spinlock

On Thu, May 20, 2021 at 3:26 AM David Miller <[email protected]> wrote:
>
> From: Zheyu Ma <[email protected]>
> Date: Wed, 19 May 2021 06:49:14 +0000
>
> > When calling the 'ql_sem_spinlock', the driver has already acquired the
> > spin lock, so the driver should not call 'ssleep' in atomic context.
> >
> > This bug can be fixed by unlocking before calling 'ssleep'.
> ...
> > diff --git a/drivers/net/ethernet/qlogic/qla3xxx.c b/drivers/net/ethernet/qlogic/qla3xxx.c
> > index 214e347097a7..af7c142a066f 100644
> > --- a/drivers/net/ethernet/qlogic/qla3xxx.c
> > +++ b/drivers/net/ethernet/qlogic/qla3xxx.c
> > @@ -114,7 +114,9 @@ static int ql_sem_spinlock(struct ql3_adapter *qdev,
> > value = readl(&port_regs->CommonRegs.semaphoreReg);
> > if ((value & (sem_mask >> 16)) == sem_bits)
> > return 0;
> > + spin_unlock_irq(&qdev->hw_lock);
> > ssleep(1);
> > + spin_lock_irq(&qdev->hw_lock);
> > } while (--seconds);
> > return -1;
> > }
>
> Are you sure dropping the lock like this dos not introduce a race condition?
>
> Thank you.

Thanks for your comment, it is indeed inappropriate to release the
lock here, I will resend the second version of the patch.

Zheyu Ma