2018-03-14 16:39:35

by Jonathan Toppins

[permalink] [raw]
Subject: [PATCH net] tg3: prevent scheduling while atomic splat

The problem was introduced in commit
506b0a395f26 ("[netdrv] tg3: APE heartbeat changes"). The bug occurs
because tp->lock spinlock is held which is obtained in tg3_start
by way of tg3_full_lock(), line 11571. The documentation for usleep_range()
specifically states it cannot be used inside a spinlock.

Fixes: 506b0a395f26 ("[netdrv] tg3: APE heartbeat changes")
Signed-off-by: Jonathan Toppins <[email protected]>
---

Notes:
The thing I need reviewed from Broadcom is if the udelay should be 20
instead of 10, due to any timing changes introduced by the offending
patch.

drivers/net/ethernet/broadcom/tg3.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/broadcom/tg3.c b/drivers/net/ethernet/broadcom/tg3.c
index c1841db1b500..f2593978ae75 100644
--- a/drivers/net/ethernet/broadcom/tg3.c
+++ b/drivers/net/ethernet/broadcom/tg3.c
@@ -820,7 +820,7 @@ static int tg3_ape_event_lock(struct tg3 *tp, u32 timeout_us)

tg3_ape_unlock(tp, TG3_APE_LOCK_MEM);

- usleep_range(10, 20);
+ udelay(10);
timeout_us -= (timeout_us > 10) ? 10 : timeout_us;
}

--
2.13.6



2018-03-14 17:24:49

by Michael Chan

[permalink] [raw]
Subject: Re: [PATCH net] tg3: prevent scheduling while atomic splat

On Wed, Mar 14, 2018 at 9:36 AM, Jonathan Toppins <[email protected]> wrote:
> The problem was introduced in commit
> 506b0a395f26 ("[netdrv] tg3: APE heartbeat changes"). The bug occurs
> because tp->lock spinlock is held which is obtained in tg3_start
> by way of tg3_full_lock(), line 11571. The documentation for usleep_range()
> specifically states it cannot be used inside a spinlock.
>
> Fixes: 506b0a395f26 ("[netdrv] tg3: APE heartbeat changes")
> Signed-off-by: Jonathan Toppins <[email protected]>
> ---
>
> Notes:
> The thing I need reviewed from Broadcom is if the udelay should be 20
> instead of 10, due to any timing changes introduced by the offending
> patch.

Thanks. 10 us is correct.

As a future improvement, we might want to see if we can release the
spinlock and go back to usleep_range(). The wait time is potentially
up to 20 msec which is quite long.

Acked-by: Michael Chan <[email protected]>

2018-03-14 17:29:05

by Jonathan Toppins

[permalink] [raw]
Subject: Re: [PATCH net] tg3: prevent scheduling while atomic splat

On 03/14/2018 01:22 PM, Michael Chan wrote:
> On Wed, Mar 14, 2018 at 9:36 AM, Jonathan Toppins <[email protected]> wrote:
>> The problem was introduced in commit
>> 506b0a395f26 ("[netdrv] tg3: APE heartbeat changes"). The bug occurs
>> because tp->lock spinlock is held which is obtained in tg3_start
>> by way of tg3_full_lock(), line 11571. The documentation for usleep_range()
>> specifically states it cannot be used inside a spinlock.
>>
>> Fixes: 506b0a395f26 ("[netdrv] tg3: APE heartbeat changes")
>> Signed-off-by: Jonathan Toppins <[email protected]>
>> ---
>>
>> Notes:
>> The thing I need reviewed from Broadcom is if the udelay should be 20
>> instead of 10, due to any timing changes introduced by the offending
>> patch.
>
> Thanks. 10 us is correct.
>
> As a future improvement, we might want to see if we can release the
> spinlock and go back to usleep_range(). The wait time is potentially
> up to 20 msec which is quite long.

Agreed, glad it is not just me wondering why a lock needs to be held for
reads. :-)

2018-03-14 17:46:15

by David Miller

[permalink] [raw]
Subject: Re: [PATCH net] tg3: prevent scheduling while atomic splat

From: Michael Chan <[email protected]>
Date: Wed, 14 Mar 2018 10:22:51 -0700

> On Wed, Mar 14, 2018 at 9:36 AM, Jonathan Toppins <[email protected]> wrote:
>> The problem was introduced in commit
>> 506b0a395f26 ("[netdrv] tg3: APE heartbeat changes"). The bug occurs
>> because tp->lock spinlock is held which is obtained in tg3_start
>> by way of tg3_full_lock(), line 11571. The documentation for usleep_range()
>> specifically states it cannot be used inside a spinlock.
>>
>> Fixes: 506b0a395f26 ("[netdrv] tg3: APE heartbeat changes")
>> Signed-off-by: Jonathan Toppins <[email protected]>
>> ---
>>
>> Notes:
>> The thing I need reviewed from Broadcom is if the udelay should be 20
>> instead of 10, due to any timing changes introduced by the offending
>> patch.
>
> Thanks. 10 us is correct.
>
> As a future improvement, we might want to see if we can release the
> spinlock and go back to usleep_range(). The wait time is potentially
> up to 20 msec which is quite long.
>
> Acked-by: Michael Chan <[email protected]>

Applied, thanks everyone.