2023-09-08 13:50:53

by Sascha Hauer

[permalink] [raw]
Subject: [PATCH] net: macb: fix sleep inside spinlock

macb_set_tx_clk() is called under a spinlock but itself calls clk_set_rate()
which can sleep. This results in:

| BUG: sleeping function called from invalid context at kernel/locking/mutex.c:580
| pps pps1: new PPS source ptp1
| in_atomic(): 1, irqs_disabled(): 1, non_block: 0, pid: 40, name: kworker/u4:3
| preempt_count: 1, expected: 0
| RCU nest depth: 0, expected: 0
| 4 locks held by kworker/u4:3/40:
| #0: ffff000003409148
| macb ff0c0000.ethernet: gem-ptp-timer ptp clock registered.
| ((wq_completion)events_power_efficient){+.+.}-{0:0}, at: process_one_work+0x14c/0x51c
| #1: ffff8000833cbdd8 ((work_completion)(&pl->resolve)){+.+.}-{0:0}, at: process_one_work+0x14c/0x51c
| #2: ffff000004f01578 (&pl->state_mutex){+.+.}-{4:4}, at: phylink_resolve+0x44/0x4e8
| #3: ffff000004f06f50 (&bp->lock){....}-{3:3}, at: macb_mac_link_up+0x40/0x2ac
| irq event stamp: 113998
| hardirqs last enabled at (113997): [<ffff800080e8503c>] _raw_spin_unlock_irq+0x30/0x64
| hardirqs last disabled at (113998): [<ffff800080e84478>] _raw_spin_lock_irqsave+0xac/0xc8
| softirqs last enabled at (113608): [<ffff800080010630>] __do_softirq+0x430/0x4e4
| softirqs last disabled at (113597): [<ffff80008001614c>] ____do_softirq+0x10/0x1c
| CPU: 0 PID: 40 Comm: kworker/u4:3 Not tainted 6.5.0-11717-g9355ce8b2f50-dirty #368
| Hardware name: ... ZynqMP ... (DT)
| Workqueue: events_power_efficient phylink_resolve
| Call trace:
| dump_backtrace+0x98/0xf0
| show_stack+0x18/0x24
| dump_stack_lvl+0x60/0xac
| dump_stack+0x18/0x24
| __might_resched+0x144/0x24c
| __might_sleep+0x48/0x98
| __mutex_lock+0x58/0x7b0
| mutex_lock_nested+0x24/0x30
| clk_prepare_lock+0x4c/0xa8
| clk_set_rate+0x24/0x8c
| macb_mac_link_up+0x25c/0x2ac
| phylink_resolve+0x178/0x4e8
| process_one_work+0x1ec/0x51c
| worker_thread+0x1ec/0x3e4
| kthread+0x120/0x124
| ret_from_fork+0x10/0x20

The obvious fix is to move the call to macb_set_tx_clk() out of the
protected area. This seems safe as rx and tx are both disabled anyway at
this point.
It is however not entirely clear what the spinlock shall protect. It
could be the read-modify-write access to the NCFGR register, but this
is accessed in macb_set_rx_mode() and macb_set_rxcsum_feature() as well
without holding the spinlock. It could also be the register accesses
done in mog_init_rings() or macb_init_buffers(), but again these
functions are called without holding the spinlock in macb_hresp_error_task().
The locking seems fishy in this driver and it might deserve another look
before this patch is applied.

Fixes: 633e98a711ac0 ("net: macb: use resolved link config in mac_link_up()")
Signed-off-by: Sascha Hauer <[email protected]>
---
drivers/net/ethernet/cadence/macb_main.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/cadence/macb_main.c b/drivers/net/ethernet/cadence/macb_main.c
index 31f664ee4d778..b940dcd3ace68 100644
--- a/drivers/net/ethernet/cadence/macb_main.c
+++ b/drivers/net/ethernet/cadence/macb_main.c
@@ -756,8 +756,6 @@ static void macb_mac_link_up(struct phylink_config *config,
if (rx_pause)
ctrl |= MACB_BIT(PAE);

- macb_set_tx_clk(bp, speed);
-
/* Initialize rings & buffers as clearing MACB_BIT(TE) in link down
* cleared the pipeline and control registers.
*/
@@ -777,6 +775,9 @@ static void macb_mac_link_up(struct phylink_config *config,

spin_unlock_irqrestore(&bp->lock, flags);

+ if (!(bp->caps & MACB_CAPS_MACB_IS_EMAC))
+ macb_set_tx_clk(bp, speed);
+
/* Enable Rx and Tx; Enable PTP unicast */
ctrl = macb_readl(bp, NCR);
if (gem_has_ptp(bp))
--
2.39.2


2023-09-12 13:32:17

by patchwork-bot+netdevbpf

[permalink] [raw]
Subject: Re: [PATCH] net: macb: fix sleep inside spinlock

Hello:

This patch was applied to netdev/net.git (main)
by Paolo Abeni <[email protected]>:

On Fri, 8 Sep 2023 13:29:13 +0200 you wrote:
> macb_set_tx_clk() is called under a spinlock but itself calls clk_set_rate()
> which can sleep. This results in:
>
> | BUG: sleeping function called from invalid context at kernel/locking/mutex.c:580
> | pps pps1: new PPS source ptp1
> | in_atomic(): 1, irqs_disabled(): 1, non_block: 0, pid: 40, name: kworker/u4:3
> | preempt_count: 1, expected: 0
> | RCU nest depth: 0, expected: 0
> | 4 locks held by kworker/u4:3/40:
> | #0: ffff000003409148
> | macb ff0c0000.ethernet: gem-ptp-timer ptp clock registered.
> | ((wq_completion)events_power_efficient){+.+.}-{0:0}, at: process_one_work+0x14c/0x51c
> | #1: ffff8000833cbdd8 ((work_completion)(&pl->resolve)){+.+.}-{0:0}, at: process_one_work+0x14c/0x51c
> | #2: ffff000004f01578 (&pl->state_mutex){+.+.}-{4:4}, at: phylink_resolve+0x44/0x4e8
> | #3: ffff000004f06f50 (&bp->lock){....}-{3:3}, at: macb_mac_link_up+0x40/0x2ac
> | irq event stamp: 113998
> | hardirqs last enabled at (113997): [<ffff800080e8503c>] _raw_spin_unlock_irq+0x30/0x64
> | hardirqs last disabled at (113998): [<ffff800080e84478>] _raw_spin_lock_irqsave+0xac/0xc8
> | softirqs last enabled at (113608): [<ffff800080010630>] __do_softirq+0x430/0x4e4
> | softirqs last disabled at (113597): [<ffff80008001614c>] ____do_softirq+0x10/0x1c
> | CPU: 0 PID: 40 Comm: kworker/u4:3 Not tainted 6.5.0-11717-g9355ce8b2f50-dirty #368
> | Hardware name: ... ZynqMP ... (DT)
> | Workqueue: events_power_efficient phylink_resolve
> | Call trace:
> | dump_backtrace+0x98/0xf0
> | show_stack+0x18/0x24
> | dump_stack_lvl+0x60/0xac
> | dump_stack+0x18/0x24
> | __might_resched+0x144/0x24c
> | __might_sleep+0x48/0x98
> | __mutex_lock+0x58/0x7b0
> | mutex_lock_nested+0x24/0x30
> | clk_prepare_lock+0x4c/0xa8
> | clk_set_rate+0x24/0x8c
> | macb_mac_link_up+0x25c/0x2ac
> | phylink_resolve+0x178/0x4e8
> | process_one_work+0x1ec/0x51c
> | worker_thread+0x1ec/0x3e4
> | kthread+0x120/0x124
> | ret_from_fork+0x10/0x20
>
> [...]

Here is the summary with links:
- net: macb: fix sleep inside spinlock
https://git.kernel.org/netdev/net/c/403f0e771457

You are awesome, thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html


2023-09-13 03:24:39

by Paolo Abeni

[permalink] [raw]
Subject: Re: [PATCH] net: macb: fix sleep inside spinlock

On Fri, 2023-09-08 at 13:29 +0200, Sascha Hauer wrote:
> macb_set_tx_clk() is called under a spinlock but itself calls clk_set_rate()
> which can sleep. This results in:
>
> > BUG: sleeping function called from invalid context at kernel/locking/mutex.c:580
> > pps pps1: new PPS source ptp1
> > in_atomic(): 1, irqs_disabled(): 1, non_block: 0, pid: 40, name: kworker/u4:3
> > preempt_count: 1, expected: 0
> > RCU nest depth: 0, expected: 0
> > 4 locks held by kworker/u4:3/40:
> > #0: ffff000003409148
> > macb ff0c0000.ethernet: gem-ptp-timer ptp clock registered.
> > ((wq_completion)events_power_efficient){+.+.}-{0:0}, at: process_one_work+0x14c/0x51c
> > #1: ffff8000833cbdd8 ((work_completion)(&pl->resolve)){+.+.}-{0:0}, at: process_one_work+0x14c/0x51c
> > #2: ffff000004f01578 (&pl->state_mutex){+.+.}-{4:4}, at: phylink_resolve+0x44/0x4e8
> > #3: ffff000004f06f50 (&bp->lock){....}-{3:3}, at: macb_mac_link_up+0x40/0x2ac
> > irq event stamp: 113998
> > hardirqs last enabled at (113997): [<ffff800080e8503c>] _raw_spin_unlock_irq+0x30/0x64
> > hardirqs last disabled at (113998): [<ffff800080e84478>] _raw_spin_lock_irqsave+0xac/0xc8
> > softirqs last enabled at (113608): [<ffff800080010630>] __do_softirq+0x430/0x4e4
> > softirqs last disabled at (113597): [<ffff80008001614c>] ____do_softirq+0x10/0x1c
> > CPU: 0 PID: 40 Comm: kworker/u4:3 Not tainted 6.5.0-11717-g9355ce8b2f50-dirty #368
> > Hardware name: ... ZynqMP ... (DT)
> > Workqueue: events_power_efficient phylink_resolve
> > Call trace:
> > dump_backtrace+0x98/0xf0
> > show_stack+0x18/0x24
> > dump_stack_lvl+0x60/0xac
> > dump_stack+0x18/0x24
> > __might_resched+0x144/0x24c
> > __might_sleep+0x48/0x98
> > __mutex_lock+0x58/0x7b0
> > mutex_lock_nested+0x24/0x30
> > clk_prepare_lock+0x4c/0xa8
> > clk_set_rate+0x24/0x8c
> > macb_mac_link_up+0x25c/0x2ac
> > phylink_resolve+0x178/0x4e8
> > process_one_work+0x1ec/0x51c
> > worker_thread+0x1ec/0x3e4
> > kthread+0x120/0x124
> > ret_from_fork+0x10/0x20
>
> The obvious fix is to move the call to macb_set_tx_clk() out of the
> protected area. This seems safe as rx and tx are both disabled anyway at
> this point.
> It is however not entirely clear what the spinlock shall protect. It
> could be the read-modify-write access to the NCFGR register, but this
> is accessed in macb_set_rx_mode() and macb_set_rxcsum_feature() as well
> without holding the spinlock. It could also be the register accesses
> done in mog_init_rings() or macb_init_buffers(), but again these
> functions are called without holding the spinlock in macb_hresp_error_task().
> The locking seems fishy in this driver and it might deserve another look
> before this patch is applied.

macb_set_tx_clk() moved under the bp->lock scope as a consequence of
the blamed commit. Such commit moved a bunch of code originally in
macb_mac_config() and under the bp->lock lock into macb_mac_link_up().

I guess that the lock was added to the latter function to be on the
"safe side", but it looks like macb_set_tx_clk() does not need it.

The patch LGTM, thanks!

Paolo