The system hang because of dsa_tag_8021q_port_setup()->
stmmac_vlan_rx_add_vid().
I found in stmmac_drv_probe() that cailing pm_runtime_put()
disabled the clock.
First, when the kernel is compiled with CONFIG_PM=y,The stmmac's
resume/suspend is active.
Secondly,stmmac as DSA master,the dsa_tag_8021q_port_setup() function
will callback stmmac_vlan_rx_add_vid when DSA dirver starts. However,
The system is hanged for the stmmac_vlan_rx_add_vid() accesses its
registers after stmmac's clock is closed.
I would suggest adding the pm_runtime_resume_and_get() to the
stmmac_vlan_rx_add_vid().This guarantees that resuming clock output
while in use.
Signed-off-by: Yan Wang <[email protected]>
---
drivers/net/ethernet/stmicro/stmmac/stmmac_main.c | 12 +++++++++---
1 file changed, 9 insertions(+), 3 deletions(-)
diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
index d7fcab057032..f9cd063f1fe3 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
@@ -6350,6 +6350,10 @@ static int stmmac_vlan_rx_add_vid(struct net_device *ndev, __be16 proto, u16 vid
bool is_double = false;
int ret;
+ ret = pm_runtime_resume_and_get(priv->device);
+ if (ret < 0)
+ return ret;
+
if (be16_to_cpu(proto) == ETH_P_8021AD)
is_double = true;
@@ -6357,16 +6361,18 @@ static int stmmac_vlan_rx_add_vid(struct net_device *ndev, __be16 proto, u16 vid
ret = stmmac_vlan_update(priv, is_double);
if (ret) {
clear_bit(vid, priv->active_vlans);
- return ret;
+ goto err_pm_put;
}
if (priv->hw->num_vlan) {
ret = stmmac_add_hw_vlan_rx_fltr(priv, ndev, priv->hw, proto, vid);
if (ret)
- return ret;
+ goto err_pm_put;
}
+err_pm_put:
+ pm_runtime_put(priv->device);
- return 0;
+ return ret;
}
static int stmmac_vlan_rx_kill_vid(struct net_device *ndev, __be16 proto, u16 vid)
--
2.17.1
On 4/13/2023 8:06 AM, Yan Wang wrote:
> The system hang because of dsa_tag_8021q_port_setup()->
> stmmac_vlan_rx_add_vid().
>
> I found in stmmac_drv_probe() that cailing pm_runtime_put()
> disabled the clock.
>
> First, when the kernel is compiled with CONFIG_PM=y,The stmmac's
> resume/suspend is active.
>
> Secondly,stmmac as DSA master,the dsa_tag_8021q_port_setup() function
> will callback stmmac_vlan_rx_add_vid when DSA dirver starts. However,
> The system is hanged for the stmmac_vlan_rx_add_vid() accesses its
> registers after stmmac's clock is closed.
>
> I would suggest adding the pm_runtime_resume_and_get() to the
> stmmac_vlan_rx_add_vid().This guarantees that resuming clock output
> while in use.
>
> Signed-off-by: Yan Wang <[email protected]>
This looks identical to the net fix you posted at [1]. I don't think we
need both?
[1]:
https://lore.kernel.org/netdev/KL1PR01MB5448020DE191340AE64530B0E6989@KL1PR01MB5448.apcprd01.prod.exchangelabs.com/
On 4/13/23 10:07, Jacob Keller wrote:
>
>
> On 4/13/2023 8:06 AM, Yan Wang wrote:
>> The system hang because of dsa_tag_8021q_port_setup()->
>> stmmac_vlan_rx_add_vid().
>>
>> I found in stmmac_drv_probe() that cailing pm_runtime_put()
>> disabled the clock.
>>
>> First, when the kernel is compiled with CONFIG_PM=y,The stmmac's
>> resume/suspend is active.
>>
>> Secondly,stmmac as DSA master,the dsa_tag_8021q_port_setup() function
>> will callback stmmac_vlan_rx_add_vid when DSA dirver starts. However,
>> The system is hanged for the stmmac_vlan_rx_add_vid() accesses its
>> registers after stmmac's clock is closed.
>>
>> I would suggest adding the pm_runtime_resume_and_get() to the
>> stmmac_vlan_rx_add_vid().This guarantees that resuming clock output
>> while in use.
>>
>> Signed-off-by: Yan Wang <[email protected]>
>
> This looks identical to the net fix you posted at [1]. I don't think we
> need both?
>
> [1]:
> https://lore.kernel.org/netdev/KL1PR01MB5448020DE191340AE64530B0E6989@KL1PR01MB5448.apcprd01.prod.exchangelabs.com/
Unfortunately both still lack a proper Fixes: tag, and this is bug fix.
--
Florian
On 4/13/2023 10:15 AM, Florian Fainelli wrote:
> On 4/13/23 10:07, Jacob Keller wrote:
>>
>>
>> On 4/13/2023 8:06 AM, Yan Wang wrote:
>>> The system hang because of dsa_tag_8021q_port_setup()->
>>> stmmac_vlan_rx_add_vid().
>>>
>>> I found in stmmac_drv_probe() that cailing pm_runtime_put()
>>> disabled the clock.
>>>
>>> First, when the kernel is compiled with CONFIG_PM=y,The stmmac's
>>> resume/suspend is active.
>>>
>>> Secondly,stmmac as DSA master,the dsa_tag_8021q_port_setup() function
>>> will callback stmmac_vlan_rx_add_vid when DSA dirver starts. However,
>>> The system is hanged for the stmmac_vlan_rx_add_vid() accesses its
>>> registers after stmmac's clock is closed.
>>>
>>> I would suggest adding the pm_runtime_resume_and_get() to the
>>> stmmac_vlan_rx_add_vid().This guarantees that resuming clock output
>>> while in use.
>>>
>>> Signed-off-by: Yan Wang <[email protected]>
>>
>> This looks identical to the net fix you posted at [1]. I don't think we
>> need both?
>>
>> [1]:
>> https://lore.kernel.org/netdev/KL1PR01MB5448020DE191340AE64530B0E6989@KL1PR01MB5448.apcprd01.prod.exchangelabs.com/
>
> Unfortunately both still lack a proper Fixes: tag, and this is bug fix.
Good point. Yan, please identify the appropriate fixes tag and send a v2
of the net fix.
Thanks,
Jake
On Thu, Apr 13, 2023 at 10:15:55AM -0700, Florian Fainelli wrote:
> On 4/13/23 10:07, Jacob Keller wrote:
> > On 4/13/2023 8:06 AM, Yan Wang wrote:
> > > The system hang because of dsa_tag_8021q_port_setup()->
> > > stmmac_vlan_rx_add_vid().
> > >
> > > I found in stmmac_drv_probe() that cailing pm_runtime_put()
> > > disabled the clock.
> > >
> > > First, when the kernel is compiled with CONFIG_PM=y,The stmmac's
> > > resume/suspend is active.
> > >
> > > Secondly,stmmac as DSA master,the dsa_tag_8021q_port_setup() function
> > > will callback stmmac_vlan_rx_add_vid when DSA dirver starts. However,
> > > The system is hanged for the stmmac_vlan_rx_add_vid() accesses its
> > > registers after stmmac's clock is closed.
> > >
> > > I would suggest adding the pm_runtime_resume_and_get() to the
> > > stmmac_vlan_rx_add_vid().This guarantees that resuming clock output
> > > while in use.
> > >
> > > Signed-off-by: Yan Wang <[email protected]>
> >
> > This looks identical to the net fix you posted at [1]. I don't think we
> > need both?
> >
> > [1]:
> > https://lore.kernel.org/netdev/KL1PR01MB5448020DE191340AE64530B0E6989@KL1PR01MB5448.apcprd01.prod.exchangelabs.com/
>
> Unfortunately both still lack a proper Fixes: tag, and this is bug fix.
> --
> Florian
>
I guess that would be:
Fixes: 5ec55823438e ("net: stmmac: add clocks management for gmac driver")
although in this case, that would be only part of the story. That commit
split the runtime PM handling between stmmac_vlan_rx_add_vid() and
stmmac_vlan_rx_kill_vid() in a strange way, where an added VLAN RX
filter takes a refcount on the device, and a deleted filter one drops
the refcount.
That is... strange?! but it worked in a way, I guess.
Then commit b3dcb3127786 ("net: stmmac: correct clocks enabled in
stmmac_vlan_rx_kill_vid()") came a few months later and blamed that
oddity on a bad merge conflict resolution... ?! Basically, from what I
can tell, it's this later commit the one that broke things, for using
runtime PM only for stmmac_vlan_rx_kill_vid() but not for stmmac_vlan_rx_add_vid().