2021-08-12 20:44:21

by Alex Elder

[permalink] [raw]
Subject: [PATCH net-next 4/6] net: ipa: ensure hardware has power in ipa_start_xmit()

We need to ensure the hardware is powered when we transmit a packet.
But if it's not, we can't block to wait for it. So asynchronously
request power in ipa_start_xmit(), and only proceed if the return
value indicates the power state is active.

If the hardware is not active, a runtime resume request will have
been initiated. In that case, stop the network stack from further
transmit attempts until the resume completes. Return NETDEV_TX_BUSY,
to retry sending the packet once the queue is restarted.

If the power request returns an error (other than -EINPROGRESS,
which just means a resume requested elsewhere isn't complete), just
drop the packet.

Signed-off-by: Alex Elder <[email protected]>
---
drivers/net/ipa/ipa_modem.c | 30 +++++++++++++++++++++++++++++-
1 file changed, 29 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ipa/ipa_modem.c b/drivers/net/ipa/ipa_modem.c
index 0a3b034614b61..aa1b483d9f7db 100644
--- a/drivers/net/ipa/ipa_modem.c
+++ b/drivers/net/ipa/ipa_modem.c
@@ -106,6 +106,7 @@ static int ipa_start_xmit(struct sk_buff *skb, struct net_device *netdev)
struct ipa_endpoint *endpoint;
struct ipa *ipa = priv->ipa;
u32 skb_len = skb->len;
+ struct device *dev;
int ret;

if (!skb_len)
@@ -115,7 +116,31 @@ static int ipa_start_xmit(struct sk_buff *skb, struct net_device *netdev)
if (endpoint->data->qmap && skb->protocol != htons(ETH_P_MAP))
goto err_drop_skb;

+ /* The hardware must be powered for us to transmit */
+ dev = &ipa->pdev->dev;
+ ret = pm_runtime_get(dev);
+ if (ret < 1) {
+ /* If a resume won't happen, just drop the packet */
+ if (ret < 0 && ret != -EINPROGRESS) {
+ pm_runtime_put_noidle(dev);
+ goto err_drop_skb;
+ }
+
+ /* No power (yet). Stop the network stack from transmitting
+ * until we're resumed; ipa_modem_resume() arranges for the
+ * TX queue to be started again.
+ */
+ netif_stop_queue(netdev);
+
+ (void)pm_runtime_put(dev);
+
+ return NETDEV_TX_BUSY;
+ }
+
ret = ipa_endpoint_skb_tx(endpoint, skb);
+
+ (void)pm_runtime_put(dev);
+
if (ret) {
if (ret != -E2BIG)
return NETDEV_TX_BUSY;
@@ -201,7 +226,10 @@ void ipa_modem_suspend(struct net_device *netdev)
*
* Re-enable transmit on the modem network device. This is called
* in (power management) work queue context, scheduled when resuming
- * the modem.
+ * the modem. We can't enable the queue directly in ipa_modem_resume()
+ * because transmits restart the instant the queue is awakened; but the
+ * device power state won't be ACTIVE until *after* ipa_modem_resume()
+ * returns.
*/
static void ipa_modem_wake_queue_work(struct work_struct *work)
{
--
2.27.0


2021-08-14 00:47:50

by Jakub Kicinski

[permalink] [raw]
Subject: Re: [PATCH net-next 4/6] net: ipa: ensure hardware has power in ipa_start_xmit()

On Thu, 12 Aug 2021 14:50:33 -0500 Alex Elder wrote:
> + /* The hardware must be powered for us to transmit */
> + dev = &ipa->pdev->dev;
> + ret = pm_runtime_get(dev);
> + if (ret < 1) {
> + /* If a resume won't happen, just drop the packet */
> + if (ret < 0 && ret != -EINPROGRESS) {
> + pm_runtime_put_noidle(dev);
> + goto err_drop_skb;
> + }

This is racy, what if the pm work gets scheduled on another CPU and
calls wake right here (i.e. before you call netif_stop_queue())?
The queue may never get woken up?

> + /* No power (yet). Stop the network stack from transmitting
> + * until we're resumed; ipa_modem_resume() arranges for the
> + * TX queue to be started again.
> + */
> + netif_stop_queue(netdev);
> +
> + (void)pm_runtime_put(dev);
> +
> + return NETDEV_TX_BUSY;

2021-08-14 02:26:53

by Alex Elder

[permalink] [raw]
Subject: Re: [PATCH net-next 4/6] net: ipa: ensure hardware has power in ipa_start_xmit()

On 8/13/21 7:46 PM, Jakub Kicinski wrote:
> On Thu, 12 Aug 2021 14:50:33 -0500 Alex Elder wrote:
>> + /* The hardware must be powered for us to transmit */
>> + dev = &ipa->pdev->dev;
>> + ret = pm_runtime_get(dev);
>> + if (ret < 1) {
>> + /* If a resume won't happen, just drop the packet */
>> + if (ret < 0 && ret != -EINPROGRESS) {
>> + pm_runtime_put_noidle(dev);
>> + goto err_drop_skb;
>> + }
>
> This is racy, what if the pm work gets scheduled on another CPU and
> calls wake right here (i.e. before you call netif_stop_queue())?
> The queue may never get woken up?

I haven't been seeing this happen but I think you may be right.

I did think about this race, but I think I was relying on the
PM work queue to somehow avoid the problem. I need to think
about this again after a good night's sleep. I might need
to add an atomic flag or something.

-Alex

>> + /* No power (yet). Stop the network stack from transmitting
>> + * until we're resumed; ipa_modem_resume() arranges for the
>> + * TX queue to be started again.
>> + */
>> + netif_stop_queue(netdev);
>> +
>> + (void)pm_runtime_put(dev);
>> +
>> + return NETDEV_TX_BUSY;

2021-08-16 14:18:04

by Jakub Kicinski

[permalink] [raw]
Subject: Re: [PATCH net-next 4/6] net: ipa: ensure hardware has power in ipa_start_xmit()

On Fri, 13 Aug 2021 21:25:23 -0500 Alex Elder wrote:
> > This is racy, what if the pm work gets scheduled on another CPU and
> > calls wake right here (i.e. before you call netif_stop_queue())?
> > The queue may never get woken up?
>
> I haven't been seeing this happen but I think you may be right.
>
> I did think about this race, but I think I was relying on the
> PM work queue to somehow avoid the problem. I need to think
> about this again after a good night's sleep. I might need
> to add an atomic flag or something.

Maybe add a spin lock? Seems like the whole wake up path will be
expensive enough for a spin lock to be in the noise. You can always
add complexity later.

2021-08-16 14:25:05

by Alex Elder

[permalink] [raw]
Subject: Re: [PATCH net-next 4/6] net: ipa: ensure hardware has power in ipa_start_xmit()

On 8/16/21 9:15 AM, Jakub Kicinski wrote:
> On Fri, 13 Aug 2021 21:25:23 -0500 Alex Elder wrote:
>>> This is racy, what if the pm work gets scheduled on another CPU and
>>> calls wake right here (i.e. before you call netif_stop_queue())?
>>> The queue may never get woken up?
>>
>> I haven't been seeing this happen but I think you may be right.
>>
>> I did think about this race, but I think I was relying on the
>> PM work queue to somehow avoid the problem. I need to think
>> about this again after a good night's sleep. I might need
>> to add an atomic flag or something.
>
> Maybe add a spin lock? Seems like the whole wake up path will be
> expensive enough for a spin lock to be in the noise. You can always
> add complexity later.

Exactly what I just decided after trying to work out a
clever way without using a spinlock... I'll be sending
out a fix today. Thanks.

-Alex

2021-08-16 17:59:55

by Alex Elder

[permalink] [raw]
Subject: Re: [PATCH net-next 4/6] net: ipa: ensure hardware has power in ipa_start_xmit()

On 8/16/21 9:20 AM, Alex Elder wrote:
> On 8/16/21 9:15 AM, Jakub Kicinski wrote:
>> On Fri, 13 Aug 2021 21:25:23 -0500 Alex Elder wrote:
>>>> This is racy, what if the pm work gets scheduled on another CPU and
>>>> calls wake right here (i.e. before you call netif_stop_queue())?
>>>> The queue may never get woken up?
>>>
>>> I haven't been seeing this happen but I think you may be right.
>>>
>>> I did think about this race, but I think I was relying on the
>>> PM work queue to somehow avoid the problem.  I need to think
>>> about this again after a good night's sleep.  I might need
>>> to add an atomic flag or something.
>>
>> Maybe add a spin lock?  Seems like the whole wake up path will be
>> expensive enough for a spin lock to be in the noise. You can always
>> add complexity later.
>
> Exactly what I just decided after trying to work out a
> clever way without using a spinlock...  I'll be sending
> out a fix today.  Thanks.

I'm finding this isn't an easy problem to solve (or even think
about). While I ponder the best course of action I'm going
to send out another series (i.e., *before* I send a fix for
this issue) because I'd like to get everything I have out
for review this week. I *will* address this potential race
one way or another, possibly later today.

-Alex

>
>                     -Alex
>

2021-08-16 20:21:18

by Jakub Kicinski

[permalink] [raw]
Subject: Re: [PATCH net-next 4/6] net: ipa: ensure hardware has power in ipa_start_xmit()

On Mon, 16 Aug 2021 12:56:40 -0500 Alex Elder wrote:
> I'm finding this isn't an easy problem to solve (or even think
> about). While I ponder the best course of action I'm going
> to send out another series (i.e., *before* I send a fix for
> this issue) because I'd like to get everything I have out
> for review this week. I *will* address this potential race
> one way or another, possibly later today.

Alright, thanks for the heads up.