Received: by 2002:a05:6a10:8a4d:0:0:0:0 with SMTP id dn13csp1281548pxb; Fri, 13 Aug 2021 19:35:34 -0700 (PDT) X-Google-Smtp-Source: ABdhPJyeeSXAoktnEoQPfjzt2LYi3IiKp2sfXrXOhrH8PBda9N977toGdqJ3sw+nPjdfeLdZmVUF X-Received: by 2002:a92:bf02:: with SMTP id z2mr4021625ilh.213.1628908534625; Fri, 13 Aug 2021 19:35:34 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1628908534; cv=none; d=google.com; s=arc-20160816; b=xMIJvhgnzRrsW5oWCjfnrb7VbGT0z9LOzlBUG0CaDZ++hNvwQbvMwVEuXKE+Z35qnx v+ND8Ud9oWmF3059tzfYM0CFwzdAb03BeFPh1BYFJfN1xrJWPumsmTXXTdtPnT18FLRd AdNXoOuXZLkqNqk8A5tjCBjkcDEuz8GWhER9gHukGH1qSL4ODx+UcoR5Ase9+f6vgutO 1Oe16jhiB4aZ10nIzYwK0vNkcCxQf28ZnmK6maMaAnDJthuZFVp+UThKGuJ57oopPN5V b50jVcG96LlQ+18WA5qczEgTVAtDyaqGZjsiL5IIkLpsaOmV/SYAPj71JIz+8EoAjT6f 8U5Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:content-language :in-reply-to:mime-version:user-agent:date:message-id:from:references :cc:to:subject:dkim-signature; bh=PU2ldgkvXDgGX6HP14m612YtpPPuLizt585RKwAdEhM=; b=qGcP73EBa+uEItCCIKiuzD0cCGD77FErjL7p7aDHCBEbZ7dQ7/RqOBq6pMGKS7EGyf f9RvteYZ3Z3T+oE+3jch9Y6I2NG9zyyoUrN7CJ+yYwpfGXQC/TyFQUOheNpzpPQoPRNm uje0nmU0Au39yPv1n74q/Y09lBNpQ7l1ZIvfDwRgxe/CkRDuKEHg4GYD6ZyPdFzsfEXH mKuca4ZpUB298U1apfjqQPZUncneyqe+CP6PihNIlbvIHsONDLSjNgsFXQANKj8iPJ5u DExyvkxlHHsfjx99mP7at2Ea7ZEA2rTzJpyJkXDOjLejs5S0kwgqMfooyt0JQpjcRjEY eAog== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=UOPvdnsI; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id c1si3234118ilj.76.2021.08.13.19.35.23; Fri, 13 Aug 2021 19:35:34 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=UOPvdnsI; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236717AbhHNCdA (ORCPT + 99 others); Fri, 13 Aug 2021 22:33:00 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60484 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236545AbhHNCc7 (ORCPT ); Fri, 13 Aug 2021 22:32:59 -0400 Received: from mail-io1-xd2d.google.com (mail-io1-xd2d.google.com [IPv6:2607:f8b0:4864:20::d2d]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2A074C0617AD for ; Fri, 13 Aug 2021 19:32:32 -0700 (PDT) Received: by mail-io1-xd2d.google.com with SMTP id d22so15745964ioy.11 for ; Fri, 13 Aug 2021 19:32:32 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=subject:to:cc:references:from:message-id:date:user-agent :mime-version:in-reply-to:content-language:content-transfer-encoding; bh=PU2ldgkvXDgGX6HP14m612YtpPPuLizt585RKwAdEhM=; b=UOPvdnsIe89kQQTVGUFkmBkWKiXvoGuV2j2U+CzJEH8HtnUesVCvpOnRVv7YVGfnGu GFKi5FGJKj26MbVjp4715QjO5iFMV+tqAoLEThdHrpmmLAC+TOTheTigKi2MIqA8eV+f geeuif3GnheJn6P4iOfw5xu0VmUIhS1ksdP2ihtd3jHJ/Acs09v9kmyaPUgDLj/GjFfc 9/C5NrHQIoItmV0jdkovGeR9NwTGBcEE1XImeS9FjNG4wXq7ECR/eq6glku3qSmyfZBs NWftk4GBMlXPWJ9EqXGWRRgNeLB9OgJfCrQ6c/TAbeK7GA+Vt5fI7hB5QOoBe7phr5cc Az8w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:to:cc:references:from:message-id:date :user-agent:mime-version:in-reply-to:content-language :content-transfer-encoding; bh=PU2ldgkvXDgGX6HP14m612YtpPPuLizt585RKwAdEhM=; b=hDbiWhWz09YmmY4vLKLhZ0iwl6UWmGBcXjGJYe5nIH9TcvOYl12PZQfIWL/0gL7117 8xUsBWcINtcwFNskLcYsCoL9wEidVKe06BPh3AguDevO/eLdXHR3l3TqDxVP+8JqR4dL 0MrSblO+hSmasnNJYSlGgn4HegJb4mUwnnkbm0+9FKrC5dhtV4Kcxgk7kOnUehD4ikEn aibVovI1MvxWDeHBbF2SOcILTbln60Nb3XskpwH5lX2WrjvNBbsWSWzw77nzHdf2xZ17 5w4bL2UMOxaxLo1tL+OawExahLr8hn3jacLcEtkbgWNMXU5Uvt7liYOOa4ySzKyBJHaT WttA== X-Gm-Message-State: AOAM5339FLQa1yoB2YV/ZAKgpGZHmIj9QKIMoD7R91ABJaQPNGaIeV0B xTiL16gTFdGT7vQdAkVXLtZchEroirVHQC0c X-Received: by 2002:a6b:7209:: with SMTP id n9mr4373858ioc.67.1628908351375; Fri, 13 Aug 2021 19:32:31 -0700 (PDT) Received: from [172.22.22.26] (c-73-185-129-58.hsd1.mn.comcast.net. [73.185.129.58]) by smtp.googlemail.com with ESMTPSA id j20sm1941644ile.17.2021.08.13.19.32.30 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Fri, 13 Aug 2021 19:32:31 -0700 (PDT) Subject: Re: [PATCH net-next 3/6] net: ipa: re-enable transmit in PM WQ context To: Jakub Kicinski Cc: davem@davemloft.net, bjorn.andersson@linaro.org, evgreen@chromium.org, cpratapa@codeaurora.org, subashab@codeaurora.org, elder@kernel.org, netdev@vger.kernel.org, linux-kernel@vger.kernel.org References: <20210812195035.2816276-1-elder@linaro.org> <20210812195035.2816276-4-elder@linaro.org> <20210813174406.5e7fc350@kicinski-fedora-pc1c0hjn.dhcp.thefacebook.com> From: Alex Elder Message-ID: <1e17b9a9-4f04-bae7-b113-26c1944abbfc@linaro.org> Date: Fri, 13 Aug 2021 21:32:30 -0500 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.11.0 MIME-Version: 1.0 In-Reply-To: <20210813174406.5e7fc350@kicinski-fedora-pc1c0hjn.dhcp.thefacebook.com> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 8/13/21 7:44 PM, Jakub Kicinski wrote: > On Thu, 12 Aug 2021 14:50:32 -0500 Alex Elder wrote: >> +/** >> + * ipa_modem_wake_queue_work() - enable modem netdev queue >> + * @work: Work structure >> + * >> + * Re-enable transmit on the modem network device. This is called >> + * in (power management) work queue context, scheduled when resuming >> + * the modem. >> + */ >> +static void ipa_modem_wake_queue_work(struct work_struct *work) >> +{ >> + struct ipa_priv *priv = container_of(work, struct ipa_priv, work); >> + >> + netif_wake_queue(priv->ipa->modem_netdev); >> +} >> + >> /** ipa_modem_resume() - resume callback for runtime_pm >> * @dev: pointer to device >> * >> @@ -205,7 +226,8 @@ void ipa_modem_resume(struct net_device *netdev) >> ipa_endpoint_resume_one(ipa->name_map[IPA_ENDPOINT_AP_MODEM_TX]); >> ipa_endpoint_resume_one(ipa->name_map[IPA_ENDPOINT_AP_MODEM_RX]); >> >> - netif_wake_queue(netdev); >> + /* Arrange for the TX queue to be restarted */ >> + (void)queue_pm_work(&priv->work); >> } > > Why move the wake call to a work queue, tho? It's okay to call it > from any context. The issue isn't about the context in which is run (well, not really, not in the sense you're talking about). The issue has to do with the PM ->runtime_resume function running concurrent with the network ->start_xmit function. We need the hardware powered in ipa_start_xmit(). So we call pm_runtime_get(), which will not block and which will indicate in its return value whether power: is active (return is 1); will be active once the resume underway completes (return is -EINPROGRESS); will be active once suspend underway and a delayed resume completes (return is 0); or will be active once the newly-scheduled resume completes (return is 0, scheduled on PM work queue). We don't expect any other error, but if we get one we drop the packet. If the return value is 1, power is active and we transmit the packet. If the return value indicates power is not active, but will be, we stop the TX queue. No other packets should be passed to ->start_xmit until TX is started again. We wish to restart the TX queue when the ipa_runtime_resume() completes. Here is the call path: ipa_runtime_resume() This is the ->runtime_resume PM op ipa_endpoint_resume() ipa_modem_resume() netif_wake_queue() Without this patch The instant netif_wake_queue() is called, we start getting calls to ipa_start_xmit(), which again attempts to transmit the SKB that caused the queue to be stopped. And there is a good chance that when that is called, the ipa_runtime_resume() PM callback is still executing, and not complete. In that case, we'll *again* get an -EINPROGRESS back from pm_runtime_get() in ipa_start_xmit(), and we stop the TX queue again. Basically, we're stuck. All we need is for the TX queue to be started *after* the PM ->runtime_resume callback completes and marks the the PM runtime status ACTIVE. Scheduling this on the PM workqueue ensures this will happen then, if we happen to be running ipa_runtime_resume() via that workqueue. If not, there's a bit of a race but it should resolve (but I think here lies the specific race you mentioned in the other message). I'm open to other suggestions, but my hope was to at least explain why I did it this way. I'll think about it over the weekend and will send a new version of the series when I come up with a solution. Thank you very much for the review. -Alex