Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.5 required=3.0 tests=HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH,MAILING_LIST_MULTI, MENTIONS_GIT_HOSTING,SIGNED_OFF_BY,SPF_PASS,USER_AGENT_MUTT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id D3D2AC10F14 for ; Tue, 16 Apr 2019 08:27:26 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id B0A2B2073F for ; Tue, 16 Apr 2019 08:27:26 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728757AbfDPI10 (ORCPT ); Tue, 16 Apr 2019 04:27:26 -0400 Received: from mx1.redhat.com ([209.132.183.28]:52192 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728743AbfDPI1N (ORCPT ); Tue, 16 Apr 2019 04:27:13 -0400 Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 319D6C05686D; Tue, 16 Apr 2019 08:27:13 +0000 (UTC) Received: from localhost (unknown [10.40.205.93]) by smtp.corp.redhat.com (Postfix) with ESMTP id A155B19C77; Tue, 16 Apr 2019 08:27:10 +0000 (UTC) Date: Tue, 16 Apr 2019 10:27:09 +0200 From: Stanislaw Gruszka To: Lorenzo Bianconi Cc: Lorenzo Bianconi , nbd@nbd.name, linux-wireless@vger.kernel.org Subject: Re: [PATCH] mt76: usb: fix possible memory leak during suspend/resume Message-ID: <20190416082708.GB2833@redhat.com> References: <20190412145442.GA2539@redhat.com> <20190412153509.GB3156@localhost.localdomain> <20190412162746.GC3156@localhost.localdomain> <20190413083050.GA7434@redhat.com> <20190413101056.GA7940@localhost.localdomain> <20190415115352.GA4143@redhat.com> <20190415150405.GA14449@localhost.localdomain> <20190416080436.GA2833@redhat.com> <20190416081241.GA11046@localhost.localdomain> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20190416081241.GA11046@localhost.localdomain> User-Agent: Mutt/1.11.3 (2019-02-01) X-Scanned-By: MIMEDefang 2.84 on 10.5.11.23 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.32]); Tue, 16 Apr 2019 08:27:13 +0000 (UTC) Sender: linux-wireless-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-wireless@vger.kernel.org On Tue, Apr 16, 2019 at 10:12:42AM +0200, Lorenzo Bianconi wrote: > > On Mon, Apr 15, 2019 at 05:04:06PM +0200, Lorenzo Bianconi wrote: > > > > On Sat, Apr 13, 2019 at 12:10:59PM +0200, Lorenzo Bianconi wrote: > > > > > > On Fri, Apr 12, 2019 at 06:27:48PM +0200, Lorenzo Bianconi wrote: > > > > > > > > > On Fri, Apr 12, 2019 at 02:27:16PM +0200, Lorenzo Bianconi wrote: > > > > > > > > > > Disable mt76u_tx_tasklet at the end of mt76u_stop_queues in order to > > > > > > > > > > properly deallocate all pending skbs during suspend/resume phase > > > > > > > > > > > > > > > > > > On suspend/resume tx skb's are processed after tasklet_enable() > > > > > > > > > in resume callback. There is issue with device removal though > > > > > > > > > (during suspend or otherwise). > > > > > > > > > > > > > > > > Hi Stanislaw, > > > > > > > > > > > > > > > > I guess the right moment to deallocate the skbs is during suspend since resume > > > > > > > > can happen in very far future > > > > > > > > > > > > Yes, it's better to free on suspend, but in practice does not really matter since > > > > > > system is disabled till resume. > > > > > > > > > > > > > > > > Fixes: b40b15e1521f ("mt76: add usb support to mt76 layer") > > > > > > > > > > Signed-off-by: Lorenzo Bianconi > > > > > > > > > > --- > > > > > > > > > > drivers/net/wireless/mediatek/mt76/usb.c | 4 ++-- > > > > > > > > > > 1 file changed, 2 insertions(+), 2 deletions(-) > > > > > > > > > > > > > > > > > > > > diff --git a/drivers/net/wireless/mediatek/mt76/usb.c b/drivers/net/wireless/mediatek/mt76/usb.c > > > > > > > > > > index a3acc070063a..575207133775 100644 > > > > > > > > > > --- a/drivers/net/wireless/mediatek/mt76/usb.c > > > > > > > > > > +++ b/drivers/net/wireless/mediatek/mt76/usb.c > > > > > > > > > > @@ -842,10 +842,10 @@ static void mt76u_stop_tx(struct mt76_dev *dev) > > > > > > > > > > void mt76u_stop_queues(struct mt76_dev *dev) > > > > > > > > > > { > > > > > > > > > > tasklet_disable(&dev->usb.rx_tasklet); > > > > > > > > > > - tasklet_disable(&dev->usb.tx_tasklet); > > > > > > > > > > - > > > > > > > > > > mt76u_stop_rx(dev); > > > > > > > > > > + > > > > > > > > > > mt76u_stop_tx(dev); > > > > > > > > > > + tasklet_disable(&dev->usb.tx_tasklet); > > > > > > > > > > > > > > > > > > If tasklet is scheduled and we disable it and never enable, we end up > > > > > > > > > with infinite loop in tasklet_action_common(). This patch make the > > > > > > > > > problem less reproducible since tasklet_disable() is moved after > > > > > > > > > usb_kill_urb() -> tasklet_schedule(), but it is still possible. > > > > > > > > > > > > > > > > I can see the point here. Maybe we can just run tasklet_kill instead of > > > > > > > > tasklet_disable here (at least for tx one) > > > > > > > > > > > > I think you have right as tasklet_kill() will wait for scheduled tasklet . > > > > > > Originally in my patch (see below) I used wait_event as I thought > > > > > > tasklet_kill() may prevent scheduled tasklet to be executed (hence cause > > > > > > leak) but that seems to be not true. > > > > > > > > > > I agree with rx side (good catch!!), but on tx one I guess usb_kill_urb() > > > > > is already waiting for tx pending so we just need to use tasklet_kill > > > > > at the end of mt76u_stop_queues, in this way we will free pending skbs during > > > > > suspend > > > > > > > > I looked more into that and there are some issues with this approach. > > > > tx_tasklet do mt76_txq_schedule() which can queue tx frames. Also we > > > > do not free skb's that require status check and dev->usb.stat_work > > > > is already (correctly) stopped on mac80211.stop. > > > > > > right > > > > > > > > > > > I'll use wait_event(dev->tx_wait) on mac80211 stop to handle those > > > > issues correctly. > > > > > > ack > > > > > > > > > > > Stanislaw > > > > > > during device removal I guess we should also flush skbs in status queue, doing > > > something like (after commit 0b5f71304cd9 (mt76: introduce mt76_free_device > > > routine)) > > > > > > diff --git a/drivers/net/wireless/mediatek/mt76/mt76x0/usb.c b/drivers/net/wireless/mediatek/mt76/mt76x0/usb.c > > > index 1ef00e971cfa..d4d1eb003148 100644 > > > --- a/drivers/net/wireless/mediatek/mt76/mt76x0/usb.c > > > +++ b/drivers/net/wireless/mediatek/mt76/mt76x0/usb.c > > > @@ -299,7 +299,7 @@ static void mt76x0_disconnect(struct usb_interface *usb_intf) > > > if (!initalized) > > > return; > > > > > > - ieee80211_unregister_hw(dev->mt76.hw); > > > + mt76_unregister_device(&dev->mt76); > > > > mt76_unregister_device() free mmio dma. I've added mt76_tx_status_check() > > on mt76u_stop_tx() routine instead. > > nope, after commit 0b5f71304cd98fb7b3b5b3a633e470bea979fe94 > (https://github.com/nbd168/wireless/commit/0b5f71304cd98fb7b3b5b3a633e470bea979fe94) > it can be used even for usb Ok, but as you pointed before 'right moment to deallocate the skbs is during suspend' so I still preffer to flush statuses there. Stanislaw