Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755209AbbHXRBE (ORCPT ); Mon, 24 Aug 2015 13:01:04 -0400 Received: from collab.rosalab.ru ([195.19.76.181]:35915 "EHLO collab.rosalab.ru" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753858AbbHXRBB (ORCPT ); Mon, 24 Aug 2015 13:01:01 -0400 Subject: Re: [PATCH] usbnet: Fix two races between usbnet_stop() and the BH To: =?UTF-8?Q?Bj=c3=b8rn_Mork?= References: <55CE1D7E.2070400@rosalab.ru> <1439571516-11862-1-git-send-email-eugene.shatokhin@rosalab.ru> <20150818.185407.1667358232705414236.davem@davemloft.net> <55D436D5.6010105@rosalab.ru> <87k2sreefu.fsf@nemi.mork.no> <55D46F85.50608@rosalab.ru> <877fore9yn.fsf@nemi.mork.no> <55DB0C1C.5030607@rosalab.ru> <87io84byr8.fsf@nemi.mork.no> Cc: David Miller , oneukum@suse.com, netdev@vger.kernel.org, linux-usb@vger.kernel.org, linux-kernel@vger.kernel.org From: Eugene Shatokhin Organization: ROSA Message-ID: <55DB4DC0.7030807@rosalab.ru> Date: Mon, 24 Aug 2015 20:00:48 +0300 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.0.1 MIME-Version: 1.0 In-Reply-To: <87io84byr8.fsf@nemi.mork.no> Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 6742 Lines: 163 24.08.2015 16:29, Bjørn Mork пишет: > Eugene Shatokhin writes: > >> 19.08.2015 15:31, Bjørn Mork пишет: >>> Eugene Shatokhin writes: >>> >>>> The problem is not in the reordering but rather in the fact that >>>> "dev->flags = 0" is not necessarily atomic >>>> w.r.t. "clear_bit(EVENT_RX_KILL, &dev->flags)", and vice versa. >>>> >>>> So the following might be possible, although unlikely: >>>> >>>> CPU0 CPU1 >>>> clear_bit: read dev->flags >>>> clear_bit: clear EVENT_RX_KILL in the read value >>>> >>>> dev->flags=0; >>>> >>>> clear_bit: write updated dev->flags >>>> >>>> As a result, dev->flags may become non-zero again. >>> >>> Ah, right. Thanks for explaining. >>> >>>> I cannot prove yet that this is an impossible situation. If anyone >>>> can, please explain. If so, this part of the patch will not be needed. >>> >>> I wonder if we could simply move the dev->flags = 0 down a few lines to >>> fix both issues? It doesn't seem to do anything useful except for >>> resetting the flags to a sane initial state after the device is down. >>> >>> Stopping the tasklet rescheduling etc depends only on netif_running(), >>> which will be false when usbnet_stop is called. There is no need to >>> touch dev->flags for this to happen. >> >> That was one of the first ideas we discussed here. Unfortunately, it >> is probably not so simple. >> >> Setting dev->flags to 0 makes some delayed operations do nothing and, >> among other things, not to reschedule usbnet_bh(). > > Yes, but I believe that is merely a side effect. You should never need > to clear multiple flags to get the desired behaviour. > >> As you can see in drivers/net/usb/usbnet.c, usbnet_bh() can be called >> as a tasklet function and as a timer function in a number of >> situations (look for the usage of dev->bh and dev->delay there). >> >> netif_running() is indeed false when usbnet_stop() runs, usbnet_stop() >> also disables Tx. This seems to be enough for many cases where >> usbnet_bh() is scheduled, but I am not so sure about the remaining >> ones, namely: >> >> 1. A work function, usbnet_deferred_kevent(), may reschedule >> usbnet_bh(). Looks like the workqueue is only stopped in >> usbnet_disconnect(), so a work item might be processed while >> usbnet_stop() works. Setting dev->flags to 0 makes the work function >> do nothing, by the way. See also the comment in usbnet_stop() about >> this. >> >> A work item may be placed to this workqueue in a number of ways, by >> both usbnet module and the mini-drivers. It is not too easy to track >> all these situations. > > That's an understatement :) > > > >> 2. rx_complete() and tx_complete() may schedule execution of >> usbnet_bh() as a tasklet or a timer function. These two are URB >> completion callbacks. >> >> It seems, new Rx and Tx URBs cannot be submitted when usbnet_stop() >> clears dev->flags, indeed. But it does not prevent the completion >> handlers for the previously submitted URBs from running concurrently >> with usbnet_stop(). The latter waits for them to complete (via >> usbnet_terminate_urbs(dev)) but only if FLAG_AVOID_UNLINK_URBS is not >> set in info->flags. rndis_wlan, however, sets this flag for a few >> hardware models. So - no guarantees here as well. > > FLAG_AVOID_UNLINK_URBS looks like it should be replaced by the newer > ability to keep the status urb active. I believe that must have been the > real reason for adding it, based on the commit message and the effect > the flag will have: > > commit 1487cd5e76337555737cbc55d7d83f41460d198f > Author: Jussi Kivilinna > Date: Thu Jul 30 19:41:20 2009 +0300 > > usbnet: allow "minidriver" to prevent urb unlinking on usbnet_stop > > rndis_wlan devices freeze after running usbnet_stop several times. It appears > that firmware freezes in state where it does not respond to any RNDIS commands > and device have to be physically unplugged/replugged. This patch lets > minidrivers to disable unlink_urbs on usbnet_stop through new info flag. > > Signed-off-by: Jussi Kivilinna > Cc: David Brownell > Signed-off-by: John W. Linville > > > > The rx urbs will not be resubmitted in any case, and there are of course > no tx urbs being submitted. So the only effect of this flag is on the > status/interrupt urb, which I can imagine some RNDIS devices wants > active all the time. > > So FLAG_AVOID_UNLINK_URBS should probably be removed and replaced calls > to usbnet_status_start() and usbnet_status_stop(). This will require > testing on some of the devices with the original firmware problem > however. > > In any case: I do not think this flag should be considered when trying > to make usbnet_stop behaviour saner. It's only purpose is to > deliberately break usbnet_stop by not actually stopping. > > >> If someone could list the particular bits of dev->flags that should be >> cleared to make sure no deferred call could reschedule usbnet_bh(), >> etc... Well, it would be enough to clear these first and use >> dev->flags = 0 later, after tasklet_kill() and del_timer_sync(). I >> cannot point out these particular bits now. > > > I don't think any of the flags must be cleared. The sequence > > dev_close(dev->net); > usbnet_terminate_urbs(dev); > usbnet_purge_paused_rxq(dev); > del_timer_sync (&dev->delay); > tasklet_kill (&dev->bh); > > should prevent any rescheduling of usbnet_bh If so, then, I suppose, one could ignore that FLAG_AVOID_UNLINK_URBS for now and just move dev->flags = 0 down as you suggested and as we thought before. The patch will become simpler, indeed. > >> Besides, it is possible, although unlikely, that new event bits will >> be added to dev->flags in the future. And one will need to keep track >> of these to see if they should be cleared as well. I'd prever to play >> safer for now and clear them all. > > I don't think we should ever make a flag which will _have_ to be reset > for usbnet_stop. The only reason for clearing all flags is to reset the > state before the next open. > > Yes, I see that we currently need to clear EVENT_DEV_OPEN in > usbnet_stop, but I really don't see what this flag gives us which isn't > already provided by netif_running(). It looks like a duplicate. > Regards, Eugene -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/