Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753653AbbHSKyv (ORCPT ); Wed, 19 Aug 2015 06:54:51 -0400 Received: from canardo.mork.no ([148.122.252.1]:45785 "EHLO canardo.mork.no" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751525AbbHSKyt convert rfc822-to-8bit (ORCPT ); Wed, 19 Aug 2015 06:54:49 -0400 From: =?utf-8?Q?Bj=C3=B8rn_Mork?= To: Eugene Shatokhin Cc: David Miller , oneukum@suse.com, netdev@vger.kernel.org, linux-usb@vger.kernel.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH] usbnet: Fix two races between usbnet_stop() and the BH Organization: m References: <55CE1D7E.2070400@rosalab.ru> <1439571516-11862-1-git-send-email-eugene.shatokhin@rosalab.ru> <20150818.185407.1667358232705414236.davem@davemloft.net> <55D436D5.6010105@rosalab.ru> Date: Wed, 19 Aug 2015 12:54:29 +0200 In-Reply-To: <55D436D5.6010105@rosalab.ru> (Eugene Shatokhin's message of "Wed, 19 Aug 2015 10:57:09 +0300") Message-ID: <87k2sreefu.fsf@nemi.mork.no> User-Agent: Gnus/5.130013 (Ma Gnus v0.13) Emacs/24.4 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8BIT X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.5.11 (canardo.mork.no [IPv6:2001:4641::1]); Wed, 19 Aug 2015 12:54:36 +0200 (CEST) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2587 Lines: 72 Eugene Shatokhin writes: > 19.08.2015 04:54, David Miller пишет: >> From: Eugene Shatokhin >> Date: Fri, 14 Aug 2015 19:58:36 +0300 >> >>> 2. The second race is on dev->flags. >>> >>> dev->flags is set to 0 here: >>> *0 usbnet_stop (usbnet.c:816) >>> /* deferred work (task, timer, softirq) must also stop. >>> * can't flush_scheduled_work() until we drop rtnl (later), >>> * else workers could deadlock; so make workers a NOP. >>> */ >>> dev->flags = 0; >>> del_timer_sync (&dev->delay); >>> tasklet_kill (&dev->bh); >>> >>> And here, the code clears EVENT_RX_KILL bit in dev->flags, which may >>> execute concurrently with the above operation: >>> *0 clear_bit (bitops.h:113, inlined) >>> *1 usbnet_bh (usbnet.c:1475) >>> /* restart RX again after disabling due to high error rate */ >>> clear_bit(EVENT_RX_KILL, &dev->flags); >>> >>> It seems, setting dev->flags to 0 is not necessarily atomic w.r.t. >>> clear_bit() and other bit operations with dev->flags. It is safer to >>> make it atomic and this way, make the race harmless. >>> >>> While at it, the checking of EVENT_NO_RUNTIME_PM bit of dev->flags in >>> usbnet_stop() was fixed too: the bit should be checked before dev->flags >>> is cleared. >> >> The fix for this is excessive. >> >> Instead of all of this madness, looping over expensive clear_bit() >> atomics, just do whatever it takes to make sure that usbnet_bh() is >> quiesced and cannot execute any more. Then you can safely clear >> dev->flags normally. >> > > If I understand it correctly, it is to make sure usbnet_bh() is not > scheduled again that dev->flags should be set to 0 first, one way or > another. That is what this madness is for. Assuming there is a race which may reorder these, exactly what difference does it make wrt EVENT_RX_KILL if you do a) clear_bit(EVENT_RX_KILL, &dev->flags); dev->flags = 0; or b) dev->flags = 0; clear_bit(EVENT_RX_KILL, &dev->flags); AFAICS, the result will be a cleared EVENT_RX_KILL bit in either case. The EVENT_NO_RUNTIME_PM bug should definitely be fixed. Please split that out as a separate fix. It's a separate issue, and should be backported to all maintained stable releases it applies to (anything from v3.8 and newer) Bjørn -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/