Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753237AbbHSH5P (ORCPT ); Wed, 19 Aug 2015 03:57:15 -0400 Received: from collab.rosalab.ru ([195.19.76.181]:58278 "EHLO collab.rosalab.ru" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752782AbbHSH5M (ORCPT ); Wed, 19 Aug 2015 03:57:12 -0400 Subject: Re: [PATCH] usbnet: Fix two races between usbnet_stop() and the BH To: David Miller References: <55CE1D7E.2070400@rosalab.ru> <1439571516-11862-1-git-send-email-eugene.shatokhin@rosalab.ru> <20150818.185407.1667358232705414236.davem@davemloft.net> Cc: oneukum@suse.com, netdev@vger.kernel.org, linux-usb@vger.kernel.org, linux-kernel@vger.kernel.org From: Eugene Shatokhin Organization: ROSA Message-ID: <55D436D5.6010105@rosalab.ru> Date: Wed, 19 Aug 2015 10:57:09 +0300 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.0.1 MIME-Version: 1.0 In-Reply-To: <20150818.185407.1667358232705414236.davem@davemloft.net> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2506 Lines: 63 19.08.2015 04:54, David Miller пишет: > From: Eugene Shatokhin > Date: Fri, 14 Aug 2015 19:58:36 +0300 > >> 2. The second race is on dev->flags. >> >> dev->flags is set to 0 here: >> *0 usbnet_stop (usbnet.c:816) >> /* deferred work (task, timer, softirq) must also stop. >> * can't flush_scheduled_work() until we drop rtnl (later), >> * else workers could deadlock; so make workers a NOP. >> */ >> dev->flags = 0; >> del_timer_sync (&dev->delay); >> tasklet_kill (&dev->bh); >> >> And here, the code clears EVENT_RX_KILL bit in dev->flags, which may >> execute concurrently with the above operation: >> *0 clear_bit (bitops.h:113, inlined) >> *1 usbnet_bh (usbnet.c:1475) >> /* restart RX again after disabling due to high error rate */ >> clear_bit(EVENT_RX_KILL, &dev->flags); >> >> It seems, setting dev->flags to 0 is not necessarily atomic w.r.t. >> clear_bit() and other bit operations with dev->flags. It is safer to >> make it atomic and this way, make the race harmless. >> >> While at it, the checking of EVENT_NO_RUNTIME_PM bit of dev->flags in >> usbnet_stop() was fixed too: the bit should be checked before dev->flags >> is cleared. > > The fix for this is excessive. > > Instead of all of this madness, looping over expensive clear_bit() > atomics, just do whatever it takes to make sure that usbnet_bh() is > quiesced and cannot execute any more. Then you can safely clear > dev->flags normally. > If I understand it correctly, it is to make sure usbnet_bh() is not scheduled again that dev->flags should be set to 0 first, one way or another. That is what this madness is for. tasklet_kill() will wait then for the already running instance of usbnet_bh() (if one is running). After that, it is guaranteed BH is not running and will not be re-scheduled. As for the performance concerns, I doubt that usbnet_stop() is anywhere on the critical path. I have been testing this patch for some time and haven't seen any new performance issues with it yet. If needed, it is possible to measure and compare the time needed for usbnet_stop() before and after this patch and try to estimate the impact of this on the overall performance. Regards, Eugene -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/