Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754759AbbGXRiU (ORCPT ); Fri, 24 Jul 2015 13:38:20 -0400 Received: from collab.rosalab.ru ([195.19.76.181]:46820 "EHLO collab.rosalab.ru" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754379AbbGXRiR (ORCPT ); Fri, 24 Jul 2015 13:38:17 -0400 Subject: Re: Several races in "usbnet" module (kernel 4.1.x) To: Oliver Neukum References: <55AD3A41.2040100@rosalab.ru> <1437480243.3823.5.camel@suse.com> Cc: netdev@vger.kernel.org, linux-usb@vger.kernel.org, LKML From: Eugene Shatokhin Organization: ROSA Message-ID: <55B27805.90601@rosalab.ru> Date: Fri, 24 Jul 2015 20:38:13 +0300 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.0.1 MIME-Version: 1.0 In-Reply-To: <1437480243.3823.5.camel@suse.com> Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 5531 Lines: 159 21.07.2015 15:04, Oliver Neukum пишет: > On Mon, 2015-07-20 at 21:13 +0300, Eugene Shatokhin wrote: >> Hi, >> >> I have recently found several data races in "usbnet" module, checked on >> vanilla kernel 4.1.0 on x86_64. The races do actually happen, I have >> confirmed it by adding delays and using hardware breakpoints to detect >> the conflicting memory accesses (with RaceHound tool, >> https://github.com/winnukem/racehound). >> >> I have not analyzed yet how harmful these races are (if they are), but >> it is better to report them anyway, I think. >> >> Everything was checked using YOTA 4G LTE Modem that works via "usbnet" >> and "cdc_ether" kernel modules. >> -------------------------- >> >> [Race #1] >> >> Race on skb_queue ('next' pointer) between usbnet_stop() and rx_complete(). >> >> Reproduced that by unplugging the device while the system was >> downloading a large file from the Net. >> >> Here is part of the call stack with the code where the changes to the >> queue happen: >> >> #0 __skb_unlink (skbuff.h:1517) >> prev->next = next; >> #1 defer_bh (usbnet.c:430) >> spin_lock_irqsave(&list->lock, flags); >> old_state = entry->state; >> entry->state = state; >> __skb_unlink(skb, list); >> spin_unlock(&list->lock); >> spin_lock(&dev->done.lock); >> __skb_queue_tail(&dev->done, skb); >> if (dev->done.qlen == 1) >> tasklet_schedule(&dev->bh); >> spin_unlock_irqrestore(&dev->done.lock, flags); >> #2 rx_complete (usbnet.c:640) >> state = defer_bh(dev, skb, &dev->rxq, state); >> >> At the same time, the following code repeatedly checks if the queue is >> empty and reads the same values concurrently with the above changes: >> >> #0 usbnet_terminate_urbs (usbnet.c:765) >> /* maybe wait for deletions to finish. */ >> while (!skb_queue_empty(&dev->rxq) >> && !skb_queue_empty(&dev->txq) >> && !skb_queue_empty(&dev->done)) { >> schedule_timeout(msecs_to_jiffies(UNLINK_TIMEOUT_MS)); >> set_current_state(TASK_UNINTERRUPTIBLE); >> netif_dbg(dev, ifdown, dev->net, >> "waited for %d urb completions\n", temp); >> } >> #1 usbnet_stop (usbnet.c:806) >> if (!(info->flags & FLAG_AVOID_UNLINK_URBS)) >> usbnet_terminate_urbs(dev); >> >> For example, it is possible that the skb is removed from dev->rxq by >> __skb_unlink() before the check "!skb_queue_empty(&dev->rxq)" in >> usbnet_terminate_urbs() is made. It is also possible in this case that >> the skb is added to dev->done queue after "!skb_queue_empty(&dev->done)" >> is checked. So usbnet_terminate_urbs() may stop waiting and return while >> dev->done queue still has an item. > > Hi, > > your analysis is correct and it looks like in addition to your proposed > fix locking needs to be simplified and a common lock to be taken. > Suggestions? Just an idea, I haven't tested it. How about moving the operations with dev->done under &list->lock in defer_bh, while keeping dev->done.lock too and changing usbnet_terminate_urbs() as described below? Like this: @@ -428,12 +428,12 @@ static enum skb_state defer_bh(struct usbnet *dev, struct sk_buff *skb, old_state = entry->state; entry->state = state; __skb_unlink(skb, list); - spin_unlock(&list->lock); spin_lock(&dev->done.lock); __skb_queue_tail(&dev->done, skb); if (dev->done.qlen == 1) tasklet_schedule(&dev->bh); - spin_unlock_irqrestore(&dev->done.lock, flags); + spin_unlock(&dev->done.lock); + spin_unlock_irqrestore(&list->lock, flags); return old_state; } ------------------- usbnet_terminate_urbs() can then be changed as follows: @@ -749,6 +749,20 @@ EXPORT_SYMBOL_GPL(usbnet_unlink_rx_urbs); /*-------------------------------------------------------------------------*/ +static void wait_skb_queue_empty(struct sk_buff_head *q) +{ + unsigned long flags; + + spin_lock_irqsave(&q->lock, flags); + while (!skb_queue_empty(q)) { + spin_unlock_irqrestore(&q->lock, flags); + schedule_timeout(msecs_to_jiffies(UNLINK_TIMEOUT_MS)); + set_current_state(TASK_UNINTERRUPTIBLE); + spin_lock_irqsave(&q->lock, flags); + } + spin_unlock_irqrestore(&q->lock, flags); +} + // precondition: never called in_interrupt static void usbnet_terminate_urbs(struct usbnet *dev) { @@ -762,14 +776,11 @@ static void usbnet_terminate_urbs(struct usbnet *dev) unlink_urbs(dev, &dev->rxq); /* maybe wait for deletions to finish. */ - while (!skb_queue_empty(&dev->rxq) - && !skb_queue_empty(&dev->txq) - && !skb_queue_empty(&dev->done)) { - schedule_timeout(msecs_to_jiffies(UNLINK_TIMEOUT_MS)); - set_current_state(TASK_UNINTERRUPTIBLE); - netif_dbg(dev, ifdown, dev->net, - "waited for %d urb completions\n", temp); - } + wait_skb_queue_empty(&dev->rxq); + wait_skb_queue_empty(&dev->txq); + wait_skb_queue_empty(&dev->done); + netif_dbg(dev, ifdown, dev->net, + "waited for %d urb completions\n", temp); set_current_state(TASK_RUNNING); remove_wait_queue(&dev->wait, &wait); } ------------------- This way, when usbnet_terminate_urbs() finds dev->rxq or dev->txq empty, the skbs from these queues, if there were any, have already been queued to dev->done. At the first glance, moving the code under list->lock in defer_bh() should not produce deadlocks. Still, I suppose, it is better to use lockdep to be sure. Regards, Eugene -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/