Received: by 2002:ac0:a5a6:0:0:0:0:0 with SMTP id m35-v6csp197071imm; Fri, 31 Aug 2018 23:10:04 -0700 (PDT) X-Google-Smtp-Source: ANB0VdaoYrAADODTCgeoT1xFllgPQGmYMCTALA5WcCR4LQBPl42HuErdMVu3qLLFNStWYqfn4D03 X-Received: by 2002:a63:2150:: with SMTP id s16-v6mr17593392pgm.267.1535782204815; Fri, 31 Aug 2018 23:10:04 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1535782204; cv=none; d=google.com; s=arc-20160816; b=QWH8iR3ZJrrElCdBeoA5AQi5S0d5P7XQogT8rJabeiZCOAxPwybnwVHglQuayWb7Hz zkjIvEAWDtlsnDzY+bGzaJP0+i/AbufBNsR7M5sS4FTTWfx/0qduq/9iKTvNKCkd2fr/ vnr4J3HEgCmN3UjOF4mxX8ADX8b0AImzYaYuTV2YPtu7K4L8ZnG2YrvyFN0kdEkHUBT1 YVVSHWGqvnqZxhG01epGpZkVDGwF/cO3tPK02sl2IHd6g3dwHUVzEB2NmUf8uA0jsAgT NI2Q6Itxv76L4rPjFOkhfDlESOv9R1t3sxVkszNohihL43eAAqnNNMwSzXWBW6tmikMg p8ig== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :references:in-reply-to:from:subject:cc:to:message-id:date :arc-authentication-results; bh=tmKP0p5ZmvyxPumMzY3QVhRgA2Yzareg5w0b6b6NP6w=; b=Lt69FoZlTlwadijagaHW7M8CRX54gDjgkwesWbNkjpRL2h7hw00bGhenxwNL916xcF 7hfrvWjSBKOWtk8DbT4xitaYPa0FBMMKzW0eY+pxxqGXgS57vMIUKeR7jhtdJdsSBV81 6hfPbfmLAfCkZ8XpwiPbBXWwvgZUTgi9EIui2OiyQ3VZkO3mUszB1eGAdkqvdccIDoaX d2NC6LZIZLgbiN+5cVqwP5NZTV1nL26lyrWNu33IxLLH1UXmpMv8vahJcKwKbQ1R9+4/ RAI72mxOFOjV5tHLBIuJzP0UrAZ8fAbedFkoakvgU3FRXSaUA/8vKfOgyOU5wG/03ltq otvA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id o5-v6si11482920pgo.250.2018.08.31.23.09.46; Fri, 31 Aug 2018 23:10:04 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727273AbeIAKTY (ORCPT + 99 others); Sat, 1 Sep 2018 06:19:24 -0400 Received: from shards.monkeyblade.net ([23.128.96.9]:44030 "EHLO shards.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726044AbeIAKTY (ORCPT ); Sat, 1 Sep 2018 06:19:24 -0400 Received: from localhost (unknown [172.58.43.31]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (Client did not present a certificate) (Authenticated sender: davem-davemloft) by shards.monkeyblade.net (Postfix) with ESMTPSA id 7103E1433DAA3; Fri, 31 Aug 2018 23:08:34 -0700 (PDT) Date: Fri, 31 Aug 2018 23:08:33 -0700 (PDT) Message-Id: <20180831.230833.1182096350264602698.davem@davemloft.net> To: decui@microsoft.com Cc: kys@microsoft.com, haiyangz@microsoft.com, sthemmin@microsoft.com, netdev@vger.kernel.org, jopoulso@microsoft.com, olaf@aepfle.de, jasowang@redhat.com, linux-kernel@vger.kernel.org, marcelo.cerri@canonical.com, apw@canonical.com, devel@linuxdriverproject.org, vkuznets@redhat.com Subject: Re: [PATCH v2] hv_netvsc: Fix a deadlock by getting rtnl lock earlier in netvsc_probe() From: David Miller In-Reply-To: References: X-Mailer: Mew version 6.7 on Emacs 26 / Mule 6.0 (HANACHIRUSATO) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.5.12 (shards.monkeyblade.net [149.20.54.216]); Fri, 31 Aug 2018 23:08:35 -0700 (PDT) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Dexuan Cui Date: Thu, 30 Aug 2018 05:42:13 +0000 > > This patch fixes the race between netvsc_probe() and > rndis_set_subchannel(), which can cause a deadlock. > > These are the related 3 paths which show the deadlock: ... > Before path #1 finishes, path #2 can start to run, because just before > the "bus_probe_device(dev);" in device_add() in path #1, there is a line > "object_uevent(&dev->kobj, KOBJ_ADD);", so systemd-udevd can > immediately try to load hv_netvsc and hence path #2 can start to run. > > Next, path #2 offloads the subchannal's initialization to a workqueue, > i.e. path #3, so we can end up in a deadlock situation like this: > > Path #2 gets the device lock, and is trying to get the rtnl lock; > Path #3 gets the rtnl lock and is waiting for all the subchannel messages > to be processed; > Path #1 is trying to get the device lock, but since #2 is not releasing > the device lock, path #1 has to sleep; since the VMBus messages are > processed one by one, this means the sub-channel messages can't be > procedded, so #3 has to sleep with the rtnl lock held, and finally #2 > has to sleep... Now all the 3 paths are sleeping and we hit the deadlock. > > With the patch, we can make sure #2 gets both the device lock and the > rtnl lock together, gets its job done, and releases the locks, so #1 > and #3 will not be blocked for ever. > > Fixes: 8195b1396ec8 ("hv_netvsc: fix deadlock on hotplug") > Signed-off-by: Dexuan Cui Applied and queued up for -stable.