Date: Thu, 11 Jan 2018 10:26:22 -0500 (EST)
Message-Id: <20180111.102622.769744562294438306.davem@davemloft.net>
To: ross.lagerwall@citrix.com
Cc: xen-devel@lists.xenproject.org, boris.ostrovsky@oracle.com,
        jgross@suse.com, netdev@vger.kernel.org,
        linux-kernel@vger.kernel.org
Subject: Re: [PATCH 2/2] xen-netfront: Fix race between device setup and
 open
From: David Miller <davem@davemloft.net>
In-Reply-To: <20180111093638.28937-3-ross.lagerwall@citrix.com>
References: <20180111093638.28937-1-ross.lagerwall@citrix.com>
        <20180111093638.28937-3-ross.lagerwall@citrix.com>
Mime-Version: 1.0
Content-Type: Text/Plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Sender: linux-kernel-owner@vger.kernel.org

From: Ross Lagerwall <ross.lagerwall@citrix.com>
Date: Thu, 11 Jan 2018 09:36:38 +0000

> When a netfront device is set up it registers a netdev fairly early on,
> before it has set up the queues and is actually usable. A userspace tool
> like NetworkManager will immediately try to open it and access its state
> as soon as it appears. The bug can be reproduced by hotplugging VIFs
> until the VM runs out of grant refs. It registers the netdev but fails
> to set up any queues (since there are no more grant refs). In the
> meantime, NetworkManager opens the device and the kernel crashes trying
> to access the queues (of which there are none).
> 
> Fix this in two ways:
> * For initial setup, register the netdev much later, after the queues
> are setup. This avoids the race entirely.
> * During a suspend/resume cycle, the frontend reconnects to the backend
> and the queues are recreated. It is possible (though highly unlikely) to
> race with something opening the device and accessing the queues after
> they have been destroyed but before they have been recreated. Extend the
> region covered by the rtnl semaphore to protect against this race. There
> is a possibility that we fail to recreate the queues so check for this
> in the open function.
> 
> Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com>

Where is patch 1/2 and the 0/2 header posting which explains what this
patch series is doing, how it is doing it, and why it is doing it that
way?

Thanks.