Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S934938AbcJTIvL (ORCPT ); Thu, 20 Oct 2016 04:51:11 -0400 Received: from mx1.redhat.com ([209.132.183.28]:43720 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S934884AbcJTIvI (ORCPT ); Thu, 20 Oct 2016 04:51:08 -0400 From: Vitaly Kuznetsov To: Stephen Hemminger Cc: "netdev\@vger.kernel.org" , "devel\@linuxdriverproject.org" , "linux-kernel\@vger.kernel.org" , KY Srinivasan , Haiyang Zhang Subject: Re: [PATCH net-next] hv_netvsc: fix a race between netvsc_send() and netvsc_init_buf() References: <1476885181-3456-1-git-send-email-vkuznets@redhat.com> Date: Thu, 20 Oct 2016 10:51:04 +0200 In-Reply-To: (Stephen Hemminger's message of "Thu, 20 Oct 2016 08:36:21 +0000") Message-ID: <8737jr1k07.fsf@vitty.brq.redhat.com> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/25.1 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.31]); Thu, 20 Oct 2016 08:51:07 +0000 (UTC) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2568 Lines: 52 Stephen Hemminger writes: > Do we need ACCESS_ONCE() here to avoid check/use issues? > I think we don't: this is the only place in the function where we read the variable so we'll get normal read. We're not trying to syncronize with netvsc_init_buf() as that would require locking, if we read stale NULL value after it was already updated on a different CPU we're fine, we'll just return -EAGAIN. > -----Original Message----- > From: Vitaly Kuznetsov [mailto:vkuznets@redhat.com] > Sent: Wednesday, October 19, 2016 2:53 PM > To: netdev@vger.kernel.org > Cc: Stephen Hemminger ; devel@linuxdriverproject.org; linux-kernel@vger.kernel.org; KY Srinivasan ; Haiyang Zhang > Subject: [PATCH net-next] hv_netvsc: fix a race between netvsc_send() and netvsc_init_buf() > > Fix in commit 880988348270 ("hv_netvsc: set nvdev link after populating > chn_table") turns out to be incomplete. A crash in > netvsc_get_next_send_section() is observed on mtu change when the device is under load. The race I identified is: if we get to netvsc_send() after we set net_device_ctx->nvdev link in netvsc_device_add() but before we finish netvsc_connect_vsp()->netvsc_init_buf() send_section_map is not allocated and we crash. Unfortunately we can't set net_device_ctx->nvdev link after the netvsc_init_buf() call as during the negotiation we need to receive packets and on the receive path we check for it. It would probably be possible to split nvdev into a pair of nvdev_in and nvdev_out links and check them accordingly in get_outbound_net_device()/ > get_inbound_net_device() but this looks like an overkill. > > Check that send_section_map is allocated in netvsc_send(). > > Signed-off-by: Vitaly Kuznetsov > --- > drivers/net/hyperv/netvsc.c | 7 +++++++ > 1 file changed, 7 insertions(+) > > diff --git a/drivers/net/hyperv/netvsc.c b/drivers/net/hyperv/netvsc.c index 720b5fa..e2bfaac 100644 > --- a/drivers/net/hyperv/netvsc.c > +++ b/drivers/net/hyperv/netvsc.c > @@ -888,6 +888,13 @@ int netvsc_send(struct hv_device *device, > if (!net_device) > return -ENODEV; > > + /* We may race with netvsc_connect_vsp()/netvsc_init_buf() and get > + * here before the negotiation with the host is finished and > + * send_section_map may not be allocated yet. > + */ > + if (!net_device->send_section_map) > + return -EAGAIN; > + > out_channel = net_device->chn_table[q_idx]; > > packet->send_buf_index = NETVSC_INVALID_INDEX; > -- > 2.7.4 -- Vitaly