Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752988AbbGNQC7 (ORCPT ); Tue, 14 Jul 2015 12:02:59 -0400 Received: from mx1.redhat.com ([209.132.183.28]:58693 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751644AbbGNQC6 (ORCPT ); Tue, 14 Jul 2015 12:02:58 -0400 From: Vitaly Kuznetsov To: Dexuan Cui Cc: "devel\@linuxdriverproject.org" , KY Srinivasan , Haiyang Zhang , "linux-kernel\@vger.kernel.org" Subject: Re: [PATCH] Drivers: hv: vmbus: prevent new subchannel creation on device shutdown References: <1436789934-11566-1-git-send-email-vkuznets@redhat.com> <19f503e369b04d01b79a1bde866a39f8@SIXPR30MB031.064d.mgd.msft.net> Date: Tue, 14 Jul 2015 18:02:54 +0200 In-Reply-To: <19f503e369b04d01b79a1bde866a39f8@SIXPR30MB031.064d.mgd.msft.net> (Dexuan Cui's message of "Tue, 14 Jul 2015 11:41:44 +0000") Message-ID: <87lheiybf5.fsf@vitty.brq.redhat.com> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/24.5 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3719 Lines: 91 Dexuan Cui writes: >> -----Original Message----- >> From: Vitaly Kuznetsov >> Sent: Monday, July 13, 2015 20:19 >> Subject: [PATCH] Drivers: hv: vmbus: prevent new subchannel creation on device >> shutdown >> >> When a new subchannel offer from host comes during device shutdown (e.g. >> when a netvsc/storvsc module is unloadedshortly after it was loaded) a >> crash can happen as vmbus_process_offer() is not anyhow serialized with >> vmbus_remove(). > > How about vmbus_onoffer_rescind()? > It's not serialized with vmbus_remove() either, so I think there is an issue too? > > I remember when 'rmmod hv_netvsc', we get a rescind-offer message for > each subchannel. > True, I think we have a race with rescind messages as well, we just never saw crashes for some reason. I'll think how we can make the fix more general. >> As an example we can try calling subchannel create >> callback when the module is already unloaded. >> The following crash was observed while keeping loading/unloading hv_netvsc >> module on 64-CPU guest: >> >> hv_netvsc vmbus_14: net device safe to remove >> BUG: unable to handle kernel paging request at 000000000000a118 >> IP: [] netvsc_sc_open+0x31/0xb0 [hv_netvsc] >> PGD 1f3946a067 PUD 1f38a5f067 PMD 0 >> Oops: 0000 [#1] SMP >> ... >> Call Trace: >> [] vmbus_onoffer+0x477/0x530 [hv_vmbus] >> [] ? move_linked_works+0x5f/0x80 >> [] vmbus_onmessage+0x33/0xa0 [hv_vmbus] >> [] vmbus_onmessage_work+0x21/0x30 [hv_vmbus] >> [] process_one_work+0x18e/0x4e0 >> ... >> >> The issue cannot be solved by just resetting sc_creation_callback on >> driver removal as while we search for the parent channel with channel_lock >> held we release it after the channel was found and it can disapper beneath >> our feet while we're still in vmbus_process_offer(); >> >> Introduce new sc_create_lock mutex and take it in vmbus_remove() to ensure >> no new subchannels are created after we started the removal procedure. >> Check its state with mutex_trylock in vmbus_process_offer(). > > In my 8-CPU VM, I can very easily reproduce the panic by > 1. running > while ((1)); do modprobe -r hv_netvsc; modprobe hv_netvsc; sleep 10; done. > > and > 2. in vmbus_onoffer_rescind(), we sleep 3s after a subchannel is added into > the primary channel's sc_list (and before the sc_creation_callback is invoked): > (I added line 275) > > 262 if (!fnew) { > 263 /* > 264 * Check to see if this is a sub-channel. > 265 */ > 266 if (newchannel->offermsg.offer.sub_channel_index != 0) { > 267 /* > 268 * Process the sub-channel. > 269 */ > 270 newchannel->primary_channel = channel; > 271 spin_lock_irqsave(&channel->lock, flags); > 272 list_add_tail(&newchannel->sc_list, &channel->sc_list); > 273 channel->num_sc++; > 274 spin_unlock_irqrestore(&channel->lock, flags); > 275 ssleep(3); > 276 } else > 277 goto err_free_chan; > 278 } It is possible to see crashes even without such intrumentation, move CPUs and slower host will do the job. Thanks, -- Vitaly -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/