Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754012AbdCBO5X (ORCPT ); Thu, 2 Mar 2017 09:57:23 -0500 Received: from smtp.ctxuk.citrix.com ([185.25.65.24]:19563 "EHLO SMTP.EU.CITRIX.COM" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753852AbdCBO4x (ORCPT ); Thu, 2 Mar 2017 09:56:53 -0500 X-IronPort-AV: E=Sophos;i="5.35,231,1484006400"; d="scan'208";a="41749269" Subject: Re: BUG due to "xen-netback: protect resource cleaning on XenBus disconnect" To: Paul Durrant , "'Juergen Gross'" , Wei Liu References: <75c81731-e4a7-bde1-c4fd-a52e97b820a0@suse.com> <20170302120653.dce2mehbmhyxsiow@citrix.com> <0e503a09-fbbb-02aa-152b-e4eaf3122a56@suse.com> <6e79cc86f623446aa432a35aa8305807@AMSPEX02CL03.citrite.net> CC: xen-devel , Linux Kernel Mailing List , "netdev@vger.kernel.org" , Boris Ostrovsky , David Miller From: Igor Druzhinin Message-ID: Date: Thu, 2 Mar 2017 14:55:57 +0000 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.7.0 MIME-Version: 1.0 In-Reply-To: <6e79cc86f623446aa432a35aa8305807@AMSPEX02CL03.citrite.net> Content-Type: text/plain; charset="windows-1252" Content-Transfer-Encoding: 7bit X-ClientProxiedBy: FTLPEX02CAS02.citrite.net (10.13.99.123) To AMSPEX02CL03.citrite.net (10.69.22.127) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3197 Lines: 80 On 02/03/17 12:19, Paul Durrant wrote: >> -----Original Message----- >> From: Juergen Gross [mailto:jgross@suse.com] >> Sent: 02 March 2017 12:13 >> To: Wei Liu >> Cc: Igor Druzhinin ; xen-devel > devel@lists.xenproject.org>; Linux Kernel Mailing List > kernel@vger.kernel.org>; netdev@vger.kernel.org; Boris Ostrovsky >> ; David Miller ; Paul >> Durrant >> Subject: Re: BUG due to "xen-netback: protect resource cleaning on XenBus >> disconnect" >> >> On 02/03/17 13:06, Wei Liu wrote: >>> On Thu, Mar 02, 2017 at 12:56:20PM +0100, Juergen Gross wrote: >>>> With commits f16f1df65 and 9a6cdf52b we get in our Xen testing: >>>> >>>> [ 174.512861] switch: port 2(vif3.0) entered disabled state >>>> [ 174.522735] BUG: sleeping function called from invalid context at >>>> /home/build/linux-linus/mm/vmalloc.c:1441 >>>> [ 174.523451] in_atomic(): 1, irqs_disabled(): 0, pid: 28, name: xenwatch >>>> [ 174.524131] CPU: 1 PID: 28 Comm: xenwatch Tainted: G W >>>> 4.10.0upstream-11073-g4977ab6-dirty #1 >>>> [ 174.524819] Hardware name: MSI MS-7680/H61M-P23 (MS-7680), BIOS >> V17.0 >>>> 03/14/2011 >>>> [ 174.525517] Call Trace: >>>> [ 174.526217] show_stack+0x23/0x60 >>>> [ 174.526899] dump_stack+0x5b/0x88 >>>> [ 174.527562] ___might_sleep+0xde/0x130 >>>> [ 174.528208] __might_sleep+0x35/0xa0 >>>> [ 174.528840] ? _raw_spin_unlock_irqrestore+0x13/0x20 >>>> [ 174.529463] ? __wake_up+0x40/0x50 >>>> [ 174.530089] remove_vm_area+0x20/0x90 >>>> [ 174.530724] __vunmap+0x1d/0xc0 >>>> [ 174.531346] ? delete_object_full+0x13/0x20 >>>> [ 174.531973] vfree+0x40/0x80 >>>> [ 174.532594] set_backend_state+0x18a/0xa90 >>>> [ 174.533221] ? dwc_scan_descriptors+0x24d/0x430 >>>> [ 174.533850] ? kfree+0x5b/0xc0 >>>> [ 174.534476] ? xenbus_read+0x3d/0x50 >>>> [ 174.535101] ? xenbus_read+0x3d/0x50 >>>> [ 174.535718] ? xenbus_gather+0x31/0x90 >>>> [ 174.536332] ? ___might_sleep+0xf6/0x130 >>>> [ 174.536945] frontend_changed+0x6b/0xd0 >>>> [ 174.537565] xenbus_otherend_changed+0x7d/0x80 >>>> [ 174.538185] frontend_changed+0x12/0x20 >>>> [ 174.538803] xenwatch_thread+0x74/0x110 >>>> [ 174.539417] ? woken_wake_function+0x20/0x20 >>>> [ 174.540049] kthread+0xe5/0x120 >>>> [ 174.540663] ? xenbus_printf+0x50/0x50 >>>> [ 174.541278] ? __kthread_init_worker+0x40/0x40 >>>> [ 174.541898] ret_from_fork+0x21/0x2c >>>> [ 174.548635] switch: port 2(vif3.0) entered disabled state >>>> >>>> I believe calling vfree() when holding a spin_lock isn't a good idea. >>>> >>> >>> Use vfree_atomic instead? >> >> Hmm, isn't this overkill here? >> >> You can just set a local variable with the address and do vfree() after >> releasing the lock. >> > > Yep, that's what I was thinking. Patch coming shortly. > > Paul We have an internal patch that was just recently tested without using spinlocks. Calling vfree in the spinlock section is not the worst that could happen as our testing revealed. I switched to RCU for protecting the environment from memory release. I'll share it today. Igor > >> >> Juergen