Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S965085AbdCWOWT (ORCPT ); Thu, 23 Mar 2017 10:22:19 -0400 Received: from verein.lst.de ([213.95.11.211]:49883 "EHLO newverein.lst.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S933717AbdCWOWS (ORCPT ); Thu, 23 Mar 2017 10:22:18 -0400 Date: Thu, 23 Mar 2017 15:22:15 +0100 From: Christoph Hellwig To: Jason Wang Cc: Laura Abbott , Christoph Hellwig , "Michael S. Tsirkin" , Linux Kernel Mailing List , virtualization@lists.linux-foundation.org Subject: Re: [REGRESSION] 07ec51480b5e ("virtio_pci: use shared interrupts for virtqueues") causes crashes in guest Message-ID: <20170323142215.GA30988@lst.de> References: <43ffa887-a20d-f836-cba8-11196924d82d@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: User-Agent: Mutt/1.5.17 (2007-11-01) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2504 Lines: 61 On Thu, Mar 23, 2017 at 01:13:50PM +0800, Jason Wang wrote: > > > On 2017年03月23日 08:30, Laura Abbott wrote: >> Hi, >> >> Fedora has received multiple reports of crashes when running >> 4.11 as a guest >> >> https://bugzilla.redhat.com/show_bug.cgi?id=1430297 >> https://bugzilla.redhat.com/show_bug.cgi?id=1434462 >> https://bugzilla.kernel.org/show_bug.cgi?id=194911 >> https://bugzilla.redhat.com/show_bug.cgi?id=1433899 >> >> The crashes are not always consistent but they are generally >> some flavor of oops or GPF in virtio related code. Multiple people >> have done bisections (Thank you Thorsten Leemhuis and >> Richard W.M. Jones) and found this commit to be at fault >> >> 07ec51480b5eb1233f8c1b0f5d7a7c8d1247c507 is the first bad commit >> commit 07ec51480b5eb1233f8c1b0f5d7a7c8d1247c507 >> Author: Christoph Hellwig >> Date: Sun Feb 5 18:15:19 2017 +0100 >> >> virtio_pci: use shared interrupts for virtqueues >> This lets IRQ layer handle dispatching IRQs to separate handlers >> for the >> case where we don't have per-VQ MSI-X vectors, and allows us to greatly >> simplify the code based on the assumption that we always have interrupt >> vector 0 (legacy INTx or config interrupt for MSI-X) available, and >> any other interrupt is request/freed throught the VQ, even if the >> actual interrupt line might be shared in some cases. >> This allows removing a great deal of variables keeping track of >> the >> interrupt state in struct virtio_pci_device, as we can now simply walk the >> list of VQs and deal with per-VQ interrupt handlers there, and only treat >> vector 0 special. >> Additionally clean up the VQ allocation code to properly unwind >> on error >> instead of having a single global cleanup label, which is error prone, >> and in this case also leads to more code. >> Signed-off-by: Christoph Hellwig >> Signed-off-by: Michael S. Tsirkin >> >> :040000 040000 79a8267ffb73f9d244267c5f68365305bddd4696 8832a160b978710bbd24ba6966f462b3faa27fcc M drivers >> >> It doesn't revert cleanly so we haven't been able to do a clean >> test. Any ideas? >> >> Thanks, >> Laura > > Hello: > > Can you try the attached patch to see if it solves the problem? (At least > it silent KASan warnings for me). This looks like a correct fix to me, independent of fixing the original bug or not: Reviewed-by: Christoph Hellwig