Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753209AbdCWAav (ORCPT ); Wed, 22 Mar 2017 20:30:51 -0400 Received: from mail-qk0-f172.google.com ([209.85.220.172]:36154 "EHLO mail-qk0-f172.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752190AbdCWAad (ORCPT ); Wed, 22 Mar 2017 20:30:33 -0400 To: Christoph Hellwig , "Michael S. Tsirkin" , Jason Wang Cc: Linux Kernel Mailing List , virtualization@lists.linux-foundation.org From: Laura Abbott Subject: [REGRESSION] 07ec51480b5e ("virtio_pci: use shared interrupts for virtqueues") causes crashes in guest Message-ID: <43ffa887-a20d-f836-cba8-11196924d82d@redhat.com> Date: Wed, 22 Mar 2017 17:30:27 -0700 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.7.0 MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1983 Lines: 48 Hi, Fedora has received multiple reports of crashes when running 4.11 as a guest https://bugzilla.redhat.com/show_bug.cgi?id=1430297 https://bugzilla.redhat.com/show_bug.cgi?id=1434462 https://bugzilla.kernel.org/show_bug.cgi?id=194911 https://bugzilla.redhat.com/show_bug.cgi?id=1433899 The crashes are not always consistent but they are generally some flavor of oops or GPF in virtio related code. Multiple people have done bisections (Thank you Thorsten Leemhuis and Richard W.M. Jones) and found this commit to be at fault 07ec51480b5eb1233f8c1b0f5d7a7c8d1247c507 is the first bad commit commit 07ec51480b5eb1233f8c1b0f5d7a7c8d1247c507 Author: Christoph Hellwig Date: Sun Feb 5 18:15:19 2017 +0100 virtio_pci: use shared interrupts for virtqueues This lets IRQ layer handle dispatching IRQs to separate handlers for the case where we don't have per-VQ MSI-X vectors, and allows us to greatly simplify the code based on the assumption that we always have interrupt vector 0 (legacy INTx or config interrupt for MSI-X) available, and any other interrupt is request/freed throught the VQ, even if the actual interrupt line might be shared in some cases. This allows removing a great deal of variables keeping track of the interrupt state in struct virtio_pci_device, as we can now simply walk the list of VQs and deal with per-VQ interrupt handlers there, and only treat vector 0 special. Additionally clean up the VQ allocation code to properly unwind on error instead of having a single global cleanup label, which is error prone, and in this case also leads to more code. Signed-off-by: Christoph Hellwig Signed-off-by: Michael S. Tsirkin :040000 040000 79a8267ffb73f9d244267c5f68365305bddd4696 8832a160b978710bbd24ba6966f462b3faa27fcc M drivers It doesn't revert cleanly so we haven't been able to do a clean test. Any ideas? Thanks, Laura