Received: by 2002:a05:6a10:5bc5:0:0:0:0 with SMTP id os5csp2770724pxb; Tue, 12 Oct 2021 13:09:21 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxmXW7yEJRo0A4HeL00KTiq9c5+7YZZtgx5/zo83KnFrvtxiulmohadSgwuz7Ny8rqdnfM8 X-Received: by 2002:a63:5b09:: with SMTP id p9mr24220858pgb.321.1634069361100; Tue, 12 Oct 2021 13:09:21 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1634069361; cv=none; d=google.com; s=arc-20160816; b=waIbZYuCqwaAY0aLUWyge0BUOBU3SjF1m1MDTemEzQq+gDoM1LjOdO3BA/tZ8hlXfu kpogoXNOFl70TCK7uehWekXXpgceeSPmMn8trpL+WBIlk7sDQ++VcM5J3DMDuHiRkqMK Js9292iyvU2qFI7QKzpL0hxdeSXIeinzgI/vjIe2UVSwibTMzy9U6jG7lYth44EMkWWC A61wnr7RCHk2Y0So+tiJR+qlD28ErFfZBuTb8O14kxIRa5L1QoGBfYi8TuBgSZ16mwM3 +HJYdWG+T/5a+2jZh/6v1WlM8gvkuWJkWeNqjNa/Ix43XMNsWyi4T0KEcV2+UaD8/zls NIyg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:subject:cc:to:from:date :dkim-signature; bh=Izu+rZ+bydrqP5Ea4+zsV5r5PHcw8FRNGYQmKnuuTsQ=; b=fYXe2TKx0AGqrUQXL+2qV661TncwV3NVshuG9wI0DSHTY6N85PtDvx9lnMsulgce1K 9Lpaq8Hia+tRGDRxuzRa9BxDMEhs6IgyzuEGyvodb/i4A/jt+w8FX59tInZkBy9pyMRl 05ALq83ww9bbdJbm6n2Qp7/H7WBsE5ZFfi7oQAefScBPxwrTCDgQlyWN+mnIGB8LMhxU WrolOGnyDJxFTl8WhSy0jD/STY76I1I/kf8Qtzo1cf1hLFzTzBjN11JdYGxUTr6Ix5UB HZKFXLy3XUBxc8tZV8VbxJTzwg7kXD9/8gfBJinTh4lnEfM703J2a4JYx5GN8cqCm0pR CpMg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=Qesu7kYU; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id s12si18627496pgf.475.2021.10.12.13.09.07; Tue, 12 Oct 2021 13:09:21 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=Qesu7kYU; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234182AbhJLUHY (ORCPT + 99 others); Tue, 12 Oct 2021 16:07:24 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]:43178 "EHLO us-smtp-delivery-124.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234100AbhJLUHX (ORCPT ); Tue, 12 Oct 2021 16:07:23 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1634069121; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Izu+rZ+bydrqP5Ea4+zsV5r5PHcw8FRNGYQmKnuuTsQ=; b=Qesu7kYU2iLqRQ/fK5gVfhKvN2RjN+d0ZA7SsNAGuzMoVjp88GuNedbUKXUYJN9b9lu6Lb 4c11FDzauZzDSEMOFjNPLN5/0xC5wDQ2+ERddqu5rDztXopnu97EhPSV/5mH+bE1kGDJda ZMpAq8lDokzoO+PsjI/D12gVpI+uDBA= Received: from mail-ot1-f70.google.com (mail-ot1-f70.google.com [209.85.210.70]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-536-0-3oVFwvMRqlR90ZgltCQg-1; Tue, 12 Oct 2021 16:05:19 -0400 X-MC-Unique: 0-3oVFwvMRqlR90ZgltCQg-1 Received: by mail-ot1-f70.google.com with SMTP id u19-20020a0568301f1300b005472c85a1feso315782otg.12 for ; Tue, 12 Oct 2021 13:05:19 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:from:to:cc:subject:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=Izu+rZ+bydrqP5Ea4+zsV5r5PHcw8FRNGYQmKnuuTsQ=; b=YvxWreoq21lHBsqAnpx7Mzyl/lfNGwKsjUhbbCDz5JCH2q6ap7mb1SJSeD4NPdKw8w F3hiA/RJIJyyxFsreRu/A43k7/qujtSWhc8DK1lQ7BJyOiAjPpvPmDwv66NkAsBvCj1H 6CuPg9nqnaBhALhZzo6xShdqYdl8ol6fNs4P30NxLC1yWf0xk9huY3Mq6Oq420vGRspK s1UZgWyvrG6lkcpjtcpiOYOL8KRtMNa4bzn9EJPn2HMkkD2UIcEd1eHBnRqgrlI1usfu MHmSqL3+h24IadIuSjeZHYd/YilOnsFNCJgJEFks5xUOuUb+nu9vdHR0qgbvy7PrA6Io 4lXQ== X-Gm-Message-State: AOAM532PyzKk9AIxxF2bV+qx0yiYRgiL1qOwbyaVQnVbd4X7kZA/7biP N+6Tz3q2pnFKnmZsThka2G72AV7ulO1Kq9SFe840sf0oHBdbIsWJZPjoYvQ1u5jBz0YpM9o5Nrn zfCtEdrmLp/sjQMqvFKdIDRXf X-Received: by 2002:a9d:1b7:: with SMTP id e52mr23201513ote.352.1634069118749; Tue, 12 Oct 2021 13:05:18 -0700 (PDT) X-Received: by 2002:a9d:1b7:: with SMTP id e52mr23201490ote.352.1634069118419; Tue, 12 Oct 2021 13:05:18 -0700 (PDT) Received: from redhat.com ([38.15.36.239]) by smtp.gmail.com with ESMTPSA id s18sm2135912otd.55.2021.10.12.13.05.17 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 12 Oct 2021 13:05:18 -0700 (PDT) Date: Tue, 12 Oct 2021 14:05:16 -0600 From: Alex Williamson To: Matthew Ruffell Cc: linux-pci@vger.kernel.org, lkml , kvm@vger.kernel.org, nathan.langford@xcelesunifiedtechnologies.com Subject: Re: [PROBLEM] Frequently get "irq 31: nobody cared" when passing through 2x GPUs that share same pci switch via vfio Message-ID: <20211012140516.6838248b.alex.williamson@redhat.com> In-Reply-To: References: <20210914104301.48270518.alex.williamson@redhat.com> <9e8d0e9e-1d94-35e8-be1f-cf66916c24b2@canonical.com> <20210915103235.097202d2.alex.williamson@redhat.com> <2fadf33d-8487-94c2-4460-2a20fdb2ea12@canonical.com> <20211005171326.3f25a43a.alex.williamson@redhat.com> X-Mailer: Claws Mail 3.18.0 (GTK+ 2.24.33; x86_64-redhat-linux-gnu) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, 12 Oct 2021 17:58:07 +1300 Matthew Ruffell wrote: > Hi Alex, > > On Wed, Oct 6, 2021 at 12:13 PM Alex Williamson > wrote: > > With both of these together, I'm so far able to prevent an interrupt > > storm for these cards. I'd say the patch below is still extremely > > experimental, and I'm not sure how to get around the really hacky bit, > > but it would be interesting to see if it resolves the original issue. > > I've not yet tested this on a variety of devices, so YMMV. Thanks, > > Thank you very much for your analysis and for the experimental patch, and we > have excellent news to report. > > I sent Nathan a test kernel built on 5.14.0, and he has been running the > reproducer for a few days now. > > Nathan writes: > > > I've been testing heavily with the reproducer for a few days using all 8 GPUs > > and with the MSI fix for the audio devices in the guest disabled, i.e. a pretty > > much worst case scenario. As a control with kernel 5.14 (unpatched), the system > > locked up in 2,2,6,1, and 4 VM reset iterations, all in less than 10 minutes > > each time. With the patched kernel I'm currently at 1226 iterations running for > > 2 days 10 hours with no failures. This is excellent. FYI, I have disabled the > > dyndbg setting. > > The system is stable, and your patch sounds very promising. Great, I also ran a VM reboot loop for several days with all 6 GPUs assigned, no interrupt issues. > Nathan does have a small side effect to report: > > > The only thing close to an issue that I have is that I still get frequent > > "irq 112: nobody cared" and "Disabling IRQ #112" errors. They just no longer > > lockup the system. If I watch the reproducer time between VM resets, I've > > noticed that it takes longer for the VM to startup after one of these > > "nobody cared" errors, and thus it takes longer until I can reset the VM again. > > I believe slow guest behavior in this disabled IRQ scenario is expected though? > > Full dmesg: > https://paste.ubuntu.com/p/hz8WdPZmNZ/ > > I had a look at all the lspci Nathan has provided me in the past, but 112 isn't > listed. I will ask Nathan for a fresh lspci so we can see what device it is. > The interesting thing is that we still hit __report_bad_irq() for 112 when we > have previously disabled it, typically after 1000+ seconds has gone by. The device might need to be operating in INTx mode, or at least had been at some point, to get the register filled. It's essentially just a scratch register on the card that gets filled when the interrupt is configured. Each time we register a new handler for the irq the masking due to spurious interrupt will be removed, but if it's actually causing the VM boot to take longer that suggests to me that the guest driver is stalled, perhaps because it's expecting an interrupt that's now masked in the host. This could also be caused by a device that gets incorrectly probed for PCI-2.3 compliant interrupt masking. For probing we can really only test that we have the ability to set the DisINTx bit, we can only hope that the hardware folks also properly implemented the INTx status bit to indicate the device is signaling INTx. We should really figure out which device this is so that we can focus on whether it's another shared interrupt issue or something specific to the device. I'm also confused why this doesn't trigger the same panic/kexec as we were seeing with the other interrupt lines. Are there some downstream patches or configs missing here that would promote these to more fatal errors? > We think your patch fixes the interrupt storm issues. We are happy to continue > testing for as much as you need, and we are happy to test any followup patch > revisions. > > Is there anything you can do to feel more comfortable about the > PCI_DEV_FLAGS_MSI_INTX_DISABLE_BUG dev flag hack? While it works, I can see why > you might not want to land it in mainline. Yeah, it's a huge hack. I wonder if we could look at the interrupt status and conditional'ize clearing DisINTx based on lack of a pending interrupt. It seems somewhat reasonable not to clear the bit masking the interrupt if we know it's pending and know there's no handler for it. I'll try to check if that's possible. Thanks, Alex