Received: by 2002:a05:6358:1087:b0:cb:c9d3:cd90 with SMTP id j7csp2677061rwi; Fri, 21 Oct 2022 06:45:21 -0700 (PDT) X-Google-Smtp-Source: AMsMyM7ebjkjAAgQ7Rz8rvcXCItMRDGiFlQ0VSZHAyGnd0XD6I1zmsY70nQp7AP7jgZXgdTXtBdD X-Received: by 2002:a17:90b:1e4d:b0:212:c6f4:2cff with SMTP id pi13-20020a17090b1e4d00b00212c6f42cffmr6450542pjb.240.1666359910925; Fri, 21 Oct 2022 06:45:10 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1666359910; cv=none; d=google.com; s=arc-20160816; b=WRjL9FfeWkd1tRxMnucorpx/d7Kecub21qY7wzyizldToRXt7vMvQ29CqtNjzEDcnq GYZC7w2nV8XmIEzTX+aw88aiGBNtNMX2UhMMyOqVdrqZU1y83GNIOJ8sP8dGNrXPsmhE LGskm1twbhMEWKRzhSwe6/OWXuCctxMAlLPIFz/ucUo/jR96Bc2q/jZVzHn8vav4SrQV PQz41Bb94BRjRgqcGmojDbdk/NDJEW8NcCDCM752vA4P2QfHWn2tNq9frXLuVkiUEKWx 2rxelN/9kL5+UDTNtRkhMTpw7HJ8wbVIL+WeRjFnQQ8V5MxBl3fwNmTTPtH6bIv6l4Jv hHcQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:in-reply-to:from :references:cc:to:content-language:subject:user-agent:mime-version :date:message-id:dkim-signature; bh=EDUhBcAfNoGdnxG5oszokQRes8Xbrb0tEctEqEYIWVM=; b=CxIhcm8KLQXRhtBuVwVOPi/f+DRbjuRv931qvv9nkwuqeu3uyy+cxA/IGgXk60O6Kt tYS566uBDE7CmOgnxzc+UXpcBiva505ZDqF58ETWlKmqG5hRnj+c8IqOQ1vuiqOTDH3t yp8bdo+9g5lyfkqesQBde1j7igbsvIDdNU2u8ph8XOQ+H4ODJzZZy7xR93sDRMXydbQs srnCOJtl2uD93i26KW+ZMD058GVUOpVcayls61DpUylBrvheDITxss3T5MIlINee/ebu 0r+QD9nFj6ctePdA3n38c3znvSX9bLZgAD9Ch2z/aOqN19UkSv+IAdXetlDBbhUTLMll hVPg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@semihalf.com header.s=google header.b=IMju6rHp; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=semihalf.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id i10-20020a170902c94a00b00186605b7528si2010303pla.185.2022.10.21.06.44.58; Fri, 21 Oct 2022 06:45:10 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@semihalf.com header.s=google header.b=IMju6rHp; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=semihalf.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230509AbiJUNbp (ORCPT + 99 others); Fri, 21 Oct 2022 09:31:45 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44830 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229959AbiJUNbm (ORCPT ); Fri, 21 Oct 2022 09:31:42 -0400 Received: from mail-lj1-x22e.google.com (mail-lj1-x22e.google.com [IPv6:2a00:1450:4864:20::22e]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 7946E2639CB for ; Fri, 21 Oct 2022 06:31:37 -0700 (PDT) Received: by mail-lj1-x22e.google.com with SMTP id h8so3655787lja.11 for ; Fri, 21 Oct 2022 06:31:37 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=semihalf.com; s=google; h=content-transfer-encoding:in-reply-to:from:references:cc:to :content-language:subject:user-agent:mime-version:date:message-id :from:to:cc:subject:date:message-id:reply-to; bh=EDUhBcAfNoGdnxG5oszokQRes8Xbrb0tEctEqEYIWVM=; b=IMju6rHpngmVNILCsj80Ahm46IPzd0gix6kSxPd4dMjE0kWkNqedCVZv+cgD0IpnnS /xAxQTGvl8vN7TgGx44fb20K4rVUA84my6D4N7rR3/hL//yQfqIZ54uYhQfR0zSNot8C K+SK8f1EdT5RQ1sOCQtfHWCnMU9deTxF8pAvNEm9Ewp9A8psMH7PyWOY3rruGPab2ai4 MMEQarEJTKasY1diSn4lTWJN03oCH3y6AQCWu2eg77yPDeZ1Tep4HzaKHKEO5jp6mjRq JXserd7++TqzHJ7tetmnPsyys0CjgTdtqm7AX357lZ1dHPoVJnt6DeEX11l8QtQT307s DrTg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:in-reply-to:from:references:cc:to :content-language:subject:user-agent:mime-version:date:message-id :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=EDUhBcAfNoGdnxG5oszokQRes8Xbrb0tEctEqEYIWVM=; b=Z8lUrny08K+ERg2d+oD8mNc3GaUeMNaefEVZmebhr/1yCv1+Vz+wgMIcgyKE2DXLPs h23LflO2TJclUOx1iFRixK1WjZ15WA81tLuAYRzSHNfadAJ5Mc61hau24Tt9sAqXAzWE ERiNAe6PtxHku7rD4UnctLeReG5XfiCzwxpFCH4kMPpEGadV4I6X56x3FIOG4DW/mbcS TCTMdOqqHsZt6SD4IJnDJlqWrayPs5sfqwbkVfcSQwRztRRoJCbAGDmt/ZdlOFR0uK08 LP5X/xqlWNc8dpcBnLDTqW31TOWSbzDQ+uhV2jIqX1uh4BahPdcOjF6exuGXEczDc1D+ qaZg== X-Gm-Message-State: ACrzQf2lA+TdPg0ZIiiSjcnrzerIK3uFewsjs+NU+PUG6M5DDYDJDImG YTbTZka2Fo7trQqrkW4GRx32cw== X-Received: by 2002:a2e:b16b:0:b0:26e:9408:8301 with SMTP id a11-20020a2eb16b000000b0026e94088301mr6777593ljm.523.1666359095793; Fri, 21 Oct 2022 06:31:35 -0700 (PDT) Received: from [10.43.111.113] ([83.142.187.86]) by smtp.gmail.com with ESMTPSA id s5-20020a05651c048500b0026fd3f18ac6sm3410029ljc.33.2022.10.21.06.31.34 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Fri, 21 Oct 2022 06:31:35 -0700 (PDT) Message-ID: Date: Fri, 21 Oct 2022 15:31:32 +0200 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.13.0 Subject: Re: [PATCH v3 0/2] KVM: x86/ioapic: Fix oneshot interrupts forwarding Content-Language: en-US To: Sean Christopherson , Paolo Bonzini , kvm@vger.kernel.org Cc: Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , x86@kernel.org, "H. Peter Anvin" , linux-kernel@vger.kernel.org, Marc Zyngier , Eric Auger , Alex Williamson , Rong L Liu , Zhenyu Wang , Tomasz Nowicki , Grzegorz Jaszczyk , upstream@semihalf.com, Dmitry Torokhov , "Dong, Eddie" References: <20220818202701.3314045-1-dmy@semihalf.com> From: Dmytro Maluka In-Reply-To: <20220818202701.3314045-1-dmy@semihalf.com> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,NICE_REPLY_A,RCVD_IN_DNSWL_NONE, SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi Paolo, Marc, Eric and others, Did you have a chance to take a look at this v3 patchset? Just to remind the context: this patchset implements a solution I proposed in [1], only that it doesn't introduce the additional "pending oneshot" state (which was Marc's concern in [2]): I realized that this extra state is not really needed, as we can just unconditionally notify the resamplefd once again at unmask, with the same end result. [1] https://lore.kernel.org/kvm/72e40c17-e5cd-1ffd-9a38-00b47e1cbd8e@semihalf.com/ [2] https://lore.kernel.org/kvm/87o7wrug0w.wl-maz@kernel.org/ On 8/18/22 22:26, Dmytro Maluka wrote: > KVM irqfd based emulation of level-triggered interrupts doesn't work > quite correctly in some cases, particularly in the case of interrupts > that are handled in a Linux guest as oneshot interrupts (IRQF_ONESHOT). > Such an interrupt is acked to the device in its threaded irq handler, > i.e. later than it is acked to the interrupt controller (EOI at the end > of hardirq), not earlier. > > Linux keeps such interrupt masked until its threaded handler finishes, > to prevent the EOI from re-asserting an unacknowledged interrupt. > However, with KVM + vfio (or whatever is listening on the resamplefd) > we always notify resamplefd at the EOI, so vfio prematurely unmasks the > host physical IRQ, thus a new physical interrupt is fired in the host. > This extra interrupt in the host is not a problem per se. The problem is > that it is unconditionally queued for injection into the guest, so the > guest sees an extra bogus interrupt. [*] > > There are observed at least 2 user-visible issues caused by those > extra erroneous interrupts for a oneshot irq in the guest: > > 1. System suspend aborted due to a pending wakeup interrupt from > ChromeOS EC (drivers/platform/chrome/cros_ec.c). > 2. Annoying "invalid report id data" errors from ELAN0000 touchpad > (drivers/input/mouse/elan_i2c_core.c), flooding the guest dmesg > every time the touchpad is touched. > > The core issue here is that by the time when the guest unmasks the IRQ, > the physical IRQ line is no longer asserted (since the guest has > acked the interrupt to the device in the meantime), yet we > unconditionally inject the interrupt queued into the guest by the > previous resampling. So to fix the issue, we need a way to detect that > the IRQ is no longer pending, and cancel the queued interrupt in this > case. > > With IOAPIC we are not able to probe the physical IRQ line state > directly (at least not if the underlying physical interrupt controller > is an IOAPIC too), so in this patch series we use irqfd resampler for > that. Namely, instead of injecting the queued interrupt, we just notify > the resampler that this interrupt is done. If the IRQ line is actually > already deasserted, we are done. If it is still asserted, a new > interrupt will be shortly triggered through irqfd and injected into the > guest. > > In the case if there is no irqfd resampler registered for this IRQ, we > cannot fix the issue, so we keep the existing behavior: immediately > unconditionally inject the queued interrupt. > > This patch series fixes the issue for x86 IOAPIC only. In the long run, > we can fix it for other irqchips and other architectures too, possibly > taking advantage of reading the physical state of the IRQ line, which is > possible with some other irqchips (e.g. with arm64 GIC, maybe even with > the legacy x86 PIC). > > [*] In this description we assume that the interrupt is a physical host > interrupt forwarded to the guest e.g. by vfio. Potentially the same > issue may occur also with a purely virtual interrupt from an > emulated device, e.g. if the guest handles this interrupt, again, as > a oneshot interrupt. > > > v3: > - Completely reworked: instead of postponing resamplefd notify until > unmask (to avoid extra interrupts in the host), resample the pending > status at unmask to avoid erroneous propagation of those extra > interrupts to the guest. > Thanks to Marc Zyngier for helping to identify the core issue, which > resulted in a simpler and probably more sensible implementation > (even though Marc's concern about presenting inaccurate pending > status to the guest is a non-issue in the case of IOAPIC, since > IOAPIC doesn't present this information anyway). > > v2: > - Fixed compilation failure on non-x86: mask_notifier_list moved from > x86 "struct kvm_arch" to generic "struct kvm". > - kvm_fire_mask_notifiers() also moved from x86 to generic code, even > though it is not called on other architectures for now. > - Instead of kvm_irq_is_masked() implemented > kvm_register_and_fire_irq_mask_notifier() to fix potential race > when reading the initial IRQ mask state. > - Renamed for clarity: > - irqfd_resampler_mask() -> irqfd_resampler_mask_notify() > - kvm_irq_has_notifier() -> kvm_irq_has_ack_notifier() > - resampler->notifier -> resampler->ack_notifier > - Reorganized code in irqfd_resampler_ack() and > irqfd_resampler_mask_notify() to make it easier to follow. > - Don't follow unwanted "return type on separate line" style for > irqfd_resampler_mask_notify(). > > Dmytro Maluka (2): > KVM: irqfd: Make resampler_list an RCU list > KVM: x86/ioapic: Resample the pending state of an IRQ when unmasking > > arch/x86/kvm/ioapic.c | 36 ++++++++++++++++++++++++++++-- > include/linux/kvm_host.h | 9 ++++++++ > include/linux/kvm_irqfd.h | 2 +- > virt/kvm/eventfd.c | 47 ++++++++++++++++++++++++++++++++------- > 4 files changed, 83 insertions(+), 11 deletions(-) >