Received: by 2002:a05:6358:e9c4:b0:b2:91dc:71ab with SMTP id hc4csp6486550rwb; Tue, 9 Aug 2022 16:46:08 -0700 (PDT) X-Google-Smtp-Source: AA6agR55I5rhEbdKx2O3eJElcwgu+eNwLBkY8NetiJvx5kJkX/zl4muWaa+lStE96Q9d1KrmSZoQ X-Received: by 2002:a17:907:75fb:b0:730:9e1f:73f1 with SMTP id jz27-20020a17090775fb00b007309e1f73f1mr17704840ejc.495.1660088768325; Tue, 09 Aug 2022 16:46:08 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1660088768; cv=none; d=google.com; s=arc-20160816; b=KrlklgQyyI06u+Dgl5YFkk1UZDEl4NBUwkLQXgq7EYJlGMZpsmPQho3r91Cb/yKjw3 Z7Cx6J42ED3YpIa7JAEJYhTHj5/iI90y6EcaKXbuNVf/9ny13oREDLpk6St6Kd8gIGB1 sm7Z8FaiDRIvoKhDg4F3QODahNimWjkF6Hog8Pl7sQ/SOkF67L6rXtYqENtkaPSKFe0i VLOuREa729phTnQmhNDGlZyY+tOr9KZqz903c8bfYwJCiAzSA8fCggEtST9qevLOb7tw sut6RZPzBqO9tHOj87XxHKjIBcomhWOHV1S2rfegAvhlanVKT6An1uFnnw2TEnJfMCJ8 Lhdg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:content-language :in-reply-to:mime-version:user-agent:date:message-id:from:references :cc:to:subject:dkim-signature; bh=iZRwQQqJlrwLCsKGlbObwH22s/7VDIFNBtxo7IR0iGM=; b=fIJIZeGrWOFZyiIMwEHBp07EkM0+OvonO7E1WDokMiBFjjTIdekiz2PJqb4zpj9cVF UlpUvZI1XHVEbNRQssTalLbCPMnVWK1WmJqhAFecqEWdrVmnBTsHXsAVDXC5hzHTC3gP kZZs+Eo5g/vhc7BjGwLk+0qX7RbuX7SVuks0T1G8ENVvvgLqGaf+p719e7W6UWFoCS9B JBkqLxlWFpxuL5h70wykCO1xww8RF3RhnqmjHouwR4I/7qip9TwJh+z0J1EZeX8JRBEg gP/KrdDPwl735llTodhfmFHevbmuUuRTXUI+epq7s90MDXNjFdd2MBNxuNd5c9OMrYIp A7jA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@semihalf.com header.s=google header.b="phR66Nr/"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=semihalf.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id qa42-20020a17090786aa00b0072b2e31d066si3297892ejc.642.2022.08.09.16.45.40; Tue, 09 Aug 2022 16:46:08 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@semihalf.com header.s=google header.b="phR66Nr/"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=semihalf.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229539AbiHIXah (ORCPT + 99 others); Tue, 9 Aug 2022 19:30:37 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48804 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229480AbiHIXag (ORCPT ); Tue, 9 Aug 2022 19:30:36 -0400 Received: from mail-lf1-x133.google.com (mail-lf1-x133.google.com [IPv6:2a00:1450:4864:20::133]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 05B5D4B48F for ; Tue, 9 Aug 2022 16:30:34 -0700 (PDT) Received: by mail-lf1-x133.google.com with SMTP id z6so11568852lfu.9 for ; Tue, 09 Aug 2022 16:30:33 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=semihalf.com; s=google; h=content-transfer-encoding:content-language:in-reply-to:mime-version :user-agent:date:message-id:from:references:cc:to:subject:from:to:cc; bh=iZRwQQqJlrwLCsKGlbObwH22s/7VDIFNBtxo7IR0iGM=; b=phR66Nr/T+MuFGa2vBt/kFwbuJkQC6v36BuX7nbJpfOOFVL6bDB6NzutH535gBOgxG EJ6DsY13KJ/BobITbT+druSIqjWbJpVSOyiP0NSOQh/U+hPf/UaXLGgHxWhakaWZ/6/G O3eqhGdsvnOH6IhZEZeIUSYjAMCJDtRgDDCqfh7oStDhQJMlX4cfV7bnURNaCT4Bqhwx bYDUkyJWvVeB7kxY3xmrSxuVylGEMufLS78IXMI3EAYzewz9sqzVbxukFq54gZD7ZmQ4 aZAB95yWTBYeYECyKhTfzZNJ3V1mrIdFyg2L+MEz9Gf0XjH6GP3ETbQ0jwF8fnOQ/vHa qfAg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:content-language:in-reply-to:mime-version :user-agent:date:message-id:from:references:cc:to:subject :x-gm-message-state:from:to:cc; bh=iZRwQQqJlrwLCsKGlbObwH22s/7VDIFNBtxo7IR0iGM=; b=O7DM4p0qJaWbT8abiHtIj5t2DCQ6tHvdgmSgHKhsxU+w0EemuldH4RM5N7FO0b5elO O81HinP48zFFmgn1uSSkJ6dPxyIHzec98mGEdQAGQOzT+sP5+DReZStASnxh8tU42G2I 3w6ytzqJw3+SYccvZSe7D7gy8E/ArP6WP3euo8pafjP3moURYOuS0HabZp6Jq13B4/Es KCZMOXh+TKucQBVtRqooRctCy8U15cVFHcdFLfta1Yh0SBpxK+u+PJs9TmbMAlaiL0Fz mqftgWAfCdnlRGAIYYRlXs/bSbbvcx9nwfnVN1TGIUZyAj4WVJ1WT3h8V2zvgAFLKyDg bzYA== X-Gm-Message-State: ACgBeo1vVkU09OygnrHOUS+pPAISBBMz+psTYib/wlXmSTLvH3GsGx8w Ysgrro7S8/bPbcEOWHLPncB+2w== X-Received: by 2002:ac2:4832:0:b0:48b:1899:b364 with SMTP id 18-20020ac24832000000b0048b1899b364mr8928921lft.534.1660087832351; Tue, 09 Aug 2022 16:30:32 -0700 (PDT) Received: from ?IPv6:2a02:a31b:33d:9c00:463a:87e3:44fc:2b2f? ([2a02:a31b:33d:9c00:463a:87e3:44fc:2b2f]) by smtp.gmail.com with ESMTPSA id be36-20020a05651c172400b0025e48907929sm156503ljb.23.2022.08.09.16.30.30 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Tue, 09 Aug 2022 16:30:31 -0700 (PDT) Subject: Re: [PATCH v2 0/5] KVM: Fix oneshot interrupts forwarding To: "Dong, Eddie" , "Christopherson,, Sean" , Paolo Bonzini , "kvm@vger.kernel.org" Cc: Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , "x86@kernel.org" , "H. Peter Anvin" , "linux-kernel@vger.kernel.org" , Eric Auger , Alex Williamson , "Liu, Rong L" , Zhenyu Wang , Tomasz Nowicki , Grzegorz Jaszczyk , "upstream@semihalf.com" , Dmitry Torokhov , Marc Zyngier References: <20220805193919.1470653-1-dmy@semihalf.com> From: Dmytro Maluka Message-ID: Date: Wed, 10 Aug 2022 01:30:29 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.13.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,NICE_REPLY_A,RCVD_IN_DNSWL_NONE, SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 8/9/22 10:01 PM, Dong, Eddie wrote: > > >> -----Original Message----- >> From: Dmytro Maluka >> Sent: Tuesday, August 9, 2022 12:24 AM >> To: Dong, Eddie ; Christopherson,, Sean >> ; Paolo Bonzini ; >> kvm@vger.kernel.org >> Cc: Thomas Gleixner ; Ingo Molnar ; >> Borislav Petkov ; Dave Hansen ; >> x86@kernel.org; H. Peter Anvin ; linux- >> kernel@vger.kernel.org; Eric Auger ; Alex >> Williamson ; Liu, Rong L ; >> Zhenyu Wang ; Tomasz Nowicki >> ; Grzegorz Jaszczyk ; >> upstream@semihalf.com; Dmitry Torokhov >> Subject: Re: [PATCH v2 0/5] KVM: Fix oneshot interrupts forwarding >> >> On 8/9/22 1:26 AM, Dong, Eddie wrote: >>>> >>>> The existing KVM mechanism for forwarding of level-triggered >>>> interrupts using resample eventfd doesn't work quite correctly in the >>>> case of interrupts that are handled in a Linux guest as oneshot >>>> interrupts (IRQF_ONESHOT). Such an interrupt is acked to the device >>>> in its threaded irq handler, i.e. later than it is acked to the >>>> interrupt controller (EOI at the end of hardirq), not earlier. The >>>> existing KVM code doesn't take that into account, which results in >>>> erroneous extra interrupts in the guest caused by premature re-assert of an >> unacknowledged IRQ by the host. >>> >>> Interesting... How it behaviors in native side? >> >> In native it behaves correctly, since Linux masks such a oneshot interrupt at the >> beginning of hardirq, so that the EOI at the end of hardirq doesn't result in its >> immediate re-assert, and then unmasks it later, after its threaded irq handler >> completes. >> >> In handle_fasteoi_irq(): >> >> if (desc->istate & IRQS_ONESHOT) >> mask_irq(desc); >> >> handle_irq_event(desc); >> >> cond_unmask_eoi_irq(desc, chip); >> >> >> and later in unmask_threaded_irq(): >> >> unmask_irq(desc); >> >> I also mentioned that in patch #3 description: >> "Linux keeps such interrupt masked until its threaded handler finishes, to >> prevent the EOI from re-asserting an unacknowledged interrupt. > > That makes sense. Can you include the full story in cover letter too? Ok, I will. > > >> However, with KVM + vfio (or whatever is listening on the resamplefd) we don't >> check that the interrupt is still masked in the guest at the moment of EOI. >> Resamplefd is notified regardless, so vfio prematurely unmasks the host >> physical IRQ, thus a new (unwanted) physical interrupt is generated in the host >> and queued for injection to the guest." >> > > Emulation of level triggered IRQ is a pain point ☹ > I read we need to emulate the "level" of the IRQ pin (connecting from device to IRQchip, i.e. ioapic here). > Technically, the guest can change the polarity of vIOAPIC, which will lead to a new virtual IRQ > even w/o host side interrupt. Thanks, interesting point. Do you mean that this behavior (a new vIRQ as a result of polarity change) may already happen with the existing KVM code? It doesn't seem so to me. AFAICT, KVM completely ignores the vIOAPIC polarity bit, in particular it doesn't handle change of the polarity by the guest (i.e. doesn't update the virtual IRR register, and so on), so it shouldn't result in a new interrupt. Since commit 100943c54e09 ("kvm: x86: ignore ioapic polarity") there seems to be an assumption that KVM interpretes the IRQ level value as active (asserted) vs inactive (deasserted) rather than high vs low, i.e. the polarity doesn't matter to KVM. So, since both sides (KVM emulating the IOAPIC, and vfio/whatever emulating an external interrupt source) seem to operate on a level of abstraction of "asserted" vs "de-asserted" interrupt state regardless of the polarity, and that seems not a bug but a feature, it seems that we don't need to emulate the IRQ level as such. Or am I missing something? OTOH, I guess this means that the existing KVM's emulation of level-triggered interrupts is somewhat limited (a guest may legitimately expect an interrupt fired as a result of polarity change, and that case is not supported by KVM). But that is rather out of scope of the oneshot interrupts issue addressed by this patchset. > "pending" field of kvm_kernel_irqfd_resampler in patch 3 means more an event rather than an interrupt level. > > >>> >>>> >>>> This patch series fixes this issue (for now on x86 only) by checking >>>> if the interrupt is unmasked when we receive irq ack (EOI) and, in >>>> case if it's masked, postponing resamplefd notify until the guest unmasks it. >>>> >>>> Patches 1 and 2 extend the existing support for irq mask notifiers in >>>> KVM, which is a prerequisite needed for KVM irqfd to use mask >>>> notifiers to know when an interrupt is masked or unmasked. >>>> >>>> Patch 3 implements the actual fix: postponing resamplefd notify in >>>> irqfd until the irq is unmasked. >>>> >>>> Patches 4 and 5 just do some optional renaming for consistency, as we >>>> are now using irq mask notifiers in irqfd along with irq ack notifiers. >>>> >>>> Please see individual patches for more details. >>>> >>>> v2: >>>> - Fixed compilation failure on non-x86: mask_notifier_list moved from >>>> x86 "struct kvm_arch" to generic "struct kvm". >>>> - kvm_fire_mask_notifiers() also moved from x86 to generic code, even >>>> though it is not called on other architectures for now. >>>> - Instead of kvm_irq_is_masked() implemented >>>> kvm_register_and_fire_irq_mask_notifier() to fix potential race >>>> when reading the initial IRQ mask state. >>>> - Renamed for clarity: >>>> - irqfd_resampler_mask() -> irqfd_resampler_mask_notify() >>>> - kvm_irq_has_notifier() -> kvm_irq_has_ack_notifier() >>>> - resampler->notifier -> resampler->ack_notifier >>>> - Reorganized code in irqfd_resampler_ack() and >>>> irqfd_resampler_mask_notify() to make it easier to follow. >>>> - Don't follow unwanted "return type on separate line" style for >>>> irqfd_resampler_mask_notify(). >>>> >>>> Dmytro Maluka (5): >>>> KVM: x86: Move irq mask notifiers from x86 to generic KVM >>>> KVM: x86: Add kvm_register_and_fire_irq_mask_notifier() >>>> KVM: irqfd: Postpone resamplefd notify for oneshot interrupts >>>> KVM: irqfd: Rename resampler->notifier >>>> KVM: Rename kvm_irq_has_notifier() >>>> >>>> arch/x86/include/asm/kvm_host.h | 17 +--- >>>> arch/x86/kvm/i8259.c | 6 ++ >>>> arch/x86/kvm/ioapic.c | 8 +- >>>> arch/x86/kvm/ioapic.h | 1 + >>>> arch/x86/kvm/irq_comm.c | 74 +++++++++++------ >>>> arch/x86/kvm/x86.c | 1 - >>>> include/linux/kvm_host.h | 21 ++++- >>>> include/linux/kvm_irqfd.h | 16 +++- >>>> virt/kvm/eventfd.c | 136 ++++++++++++++++++++++++++++---- >>>> virt/kvm/kvm_main.c | 1 + >>>> 10 files changed, 221 insertions(+), 60 deletions(-) >>>> >>>> -- >>>> 2.37.1.559.g78731f0fdb-goog >>>