Received: by 2002:a05:6359:322:b0:b3:69d0:12d8 with SMTP id ef34csp474707rwb; Wed, 10 Aug 2022 11:15:20 -0700 (PDT) X-Google-Smtp-Source: AA6agR5pDBwYMvW32v9MQ6hQi6bdhIFnO+lxe7cmURNRZyt33tWVmeeD2pDtx46Iu7hxPkxFtKRk X-Received: by 2002:aa7:cd84:0:b0:43c:532b:65e9 with SMTP id x4-20020aa7cd84000000b0043c532b65e9mr28245924edv.330.1660155319810; Wed, 10 Aug 2022 11:15:19 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1660155319; cv=none; d=google.com; s=arc-20160816; b=AfVE/rVW94/tkXJ0PbqDVTKXkdahKOfzu4lwJ1rdiicNagyKQQO0BVLK8X65Hranjx 24XbyGHJAhgzOOszd7FUB6i1sof3zh3eNvgsThmXgIICcqCByqrqKg81lOC8mqwXGxO7 dyNTE1Z3eNQxchVhcDHtNXrJQ/DruX+NhWXMiIx79xvhEEthZ9hZ2lTKOvPsz5G/Ecg4 gdN111kwVqCqhNZlixGC4BmPl/Uewme0W4iuIP8VOuxPANpaThcoY5PoLfDZv4UNtlo7 tOY9EbFVHwkzNyQDNcXzubG7Jh4Wq6+RyQ/wEdHYRMVwWa+GOf2QuQ9XIhVcsjOADH+P 2dlg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:content-language :in-reply-to:mime-version:user-agent:date:message-id:from:references :cc:to:subject:dkim-signature; bh=f7wY4+TCVlOsefMDOe5awPGi8MWMNO4jCwUTni3WBgc=; b=XY3TfB0SFOI0PMCYPCfCdPqjhu4jHDJ1tdxVT+Mw2zm1vzsNnU4+3htpQEiKaER5Vr Ro89S5+1mbhzKZzBYutNsvyoZGpw/dLxys+73TZrZpOb9IOkEiW+AFuCG3kW7tcHC419 rF31xUFkNxM8r+3M3DoHWw+A4PW4t1hhl0v/Pou23JXS1lcXbZjmJBq6gpH5rNADNQlW Om53Fz9kfZ3C09K8C8sftMj93cW6M0dkjJ2qP7ZL7c+lEJroZBraHom71Y8FmjUpYwJO TXowdWfhzhuVORmprkInwPWKPsKXCNK9AQRgZSBY8IvqdScUkvUizGZgLlml69mK/H/j cWHg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@semihalf.com header.s=google header.b=IKghbtR1; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=semihalf.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id jg8-20020a170907970800b007321eb06896si4976981ejc.153.2022.08.10.11.14.53; Wed, 10 Aug 2022 11:15:19 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@semihalf.com header.s=google header.b=IKghbtR1; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=semihalf.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232904AbiHJRCw (ORCPT + 99 others); Wed, 10 Aug 2022 13:02:52 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35868 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233397AbiHJRCg (ORCPT ); Wed, 10 Aug 2022 13:02:36 -0400 Received: from mail-lj1-x233.google.com (mail-lj1-x233.google.com [IPv6:2a00:1450:4864:20::233]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 57DD02A958 for ; Wed, 10 Aug 2022 10:02:34 -0700 (PDT) Received: by mail-lj1-x233.google.com with SMTP id j3so9553542ljo.0 for ; Wed, 10 Aug 2022 10:02:34 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=semihalf.com; s=google; h=content-transfer-encoding:content-language:in-reply-to:mime-version :user-agent:date:message-id:from:references:cc:to:subject:from:to:cc; bh=f7wY4+TCVlOsefMDOe5awPGi8MWMNO4jCwUTni3WBgc=; b=IKghbtR1o4EFWP9t2qaIK2WuGTpVZIRjdZFCrnenMelm3fh7CLCb/GXs99g3LH37sR 35YcAyViC+Icacu+wTzjuDYkyH5K/6Onu+qVtQMYvmOWTHmGpiM71iDhzkuZ5ozE81yG 2ZttwdwictUX/n+o7APxIOi+vbVmWngMUqsrHSp6I8mFbgKE8kSn9d1SZLGsm5QACKdE uLNPaTke2LdHPSvXm2Kcp6Os+t2LQW9QmVUvfuxUkBNwJj3pKF2DOOgRfbUpaGcmZc6T t1DBYSUbkPvRTlesThbyz3d8iapSIhhhgXbtQ12hMP0m3qQ1yGQVRZldptcH+A0woh4T ANug== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:content-language:in-reply-to:mime-version :user-agent:date:message-id:from:references:cc:to:subject :x-gm-message-state:from:to:cc; bh=f7wY4+TCVlOsefMDOe5awPGi8MWMNO4jCwUTni3WBgc=; b=4OoQFy21hjcTGXUhWdZeIJSXHCLMLW3kf4rk0FbuZdjECoaveWlzWy/A7gFbTzl0Or DlBq0cINmYlemqNY9nprFsMg9QkjXb/FQ4VanJqCXPxI41P8+yEJY2nr4/saFI2jpKmF QvozfETQwv7s0Ir6SAcI1tGnBMRs24ACBZR3u8kfs/2rgyR/DvMl1wwnO3RQTM0P3OIo oN8XpI5+AzFVsd7MWE6XNcQ1Kyc5g5mjKNj1o9eZu7DF8iQxoDYZGFz5zd7dufKM2eym pUHETa+vG4P8XL69vck3fH0GL7dOlaWwPOpeX9BJ+iaQH7BzeH77g+IRKGbFN3datA/P MGmw== X-Gm-Message-State: ACgBeo218od0F+E2hVY8VPC7DYHFuNIc5VjFt4tQW2m1TDIvd/hOzyb8 /FCr1CMcSTEaTUadDrTed5pCCA== X-Received: by 2002:a2e:9415:0:b0:25e:477b:adc9 with SMTP id i21-20020a2e9415000000b0025e477badc9mr8380761ljh.109.1660150952622; Wed, 10 Aug 2022 10:02:32 -0700 (PDT) Received: from ?IPv6:2a02:a31b:33d:9c00:463a:87e3:44fc:2b2f? ([2a02:a31b:33d:9c00:463a:87e3:44fc:2b2f]) by smtp.gmail.com with ESMTPSA id m3-20020a2e8703000000b0025e87b1fdbcsm459930lji.63.2022.08.10.10.02.31 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Wed, 10 Aug 2022 10:02:32 -0700 (PDT) Subject: Re: [PATCH v2 0/5] KVM: Fix oneshot interrupts forwarding To: Marc Zyngier , eric.auger@redhat.com Cc: "Dong, Eddie" , "Christopherson,, Sean" , Paolo Bonzini , "kvm@vger.kernel.org" , Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , "x86@kernel.org" , "H. Peter Anvin" , "linux-kernel@vger.kernel.org" , Alex Williamson , "Liu, Rong L" , Zhenyu Wang , Tomasz Nowicki , Grzegorz Jaszczyk , "upstream@semihalf.com" , Dmitry Torokhov References: <20220805193919.1470653-1-dmy@semihalf.com> <87o7wsbngz.wl-maz@kernel.org> <8ff76b5e-ae28-70c8-2ec5-01662874fb15@redhat.com> <87r11ouu9y.wl-maz@kernel.org> From: Dmytro Maluka Message-ID: <72e40c17-e5cd-1ffd-9a38-00b47e1cbd8e@semihalf.com> Date: Wed, 10 Aug 2022 19:02:29 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.13.0 MIME-Version: 1.0 In-Reply-To: <87r11ouu9y.wl-maz@kernel.org> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,NICE_REPLY_A,RCVD_IN_DNSWL_NONE, SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi Marc, On 8/10/22 3:01 PM, Marc Zyngier wrote: > On Wed, 10 Aug 2022 09:12:18 +0100, > Eric Auger wrote: >> >> Hi Marc, >> >> On 8/10/22 08:51, Marc Zyngier wrote: >>> On Wed, 10 Aug 2022 00:30:29 +0100, >>> Dmytro Maluka wrote: >>>> On 8/9/22 10:01 PM, Dong, Eddie wrote: >>>>> >>>>>> -----Original Message----- >>>>>> From: Dmytro Maluka >>>>>> Sent: Tuesday, August 9, 2022 12:24 AM >>>>>> To: Dong, Eddie ; Christopherson,, Sean >>>>>> ; Paolo Bonzini ; >>>>>> kvm@vger.kernel.org >>>>>> Cc: Thomas Gleixner ; Ingo Molnar ; >>>>>> Borislav Petkov ; Dave Hansen ; >>>>>> x86@kernel.org; H. Peter Anvin ; linux- >>>>>> kernel@vger.kernel.org; Eric Auger ; Alex >>>>>> Williamson ; Liu, Rong L ; >>>>>> Zhenyu Wang ; Tomasz Nowicki >>>>>> ; Grzegorz Jaszczyk ; >>>>>> upstream@semihalf.com; Dmitry Torokhov >>>>>> Subject: Re: [PATCH v2 0/5] KVM: Fix oneshot interrupts forwarding >>>>>> >>>>>> On 8/9/22 1:26 AM, Dong, Eddie wrote: >>>>>>>> The existing KVM mechanism for forwarding of level-triggered >>>>>>>> interrupts using resample eventfd doesn't work quite correctly in the >>>>>>>> case of interrupts that are handled in a Linux guest as oneshot >>>>>>>> interrupts (IRQF_ONESHOT). Such an interrupt is acked to the device >>>>>>>> in its threaded irq handler, i.e. later than it is acked to the >>>>>>>> interrupt controller (EOI at the end of hardirq), not earlier. The >>>>>>>> existing KVM code doesn't take that into account, which results in >>>>>>>> erroneous extra interrupts in the guest caused by premature re-assert of an >>>>>> unacknowledged IRQ by the host. >>>>>>> Interesting... How it behaviors in native side? >>>>>> In native it behaves correctly, since Linux masks such a oneshot interrupt at the >>>>>> beginning of hardirq, so that the EOI at the end of hardirq doesn't result in its >>>>>> immediate re-assert, and then unmasks it later, after its threaded irq handler >>>>>> completes. >>>>>> >>>>>> In handle_fasteoi_irq(): >>>>>> >>>>>> if (desc->istate & IRQS_ONESHOT) >>>>>> mask_irq(desc); >>>>>> >>>>>> handle_irq_event(desc); >>>>>> >>>>>> cond_unmask_eoi_irq(desc, chip); >>>>>> >>>>>> >>>>>> and later in unmask_threaded_irq(): >>>>>> >>>>>> unmask_irq(desc); >>>>>> >>>>>> I also mentioned that in patch #3 description: >>>>>> "Linux keeps such interrupt masked until its threaded handler finishes, to >>>>>> prevent the EOI from re-asserting an unacknowledged interrupt. >>>>> That makes sense. Can you include the full story in cover letter too? >>>> Ok, I will. >>>> >>>>> >>>>>> However, with KVM + vfio (or whatever is listening on the resamplefd) we don't >>>>>> check that the interrupt is still masked in the guest at the moment of EOI. >>>>>> Resamplefd is notified regardless, so vfio prematurely unmasks the host >>>>>> physical IRQ, thus a new (unwanted) physical interrupt is generated in the host >>>>>> and queued for injection to the guest." >>> Sorry to barge in pretty late in the conversation (just been Cc'd on >>> this), but why shouldn't the resamplefd be notified? If there has been >> yeah sorry to get you involved here ;-) > > No problem! > >>> an EOI, a new level must be made visible to the guest interrupt >>> controller, no matter what the state of the interrupt masking is. >>> >>> Whether this new level is actually *presented* to a vCPU is another >>> matter entirely, and is arguably a problem for the interrupt >>> controller emulation. >> >> FWIU on guest EOI the physical line is still asserted so the pIRQ is >> immediatly re-sampled by the interrupt controller (because the >> resamplefd unmasked the physical IRQ) and recorded as a guest IRQ >> (although it is masked at guest level). When the guest actually unmasks >> the vIRQ we do not get a chance to re-evaluate the physical line level. > > Indeed, and maybe this is what should be fixed instead of moving the > resampling point around (I was suggesting something along these lines > in [1]). > > We already do this on arm64 for the timer, and it should be easy > enough it generalise to any interrupt backed by the GIC (there is an > in-kernel API to sample the pending state). No idea how that translate > for other architectures though. Actually I'm now thinking about changing the behavior implemented in my patchset, which is: 1. If vEOI happens for a masked vIRQ, don't notify resamplefd, so that no new physical IRQ is generated, and the vIRQ is not set as pending. 2. After this vIRQ is unmasked by the guest, notify resamplefd. to the following one: 1. If vEOI happens for a masked vIRQ, notify resamplefd as usual, but also remember this vIRQ as, let's call it, "pending oneshot". 2. A new physical IRQ is immediately generated, so the vIRQ is properly set as pending. 3. After the vIRQ is unmasked by the guest, check and find out that it is not just pending but also "pending oneshot", so don't deliver it to a vCPU. Instead, immediately notify resamplefd once again. In other words, don't avoid extra physical interrupts in the host (rather, use those extra interrupts for properly updating the pending state of the vIRQ) but avoid propagating those extra interrupts to the guest. Does this sound reasonable to you? Your suggestion to sample the pending state of the physical IRQ sounds interesting too. But as you said, it's yet to be checked how feasible it would be on architectures other than arm64. Also it assumes that the IRQ in question is a forwarded physical interrupt, while I can imagine that KVM's resamplefd could in principle also be useful for implementing purely emulated interrupts. Do you see any advantages of sampling the physical IRQ pending state over remembering the "pending oneshot" state as described above? > > M. > > [1] https://lore.kernel.org/r/87mtccbie4.wl-maz@kernel.org >