Received: by 2002:a05:6358:4e97:b0:b3:742d:4702 with SMTP id ce23csp571506rwb; Thu, 11 Aug 2022 06:36:24 -0700 (PDT) X-Google-Smtp-Source: AA6agR7VJ9vUcSDUB96Thn0wS1Yyq/qItaOiYYWXA3i2X7o6OEWMlycT4Eo4D/Q6JQWlsV6Nmk9g X-Received: by 2002:a17:90b:38cc:b0:1f7:2835:d45e with SMTP id nn12-20020a17090b38cc00b001f72835d45emr8663714pjb.177.1660224983910; Thu, 11 Aug 2022 06:36:23 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1660224983; cv=none; d=google.com; s=arc-20160816; b=DsXqLj8Mg1D1gKp1sQDtXHn2BQ7CPU7+Cyf1rbaQNqbgn9/HPrWzBr9XQH6MUeOfZi Wz0LANp8sFEHr86k0Xe6nehXVVq2XjYDZZLMripYlvm+nnS/Zor1fibX+OFhjVjYAfO0 0jo/hbnsE6xTLWOjcu3TgYJCmy5zsf5QUOtoId4/95DIG+AwM7VquQd0lebogH5ZA85b EB09JGTXha8UNqAtZpr+c8bjP6kxMdLsbIyCy3WtDppJALwAR3vRwi7Doc+PpSxKBKxp WIrGSYkWDx45NgaH+hizzjVdVXmIXvsm8wuW69jasGXcqZBE1BQd02MlKd+PfCy88hYH IzjA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:mime-version:user-agent:references:in-reply-to :subject:cc:to:from:message-id:date:dkim-signature; bh=ZL4B7mnbA86ErPw6FTLe5QLcc1nbDI8PzOUWa8Dgi4Y=; b=z8at119v3Ky/dvCt6OdMFeNXfNAdKhegdrvA5i3uFIE75KEvd07e9s4nU4EeSnS9Rj dV0aIig0BaHQLJ6FsXoOFaS502PzK9Z8tu7M8orl+Y3AcyV4sOwtSqNsEZOfWjIMW6GL 8y6GXvaIREcwkcYicXFy5RXEY1WQl4xxQ1C8d3OgfbpJT10fiSyL/cn8GuitFqXlfcPI 57EZR+GsWwBd28fz3l02BETMXBQHCYInoXM/YFi3Dk2YHW7ZDij3PcXm3CFVpjjNuVGj uRKAN9iNO+renNdlBe2mwJTIhgdyKFxxLI+eOShe+EPbXbEWO4qxuxbk8UNHJQL13SiI T7kg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=SMKTZh3l; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id t1-20020a170902e84100b0016c2ae85b7bsi22801001plg.106.2022.08.11.06.36.00; Thu, 11 Aug 2022 06:36:23 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=SMKTZh3l; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235280AbiHKMgB (ORCPT + 99 others); Thu, 11 Aug 2022 08:36:01 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:45484 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235321AbiHKMfu (ORCPT ); Thu, 11 Aug 2022 08:35:50 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 759ECB11; Thu, 11 Aug 2022 05:35:49 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 127C26144A; Thu, 11 Aug 2022 12:35:49 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 5C772C433D6; Thu, 11 Aug 2022 12:35:48 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1660221348; bh=Ez2d8p0eeWF0OIsRxbzSqCD6kWS8Wmg/mVwzdTqnfik=; h=Date:From:To:Cc:Subject:In-Reply-To:References:From; b=SMKTZh3lBX58gL8LiJd0G9Y/XmGZmVQ9cWhbJ1tzRGV5PDMH9GBugkdKUIj7tCiDJ wMvBWEe0+Pt0N5LGrZSRdwrV8d1WD8A7QWtE7aL0WnbRonpEQDwjLPa9ADXKrT7un0 ZKUp2MRYcK7oIQqMO1khFHuySUQ/7uSKFf/OSPXjG2BJr7E47w21J0SdukxySwA16G y45OL1ZYV0C5GvHGq1oRE/DNaiyEswbHzBHNc0Qis/ZRblwrEVWk1+8LxQNQx19y0W q9P36gE3q3zIQrEy5LdBLN315tz8cstENlvLsUgkGM49kXIl47+Cbi/o2iOJNgvcCv LR3qbem7ALuuQ== Received: from sofa.misterjones.org ([185.219.108.64] helo=why.misterjones.org) by disco-boy.misterjones.org with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.95) (envelope-from ) id 1oM7Pe-002L20-4i; Thu, 11 Aug 2022 13:35:46 +0100 Date: Thu, 11 Aug 2022 13:35:45 +0100 Message-ID: <87mtcbufdq.wl-maz@kernel.org> From: Marc Zyngier To: Dmytro Maluka Cc: "Dong, Eddie" , "Christopherson,, Sean" , Paolo Bonzini , "kvm@vger.kernel.org" , Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , "x86@kernel.org" , "H. Peter Anvin" , "linux-kernel@vger.kernel.org" , Eric Auger , Alex Williamson , "Liu, Rong L" , Zhenyu Wang , Tomasz Nowicki , Grzegorz Jaszczyk , "upstream@semihalf.com" , Dmitry Torokhov Subject: Re: [PATCH v2 0/5] KVM: Fix oneshot interrupts forwarding In-Reply-To: <3bdcda9f-ac2f-14df-2932-cf16912fe71b@semihalf.com> References: <20220805193919.1470653-1-dmy@semihalf.com> <87o7wsbngz.wl-maz@kernel.org> <3bdcda9f-ac2f-14df-2932-cf16912fe71b@semihalf.com> User-Agent: Wanderlust/2.15.9 (Almost Unreal) SEMI-EPG/1.14.7 (Harue) FLIM-LB/1.14.9 (=?UTF-8?B?R29qxY0=?=) APEL-LB/10.8 EasyPG/1.0.0 Emacs/27.1 (x86_64-pc-linux-gnu) MULE/6.0 (HANACHIRUSATO) MIME-Version: 1.0 (generated by SEMI-EPG 1.14.7 - "Harue") Content-Type: text/plain; charset=US-ASCII X-SA-Exim-Connect-IP: 185.219.108.64 X-SA-Exim-Rcpt-To: dmy@semihalf.com, eddie.dong@intel.com, seanjc@google.com, pbonzini@redhat.com, kvm@vger.kernel.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org, hpa@zytor.com, linux-kernel@vger.kernel.org, eric.auger@redhat.com, alex.williamson@redhat.com, rong.l.liu@intel.com, zhenyuw@linux.intel.com, tn@semihalf.com, jaz@semihalf.com, upstream@semihalf.com, dtor@google.com X-SA-Exim-Mail-From: maz@kernel.org X-SA-Exim-Scanned: No (on disco-boy.misterjones.org); SAEximRunCond expanded to false X-Spam-Status: No, score=-7.7 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_HI, SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, 10 Aug 2022 18:06:53 +0100, Dmytro Maluka wrote: > > Hi Marc, > > On 8/10/22 8:51 AM, Marc Zyngier wrote: > > On Wed, 10 Aug 2022 00:30:29 +0100, > > Dmytro Maluka wrote: > >> > >> On 8/9/22 10:01 PM, Dong, Eddie wrote: > >>> > >>> > >>>> -----Original Message----- > >>>> From: Dmytro Maluka > >>>> Sent: Tuesday, August 9, 2022 12:24 AM > >>>> To: Dong, Eddie ; Christopherson,, Sean > >>>> ; Paolo Bonzini ; > >>>> kvm@vger.kernel.org > >>>> Cc: Thomas Gleixner ; Ingo Molnar ; > >>>> Borislav Petkov ; Dave Hansen ; > >>>> x86@kernel.org; H. Peter Anvin ; linux- > >>>> kernel@vger.kernel.org; Eric Auger ; Alex > >>>> Williamson ; Liu, Rong L ; > >>>> Zhenyu Wang ; Tomasz Nowicki > >>>> ; Grzegorz Jaszczyk ; > >>>> upstream@semihalf.com; Dmitry Torokhov > >>>> Subject: Re: [PATCH v2 0/5] KVM: Fix oneshot interrupts forwarding > >>>> > >>>> On 8/9/22 1:26 AM, Dong, Eddie wrote: > >>>>>> > >>>>>> The existing KVM mechanism for forwarding of level-triggered > >>>>>> interrupts using resample eventfd doesn't work quite correctly in the > >>>>>> case of interrupts that are handled in a Linux guest as oneshot > >>>>>> interrupts (IRQF_ONESHOT). Such an interrupt is acked to the device > >>>>>> in its threaded irq handler, i.e. later than it is acked to the > >>>>>> interrupt controller (EOI at the end of hardirq), not earlier. The > >>>>>> existing KVM code doesn't take that into account, which results in > >>>>>> erroneous extra interrupts in the guest caused by premature re-assert of an > >>>> unacknowledged IRQ by the host. > >>>>> > >>>>> Interesting... How it behaviors in native side? > >>>> > >>>> In native it behaves correctly, since Linux masks such a oneshot interrupt at the > >>>> beginning of hardirq, so that the EOI at the end of hardirq doesn't result in its > >>>> immediate re-assert, and then unmasks it later, after its threaded irq handler > >>>> completes. > >>>> > >>>> In handle_fasteoi_irq(): > >>>> > >>>> if (desc->istate & IRQS_ONESHOT) > >>>> mask_irq(desc); > >>>> > >>>> handle_irq_event(desc); > >>>> > >>>> cond_unmask_eoi_irq(desc, chip); > >>>> > >>>> > >>>> and later in unmask_threaded_irq(): > >>>> > >>>> unmask_irq(desc); > >>>> > >>>> I also mentioned that in patch #3 description: > >>>> "Linux keeps such interrupt masked until its threaded handler finishes, to > >>>> prevent the EOI from re-asserting an unacknowledged interrupt. > >>> > >>> That makes sense. Can you include the full story in cover letter too? > >> > >> Ok, I will. > >> > >>> > >>> > >>>> However, with KVM + vfio (or whatever is listening on the resamplefd) we don't > >>>> check that the interrupt is still masked in the guest at the moment of EOI. > >>>> Resamplefd is notified regardless, so vfio prematurely unmasks the host > >>>> physical IRQ, thus a new (unwanted) physical interrupt is generated in the host > >>>> and queued for injection to the guest." > > > > Sorry to barge in pretty late in the conversation (just been Cc'd on > > this), but why shouldn't the resamplefd be notified? If there has been > > an EOI, a new level must be made visible to the guest interrupt > > controller, no matter what the state of the interrupt masking is. > > > > Whether this new level is actually *presented* to a vCPU is another > > matter entirely, and is arguably a problem for the interrupt > > controller emulation. > > > > For example on arm64, we expect to be able to read the pending state > > of an interrupt from the guest irrespective of the masking state of > > that interrupt. Any change to the interrupt flow should preserve this. > > I'd like to understand the problem better, so could you please give some > examples of cases where it is required/useful/desirable to read the > correct pending state of a guest interrupt? I'm not sure I understand the question. It is *always* desirable to present the correct information to the guest. For example, a guest could periodically poll the pending interrupt registers and only enable interrupts that are pending. Is it a good idea? No. Is it expected to work? Absolutely. And yes, we go out of our way to make sure these things actually work, because one day or another, you'll find a guest that does exactly that. Thanks, M. -- Without deviation from the norm, progress is not possible.