Received: by 2002:a05:7412:8d10:b0:f3:1519:9f41 with SMTP id bj16csp1227489rdb; Wed, 6 Dec 2023 12:09:34 -0800 (PST) X-Google-Smtp-Source: AGHT+IGPvGP1LPD3/AGbp3M3Jqfb+ekamUp5hEcDDoIUQX70eFk3WTLmhWvPJcjSPzmUQwFVo5Xd X-Received: by 2002:a05:6a00:14ca:b0:6cb:8c70:4790 with SMTP id w10-20020a056a0014ca00b006cb8c704790mr1321883pfu.1.1701893373764; Wed, 06 Dec 2023 12:09:33 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1701893373; cv=none; d=google.com; s=arc-20160816; b=k3IG8E4WjSAymrKh4ZorwhHNg0rI3E6YkqmumbSaQEg7mwEnIJdEQaZ69qxkyMKpa3 kIL/oyRq/tt6rno6nuR8PREwRGkPx23D+0M3/9nezE8m3CzG+oArlgAdwIGkPfe9oyMq awE29F4+mFe17ykUyO8gd7kUOfsYGnoF4FfYTUOPxWrvKnOJGrc/Ipz69BDgqwzT0aoF CxeFzok0bSTSL3IQOwi55fHGg6P8BkuaRGgZEiopDHGcLNn1OACvXv6ZhPOMaCZQHcDp KNBxHAASgKXOvafbd43EtycEyDOfEwRF/h9ge2jJYI/u2ZOJt/8vRU0mhpWHsINfRX0O J5uQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:mime-version:message-id:date:references :in-reply-to:subject:cc:to:dkim-signature:dkim-signature:from; bh=7dbW0vrOnX4IbvYPSelJWPZTu1xdvo3BzQO97dlKHxY=; fh=zmmCNmWX/qYnCe8iaDOJ+ow6H2W6Ybs9ZtfrTe1mIv0=; b=qxZU9z486DCUVdofyUHYTCjC411AcAppVOeOXLPpbemIcHBocOYrsbvUGExIRg1SXN rouFvm2zvCRDXX608QH93QmlvGTbq2TrbA2j8wZsf/bE9dcKRfl2frLY9VOif/F/S3ep TbvPDIx/qackcbBLdaiMtjXMWSTJNVI3ODppoaF/AaLr+JjPqLYm2NW6UTKgu42fHqQ2 IRGjsZ0g7wIEBKr7/CpxheFm1qNyqPnWnBsYLNKbTV1dwMSn6oQMRGAbZpMjusrV5OhB w+FOppnGus/wvOIePlKdtD3A0lghFg+ntcnJ0PbO4jymutgOITfrNLMcqBkOO2lHaiyu /UEw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linutronix.de header.s=2020 header.b=qeV8sL4N; dkim=neutral (no key) header.i=@linutronix.de header.s=2020e; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.35 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=linutronix.de Return-Path: Received: from groat.vger.email (groat.vger.email. [23.128.96.35]) by mx.google.com with ESMTPS id t24-20020a62d158000000b006cb997a7799si454242pfl.67.2023.12.06.12.09.32 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 06 Dec 2023 12:09:33 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.35 as permitted sender) client-ip=23.128.96.35; Authentication-Results: mx.google.com; dkim=pass header.i=@linutronix.de header.s=2020 header.b=qeV8sL4N; dkim=neutral (no key) header.i=@linutronix.de header.s=2020e; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.35 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=linutronix.de Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by groat.vger.email (Postfix) with ESMTP id 8A05D80263A2; Wed, 6 Dec 2023 12:09:29 -0800 (PST) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.11 at groat.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1442862AbjLFUJO (ORCPT + 99 others); Wed, 6 Dec 2023 15:09:14 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57656 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1442850AbjLFUJN (ORCPT ); Wed, 6 Dec 2023 15:09:13 -0500 Received: from galois.linutronix.de (Galois.linutronix.de [193.142.43.55]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 8E8FDC6; Wed, 6 Dec 2023 12:09:19 -0800 (PST) From: Thomas Gleixner DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1701893358; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=7dbW0vrOnX4IbvYPSelJWPZTu1xdvo3BzQO97dlKHxY=; b=qeV8sL4NtDdpC+v98nxJ4YFKVOOWfHp2g6n3wWz51mR3auxn6LBI0HtkY4MvRXiDWwNVoP wNpoHq/gBo82SQ6AAblKYvofi0cnZSmpkGq+AJgot54INj94Wsb7mnVGhhS7TWa4aBA0ro XZh/QZPNCZgbSR68EModhMZezBhyZYqi50JYNEbFW7YGIoxwicQc6OW327a/9s0kmYvbIt L6VsFYRDQfBpTt/alXFH1z9XxlwgKo6yaVs477xVLupTh6J/7Pwzlb1NcwQpJMdVuc+oN8 98LsEDjHF5IjePvq9BSSpYXSKNgCObJMxVu8faYOVYl54tC684CMmgwiz2fObg== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1701893358; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=7dbW0vrOnX4IbvYPSelJWPZTu1xdvo3BzQO97dlKHxY=; b=wO9GqRfsT0j4miTOei94cHhFBtNkngyWls/haTanPD0XtoG5yo1CZiFdGQTTKhwHhkTvkZ +Egma6PkYZuw3mBQ== To: Jacob Pan , LKML , X86 Kernel , iommu@lists.linux.dev, Lu Baolu , kvm@vger.kernel.org, Dave Hansen , Joerg Roedel , "H. Peter Anvin" , Borislav Petkov , Ingo Molnar Cc: Raj Ashok , "Tian, Kevin" , maz@kernel.org, peterz@infradead.org, seanjc@google.com, Robin Murphy , Jacob Pan Subject: Re: [PATCH RFC 10/13] x86/irq: Handle potential lost IRQ during migration and CPU offline In-Reply-To: <20231112041643.2868316-11-jacob.jun.pan@linux.intel.com> References: <20231112041643.2868316-1-jacob.jun.pan@linux.intel.com> <20231112041643.2868316-11-jacob.jun.pan@linux.intel.com> Date: Wed, 06 Dec 2023 21:09:17 +0100 Message-ID: <87a5qnum8i.ffs@tglx> MIME-Version: 1.0 Content-Type: text/plain X-Spam-Status: No, score=-0.9 required=5.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on groat.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (groat.vger.email [0.0.0.0]); Wed, 06 Dec 2023 12:09:29 -0800 (PST) On Sat, Nov 11 2023 at 20:16, Jacob Pan wrote: > Though IRTE modification for IRQ affinity change is a atomic operation, > it does not guarantee the timing of IRQ posting at PID. No acronyms please. > considered the following scenario: > Device system agent iommu memory CPU/LAPIC > 1 FEEX_XXXX > 2 Interrupt request > 3 Fetch IRTE -> > 4 ->Atomic Swap PID.PIR(vec) > Push to Global Observable(GO) > 5 if (ON*) > i done;* > else > 6 send a notification -> > > * ON: outstanding notification, 1 will suppress new notifications > > If IRQ affinity change happens between 3 and 5 in IOMMU, old CPU's PIR could > have pending bit set for the vector being moved. We must check PID.PIR > to prevent the lost of interrupts. We must check nothing. We must ensure that the code is correct, right? > Suggested-by: Thomas Gleixner > Signed-off-by: Jacob Pan > --- > arch/x86/kernel/apic/vector.c | 8 +++++++- > arch/x86/kernel/irq.c | 20 +++++++++++++++++--- > 2 files changed, 24 insertions(+), 4 deletions(-) > > diff --git a/arch/x86/kernel/apic/vector.c b/arch/x86/kernel/apic/vector.c > index 319448d87b99..14fc33cfdb37 100644 > --- a/arch/x86/kernel/apic/vector.c > +++ b/arch/x86/kernel/apic/vector.c > @@ -19,6 +19,7 @@ > #include > #include > #include > +#include > #include > > #include > @@ -978,9 +979,14 @@ static void __vector_cleanup(struct vector_cleanup *cl, bool check_irr) > * Do not check IRR when called from lapic_offline(), because > * fixup_irqs() was just called to scan IRR for set bits and > * forward them to new destination CPUs via IPIs. > + * > + * If the vector to be cleaned is delivered as posted intr, > + * it is possible that the interrupt has been posted but > + * not made to the IRR due to coalesced notifications. not made to? > + * Therefore, check PIR to see if the interrupt was posted. > */ > irr = check_irr ? apic_read(APIC_IRR + (vector / 32 * 0x10)) : 0; > - if (irr & (1U << (vector % 32))) { > + if (irr & (1U << (vector % 32)) || is_pi_pending_this_cpu(vector)) { The comment above this code clearly explains what check_irr is about. Why would the PIR pending check have different rules? Just because its PIR, right? > > +/* > + * Check if a given vector is pending in APIC IRR or PIR if posted interrupt > + * is enabled for coalesced interrupt delivery (CID). > + */ > +static inline bool is_vector_pending(unsigned int vector) > +{ > + unsigned int irr; > + > + irr = apic_read(APIC_IRR + (vector / 32 * 0x10)); > + if (irr & (1 << (vector % 32))) > + return true; > + > + return is_pi_pending_this_cpu(vector); > +} Why is this outside of the #ifdef region? Just because there was space to put it, right? And of course we need the same thing open coded in two locations. What's wrong with using this inline function in __vector_cleanup() too? if (check_irr && vector_is_pending(vector)) { pr_warn_once(...); .... } That would make the logic of __vector_cleanup() correct _AND_ share the code.