Received: by 2002:a25:23cc:0:0:0:0:0 with SMTP id j195csp1015938ybj; Thu, 7 May 2020 12:46:03 -0700 (PDT) X-Google-Smtp-Source: APiQypLcfw7bSvAkQLix75+cwPBGIMicxUCvDbNFEyoP9rWCfICRAB0YY7rZYPqSCmtAicAJzWOi X-Received: by 2002:a17:906:2e51:: with SMTP id r17mr8304834eji.226.1588880763639; Thu, 07 May 2020 12:46:03 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1588880763; cv=none; d=google.com; s=arc-20160816; b=Wgk3aWgeeJmv/SR5vrmf7Jc2cNhkO6Z3obVi1xBPHXejNo4kc5SKZYOpMGxuD+ZNm6 GmNemAtt7iSK39RWABh5EA8klNbcjCpM2vMyOXSWx/vWAfBCZuEnipkjC9pzDdVJo2kw 4GpmwfNeNTBJHSIk3bP6bkFA/H0Nrzwkf/q/jXkLmj46gH/LkFZ5vlFk1Gv3+WX6ZfEq +PbmkTbm95tzof6eO+bd1uuybqVxKoCs0MLQKxrNTEumsdO5vZ+aajoICzusJ8v2DGtD b7f7NEFu7vckfObRPIIE3LqPFHoypgwDnEgw8upTJbi+CbM6TnljxIuOtyzx+L23+T0Y qlkA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:mime-version:message-id:date:references :in-reply-to:subject:cc:to:from; bh=kWpmH552jlfdbCh0Wi8fFZhFVsIOVsa1avJTsm6A0oM=; b=X4YUPM9wOT61WtXehhXP94cA0Im0QaInVbFCRTbrt/jBnEmibtvo8M35GkIszt/DtL 8nkoeu4pZCO3bRymNNFe+KW0ppa3kPMu7/zS8VDpx+8WDftb3mOTZU0G90vvOzIG5vvT 5LsHyBVO3rwGzXAKj0+NpkVCtCwSLR7mleSZP3WMmJf8oZYXkgWY/3D4FZdRlX6IORqu xnlw2QvcLK+YXzZG8q8koCtyDomNJFNG9pXY+xFVWtxTZtuEiSdTw3dn1K0DiXQNTPvL JskjDGrLjVIGZ19wwtScdK0mbLuSEquyrcH70NozKliPxLmsdNihM78amTmoLXEZRqFN bBVw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id u18si3870670ejx.502.2020.05.07.12.45.39; Thu, 07 May 2020 12:46:03 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727826AbgEGTlq (ORCPT + 99 others); Thu, 7 May 2020 15:41:46 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:41540 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-FAIL-OK-FAIL) by vger.kernel.org with ESMTP id S1726320AbgEGTlp (ORCPT ); Thu, 7 May 2020 15:41:45 -0400 Received: from Galois.linutronix.de (Galois.linutronix.de [IPv6:2a0a:51c0:0:12e:550::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1C22CC05BD43; Thu, 7 May 2020 12:41:45 -0700 (PDT) Received: from p5de0bf0b.dip0.t-ipconnect.de ([93.224.191.11] helo=nanos.tec.linutronix.de) by Galois.linutronix.de with esmtpsa (TLS1.2:DHE_RSA_AES_256_CBC_SHA256:256) (Exim 4.80) (envelope-from ) id 1jWmOn-0004f0-9z; Thu, 07 May 2020 21:41:37 +0200 Received: by nanos.tec.linutronix.de (Postfix, from userid 1000) id 7396E102652; Thu, 7 May 2020 21:41:36 +0200 (CEST) From: Thomas Gleixner To: "Raj\, Ashok" Cc: "Raj\, Ashok" , Evan Green , Mathias Nyman , x86@kernel.org, linux-pci , LKML , Bjorn Helgaas , "Ghorai\, Sukumar" , "Amara\, Madhusudanarao" , "Nandamuri\, Srikanth" , Ashok Raj Subject: Re: MSI interrupt for xhci still lost on 5.6-rc6 after cpu hotplug In-Reply-To: <20200507175715.GA22426@otc-nc-03> References: <20200501184326.GA17961@araj-mobl1.jf.intel.com> <878si6rx7f.fsf@nanos.tec.linutronix.de> <20200505201616.GA15481@otc-nc-03> <875zdarr4h.fsf@nanos.tec.linutronix.de> <20200507121850.GB85463@otc-nc-03> <87wo5nj48a.fsf@nanos.tec.linutronix.de> <20200507175715.GA22426@otc-nc-03> Date: Thu, 07 May 2020 21:41:36 +0200 Message-ID: <87blmzedn3.fsf@nanos.tec.linutronix.de> MIME-Version: 1.0 Content-Type: text/plain X-Linutronix-Spam-Score: -1.0 X-Linutronix-Spam-Level: - X-Linutronix-Spam-Status: No , -1.0 points, 5.0 required, ALL_TRUSTED=-1,SHORTCIRCUIT=-0.0001 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Ashok, "Raj, Ashok" writes: > > I think i got mixed up with logical apic id and logical cpu :-( Stuff happens. > -0 [000] d.h. 44.376659: msi_set_affinity: quirk[1] new vector allocated, new apic = 2 vector = 33 this apic = 0 > -0 [000] d.h. 44.376684: msi_set_affinity: Direct Update: irq 123 Ovec=33 Oapic 0 Nvec 33 Napic 2 > -0 [000] d.h. 44.376685: xhci_irq: xhci irq > -0 [001] d.h. 44.376750: msi_set_affinity: quirk[1] new vector allocated, new apic = 2 vector = 33 this apic = 2 > -0 [001] d.h. 44.376774: msi_set_affinity: Direct Update: irq 123 Ovec=33 Oapic 2 Nvec 33 Napic 2 > -0 [001] d.h. 44.376776: xhci_irq: xhci irq > -0 [001] d.h. 44.395824: xhci_irq: xhci irq > <...>-14 [001] d..1 44.400666: msi_set_affinity: quirk[1] new vector allocated, new apic = 6 vector = 33 this apic = 2 > <...>-14 [001] d..1 44.400691: msi_set_affinity: Direct Update: irq 123 Ovec=33 Oapic 2 Nvec 33 Napic 6 > -0 [003] d.h. 44.421021: xhci_irq: xhci irq > -0 [003] d.h. 44.421135: xhci_irq: xhci irq > migration/3-24 [003] d..1 44.421784: msi_set_affinity: quirk[1] new vector allocated, new apic = 0 vector = 33 this apic = 6 > migration/3-24 [003] d..1 44.421803: msi_set_affinity: Direct Update: irq 123 Ovec=33 Oapic 6 Nvec 33 Napic 0 So this last one is a direct update. Straight forward moving it from one to the other CPU on the same vector number. And that's the case where we either expect the interrupt to come in on CPU3 or on CPU0. There is actually an example in the trace: -0 [000] d.h. 40.616467: msi_set_affinity: quirk[1] new vector allocated, new apic = 2 vector = 33 this apic = 0 -0 [000] d.h. 40.616488: msi_set_affinity: Direct Update: irq 123 Ovec=33 Oapic 0 Nvec 33 Napic 2 -0 [000] d.h. 40.616488: xhci_irq: xhci irq -0 [001] d.h. 40.616504: xhci_irq: xhci irq > migration/3-24 [003] d..1 44.421784: msi_set_affinity: quirk[1] new vector allocated, new apic = 0 vector = 33 this apic = 6 > migration/3-24 [003] d..1 44.421803: msi_set_affinity: Direct Update: irq 123 Ovec=33 Oapic 6 Nvec 33 Napic 0 But as this last one is the migration thread, aka stomp machine, I assume this is a hotplug operation. Which means the CPU cannot handle interrupts anymore. In that case we check the old vector on the unplugged CPU in fixup_irqs() and do the retrigger from there. Can you please add tracing to that one as well? Thanks, tglx