Received: by 2002:a25:6193:0:0:0:0:0 with SMTP id v141csp5164687ybb; Tue, 24 Mar 2020 12:06:23 -0700 (PDT) X-Google-Smtp-Source: ADFU+vsRLJQc2CQoXvRr8GoYLot8Pxsd+1XubT4mKy8gX3FtN6G8A1QxWEGmGsI9p1D6C9061e/K X-Received: by 2002:a9d:62c6:: with SMTP id z6mr680793otk.328.1585076783140; Tue, 24 Mar 2020 12:06:23 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1585076783; cv=none; d=google.com; s=arc-20160816; b=gwVz2pf+CbtSmfmhVHbShUfrTTOJ/MGZGpkw2Dr1vVS3cfdBpYctiVO9bDmAu7tPqK rS+t/N1GYPS+YpbCuvIWfRwiGud0JCWNOLgtxIQ3KyTBrkJUVDCgB/S5FkpVnrJANA4x QZ6qx2dqOP+IKedwz6xfLCS5T4KgZIcF6RIXAS0Yq2aTvJQNWgBo3PSpXVLmCmDTeDO+ T9fxMGWal+sP4PUm2LlNXtQivi/G4tHnoYDaL+CbBWpC5fKa1XBZMoWP8gGlTKXN+zeh QzHbunZ6Z6sTYUnLGO0KxUnTFQoAu5j7e2VVRZbHu/mxbevND5fU6s57gKkYI5M8Vzqs t6oA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:mime-version:message-id:date:references :in-reply-to:subject:cc:to:from; bh=ZBUKzJh69Fo+3zSyi2kLWfKkaZES0zJ6PmSAKOZSbkY=; b=cWuZ1Ge2JcQb+chMjuZtQbgrvwA/kkXMy2du2e3RUi1aOJ3bzVzTe7LbAew2FhuWi5 M4gZ8hMdXLhmYrK1snjxJeX52gOJO1sCpYZHRJQ6E8/MkNOG5RK5849s4AWjdQoUJS2Q JZr5yZ/7WYxPd4CXVNxgN8ZbH32A9tvn40Ttk6Hej4FlA9ZzI7ux5lblvodPKxak/5xp yRvyDsRY+58wcfhvL5m68mno2LdoIufmIX5xU8uYMTLj15h7EE3cFNigTZalbCFlSYSO fvKn2X6R2e0LD1dGeA+rPMZ90QISBY7Il+LrMDvAqarclq0jJ0GrIiJfZ55nRwYTcCiK bf0A== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id 184si9843398oig.33.2020.03.24.12.06.07; Tue, 24 Mar 2020 12:06:23 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727992AbgCXTD5 (ORCPT + 99 others); Tue, 24 Mar 2020 15:03:57 -0400 Received: from Galois.linutronix.de ([193.142.43.55]:45910 "EHLO Galois.linutronix.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727747AbgCXTD4 (ORCPT ); Tue, 24 Mar 2020 15:03:56 -0400 Received: from p5de0bf0b.dip0.t-ipconnect.de ([93.224.191.11] helo=nanos.tec.linutronix.de) by Galois.linutronix.de with esmtpsa (TLS1.2:DHE_RSA_AES_256_CBC_SHA256:256) (Exim 4.80) (envelope-from ) id 1jGoq1-00049g-Cz; Tue, 24 Mar 2020 20:03:45 +0100 Received: by nanos.tec.linutronix.de (Postfix, from userid 1000) id 23137100292; Tue, 24 Mar 2020 20:03:44 +0100 (CET) From: Thomas Gleixner To: Evan Green Cc: Mathias Nyman , x86@kernel.org, linux-pci , LKML , Bjorn Helgaas , "Ghorai\, Sukumar" , "Amara\, Madhusudanarao" , "Nandamuri\, Srikanth" Subject: Re: MSI interrupt for xhci still lost on 5.6-rc6 after cpu hotplug In-Reply-To: References: <806c51fa-992b-33ac-61a9-00a606f82edb@linux.intel.com> <87d0974akk.fsf@nanos.tec.linutronix.de> <87r1xjp3gn.fsf@nanos.tec.linutronix.de> <878sjqfvmi.fsf@nanos.tec.linutronix.de> Date: Tue, 24 Mar 2020 20:03:44 +0100 Message-ID: <87tv2dd17z.fsf@nanos.tec.linutronix.de> MIME-Version: 1.0 Content-Type: text/plain X-Linutronix-Spam-Score: -1.0 X-Linutronix-Spam-Level: - X-Linutronix-Spam-Status: No , -1.0 points, 5.0 required, ALL_TRUSTED=-1,SHORTCIRCUIT=-0.0001 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Evan Green writes: > On Mon, Mar 23, 2020 at 5:24 PM Thomas Gleixner wrote: >> And of course all of this is so well documented that all of us can >> clearly figure out what's going on... > > I won't pretend to know what's going on, so I'll preface this by > labeling it all as "flailing", but: > > I wonder if there's some way the interrupt can get delayed between > XHCI snapping the torn value and it finding its way into the IRR. For > instance, if xhci read this value at the start of their interrupt > moderation timer period, that would be awful (I hope they don't do > this). One test patch would be to carve out 8 vectors reserved for > xhci on all cpus. Whenever you change the affinity, the assigned > vector is always reserved_base + cpu_number. That lets you exercise > the affinity switching code, but in a controlled manner where torn > interrupts could be easily seen (ie hey I got an interrupt on cpu 4's > vector but I'm cpu 2). I might struggle to write such a change, but in > theory it's doable. Well, the point is that we don't see a spurious interrupt on any CPU. We added a traceprintk into do_IRQ() and that would immediately tell us where the thing goes off into lala land. Which it didn't. > I was alternately trying to build a theory in my head about the write > somehow being posted and getting out of order, but I don't think that > can happen. If that happens then the lost XHCI interrupt is the least of your worries. Thanks, tglx