Received: by 2002:a05:6a10:a0d1:0:0:0:0 with SMTP id j17csp1189675pxa; Thu, 20 Aug 2020 05:10:52 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwFI4/uksw5WnCjU7c5N9GyVry1UQcbpQxa1hCVP3LtkipeWiFNkMcZuk9zvLLHBDb27Ine X-Received: by 2002:a17:907:1191:: with SMTP id uz17mr3107528ejb.184.1597925452572; Thu, 20 Aug 2020 05:10:52 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1597925452; cv=none; d=google.com; s=arc-20160816; b=XnxYe28GFzEYclNtPhENgHSiL4xN+eaBM+ZbOKIuFwlzOk2JmZgI1vDCuZkf/JybIP CHPk6kjc69ycc7frnAGhCFAziYRxcFwid+o+EqZWvA7DKQfu6pJ/wWRfAGZZ6M0hRUrC LfxNJYCgQWUwgRDIQJERLzLKy0uj89Swpboz4/tlEqm0cj0nvLRsVDVAJgz0EJYxzzTo jTzVajN80ahNhG7Z2dG6iUrj9M1SMfGVTjhnc56pwrW8L1cHsHAP3jv0EEIIF0ragFYx uvIC8A6zaFR2qLK6fW5blejnyVcRvoXvW5LchOqm7vTVwiQUYIyXDjQGl+h7MdAAeNux Z2pQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding :content-language:in-reply-to:mime-version:user-agent:date :message-id:autocrypt:openpgp:from:references:cc:to:subject :ironport-sdr:ironport-sdr; bh=ipiC23FOEzKx8baqQCgd9EnW5blx4fLcUd+4OZI5kH4=; b=QNcNMw4RXLcygxL44NzoYEp46LykiV5yFvmYBZZduK8ZicEw5LP9hpPCxF63x22PWZ QpqO/sQ++jckPZNjJ7ggAOTnG8ntQs615lJ0Uq0D3B0oOzBrB7AFfRT2owmsUmc/WKnV LEN4OG3P8BxvYFkPKHzZNlzHNDUTxqT2ilbG+ije67J4GJqPeZbGv5IMQLbFJyjpG+uP eAdkm0xjB22lxmJnI5qhLrlG4Bwa416GD+vHI3vldthCptIEMGrLcL+06a/kTQ0tgAqr ZQgkv+0MiAvXaWyd8hudftsJHAvvqMnMXpx7n5DzTcQMRhlaVCsl8MY4av+ZlLsYH2JW F7zw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id a8si1078238ejt.607.2020.08.20.05.10.28; Thu, 20 Aug 2020 05:10:52 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730528AbgHTMIs (ORCPT + 99 others); Thu, 20 Aug 2020 08:08:48 -0400 Received: from mga12.intel.com ([192.55.52.136]:46326 "EHLO mga12.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1730486AbgHTMI3 (ORCPT ); Thu, 20 Aug 2020 08:08:29 -0400 IronPort-SDR: cNJhpmmZYHzpR+XXnCiEoC4o4US7hEOlcyW+QIJaYo5+HG3fFH++CDEq6k6ffBxPjjxq9lqCWW mnlbZAHtFSlg== X-IronPort-AV: E=McAfee;i="6000,8403,9718"; a="134810897" X-IronPort-AV: E=Sophos;i="5.76,332,1592895600"; d="scan'208";a="134810897" X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga002.fm.intel.com ([10.253.24.26]) by fmsmga106.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 20 Aug 2020 05:08:26 -0700 IronPort-SDR: 3/Dyycd6QdTlaUFlmPkr07zoPMXZLTCiu5jZ+TSZahhEAI8FLNcFbUmgcmbSuF4BIhRP9Hc0as h1pWGKojsfJg== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.76,332,1592895600"; d="scan'208";a="329678982" Received: from mattu-haswell.fi.intel.com (HELO [10.237.72.170]) ([10.237.72.170]) by fmsmga002.fm.intel.com with ESMTP; 20 Aug 2020 05:08:23 -0700 Subject: Re: [PATCH] x86/hotplug: Silence APIC only after all irq's are migrated To: Ashok Raj , linux-kernel@vger.kernel.org, tglx@linutronix.de Cc: Sukumar Ghorai , Srikanth Nandamuri , Evan Green , Bjorn Helgaas , stable@vger.kernel.org References: <20200814213842.31151-1-ashok.raj@intel.com> From: Mathias Nyman Openpgp: preference=signencrypt Autocrypt: addr=mathias.nyman@linux.intel.com; prefer-encrypt=mutual; keydata= mQINBFMB0ccBEADd+nZnZrFDsIjQtclVz6OsqFOQ6k0nQdveiDNeBuwyFYykkBpaGekoHZ6f lH4ogPZzQ+pzoJEMlRGXc881BIggKMCMH86fYJGfZKWdfpg9O6mqSxyEuvBHKe9eZCBKPvoC L2iwygtO8TcXXSCynvXSeZrOwqAlwnxWNRm4J2ikDck5S5R+Qie0ZLJIfaId1hELofWfuhy+ tOK0plFR0HgVVp8O7zWYT2ewNcgAzQrRbzidA3LNRfkL7jrzyAxDapuejuK8TMrFQT/wW53e uegnXcRJaibJD84RUJt+mJrn5BvZ0MYfyDSc1yHVO+aZcpNr+71yZBQVgVEI/AuEQ0+p9wpt O9Wt4zO2KT/R5lq2lSz1MYMJrtfFRKkqC6PsDSB4lGSgl91XbibK5poxrIouVO2g9Jabg04T MIPpVUlPme3mkYHLZUsboemRQp5/pxV4HTFR0xNBCmsidBICHOYAepCzNmfLhfo1EW2Uf+t4 L8IowAaoURKdgcR2ydUXjhACVEA/Ldtp3ftF4hTQ46Qhba/p4MUFtDAQ5yeA5vQVuspiwsqB BoL/298+V119JzM998d70Z1clqTc8fiGMXyVnFv92QKShDKyXpiisQn2rrJVWeXEIVoldh6+ J8M3vTwzetnvIKpoQdSFJ2qxOdQ8iYRtz36WYl7hhT3/hwkHuQARAQABtCdNYXRoaWFzIE55 bWFuIDxtYXRoaWFzLm55bWFuQGdtYWlsLmNvbT6JAjsEEwECACUCGwMGCwkIBwMCBhUIAgkK CwQWAgMBAh4BAheABQJTAeo1AhkBAAoJEFiDn/uYk8VJOdIP/jhA+RpIZ7rdUHFIYkHEKzHw tkwrJczGA5TyLgQaI8YTCTPSvdNHU9Rj19mkjhUO/9MKvwfoT2RFYqhkrtk0K92STDaBNXTL JIi4IHBqjXOyJ/dPADU0xiRVtCHWkBgjEgR7Wihr7McSdVpgupsaXhbZjXXgtR/N7PE0Wltz hAL2GAnMuIeJyXhIdIMLb+uyoydPCzKdH6znfu6Ox76XfGWBCqLBbvqPXvk4oH03jcdt+8UG 2nfSeti/To9ANRZIlSKGjddCGMa3xzjtTx9ryf1Xr0MnY5PeyNLexpgHp93sc1BKxKKtYaT0 lR6p0QEKeaZ70623oB7Sa2Ts4IytqUVxkQKRkJVWeQiPJ/dZYTK5uo15GaVwufuF8VTwnMkC 4l5X+NUYNAH1U1bpRtlT40aoLEUhWKAyVdowxW4yGCP3nL5E69tZQQgsag+OnxBa6f88j63u wxmOJGNXcwCerkCb+wUPwJzChSifFYmuV5l89LKHgSbv0WHSN9OLkuhJO+I9fsCNvro1Y7dT U/yq4aSVzjaqPT3yrnQkzVDxrYT54FLWO1ssFKAOlcfeWzqrT9QNcHIzHMQYf5c03Kyq3yMI Xi91hkw2uc/GuA2CZ8dUD3BZhUT1dm0igE9NViE1M7F5lHQONEr7MOCg1hcrkngY62V6vh0f RcDeV0ISwlZWuQINBFMB0ccBEACXKmWvojkaG+kh/yipMmqZTrCozsLeGitxJzo5hq9ev31N 2XpPGx4AGhpccbco63SygpVN2bOd0W62fJJoxGohtf/g0uVtRSuK43OTstoBPqyY/35+VnAV oA5cnfvtdx5kQPIL6LRcxmYKgN4/3+A7ejIxbOrjWFmbWCC+SgX6mzHHBrV0OMki8R+NnrNa NkUmMmosi7jBSKdoi9VqDqgQTJF/GftvmaZHqgmVJDWNrCv7UiorhesfIWPt1O/AIk9luxlE dHwkx5zkWa9CGYvV6LfP9BznendEoO3qYZ9IcUlW727Le80Q1oh69QnHoI8pODDBBTJvEq1h bOWcPm/DsNmDD8Rwr/msRmRyIoxjasFi5WkM/K/pzujICKeUcNGNsDsEDJC5TCmRO/TlvCvm 0X+vdfEJRZV6Z+QFBflK1asUz9QHFre5csG8MyVZkwTR9yUiKi3KiqQdaEu+LuDD2CGF5t68 xEl66Y6mwfyiISkkm3ETA4E8rVZP1rZQBBm83c5kJEDvs0A4zrhKIPTcI1smK+TWbyVyrZ/a mGYDrZzpF2N8DfuNSqOQkLHIOL3vuOyx3HPzS05lY3p+IIVmnPOEdZhMsNDIGmVorFyRWa4K uYjBP/W3E5p9e6TvDSDzqhLoY1RHfAIadM3I8kEx5wqco67VIgbIHHB9DbRcxQARAQABiQIf BBgBAgAJBQJTAdHHAhsMAAoJEFiDn/uYk8VJb7AQAK56tgX8V1Wa6RmZDmZ8dmBC7W8nsMRz PcKWiDSMIvTJT5bygMy1lf7gbHXm7fqezRtSfXAXr/OJqSA8LB2LWfThLyuuCvrdNsQNrI+3 D+hjHJjhW/4185y3EdmwwHcelixPg0X9EF+lHCltV/w29Pv3PiGDkoKxJrnOpnU6jrwiBebz eAYBfpSEvrCm4CR4hf+T6MdCs64UzZnNt0nxL8mLCCAGmq1iks9M4bZk+LG36QjCKGh8PDXz 9OsnJmCggptClgjTa7pO6040OW76pcVrP2rZrkjo/Ld/gvSc7yMO/m9sIYxLIsR2NDxMNpmE q/H7WO+2bRG0vMmsndxpEYS4WnuhKutoTA/goBEhtHu1fg5KC+WYXp9wZyTfeNPrL0L8F3N1 BCEYefp2JSZ/a355X6r2ROGSRgIIeYjAiSMgGAZMPEVsdvKsYw6BH17hDRzltNyIj5S0dIhb Gjynb3sXforM/GVbr4mnuxTdLXQYlj2EJ4O4f0tkLlADT7podzKSlSuZsLi2D+ohKxtP3U/r 42i8PBnX2oAV0UIkYk7Oel/3hr0+BP666SnTls9RJuoXc7R5XQVsomqXID6GmjwFQR5Wh/RE IJtkiDAsk37cfZ9d1kZ2gCQryTV9lmflSOB6AFZkOLuEVSC5qW8M/s6IGDfYXN12YJaZPptJ fiD/ Message-ID: <563d77c0-ab5b-df9c-8f4d-b16a0d1211a2@linux.intel.com> Date: Thu, 20 Aug 2020 15:11:45 +0300 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.9.0 MIME-Version: 1.0 In-Reply-To: <20200814213842.31151-1-ashok.raj@intel.com> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 15.8.2020 0.38, Ashok Raj wrote: > When offlining CPU's, fixup_irqs() migrates all interrupts away from the > outgoing CPU to an online CPU. Its always possible the device sent an > interrupt to the previous CPU destination. Pending interrupt bit in IRR in > lapic identifies such interrupts. apic_soft_disable() will not capture any > new interrupts in IRR. This causes interrupts from device to be lost during > cpu offline. The issue was found when explicitly setting MSI affinity to a > CPU and immediately offlining it. It was simple to recreate with a USB > ethernet device and doing I/O to it while the CPU is offlined. Lost > interrupts happen even when Interrupt Remapping is enabled. > > Current code does apic_soft_disable() before migrating interrupts. > > native_cpu_disable() > { > ... > apic_soft_disable(); > cpu_disable_common(); > --> fixup_irqs(); // Too late to capture anything in IRR. > } > > Just fliping the above call sequence seems to hit the IRR checks > and the lost interrupt is fixed for both legacy MSI and when > interrupt remapping is enabled. > > > Fixes: 60dcaad5736f ("x86/hotplug: Silence APIC and NMI when CPU is dead") > Link: https://lore.kernel.org/lkml/875zdarr4h.fsf@nanos.tec.linutronix.de/ > Signed-off-by: Ashok Raj > > To: linux-kernel@vger.kernel.org > To: Thomas Gleixner > Cc: Sukumar Ghorai > Cc: Srikanth Nandamuri > Cc: Evan Green > Cc: Mathias Nyman > Cc: Bjorn Helgaas > Cc: stable@vger.kernel.org > --- This fixes the lost xhci interrupt for me. Before this patch a msi interupt was lost after ~200 cycles of toggling CPUs offline/online under heavy usb traffic. With this patch I ran 3x2000 cycles without any issues (Comet lake, patch on top of 5.8) Tried both with and without CONFIG_IRQ_REMAP. No issues seen. Tested-by: Mathias Nyman