Received: by 2002:a25:8b91:0:0:0:0:0 with SMTP id j17csp5109242ybl; Wed, 22 Jan 2020 10:28:29 -0800 (PST) X-Google-Smtp-Source: APXvYqw+SzjkIdn9Eb3I68djHA8E8UNN5wIgs5lA5k1ON3S9T33T9rFgdXXz123sGGbmAGBp+Snq X-Received: by 2002:a05:6808:2ce:: with SMTP id a14mr7851089oid.82.1579717709819; Wed, 22 Jan 2020 10:28:29 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1579717709; cv=none; d=google.com; s=arc-20160816; b=XEekAnjjsVMqMoezSIwvylUrpBAinUlY5xWMkOirBc2yMX8VeRJZE/a7P6Pt9B0VMX nAKcMFey8Pa6hqEZdUHxnZKcJRTM9m3+ploKbzAebfDxWBpaQE512a6jh7hXGIKS62vP chJzGHm5qOxSiEheLMHo3cI1rb77wjSGy0o4Jr2JvEzYZfaxvw5q9jAU7t1MRPTxfK9E 2kakoiHVWJeoFLn3HQU8sY1vJbsiFikCrcGGksRINLUGz7f738357ZHwC5LPNtt9i+XG ALUFng2nXsA9Sub0oTsXcl299ekWx+MLJWiSiE5OzxO+aNIoevX5AUgsoEtz915/Xz4a ZVeQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:dkim-signature; bh=bBMN4EhSOng7OnOM0rFt2rQoKyMKqooeWsHDlQ/9cwM=; b=CPNetNYNnE3rGk6Zu56AKYzHtQkK9pk5mniGOID01HZqiZt4C8qEo24j2x5zR5bSTU UOVPli04Sy4EbeBbBS1Mx+ZgC5wdWUkLFVHS/g2K34GcInAtobhsJ3cUiFMntj1otat9 Zror0xxckya6b+U3OOE7oPEYLbNtQQPk3H89oT1i5pn1yUKRG/dgc2zAs18m3yCnrvHA fo5D0X7/5F2/LYPz7Ll53aNXwQhL+Jr0z0lfT9VSNv/cHKMhoEQrH9edFADYFWEo1YOQ gLmHa6twDC4Ow9E+qeACJj6j5LAeUscBq7o+nVnkI8/mzJ3rvgFfH+SVw/sq4kcp0Qk9 5pKQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@chromium.org header.s=google header.b=l0BjLgYL; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=chromium.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id v23si23091022otk.321.2020.01.22.10.28.16; Wed, 22 Jan 2020 10:28:29 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@chromium.org header.s=google header.b=l0BjLgYL; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=chromium.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726101AbgAVS1Q (ORCPT + 99 others); Wed, 22 Jan 2020 13:27:16 -0500 Received: from mail-lj1-f193.google.com ([209.85.208.193]:46844 "EHLO mail-lj1-f193.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725933AbgAVS1Q (ORCPT ); Wed, 22 Jan 2020 13:27:16 -0500 Received: by mail-lj1-f193.google.com with SMTP id m26so59688ljc.13 for ; Wed, 22 Jan 2020 10:27:15 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=chromium.org; s=google; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=bBMN4EhSOng7OnOM0rFt2rQoKyMKqooeWsHDlQ/9cwM=; b=l0BjLgYLS21obPxMv7d6xqzMaOed/W5XTPs9Jiy2R5BcIFcoPH/KAbR0zP72cJuPhM 6fni2+ohfrz9SGtbVkxchp8s/haS3M73YZjtu7djO8sq/aDCiz60GbWTYw2kbyOsKIHo SWbU9PtSdm2UWgu/bArY2QGZaEjTux8sunsCQ= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=bBMN4EhSOng7OnOM0rFt2rQoKyMKqooeWsHDlQ/9cwM=; b=TPgiMfSPNusnVdGoxJpBP7oj/Yk/BQZzXZ8fMe30AAtJcJvRuK+Pt7c4t18a5NxeUl mIMVveCXSTrzkazHITp4vEKUmseRk1fi1sp9unepuXl/+llAijs42Y54n40Qb9ugKQm8 c4pBn9w0TV++k/wFwUsGWeuB5Sc2Hx0tvYFes+DTcQVQAiI2fWTSyodix7pUV04vEqBh JxEJTnIhwEjbYdkdza9xt57Gu+weC61Xg1Mrag8NBMxQP9fhjFxhduMgStfsLnCbjrnw Tm2K9DX5YE84UqLyNhyAcfcy4zLmgiTdtZOh93+Fayh+I6m+D7jJZrkOy2cucek+F4fA qEVg== X-Gm-Message-State: APjAAAXUA4RLySx4AjiU4g0pzwnW0H76fxwg573HI78PIDP4eBCzMq6R yJgF1oyohnlOGlHNNRDnbSO6c8Z6pvs= X-Received: by 2002:a2e:6c06:: with SMTP id h6mr19820192ljc.246.1579717633345; Wed, 22 Jan 2020 10:27:13 -0800 (PST) Received: from mail-lf1-f41.google.com (mail-lf1-f41.google.com. [209.85.167.41]) by smtp.gmail.com with ESMTPSA id o6sm870961lfg.11.2020.01.22.10.27.11 for (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Wed, 22 Jan 2020 10:27:12 -0800 (PST) Received: by mail-lf1-f41.google.com with SMTP id f24so361062lfh.3 for ; Wed, 22 Jan 2020 10:27:11 -0800 (PST) X-Received: by 2002:a05:6512:2035:: with SMTP id s21mr2260437lfs.99.1579717631192; Wed, 22 Jan 2020 10:27:11 -0800 (PST) MIME-Version: 1.0 References: <20200116133102.1.I9c7e72144ef639cc135ea33ef332852a6b33730f@changeid> <20200122172816.GA139285@google.com> In-Reply-To: <20200122172816.GA139285@google.com> From: Evan Green Date: Wed, 22 Jan 2020 10:26:34 -0800 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: [PATCH] PCI/MSI: Avoid torn updates to MSI pairs To: Bjorn Helgaas Cc: linux-pci@vger.kernel.org, LKML , Thomas Gleixner , Marc Zyngier , Christoph Hellwig , Rajat Jain Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Jan 22, 2020 at 9:28 AM Bjorn Helgaas wrote: > > [+cc Thomas, Marc, Christoph, Rajat] > > On Thu, Jan 16, 2020 at 01:31:28PM -0800, Evan Green wrote: > > __pci_write_msi_msg() updates three registers in the device: address > > high, address low, and data. On x86 systems, address low contains > > CPU targeting info, and data contains the vector. The order of writes > > is address, then data. > > > > This is problematic if an interrupt comes in after address has > > been written, but before data is updated, and the SMP affinity of > > the interrupt is changing. In this case, the interrupt targets the > > wrong vector on the new CPU. > > > > This case is pretty easy to stumble into using xhci and CPU hotplugging. > > Create a script that targets interrupts at a set of cores and then > > offlines those cores. Put some stress on USB, and then watch xhci lose > > an interrupt and die. > > > > Avoid this by disabling MSIs during the update. > > > > Signed-off-by: Evan Green > > --- > > > > > > Bjorn, > > I was unsure whether disabling MSIs temporarily is actually an okay > > thing to do. I considered using the mask bit, but got the impression > > that not all devices support the mask bit. Let me know if this going to > > cause problems or there's a better way. I can include the repro > > script I used to cause mayhem if needed. > > I suspect this *is* a problem because I think disabling MSI doesn't > disable interrupts; it just means the device will interrupt using INTx > instead of MSI. And the driver is probably not prepared to handle > INTx. > > PCIe r5.0, sec 7.7.1.2, seems relevant: "If MSI and MSI-X are both > disabled, the Function requests servicing using INTx interrupts (if > supported)." > > Maybe the IRQ guys have ideas about how to solve this? > But don't we already do this in __pci_restore_msi_state(): pci_intx_for_msi(dev, 0); pci_msi_set_enable(dev, 0); arch_restore_msi_irqs(dev); I'd think if there were a chance for a line-based interrupt to get in and wedge itself, it would already be happening there. One other way you could avoid torn MSI writes would be to ensure that if you migrate IRQs across cores, you keep the same x86 vector number. That way the address portion would be updated, and data doesn't change, so there's no window. But that may not actually be feasible. -Evan