Received: by 2002:a25:8b91:0:0:0:0:0 with SMTP id j17csp1018108ybl; Fri, 24 Jan 2020 13:55:17 -0800 (PST) X-Google-Smtp-Source: APXvYqyhKHt6GYwXJsoBLKFvkGnJOlgS+aUbL0ViKORRv6H/stmiqv4ATZ+1tiXBQcn67SF+dhko X-Received: by 2002:aca:a897:: with SMTP id r145mr665468oie.67.1579902917247; Fri, 24 Jan 2020 13:55:17 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1579902917; cv=none; d=google.com; s=arc-20160816; b=h945mjU428DC+RQHMbaA8xi0zrsbb0lVdOL25AGQpGVKN68TXmu/NmmrbCQmnVCQVe LXNWXbDkuW/R/uDOzLpWXhHHXe1KZONmWgP6Rba7D/hLVXKNEishKiWRnj4B2K1sBjlV Hzd057XWGXNRsYAXsSeBnqUekCEADVDBpUbCL+pFChxJPy6oXEjWRhqbHHiUAEPDBrHa ZUYOn5rcWQ6ureCQt8geoCVfLaVjUm0zrfSWJikZi0L+0nE81oA3jihsAY9ZyFYbcbv9 20CdTzNL+5GmVYdQEogxdvYcv7rcyk/CBQokzIIb6aO/37YHv+0lGi/A6atKHqa+ry+v nNUA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:dkim-signature; bh=3HJT7W/WRXdBhAlKmOBifhojKSqvdoUdaUmLJgJ9MdU=; b=oVrIJA8AFhbM19nvGqXFkpleQgAN+oEK8ea61JZOwn2l44naY5/6ZKMfLryELj5rbA t739gF+MZdqxGJkwI+22YStEr3zPyapdPw9NygrpatBchbtB7OIRlyqcmLjjxfrknnAw 5cS46zbkANj2K9KP3yUroS3S/2VejgqyYS2ovS5u4F8aCvdUnRTORmGmGErlbTp84DYk 6LjK9YZkT+nW+iPALEyP7k4+aOP3CEaSwV3dxIO9G7tliMPu17s9t7OO1GltvgQ7wH16 YW/Qhm8KxWTvc5GncyaGbBZ0TO7La+PDmrZxFbcqWGk2XC4mIRuaCjK6U+jZKI/FDwhM u6Ug== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@chromium.org header.s=google header.b=Y6MEqGsh; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=chromium.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id z3si356918oib.164.2020.01.24.13.55.04; Fri, 24 Jan 2020 13:55:17 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@chromium.org header.s=google header.b=Y6MEqGsh; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=chromium.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728139AbgAXVx5 (ORCPT + 99 others); Fri, 24 Jan 2020 16:53:57 -0500 Received: from mail-lj1-f193.google.com ([209.85.208.193]:39872 "EHLO mail-lj1-f193.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725821AbgAXVx4 (ORCPT ); Fri, 24 Jan 2020 16:53:56 -0500 Received: by mail-lj1-f193.google.com with SMTP id o11so4256644ljc.6 for ; Fri, 24 Jan 2020 13:53:54 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=chromium.org; s=google; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=3HJT7W/WRXdBhAlKmOBifhojKSqvdoUdaUmLJgJ9MdU=; b=Y6MEqGsh7k/zNskKHAS+ylILiPoWb+Xn+ki0mKYgKGyQMZ8ZE5YJRsYzhXQtKnZVBZ 3Zn4D2XQ9BZp8C/u2g41Jwdldu0dMd0Y3uPzS9Un5bPlzINSi+NEETRD/7n90FMnOG+K A0WIHrq7VoC/sx/OV9KPfd6RS9XHnh+aubq88= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=3HJT7W/WRXdBhAlKmOBifhojKSqvdoUdaUmLJgJ9MdU=; b=YALOff43V/MhgMVy+GE/DLkRvt2a075+LjPXy5f5Q+XfcpnAeHSLn4fpGZTFEaS2Gq benBTFstFmX8uBrpyhl2hyeCiyUGsmd9KuK4SwIuPhZS+iWhRmQuXYc3nhIgjG1WGpwz sL3ZZE2V2hjxSfhX/vlc1OVSIzAGG5Lh8X5VrF/WlefXh66qWfIw4kdF+mV/wepb/8RX 2BEmjHYUZcHrEG0d6GhtYpx6RaQcaNxTqfWLh3m2k9xohLo4Oe8e299Oas7CwvzZflXb HPIq+YHvL8ZDmzbq0vtFeowre/PjDufG20TVVhlRKw/vbGWd8JDB0XsB0+v/oJyfW4T9 CoEw== X-Gm-Message-State: APjAAAVFwLA0LHmWkkf143cd9mAoJAO3xoIvbV2Kk3KukbbBJoLdToIw MKSl2YyvKtaOwLz8/b59OCAuq5Nwsho= X-Received: by 2002:a2e:3309:: with SMTP id d9mr3358171ljc.262.1579902833549; Fri, 24 Jan 2020 13:53:53 -0800 (PST) Received: from mail-lf1-f44.google.com (mail-lf1-f44.google.com. [209.85.167.44]) by smtp.gmail.com with ESMTPSA id c22sm3388457lfc.93.2020.01.24.13.53.52 for (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Fri, 24 Jan 2020 13:53:52 -0800 (PST) Received: by mail-lf1-f44.google.com with SMTP id t23so2192105lfk.6 for ; Fri, 24 Jan 2020 13:53:52 -0800 (PST) X-Received: by 2002:a05:6512:2035:: with SMTP id s21mr2229303lfs.99.1579902831885; Fri, 24 Jan 2020 13:53:51 -0800 (PST) MIME-Version: 1.0 References: <20200117162444.v2.1.I9c7e72144ef639cc135ea33ef332852a6b33730f@changeid> <87y2tytv5i.fsf@nanos.tec.linutronix.de> <87eevqkpgn.fsf@nanos.tec.linutronix.de> <87d0b82a9o.fsf@nanos.tec.linutronix.de> In-Reply-To: <87d0b82a9o.fsf@nanos.tec.linutronix.de> From: Evan Green Date: Fri, 24 Jan 2020 13:53:15 -0800 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: [PATCH v2] PCI/MSI: Avoid torn updates to MSI pairs To: Thomas Gleixner Cc: Rajat Jain , Bjorn Helgaas , linux-pci , Linux Kernel Mailing List , x86@kernel.org Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Jan 24, 2020 at 6:34 AM Thomas Gleixner wrote: > > Evan, > > Evan Green writes: > > I did another experiment that I think lends credibility to my torn MSI > > hypothesis. I have the following change: > > > > And indeed, I get a machine check, despite the fact that MSI_DATA is > > overwritten just after address is updated. > > I don't have to understand why a SoC released in 2019 still has > unmaskable MSI especially as Inhell's own XHCI spec clearly documents > and recommends MSI-X. > > While your workaround (disabling MSI) works in this particular case it's > not really a good option: > > 1) Quite some devices have a bug where the legacy INTX disable does not > work reliably or is outright broken. That means MSI disable will > reroute to INTX. > > 2) I digged out old debug data which confirms that some silly devices > lose interrupts accross MSI disable/reenable if the INTX fallback is > disabled. > > And no, it's not a random weird device, it's part of a chipset which > was pretty popular a few years ago. I leave it as an excercise for > the reader to guess the vendor. > > Can you please apply the patch below? It enforces an IPI to the new > vector/target CPU when the interrupt is MSI w/o masking. It should > cure the issue. It goes without saying that I'm not proud of it. I'll feel just as dirty putting a tested-by on it :) I don't think this patch is complete. As written, it creates "recovery interrupts" for MSIs that are not maskable, however through the pci_msi_domain_write_msg() path, which is the one I seem to use, we make no effort to mask the MSI while changing affinity. So at the very least it would need a follow-on patch that attempts to mask the MSI, for MSIs that are maskable. __pci_restore_msi_state(), called in the resume path, does have this masking, but for some reason not pci_msi_domain_write_msg(). I'm also a bit concerned about all the spurious interrupts we'll be introducing. Not just the retriggering introduced here, but the fact that we never dealt with the torn interrupt. So in my case, XHCI will be sending an interrupt on the old vector to the new CPU, which could be registered to anything. I'm worried that not every driver in the system is hardened to receiving interrupts it's not prepared for. Perhaps the driver misbehaves, or perhaps it's a "bad" interrupt like the MCE interrupt that takes the system down. (I realize the MCE interrupt itself is not in the device vector region, but some other bad interrupt then). Now that you're on board with the torn write theory, what do you think about my "transit vector" proposal? The idea is this: - Reserve a single vector number on all CPUs for interrupts in transit between CPUs. - Interrupts in transit between CPUs are added to some sort of list, or maybe the transit vector itself. - __pci_msi_write_msg() would, after proper abstractions, essentially be doing this: pci_write(MSI_DATA, TRANSIT_VECTOR); pci_write(MSI_ADDRESS, new_affinity); pci_write(MSI_DATA, new_vector); - In the rare torn case I've found here, the interrupt will come in on , or . - The ISR for TRANSIT_VECTOR would go through and call the ISR for every IRQ in transit across CPUs. This does still result in a couple extra ISR calls, since multiple interrupts might be in transit across CPUs, but at least it's very rare. - CPU hotplug would keep the same logic it already has, retriggering TRANSIT_VECTOR if it happened to land on . - When the interrupt is confirmed on , remove the ISR from the TRANSIT_VECTOR list. If you think it's a worthwhile idea I can try to code it up. I've been running your patch for about 30 minutes, with no repro case. -Evan