Received: by 2002:a25:8b91:0:0:0:0:0 with SMTP id j17csp915766ybl; Tue, 28 Jan 2020 14:49:44 -0800 (PST) X-Google-Smtp-Source: APXvYqyANsMOWzT72YBwkD+fRM+uTIQtHnjMfu2urMfBBqfK61I+VnyAD41eQ1aqI64qjahvKQ6h X-Received: by 2002:aca:90f:: with SMTP id 15mr4567858oij.18.1580251784606; Tue, 28 Jan 2020 14:49:44 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1580251784; cv=none; d=google.com; s=arc-20160816; b=S1gaNVeanhK+NYAuvhX9Rcr4SbOSMZ7KYDSBWYmiZwusuInl9E2LrPZHphxMou6oHb Zq7CFvrB0XjsDYJnFR3YE2ShhvbGvXyMbNG9JyksbBZMyZmL9UQCDCcjDxCblvAtjDsQ YkFGCiD9HgDUh720oRfsCqBubGKduapiPBQ9ea93sdNHefh/uG1l8IXBPhMTk0p0gBpI 99TK1yhDTU1mYk5uWOxrmfHlEJDNtuRUCHzNRKzKGBHQrzt+R0wa1NLQAszIyMXVUGTE N7FDaVO6By5lYjFENb5BNwlRrdFp7iowQoUZ9if2eMMX9WmvB0EkW9pQ5kMbDtIWAwC9 SrIQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:mime-version:message-id:date:references :in-reply-to:subject:cc:to:from; bh=UzIqDJ9YdgiOx2Yt2fia3Mez+wJU6GQJcb2pOGYfGCU=; b=lzdiJ/bTkJVrFGAfWA8d0gYhPZk4nSIesZb42bcRAwzyF7sN/CjVbHDL2uiZaDH4iW 6fsgHY2LzXdU9MWpq0cRmoVo5UTYt7VuMvtZZhE6InuwjBzNASu+YiUmzvh4LxWkilnl sWdsUqtIIl+MjNIGi51LH077surtqnDkzWmO72cOitZ3vfoRLvAvbAYVeddsnyWxLdaA lBqGR0OMLkZJPWUOQJLR40rZ99kvdN+SY5+vLt0A/OE02CdDu6uwK2X0nX7sePS32BhC 6X71PvmpjSn6Ic3cIj6INKxlIqqhWdBc2XgBWfj8WwkoJ6A4LCELXKE1drk1bfKv/Utr lobA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id b141si118791oii.79.2020.01.28.14.49.32; Tue, 28 Jan 2020 14:49:44 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726383AbgA1Wsj (ORCPT + 99 others); Tue, 28 Jan 2020 17:48:39 -0500 Received: from Galois.linutronix.de ([193.142.43.55]:50132 "EHLO Galois.linutronix.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726276AbgA1Wsj (ORCPT ); Tue, 28 Jan 2020 17:48:39 -0500 Received: from p5b06da22.dip0.t-ipconnect.de ([91.6.218.34] helo=nanos.tec.linutronix.de) by Galois.linutronix.de with esmtpsa (TLS1.2:DHE_RSA_AES_256_CBC_SHA256:256) (Exim 4.80) (envelope-from ) id 1iwZer-0006o0-HV; Tue, 28 Jan 2020 23:48:33 +0100 Received: by nanos.tec.linutronix.de (Postfix, from userid 1000) id 084FB101227; Tue, 28 Jan 2020 23:48:33 +0100 (CET) From: Thomas Gleixner To: Evan Green Cc: Rajat Jain , Bjorn Helgaas , linux-pci , Linux Kernel Mailing List , x86@kernel.org, Marc Zyngier Subject: Re: [PATCH v2] PCI/MSI: Avoid torn updates to MSI pairs In-Reply-To: References: <20200117162444.v2.1.I9c7e72144ef639cc135ea33ef332852a6b33730f@changeid> <87y2tytv5i.fsf@nanos.tec.linutronix.de> <87eevqkpgn.fsf@nanos.tec.linutronix.de> <87d0b82a9o.fsf@nanos.tec.linutronix.de> <878slwmpu9.fsf@nanos.tec.linutronix.de> <87imkv63yf.fsf@nanos.tec.linutronix.de> Date: Tue, 28 Jan 2020 23:48:32 +0100 Message-ID: <87pnf342pr.fsf@nanos.tec.linutronix.de> MIME-Version: 1.0 Content-Type: text/plain X-Linutronix-Spam-Score: -1.0 X-Linutronix-Spam-Level: - X-Linutronix-Spam-Status: No , -1.0 points, 5.0 required, ALL_TRUSTED=-1,SHORTCIRCUIT=-0.0001 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Evan, Evan Green writes: > On Tue, Jan 28, 2020 at 6:38 AM Thomas Gleixner wrote: >> The patch is only lightly tested, but so far it survived. >> > > Hi Thomas, > Thanks for the patch, I gave it a try. I get the following splat, then a hang: > > [ 62.238406] CPU0 > [ 62.241135] ---- > [ 62.243863] lock(vector_lock); > [ 62.247467] lock(vector_lock); > [ 62.251071] > [ 62.251071] *** DEADLOCK *** > [ 62.251071] > [ 62.257687] May be due to missing lock nesting notation > [ 62.257687] > [ 62.265274] 2 locks held by migration/1/17: > [ 62.269946] #0: 00000000cfa9d8c3 (&irq_desc_lock_class){-.-.}, at: > irq_migrate_all_off_this_cpu+0x44/0x28f > [ 62.280846] #1: 000000006885da2d (vector_lock){-.-.}, at: > msi_set_affinity+0x13c/0x27b > [ 62.289801] > [ 62.289801] stack backtrace: > [ 62.294669] CPU: 1 PID: 17 Comm: migration/1 Not tainted 4.19.96 #2 > [ 62.310713] Call Trace: > [ 62.313446] dump_stack+0xac/0x11e > [ 62.317255] __lock_acquire+0x64f/0x19bc > [ 62.321646] ? find_held_lock+0x3d/0xb8 > [ 62.325936] ? pci_conf1_write+0x4f/0xdf > [ 62.330320] lock_acquire+0x1b2/0x1fa > [ 62.334413] ? apic_retrigger_irq+0x31/0x63 > [ 62.339097] _raw_spin_lock_irqsave+0x51/0x7d > [ 62.343972] ? apic_retrigger_irq+0x31/0x63 > [ 62.348646] apic_retrigger_irq+0x31/0x63 > [ 62.353124] msi_set_affinity+0x25a/0x27b Bah. I'm sure I looked at that call chain, noticed the double vector lock and then forgot. Delta patch below. Thanks, tglx 8<-------------- --- a/arch/x86/kernel/apic/msi.c +++ b/arch/x86/kernel/apic/msi.c @@ -64,6 +64,7 @@ msi_set_affinity(struct irq_data *irqd, struct irq_cfg old_cfg, *cfg = irqd_cfg(irqd); struct irq_data *parent = irqd->parent_data; unsigned int cpu; + bool pending; int ret; /* Save the current configuration */ @@ -147,9 +148,13 @@ msi_set_affinity(struct irq_data *irqd, * vector/CPU. Check whether the transition raced with a device * interrupt and is pending in the local APICs IRR. */ - if (lapic_vector_set_in_irr(cfg->vector)) - irq_data_get_irq_chip(irqd)->irq_retrigger(irqd); + pending = lapic_vector_set_in_irr(cfg->vector); + unlock_vector_lock(); + + if (pending) + irq_data_get_irq_chip(irqd)->irq_retrigger(irqd); + return ret; }