Received: by 2002:ad5:474a:0:0:0:0:0 with SMTP id i10csp1918990imu; Sat, 5 Jan 2019 09:52:43 -0800 (PST) X-Google-Smtp-Source: ALg8bN7++uBQEJAInhMe8dcAVhSDq90DZvOYgu+Pj+ObHjQ08Djijy+zO6RcFHFEMeQxnvaVgYiv X-Received: by 2002:a63:dd15:: with SMTP id t21mr5285733pgg.347.1546710763289; Sat, 05 Jan 2019 09:52:43 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1546710763; cv=none; d=google.com; s=arc-20160816; b=GHRMYXbjonOvPoQIr/LVBRqM+zgmEnA7LcCA7tC1HkTcwPsZNrVZ0G6tpqD9Wbc0qd oCjb7ES+LrtH8wjs5apBo5NFgTeTNTlgVFbriNxD8YFazGa3p3XrS4+lq2lhCiHjlaaJ V8R0FF0sB6WijDkHkSR3/A4GCTZp3PF8Qhu3Diq77Ew3oWMyAcEzQAs0EfjPwevCGlLZ Rm2zPIYwivaWyzLrpZm1xPmVorUAidVp4wjBouTGuh0wTXRS9r61KdZt+hn4dfyIFvQH eHKTHeE3k1889CgFAXrm/Fe9zhvnA9aN3EYB0y4aRxkskfxbV1gtes0C7XINeQcNknE0 Pijg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:dkim-signature; bh=bmFtkiMzY0SjcGgWZDnXoB7lrhs+dKojhnWxuZ4Ulkg=; b=R72eBKDmr/RzbBNh79gLawD/YQ1aZgiS7i6wNktYFxehdxCq1IHRhrQnx7QbPOuxZ1 2VC7CEXfG4QUSLdVZ8abpx8CwA7Eer5rMCi7ROUgwVQovudZTcup/SKww7P49eiENM8Z WRk9VVJ8UKnCzXJWPJhgf1Q9/RryvewKrCOXavm9R5K5Jlx8/7FEAXA/lopvw2ovuf8X d1y7gjgsS2E0CY7EK5li2Z4l37JZ/IX7fY826q4/Bf83rUVmx8CA1EdDPt/VTAR17b5v MezNKMovQKs50cjueMgI9xcb+TKWrciS8mUugOcOuarlelXgB11i6m9yjHMj5lfa5QCG NL/g== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@ziepe.ca header.s=google header.b=ITdm6f4f; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id j35si7720237pgl.223.2019.01.05.09.52.27; Sat, 05 Jan 2019 09:52:43 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@ziepe.ca header.s=google header.b=ITdm6f4f; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726456AbfAERvU (ORCPT + 99 others); Sat, 5 Jan 2019 12:51:20 -0500 Received: from mail-pf1-f196.google.com ([209.85.210.196]:46653 "EHLO mail-pf1-f196.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726318AbfAERvT (ORCPT ); Sat, 5 Jan 2019 12:51:19 -0500 Received: by mail-pf1-f196.google.com with SMTP id c73so19784289pfe.13 for ; Sat, 05 Jan 2019 09:51:19 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ziepe.ca; s=google; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to:user-agent; bh=bmFtkiMzY0SjcGgWZDnXoB7lrhs+dKojhnWxuZ4Ulkg=; b=ITdm6f4fW8QautoLRLSG1phMTMj2iuQLmLxDF+T0kN82TxeVV1j00miE0Oo3PslMCO bQ6TrG8UQ73pjOcXA73iW8aR45Cz3HAKFz1BtvLEMpdNyFCiN+B0jqgWcVENVyqR4KFO L5ovF35IgXPwJPIKg7TAaWZqEjbmpgkemK+oaQeTp+BUBv639FtVLUCnYThlsw7U4klp jBo9vtXrHIXYU7FZLG3b2aIODlSkG9va85ecBLN7eFKCWvCc3DOsKnE+og9x/fEarDIu XQEFOyMEksq4jWcLbKWF9i2JUj8D+iy3tHD47fDvNw0diAQaD4T/Ej+zYg/cHJeDKJ4J fFkA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to:user-agent; bh=bmFtkiMzY0SjcGgWZDnXoB7lrhs+dKojhnWxuZ4Ulkg=; b=W4cAuQtGIVoLVtneD6DYRujs0y8DIOLv/IRGbAMVgrN4O8W/KePhuGZvHVa+ZP+H8f oOxdN4S8YvqFbOEE55cD10SU+YDLnC+MRfL/5cp1X1M3vncqX0v8hrdlQxF3VVfuKQZd cboScwJtDmKaO2aQwc/2OK+DO5XwstdHR3mRqU3IpjM0hLIAE2ft+1rrtCcsIdJ66QrC ceJ7rRNUVof+brUk9O0MLWCYfUbvykx8ZQTbQ6k8LoNIFhJpLsSAvEjIBtYAPTvEpNpu fioDrNbx8E828Y4huvWlX4v/ktn96KGyhMNc7nebpJ99PCFGmwIx40AvK5f/Z8HVl8ze dmXg== X-Gm-Message-State: AJcUukfzObj7L8r3U08X5Wdb4LIZXk77/yL5hhsA37U5UqKsvB71IM95 bboaJreH8kJyHlyvuE39AosXvQ== X-Received: by 2002:a63:1d59:: with SMTP id d25mr24839142pgm.180.1546710678704; Sat, 05 Jan 2019 09:51:18 -0800 (PST) Received: from ziepe.ca (S010614cc2056d97f.ed.shawcable.net. [174.3.196.123]) by smtp.gmail.com with ESMTPSA id y71sm119832456pfi.123.2019.01.05.09.51.17 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Sat, 05 Jan 2019 09:51:17 -0800 (PST) Received: from jgg by mlx.ziepe.ca with local (Exim 4.90_1) (envelope-from ) id 1gfq6P-00045X-0F; Sat, 05 Jan 2019 10:51:17 -0700 Date: Sat, 5 Jan 2019 10:51:16 -0700 From: Jason Gunthorpe To: David Gibson Cc: Leon Romanovsky , davem@davemloft.net, saeedm@mellanox.com, ogerlitz@mellanox.com, tariqt@mellanox.com, bhelgaas@google.com, linux-kernel@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, netdev@vger.kernel.org, alex.williamson@redhat.com, linux-pci@vger.kernel.org, linux-rdma@vger.kernel.org, sbest@redhat.com, paulus@samba.org, benh@kernel.crashing.org Subject: Re: [PATCH] PCI: Add no-D3 quirk for Mellanox ConnectX-[45] Message-ID: <20190105175116.GB14238@ziepe.ca> References: <20181206041951.22413-1-david@gibson.dropbear.id.au> <20181206064509.GM15544@mtr-leonro.mtl.com> <20190104034401.GA2801@umbus.fritz.box> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20190104034401.GA2801@umbus.fritz.box> User-Agent: Mutt/1.9.4 (2018-02-28) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Jan 04, 2019 at 02:44:01PM +1100, David Gibson wrote: > On Thu, Dec 06, 2018 at 08:45:09AM +0200, Leon Romanovsky wrote: > > On Thu, Dec 06, 2018 at 03:19:51PM +1100, David Gibson wrote: > > > Mellanox ConnectX-5 IB cards (MT27800) seem to cause a call trace when > > > unbound from their regular driver and attached to vfio-pci in order to pass > > > them through to a guest. > > > > > > This goes away if the disable_idle_d3 option is used, so it looks like a > > > problem with the hardware handling D3 state. To fix that more permanently, > > > use a device quirk to disable D3 state for these devices. > > > > > > We do this by renaming the existing quirk_no_ata_d3() more generally and > > > attaching it to the ConnectX-[45] devices (0x15b3:0x1013). > > > > > > Signed-off-by: David Gibson > > > drivers/pci/quirks.c | 17 +++++++++++------ > > > 1 file changed, 11 insertions(+), 6 deletions(-) > > > > > > > Hi David, > > > > Thank for your patch, > > > > I would like to reproduce the calltrace before moving forward, > > but have trouble to reproduce the original issue. > > > > I'm working with vfio-pci and CX-4/5 cards on daily basis, > > tried manually enter into D3 state now, and it worked for me. > > Interesting. I've investigated this further, though I don't have as > many new clues as I'd like. The problem occurs reliably, at least on > one particular type of machine (a POWER8 "Garrison" with ConnectX-4). > I don't yet know if it occurs with other machines, I'm having trouble > getting access to other machines with a suitable card. I didn't > manage to reproduce it on a different POWER8 machine with a > ConnectX-5, but I don't know if it's the difference in machine or > difference in card revision that's important. Make sure the card has the latest firmware is always good advice.. > So possibilities that occur to me: > * It's something specific about how the vfio-pci driver uses D3 > state - have you tried rebinding your device to vfio-pci? > * It's something specific about POWER, either the kernel or the PCI > bridge hardware > * It's something specific about this particular type of machine Does the EEH indicate what happend to actually trigger it? Jason