Received: by 2002:ad5:474a:0:0:0:0:0 with SMTP id i10csp433008imu; Tue, 8 Jan 2019 23:43:43 -0800 (PST) X-Google-Smtp-Source: ALg8bN4W82C5Zhy7OIwnhWk8AaOAaPYw0IGaNAgmFxKK4UUCb09gw8jpcrLMV9H14ah9jgAGqO6s X-Received: by 2002:a62:7504:: with SMTP id q4mr4849501pfc.180.1547019823791; Tue, 08 Jan 2019 23:43:43 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1547019823; cv=none; d=google.com; s=arc-20160816; b=RtIeLrRopTxaAe1kAhtuzDZasBBb23ZY8rmDpZOQ5kArjIPAMQcUgH0lMeQq6u/JYB OiYsBxmBpPKunjsWWc9oDWxkZd5bAX7h4m/ihwjqz0T5+zUOSsoT2K8DMJ7LSw4g9nVK z7kcentqWYKDTKHXDy7t/kR/9Ti+VDk1u8W1yE7rcjyWW90e5x8ye6plJR3Kvjz0z4Zs glMLLudu4gthTMisa15ClBFx9arER7wOVeofmzJhRKGWkX2amQFjmNWWBXT5NeKHOne7 bhy+5oeZRRrf5EpUiyqmHi9N131gXgh2OLeIPZ9GW6jAlZihRBWA9kbZpfWrMweqtgPQ iwyA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :user-agent:references:in-reply-to:date:cc:to:from:subject :message-id; bh=78C16EV1nDQ2pDZuwxpDwkY7VXZorDeofWLV+tdLCR4=; b=u6lbYr4CCPzL8YM5DdhRElyG7xtft4NP4oLe+1HPvgOfby8UJVUDmJSr9kCbMToxoc MuAwFU4fpcLQJ6N/RIohEwlFB+DddS/bfYgeWT95tRxJdqkcRKJmki2yeua8wHAf1tFX W1B5n24RpYfrQ69OrC3k7lBFJZ2tnzxBq6h2YDWYiuhup7NtiTR3XljDqxK6WnRFvOkQ w3XUT6xdYCKFexLwUxBiN0JtJbIG/xvv2WRqbHLN5YE69rjsuNKhLc0TEoGmE42vog5A /l4iJrh/L60KILMkhG3/8ETV3uSjrPB2dpn9VrFgXRQAkpb9UsbeIuh9ttVuIDizgwWt tEaw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id x135si69075115pgx.579.2019.01.08.23.43.28; Tue, 08 Jan 2019 23:43:43 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729607AbfAIHlY (ORCPT + 99 others); Wed, 9 Jan 2019 02:41:24 -0500 Received: from gate.crashing.org ([63.228.1.57]:57849 "EHLO gate.crashing.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728112AbfAIHlX (ORCPT ); Wed, 9 Jan 2019 02:41:23 -0500 Received: from localhost (localhost.localdomain [127.0.0.1]) by gate.crashing.org (8.14.1/8.14.1) with ESMTP id x097O0fb006443; Wed, 9 Jan 2019 01:24:01 -0600 Message-ID: Subject: Re: [PATCH] PCI: Add no-D3 quirk for Mellanox ConnectX-[45] From: Benjamin Herrenschmidt To: Alexey Kardashevskiy , Jason Gunthorpe , David Gibson Cc: Leon Romanovsky , linux-rdma@vger.kernel.org, netdev@vger.kernel.org, linux-kernel@vger.kernel.org, sbest@redhat.com, saeedm@mellanox.com, alex.williamson@redhat.com, paulus@samba.org, linux-pci@vger.kernel.org, bhelgaas@google.com, ogerlitz@mellanox.com, linuxppc-dev@lists.ozlabs.org, davem@davemloft.net, tariqt@mellanox.com Date: Wed, 09 Jan 2019 18:24:00 +1100 In-Reply-To: <06c4612c-8409-ea7d-4f7c-4c010d8ecc01@ozlabs.ru> References: <20181206041951.22413-1-david@gibson.dropbear.id.au> <20181206064509.GM15544@mtr-leonro.mtl.com> <20190104034401.GA2801@umbus.fritz.box> <20190105175116.GB14238@ziepe.ca> <06c4612c-8409-ea7d-4f7c-4c010d8ecc01@ozlabs.ru> Content-Type: text/plain; charset="UTF-8" User-Agent: Evolution 3.30.3 (3.30.3-1.fc29) Mime-Version: 1.0 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, 2019-01-09 at 15:53 +1100, Alexey Kardashevskiy wrote: > "A PCI completion timeout occurred for an outstanding PCI-E transaction" > it is. > > This is how I bind the device to vfio: > > echo vfio-pci > '/sys/bus/pci/devices/0000:01:00.0/driver_override' > echo vfio-pci > '/sys/bus/pci/devices/0000:01:00.1/driver_override' > echo '0000:01:00.0' > '/sys/bus/pci/devices/0000:01:00.0/driver/unbind' > echo '0000:01:00.1' > '/sys/bus/pci/devices/0000:01:00.1/driver/unbind' > echo '0000:01:00.0' > /sys/bus/pci/drivers/vfio-pci/bind > echo '0000:01:00.1' > /sys/bus/pci/drivers/vfio-pci/bind > > > and I noticed that EEH only happens with the last command. The order > (.0,.1 or .1,.0) does not matter, it seems that putting one function to > D3 is fine but putting another one when the first one is already in D3 - > produces EEH. And I do not recall ever seeing this on the firestone > machine. Weird. Putting all functions into D3 is what allows the device to actually go into D3. Does it work with other devices ? We do have that bug on early P9 revisions where the attempt of bringing the link to L1 as part of the D3 process fails in horrible ways, I thought P8 would be ok but maybe not ... Otherwise, it might be that our timeouts are too low (you may want to talk to our PCIe guys internally) Cheers, Ben.