Received: by 2002:a25:8b91:0:0:0:0:0 with SMTP id j17csp2082141ybl; Sun, 19 Jan 2020 18:37:07 -0800 (PST) X-Google-Smtp-Source: APXvYqxA9i0clpoa9zd6UXupgji+Zgamc5o4/BOrpeb0q0nrc/iLNMKKwSHUYZ/ze+4iDbmYmc8Y X-Received: by 2002:a54:4e8d:: with SMTP id c13mr11280732oiy.27.1579487827315; Sun, 19 Jan 2020 18:37:07 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1579487827; cv=none; d=google.com; s=arc-20160816; b=nyP61N4C+7PBSnKJ8rZc4D92CEhGXZUxBmnN5f7aNi0kfAEd71D0RKk2oo/0bEUw97 02mSweYp/c7pxiXKtSYaGBzMPLOp+eSl/QDpB1G4PFC1TUPgWBsNm3P7XxZ4FPohmjwD eWrDMX6abqONLI2lHgr/PWtNh2Z9bf5xPUTTsyBKpaAO42hwsCqQeq+NBpHHumRffuax fqq9kjfkvKfk4fouyT0+tGSzzB4UKMMm/wXhgwZ0lowaDEPo0Ip7R/P2/8CyoU7dpE41 zA+hHKdk6mgd5ALH9JmNvpJdBX6H9IywgTQaqxU4eevL7/JmkzraJsDxlWdNvElrF8lA m7wQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:message-id:subject:cc:to:from:date :dkim-signature; bh=VJs2lnmitSj1eWsjkgBHSXwT9KcZPmaEEbmc1PTHXTA=; b=0gjmZUuD7Wd+xn+fBZYTYtNmffl7VKoEaeq1GrizFS/yOoHLzGxAyCUJLm5QL1rfB6 Rneg6mh0gmtkCP/ds9e6whfqggvSFD0ZJtwetckyVQxPk7Qr5CkhlmDewVUU/VqTtcti /fTngi8aEjXo3IJIhJjaBqv/8OkWVc3g19+eEeBTLY+6/ykM3lu8fB5MvigjxFYrZKgf IRkhT2V+1MGuy0i4KNrZ30iSRJSeL6WTYawo5yGlJZzyOvXvEimd8qWwsYPlIk/aj2J7 fVcI3OefYKn1idFiGyHjAq3MxSap6TwZmCS9W5LdIZeBI1R6gUPpUlicGOsRTAj0ZJ/J 5U1g== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=zRTBpwFL; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id w81si17393482oig.107.2020.01.19.18.36.23; Sun, 19 Jan 2020 18:37:07 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=zRTBpwFL; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729008AbgATCd2 (ORCPT + 99 others); Sun, 19 Jan 2020 21:33:28 -0500 Received: from mail.kernel.org ([198.145.29.99]:56132 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728874AbgATCd2 (ORCPT ); Sun, 19 Jan 2020 21:33:28 -0500 Received: from localhost (108.sub-174-195-2.myvzw.com [174.195.2.108]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id D9387206B7; Mon, 20 Jan 2020 02:33:27 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1579487608; bh=1sYs9H8+JsDNq45a9MdMoXvIVGncR7j7QbIPXhvbpYk=; h=Date:From:To:Cc:Subject:In-Reply-To:From; b=zRTBpwFLl5+RLb8xQmoK3vv43rDvPwct1C7dMIpW6c9+bMbr2jjIqGKnveeIGRJsh doH6uQxYWjjzzgjns6bZeRziMuG3ziUKa5wOAdde9sJj0hwD42yxE12WYMuv6kBqW3 rSBXfRgsCQYr3aVF7Jaf3i7ou25Qq4SWwr4gaU+k= Date: Sun, 19 Jan 2020 20:33:26 -0600 From: Bjorn Helgaas To: Alexandru Gagniuc , Alexandru Gagniuc , Keith Busch , Jens Axboe , Christoph Hellwig , Sagi Grimberg , David Airlie , Daniel Vetter Cc: Jan Vesely , Lukas Wunner , Alex Williamson , Austin Bolen , Shyam Iyer , Sinan Kaya , linux-pci@vger.kernel.org, linux-kernel@vger.kernel.org Subject: Re: Issues with "PCI/LINK: Report degraded links via link bandwidth notification" Message-ID: <20200120023326.GA149019@google.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20200115221008.GA191037@google.com> User-Agent: Mutt/1.10.1 (2018-07-13) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org [+cc NVMe, GPU driver folks] On Wed, Jan 15, 2020 at 04:10:08PM -0600, Bjorn Helgaas wrote: > I think we have a problem with link bandwidth change notifications > (see https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/pci/pcie/bw_notification.c). > > Here's a recent bug report where Jan reported "_tons_" of these > notifications on an nvme device: > https://bugzilla.kernel.org/show_bug.cgi?id=206197 > > There was similar discussion involving GPU drivers at > https://lore.kernel.org/r/20190429185611.121751-2-helgaas@kernel.org > > The current solution is the CONFIG_PCIE_BW config option, which > disables the messages completely. That option defaults to "off" (no > messages), but even so, I think it's a little problematic. > > Users are not really in a position to figure out whether it's safe to > enable. All they can do is experiment and see whether it works with > their current mix of devices and drivers. > > I don't think it's currently useful for distros because it's a > compile-time switch, and distros cannot predict what system configs > will be used, so I don't think they can enable it. > > Does anybody have proposals for making it smarter about distinguishing > real problems from intentional power management, or maybe interfaces > drivers could use to tell us when we should ignore bandwidth changes? NVMe, GPU folks, do your drivers or devices change PCIe link speed/width for power saving or other reasons? When CONFIG_PCIE_BW=y, the PCI core interprets changes like that as problems that need to be reported. If drivers do change link speed/width, can you point me to where that's done? Would it be feasible to add some sort of PCI core interface so the driver could say "ignore" or "pay attention to" subsequent link changes? Or maybe there would even be a way to move the link change itself into the PCI core, so the core would be aware of what's going on? Bjorn