Received: by 2002:a25:ad19:0:0:0:0:0 with SMTP id y25csp6603590ybi; Wed, 31 Jul 2019 17:51:19 -0700 (PDT) X-Google-Smtp-Source: APXvYqyT9YDtbZDNFdl0yGGQtQREbywzREgqr7bAnYVdPGcOo7jkqsd46lbMbEZJikS5T4n4jRWU X-Received: by 2002:a17:90a:2641:: with SMTP id l59mr5304187pje.55.1564620679161; Wed, 31 Jul 2019 17:51:19 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1564620679; cv=none; d=google.com; s=arc-20160816; b=LbqcHqfcRtS408znHwd9EFFtRQ/Rd7cyme3zU6I0k3ul2QxceZDaQOwmOK9hP7PmPk IefhV5QZdIwYSLaaS4rHZTWtrODAYI9CXZkQG/+BXsirHOjAreZp5xivlTU7C91Offsh 795A5R62QsJu+cD7tlrgwsFxu/hwRjdgEglhU7tBAXBwLJF0t1UYbgmtl1BWM9hbCK3p nmS6vjOFmT2KnKtmt4Rr9BkR6bQJra3upwWd/irUsQbeqL7MDtsXFRx30nx0kQQmZ+LI M+ul7vYv5MV9P7dIF8DQ5SZhyBozlL4y4gbnrr7v5Z35Rl9qELBTkx8ARz2qKko70zt+ w16w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :user-agent:organization:references:in-reply-to:date:cc:to:from :subject:message-id; bh=NPkDfDpHDw3YeQl+T3W6AvueVUYarX7de1k02h9BSIY=; b=HjqUzpQzUSus2ZHxpBa+ycwrQsr060MAI9VlRO7oRpZFr5bHbI60qS8HIQJnJuVl0r 8pXGY8jv6H0Kz+YcQ7PGX7micxUL3P3yWbTzVp05uGklfeHMGk7X4bKvM3d6gzZiG8DF oRhgvPY7RXxW3dXf+BkU+J9B4hHcMSOo+Mil6dR4ABhXZ0KlnUb5IV+bNeDxkCojZVx5 juq66sBA+i1h8XbGO0pUA8O7CCO38WJ1sTEmRGensxu3GKfaljeAoz2qrqJzWVjrm2yC /AwynO9MkDVaC9c7te8KK36dKBoynGuENDX5DLak+STlsHLmNZUxVZnZHaX2KfJKcHAy JlfA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id p4si17231752pgk.496.2019.07.31.17.51.02; Wed, 31 Jul 2019 17:51:19 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727370AbfHAAu1 (ORCPT + 99 others); Wed, 31 Jul 2019 20:50:27 -0400 Received: from mail-qk1-f195.google.com ([209.85.222.195]:35096 "EHLO mail-qk1-f195.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725942AbfHAAu1 (ORCPT ); Wed, 31 Jul 2019 20:50:27 -0400 Received: by mail-qk1-f195.google.com with SMTP id r21so50738262qke.2 for ; Wed, 31 Jul 2019 17:50:26 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:subject:from:to:cc:date:in-reply-to :references:organization:user-agent:mime-version :content-transfer-encoding; bh=NPkDfDpHDw3YeQl+T3W6AvueVUYarX7de1k02h9BSIY=; b=Idd42Fx8c7Sz4VPriWIBGmfebqKpa0lycZKRmCUqZS77Rey912da1RQ7Gp/nJVmiWc a7uIMTdNuKrvShYTPS1D68P4NI8cqWFpI/KjzrcY7sKMxtWBqiYGNeFAdAbBlT0GAo80 3HNRpMA2JVN7uy3lvQ5J8nLC0ISDuWb6SKSrqjU/KNjk10fGRLrO+qSEz+dA0TDqTRs0 lWYipq6gScTusL1xtBjOt6b9P+QAYy9SxdMW1tFmHUipR+6SLBBmYBbAPKFfCdjWyQKI 4guPH45hi4BspkWnDd1k2tUErZJKVhlYLp2iyj2/k//j69+fdLfzoa2CtYMRYo71/yG7 X8FA== X-Gm-Message-State: APjAAAVTdtQht3tYXe4s3yyWM0j1H6q6uXUwt0i4Nasg5TEkUv/R4hgF ZMArawzFRvh8G2RUrvWarCJOQw== X-Received: by 2002:a05:620a:16a6:: with SMTP id s6mr82383362qkj.39.1564620625950; Wed, 31 Jul 2019 17:50:25 -0700 (PDT) Received: from dhcp-10-20-1-11.bss.redhat.com ([144.121.20.162]) by smtp.gmail.com with ESMTPSA id a6sm31199887qth.76.2019.07.31.17.50.21 (version=TLS1_3 cipher=AEAD-AES256-GCM-SHA384 bits=256/256); Wed, 31 Jul 2019 17:50:24 -0700 (PDT) Message-ID: <97d4312d883f3c031c5a6144a2f5f5dd74dfe4df.camel@redhat.com> Subject: Re: [PATCH] Revert "PCI: Enable NVIDIA HDA controllers" From: Lyude Paul To: Karol Herbst , Lukas Wunner Cc: nouveau , Linux PCI , Daniel Drake , Bjorn Helgaas , Aaron Plattner , Peter Wu , Ilia Mirkin , Maik Freudenberg , LKML Date: Wed, 31 Jul 2019 20:50:20 -0400 In-Reply-To: References: <20190731201927.22054-1-lyude@redhat.com> <20190731211842.befvpoyudrm2subf@wunner.de> Organization: Red Hat Content-Type: text/plain; charset="UTF-8" User-Agent: Evolution 3.32.4 (3.32.4-1.fc30) MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, 2019-07-31 at 17:35 -0400, Lyude Paul wrote: > On Wed, 2019-07-31 at 23:26 +0200, Karol Herbst wrote: > > On Wed, Jul 31, 2019 at 11:18 PM Lukas Wunner wrote: > > > On Wed, Jul 31, 2019 at 04:19:27PM -0400, Lyude Paul wrote: > > > > While this fixes audio for a number of users, this commit has the > > > > sideaffect of breaking the BIOS workaround that's required to make the > > > > GPU on the nvidia P50 work, by causing the GPU's PCI device function > > > > to > > > > stop working after it's been set to multifunction mode. > > > > > > This is missing a reference to the commit introducing the P50 quirk, > > > which is e0547c81bfcf ("PCI: Reset Lenovo ThinkPad P50 nvgpu at boot > > > if necessary"). > > > > > > Please describe in more detail how the GPU's PCI function stops working. > > > Does it respond with "all ones" when accessing MMIO? > > > Do MMIO accesses cause the system to hang? > > > > > > Could you provide lspci -vvxx output for the GPU and its associated > > > HDA controller with and without b516ea586d71? > > > > > > Does this machine have external display connectors via which audio > > > can be streamed? > > > > > > > > > > I'm not really holding my breath on this patch to being accepted: > > > > there's a good chance there's a better solution for this (and I'm > > > > going > > > > to continue investigating for one after sending this patch), this is > > > > more just to start a conversation on what the proper way to fix this > > > > is. > > > > > > Posting as an RFC might have been more appropriate then. > > > > > > > no, a revert is actually appropriate. If a commit fixes something, > > but breaks something else, it gets either reverted or fixed. If nobody > > fixes it, then revert it is. > > To answer Lukas's question btw: most of the details on how things break are > back in the original commit (sorry for forgetting the reference!), there's a > _lot_ of explanation there that I'd rather not retype, so just refer back to > the commit and bug @ https://bugs.freedesktop.org/show_bug.cgi?id=75985 > > Additionally, there was some extra discussion providing some more detail in > the email thread that I had with Bjorn: > > https://lkml.org/lkml/2019/2/12/1172 > > As for how this commit breaks the workaround: it seems that when we enable > the > HDA controller and put the GPU into multifunction mode, the function-level > reset stops working and thus we can't reset the GPU anymore. Currently I can > see a couple of solutions (again, please feel free to suggest more!): > > * Just revert the commit. We should do this if necessary, but of course I'd > much rather try finding a fix first > * Disable the HDA controller temporarily when a GPU reset is neded in > quirk_reset_lenovo_thinkpad_p50_nvgpu(), then call the function level > reset, then re-enable the HDA controller. I have no idea if this actually > works yet, but I'm about to try this on my system > * Get quirk_reset_lenovo_thinkpad_p50_nvgpu() to run before > quirk_nvidia_hda(). This would probably be fine, but we would need to > rework some stuff in the PCI subsystem (maybe it already has a way to do > this? haven't checked yet) so that we could perform an flr probe early > enough to perform the quirk Good news! After some investigation looks like that function level reset actually does work, just that after we put it in multifunction mode pci_parent_bus_reset() sees multiple devices on the bus and returns -ENOTTY as a result. So I should definitely be able to come up with a fix for this other then reverting this :). Will send out patches soon > > > > So, I'm kind of confused about why exactly this was implemented as an > > > > early boot quirk in the first place. If we're seeing the GPU's PCI > > > > device, we already know the GPU is there. Shouldn't we be able to > > > > check > > > > for the existence of the HDA device once we probe the GPU in nouveau? > > > > > > I think a motivation to keep this generic was to make it work with > > > other drivers besides nouveau, specifically Nvidia's proprietary driver. > > > nouveau might not even be enabled. > > > > > > > > > > that still doesn't explain why this was implemented as an early quirk > > > > > > This isn't an early quirk. Those live in arch/x86/kernel/early- > > > quirks.c. > > > This is just a PCI quirk executed on device enumeration and on resume. > > > Devices aren't necessarily enumerated only on boot, e.g. think > > > Thunderbolt. > > > > > > Thanks, > > > > > > Lukas -- Cheers, Lyude Paul