Received: by 2002:a05:6a10:a852:0:0:0:0 with SMTP id d18csp407169pxy; Wed, 5 May 2021 05:17:54 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwhT66Nqb8urQdEKCshEAVWI/AC4ZaI84OG6cwj1oPbiZAH1hiWXX+oJbiurFzpdza3LyH2 X-Received: by 2002:a17:902:e550:b029:ee:ba6b:a95e with SMTP id n16-20020a170902e550b02900eeba6ba95emr22299559plf.7.1620217074773; Wed, 05 May 2021 05:17:54 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1620217074; cv=none; d=google.com; s=arc-20160816; b=z9GN1JCPI0a84u1tEe2TN/nmbjOMYkXDRq4s0ghcodJfXvQ/eLiG79O4gBzCRjrPc6 yULUP83RzLobMACFmW1KhoKDGwapDi3N5rmYjwQmArM/kIAtK0D/quS1Bub1lLgYEoxd W8Ss+krAQqfOOnGtb8nYuhzuwQXcKkoY4A9BSiVeHnTlZIYUX+NJCpIFzdZDyCxyzq8J A1OnBXlYiVrRsynwSEQ9gppCyABElkPMN6uXQMJCCqQAxZ6Bx7dC6X+V0+lFFuAdM0C5 kl6YkyjOPxjzoTfRGWw8VQ7KxQuH60LMt7A76xAxx9AHixorrvokMgc9aZctI76/R/rI 8E4Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:user-agent:in-reply-to:content-disposition :mime-version:references:message-id:subject:cc:to:from:date :dkim-signature; bh=28XbJOvaW5jGMdyuRYRCEYlgK+OigggprJNSyGugvIU=; b=UkcvBKzfRdhoW2+pJ+/JIY67fMrojkHMfLpv6KYtc9Xi1A7SHfAM936RW6HsDjKvTM RL+1iR45xPOp3j8NgexplcFqhsxShWC8h/jh6p5fOSlBpa/T5G8k4No7h6Y2yxliY68q 5yezbKmmu4YYCni0FosdxiBzHJaRmPmfVaFDn2ewvx7MwJ+mDOLfOplb7L7/h2T+JMCr 6wtnuftcyS0eJpxeJseQfnLGOCpt/4dZGG8qI6963IEf9O81xDTr49eOrkmrl/N9OPsV LVPAmMPein+ILcU1re1Nl2jQ03KYIcDCjGdAiJHKYFaqRNYsEBFwL9AMpyxr5oVHrWm7 SWxA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=mCkf3fhM; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id x129si7424102pgb.299.2021.05.05.05.17.42; Wed, 05 May 2021 05:17:54 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=mCkf3fhM; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233184AbhEEMRt (ORCPT + 99 others); Wed, 5 May 2021 08:17:49 -0400 Received: from mail.kernel.org ([198.145.29.99]:39944 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233632AbhEEMQB (ORCPT ); Wed, 5 May 2021 08:16:01 -0400 Received: by mail.kernel.org (Postfix) with ESMTPSA id 5024361182; Wed, 5 May 2021 12:15:04 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1620216904; bh=no1Ap7Vs0p9wsqyZTTdIzLq4VZmWSYfaPoLAubQ7gTU=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=mCkf3fhM4S4YMnd7hasOmcoXwfjZHy/MJAeRHEHwOM9y9QhQN5QS39i6YhMz3mIyW oNlcREuat0+j6HEn2MGHZqXJtOUYJBN9JWMPeJjWariK4sxfKxaCN9IKCFNr8Ra0j6 zFxjXXfKbnnXkJ6AQC+uANyndxPsPGptgCO3NdNDYSX3delJLYfRnw8fahC4z1NuQM dUqXHSVWOnCMwkr5BGnMkwfCjXdMwD7ikgO3Fb56dPS7CqUcevhwpYtmPA0oNyXA+c 7CNouEU8uaX3u1OOv20/gUEH/w06ONMV/vGv9hnneuRlu8zjH1OBeR0Pg8O7PaVWpT tDXoVw5RAjT5Q== Received: by pali.im (Postfix) id 9547179D; Wed, 5 May 2021 14:15:01 +0200 (CEST) Date: Wed, 5 May 2021 14:15:01 +0200 From: Pali =?utf-8?B?Um9ow6Fy?= To: Shanker R Donthineni Cc: Bjorn Helgaas , Alex Williamson , Bjorn Helgaas , linux-pci@vger.kernel.org, linux-kernel@vger.kernel.org, Sinan Kaya , Vikram Sethi , Amey Narkhede Subject: Re: [PATCH v4 2/2] PCI: Enable NO_BUS_RESET quirk for Nvidia GPUs Message-ID: <20210505121501.54dlrussyk7kij5d@pali> References: <20210430170151.GA660969@bjorn-Precision-5520> <52c89d4e-6b26-6c56-d71e-508a715394ab@nvidia.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <52c89d4e-6b26-6c56-d71e-508a715394ab@nvidia.com> User-Agent: NeoMutt/20180716 Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Friday 30 April 2021 17:11:23 Shanker R Donthineni wrote: > Thanks Bjorn for reviewing patch. > > On 4/30/21 12:01 PM, Bjorn Helgaas wrote: > > External email: Use caution opening links or attachments > > > > > > On Wed, Apr 28, 2021 at 07:49:07PM -0500, Shanker Donthineni wrote: > >> On select platforms, some Nvidia GPU devices do not work with SBR. > >> Triggering SBR would leave the device inoperable for the current > >> system boot. It requires a system hard-reboot to get the GPU device > >> back to normal operating condition post-SBR. For the affected > >> devices, enable NO_BUS_RESET quirk to fix the issue. > > Since 1/2 adds _RST support, should I infer that _RST works on these > > Nvidia GPUs even though SBR does not? If so, how does _RST do the > > reset? > Yes, _RST method works but not SBR. The _RST method in DSDT-AML uses > platform-specific initialization steps outside of the GPU BARs for resetting > the GPU device. Hello! If I understood this "reset" issue correctly, it means that affected PCIe GPU device cannot be reset via PCI Secondary Bus Reset (PCIe Warm Reset) and some special, platform specific reset type needs to be issued. And code for this platform specific reset is included in ACPI DSDT table. But because ACPI DSDT table is part of BIOS/firmware and not part of the PCIe GPU device itself, it means that this kind of reset is available to linux kernel only in the case when vendor of motherboard (or who burn BIOS/firmware into motherboard EEPROM) includes this specific code into HW. Am I Right? So if this PCIe GPU device is connected to other motherboard or other system then this special platform reset in ACPI DSDT is not available. What is doing default APCI _RST() method on motherboards without this special platform reset hook? It probably would not be able to reset these PCIe GPU devices if standard SBR cannot reset them. Would not be better to include for these PCIe devices "native" linux code for resetting them? Please correct me if I'm wrong in my assumption or if I understood this issue incorrectly. > > Do you have a root cause for why SBR doesn't work? > It is a hardware implementation specific issue. GPU end-point device > is inoperative after receiving SBR from the RP/SwitchPort. This quirk is > to prevent SBR. > > > I'm not super > > confident that we perform resets correctly in general, and if the > > problem is an issue in Linux, it'd be nice to fix that. > We have not seen any issue with Linux SBR implementation. > > > >> This issue will be fixed in the next generation of hardware.