Received: by 2002:ac0:a581:0:0:0:0:0 with SMTP id m1-v6csp2474123imm; Sat, 30 Jun 2018 21:40:32 -0700 (PDT) X-Google-Smtp-Source: AAOMgpdJitlRE8jIlCrnfQS11O0zREjpJn8pa5b3tLicnZ1jE2y++YtCMd83e2094zBMP0AbIeje X-Received: by 2002:a62:6698:: with SMTP id s24-v6mr20598887pfj.243.1530420032287; Sat, 30 Jun 2018 21:40:32 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1530420032; cv=none; d=google.com; s=arc-20160816; b=AyREN0rC7my+P+fl1TJPnti1/bft253/MlcXFzYISHRLTpScE7Yh3lBRYOsVjBRcZZ AhZlr3cHK5zGTqW/G1BS/R0nmiDBiU5PSX4SNxG3wixQblYoUt1OGM96+23H1wxvQY2l KWF41XmhVbfc09hTyYIuQ0MCjHICiD7fcQlLtgNevihcsyhNJv6SiaA43RwOmLDNdBBk KP+aRPybxLeg9dlyZ8Q344OCJyfWrZ5cdbl9jDaO37O3zpfSwwrPUQifntDQAGZ4dQdC UgPWa5SNwbrgBi+S9tDG7aOlZUfJzJsT3uyORBKA/moHL9jFbx4dRkvElZH5y9TDsNib PuJg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding :content-language:in-reply-to:mime-version:user-agent:date :message-id:from:references:cc:to:subject:dkim-signature :arc-authentication-results; bh=IWtqc0nNyT3q1rY3QA1g++ui7vzbh58HnPXDWcEipbE=; b=cNqJtwI68AtnnIqKD30ycsWVxHci+UFXuJflOTgcEVb4OTP/mtORJt6b0ZIm9L0uRa 8U12z2ShDq0JhpvRgbbyzRdIXhmpISUlmQrCKvYjfZ3ykfRdLuQDcTgi+XlmyJaPdXoB V54z28IlZ2t4i3qQqYIbtvla2NDf4MyLop5aHVhXAY3dM46c2f8RPIKLuFIzFCNTmUNX 1YEbI8rVhblYGDnHYUIbE8RiOMUuLWh2UR2Mr35d8TJmkX3FBBEgDLrcMkBUJv5kJu9X VRxyy1snK5KaMsgxiSdRiuMobnQ4tv2ELrYs+uqqcnRWmFlDAIf85yWYEKTDse2EQlby y4ew== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b="JQ9/nIrS"; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id k72-v6si13501685pfj.141.2018.06.30.21.40.18; Sat, 30 Jun 2018 21:40:32 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b="JQ9/nIrS"; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751898AbeGAEjG (ORCPT + 99 others); Sun, 1 Jul 2018 00:39:06 -0400 Received: from mail-ot0-f196.google.com ([74.125.82.196]:37571 "EHLO mail-ot0-f196.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751409AbeGAEjC (ORCPT ); Sun, 1 Jul 2018 00:39:02 -0400 Received: by mail-ot0-f196.google.com with SMTP id q11-v6so11852801otk.4; Sat, 30 Jun 2018 21:39:02 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=subject:to:cc:references:from:message-id:date:user-agent :mime-version:in-reply-to:content-language:content-transfer-encoding; bh=IWtqc0nNyT3q1rY3QA1g++ui7vzbh58HnPXDWcEipbE=; b=JQ9/nIrSFsN/7B3J+dC2cUUP3TLXqH48p9vAtXxsaROd/ElUoQfvABeXbSbuMafQdV XDM/f+5TSIHrxdgojQd25bAXuRN+KQi2NoeJw6BCH6ekNTsGNFGyLuTO00CRBUvHtcJy FkNtUcx/YI34pkqjzliepmG/sZWoI0Z3VJ0hWrEOXqcxyRina6EdzKLYfVhX0aaNsjGf YRDoC1rw6r/3+Yo9IJvDoARz5xnH3ALn1VBklHeLbxYXNVbKw7hCiClLBzGjsX0dIzOq Dh6mXg/1FmBC204dp8kxC0LX6uM2pqzcv1P7G/FFATILL+9oQmr2vJ90bVTg/0fSPsIE 62Cw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:to:cc:references:from:message-id:date :user-agent:mime-version:in-reply-to:content-language :content-transfer-encoding; bh=IWtqc0nNyT3q1rY3QA1g++ui7vzbh58HnPXDWcEipbE=; b=TSLcp3xtuR/UefwRbO7t5/zmfW+axQn/RokQFWpKOVX39pZXum/TqdoOBbA2XsLADX /GzZHRQiDyZskQPN8z1207c43zjM3DfSWud4TBHcePY2KeIXZ0AJDDAMpGkrpz751Ptz qCL+gDHMLESPb6WcG3mdym0SGD+8juSJUXQu7dedTC0QudTC9t7Ce0yxnCb2K4VlN46R maQHizcKkpScWV2xCVJiVWf7dI3xtlnv+c6Iypghu+8k6CxBJhtDmoAIM3ALLXpNVKUn Enarhf5V/OO715LDNru6/DH+9PL4zGgHjnJvzVGux8gUtNn9maUOvLs8hKue0agNNeTR oP8w== X-Gm-Message-State: APt69E1IjjIyO7IpGi3J78oAgAhy7g8I8zA7jrPpuWAWgY2fPic2XIqt rkS3ch1SSm/bbjFBC9Kl0HU= X-Received: by 2002:a9d:61c9:: with SMTP id h9-v6mr222050otk.4.1530419942190; Sat, 30 Jun 2018 21:39:02 -0700 (PDT) Received: from nuclearis2.gtech ([2601:2c1:8500:500b:da50:e6ff:fe00:4a96]) by smtp.gmail.com with ESMTPSA id z17-v6sm1784459otd.9.2018.06.30.21.39.00 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sat, 30 Jun 2018 21:39:01 -0700 (PDT) Subject: Re: [PATCH v2] PCI/AER: Fix aerdrv loading with "pcie_ports=native" parameter To: Bjorn Helgaas Cc: bhelgaas@google.com, keith.busch@intel.com, alex_gagniuc@dellteam.com, austin_bolen@dell.com, shyam_iyer@dell.com, Frederick Lawler , Greg Kroah-Hartman , Oza Pawandeep , linux-pci@vger.kernel.org, linux-kernel@vger.kernel.org, linux-acpi@vger.kernel.org, Borislav Petkov References: <20180619195835.5423-1-mr.nuke.me@gmail.com> <20180630213140.GG9547@bhelgaas-glaptop.roam.corp.google.com> From: Alex G Message-ID: Date: Sat, 30 Jun 2018 23:39:00 -0500 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.7.0 MIME-Version: 1.0 In-Reply-To: <20180630213140.GG9547@bhelgaas-glaptop.roam.corp.google.com> Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 06/30/2018 04:31 PM, Bjorn Helgaas wrote: > [+cc Borislav, linux-acpi, since this involves APEI/HEST] Borislav is not the relevant maintainer here, since we're not contingent on APEI handling. I think Keith has a lot more experience with this part of the kernel. > On Tue, Jun 19, 2018 at 02:58:20PM -0500, Alexandru Gagniuc wrote: >> According to the documentation, "pcie_ports=native", linux should use >> native AER and DPC services. While that is true for the _OSC method >> parsing, this is not the only place that is checked. Should the HEST >> table list PCIe ports as firmware-first, linux will not use native >> services. > > Nothing in ACPI-land looks at pcie_ports_native. How should ACPI > things work in the "pcie_ports=native" case? I guess we still have to > expect to receive error records from the firmware, because it may > certainly send us non-PCI errors (machine checks, etc) and maybe even > some PCI errors (even if the Linux AER driver claims AER interrupts, > we don't know what notification mechanisms the firmware may be using). I think ACPI land shouldn't care about this. We care about it from the PCIe stand point at the interface with ACPI. FW might see a delta in the sense that we request control of some features via _OSC, which we otherwise would not do without pcie_ports=native. > I guess best-case, we'll get ACPI error records for all non-PCI > things, and the Linux AER driver will see all the AER errors. It might affect FW's ability to catch errors, but that's dependent on the root port implementation. > Worst-case, I don't really know what to expect. Duplicate reporting > of AER errors via firmware and Linux AER driver? Some kind of > confusion about who acknowledges and clears them? Once user enters pcie_ports=native, all bets are off: you broke the contract you have with the FW -- whether or not you have this patch. > Out of curiosity, what is your use case for "pcie_ports=native"? > Presumably there's something that works better when using it, and > things work even *better* with this patch? Corectness. It bothers me that actual behavior does not match the documentation: native Use native PCIe services associated with PCIe ports unconditionally. > I know people do use it, because I often see it mentioned in forums > and bug reports, but I really don't expect it to work very well > because we're ignoring the usage model the firmware is designed > around. My unproven suspicion is that most uses are in the black > magic category of "there's a bug here, and we don't know how to fix > it, but pcie_ports=native makes it work better". There exist cases that firmware didn't consider. I would not call them "firmware bugs", but there are cases where the user understands the platform better than firmware. Example: on certain PCIe switches, a hardware PCIe error may bring the switch downstream ports into a state where they stop notifying hotplug events. Depending on the platform, firmware may or may not fix this condition, but "pcie_ports=native" enables DPC. DPC contains the error without the switch downstream port entering the weird error state in the first place. All bets are off at this point. > Obviously I would much rather find and fix those bugs so people > wouldn't have to stumble over the problem in the first place. Using native services when firmware asks us not to is a crapshoot every time. I don't condone the use of this feature, but if we do get a pcie_ports=native request, we should at least honor it. >> This happens because aer_acpi_firmware_first() doesn't take >> 'pcie_ports' into account. This is wrong. DPC uses the same logic when >> it decides whether to load or not, so fixing this also fixes DPC not >> loading. >> >> Signed-off-by: Alexandru Gagniuc >> --- >> drivers/pci/pcie/aer.c | 5 ++++- >> 1 file changed, 4 insertions(+), 1 deletion(-) >> >> Changes since v1: >> - Re-tested with latest and greatest (v4.18-rc1) -- works great >> >> diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c >> index a2e88386af28..98ced0f7c850 100644 >> --- a/drivers/pci/pcie/aer.c >> +++ b/drivers/pci/pcie/aer.c >> @@ -291,7 +291,7 @@ static void aer_set_firmware_first(struct pci_dev *pci_dev) >> >> rc = apei_hest_parse(aer_hest_parse, &info); >> >> - if (rc) >> + if (rc || pcie_ports_native) >> pci_dev->__aer_firmware_first = 0; >> else >> pci_dev->__aer_firmware_first = info.firmware_first; >> @@ -327,6 +327,9 @@ bool aer_acpi_firmware_first(void) >> apei_hest_parse(aer_hest_parse, &info); >> aer_firmware_first = info.firmware_first; >> parsed = true; >> + if (pcie_ports_native) >> + aer_firmware_first = 0; >> + >> } >> return aer_firmware_first; >> } >> -- >> 2.14.3 >>