Received: by 2002:a05:6a10:f3d0:0:0:0:0 with SMTP id a16csp5980878pxv; Wed, 7 Jul 2021 16:45:03 -0700 (PDT) X-Google-Smtp-Source: ABdhPJx5G7mV8M337ResyJ6e0cBr/120UOjAopk9l8C1cjfJDEBNVC7RPjJgdQcTLzbd2tyBhnVA X-Received: by 2002:a05:6402:1385:: with SMTP id b5mr18946482edv.276.1625701502883; Wed, 07 Jul 2021 16:45:02 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1625701502; cv=none; d=google.com; s=arc-20160816; b=Ggb9D34I3W8qnnR1FMQsvsffHgJAelQCoFKRbNN5u0gCt8v/pBWhCcSgJskIvkBnDt FB/9HsbIdoq8PWx/oiQfwkLVp38lQV1Kbe2W2gGCNaNihtDsHLt3dzm4doj6+rOLbGNq WEDZ3HN/ejd/6e4soShU3lRN4rscvYcXMXPuTdfwkkLrTrRntxuIcjPjYVpop9OS36gH CaTQDJLDuWPSd+JFLfeByxD3kb1faNFR9qrJoQ89Cns77sOr0vn3B8qjI3DYgtt1GBE4 lTm56ngPMPl0XuV0vJMv2S3dq+ymuvLWrFuV59gU+Y0jt3MRP6QLHxldHkz1eNGcCKN4 M6vQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-transfer-encoding :content-disposition:mime-version:message-id:subject:cc:to:from:date :dkim-signature; bh=OmsZmlP5dDXeTodCCLBEjDK9NpGi82CZAXh5Nn2r6x8=; b=a7kDys2T+Xdbxpi3t7Tyg3GDxx6kMi1rPzE1t9MSJhhL0a83DzveROGAy+HRvZ6u76 d8DxPD5vReIRcHzv6u6L/mWOnRJAmEK5KFiDplcqgTIM9PacbG6CGI4zDYWTnk92NXed KxTHPxctJiODlLo2P0H66kD3qRVEN+ye/I0S5sJwvK1mUbzTE4L41Roo5GFxzlqCrwLt gsdDOqrHDt6YNBmZ1eyU9OuSlTvuKS6/cURQuPS20XKpoMDNfdFytAn1vh9XOILgMCpe UfYI4LEqCsR+Khr9QOo76UhgTCj560cNBlC5yC/jL5APDviG7QW2YQbqBlICDz65eQrd VpTQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=E8lXzKWT; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id js21si503801ejc.229.2021.07.07.16.44.38; Wed, 07 Jul 2021 16:45:02 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=E8lXzKWT; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230463AbhGGWNZ (ORCPT + 99 others); Wed, 7 Jul 2021 18:13:25 -0400 Received: from mail.kernel.org ([198.145.29.99]:55976 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229717AbhGGWNY (ORCPT ); Wed, 7 Jul 2021 18:13:24 -0400 Received: by mail.kernel.org (Postfix) with ESMTPSA id A664E61CBE; Wed, 7 Jul 2021 22:10:43 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1625695844; bh=Krj0NcI/vPoGRovvAGXcsL3NienP2oguEHItO3C5azU=; h=Date:From:To:Cc:Subject:In-Reply-To:From; b=E8lXzKWTlyvg25yMM8pCFQT2dftOY0PgFQbHs08wXkuhnQDxizdK0Df+f5l8/BdK2 8IoQgVOdKECBLbcWNS5j1ZfJRUYxcrrLFcRjo3d5nUQIckhX1xSfb7yT/yMUMNqgF2 R6evXm8h72o7621UpUSQMHWyLGUFBU6GfZZHsny0JqhSX9zonU8r0vP1bTwV5MSpco LWGOZ+y433AplT1sCKyFNUuyY8pnx1S2ky9MyBjhTCq+mKydbC1iLUU74kKrvMsHkn eh92SnZ6t+Zh+r4nbeknVYkN5Lox4HScNeUJkrDjVCAEAfe3Q1SXztC13LzkRJNaEf kbr+C2gOr7Gsw== Date: Wed, 7 Jul 2021 17:10:42 -0500 From: Bjorn Helgaas To: Pali =?iso-8859-1?Q?Roh=E1r?= Cc: Aaron Ma , jesse.brandeburg@intel.com, anthony.l.nguyen@intel.com, davem@davemloft.net, kuba@kernel.org, intel-wired-lan@lists.osuosl.org, netdev@vger.kernel.org, linux-kernel@vger.kernel.org, Krzysztof =?utf-8?Q?Wilczy=C5=84ski?= , linux-pci@vger.kernel.org Subject: Re: [PATCH 1/2] igc: don't rd/wr iomem when PCI is removed Message-ID: <20210707221042.GA939059@bjorn-Precision-5520> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <20210707215337.lwbgvb6lxs3gmsbb@pali> Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Jul 07, 2021 at 11:53:37PM +0200, Pali Roh?r wrote: > On Tuesday 06 July 2021 15:12:41 Bjorn Helgaas wrote: > > On Fri, Jul 02, 2021 at 12:51:19PM +0800, Aaron Ma wrote: > > > Check PCI state when rd/wr iomem. > > > Implement wr32 function as rd32 too. > > > > > > When unplug TBT dock with i225, rd/wr PCI iomem will cause error log: > > > Trace: > > > BUG: unable to handle page fault for address: 000000000000b604 > > > Oops: 0000 [#1] SMP NOPTI > > > RIP: 0010:igc_rd32+0x1c/0x90 [igc] > > > Call Trace: > > > igc_ptp_suspend+0x6c/0xa0 [igc] > > > igc_ptp_stop+0x12/0x50 [igc] > > > igc_remove+0x7f/0x1c0 [igc] > > > pci_device_remove+0x3e/0xb0 > > > __device_release_driver+0x181/0x240 > > > > > > Signed-off-by: Aaron Ma > > > --- > > > drivers/net/ethernet/intel/igc/igc_main.c | 16 ++++++++++++++++ > > > drivers/net/ethernet/intel/igc/igc_regs.h | 7 ++----- > > > 2 files changed, 18 insertions(+), 5 deletions(-) > > > > > > diff --git a/drivers/net/ethernet/intel/igc/igc_main.c b/drivers/net/ethernet/intel/igc/igc_main.c > > > index f1adf154ec4a..606b72cb6193 100644 > > > --- a/drivers/net/ethernet/intel/igc/igc_main.c > > > +++ b/drivers/net/ethernet/intel/igc/igc_main.c > > > @@ -5292,6 +5292,10 @@ u32 igc_rd32(struct igc_hw *hw, u32 reg) > > > u8 __iomem *hw_addr = READ_ONCE(hw->hw_addr); > > > u32 value = 0; > > > > > > + if (igc->pdev && > > > + igc->pdev->error_state == pci_channel_io_perm_failure) > > > + return 0; > > > > I don't think this solves the problem. > > > > - Driver calls igc_rd32(). > > > > - "if (pci_channel_io_perm_failure)" evaluates to false (error_state > > does not indicate an error). > > > > - Device is unplugged. > > > > - igc_rd32() calls readl(), which performs MMIO read, which fails > > because the device is no longer present. readl() returns ~0 on > > most platforms. > > > > - Same page fault occurs. > > Hi Bjorn! I think that backtrace show that this error happens when PCIe > hotplug get interrupt that device was unplugged and PCIe hotplug code > calls remove/unbind procedure to stop unplugged driver. > > And in this case really does not make sense to try issuing MMIO read, > device is already unplugged. > > I looked that PCIe hotplug driver calls pci_dev_set_disconnected() when > this unplug interrupt happens and pci_dev_set_disconnected() just sets > pci_channel_io_perm_failure flag. > > drivers/pci/pci.h provides function pci_dev_is_disconnected() which > checks if that flag pci_channel_io_perm_failure is set. > > So I think that pci_dev_is_disconnected() is useful and could be > exported also to drivers (like this one) so they can check if > pci_dev_set_disconnected() was called in past and PCI driver is now in > unbind/cleanup/remove state because PCIe device is already disconnected > and not accessible anymore. > > But maybe this check should be on other place in driver unbound > procedure and not in general MMIO read function? If we add the check as proposed in this patch, I think people will read it and think this is the correct way to avoid MMIO errors. It does happen to avoid some MMIO errors, but it cannot avoid them all, so it's not a complete solution and it gives a false sense of security. A complete solution requires a test *after* the MMIO read. If you have the test after the read, you don't really need one before. Sure, testing before means you can avoid one MMIO read failure in some cases. But avoiding that failure costs quite a lot in code clutter. > > The only way is to check *after* the MMIO read to see whether an error > > occurred. On most platforms that means checking for ~0 data. If you > > see that, a PCI error *may* have occurred. > > > > If you know that ~0 can never be valid, e.g., if you're reading a > > register where ~0 is not a valid value, you know for sure that an > > error has occurred. > > > > If ~0 might be a valid value, e.g., if you're reading a buffer that > > contains arbitrary data, you have to look harder. You might read a > > register than cannot contain ~0, and see if you get the data you > > expect. Or you might read the Vendor ID or something from config > > space. > > > > > value = readl(&hw_addr[reg]); > > > > > > /* reads should not return all F's */ > > > @@ -5308,6 +5312,18 @@ u32 igc_rd32(struct igc_hw *hw, u32 reg) > > > return value; > > > } > > > > > > +void igc_wr32(struct igc_hw *hw, u32 reg, u32 val) > > > +{ > > > + struct igc_adapter *igc = container_of(hw, struct igc_adapter, hw); > > > + u8 __iomem *hw_addr = READ_ONCE(hw->hw_addr); > > > + > > > + if (igc->pdev && > > > + igc->pdev->error_state == pci_channel_io_perm_failure) > > > + return; > > > + > > > + writel((val), &hw_addr[(reg)]); > > > +} > > > + > > > int igc_set_spd_dplx(struct igc_adapter *adapter, u32 spd, u8 dplx) > > > { > > > struct igc_mac_info *mac = &adapter->hw.mac; > > > diff --git a/drivers/net/ethernet/intel/igc/igc_regs.h b/drivers/net/ethernet/intel/igc/igc_regs.h > > > index cc174853554b..eb4be87d0e8b 100644 > > > --- a/drivers/net/ethernet/intel/igc/igc_regs.h > > > +++ b/drivers/net/ethernet/intel/igc/igc_regs.h > > > @@ -260,13 +260,10 @@ struct igc_hw; > > > u32 igc_rd32(struct igc_hw *hw, u32 reg); > > > > > > /* write operations, indexed using DWORDS */ > > > -#define wr32(reg, val) \ > > > -do { \ > > > - u8 __iomem *hw_addr = READ_ONCE((hw)->hw_addr); \ > > > - writel((val), &hw_addr[(reg)]); \ > > > -} while (0) > > > +void igc_wr32(struct igc_hw *hw, u32 reg, u32 val); > > > > > > #define rd32(reg) (igc_rd32(hw, reg)) > > > +#define wr32(reg, val) (igc_wr32(hw, reg, val)) > > > > > > #define wrfl() ((void)rd32(IGC_STATUS)) > > > > > > -- > > > 2.30.2 > > >