Received: by 2002:a05:6358:45e:b0:b5:b6eb:e1f9 with SMTP id 30csp591314rwe; Fri, 26 Aug 2022 10:22:26 -0700 (PDT) X-Google-Smtp-Source: AA6agR5CGieejsa670h3kZBv+i6IyuBIPKMIqS5eU6MAsi/wqAojhJOIPG6Bvw2Tx0CXUVzs6qFZ X-Received: by 2002:a17:907:c0d:b0:730:a85d:8300 with SMTP id ga13-20020a1709070c0d00b00730a85d8300mr6116860ejc.558.1661534546332; Fri, 26 Aug 2022 10:22:26 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1661534546; cv=none; d=google.com; s=arc-20160816; b=hB84FmqdxxSs1LxlNrOlUM6ZSgnBGnN+Y1PJx2pyXqN82D4leN2wXcBGBRbX2L9FUL hm1qfw48WKmS3Rbi6xOhq6jJYE8icoWhkUtSBkxuLYRb6Mx5brwPHC4X286a3dZys9Tq QeoCjji8z32/FNDGs5/S6hdgFqk9hzAOGAQ89z5M7GJ4bfxB699Xmyps4gFbgaAuLSky QD9aGWkyGad8qluiZfK/5dFkbwaKSzMkIOZ2z73a3IqYTW1j9GB4OT6Cxjs1tEFoK8sR r6MQQI2N0rhLbNN40gKnlxuneOO7SS/tnD2js+KhIEu4ZUcH21V2t31/rCG28IK1u2O4 eiTQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:subject:message-id:date:from:in-reply-to :references:mime-version:dkim-signature; bh=1in29G4BteIRakRuuoxRuXqGQ7UlsROlnjbxci51a0M=; b=Lt7CHYAX64U0GTQTLHlsus/zLG63Z/CO65Gfwvjrp1yowBVBsXQy1WGhutsqnWJhKb SEpM8b6Yh5NGRjwmM1XzKf+m3Ir/ZnosRH8jyRt0HjOFiLF1AIX0GuxovF/KElV3qLpT 9PiSuqku5Ynh3I3BqaidcfSre8bKvUFXQH8ihglroaD2DUxKtOAEllhUk91rnIzAG9lD YJvy5Z2L4ngisgVEZvZzl6R0st1nIYTlWaDkZbWC+HxXMCku2ikj5P3wtCqwGs4VGKu4 sQLOHH3c6oEoMLMt8JNhQY5d+Okznd2v+e/51/WVQnm9lFusGr1jJ8MPWxWpSQszTh+j HUEA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b="Ww6/9i88"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id u1-20020a50a401000000b004477be72d56si1526137edb.521.2022.08.26.10.22.00; Fri, 26 Aug 2022 10:22:26 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b="Ww6/9i88"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1344589AbiHZQTN (ORCPT + 99 others); Fri, 26 Aug 2022 12:19:13 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:52762 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1344241AbiHZQTL (ORCPT ); Fri, 26 Aug 2022 12:19:11 -0400 Received: from mail-yw1-x1136.google.com (mail-yw1-x1136.google.com [IPv6:2607:f8b0:4864:20::1136]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1F25AD9D5A for ; Fri, 26 Aug 2022 09:19:10 -0700 (PDT) Received: by mail-yw1-x1136.google.com with SMTP id 00721157ae682-334dc616f86so48104637b3.8 for ; Fri, 26 Aug 2022 09:19:10 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc; bh=1in29G4BteIRakRuuoxRuXqGQ7UlsROlnjbxci51a0M=; b=Ww6/9i888Lv4eGNktEUK6+TxKPacYVk3H/C7dNHJ56y1BH3v6MQCLDSPE04XHWr8sp gfOx4tjbfxgTjEFyHXEvXEXvmrejVTr5plM2OxR7bg3cUblbu8/v9vgfu3tGGbqR/522 oGc0PKLc2p3B3SsdGldQyG3PeznGcPIppvgu0yz81NnCwtJfQNKdl/riHC8N414nBg7V LRZQPmbFP2/8fgbZZb5c1lD9MhGZ1lchKh4KHbAiDrWFRVZwVT2JMS5npB6g5l88iTcb IeYTBrqXY5dTlEpYbnhn/Iw7UmFJPKVnHojLZlTladmMPYXHzSiLmFZzgZg6TRF/vJR6 YFGQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc; bh=1in29G4BteIRakRuuoxRuXqGQ7UlsROlnjbxci51a0M=; b=hB5LqzHFysRG9/BNpEwothAT6jSd1DDWuFu9CuylXF9Z0kZExAIpLMSBo4XrhEvP4v 2nVNT/bXqXdGOhmVnIbikmSYBYIkqSlF05eJOxNAnNlbe1cswt0mDJwu3TgjX1Us+ff9 zobmksGWSjRLuFsk42t7hm5qFz2X/jG+Qgj3WLtDCDSB5DSpkOTplllUzfOvUJoLp1+6 sUe1pAPZCTL/Kym6l5flA8xC62W9Ni2//Ivj64LWNpqfk/fMQrZut1zniqvxiivi6vWO Idb4DEHx9OYRUs9GuXLyvqCsmQM+Bxk/2JlqMxAnQe9PSqDl+I5gcK52vktHTAkOkxYv 8F2g== X-Gm-Message-State: ACgBeo1DZ0wuAVploeXQVts/fvfC1+nQ2n0brkJOlXsBIHA+MAWoOedb 7ZtRMX6CHwLaeJjeDfYxOBostJKQNH5qzbJNK4eHew== X-Received: by 2002:a25:b083:0:b0:695:9a91:317d with SMTP id f3-20020a25b083000000b006959a91317dmr370577ybj.387.1661530749022; Fri, 26 Aug 2022 09:19:09 -0700 (PDT) MIME-Version: 1.0 References: <20220826002530.1153296-1-kai.heng.feng@canonical.com> In-Reply-To: <20220826002530.1153296-1-kai.heng.feng@canonical.com> From: Eric Dumazet Date: Fri, 26 Aug 2022 09:18:58 -0700 Message-ID: Subject: Re: [PATCH v2] tg3: Disable tg3 device on system reboot to avoid triggering AER To: Kai-Heng Feng Cc: siva.kallam@broadcom.com, prashant@broadcom.com, mchan@broadcom.com, Josef Bacik , "David S. Miller" , Jakub Kicinski , Paolo Abeni , netdev , LKML Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-17.6 required=5.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF, ENV_AND_HDR_SPF_MATCH,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS, T_SCC_BODY_TEXT_LINE,USER_IN_DEF_DKIM_WL,USER_IN_DEF_SPF_WL autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Aug 25, 2022 at 5:25 PM Kai-Heng Feng wrote: > > Commit d60cd06331a3 ("PM: ACPI: reboot: Use S5 for reboot") caused a > reboot hang on one Dell servers so the commit was reverted. > > Someone managed to collect the AER log and it's caused by MSI: > [ 148.762067] ACPI: Preparing to enter system sleep state S5 > [ 148.794638] {1}[Hardware Error]: Hardware error from APEI Generic Hardware Error Source: 5 > [ 148.803731] {1}[Hardware Error]: event severity: recoverable > [ 148.810191] {1}[Hardware Error]: Error 0, type: fatal > [ 148.816088] {1}[Hardware Error]: section_type: PCIe error > [ 148.822391] {1}[Hardware Error]: port_type: 0, PCIe end point > [ 148.829026] {1}[Hardware Error]: version: 3.0 > [ 148.834266] {1}[Hardware Error]: command: 0x0006, status: 0x0010 > [ 148.841140] {1}[Hardware Error]: device_id: 0000:04:00.0 > [ 148.847309] {1}[Hardware Error]: slot: 0 > [ 148.852077] {1}[Hardware Error]: secondary_bus: 0x00 > [ 148.857876] {1}[Hardware Error]: vendor_id: 0x14e4, device_id: 0x165f > [ 148.865145] {1}[Hardware Error]: class_code: 020000 > [ 148.870845] {1}[Hardware Error]: aer_uncor_status: 0x00100000, aer_uncor_mask: 0x00010000 > [ 148.879842] {1}[Hardware Error]: aer_uncor_severity: 0x000ef030 > [ 148.886575] {1}[Hardware Error]: TLP Header: 40000001 0000030f 90028090 00000000 > [ 148.894823] tg3 0000:04:00.0: AER: aer_status: 0x00100000, aer_mask: 0x00010000 > [ 148.902795] tg3 0000:04:00.0: AER: [20] UnsupReq (First) > [ 148.910234] tg3 0000:04:00.0: AER: aer_layer=Transaction Layer, aer_agent=Requester ID > [ 148.918806] tg3 0000:04:00.0: AER: aer_uncor_severity: 0x000ef030 > [ 148.925558] tg3 0000:04:00.0: AER: TLP Header: 40000001 0000030f 90028090 00000000 > > The MSI is probably raised by incoming packets, so power down the device > and disable bus mastering to stop the traffic, as user confirmed this > approach works. > > In addition to that, be extra safe and cancel reset task if it's running. > > Cc: Josef Bacik > Link: https://lore.kernel.org/all/b8db79e6857c41dab4ef08bdf826ea7c47e3bafc.1615947283.git.josef@toxicpanda.com/ > BugLink: https://bugs.launchpad.net/bugs/1917471 > Signed-off-by: Kai-Heng Feng > --- > v2: > - Move tg3_reset_task_cancel() outside of rtnl_lock() to prevent > deadlock. > It seems tg3_reset_task_cancel() is already called while rtnl is held/owned. Should we worry about that ? > drivers/net/ethernet/broadcom/tg3.c | 8 ++++++-- > 1 file changed, 6 insertions(+), 2 deletions(-) > > diff --git a/drivers/net/ethernet/broadcom/tg3.c b/drivers/net/ethernet/broadcom/tg3.c > index db1e9d810b416..89889d8150da1 100644 > --- a/drivers/net/ethernet/broadcom/tg3.c > +++ b/drivers/net/ethernet/broadcom/tg3.c > @@ -18076,16 +18076,20 @@ static void tg3_shutdown(struct pci_dev *pdev) > struct net_device *dev = pci_get_drvdata(pdev); > struct tg3 *tp = netdev_priv(dev); > > + tg3_reset_task_cancel(tp); > + > rtnl_lock(); > + > netif_device_detach(dev); > > if (netif_running(dev)) > dev_close(dev); > > - if (system_state == SYSTEM_POWER_OFF) > - tg3_power_down(tp); > + tg3_power_down(tp); > > rtnl_unlock(); > + > + pci_disable_device(pdev); > } > > /** > -- > 2.36.1 >