Received: by 2002:a05:6a10:16a7:0:0:0:0 with SMTP id gp39csp107635pxb; Tue, 10 Nov 2020 21:50:09 -0800 (PST) X-Google-Smtp-Source: ABdhPJzLKHMwQml+JNDTvUyqNEdZJXp/VAPx5+b6+F29+UX8mBjO0f8P5qZzBDM8rTc1yaMeFRcH X-Received: by 2002:a05:6402:30b5:: with SMTP id df21mr24167343edb.146.1605073809191; Tue, 10 Nov 2020 21:50:09 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1605073809; cv=none; d=google.com; s=arc-20160816; b=LCqIIWuLQO4wWG+Pu/NT1aqRUQy+RCFnv9t25xyjw50jdsZKch2QBi3Dbe9SlxLkwh 0Sezck/UYpV0tgqLKiDqZW+zV18IhySSBamOD6/mHrfzjJcTS6bEnVfOnBNwTT+1bTqp 09rTTwf/yqMIJcZMT46lsivtINJwh+8nPtyr0Ivappl0CLx0DkUFm2of6tH3szZfS9sU KhBBHw9k0+tzazzGFTfLZhhsYA8ifG+x5OvhItVTJmoo+ZjLsF2hxthQT4jitC5JMXHG AyflrixudZjRkcIqPbXUEqjCjaHDBQOfau2FajQimUZRmUeBkdrmV5SlE2kL9K/Y2pzt HCIg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:message-id:date:subject:cc:to:from:ironport-sdr :ironport-sdr; bh=2YpUfxzJfytoTi8f3OJNZa5IRfxc56zGFoOr8g0iT8I=; b=UdQu4BqKpNhdA3EMs+2tHscdttj2tfnH2TKS/Vn7Uyq2YI98QMY3rxgEXDzglCWfeT p5iOh+2eQ8lQvdraig8r58jA/zYbnZW5Oc4fnk96VlrM2fGS6IfpaCIqcNHXfFliUAbo GaRP2RdRxgHMZU6Ra2oQLmJjDvoXZFmrPCNi4yEXoWFBcQ8jRHE7nGBngtQbh/Cdi6Rl JCSiY7F2HCAp2uao9xT8iLOC88ZDR3U0mqa7QDUfrkb9O/97SNca3SEaDSWZFCaM+jTS fljqcI+AWdNTK1pKW4gJuncy0dbiLaOhlVyd7uCibh+iYSK+6zKtZEP5WtqXBJjxzB9h CMpw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id mc1si611450ejb.219.2020.11.10.21.49.45; Tue, 10 Nov 2020 21:50:09 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1725912AbgKKFsV (ORCPT + 99 others); Wed, 11 Nov 2020 00:48:21 -0500 Received: from mga18.intel.com ([134.134.136.126]:10322 "EHLO mga18.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725468AbgKKFsV (ORCPT ); Wed, 11 Nov 2020 00:48:21 -0500 IronPort-SDR: 0Q4kqZS5CejBjWxnXlcBjJGfuVQ/sHwqYD8r+8Gur44PfVXiqGUIGGiP96gYOVfi9mVxBS30ia cROM+MB8Eb+A== X-IronPort-AV: E=McAfee;i="6000,8403,9801"; a="157878642" X-IronPort-AV: E=Sophos;i="5.77,468,1596524400"; d="scan'208";a="157878642" X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga005.jf.intel.com ([10.7.209.41]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 10 Nov 2020 21:48:18 -0800 IronPort-SDR: WYNzA27gZPv6wwbg7/TmB4OGBuaHUE33HXyvhRsDUZF6NyfNrUG7ydyd/ETie3vk/Hrn1K59nO 401vsZUTTOgw== X-IronPort-AV: E=Sophos;i="5.77,468,1596524400"; d="scan'208";a="541641535" Received: from chenyu-office.sh.intel.com ([10.239.158.173]) by orsmga005-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 10 Nov 2020 21:48:15 -0800 From: Chen Yu To: intel-wired-lan@lists.osuosl.org Cc: "Neftin, Sasha" , Len Brown , "Rafael J. Wysocki" , "Brandt, Todd E" , Zhang Rui , Tony Nguyen , Jesse Brandeburg , linux-kernel@vger.kernel.org, Chen Yu Subject: [PATCH 0/4][RFC] Disable e1000e power management if hardware error is detected Date: Wed, 11 Nov 2020 13:50:35 +0800 Message-Id: X-Mailer: git-send-email 2.17.1 Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org This is a trial patchset that aims to cope with an intermittently triggered hardware error during system resume. On some platforms the NIC's hardware error was detected during resume from S3, causing the NIC to not fully initialize and remain in unstable state afterwards. As a consequence the system fails to suspend due to incorrect NIC status. In theory if the NIC could not be initialized after resumed, it should not do system/runtime suspend/resume afterwards. There are two proposals to deal with this situation: Either: 1. Each time before the NIC going to suspend, check the status of NIC by querying corresponding registers, bypass the suspend callback on this NIC if it's unstable. Or: 2. During NIC resume, if the hardware error was detected, removes the NIC from power management list entirely. Proposal 2 was chosen in this patch set because: 1. Proposal 1 requires that the driver queries the status of the NIC in e1000e driver. However there seems to be no specific registers for the e1000e to query the result of NIC initialization. 2. Proposal 1 just bypass the suspend process but the power management framework is still aware of this NIC, which might bring potential issue in race condition. 3. Approach 2 is a clean solution and it is platform independent that, not only e1000e, but also other drivers could leverage this generic mechanism in the future. Comments appreciated. Chen Yu (4): e1000e: save the return value of e1000e_reset() PM: sleep: export device_pm_remove() for driver use e1000e: Introduce workqueue to disable the power management e1000e: Disable the power management if hardware error detected during resume drivers/base/power/main.c | 1 + drivers/base/power/power.h | 8 ------- drivers/net/ethernet/intel/e1000e/e1000.h | 1 + drivers/net/ethernet/intel/e1000e/netdev.c | 27 ++++++++++++++++++---- include/linux/pm.h | 12 ++++++++++ 5 files changed, 37 insertions(+), 12 deletions(-) -- 2.17.1