Received: by 2002:a05:6a10:22f:0:0:0:0 with SMTP id 15csp3183517pxk; Mon, 28 Sep 2020 10:19:49 -0700 (PDT) X-Google-Smtp-Source: ABdhPJx3SOLt0tfPtnUaSe7CvcJF4WGoqkZ8xIrnQbcs2RLSvlRh5kdcBnQhxDCEtE5yClySpGM6 X-Received: by 2002:a50:8e17:: with SMTP id 23mr2900395edw.42.1601313588889; Mon, 28 Sep 2020 10:19:48 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1601313588; cv=none; d=google.com; s=arc-20160816; b=kG626UGtVX7MmaobeNPzSvIKnyvapwZZ6A5F/neulCrv97U3RLJS5er5oQqWZ0QHgZ T1YgNw4tnUcW7j1aZwdd60ZtpJg4+KouUdtMCjrP2iU/a8WuUxEm88cRdQyTCw2xuMbw lkDmwP/jBYW6hOeLy50pTG9dJ2nvJUG9ZAdqOkGipQf0jjpXxxsn+YcKD9jvBK1E6Nuw yTapcX/TzMkUnA2XOuSUhAAQYnOUUgj5tRUdx0SkMbzEXv7HUT381EEkyG6TsoZwh73u 5e8VQaj6ThuxOM+9PAB4smUkKdZ9l8tnuPEFNnrwM9WWFcBM+uRKJ9kJPYTjt4o2l3XF ml6g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:content-language :in-reply-to:mime-version:user-agent:date:message-id:from:references :cc:to:subject:ironport-sdr:ironport-sdr; bh=wEF9E/7mbYtgoIKmj2TSYKx0k6qPnEopQFj2MErfecw=; b=nyHXTkJntTvGPG16XFcjvsIyh30iWO8lthTO610xsINfJMURh0sqyPJ/v2EK5HujAy bnZtjySncnHyT0xmuQgZL9TBQTKS5+IP7BCuBiafAHcOMshASIp0EuJIuv5KkicGDQxQ nTQn/jRpDGSvYary9rqWOCdfcQU0XlSoJqF7KXgXB6KHraIK5CGXUk6Yaiq7/KiIkW1N GPnqsvJ3a5dOW7kc+q2hdd7ybtoe6FeL3yx4/4brF27CtG4Ew8mnmT9tJaiKLB9C+CD5 7lcDQz8Gft2KTRAl5z4aNQRxlTUNzq+MBv9C3JQFSwUf/p7s+/7a4USvwf1YYtYlKESW aloA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id u4si956509ejj.685.2020.09.28.10.19.24; Mon, 28 Sep 2020 10:19:48 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726641AbgI1RPt (ORCPT + 99 others); Mon, 28 Sep 2020 13:15:49 -0400 Received: from mga12.intel.com ([192.55.52.136]:55832 "EHLO mga12.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726281AbgI1RPt (ORCPT ); Mon, 28 Sep 2020 13:15:49 -0400 IronPort-SDR: tb41lbWrRFpG9ZRUAg6PYHWfEiQHilozQl4n4L5M1NinCHYvL0jm4ozlCnqMrUhoAwwEKZgJRH k67+wz6md/GA== X-IronPort-AV: E=McAfee;i="6000,8403,9758"; a="141437653" X-IronPort-AV: E=Sophos;i="5.77,313,1596524400"; d="scan'208";a="141437653" X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga003.jf.intel.com ([10.7.209.27]) by fmsmga106.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 28 Sep 2020 10:15:47 -0700 IronPort-SDR: pUE4d4kRrN3n0sQhOro+Fv/Fo8sWe/Ulj1lzc9aUQqNIpzH5jDcjFd/8S27J26RmlOBX/b7y/3 NuyX3c10hmUw== X-IronPort-AV: E=Sophos;i="5.77,313,1596524400"; d="scan'208";a="307410544" Received: from sethura1-mobl2.amr.corp.intel.com (HELO [10.254.88.203]) ([10.254.88.203]) by orsmga003-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 28 Sep 2020 10:15:47 -0700 Subject: Re: [PATCH v3 1/1] PCI/ERR: Fix reset logic in pcie_do_recovery() call To: Sinan Kaya , Bjorn Helgaas Cc: bhelgaas@google.com, linux-pci@vger.kernel.org, linux-kernel@vger.kernel.org, ashok.raj@intel.com, Jay Vosburgh References: <20200922233333.GA2239404@bjorn-Precision-5520> <704c39bf-6f0c-bba3-70b8-91de6a445e43@linux.intel.com> <3d27d0a4-2115-fa72-8990-a84910e4215f@kernel.org> <526dc846-b12b-3523-4995-966eb972ceb7@kernel.org> <1fdcc4a6-53b7-2b5f-8496-f0f09405f561@linux.intel.com> <95e23cb5-f6e1-b121-0de8-a2066d507d9c@linux.intel.com> <65238d0b-0a39-400a-3a18-4f68eb554538@kernel.org> <4ae86061-2182-bcf1-ebd7-485acf2d47b9@linux.intel.com> <8beca800-ffb5-c535-6d43-7e750cbf06d0@linux.intel.com> <44f0cac5-8deb-1169-eb6d-93ac4889fe7e@kernel.org> <3bc0fd23-8ddd-32c5-1dd9-4d5209ea68c3@linux.intel.com> From: "Kuppuswamy, Sathyanarayanan" Message-ID: <8a3aeb3c-83c4-8626-601d-360946d55dd8@linux.intel.com> Date: Mon, 28 Sep 2020 10:15:44 -0700 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.10.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 8bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 9/28/20 4:17 AM, Sinan Kaya wrote: > On 9/27/2020 10:43 PM, Kuppuswamy, Sathyanarayanan wrote: >> FATAL + no-hotplug - In this case, link will still be reseted. But >> currently driver state is not properly restored. So I attempted >> to restore it using pci_reset_bus(). >> >>          status = reset_link(dev); >> -        if (status != PCI_ERS_RESULT_RECOVERED) { >> +        if (status == PCI_ERS_RESULT_RECOVERED) { >> +            status = PCI_ERS_RESULT_NEED_RESET; >> >> ... >> >>      if (status == PCI_ERS_RESULT_NEED_RESET) { >>          /* >> -         * TODO: Should call platform-specific >> -         * functions to reset slot before calling >> -         * drivers' slot_reset callbacks? >> +         * TODO: Optimize the call to pci_reset_bus() >> +         * >> +         * There are two components to pci_reset_bus(). >> +         * >> +         * 1. Do platform specific slot/bus reset. >> +         * 2. Save/Restore all devices in the bus. >> +         * >> +         * For hotplug capable devices and fatal errors, >> +         * device is already in reset state due to link >> +         * reset. So repeating platform specific slot/bus >> +         * reset via pci_reset_bus() call is redundant. So >> +         * can optimize this logic and conditionally call >> +         * pci_reset_bus(). >>           */ >> +        pci_reset_bus(dev); > > I think we have to go to remove/rescan for this case as you also > mentioned above. There is no state to save. All BAR assignments > are gone. Entire device programming is also lost. > > I don't think pci_reset_bus() can recover from this situation safely. > It will make things worse by saving/restoring the hardware default > state. > > This should remove/rescan logic should be inside DPC's slot_reset() > function BTW. Not here. Since there is no state restoration for FATAL errors, I am wondering whether calls to ->error_detected(), ->mmio_enabled() and ->slot_reset() are required? Let me know your comments about following pseudo code. if (fatal error & hotplug_supported) do nothing // if fatal triggered by DPC, clear DPC state. if (fatal error & no-hotplug) perform slot_reset and renumerate affected devices. > -- Sathyanarayanan Kuppuswamy Linux Kernel Developer