Received: by 10.223.176.5 with SMTP id f5csp2018376wra; Sun, 4 Feb 2018 18:23:46 -0800 (PST) X-Google-Smtp-Source: AH8x227tvOxVt49Psis2VZHvQryX+aBWJ0xv+ZJPqcEa9I4KJZls0nLxY3IC3XjDC6GZpo+xviGH X-Received: by 10.99.111.68 with SMTP id k65mr37175705pgc.446.1517797426253; Sun, 04 Feb 2018 18:23:46 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1517797426; cv=none; d=google.com; s=arc-20160816; b=sfa7m3OXyseO5ZuDDlvKSMAH1R8ZVil0VsiNfJqPogl8OJ++J0B+QuBFeS3Y4FMOsr ZaAgM0QXnvWJRIl9BYoIekcAvAYoNKXtb0jA2StvLeu7QlCuLgUB2lY9bArhb75VmsV1 NCt6APmKOa343tFvvxwclzt1oJ9FpYR8sBaVVNquLS0YBfbMJKAu2XUNV/wIKQcvlMOa kvAVOfGJQUyjSBzhCIoE4PR64Z0X3d6bLmjBwriba9oMhQ8u4wgJV4445mLgC7zigD48 O2RfSlcj//wVZ2PqXup6k/SWrwnhX2796kkLNsa2/bbATWYwow90nhvGz9FkAnwbXm0p Hkhw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding :content-language:in-reply-to:mime-version:user-agent:date :message-id:from:references:cc:to:subject:dkim-signature :arc-authentication-results; bh=P/sgHqfSbYAkEWV18d5B2jD8aw4hDQYAb50SU4eH5Ds=; b=y0ejcmH+Csd+wnZQ6mnlitSwAuQd6idgrtIx0jWlmpjj9KG0F1ZTCtgmMem311XvLE Lm39G9//RXOaArDThcmviu8jb1bnprK99c+sGQY2SKAImGoXCu4AxlSctIOiyvPt/UT4 rMJKNJ+pHQ6ZxRscwOHCuHVYJ2Yizz+Oyb/KkFoSGaXephJTYIdbcO4QizdIDzpe1fHE F7lD9fDvu8hM9/nE0WeX1WS1bNSlvZqJ1fdfnVnLoh8oddIIPE5j6Ksmx6IN/NooplO9 avR6cn2qiS6iWzVA1A1WFic9j6APGH7Mn8YvTxGtJ36vc1cxzmekERnnJjaMfsog+3mG zKnQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@oracle.com header.s=corp-2017-10-26 header.b=urrkZZzI; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=oracle.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id l1-v6si4347846plg.56.2018.02.04.18.23.31; Sun, 04 Feb 2018 18:23:46 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@oracle.com header.s=corp-2017-10-26 header.b=urrkZZzI; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=oracle.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752228AbeBECXL (ORCPT + 99 others); Sun, 4 Feb 2018 21:23:11 -0500 Received: from aserp2130.oracle.com ([141.146.126.79]:48310 "EHLO aserp2130.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751991AbeBECXI (ORCPT ); Sun, 4 Feb 2018 21:23:08 -0500 Received: from pps.filterd (aserp2130.oracle.com [127.0.0.1]) by aserp2130.oracle.com (8.16.0.22/8.16.0.22) with SMTP id w152M3iR134844; Mon, 5 Feb 2018 02:22:03 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=subject : to : cc : references : from : message-id : date : mime-version : in-reply-to : content-type : content-transfer-encoding; s=corp-2017-10-26; bh=P/sgHqfSbYAkEWV18d5B2jD8aw4hDQYAb50SU4eH5Ds=; b=urrkZZzIF70Taf6uerlwsh0szSg2z8pGtAhXnatjSTdT5K1AcNM/3EQ2VE26yr1OMAOH e9O4249R023QXYlh41DkuaJpA+tW0EX8LeYDFJCXULKsc2GPkvcnkYCqcZZTzfVxMMoH Mj11wUHUdHAH27Fd4184HvZEdZC7JQoa01WFgUSSLB+tCsB60+bue4SfA0sxSr3bOxJ+ uYS/Zh8L/4PXZ5mKBSXBKCiHcFcteOdx7cYJ7L1pu3G5/9C4ef+6O84BT6Jxo4au+nCy yKZP00Lnz9Rui6wiZc6lomgc+Ga3uCJBIET5Fky5YSTn7NkPakOheZbgJ+vF+yXiAQI/ mg== Received: from aserv0022.oracle.com (aserv0022.oracle.com [141.146.126.234]) by aserp2130.oracle.com with ESMTP id 2fxcg60731-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Mon, 05 Feb 2018 02:22:03 +0000 Received: from aserv0122.oracle.com (aserv0122.oracle.com [141.146.126.236]) by aserv0022.oracle.com (8.14.4/8.14.4) with ESMTP id w152M25P015947 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=FAIL); Mon, 5 Feb 2018 02:22:02 GMT Received: from abhmp0016.oracle.com (abhmp0016.oracle.com [141.146.116.22]) by aserv0122.oracle.com (8.14.4/8.14.4) with ESMTP id w152M1vV027500; Mon, 5 Feb 2018 02:22:01 GMT Received: from [10.182.69.179] (/10.182.69.179) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Sun, 04 Feb 2018 18:22:01 -0800 Subject: Re: [PATCH 4/6] nvme-pci: break up nvme_timeout and nvme_dev_disable To: Keith Busch Cc: axboe@fb.com, hch@lst.de, sagi@grimberg.me, linux-nvme@lists.infradead.org, linux-kernel@vger.kernel.org References: <1517554849-7802-1-git-send-email-jianchao.w.wang@oracle.com> <1517554849-7802-5-git-send-email-jianchao.w.wang@oracle.com> <20180202183103.GI24417@localhost.localdomain> From: "jianchao.wang" Message-ID: <584a703f-98e9-a07b-ad44-6fa51b119c02@oracle.com> Date: Mon, 5 Feb 2018 10:22:17 +0800 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.6.0 MIME-Version: 1.0 In-Reply-To: <20180202183103.GI24417@localhost.localdomain> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit X-Proofpoint-Virus-Version: vendor=nai engine=5900 definitions=8795 signatures=668662 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 suspectscore=2 malwarescore=0 phishscore=0 bulkscore=0 spamscore=0 mlxscore=0 mlxlogscore=765 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1711220000 definitions=main-1802050029 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi Keith Thanks for you kindly response and comment. That's really appreciated. On 02/03/2018 02:31 AM, Keith Busch wrote: > On Fri, Feb 02, 2018 at 03:00:47PM +0800, Jianchao Wang wrote: >> Currently, the complicated relationship between nvme_dev_disable >> and nvme_timeout has become a devil that will introduce many >> circular pattern which may trigger deadlock or IO hang. Let's >> enumerate the tangles between them: >> - nvme_timeout has to invoke nvme_dev_disable to stop the >> controller doing DMA access before free the request. >> - nvme_dev_disable has to depend on nvme_timeout to complete >> adminq requests to set HMB or delete sq/cq when the controller >> has no response. >> - nvme_dev_disable will race with nvme_timeout when cancels the >> outstanding requests. > > Your patch is releasing a command back to the OS with the > PCI controller bus master still enabled. This could lead to data or > memory corruption. > There are two cases nvme_timeout will return. BLK_EH_HANDLED BLK_EH_NOT_HANDLED For the 1st case, the patch will disable the controller. Then the controller will stop processing any outstanding command and delete the sq/cq queues as the protocol. Looks like it is still not enough, I will to disable the _pci_in nvme_pci_disable_dev_directly next version. Really thanks for your directive here. For the 2nd case, it will return BLK_EH_NOT_HANDLED. blk_mq_rq_timed_out will do nothing for this case. All the command will be handled after all the things are disabled. > In any case, it's not as complicated as you're making it out to > be. It'd be easier to just enforce the exisiting rule that commands > issued in the disabling path not depend on completions or timeout > handling. All of commands issued in this path already do this except > for HMB disabling. Let'sjust fix that command, right? > We will still met nvme_timeout will invoke nvme_dev_disable and cannot synchronize on the outstanding requests. This is really a devil and will be a block to do other improvements. This patch just do two things: 1. grab all the previous outstanding requests with blk_abort_request. Then release them after the controller is totally disabled/shutdown. consequently, during the disable/shutdown and initializing procedure, nvme_timeout path only need to serve them. And this also could ensure there will be _no_ any outstanding requests after nvme_dev_disable. 2. fail the adminq command issued during disable/shutdown and initializing procedure when the controller no response. we need to do two steps for this, disable the controller/pci and complete the command. Then nvme_timeout will not need to invoke nvme_dev_disable and nvme_dev_disable will be independent. Please consider this. Many thanks Jianchao