Received: by 10.192.165.156 with SMTP id m28csp1376052imm; Wed, 18 Apr 2018 08:42:40 -0700 (PDT) X-Google-Smtp-Source: AIpwx48UU1QcSdY1H8EVijHJjkrlsipIrqukXouN3BEa2gdMHNcevAY6s0Yd3rQK87vWbBFOCeKZ X-Received: by 10.98.204.220 with SMTP id j89mr2057906pfk.182.1524066159957; Wed, 18 Apr 2018 08:42:39 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1524066159; cv=none; d=google.com; s=arc-20160816; b=zJ4Y63DpfaUU9GYEnjqOhG0o7OX8ht27S273vzInlW6w1LpmRdNCm8ERpOjRXc6EFV 4mYgJrhsJ2rEAY04zxTa6M2xyTTFBGhPSYaW+Ou0McnGKZDSXJoJVXpppmLgG76kI5nH 1EFD9uvlC3NHEO6J6U/yby5ktoufwMVFlG1dBW7Tr2WD5RIcR2zq+8VBDRe5sayPFroK QT4BGbgBA0QXpKP241JkLxd8yCF130oAW3fDDJSThDZhNMbkOFU1smlm1co26gGNXyTd PqAVRYTqoEIp9D3Ab/s6ekj4W7jugbvBWk7Bt+NBi2spW2y/uJ9UcFqlC/wOxtnNlt1A dbfA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:arc-authentication-results; bh=qsrlt0Jev+hxVNQRWUxTsGqhg948UqSX9E0bfVvGVSo=; b=VEDsC6T2XXkcEwEWYhXGnzBmg+mYURGRETP7+eevikkkZpZ5jstiAsrJmN7rp7+LdU w9uHd98Mu7O4XpVMEMvQ1BTtgRds4htbwYXZp1b/BjvkD57hwyChp48MTPBHuRGMqgEu Ogvg3yP4mR6tNmLxLOY1CIKJlbH96tFxCvecsYZ4U5I4JXAoJW3uN0BVUmEXcZ6Npm7f 3Z7mgVo8IOhFcdEmwXxrZXdh6fSgUJyNFGcKmOI5ROjtVIdsiqUVrRvUlS6LuRMy1Sl5 zc8VIH7PNQmKu2HZ52ZQ1D1URs2gjTT5ckXDkIGa9yyumtM5fVubNuHuL63fvTDWa/mA PI9Q== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id h17si1364601pfj.178.2018.04.18.08.42.25; Wed, 18 Apr 2018 08:42:39 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753895AbeDRPlD (ORCPT + 99 others); Wed, 18 Apr 2018 11:41:03 -0400 Received: from mx3-rdu2.redhat.com ([66.187.233.73]:48420 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1753634AbeDRPk7 (ORCPT ); Wed, 18 Apr 2018 11:40:59 -0400 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.rdu2.redhat.com [10.11.54.3]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id CDE7081A88A3; Wed, 18 Apr 2018 15:40:58 +0000 (UTC) Received: from ming.t460p (ovpn-12-81.pek2.redhat.com [10.72.12.81]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 4DDED1134CCD; Wed, 18 Apr 2018 15:40:44 +0000 (UTC) Date: Wed, 18 Apr 2018 23:40:39 +0800 From: Ming Lei To: "jianchao.wang" Cc: axboe@fb.com, sagi@grimberg.me, linux-kernel@vger.kernel.org, linux-nvme@lists.infradead.org, keith.busch@intel.com, hch@lst.de Subject: Re: PATCH V4 0/5 nvme-pci: fixes on nvme_timeout and nvme_dev_disable Message-ID: <20180418154032.GA22533@ming.t460p> References: <1520489971-31174-1-git-send-email-jianchao.w.wang@oracle.com> <20180417151700.GC16286@ming.t460p> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.9.1 (2017-09-22) X-Scanned-By: MIMEDefang 2.78 on 10.11.54.3 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.8]); Wed, 18 Apr 2018 15:40:58 +0000 (UTC) X-Greylist: inspected by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.8]); Wed, 18 Apr 2018 15:40:58 +0000 (UTC) for IP:'10.11.54.3' DOMAIN:'int-mx03.intmail.prod.int.rdu2.redhat.com' HELO:'smtp.corp.redhat.com' FROM:'ming.lei@redhat.com' RCPT:'' Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Apr 18, 2018 at 10:24:28PM +0800, jianchao.wang wrote: > Hi Ming > > On 04/17/2018 11:17 PM, Ming Lei wrote: > > Looks blktest(block/011) can trigger IO hang easily on NVMe PCI device, > > and all are related with nvme_dev_disable(): > > > > 1) admin queue may be disabled by nvme_dev_disable() from timeout path > > during resetting, then reset can't move on > > > > 2) the nvme_dev_disable() called from nvme_reset_work() may cause double > > completion on timed-out request > > > > So could you share us what your plan is about this patchset? > > Regarding to this patchset, it is mainly to fix the dependency between > nvme_timeout and nvme_dev_disable, as your can see: > nvme_timeout will invoke nvme_dev_disable, and nvme_dev_disable have to > depend on nvme_timeout when controller no response. Do you mean nvme_disable_io_queues()? If yes, this one has been handled by wait_for_completion_io_timeout() already, and looks the block timeout can be disabled simply. Or are there others? > Till now, some parts > of the patchset looks bad and seem to have a lot of work need to be done. > :) Yeah, this part is much more complicated than I thought. I think it is a good topic to discuss in the coming LSF/MM, and the NVMe timeout(EH) may need to be refactored/cleaned up, and current issues should be addressed in clean way. Guys, are there other issues wrt. NVMe timeout & reset except for the above? Thanks, Ming