Received: by 10.223.176.5 with SMTP id f5csp1060409wra; Fri, 2 Feb 2018 10:29:22 -0800 (PST) X-Google-Smtp-Source: AH8x226dViJ0VwydewoT9URTrAhoErhNfVA/IEFrylmcYWQrUs1as3R7CuK3Wnhnt6F1BVadP0Fu X-Received: by 10.98.158.89 with SMTP id s86mr41240913pfd.203.1517596162556; Fri, 02 Feb 2018 10:29:22 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1517596162; cv=none; d=google.com; s=arc-20160816; b=tbX6qiztNjx0elPsTeNTTk9RomQ3hZyu0f8/c5NP5CV2FCos2ItFhFY7cx+kiOzAjN q0QnCayY5yQBQNy2ls+wG7BoWHRmTphwLCDSrv3n8KVi+0WTlTwZPo2qUWHBEoMPGSYZ zTKE1JhZyOS0dnNOda3ipXcCqNk5Zt/AGQs8Zs7wPLCOR0uEFHit6DHDf5UrrTODVOR4 pPlj0u3gY8P+bWNvF+ghhSX3xMWGWImm5CtjKVMjFwH3jGLl5tkj8zXotby6s8vWn/PW fxE8VFLYPdCqkJbzSyXjs1YzXKMY2UYnMkPzG+tRirudhnEaXVFhFaHPc1slm9Lr3qAS CE/Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:arc-authentication-results; bh=uT5dYdJzo5rjH1sW8EDHlKOamzhRYnrD0NVLn3kPbaU=; b=T0NJzsA7e8LYbajc7XvtNVEJmQNT4na1AeJGilTW8h71Pjxpm6g30ajeU0CwGUyrsN EdeFwgxH17fBK1zn0qWzpXbtAzDQFhmjN4/O3hbfOqJtci0av1ggDl1+XlS1mTLbc0S8 dl09gJnBLJ8wf795uJHllwGIH/QNVowcicsEU33opc3AeMygEYJNGoLIYnhqhvVF9YeT pR34bP7ccK7cWvezYUKautWxoIBnecUXXNtQTF64z3FdbrIOY5vf0ZO1Sjghz23w7oYv c+c7GQNSQXHulQpr75uhCT+onUIa87vyb0dTN3hYQbIrE8l/QTvZLhvAMTDpF/+HWehR iA1A== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id d90-v6si2297850pld.193.2018.02.02.10.29.07; Fri, 02 Feb 2018 10:29:22 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754744AbeBBS2Y (ORCPT + 99 others); Fri, 2 Feb 2018 13:28:24 -0500 Received: from mga07.intel.com ([134.134.136.100]:51540 "EHLO mga07.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754625AbeBBS1Q (ORCPT ); Fri, 2 Feb 2018 13:27:16 -0500 X-Amp-Result: UNSCANNABLE X-Amp-File-Uploaded: False Received: from fmsmga007.fm.intel.com ([10.253.24.52]) by orsmga105.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 02 Feb 2018 10:27:15 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.46,450,1511856000"; d="scan'208";a="14914174" Received: from unknown (HELO localhost.localdomain) ([10.232.112.44]) by fmsmga007.fm.intel.com with ESMTP; 02 Feb 2018 10:27:15 -0800 Date: Fri, 2 Feb 2018 11:31:04 -0700 From: Keith Busch To: Jianchao Wang Cc: axboe@fb.com, hch@lst.de, sagi@grimberg.me, linux-nvme@lists.infradead.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH 4/6] nvme-pci: break up nvme_timeout and nvme_dev_disable Message-ID: <20180202183103.GI24417@localhost.localdomain> References: <1517554849-7802-1-git-send-email-jianchao.w.wang@oracle.com> <1517554849-7802-5-git-send-email-jianchao.w.wang@oracle.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1517554849-7802-5-git-send-email-jianchao.w.wang@oracle.com> User-Agent: Mutt/1.9.1 (2017-09-22) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Feb 02, 2018 at 03:00:47PM +0800, Jianchao Wang wrote: > Currently, the complicated relationship between nvme_dev_disable > and nvme_timeout has become a devil that will introduce many > circular pattern which may trigger deadlock or IO hang. Let's > enumerate the tangles between them: > - nvme_timeout has to invoke nvme_dev_disable to stop the > controller doing DMA access before free the request. > - nvme_dev_disable has to depend on nvme_timeout to complete > adminq requests to set HMB or delete sq/cq when the controller > has no response. > - nvme_dev_disable will race with nvme_timeout when cancels the > outstanding requests. Your patch is releasing a command back to the OS with the PCI controller bus master still enabled. This could lead to data or memory corruption. In any case, it's not as complicated as you're making it out to be. It'd be easier to just enforce the exisiting rule that commands issued in the disabling path not depend on completions or timeout handling. All of commands issued in this path already do this except for HMB disabling. Let'sjust fix that command, right?