Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1757429AbbEVPLr (ORCPT ); Fri, 22 May 2015 11:11:47 -0400 Received: from mga03.intel.com ([134.134.136.65]:24816 "EHLO mga03.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1756233AbbEVPLp (ORCPT ); Fri, 22 May 2015 11:11:45 -0400 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.13,476,1427785200"; d="scan'208";a="733785502" Date: Fri, 22 May 2015 15:11:44 +0000 (UTC) From: Keith Busch X-X-Sender: vmware@localhost.lm.intel.com To: Parav Pandit cc: Keith Busch , linux-nvme@lists.infradead.org, Matthew Wilcox , Jens Axboe , linux-kernel@vger.kernel.org Subject: Re: [PATCH] NVMe: Avoid interrupt disable during queue init. In-Reply-To: Message-ID: References: <1432253553-17045-1-git-send-email-parav.pandit@avagotech.com> User-Agent: Alpine 2.00 (LNX 1167 2008-08-23) MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1425 Lines: 28 On Fri, 22 May 2015, Parav Pandit wrote: > On Fri, May 22, 2015 at 8:18 PM, Keith Busch wrote: >> The rcu protection on nvme queues was removed with the blk-mq conversion >> as we rely on that layer for h/w access. > > o.k. But above is at level where data I/Os are not even active. Its > between nvme_kthread and nvme_resume() from power management > subsystem. > I must be missing something. On resume, everything is already reaped from the queues, so there should be no harm letting the kthread poll an inactive queue. The proposal to remove the q_lock during queue init makes it possible for the thread to see the wrong cq phase bit and mess up the completion queue's head from reaping non-existent entries. But beyond nvme_resume, it appears a race condition is possible on any scenario when a device is reinitialized if it cannot create the same number of IO queues as it had in originally. Part of the problem is there doesn't seem to be a way to change a tagset's nr_hw_queues after it was created. The conditions that leads to this scenario should be uncommon, so I haven't given it much thought; I need to untangle dynamic namespaces first. :) -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/