Received: by 2002:a05:7412:419a:b0:f3:1519:9f41 with SMTP id i26csp4034540rdh; Tue, 28 Nov 2023 10:01:26 -0800 (PST) X-Google-Smtp-Source: AGHT+IG7fKzBiPXg9iDlpbi2HVhXRkOI3aGAldOqA2LCfas0GUoQpOu4pwog+mCmvz/WiydrrVhu X-Received: by 2002:a05:6808:114f:b0:3b6:dd3c:5f78 with SMTP id u15-20020a056808114f00b003b6dd3c5f78mr22162706oiu.44.1701194485885; Tue, 28 Nov 2023 10:01:25 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1701194485; cv=none; d=google.com; s=arc-20160816; b=D3WQwT9AIvpVCg4V6GvPRE+g2603mJTSJkR58e9g7wK/ygCkZe+TqN62h86lzE27h1 2nAusaH9yXwXbVNprxpM66hhy627vewnVct2x2Ili9bnnh49AUSbskGr75opTWq/r4qV iGN/r1BgydbssD+hpmtSoIReFtYVpPba8oFCSenqzoplzQx4ztRxhWno1/nmy1QMhNH8 QAyc5wigc/Yqi6/7hoU5fytUmss9lIXzAZZ/IHK1SKXr03YcjjmkjXSjhT2+YKznJvux 6TcUAYjLHQk0U3cllSPcUlKQ9MMhJZRjU+gzysyfzDgP3Ws7SVe1RpgDnxguTDVeCIYH d3qw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:dkim-signature; bh=cZa2rbawZm9YETONcae7fFTCBGipq6vD0hlgjn4LWVk=; fh=5mojPe5YBkw8wSsBqPMEN+GdmQasbk3f6eIjhVSpkdk=; b=01R7eqzs46Tes505PlQ6KPlGjg/42Pg8fjyFkLS3f/ut19R4881F6LuZvJnU+iN1jd +OpvACRI7tGpezlpRajIS/ik+3XBIDmy5MoKzvNrRKY/07uxU/01pk5zswzi7dguhGie 5p+UEPlL6prQY2pLuEjQDA8J3ZyU/qAIS9Lgdcoaw+rqNHWIocsXkMyrB5mANpLmtbF0 c/zO52XGttnGaLdQbl2fVjNQWS4cHBEVJjGSsMVd6PxVYD7gNA68NyyjZVTi/jceJ8Jl hz51Bks0r/QnHLVn1UiOMd03SecKhKgu+r1QNY3pyjmcMxXLr/U3itrMn2UHZzFBix2G 2NQg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=HYV2VFHp; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.37 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from snail.vger.email (snail.vger.email. [23.128.96.37]) by mx.google.com with ESMTPS id 14-20020aca280e000000b003b5ff071627si4626593oix.286.2023.11.28.10.01.02 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 28 Nov 2023 10:01:25 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.37 as permitted sender) client-ip=23.128.96.37; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=HYV2VFHp; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.37 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by snail.vger.email (Postfix) with ESMTP id DEE9C80417DA; Tue, 28 Nov 2023 10:00:43 -0800 (PST) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.11 at snail.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229949AbjK1SAd (ORCPT + 99 others); Tue, 28 Nov 2023 13:00:33 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48492 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1346520AbjK1SA1 (ORCPT ); Tue, 28 Nov 2023 13:00:27 -0500 Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E4AE8D5B for ; Tue, 28 Nov 2023 10:00:33 -0800 (PST) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 08EC0C433C8; Tue, 28 Nov 2023 18:00:32 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1701194433; bh=/gFbOX4F6s+/kOnZpinHF5H/p/09WqRtksDYYY0r3d0=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=HYV2VFHp/dijSQG9xQiRQATmwkzl/4Q4UefhRaedMHB4yKrhw/ZKm5CobKmeG5+Jz WXnTWvJDKtiFYaU5pl6sYW+wRbzlMWIcFMu8f7bnPXuIQ4JgmodElqyeR7rn8M7oaM i7YMRYf4Vc/C4A2eSFT15X0gu4XfPoCWmkpoJVNVdWyM5Id/7KxX4PIcZ04rP++n8G UfPsOnalrMVyYiv7aM8jnmFSjmZXBTVmhTpdvyENerqqoASAbTunzYeQxzTQbLaCSl E6vdSel6v6OhKJ8Lz0fBOvbk12LJbgoBaCNMMAVh1drSNy/gh33KBODAQ400OV6sg5 Hcm6wctJ6LKOg== Date: Tue, 28 Nov 2023 11:00:30 -0700 From: Keith Busch To: Sagi Grimberg Cc: yaoma , axboe@kernel.dk, hch@lst.de, linux-nvme@lists.infradead.org, linux-kernel@vger.kernel.org, kanie@linux.alibaba.com Subject: Re: [PATCH] nvme: fix deadlock between reset and scan Message-ID: References: <1700737213-110685-1-git-send-email-yaoma@linux.alibaba.com> <65b0c372-b308-46dd-c2f2-a5ddb50adb10@linux.alibaba.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Spam-Status: No, score=-4.4 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (snail.vger.email [0.0.0.0]); Tue, 28 Nov 2023 10:00:44 -0800 (PST) On Tue, Nov 28, 2023 at 12:13:59PM +0200, Sagi Grimberg wrote: > > > On 11/28/23 08:22, yaoma wrote: > > Hi Keith Busch > > > > Thanks for your reply. > > > > The idea to avoid such a deadlock between nvme_reset and nvme_scan is to > > ensure that no namespace can be added to ctrl->namespaces after > > nvme_start_freeze has already been called. We can achieve this goal by > > assessing the ctrl->state after we have already acquired the > > ctrl->namespaces_rwsem lock, to decide whether to add the namespace to > > the list or not. > > 1. After we determine that ctrl->state is LIVE, it may be immediately > > changed to another state. However, since we have already acquired the > > lock, other tasks cannot access ctrl->namespace, so we can still safely > > add the namespace to the list. After acquiring the lock, > > nvme_start_freeze will freeze all ns->q in the list, including any newly > > added namespaces. > > 2. Before the completion of nvme_reset, ctrl->state will not be changed > > to LIVE, so we will not add any more namespaces to the list. All ns->q > > in the list is frozen, so nvme_wait_freeze can exit normally. > > I agree with the analysis, there is a hole between start_freeze and > freeze_wait that a scan may add a ns to the ctrl ns list. > > However the fix should be to mark the ctrl with say NVME_CTRL_FROZEN > flag set in nvme_freeze_start and cleared in nvme_unfreeze (similar > to what we did with quiesce). Then the scan can check it before adding > the new namespace (under the namespaces_rwsem). Could we just make sure that scan_work isn't running? If we reset a live controller, then we're not depending on reset_work to unblock scan_work, and can let scan_work end gracefully. The scan_work can't be rescheduled again while in the resetting state. --- diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c index fad4cccce745c..5d6305475bad5 100644 --- a/drivers/nvme/host/pci.c +++ b/drivers/nvme/host/pci.c @@ -2701,8 +2701,10 @@ static void nvme_reset_work(struct work_struct *work) * If we're called to reset a live controller first shut it down before * moving on. */ - if (dev->ctrl.ctrl_config & NVME_CC_ENABLE) + if (dev->ctrl.ctrl_config & NVME_CC_ENABLE) { + flush_work(&dev->ctrl.scan_work); nvme_dev_disable(dev, false); + } nvme_sync_queues(&dev->ctrl); mutex_lock(&dev->shutdown_lock); --