Received: by 2002:a05:6a10:7420:0:0:0:0 with SMTP id hk32csp272523pxb; Tue, 15 Feb 2022 13:01:45 -0800 (PST) X-Google-Smtp-Source: ABdhPJyNZJieqUdNQ/yZmG/6IeHXV7V5U1Un4MgLX4y+w8yDrcZWhy1tsAKT2nfmgKMp/mAUVKbi X-Received: by 2002:aa7:c947:: with SMTP id h7mr749062edt.447.1644958905120; Tue, 15 Feb 2022 13:01:45 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1644958905; cv=none; d=google.com; s=arc-20160816; b=dPLZAlzt2XvCYkoDZhoTlqNeMCZUlf97gyupn2TNrjFdwfRAts9n8FsbTAZ1q6knYQ sFMx/2dG6CkBrFqXTieP9fOj/c7QSlnS/CCcfyWBhHVzBZtLQtYtM0qRHrC9QoXxTnrM ioeHO0y3sgvvGr0xcWxAyRJeVurGrRd1F0c5ojljfxP+pLDI851tvTpZaV3z3KuHybEt SzY8hLrkI9A/pS+wmTDObWodb/aSOIB/BCS+qnd6aWj3V9q6sVXdH++FHhbQ+gKr01XK Ofk5IvJbIZWKRn6ox9nxGNC81ELfczKwuSli0bUy6xM588DLQZdSEcCHCnjElkTzI9E2 UbbQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:dkim-signature; bh=4j6YcMQ+/aGLTz1M18kZKsEcPuNovrJW30rItZ8Q2Fc=; b=nHmwpEjej4aJlHqWOMw1v4fzm6IdE9/4J6onlCAbug5IX7uircN/XJgWutF7P71l3H 1UWj5nr+QT37vbrDn4ROh+GPaD3/eeRTZqnb10PsVF7rsCum3BWtNfYg0sGkwOGHNp08 Ux1YDO0HCF8czMwKzcswSX8DoiUOcqru83G36xY1arCacAYbzzM5YipRRmUPRsQf/hBR /EJhbBz1CqLzve37xa4bwE9M5r74NXQJcuvLKQyik93xxhcw97DtADspkdQIEmNZ+q63 dvKDFQYNm8wAjKm122Mm9mYSOMFMDJJo0eCqPedAMVQSmPqiV7vay5HSoj10JzQmm6P0 9ljg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b="o/Oe5c/7"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id f12si512331edy.78.2022.02.15.13.01.22; Tue, 15 Feb 2022 13:01:45 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b="o/Oe5c/7"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S240928AbiBOTPM (ORCPT + 99 others); Tue, 15 Feb 2022 14:15:12 -0500 Received: from mxb-00190b01.gslb.pphosted.com ([23.128.96.19]:38874 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234514AbiBOTPL (ORCPT ); Tue, 15 Feb 2022 14:15:11 -0500 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 37FD5108189; Tue, 15 Feb 2022 11:15:01 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id CB31361743; Tue, 15 Feb 2022 19:15:00 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id A36CBC340EB; Tue, 15 Feb 2022 19:14:58 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1644952499; bh=AjJtRhsxaMSmw5bCKP9unHA02aPAd54DWxZLx21Wz6M=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=o/Oe5c/72fDzB3xtvF/HWUtrR4/Pl3HqbbE/VMGjrs07aQhZNPWjF2EkhDhCfzRtO EHueLKYLA4M7CQLQKG0cOmXYT72Dscuo7STaTpYQpuL5kOb/JJxHjZlU76eXQfWFB+ LQl3V/C9dIX/Pi+erkqAyJVSZjYk/FrvbQZQnRyNNRmjHYkBu9pqHZUjeE0MWbCt8i Koc01uXydjUxaYFXkGYxldhM212ffpI6zgD7wy/ogAkltZtpodQmM0zY4x9TKiH1rP bntDyx4HJ+DhwmexzG/asgjqAEf51Izas/3RtVToyGQR5aJKgrqyV3N7niQE4Z35aB JSgPRFsQiTlJQ== Date: Tue, 15 Feb 2022 11:14:56 -0800 From: Keith Busch To: Christoph Hellwig Cc: Markus =?iso-8859-1?Q?Bl=F6chl?= , Jens Axboe , Sagi Grimberg , linux-nvme@lists.infradead.org, linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, Stefan Roese Subject: Re: [RFC PATCH] nvme: prevent hang on surprise removal of NVMe disk Message-ID: <20220215191456.GB1934598@dhcp-10-100-145-180.wdc.com> References: <20220214095107.3t5en5a3tosaeoo6@ipetronik.com> <20220215152240.GB1663897@dhcp-10-100-145-180.wdc.com> <20220215184704.GB24543@lst.de> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20220215184704.GB24543@lst.de> X-Spam-Status: No, score=-7.2 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_HI, SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Feb 15, 2022 at 07:47:04PM +0100, Christoph Hellwig wrote: > On Tue, Feb 15, 2022 at 07:22:40AM -0800, Keith Busch wrote: > > I can't actually tell if not checking the DYING flag check was > > intentional or not, since the comments in blk_queue_start_drain() say > > otherwise. > > > > Christoph, do you know the intention here? Should __bio_queue_enter() > > check the queue DYING flag, or do you prefer drivers explicity set the > > disk state like this? It looks to me the queue flags should be checked > > since that's already tied to the freeze wait_queue_head_t. > > It was intentional but maybe not fully thought out. Do you remember why > we're doing the manual setting of the dying flag instead of just calling > del_gendisk early on in nvme? Because calling del_gendisk is supposed > to be all that a tree needs to do. When the driver concludes new requests can't ever succeed, we had been setting the queue to DYING first so new requests can't enter, which can prevent forward progress. AFAICT, just calling del_gendisk() is fine for a graceful removal. It calls fsync_bdev() to flush out pending writes before setting the disk state to "DEAD". Setting the queue to dying first will "freeze" the queue, which is why fsync_bdev() is blocked. We were relying on the queue DYING flag to prevent that from blocking. Perhaps another way to do this might be to remove the queue DYING setting, and let the driver internally fail new requests instead? There may be some issues with doing it that way IIRC, but blk-mq is a bit evolved from where we started, so I'd need to test it out to confirm.