Received: by 2002:a05:6a10:7420:0:0:0:0 with SMTP id hk32csp600800pxb; Tue, 15 Feb 2022 23:24:06 -0800 (PST) X-Google-Smtp-Source: ABdhPJxDDVexH7dTEFnA9lz1CRGYsQMPoZuWfMnjFPfT/dqMCWbhKtjB5TXd7X1mV9CQobjeqvm5 X-Received: by 2002:a05:6a00:cc8:b0:4cf:432f:9488 with SMTP id b8-20020a056a000cc800b004cf432f9488mr1394202pfv.31.1644996246758; Tue, 15 Feb 2022 23:24:06 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1644996246; cv=none; d=google.com; s=arc-20160816; b=JBzthttTwW48codFg1txy/4V6/6YXztZANASQfXxr0EYjJUV01eZtT/u5DEGJLQBIq 3XvtdMV7kf2bvHXwTsBLeUh1TP6HLepefb4cY531Ha23jMQ9KHrS7AsH6J0f2Dp5c6Vj 4yXac2NhEdIAOJd1WGj652ySwmo5KJINmVx8vNOh7FTpLoKlavaEQP/2KL7Hc026pnkp YDgnA2Gz/TFuddbqfKKCtWfeZ3c5fLjT8bjDUcU7gjh7sH2v9lB33KroOhF5ByFahwrT +VgcriMyuFHnH954/j6oCFXoTvr9rAVHp31xAFn+OvdqkegnnldcNzOJnB+JSZlUApzr N7IQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-transfer-encoding :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:dkim-signature; bh=1T7HTufrJoOwaaY/G1V4oVT4g9q0EzDEvMuwb8+yCYY=; b=IltCVWrnOt30H8u1nTtmIRt4X9Q3+0q7hS8zUhllDVpRWBH0q/VIlhGtQPWFWiDalt W3FDAdrXjeF6EIX28WCrm9JwGRI0hNOda2lPhEYZZyv0rP2hLg+Bb0hJCN5LoPXKQzBF 95d7CD8f/u14gOZ69plSkO2plwhmR7y4wjl9HCU0MH2OmlvqvNnXzgIGLOMrd54EZ1oe B3rQiRv3NDlsYYJz+5vQxgtSX7HP3/MXSFU60GQcrgNXq4qWtX+2vXr7CdZoPtYC+bN/ rux17vyIeJYW1GmcoJwp0a8kPV4MZA9YTgphm581lXjg69Ffjey+U4T37TW1WhfF0ISB uUqQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=V6v2Ql03; spf=softfail (google.com: domain of transitioning linux-kernel-owner@vger.kernel.org does not designate 23.128.96.19 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net. [23.128.96.19]) by mx.google.com with ESMTPS id n10si12147713pfd.153.2022.02.15.23.24.06 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 15 Feb 2022 23:24:06 -0800 (PST) Received-SPF: softfail (google.com: domain of transitioning linux-kernel-owner@vger.kernel.org does not designate 23.128.96.19 as permitted sender) client-ip=23.128.96.19; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=V6v2Ql03; spf=softfail (google.com: domain of transitioning linux-kernel-owner@vger.kernel.org does not designate 23.128.96.19 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 9212A284224; Tue, 15 Feb 2022 22:53:10 -0800 (PST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237007AbiBOThk (ORCPT + 99 others); Tue, 15 Feb 2022 14:37:40 -0500 Received: from mxb-00190b01.gslb.pphosted.com ([23.128.96.19]:41710 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234094AbiBOThj (ORCPT ); Tue, 15 Feb 2022 14:37:39 -0500 Received: from ams.source.kernel.org (ams.source.kernel.org [145.40.68.75]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id AB8F55F62; Tue, 15 Feb 2022 11:37:27 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 6A437B818F3; Tue, 15 Feb 2022 19:37:26 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 69643C340EB; Tue, 15 Feb 2022 19:37:24 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1644953845; bh=xxtp7AtRbMPH8L8pxD9NIPa5Oth1b2IIlAEzmBmmEzE=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=V6v2Ql03UCcTy1v/BJgL+mzmEcTHWGpXmnkg153+h1Q5SRJQzrtpzdq2Jv1DkFgtK afIuNINH8mYOb4bwgeOBgov4SgZhpSceUVlMQ8F3grdpWwqTIUVWYqrOjZY9sv0uLA E3COT+NSbV3ADGUOsCpZkDFjXpU4eaFErRPtJpL/l+ukH/ViROS+gVGNEGVu1SrVdi +D1E4MCz1ylDrYLJiYygKXVSkWW1s8HsFb4X2wfD/mbciTc210PdPw+ZMHi4yp79mr zdKAnEdzSd26r+bk17yQgO+eXVudIq53Tg4IGdZab+sbMkH0+rUA2evFd4HeEMEEOG b4lG69uONLtnQ== Date: Tue, 15 Feb 2022 11:37:22 -0800 From: Keith Busch To: Christoph Hellwig Cc: Markus =?iso-8859-1?Q?Bl=F6chl?= , Jens Axboe , Sagi Grimberg , linux-nvme@lists.infradead.org, linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, Stefan Roese Subject: Re: [RFC PATCH] nvme: prevent hang on surprise removal of NVMe disk Message-ID: <20220215193722.GD1934598@dhcp-10-100-145-180.wdc.com> References: <20220214095107.3t5en5a3tosaeoo6@ipetronik.com> <20220215191731.GB25076@lst.de> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <20220215191731.GB25076@lst.de> X-Spam-Status: No, score=-2.4 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,MAILING_LIST_MULTI, RDNS_NONE,SPF_HELO_NONE,T_SCC_BODY_TEXT_LINE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Feb 15, 2022 at 08:17:31PM +0100, Christoph Hellwig wrote: > On Mon, Feb 14, 2022 at 10:51:07AM +0100, Markus Bl?chl wrote: > > After the surprise removal of a mounted NVMe disk the pciehp task > > reliably hangs forever with a trace similar to this one: > > Do you have a specific reproducer? At least with doing a > > echo 1 > /sys/.../remove > > while running fsx on a file system I can't actually reproduce it. That's a gracefull removal. You need to do something to terminate the connection without the driver knowing about it. If you don't have a hotplug capable system, you can do something slightly destructive to the PCI link to force an ungraceful teardown, though you'll need to wait for IO timeout before anything interesting will happen. # setpci -s "${slot}" CAP_EXP+10.w=10:10 The "$slot" needs to be the B:D.f of the bridge connecting to your nvme end device. An example getting it for a non-multipath PCIe attached nvme0n1: # readlink -f /sys/block/nvme0n1/device | grep -Eo '[0-9a-f]{4,5}:[0-9a-f]{2}:[0-9a-f]{2}\.[0-9a-f]' | tail -2 | head -1