Received: by 2002:a25:824b:0:0:0:0:0 with SMTP id d11csp735861ybn; Wed, 2 Oct 2019 05:29:10 -0700 (PDT) X-Google-Smtp-Source: APXvYqz/EUlMQI9xX9davFbO06SPQ/pANPFjbvsf1ZpiaQkvIRfSB4NfHcnsrf6qoQxntuRCgOqZ X-Received: by 2002:a50:d98a:: with SMTP id w10mr3456762edj.276.1570019350738; Wed, 02 Oct 2019 05:29:10 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1570019350; cv=none; d=google.com; s=arc-20160816; b=gPFWLf7Re1MtE7/XMK823CsgtLVKtag7JMsZCjGbq2Nfcc1GWBBbwVQ51h7TL+ZS+H fWokmeVTWlqnC33Y2EGw48tP1lOZ7k6JRzfi8uO8f6RGQ8PEanSPl7/zp9jMgmgzj8AN LT5p1i0eMDayIgKl6aC1OzdWD/ZjGhh7EX+EdX7GdPyIi6sgTxLQiib09ch5t+XZyiZ+ 0nvrakOn9YI+qwbfhRXPX5BIKNeLXj65UBpRGUHoO5lc7aalCkqJeceZWxXwo3E621Y5 ldHwa3zE2J+W08HkuCpJGH5ubYdxUZDZha6S8P/r7jNJdwyG8d61frO24v05Z4wfowWK P1KQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:organization :content-disposition:mime-version:message-id:subject:cc:to:from:date; bh=GZ0hppdxGbro4PpTns5IxtxuZw3g2SLpxcIdPeNpPW4=; b=S0zNOnSMKUBuzz+GOWyl80LHbI1l9zm5+AWW6egyQz576sOrpKOpmRS3ZKDM7bDoCB J/uxSzseszQJYBfDlVoJ/Scf+Es1ASul4fZA7q3UA+GTJlxNp5Jdu/BOX3xiWZr8kB5R WNyS4NBeFB6aru5nTA4kusQMWAtAWXsqokC2ExKPyqPb5mQHaHeGqvdYuvbOpNaBmyBY 5S3fczXT+mjXWyE2w412wL55I7C5ZAnGzpctG6HRoWhRg/hKnx1/IYhDFMOzVtzfwSuv IYIb1XEIpx8vKsPqgV/kxuY0u4tYfLbgMC4utshww6eCzkZ+33gjJfelmsJq9u4qLOk8 e2UA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id n3si12984561edc.151.2019.10.02.05.28.46; Wed, 02 Oct 2019 05:29:10 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726771AbfJBMVk (ORCPT + 99 others); Wed, 2 Oct 2019 08:21:40 -0400 Received: from mga12.intel.com ([192.55.52.136]:7742 "EHLO mga12.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725747AbfJBMVk (ORCPT ); Wed, 2 Oct 2019 08:21:40 -0400 X-Amp-Result: UNKNOWN X-Amp-Original-Verdict: FILE UNKNOWN X-Amp-File-Uploaded: False Received: from fmsmga001.fm.intel.com ([10.253.24.23]) by fmsmga106.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 02 Oct 2019 05:21:39 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.64,574,1559545200"; d="scan'208";a="205328880" Received: from lahna.fi.intel.com (HELO lahna) ([10.237.72.157]) by fmsmga001.fm.intel.com with SMTP; 02 Oct 2019 05:21:37 -0700 Received: by lahna (sSMTP sendmail emulation); Wed, 02 Oct 2019 15:21:36 +0300 Date: Wed, 2 Oct 2019 15:21:36 +0300 From: Mika Westerberg To: Tejun Heo Cc: "Rafael J. Wysocki" , AceLan Kao , Jens Axboe , Jan Kara , Greg Kroah-Hartman , linux-kernel@vger.kernel.org Subject: System hangs if NVMe/SSD is removed during suspend Message-ID: <20191002122136.GD2819@lahna.fi.intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Organization: Intel Finland Oy - BIC 0357606-4 - Westendinkatu 7, 02160 Espoo User-Agent: Mutt/1.12.1 (2019-06-15) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi Tejun, In a system with Thunderbolt connected NVMe or SSD entering system suspend, detaching the NVMe/SSD and resuming the system hangs (see also https://bugzilla.kernel.org/show_bug.cgi?id=204385). Triggering sysrq-w I see this: [ 113.093783] Workqueue: nvme-wq nvme_remove_dead_ctrl_work [nvme] [ 113.095156] Call Trace: [ 113.096234] ? __schedule+0x2c5/0x630 [ 113.097409] ? wait_for_completion+0xa4/0x120 [ 113.098639] schedule+0x3e/0xc0 [ 113.099769] schedule_timeout+0x1c9/0x320 [ 113.100973] ? resched_curr+0x1f/0xd0 [ 113.102146] ? wait_for_completion+0xa4/0x120 [ 113.103379] wait_for_completion+0xc3/0x120 [ 113.104595] ? wake_up_q+0x60/0x60 [ 113.105749] __flush_work+0x131/0x1e0 [ 113.106925] ? flush_workqueue_prep_pwqs+0x130/0x130 [ 113.108215] bdi_unregister+0xb9/0x130 [ 113.109403] del_gendisk+0x2d2/0x2e0 [ 113.110580] nvme_ns_remove+0xed/0x110 [nvme_core] [ 113.111853] nvme_remove_namespaces+0x96/0xd0 [nvme_core] [ 113.113177] nvme_remove+0x5b/0x160 [nvme] [ 113.114391] pci_device_remove+0x36/0x90 [ 113.115590] device_release_driver_internal+0xdf/0x1c0 [ 113.116893] nvme_remove_dead_ctrl_work+0x14/0x30 [nvme] [ 113.118217] process_one_work+0x1c2/0x3f0 [ 113.119434] worker_thread+0x48/0x3e0 [ 113.120619] kthread+0x100/0x140 [ 113.121772] ? current_work+0x30/0x30 [ 113.122955] ? kthread_park+0x80/0x80 [ 113.124142] ret_from_fork+0x35/0x40 The exact place is in wb_shutdown(): /* * Drain work list and shutdown the delayed_work. !WB_registered * tells wb_workfn() that @wb is dying and its work_list needs to * be drained no matter what. */ mod_delayed_work(bdi_wq, &wb->dwork, 0); flush_delayed_work(&wb->dwork); Now bdi_wq is marked as WQ_FREEZABLE and at this time we are still resuming devices so I think it is still frozen. This basically results that the resume process is waiting for bdi_unregister() but it cannot progress because its workqueue is still frozen. I saw you "dealt" similar situation for libata with commit 85fbd722ad0f ("libata, freezer: avoid block device removal while system is frozen") and that there was discussion around this here: https://marc.info/?l=linux-kernel&m=138695698516487 but from that discussion I don't see more generic solution to be implemented. Any ideas we should fix this properly? I mean nowadays Thunderbolt connected storage is quite common and I suppose situation like this might wery well happen, say you have a dock with SSD connected and you disconnect it when the laptop is suspended. I've been using following hack to prevent the issue but I'm quite sure there is a better solution ;-) Thanks in advance! drivers/ata/libata-scsi.c | 2 ++ mm/backing-dev.c | 4 ++-- 2 files changed, 4 insertions(+), 2 deletions(-) diff --git a/drivers/ata/libata-scsi.c b/drivers/ata/libata-scsi.c index 76d0f9de767b..3fea8d72f6f9 100644 --- a/drivers/ata/libata-scsi.c +++ b/drivers/ata/libata-scsi.c @@ -4791,6 +4791,7 @@ void ata_scsi_hotplug(struct work_struct *work) return; } +#if 0 /* * XXX - UGLY HACK * @@ -4810,6 +4811,7 @@ void ata_scsi_hotplug(struct work_struct *work) #ifdef CONFIG_FREEZER while (pm_freezing) msleep(10); +#endif #endif DPRINTK("ENTER\n"); diff --git a/mm/backing-dev.c b/mm/backing-dev.c index e8e89158adec..8e77711d5dd0 100644 --- a/mm/backing-dev.c +++ b/mm/backing-dev.c @@ -236,8 +236,8 @@ static int __init default_bdi_init(void) { int err; - bdi_wq = alloc_workqueue("writeback", WQ_MEM_RECLAIM | WQ_FREEZABLE | - WQ_UNBOUND | WQ_SYSFS, 0); + bdi_wq = alloc_workqueue("writeback", WQ_MEM_RECLAIM | WQ_UNBOUND | + WQ_SYSFS, 0); if (!bdi_wq) return -ENOMEM;