Received: by 2002:a5b:505:0:0:0:0:0 with SMTP id o5csp875951ybp; Fri, 4 Oct 2019 06:24:03 -0700 (PDT) X-Google-Smtp-Source: APXvYqy2RYfqwqx+C13V6wn6D1sSKHTAN9BZxkYM9cIkxSmCjXeowzz4eY0rt9lBuOJT3rY1Hbxr X-Received: by 2002:a17:906:8308:: with SMTP id j8mr11971571ejx.142.1570195443723; Fri, 04 Oct 2019 06:24:03 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1570195443; cv=none; d=google.com; s=arc-20160816; b=M0ifsRa/TboCnmHonr/rajB5dA6laWI+bFwxLpdiPaiG+T0XFbpHwFz1spd75coNK4 QOPRVd3gQYkMZbMhP+KTrSPAPshijYrcLYGN2oZPHiGhiAe5dAgVA65p6WHkJebeJyKI Jsk1Sowu7ehRH+euNciYyAtvoZsEYXZvdk+OD1r4RW9IXzMsXlQV9nv2YnAQZ0Iy4iw8 pSAyuvDy6ubS9+xtxYN2wYalO/sEnJODruhYBgSnRuP/XwSEGf41GyApKL6YTRKGeR0U uHypGwpILU+ahKnbfgE7Fq9COPjXhISrU/+sPu+ws6t8RF1H+NY4H1NH97tHcV7tzy+Y PY3g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding :content-language:in-reply-to:mime-version:user-agent:date :message-id:from:references:cc:to:subject:dkim-signature; bh=LEJ9dg5xbbVV3C3q2sRRPp0B1uzrgFWed1hhJkYm69Y=; b=GVvBd7AoHHkjESDu7JBs//ph+AsUV3b3gK2uiCHqvtoaj/mO4Q5N3jV5gOKibEA8x5 N+vgf0lVVi7xu7YZwPHZyxpM9L5pyDvn6TtSrCHx/WHcPtRT42l9GFKetxdv5VOIZxd2 q1+ZiLBpvq/7IrmArdFy1iQ2UfQHv6ZY1yxiaWS9eoi7ikaGg54RwQIg926WNWUClMv/ pEBxaQCZM7WvZIQwnWhHLa5y/qJ16eRJxbsjK/10MZi7KFm8e+meVYUzJ3VK0VCh8rAw /lYn6zGvI15y1484voJFXUmHrCENN9B+zkwQt9nFb7EGFpsufkE79dymQ6TXOzYA8vy1 YBrQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel-dk.20150623.gappssmtp.com header.s=20150623 header.b=vUmqqKfe; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id t18si2952329ejr.205.2019.10.04.06.23.39; Fri, 04 Oct 2019 06:24:03 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel-dk.20150623.gappssmtp.com header.s=20150623 header.b=vUmqqKfe; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2388348AbfJDNW3 (ORCPT + 99 others); Fri, 4 Oct 2019 09:22:29 -0400 Received: from mail-io1-f65.google.com ([209.85.166.65]:38245 "EHLO mail-io1-f65.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2387952AbfJDNW2 (ORCPT ); Fri, 4 Oct 2019 09:22:28 -0400 Received: by mail-io1-f65.google.com with SMTP id u8so13465900iom.5 for ; Fri, 04 Oct 2019 06:22:28 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20150623.gappssmtp.com; s=20150623; h=subject:to:cc:references:from:message-id:date:user-agent :mime-version:in-reply-to:content-language:content-transfer-encoding; bh=LEJ9dg5xbbVV3C3q2sRRPp0B1uzrgFWed1hhJkYm69Y=; b=vUmqqKfe0iMgqQT1tBxf5L1Tzjr/8LsPzYoTRi5HnnZQi6OPLN+MSrqfkvxFB+KJ8g J9TSTshM2pCNFHqUrXr7VWcEaqbB0zClY5hRVcYS7c2VS3XzldzVcvDZUD/4jUzEglN1 L3MxWuOxvJNCHRVQv2ut5Dp0i++aZiOmqPOxf6g80n4QyQWCc9NDVP195xFSbZmFsSw5 qSC6svrQG+aRjdCg3/AzDyeO2ufS+lVeOTC8evuIySD7T9s0N44Vh8tGGxGKbex5JbSN xA3mBb26Yj7FdvVCNxGaldGhi0xovY9K+VXUkqFx9sFpxVxMOY6Fln1oLJIdiGCqg6vZ VCmQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:to:cc:references:from:message-id:date :user-agent:mime-version:in-reply-to:content-language :content-transfer-encoding; bh=LEJ9dg5xbbVV3C3q2sRRPp0B1uzrgFWed1hhJkYm69Y=; b=YukR7eWFGlOMwXQySr/I6fCX/kkv7jr5zGCj2sRmKa9P8HmdiVAlk6Cm/kiDX0T2KR mWfoLSZ12TXSu+kBgK0hVs/ZZ+yxFABKewuQXMsniSfS82/jsXi9Kxi5nkRjRDq4jPFZ wHW7oQZbbjtK8R9Yw2Io/SpglMSb5mMa3JV3v5MLUv4fKL1qeZKcLuVCmcgZEMv1ayog CE9eRTRnYYS3iKsDSLY8RflRWkzSQ7y8XQw6h4Rp80Dm/YK0wr4bkrWEbTOAhjADf/ab h+Ov8AZTEA6L/1fb6c4j7krhBw0CpvfK8ayK7t8nVRKmMPNDSsGDrJV0wcC60riMAelT 3MQw== X-Gm-Message-State: APjAAAVDulIc6224tRHykr5tLm9vTuvGa+Mg9jgxWwKdkOyovoD8zi9e TKPrdH/hFh5w0DuZ0LRdd2P85Q== X-Received: by 2002:a02:7f49:: with SMTP id r70mr14623123jac.85.1570195347722; Fri, 04 Oct 2019 06:22:27 -0700 (PDT) Received: from [192.168.1.50] ([65.144.74.34]) by smtp.gmail.com with ESMTPSA id c65sm3169547ilg.26.2019.10.04.06.22.26 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 04 Oct 2019 06:22:26 -0700 (PDT) Subject: Re: [PATCH 1/2] bdi: Do not use freezable workqueue To: Mika Westerberg , "Rafael J . Wysocki" Cc: Pavel Machek , Jan Kara , Tejun Heo , Greg Kroah-Hartman , Sebastian Andrzej Siewior , Thomas Gleixner , AceLan Kao , linux-kernel@vger.kernel.org, linux-pm@vger.kernel.org References: <20191004100025.70798-1-mika.westerberg@linux.intel.com> From: Jens Axboe Message-ID: <0002b2f3-d17c-0d49-52f4-b2ce31832e6c@kernel.dk> Date: Fri, 4 Oct 2019 07:22:25 -0600 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.8.0 MIME-Version: 1.0 In-Reply-To: <20191004100025.70798-1-mika.westerberg@linux.intel.com> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 10/4/19 4:00 AM, Mika Westerberg wrote: > A removable block device, such as NVMe or SSD connected over Thunderbolt > can be hot-removed any time including when the system is suspended. When > device is hot-removed during suspend and the system gets resumed, kernel > first resumes devices and then thaws the userspace including freezable > workqueues. What happens in that case is that the NVMe driver notices > that the device is unplugged and removes it from the system. This ends > up calling bdi_unregister() for the gendisk which then schedules > wb_workfn() to be run one more time. > > However, since the bdi_wq is still frozen flush_delayed_work() call in > wb_shutdown() blocks forever halting system resume process. User sees > this as hang as nothing is happening anymore. > > Triggering sysrq-w reveals this: > > Workqueue: nvme-wq nvme_remove_dead_ctrl_work [nvme] > Call Trace: > ? __schedule+0x2c5/0x630 > ? wait_for_completion+0xa4/0x120 > schedule+0x3e/0xc0 > schedule_timeout+0x1c9/0x320 > ? resched_curr+0x1f/0xd0 > ? wait_for_completion+0xa4/0x120 > wait_for_completion+0xc3/0x120 > ? wake_up_q+0x60/0x60 > __flush_work+0x131/0x1e0 > ? flush_workqueue_prep_pwqs+0x130/0x130 > bdi_unregister+0xb9/0x130 > del_gendisk+0x2d2/0x2e0 > nvme_ns_remove+0xed/0x110 [nvme_core] > nvme_remove_namespaces+0x96/0xd0 [nvme_core] > nvme_remove+0x5b/0x160 [nvme] > pci_device_remove+0x36/0x90 > device_release_driver_internal+0xdf/0x1c0 > nvme_remove_dead_ctrl_work+0x14/0x30 [nvme] > process_one_work+0x1c2/0x3f0 > worker_thread+0x48/0x3e0 > kthread+0x100/0x140 > ? current_work+0x30/0x30 > ? kthread_park+0x80/0x80 > ret_from_fork+0x35/0x40 > > This is not limited to NVMes so exactly same issue can be reproduced by > hot-removing SSD (over Thunderbolt) while the system is suspended. > > Prevent this from happening by removing WQ_FREEZABLE from bdi_wq. This series looks good for me, I don't think there's a reason for the workers to be marked freezable. -- Jens Axboe