Received: by 2002:ad5:4acb:0:0:0:0:0 with SMTP id n11csp2088610imw; Sat, 9 Jul 2022 20:15:03 -0700 (PDT) X-Google-Smtp-Source: AGRyM1v5/+WnLi+RQNZv2v6GY79EjP8qfmXHmPrKm11XRMZ7yOjzNcP6yDMUhtn8qis7EvKZ63Cf X-Received: by 2002:a05:6a00:1592:b0:525:7809:42c6 with SMTP id u18-20020a056a00159200b00525780942c6mr12002714pfk.64.1657422903550; Sat, 09 Jul 2022 20:15:03 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1657422903; cv=none; d=google.com; s=arc-20160816; b=nbavx7cYq8ZhdA7CTdL1e8t87q6Uj6b/PnIVbsR+N/4B4D2Qfi+Gk2LqDceBYClLMO jkFUCTlb51fcZLs6oJCUGGRlq5PqbmdnawvJs3cGR4nhZJP8X5jRZRKeQcJg3WGKkB6P pmY1NNjZH54w0HmyAkZ8ntJEM0brgDQqMK9RCw8MsSGkpVRP+fvqUdpRkVi2NLnTGeRe ZY0tGWxUIBsbMZNLuWeWn2N0QEEyRawF5/3P4ZWeefgeMx0B9sbLDbNd4M8obaJ9I+M8 hnac3U5odBhBBBPGOvkmRaGd1mKd3B4RUVJDN46sC0+5bavHVW23MBamqeY3ghmXOyXJ fhzQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:in-reply-to:references :cc:to:from:content-language:subject:user-agent:mime-version:date :message-id; bh=zCWVqUEo7tcRsQWsnjwBBejOD4GYr2/0GEHW4HVV9p8=; b=J0oHn+LtEFU1DyJD0XEQRSiRahzrW6bfDs83k8TGZ8zDeTbwyMnAF3GJ/yOXUkIfn/ zK/sQHjfG6lpJYLVc5gnpmNYY1ZNbyKKL7yubXKwJAGjLEb0NgaXoC9vTdYkA0YMQAkv IBqpMQFL8OlhVblHZAI+Q/nE4IoAAPuBz9/Fcar+Wc3i1X4Y0q+BqwEr7MkIq6RDh8az FEiUq4mazHbQlIbXEPib9BGHLqkHXC1nEbb102CF7uQowy6OVwVZQMyySXu3ehW6HAV1 HVG9GxKDbTdgNqgbDmvebSarNRhuejJjLDO+YNWWIBMd9eC3pMaav/CRImwWCbHv7rZQ 3+JA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id 19-20020a170902e9d300b0016ba3da4517si5155460plk.291.2022.07.09.20.14.51; Sat, 09 Jul 2022 20:15:03 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229564AbiGJCZP (ORCPT + 99 others); Sat, 9 Jul 2022 22:25:15 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33880 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229450AbiGJCZO (ORCPT ); Sat, 9 Jul 2022 22:25:14 -0400 Received: from www262.sakura.ne.jp (www262.sakura.ne.jp [202.181.97.72]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 32D6AA1B7 for ; Sat, 9 Jul 2022 19:25:13 -0700 (PDT) Received: from fsav119.sakura.ne.jp (fsav119.sakura.ne.jp [27.133.134.246]) by www262.sakura.ne.jp (8.15.2/8.15.2) with ESMTP id 26A2PB5m088316; Sun, 10 Jul 2022 11:25:11 +0900 (JST) (envelope-from penguin-kernel@I-love.SAKURA.ne.jp) Received: from www262.sakura.ne.jp (202.181.97.72) by fsav119.sakura.ne.jp (F-Secure/fsigk_smtp/550/fsav119.sakura.ne.jp); Sun, 10 Jul 2022 11:25:11 +0900 (JST) X-Virus-Status: clean(F-Secure/fsigk_smtp/550/fsav119.sakura.ne.jp) Received: from [192.168.1.9] (M106072142033.v4.enabler.ne.jp [106.72.142.33]) (authenticated bits=0) by www262.sakura.ne.jp (8.15.2/8.15.2) with ESMTPSA id 26A2PBf0088313 (version=TLSv1.2 cipher=AES256-GCM-SHA384 bits=256 verify=NO); Sun, 10 Jul 2022 11:25:11 +0900 (JST) (envelope-from penguin-kernel@I-love.SAKURA.ne.jp) Message-ID: <2646e8a3-cc9f-c2c5-e4d6-c86de6e1b739@I-love.SAKURA.ne.jp> Date: Sun, 10 Jul 2022 11:25:08 +0900 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Windows NT 6.3; Win64; x64; rv:91.0) Gecko/20100101 Thunderbird/91.11.0 Subject: [PATCH v2 3/4] PM: hibernate: allow wait_for_device_probe() to timeout when resuming from hibernation Content-Language: en-US From: Tetsuo Handa To: Greg KH , Oliver Neukum , Wedson Almeida Filho , "Rafael J. Wysocki" , Arjan van de Ven , Len Brown , Dmitry Vyukov Cc: linux-pm@vger.kernel.org, LKML References: <03096156-3478-db03-c015-28643479116c@I-love.SAKURA.ne.jp> <48d01ce7-e028-c103-ea7f-5a4ea4c8930b@I-love.SAKURA.ne.jp> In-Reply-To: <48d01ce7-e028-c103-ea7f-5a4ea4c8930b@I-love.SAKURA.ne.jp> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00,SPF_HELO_NONE, SPF_NONE,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org syzbot is reporting hung task at misc_open() [1], for there is a race window of AB-BA deadlock which involves probe_count variable. Even with "char: misc: allow calling open() callback without misc_mtx held" and "PM: hibernate: call wait_for_device_probe() without system_transition_mutex held", wait_for_device_probe() from snapshot_open() can sleep forever if probe_count cannot become 0. Since snapshot_open() is a userland-driven hibernation/resume request, it should be acceptable to fail if something is wrong. Users would not want to wait for hours if device stopped responding. Therefore, introduce killable version of wait_for_device_probe() with timeout. According to Oliver Neukum, there are SCSI commands that can run for more than 60 seconds. Therefore, this patch choose 5 minutes for timeout. Link: https://syzkaller.appspot.com/bug?extid=358c9ab4c93da7b7238c [1] Reported-by: syzbot Signed-off-by: Tetsuo Handa Cc: Greg KH Cc: Oliver Neukum Cc: Wedson Almeida Filho Cc: Rafael J. Wysocki Cc: Arjan van de Ven --- drivers/base/dd.c | 14 ++++++++++++++ include/linux/device/driver.h | 1 + kernel/power/user.c | 9 +++++++-- 3 files changed, 22 insertions(+), 2 deletions(-) diff --git a/drivers/base/dd.c b/drivers/base/dd.c index 70f79fc71539..3136b8403bb0 100644 --- a/drivers/base/dd.c +++ b/drivers/base/dd.c @@ -724,6 +724,20 @@ void wait_for_device_probe(void) } EXPORT_SYMBOL_GPL(wait_for_device_probe); +void wait_for_device_probe_killable_timeout(unsigned long timeout) +{ + /* wait for probe timeout */ + wait_event(probe_timeout_waitqueue, !driver_deferred_probe_timeout); + + /* wait for the deferred probe workqueue to finish */ + flush_work(&deferred_probe_work); + + /* wait for the known devices to complete their probing */ + wait_event_killable_timeout(probe_waitqueue, + atomic_read(&probe_count) == 0, timeout); + async_synchronize_full(); +} + static int __driver_probe_device(struct device_driver *drv, struct device *dev) { int ret = 0; diff --git a/include/linux/device/driver.h b/include/linux/device/driver.h index 7acaabde5396..4ee909144470 100644 --- a/include/linux/device/driver.h +++ b/include/linux/device/driver.h @@ -129,6 +129,7 @@ extern struct device_driver *driver_find(const char *name, struct bus_type *bus); extern int driver_probe_done(void); extern void wait_for_device_probe(void); +extern void wait_for_device_probe_killable_timeout(unsigned long timeout); void __init wait_for_init_devices_probe(void); /* sysfs interface for exporting driver attributes */ diff --git a/kernel/power/user.c b/kernel/power/user.c index db98a028dfdd..32dd5a855e8c 100644 --- a/kernel/power/user.c +++ b/kernel/power/user.c @@ -58,8 +58,13 @@ static int snapshot_open(struct inode *inode, struct file *filp) /* The image device should be already ready. */ break; default: /* Resuming */ - /* We may need to wait for the image device to appear. */ - wait_for_device_probe(); + /* + * Since the image device might not be ready, try to wait up to + * 5 minutes. We should not wait forever, for we might get stuck + * due to unresponsive devices and/or new probe events which + * are irrelevant to the image device keep coming in. + */ + wait_for_device_probe_killable_timeout(300 * HZ); break; } -- 2.18.4