Received: by 2002:a05:6a10:1d13:0:0:0:0 with SMTP id pp19csp2362041pxb; Mon, 23 Aug 2021 19:32:17 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxnxdHmO5ycXAsxcREltTjhCt66PqEloHUPvXN8FI5zmoROCRkQ0AZZtiBWwN389jX2Bpfs X-Received: by 2002:a17:906:9241:: with SMTP id c1mr38411574ejx.125.1629772337670; Mon, 23 Aug 2021 19:32:17 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1629772337; cv=none; d=google.com; s=arc-20160816; b=k9ZopA7AoDxsNWbvlQdOWn/+WGGogqG+8H2tXaWC7kDEO45Q98xLoU+UH/1lGaYGOR k1FWp/uLTco27ZV7QmQ2MzxrbRRdjo8q7PkSvK832BO1QYp/fiI+oln4CNVI/qSRnhTi HbA7XVpD5TVKOfLFgA5aUt2EgZEq5vZdLKdRtT6GtPC8Y2gy6qQWZW2ObBvt7cYaWjmK 4ZvW8fXSfzvd2NHZBGk23HkDQXwXBO6SMSVjE1HMmgO83LtrIy6j5GyvSZ9SLiZ5H0HC tszEJC+rgt/FzYdp3dtwOP1j4z3xc3h9x1fgcLJO59OkLqjdde+HmUaCfUaLEFcTpqxT XztQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :message-id:date:subject:cc:to:from; bh=UOhXRDeaadNCP9LcUm8i4jl3Dw2BKXtYeZgIKKILOLE=; b=fCf7pAyJBm/HENjlnaNXUZ6uEftDv624QOzFIo7jHddTr9gyA+RpbdSfX9Vs3qjIWW MJelQrvusnNc/M5bqu8ER5FZDGH4ckRr21pvqSUPxUgtPPUuPM58GmzQIfAOc4C5fkVZ 73rWuVxYFiHt6TOXCWIQCaH9B+HfZDnVCux+7w232kKC7RZ73Ix8RwJjjllAILrDcA/T JkiiDBs00hijxXZNIZ3SisyZcWdo6T01/5lHyWi8OvVcNAIpWnlUujuPWprhydPTjki2 qlpSxqa65yVnody7SPYFqle9GB1HSv2qIvOKd+Y25COCxhhmg1izSn4RN8SFswBUUV7N mUsw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=huawei.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id o18si16544113edt.101.2021.08.23.19.31.55; Mon, 23 Aug 2021 19:32:17 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=huawei.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233749AbhHXCar (ORCPT + 99 others); Mon, 23 Aug 2021 22:30:47 -0400 Received: from szxga08-in.huawei.com ([45.249.212.255]:15207 "EHLO szxga08-in.huawei.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230141AbhHXCap (ORCPT ); Mon, 23 Aug 2021 22:30:45 -0400 Received: from dggemv703-chm.china.huawei.com (unknown [172.30.72.57]) by szxga08-in.huawei.com (SkyGuard) with ESMTP id 4GttNR3mMhz1CZpt; Tue, 24 Aug 2021 10:29:27 +0800 (CST) Received: from dggema773-chm.china.huawei.com (10.1.198.217) by dggemv703-chm.china.huawei.com (10.3.19.46) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256) id 15.1.2176.2; Tue, 24 Aug 2021 10:29:58 +0800 Received: from localhost.huawei.com (10.175.124.27) by dggema773-chm.china.huawei.com (10.1.198.217) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256_P256) id 15.1.2176.2; Tue, 24 Aug 2021 10:29:58 +0800 From: Li Jinlin To: , , , CC: , , , , Subject: [PATCH v3] scsi: core: Fix hang of freezing queue between blocking and running device Date: Tue, 24 Aug 2021 10:59:21 +0800 Message-ID: <20210824025921.3277629-1-lijinlin3@huawei.com> X-Mailer: git-send-email 2.27.0 MIME-Version: 1.0 Content-Transfer-Encoding: 7BIT Content-Type: text/plain; charset=US-ASCII X-Originating-IP: [10.175.124.27] X-ClientProxiedBy: dggems703-chm.china.huawei.com (10.3.19.180) To dggema773-chm.china.huawei.com (10.1.198.217) X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Li Jinlin We found a hang issue, the test steps are as follows: 1. blocking device via scsi_device_set_state() 2. dd if=/dev/sda of=/mnt/t.log bs=1M count=10 3. echo none > /sys/block/sda/queue/scheduler 4. echo "running" >/sys/block/sda/device/state Step 3 and 4 should finish this work after step 4, but they hangs. CPU#0 CPU#1 CPU#2 --------------- ---------------- ---------------- Step 1: blocking device Step 2: dd xxxx ^^^^^^ get request q_usage_counter++ Step 3: switching scheculer elv_iosched_store elevator_switch blk_mq_freeze_queue blk_freeze_queue > blk_freeze_queue_start ^^^^^^ mq_freeze_depth++ > blk_mq_run_hw_queues ^^^^^^ can't run queue when dev blocked > blk_mq_freeze_queue_wait ^^^^^^ Hang here!!! wait q_usage_counter==0 Step 4: running device store_state_field scsi_rescan_device scsi_attach_vpd scsi_vpd_inquiry __scsi_execute blk_get_request blk_mq_alloc_request blk_queue_enter ^^^^^^ Hang here!!! wait mq_freeze_depth==0 blk_mq_run_hw_queues ^^^^^^ dispatch IO, q_usage_counter will reduce to zero blk_mq_unfreeze_queue ^^^^^ mq_freeze_depth-- Step 3 and 4 wait for each other. To fix this, we need to run queue before rescanning device when the device state changes to SDEV_RUNNING. Fixes: f0f82e2476f6 ("scsi: core: Fix capacity set to zero after offlinining device") Signed-off-by: Li Jinlin Signed-off-by: Qiu Laibin --- changes since v2 send with Message-ID: 20210809141308.3700854-1-lijinlin3@huawei.com - Expanded code comment in store_state_field() as suggested by Bart drivers/scsi/scsi_sysfs.c | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-) diff --git a/drivers/scsi/scsi_sysfs.c b/drivers/scsi/scsi_sysfs.c index ae9bfc658203..c0d31119d6d7 100644 --- a/drivers/scsi/scsi_sysfs.c +++ b/drivers/scsi/scsi_sysfs.c @@ -808,12 +808,15 @@ store_state_field(struct device *dev, struct device_attribute *attr, ret = scsi_device_set_state(sdev, state); /* * If the device state changes to SDEV_RUNNING, we need to - * rescan the device to revalidate it, and run the queue to - * avoid I/O hang. + * run the queue to avoid I/O hang, and rescan the device + * to revalidate it. Running the queue first is necessary + * because another thread may be waiting inside + * blk_mq_freeze_queue_wait() and because that call may be + * waiting for pending I/O to finish. */ if (ret == 0 && state == SDEV_RUNNING) { - scsi_rescan_device(dev); blk_mq_run_hw_queues(sdev->request_queue, true); + scsi_rescan_device(dev); } mutex_unlock(&sdev->state_mutex); -- 2.27.0