Received: by 2002:ab2:7903:0:b0:1fb:b500:807b with SMTP id a3csp945285lqj; Mon, 3 Jun 2024 06:01:51 -0700 (PDT) X-Forwarded-Encrypted: i=3; AJvYcCXmCv3IYoJWqMNw68NZTVBGBtqatWLsVQYNMTiLFYH1sry8AL1cGAK45qmTsv/yAjPQvZe2R8E9T7lEs/G05fmvZcZBcH2GQ+b3pB5zLQ== X-Google-Smtp-Source: AGHT+IEjkK5DKr1oAvubAARl7rIM2JZvCTtyPv9DlS/O+sMGtqoAMoadO2LYoSIUHy9xNsQqNQaF X-Received: by 2002:a50:c312:0:b0:579:c3f8:5923 with SMTP id 4fb4d7f45d1cf-57a36411f10mr7362717a12.13.1717419711472; Mon, 03 Jun 2024 06:01:51 -0700 (PDT) ARC-Seal: i=2; a=rsa-sha256; t=1717419711; cv=pass; d=google.com; s=arc-20160816; b=STAKAEL//L7HlkpCnOhPgzjJtr1VHzrFMI9WQ027UkBUtbZF+WSi8VUy6MOQ52Ka8v bS/h/H5+0Sd9dAxV/j///lk5yCoc5qcEIdRWPGst8VOpD1c3KITISxjhbrRSIoezC3wc PL928Fwj7gsyjm+ShSK+tU0PINiM2H7kAm5QpD87jrKUe9MwuW2gST+Sp7J6X8lCklrP Bj6ORLCUTNJQDw3AyWplsOP2h5yA6cnLrdSZalDg1z9tVhA/a/4L/HBXqengREp75Lqm +a1hFYKWvNI/r0rdvT+22t0M4CXW3Xu7F/CeJUOst/hZ142Ylz6lIWLSuhHNjPKycrWL ZQmg== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:references:in-reply-to:message-id :date:subject:cc:to:from; bh=wwMBad7z3fCQEVh7w+cH2f3no025uipG2ewdtkuBMH4=; fh=1LhuwRtrwnHwkJ8SSfr9g2yPev2+fxl/FqZerKdeGVc=; b=BeLxpnXRhBnw8821EPn9aGGxahjGw6nvAjyVhEBQ+SjyXnFmt4XiEIRn5HWacdfCjZ 6pbGjeK78rv49TH5oVj6aBJabnJsawXuzIbhjFfPNeMGNqexEgTZZGjxrc1gYUftXXR+ 88/pMo4W+M2gh6eNZ4bXTB6HvZyaA67ScxlkY76qiB/nvXuThuavpu0EK5h6EyrMIDNL ljmnVrT7ntVJXRGAU2tOQ6JVWEEA/4fXO7tY/hoC/kM14BihSIeOvt1CC0IklE7nTszP 77F78gISdhSlqJ84p51yb1wq7eGcsIlMxUGaFs0nRoPKY6aR38PaMA3VBJYd1zmZzNG8 eSFA==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; arc=pass (i=1 spf=pass spfdomain=huawei.com dmarc=pass fromdomain=huawei.com); spf=pass (google.com: domain of linux-kernel+bounces-199156-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.80.249 as permitted sender) smtp.mailfrom="linux-kernel+bounces-199156-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=fail (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=huawei.com Return-Path: Received: from am.mirrors.kernel.org (am.mirrors.kernel.org. [147.75.80.249]) by mx.google.com with ESMTPS id 4fb4d7f45d1cf-57a31c76151si4004874a12.374.2024.06.03.06.01.51 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 03 Jun 2024 06:01:51 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel+bounces-199156-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.80.249 as permitted sender) client-ip=147.75.80.249; Authentication-Results: mx.google.com; arc=pass (i=1 spf=pass spfdomain=huawei.com dmarc=pass fromdomain=huawei.com); spf=pass (google.com: domain of linux-kernel+bounces-199156-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.80.249 as permitted sender) smtp.mailfrom="linux-kernel+bounces-199156-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=fail (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=huawei.com Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by am.mirrors.kernel.org (Postfix) with ESMTPS id CA2751F26C2D for ; Mon, 3 Jun 2024 13:00:51 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 81A3B12C559; Mon, 3 Jun 2024 12:59:47 +0000 (UTC) Received: from szxga03-in.huawei.com (szxga03-in.huawei.com [45.249.212.189]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 27C5412C554; Mon, 3 Jun 2024 12:59:44 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.189 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1717419586; cv=none; b=WNLCeVUjBhOn/PwDzT7ZiPi8m3RlsOq0/qW+s6/5cOa9fNrW+7DZR+QOyBL3NhxZx+nMc6Tis+hLD1OxpFIn/PoxQVnB6zjzDgsJRk7q9gDUXAScVd1dJYQ3t+tsIze03xHbPEFe9Phzfj1iY5G0J7/so/5tF66rN9+yBK3sgYM= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1717419586; c=relaxed/simple; bh=kNYjS8n3tbQbcI11AsmL/J3aa6ZedKgAcJ5KRt5TLwk=; h=From:To:CC:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=WeDG7it/2fdxDyzH3QNwtCYXpR1RC/nMeGf+xTGHK3q9HlFPHIFcMvWZYoLpx94uTQBRuE5QXaG0Bak0qF8VQHCSi6nnwuFS42RHhMJN2vHYRikeups+opPJQ8jox89DnRzTMIvNHE2Gx0F0YyyIpdCVOVKizUMdgp7vcjWv53k= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=huawei.com; spf=pass smtp.mailfrom=huawei.com; arc=none smtp.client-ip=45.249.212.189 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=huawei.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=huawei.com Received: from mail.maildlp.com (unknown [172.19.88.105]) by szxga03-in.huawei.com (SkyGuard) with ESMTP id 4VtDHv1Q4rzPnpk; Mon, 3 Jun 2024 20:56:27 +0800 (CST) Received: from kwepemm600009.china.huawei.com (unknown [7.193.23.164]) by mail.maildlp.com (Postfix) with ESMTPS id 818DD140413; Mon, 3 Jun 2024 20:59:27 +0800 (CST) Received: from huawei.com (10.175.104.67) by kwepemm600009.china.huawei.com (7.193.23.164) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.1.2507.39; Mon, 3 Jun 2024 20:58:15 +0800 From: Yu Kuai To: , , , , , , CC: , , , , , , Subject: [PATCH 07/12] md: don't fail action_store() if sync_thread is not registered Date: Mon, 3 Jun 2024 20:58:10 +0800 Message-ID: <20240603125815.2199072-8-yukuai3@huawei.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20240603125815.2199072-1-yukuai3@huawei.com> References: <20240603125815.2199072-1-yukuai3@huawei.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Content-Type: text/plain X-ClientProxiedBy: dggems702-chm.china.huawei.com (10.3.19.179) To kwepemm600009.china.huawei.com (7.193.23.164) MD_RECOVERY_RUNNING will always be set when trying to register a new sync_thread, however, if md_start_sync() turns out to do nothing, MD_RECOVERY_RUNNING will be cleared in this case. And during the race window, action_store() will return -EBUSY, which will cause some mdadm tests to fail. For example: The test 07reshape5intr will add a new disk to array, then start reshape: mdadm /dev/md0 --add /dev/xxx mdadm --grow /dev/md0 -n 3 And add_bound_rdev() from mdadm --add will set MD_RECOVERY_NEEDED, then during the race windown, mdadm --grow will fail. Fix the problem by waiting in action_store() during the race window, fail only if sync_thread is registered. Signed-off-by: Yu Kuai --- drivers/md/md.c | 85 +++++++++++++++++++------------------------------ drivers/md/md.h | 2 -- 2 files changed, 33 insertions(+), 54 deletions(-) diff --git a/drivers/md/md.c b/drivers/md/md.c index 5e3c3c109412..3890ae86449a 100644 --- a/drivers/md/md.c +++ b/drivers/md/md.c @@ -752,7 +752,6 @@ int mddev_init(struct mddev *mddev) mutex_init(&mddev->open_mutex); mutex_init(&mddev->reconfig_mutex); - mutex_init(&mddev->sync_mutex); mutex_init(&mddev->suspend_mutex); mutex_init(&mddev->bitmap_info.mutex); INIT_LIST_HEAD(&mddev->disks); @@ -5020,34 +5019,6 @@ void md_unfrozen_sync_thread(struct mddev *mddev) } EXPORT_SYMBOL_GPL(md_unfrozen_sync_thread); -static void idle_sync_thread(struct mddev *mddev) -{ - mutex_lock(&mddev->sync_mutex); - clear_bit(MD_RECOVERY_FROZEN, &mddev->recovery); - - if (mddev_lock(mddev)) { - mutex_unlock(&mddev->sync_mutex); - return; - } - - stop_sync_thread(mddev, false); - mutex_unlock(&mddev->sync_mutex); -} - -static void frozen_sync_thread(struct mddev *mddev) -{ - mutex_lock(&mddev->sync_mutex); - set_bit(MD_RECOVERY_FROZEN, &mddev->recovery); - - if (mddev_lock(mddev)) { - mutex_unlock(&mddev->sync_mutex); - return; - } - - stop_sync_thread(mddev, false); - mutex_unlock(&mddev->sync_mutex); -} - static int mddev_start_reshape(struct mddev *mddev) { int ret; @@ -5055,24 +5026,13 @@ static int mddev_start_reshape(struct mddev *mddev) if (mddev->pers->start_reshape == NULL) return -EINVAL; - ret = mddev_lock(mddev); - if (ret) - return ret; - - if (test_bit(MD_RECOVERY_RUNNING, &mddev->recovery)) { - mddev_unlock(mddev); - return -EBUSY; - } - if (mddev->reshape_position == MaxSector || mddev->pers->check_reshape == NULL || mddev->pers->check_reshape(mddev)) { clear_bit(MD_RECOVERY_FROZEN, &mddev->recovery); ret = mddev->pers->start_reshape(mddev); - if (ret) { - mddev_unlock(mddev); + if (ret) return ret; - } } else { /* * If reshape is still in progress, and md_check_recovery() can @@ -5082,7 +5042,6 @@ static int mddev_start_reshape(struct mddev *mddev) clear_bit(MD_RECOVERY_FROZEN, &mddev->recovery); } - mddev_unlock(mddev); sysfs_notify_dirent_safe(mddev->sysfs_degraded); return 0; } @@ -5096,36 +5055,53 @@ action_store(struct mddev *mddev, const char *page, size_t len) if (!mddev->pers || !mddev->pers->sync_request) return -EINVAL; +retry: + if (work_busy(&mddev->sync_work)) + flush_work(&mddev->sync_work); + + ret = mddev_lock(mddev); + if (ret) + return ret; + + if (work_busy(&mddev->sync_work)) { + mddev_unlock(mddev); + goto retry; + } + action = md_sync_action_by_name(page); /* TODO: mdadm rely on "idle" to start sync_thread. */ if (test_bit(MD_RECOVERY_RUNNING, &mddev->recovery)) { switch (action) { case ACTION_FROZEN: - frozen_sync_thread(mddev); - return len; + md_frozen_sync_thread(mddev); + ret = len; + goto out; case ACTION_IDLE: - idle_sync_thread(mddev); + md_idle_sync_thread(mddev); break; case ACTION_RESHAPE: case ACTION_RECOVER: case ACTION_CHECK: case ACTION_REPAIR: case ACTION_RESYNC: - return -EBUSY; + ret = -EBUSY; + goto out; default: - return -EINVAL; + ret = -EINVAL; + goto out; } } else { switch (action) { case ACTION_FROZEN: set_bit(MD_RECOVERY_FROZEN, &mddev->recovery); - return len; + ret = len; + goto out; case ACTION_RESHAPE: clear_bit(MD_RECOVERY_FROZEN, &mddev->recovery); ret = mddev_start_reshape(mddev); if (ret) - return ret; + goto out; break; case ACTION_RECOVER: clear_bit(MD_RECOVERY_FROZEN, &mddev->recovery); @@ -5143,7 +5119,8 @@ action_store(struct mddev *mddev, const char *page, size_t len) clear_bit(MD_RECOVERY_FROZEN, &mddev->recovery); break; default: - return -EINVAL; + ret = -EINVAL; + goto out; } } @@ -5151,14 +5128,18 @@ action_store(struct mddev *mddev, const char *page, size_t len) /* A write to sync_action is enough to justify * canceling read-auto mode */ - flush_work(&mddev->sync_work); mddev->ro = MD_RDWR; md_wakeup_thread(mddev->sync_thread); } + set_bit(MD_RECOVERY_NEEDED, &mddev->recovery); md_wakeup_thread(mddev->thread); sysfs_notify_dirent_safe(mddev->sysfs_action); - return len; + ret = len; + +out: + mddev_unlock(mddev); + return ret; } static struct md_sysfs_entry md_scan_mode = diff --git a/drivers/md/md.h b/drivers/md/md.h index 018f3292a25c..f7afc5a46031 100644 --- a/drivers/md/md.h +++ b/drivers/md/md.h @@ -595,8 +595,6 @@ struct mddev { */ struct list_head deleting; - /* Used to synchronize idle and frozen for action_store() */ - struct mutex sync_mutex; /* The sequence number for sync thread */ atomic_t sync_seq; -- 2.39.2