Received: by 2002:a05:7412:a9a3:b0:f9:93eb:408e with SMTP id o35csp64515rdh; Wed, 20 Dec 2023 23:13:45 -0800 (PST) X-Google-Smtp-Source: AGHT+IFdDU6h5M8i5HE0D355Fz8Lv5H4x8suL41x1yJm04eezF/waCoRUQWCl3ZAE5L2nXkbxDYT X-Received: by 2002:a17:907:94d3:b0:a26:9b19:39d1 with SMTP id dn19-20020a17090794d300b00a269b1939d1mr535671ejc.200.1703142825087; Wed, 20 Dec 2023 23:13:45 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1703142825; cv=none; d=google.com; s=arc-20160816; b=j573r5DZNz9ZsSAcJcnFuZz40wK73vcsFHWy54BTWGFuOAhQZM649y2YZpxYCAixXz 0iyfEk9nYcLXmB8dIjjKKlkF3ZPBtDDYlVgYYRd9k+y3sv7mK+/IFj7FnbgIDC8Oj/7d gm28rnhYND0+oaPAMW0/UIs0J84e1zqHxje7EDU7VRoVLLKMkUK6VUyWZbmQj0i+o+gd QSmZ2T9K5NaHTbDYXR38Jxhgpxa0K3uIPDFHE2uo1jucki+TKxHEWGKrR8vyfCv+oQEi Eo4O9MRa837WBcD7WQjbhSMT7MkSi1v3oLN2EnMdrxYCvGTJFoDtysNz4cxW+UYbuqg3 h27w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:references:in-reply-to:message-id :date:subject:cc:to:from; bh=g3j43QjfIJ0fDtmKpYj36cf2lW8Z8ZnMb1ZDLvmZOPQ=; fh=rohc9p9Yc8KW3f8e96UOkL0PmV2g9mNbxLi6PQaSMkI=; b=zyHvUNN8g4Nuy9eWJkFg5bKwilKg7/4AVHmdTqB/xXO2qncGO1IiMf3P7bdwwQYwkh CJrIBdeYVjfiRYEjIbIq2yeoa7BzDOa893JRYaqGnfkQksrTk2ojeKA6YkS40fZfbecN EJ+03xXjoq8kYM8IQnOxg3N/D6khD6M77tThPcR4KXqX+cEqJ7iElldkxjHtVjI6YLkV hHcF1Oj/KXRCUpiq3wioWjrF9lcjVmy8YvCAw0CFrKy0R9hD0uvDC+NFRKVuVvTSdPAv eLXfPy5t0USwgAnznqI2UgU03dGCiUOkkbN6tHxjokB+nqyFc3cvk3quMt4ej0NrrKby M3DA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel+bounces-7922-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.80.249 as permitted sender) smtp.mailfrom="linux-kernel+bounces-7922-linux.lists.archive=gmail.com@vger.kernel.org" Return-Path: Received: from am.mirrors.kernel.org (am.mirrors.kernel.org. [147.75.80.249]) by mx.google.com with ESMTPS id fx15-20020a170906b74f00b00a235a654d09si600805ejb.795.2023.12.20.23.13.45 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 20 Dec 2023 23:13:45 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel+bounces-7922-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.80.249 as permitted sender) client-ip=147.75.80.249; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel+bounces-7922-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.80.249 as permitted sender) smtp.mailfrom="linux-kernel+bounces-7922-linux.lists.archive=gmail.com@vger.kernel.org" Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by am.mirrors.kernel.org (Postfix) with ESMTPS id AAE601F21D08 for ; Thu, 21 Dec 2023 07:13:44 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 6FCE9156E4; Thu, 21 Dec 2023 07:13:14 +0000 (UTC) X-Original-To: linux-kernel@vger.kernel.org Received: from dggsgout11.his.huawei.com (unknown [45.249.212.51]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 9044DD309; Thu, 21 Dec 2023 07:13:11 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=huaweicloud.com Received: from mail.maildlp.com (unknown [172.19.163.235]) by dggsgout11.his.huawei.com (SkyGuard) with ESMTP id 4SwhTt53ypz4f3jLc; Thu, 21 Dec 2023 15:13:06 +0800 (CST) Received: from mail02.huawei.com (unknown [10.116.40.112]) by mail.maildlp.com (Postfix) with ESMTP id 1AB7B1A49B8; Thu, 21 Dec 2023 15:13:08 +0800 (CST) Received: from huaweicloud.com (unknown [10.175.104.67]) by APP1 (Coremail) with SMTP id cCh0CgCn9QuB5YNlzFboEA--.31997S6; Thu, 21 Dec 2023 15:13:07 +0800 (CST) From: linan666@huaweicloud.com To: song@kernel.org, yukuai3@huawei.com Cc: linux-raid@vger.kernel.org, linux-kernel@vger.kernel.org, linan666@huaweicloud.com, yi.zhang@huawei.com, houtao1@huawei.com, yangerkun@huawei.com Subject: [PATCH 2/2] md: create symlink with disk holder after mddev resume Date: Thu, 21 Dec 2023 15:11:09 +0800 Message-Id: <20231221071109.1562530-3-linan666@huaweicloud.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20231221071109.1562530-1-linan666@huaweicloud.com> References: <20231221071109.1562530-1-linan666@huaweicloud.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-CM-TRANSID:cCh0CgCn9QuB5YNlzFboEA--.31997S6 X-Coremail-Antispam: 1UD129KBjvJXoWxCFyrAF13WFy7ArW7tFW7twb_yoWrWF4Up3 ySga45KrWUJr9xXr4UtasxW3W5Xw18K397try3uryIga43twsIkr1rury5Xryrtas3ZFWD Xa15Xw4UuF18uFUanT9S1TB71UUUUUUqnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUPv14x267AKxVWrJVCq3wAFc2x0x2IEx4CE42xK8VAvwI8IcIk0 rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2048vs2IY020E87I2jVAFwI0_Jryl82xGYIkIc2 x26xkF7I0E14v26r4j6ryUM28lY4IEw2IIxxk0rwA2F7IY1VAKz4vEj48ve4kI8wA2z4x0 Y4vE2Ix0cI8IcVAFwI0_Ar0_tr1l84ACjcxK6xIIjxv20xvEc7CjxVAFwI0_Gr1j6F4UJw A2z4x0Y4vEx4A2jsIE14v26rxl6s0DM28EF7xvwVC2z280aVCY1x0267AKxVW0oVCq3wAa c4AC62xK8xCEY4vEwIxC4wAS0I0E0xvYzxvE52x082IY62kv0487Mc02F40EFcxC0VAKzV Aqx4xG6I80ewAv7VC0I7IYx2IY67AKxVWUJVWUGwAv7VC2z280aVAFwI0_Jr0_Gr1lOx8S 6xCaFVCjc4AY6r1j6r4UM4x0Y48IcxkI7VAKI48JM4x0x7Aq67IIx4CEVc8vx2IErcIFxw AKzVCY07xG64k0F24l42xK82IYc2Ij64vIr41l4I8I3I0E4IkC6x0Yz7v_Jr0_Gr1lx2Iq xVAqx4xG67AKxVWUJVWUGwC20s026x8GjcxK67AKxVWUGVWUWwC2zVAF1VAY17CE14v26r 1q6r43MIIYrxkI7VAKI48JMIIF0xvE2Ix0cI8IcVAFwI0_Jr0_JF4lIxAIcVC0I7IYx2IY 6xkF7I0E14v26r4j6F4UMIIF0xvE42xK8VAvwI8IcIk0rVWUJVWUCwCI42IY6I8E87Iv67 AKxVWUJVW8JwCI42IY6I8E87Iv6xkF7I0E14v26r4j6r4UJbIYCTnIWIevJa73UjIFyTuY vjfU1xhLUUUUU X-CM-SenderInfo: polqt0awwwqx5xdzvxpfor3voofrz/ From: Li Nan There is a risk of deadlock when a process gets disk->open_mutex after suspending mddev, because other processes may hold open_mutex while submitting io. For example: T1 T2 blkdev_open bdev_open_by_dev mutex_lock(&disk->open_mutex) md_ioctl mddev_suspend_and_lock mddev_suspend md_add_new_disk bind_rdev_to_array bd_link_disk_holder //wait open_mutex blkdev_get_whole bdev_disk_changed efi_partition read_lba ... md_submit_bio md_handle_request //wait resume Fix it by getting disk->open_mutex after mddev resume, iterating each mddev->disk to create symlink for rdev which has not been created yet. and moving bd_unlink_disk_holder() to mddev_unlock(), rdev has been deleted from mddev->disks here, which can avoid concurrent bind and unbind, Fixes: 1b0a2d950ee2 ("md: use new apis to suspend array for ioctls involed array reconfiguration") Signed-off-by: Li Nan --- drivers/md/md.c | 39 +++++++++++++++++++++++++++++---------- 1 file changed, 29 insertions(+), 10 deletions(-) diff --git a/drivers/md/md.c b/drivers/md/md.c index d6612b922c76..c128570f2a5d 100644 --- a/drivers/md/md.c +++ b/drivers/md/md.c @@ -521,6 +521,20 @@ void mddev_resume(struct mddev *mddev) } EXPORT_SYMBOL_GPL(mddev_resume); +static void md_link_disk_holder(struct mddev *mddev) +{ + struct md_rdev *rdev; + + rcu_read_lock(); + rdev_for_each_rcu(rdev, mddev) { + if (test_bit(SymlinkCreated, &rdev->flags)) + continue; + if (!bd_link_disk_holder(rdev->bdev, mddev->gendisk)) + set_bit(SymlinkCreated, &rdev->flags); + } + rcu_read_unlock(); +} + /* * Generic flush handling for md */ @@ -902,6 +916,11 @@ void mddev_unlock(struct mddev *mddev) list_for_each_entry_safe(rdev, tmp, &delete, same_set) { list_del_init(&rdev->same_set); + if (test_bit(SymlinkCreated, &rdev->flags)) { + bd_unlink_disk_holder(rdev->bdev, rdev->mddev->gendisk); + clear_bit(SymlinkCreated, &rdev->flags); + } + rdev->mddev = NULL; kobject_del(&rdev->kobj); export_rdev(rdev, mddev); } @@ -2526,8 +2545,6 @@ static int bind_rdev_to_array(struct md_rdev *rdev, struct mddev *mddev) sysfs_get_dirent_safe(rdev->kobj.sd, "bad_blocks"); list_add_rcu(&rdev->same_set, &mddev->disks); - if (!bd_link_disk_holder(rdev->bdev, mddev->gendisk)) - set_bit(SymlinkCreated, &rdev->flags); /* May as well allow recovery to be retried once */ mddev->recovery_disabled++; @@ -2562,14 +2579,9 @@ static void md_kick_rdev_from_array(struct md_rdev *rdev) { struct mddev *mddev = rdev->mddev; - if (test_bit(SymlinkCreated, &rdev->flags)) { - bd_unlink_disk_holder(rdev->bdev, rdev->mddev->gendisk); - clear_bit(SymlinkCreated, &rdev->flags); - } list_del_rcu(&rdev->same_set); pr_debug("md: unbind<%pg>\n", rdev->bdev); mddev_destroy_serial_pool(rdev->mddev, rdev); - rdev->mddev = NULL; sysfs_remove_link(&rdev->kobj, "block"); sysfs_put(rdev->sysfs_state); sysfs_put(rdev->sysfs_unack_badblocks); @@ -4667,8 +4679,10 @@ new_dev_store(struct mddev *mddev, const char *buf, size_t len) if (err) export_rdev(rdev, mddev); mddev_unlock_and_resume(mddev); - if (!err) + if (!err) { + md_link_disk_holder(mddev); md_new_event(); + } return err ? err : len; } @@ -6606,6 +6620,7 @@ static void autorun_devices(int part) } autorun_array(mddev); mddev_unlock_and_resume(mddev); + md_link_disk_holder(mddev); } /* on success, candidates will be empty, on error * it won't... @@ -7832,8 +7847,12 @@ static int md_ioctl(struct block_device *bdev, blk_mode_t mode, err != -EINVAL) mddev->hold_active = 0; - md_ioctl_need_suspend(cmd) ? mddev_unlock_and_resume(mddev) : - mddev_unlock(mddev); + if (md_ioctl_need_suspend(cmd)) { + mddev_unlock_and_resume(mddev); + md_link_disk_holder(mddev); + } else { + mddev_unlock(mddev); + } out: if(did_set_md_closing) -- 2.39.2