Received: by 2002:a05:7412:5112:b0:fa:6e18:a558 with SMTP id fm18csp1005429rdb; Wed, 24 Jan 2024 01:40:52 -0800 (PST) X-Google-Smtp-Source: AGHT+IHRAqL5w1fUev4BqDPecLiAAG3+EZxxLQKODb6HhAS14/NdLF2Yoh5iiARPhSObGIhqVjxk X-Received: by 2002:a05:6871:341d:b0:203:2192:d738 with SMTP id nh29-20020a056871341d00b002032192d738mr2729942oac.92.1706089252080; Wed, 24 Jan 2024 01:40:52 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1706089252; cv=pass; d=google.com; s=arc-20160816; b=Ic9LLvkACI+ResnoWIqC0B9hhhtPLe0l2DfsXG6E8YHtwl0UwSqp1kUMe/IswzkDn1 bUwosH+M1Bucgp4TRuY14G45S6PY1n98nJQRMvWBGdrGbP5/dJEgugeFI8symDzn2Q2z xt7sks37jRN9wkD9cJgJ05YmxK6hiVc3AXfpJhluKwaL9PvlSUlyLfhGOR1t2EATOzb1 IlduvF79tMZshTJntZoBFLIOtW4hCat8QBQJE9Q72xdrk7zhV7sYPvSgEkvdXpVcwskX UVBXQ1XMnFLEHMplXEuoQ5xyw1Ia9lAFSqp4dluhm6IgXmTWpiK+UH/twsI/c0Mzlx0S jFhw== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:message-id:date:subject:cc:to :from; bh=BwWeKbPbUPTTVyM61BZpbqoEdZDhp5Tt7MpoBctDQFo=; fh=SaJ9rWjnsGFI+7ibUS2Ywr9D7BSRvW01ZC90UJpOFqs=; b=DiHn47xReLiU2Jmi47ds7ufmW4wrzqts38s9tywAP59y2vWtdUK1IpvN14EkllhzPh 7l0sFbj+1mKYjqn0KwWlUovz1Y2uSdaKgh1YKr0ts+HSbhDjzkjodMi4srWK2hjxdwTd P51JLEXn4MD46+eFYQ+MB5Z6Xn8DwDr9bvkZvb8odouZOcqa/gEvEf6VLD+un3QRGZPc ZZ8ikstE2JrAfg1sFAd+nFEQAJ0m9KgowvQ8ULFOK/UuRPGD7Jok66w07q6FgZphDvcO 33jl/HmP6iHRCQOPE6p2gfyCh79s/LEf8ku8FyLIP88Yh6y7V8/AqS3WZ4JarVV+hqWG WhKA== ARC-Authentication-Results: i=2; mx.google.com; arc=pass (i=1 spf=pass spfdomain=huawei.com dmarc=pass fromdomain=huawei.com); spf=pass (google.com: domain of linux-kernel+bounces-36691-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:40f1:3f00::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-36691-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=fail (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=huawei.com Return-Path: Received: from sy.mirrors.kernel.org (sy.mirrors.kernel.org. [2604:1380:40f1:3f00::1]) by mx.google.com with ESMTPS id d16-20020a63d710000000b005ce0219eadcsi11002169pgg.876.2024.01.24.01.40.51 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 24 Jan 2024 01:40:52 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel+bounces-36691-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:40f1:3f00::1 as permitted sender) client-ip=2604:1380:40f1:3f00::1; Authentication-Results: mx.google.com; arc=pass (i=1 spf=pass spfdomain=huawei.com dmarc=pass fromdomain=huawei.com); spf=pass (google.com: domain of linux-kernel+bounces-36691-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:40f1:3f00::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-36691-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=fail (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=huawei.com Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by sy.mirrors.kernel.org (Postfix) with ESMTPS id 1342EB28310 for ; Wed, 24 Jan 2024 09:19:00 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 6560117C6E; Wed, 24 Jan 2024 09:18:38 +0000 (UTC) Received: from szxga07-in.huawei.com (szxga07-in.huawei.com [45.249.212.35]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id C472D1804F; Wed, 24 Jan 2024 09:18:34 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.35 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1706087917; cv=none; b=PXoBuQ8rkWcFlLEkSVEGaaKlXH6PTHV2o/8OI9Ng+neizt8E3+VUdysOxdzUFPmswcDLvQoyqToJc2/7CQhtkuqXeFf1hKrCv9tkPasZNLXMYoD72CrMUlOBOHxx3087qkeX8o8Ep7KbnRknQVHn72oPKCOY9IKwA0ptjExg5OM= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1706087917; c=relaxed/simple; bh=xm2ioImktagySdW6thxgXV5AilNJP9mQwonCGuaB9Ug=; h=From:To:CC:Subject:Date:Message-ID:MIME-Version:Content-Type; b=gEnZF3xBRmnbzQMgwzhRnVqneURDHnBZbK9F1jwoFjsSzF4urIWaHZc2KONDfq6wE0yIaRHWj8ks+2J5Ps3pw0la9theny4NkWEshC0CYtiQX5TUU6uY4p4Bc9/pe3jM5iuZHosNQPf7/ZxzLEfylqWt6DoywYaagUd3yMdjleQ= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=huawei.com; spf=pass smtp.mailfrom=huawei.com; arc=none smtp.client-ip=45.249.212.35 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=huawei.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=huawei.com Received: from mail.maildlp.com (unknown [172.19.88.214]) by szxga07-in.huawei.com (SkyGuard) with ESMTP id 4TKdcv4jcKz1Q899; Wed, 24 Jan 2024 17:16:47 +0800 (CST) Received: from kwepemm600009.china.huawei.com (unknown [7.193.23.164]) by mail.maildlp.com (Postfix) with ESMTPS id 381EF1A0170; Wed, 24 Jan 2024 17:18:17 +0800 (CST) Received: from huawei.com (10.175.104.67) by kwepemm600009.china.huawei.com (7.193.23.164) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.35; Wed, 24 Jan 2024 17:18:15 +0800 From: Yu Kuai To: , , , , , , , , , , , CC: , , , , Subject: [PATCH v2 00/11] dm-raid: fix v6.7 regressions Date: Wed, 24 Jan 2024 17:14:10 +0800 Message-ID: <20240124091421.1261579-1-yukuai3@huawei.com> X-Mailer: git-send-email 2.39.2 Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Content-Type: text/plain X-ClientProxiedBy: dggems705-chm.china.huawei.com (10.3.19.182) To kwepemm600009.china.huawei.com (7.193.23.164) First regression related to stop sync thread: The lifetime of sync_thread is designed as following: 1) Decide want to start sync_thread, set MD_RECOVERY_NEEDED, and wake up daemon thread; 2) Daemon thread detect that MD_RECOVERY_NEEDED is set, then set MD_RECOVERY_RUNNING and register sync_thread; 3) Execute md_do_sync() for the actual work, if it's done or interrupted, it will set MD_RECOVERY_DONE and wake up daemone thread; 4) Daemon thread detect that MD_RECOVERY_DONE is set, then clear MD_RECOVERY_RUNNING and unregister sync_thread; In v6.7, we fix md/raid to follow this design by commit f52f5c71f3d4 ("md: fix stopping sync thread"), however, dm-raid is not considered at that time, and following test will hang: shell/integrity-caching.sh shell/lvconvert-raid-reshape.sh This patch set fix the broken test by patch 1-4; - patch 1 fix that step 4) is broken by suspended array; - patch 2 fix that step 4) is broken by read-only array; - patch 3 fix that step 3) is broken that md_do_sync() doesn't set MD_RECOVERY_DONE; Noted that this patch will introdece new problem that data will be corrupted, which will be fixed in later patches. - patch 4 fix that setp 1) is broken that sync_thread is register and MD_RECOVERY_RUNNING is set directly; With patch 1-4, the above test won't hang anymore, however, the test will still fail and complain that ext4 is corrupted; Second regression related to frozen sync thread: Noted that for raid456, if reshape is interrupted, then call "pers->start_reshape" will corrupt data. This is because dm-raid rely on md_do_sync() doesn't set MD_RECOVERY_DONE so that new sync_thread won't be registered, and patch 3 just break this. - Patch 5-6 fix this problem by interrupting reshape and frozen sync_thread in dm_suspend(), then unfrozen and continue reshape in dm_resume(). It's verified that dm-raid tests won't complain that ext4 is corrupted anymore. - Patch 7 fix the problem that raid_message() call md_reap_sync_thread() directly, without holding 'reconfig_mutex'. Last regression related to dm-raid456 IO concurrent with reshape: For raid456, if reshape is still in progress, then IO across reshape position will wait for reshape to make progress. However, for dm-raid, in following cases reshape will never make progress hence IO will hang: 1) the array is read-only; 2) MD_RECOVERY_WAIT is set; 3) MD_RECOVERY_FROZEN is set; After commit c467e97f079f ("md/raid6: use valid sector values to determine if an I/O should wait on the reshape") fix the problem that IO across reshape position doesn't wait for reshape, the dm-raid test shell/lvconvert-raid-reshape.sh start to hang at raid5_make_request(). For md/raid, the problem doesn't exist because: 1) If array is read-only, it can switch to read-write by ioctl/sysfs; 2) md/raid never set MD_RECOVERY_WAIT; 3) If MD_RECOVERY_FROZEN is set, mddev_suspend() doesn't hold 'reconfig_mutex' anymore, it can be cleared and reshape can continue by sysfs api 'sync_action'. However, I'm not sure yet how to avoid the problem in dm-raid yet. - patch 9-11 fix this problem by detecting the above 3 cases in dm_suspend(), and fail those IO directly. If user really meet the IO error, then it means they're reading the wrong data before c467e97f079f. And it's safe to read/write the array after reshape make progress successfully. Tests: I already run the following two tests many times and verified that they won't fail anymore: shell/integrity-caching.sh shell/lvconvert-raid-reshape.sh For other tests, I'm still running. However, I'm sending this patchset in case people think the fixes is not appropriate. Running the full tests will cost lots of time in my VM, and I'll update full test results soon. Yu Kuai (11): md: don't ignore suspended array in md_check_recovery() md: don't ignore read-only array in md_check_recovery() md: make sure md_do_sync() will set MD_RECOVERY_DONE md: don't register sync_thread for reshape directly md: export helpers to stop sync_thread dm-raid: really frozen sync_thread during suspend md/dm-raid: don't call md_reap_sync_thread() directly dm-raid: remove mddev_suspend/resume() dm-raid: add a new helper prepare_suspend() in md_personality md: export helper md_is_rdwr() md/raid456: fix a deadlock for dm-raid456 while io concurrent with reshape drivers/md/dm-raid.c | 76 +++++++++++++++++++++---------- drivers/md/md.c | 104 ++++++++++++++++++++++++++++--------------- drivers/md/md.h | 16 +++++++ drivers/md/raid10.c | 16 +------ drivers/md/raid5.c | 61 +++++++++++++------------ 5 files changed, 171 insertions(+), 102 deletions(-) -- 2.39.2