Received: by 2002:a05:7412:8d10:b0:f3:1519:9f41 with SMTP id bj16csp909742rdb; Wed, 6 Dec 2023 03:37:02 -0800 (PST) X-Google-Smtp-Source: AGHT+IFUEXUjWZvm/Ob27o90CBAXUv/fv+6sYZ5LYuHQcMcDs7Qw4yoRD5WxXiiIfF1Q60cFmNwT X-Received: by 2002:aa7:9830:0:b0:6ce:7e2e:2170 with SMTP id q16-20020aa79830000000b006ce7e2e2170mr381213pfl.36.1701862622145; Wed, 06 Dec 2023 03:37:02 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1701862622; cv=none; d=google.com; s=arc-20160816; b=TTwtRw0Y5fwne5BlHoPOI1zvWWAlx9ewyNG6+alFz1iqtZRXmOwnpVewgaPZYxA3wY zsd4XSf+HoKbVwDyR6vHH0hLszS4Zq3i3H7LigFm0y6BJbm5ReWVNcLv7e2WtQPdg5HJ i81BP8jtPT4deyfo1f5jUCGLaxEFkR4mkiaj+HBwcS49w6v7tz+sTl2LE+pZFIqjYCkq 7VXtAzJe61k8LyYYfinda66T4L2csViIs78IqYBGScYJ6HxwT4lA3TAQHjHLAlz9o+Od DxDPK8RxeAuknZj3RTCqDjgWsNl48FW1J9AN8OpSRXQKg3Bq1X7e+Cu2PwyKr1JqDXvd Vk9A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:in-reply-to :mime-version:user-agent:date:message-id:from:references:cc:to :subject; bh=S9ZUvLgVkLjOPaSGVJzrjts78jjNs5VLJsJETwVVSo4=; fh=54tkkytu/BWJksfZE7mAICZHqb4pATL1k47bSnda2Dg=; b=n9KtRWGA/F3ltapxSGGZShdjN/AmGpsAvw5sJiL6RccIu4HxvOdxDUw654w2hzhMKL 0PoRznY455jgkpDnyrSQ2QEAcW/Qg034hkhYBI6H32d0qHUHNVArLREz28AZygU39/W8 NNHoScX5dRQDAeHgaZJUhP1mA4ng0KUScFKV8kWsSbawEdpXHSuOdOS4BSkYEnn6GkKN w1Xkp6lUgyFK7VuKN9//Wr0u20OYtpArgoS4u6P1Xcjyl/llAEgOm4qgfCLZmloUTV8c 3XEgHQE+UT+tX+gIlJzzmBXFsydDaZfAHTm7CbGJtfFDIDyBXLsWxpiGaNv9F1VWep7k mLOQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.36 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from pete.vger.email (pete.vger.email. [23.128.96.36]) by mx.google.com with ESMTPS id o7-20020a056a0015c700b006ce37b9d2b6si7130082pfu.253.2023.12.06.03.37.00 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 06 Dec 2023 03:37:02 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.36 as permitted sender) client-ip=23.128.96.36; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.36 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by pete.vger.email (Postfix) with ESMTP id 75CD381A8AAE; Wed, 6 Dec 2023 03:36:57 -0800 (PST) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.11 at pete.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1377692AbjLFLgm (ORCPT + 99 others); Wed, 6 Dec 2023 06:36:42 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51906 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1377585AbjLFLgl (ORCPT ); Wed, 6 Dec 2023 06:36:41 -0500 Received: from dggsgout11.his.huawei.com (dggsgout11.his.huawei.com [45.249.212.51]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 072AFC9; Wed, 6 Dec 2023 03:36:47 -0800 (PST) Received: from mail.maildlp.com (unknown [172.19.93.142]) by dggsgout11.his.huawei.com (SkyGuard) with ESMTP id 4Slb2t5x9jz4f3lVb; Wed, 6 Dec 2023 19:36:38 +0800 (CST) Received: from mail02.huawei.com (unknown [10.116.40.112]) by mail.maildlp.com (Postfix) with ESMTP id 8B5371A0878; Wed, 6 Dec 2023 19:36:43 +0800 (CST) Received: from [10.174.176.73] (unknown [10.174.176.73]) by APP1 (Coremail) with SMTP id cCh0CgDn6hDJXHBlrr2VCw--.32227S3; Wed, 06 Dec 2023 19:36:43 +0800 (CST) Subject: Re: [PATCH -next] md: split MD_RECOVERY_NEEDED out of mddev_resume To: Song Liu , Yu Kuai Cc: agk@redhat.com, snitzer@kernel.org, mpatocka@redhat.com, dm-devel@lists.linux.dev, janpieter.sollie@edpnet.be, linux-kernel@vger.kernel.org, linux-raid@vger.kernel.org, yi.zhang@huawei.com, yangerkun@huawei.com, "yukuai (C)" References: <20231204031703.3102254-1-yukuai1@huaweicloud.com> From: Yu Kuai Message-ID: <269ac5cb-aa09-02ca-4150-c90cd5a72e06@huaweicloud.com> Date: Wed, 6 Dec 2023 19:36:41 +0800 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:60.0) Gecko/20100101 Thunderbird/60.8.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit X-CM-TRANSID: cCh0CgDn6hDJXHBlrr2VCw--.32227S3 X-Coremail-Antispam: 1UD129KBjvJXoWxZFWxWw45tryrWr13tF18Zrb_yoWrJF1kpa yxtF95Wr4kZa93ZrWUG3WkWa48Zw4jgrZrtrW3Wa4kA3s5K34fGF15ur1UJrWDt34SqFsx ta15Za1kAryrKFJanT9S1TB71UUUUUUqnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUU9F14x267AKxVW8JVW5JwAFc2x0x2IEx4CE42xK8VAvwI8IcIk0 rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2ocxC64kIII0Yj41l84x0c7CEw4AK67xGY2AK02 1l84ACjcxK6xIIjxv20xvE14v26F1j6w1UM28EF7xvwVC0I7IYx2IY6xkF7I0E14v26F4j 6r4UJwA2z4x0Y4vEx4A2jsIE14v26rxl6s0DM28EF7xvwVC2z280aVCY1x0267AKxVW0oV Cq3wAS0I0E0xvYzxvE52x082IY62kv0487Mc02F40EFcxC0VAKzVAqx4xG6I80ewAv7VC0 I7IYx2IY67AKxVWUJVWUGwAv7VC2z280aVAFwI0_Jr0_Gr1lOx8S6xCaFVCjc4AY6r1j6r 4UM4x0Y48IcVAKI48JM4x0x7Aq67IIx4CEVc8vx2IErcIFxwACI402YVCY1x02628vn2kI c2xKxwCYjI0SjxkI62AI1cAE67vIY487MxAIw28IcxkI7VAKI48JMxC20s026xCaFVCjc4 AY6r1j6r4UMI8I3I0E5I8CrVAFwI0_Jr0_Jr4lx2IqxVCjr7xvwVAFwI0_JrI_JrWlx4CE 17CEb7AF67AKxVWUtVW8ZwCIc40Y0x0EwIxGrwCI42IY6xIIjxv20xvE14v26r1j6r1xMI IF0xvE2Ix0cI8IcVCY1x0267AKxVW8JVWxJwCI42IY6xAIw20EY4v20xvaj40_WFyUJVCq 3wCI42IY6I8E87Iv67AKxVWUJVW8JwCI42IY6I8E87Iv6xkF7I0E14v26r4UJVWxJrUvcS sGvfC2KfnxnUUI43ZEXa7VUbXdbUUUUUU== X-CM-SenderInfo: 51xn3trlr6x35dzhxuhorxvhhfrp/ X-Spam-Status: No, score=-4.8 required=5.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,NICE_REPLY_A,SPF_HELO_NONE,SPF_PASS, T_SCC_BODY_TEXT_LINE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on pete.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (pete.vger.email [0.0.0.0]); Wed, 06 Dec 2023 03:36:57 -0800 (PST) Hi, 在 2023/12/06 16:30, Song Liu 写道: > On Sun, Dec 3, 2023 at 7:18 PM Yu Kuai wrote: >> >> From: Yu Kuai >> >> New mddev_resume() calls are added to synchroniza IO with array >> reconfiguration, however, this introduce a regression while adding it in >> md_start_sync(): >> >> 1) someone set MD_RECOVERY_NEEDED first; >> 2) daemon thread grab reconfig_mutex, then clear MD_RECOVERY_NEEDED and >> queue a new sync work; >> 3) daemon thread release reconfig_mutex; >> 4) in md_start_sync >> a) check that there are spares that can be added/removed, then suspend >> the array; >> b) remove_and_add_spares may not be called, or called without really >> add/remove spares; >> c) resume the array, then set MD_RECOVERY_NEEDED again! >> >> Loop between 2 - 4, then mddev_suspend() will be called quite often, for >> consequence, normal IO will be quite slow. >> >> Fix this problem by spliting MD_RECOVERY_NEEDED out of mddev_resume(), so >> that md_start_sync() won't set such flag and hence the loop will be broken. > > I hope we don't leak set_bit MD_RECOVERY_NEEDED to all call > sites of mddev_resume(). There are also some other mddev_resume() that is added later and don't need recovery, so md_start_sync() is not the only place: - md_setup_drive - rdev_attr_store - suspend_lo_store - suspend_hi_store - autorun_devices - md_ioct - r5c_disable_writeback_async - error path from new_dev_store(), ... I'm not sure add a new helper is a good idea, because all above apis should use new helper as well. > > How about something like the following instead? > > Please also incorporate feedback from Paul in the next version. Of course. Thanks, Kuai > > Thanks, > Song > > diff --git i/drivers/md/md.c w/drivers/md/md.c > index c94373d64f2c..2d53e1b57070 100644 > --- i/drivers/md/md.c > +++ w/drivers/md/md.c > @@ -490,7 +490,7 @@ int mddev_suspend(struct mddev *mddev, bool interruptible) > } > EXPORT_SYMBOL_GPL(mddev_suspend); > > -void mddev_resume(struct mddev *mddev) > +static void __mddev_resume(struct mddev *mddev, bool recovery_needed) > { > lockdep_assert_not_held(&mddev->reconfig_mutex); > > @@ -507,12 +507,18 @@ void mddev_resume(struct mddev *mddev) > percpu_ref_resurrect(&mddev->active_io); > wake_up(&mddev->sb_wait); > > - set_bit(MD_RECOVERY_NEEDED, &mddev->recovery); > + if (recovery_needed) > + set_bit(MD_RECOVERY_NEEDED, &mddev->recovery); > md_wakeup_thread(mddev->thread); > md_wakeup_thread(mddev->sync_thread); /* possibly kick off a reshape */ > > mutex_unlock(&mddev->suspend_mutex); > } > + > +void mddev_resume(struct mddev *mddev) > +{ > + __mddev_resume(mddev, true); > +} > EXPORT_SYMBOL_GPL(mddev_resume); > > /* > @@ -9403,7 +9409,9 @@ static void md_start_sync(struct work_struct *ws) > goto not_running; > } > > - suspend ? mddev_unlock_and_resume(mddev) : mddev_unlock(mddev); > + mddev_unlock(mddev); > + if (suspend) > + __mddev_resume(mddev, false); > md_wakeup_thread(mddev->sync_thread); > sysfs_notify_dirent_safe(mddev->sysfs_action); > md_new_event(); > @@ -9415,7 +9423,9 @@ static void md_start_sync(struct work_struct *ws) > clear_bit(MD_RECOVERY_REQUESTED, &mddev->recovery); > clear_bit(MD_RECOVERY_CHECK, &mddev->recovery); > clear_bit(MD_RECOVERY_RUNNING, &mddev->recovery); > - suspend ? mddev_unlock_and_resume(mddev) : mddev_unlock(mddev); > + mddev_unlock(mddev); > + if (suspend) > + __mddev_resume(mddev, false); > > wake_up(&resync_wait); > if (test_and_clear_bit(MD_RECOVERY_RECOVER, &mddev->recovery) && > > . >