Received: by 2002:a05:7412:419a:b0:f3:1519:9f41 with SMTP id i26csp3472176rdh; Mon, 27 Nov 2023 15:33:24 -0800 (PST) X-Google-Smtp-Source: AGHT+IFQDZ8Qcc2zPs5rIEisXEa25PKwh0Vw/m8YW6QFHFL05i2X5+Sk/2raZ2LuIwYXmgaUPtXb X-Received: by 2002:a05:6a20:12c7:b0:187:4329:f06d with SMTP id v7-20020a056a2012c700b001874329f06dmr20500288pzg.31.1701128003668; Mon, 27 Nov 2023 15:33:23 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1701128003; cv=none; d=google.com; s=arc-20160816; b=jzQ6Ka13mxT5b5tshGP48p3aegpau8p10hRqmlfM9y8lVU8Y3lz1WJ39wg52pYzyJS qROR5nexLefrcN0yOUfkJj7OiFhuD18VvgfccnN/R5KaaM2Mo+DsA3l0yQ0JI4JI4pWd ipKVdda/F+qwPoFVzayoe18QhplNGbdJfNu8Nguvlau/mexMVimbpeuf58DGA5uAW9rS 8Zbjy6qk/LqXQiFqf8JxPIvNdUqJF24wxwdnBVR45iAYP1u4TVI/DdOaK10auY13mYgx REA3AOQwVJ/yE6n9dVa692t0PYoRtn8LL0ntzrQ7oNDCpoJJNrF2+djWm0GYkdb14lUR isPQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:cc:to:subject :message-id:date:from:in-reply-to:references:mime-version :dkim-signature; bh=GJr6NwmjSjjS4lE4FG44GjRuGRcIz9QJsNsAbGtl3IM=; fh=q3YjGA1A3aZJ7muzIh7+LlW9BPvKBmBGggN9f4ElTqo=; b=XrJhX9ul3ZLajolwWyKByPwPGXW3+SZd0TD6lAwKCqc+X50Ery5JiPIhpLecPIx0DF 2PJh4C5KXwAEEiqT2yAtW3/lz3VkSNEBHU+kniUqYHYR3ipAeoDupfU1Zvx5C7/fL7ql 64OL5U594NxupT7liOx4K5CInLcI2WlV//XhcCDzsvhP6hYZ8+2zImM/eNM87HYAs9gz Pv0vTPThAwsxV7uAu8FPdTMUoLwklT/Zo1ayJ3RmYE6u1BbPW/nTIcQOusj3klg+rwJM 7RQtEkrQu3xVOaIgGdebVscSXjn2NPchPk7ZxDxzzNDFXvWRANXUEHsBNDcLYxfXvTAf s67A== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=E1VvX+o4; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:3 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from lipwig.vger.email (lipwig.vger.email. [2620:137:e000::3:3]) by mx.google.com with ESMTPS id be14-20020a656e4e000000b005be095b2545si11968937pgb.183.2023.11.27.15.33.23 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 27 Nov 2023 15:33:23 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:3 as permitted sender) client-ip=2620:137:e000::3:3; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=E1VvX+o4; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:3 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by lipwig.vger.email (Postfix) with ESMTP id 21181809356F; Mon, 27 Nov 2023 15:33:21 -0800 (PST) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.11 at lipwig.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232953AbjK0XdH (ORCPT + 99 others); Mon, 27 Nov 2023 18:33:07 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:34094 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229821AbjK0XdG (ORCPT ); Mon, 27 Nov 2023 18:33:06 -0500 Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3508F1A5 for ; Mon, 27 Nov 2023 15:33:13 -0800 (PST) Received: by smtp.kernel.org (Postfix) with ESMTPSA id D138FC433CC; Mon, 27 Nov 2023 23:33:12 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1701127992; bh=NryLZbe6Lp7RD7isdki+30wx/vixOOf5cwgMUYqIx2c=; h=References:In-Reply-To:From:Date:Subject:To:Cc:From; b=E1VvX+o4MBGKqx+f+qkKVpL+f42O3k/P1cw0pwpefp1TjPbf79swyB2/x4/ByBax/ jXRFJR2VCVgmwWjBS6fJy45OVU7fy2tnbDrcgvNbC1p6CEbQYThe8n5YFQUCdpDxOc 068/StBFqd8JVivfGp8opXxu1YLYNjjJ4TRMX420Y8DX7aP+q/mtWhcgVAODZkjuoR A58Lf89A53c5sF594AGRBYyk8suSMewOKyWpFcyMT3e/Aj/c5uPLvL7y7YuFdJzvm0 qGIMdxWlVLJz3XD8zqMznw5A4hBvMiWxRomMuWEKNUCizVW1BSe5+rI4dVuK6vaZIu id5WlPWdZUmoQ== Received: by mail-lf1-f54.google.com with SMTP id 2adb3069b0e04-507a29c7eefso6322221e87.1; Mon, 27 Nov 2023 15:33:12 -0800 (PST) X-Gm-Message-State: AOJu0YwcGSIZyWAiWWMfdZWeTboVokT9vrM0lqAfb2+xrwHTi5Q46ItU OQCE/nL2YiFtXwK+h7Fb/ghw3XWKU6T9keuuW/0= X-Received: by 2002:ac2:5dd5:0:b0:503:28cb:c087 with SMTP id x21-20020ac25dd5000000b0050328cbc087mr5091467lfq.29.1701127990996; Mon, 27 Nov 2023 15:33:10 -0800 (PST) MIME-Version: 1.0 References: <20231125065419.3518254-1-yukuai1@huaweicloud.com> In-Reply-To: From: Song Liu Date: Mon, 27 Nov 2023 15:32:59 -0800 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: [PATCH -next v2] md: synchronize flush io with array reconfiguration To: Yu Kuai Cc: maan@systemlinux.org, neilb@suse.de, linux-raid@vger.kernel.org, linux-kernel@vger.kernel.org, yukuai3@huawei.com, yi.zhang@huawei.com, yangerkun@huawei.com Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-1.2 required=5.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lipwig.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (lipwig.vger.email [0.0.0.0]); Mon, 27 Nov 2023 15:33:21 -0800 (PST) On Mon, Nov 27, 2023 at 2:16=E2=80=AFPM Song Liu wrote: > > On Fri, Nov 24, 2023 at 10:54=E2=80=AFPM Yu Kuai wrote: > > > > From: Yu Kuai > > > > Currently rcu is used to protect iterating rdev from submit_flushes(): > > > > submit_flushes remove_and_add_spares > > synchronize_rcu > > pers->hot_remove_disk() > > rcu_read_lock() > > rdev_for_each_rcu > > if (rdev->raid_disk >=3D 0) > > rdev->radi_disk =3D -1; > > atomic_inc(&rdev->nr_pending) > > rcu_read_unlock() > > bi =3D bio_alloc_bioset() > > bi->bi_end_io =3D md_end_flush > > bi->private =3D rdev > > submit_bio > > // issue io for removed rdev > > > > Fix this problem by grabbing 'acive_io' before iterating rdev, make sur= e > > that remove_and_add_spares() won't concurrent with submit_flushes(). > > > > Fixes: a2826aa92e2e ("md: support barrier requests on all personalities= .") > > Signed-off-by: Yu Kuai > > --- > > Changes v2: > > - Add WARN_ON in case md_flush_request() is not called from > > md_handle_request() in future. > > > > drivers/md/md.c | 22 ++++++++++++++++------ > > 1 file changed, 16 insertions(+), 6 deletions(-) > > > > diff --git a/drivers/md/md.c b/drivers/md/md.c > > index 86efc9c2ae56..2ffedc39edd6 100644 > > --- a/drivers/md/md.c > > +++ b/drivers/md/md.c > > @@ -538,6 +538,9 @@ static void md_end_flush(struct bio *bio) > > rdev_dec_pending(rdev, mddev); > > > > if (atomic_dec_and_test(&mddev->flush_pending)) { > > + /* The pair is percpu_ref_tryget() from md_flush_reques= t() */ > > + percpu_ref_put(&mddev->active_io); > > + > > /* The pre-request flush has finished */ > > queue_work(md_wq, &mddev->flush_work); > > } > > @@ -557,12 +560,8 @@ static void submit_flushes(struct work_struct *ws) > > rdev_for_each_rcu(rdev, mddev) > > if (rdev->raid_disk >=3D 0 && > > !test_bit(Faulty, &rdev->flags)) { > > - /* Take two references, one is dropped > > - * when request finishes, one after > > - * we reclaim rcu_read_lock > > - */ > > struct bio *bi; > > - atomic_inc(&rdev->nr_pending); > > + > > atomic_inc(&rdev->nr_pending); > > rcu_read_unlock(); > > bi =3D bio_alloc_bioset(rdev->bdev, 0, > > @@ -573,7 +572,6 @@ static void submit_flushes(struct work_struct *ws) > > atomic_inc(&mddev->flush_pending); > > submit_bio(bi); > > rcu_read_lock(); > > - rdev_dec_pending(rdev, mddev); > > } > > rcu_read_unlock(); > > if (atomic_dec_and_test(&mddev->flush_pending)) > > @@ -626,6 +624,18 @@ bool md_flush_request(struct mddev *mddev, struct = bio *bio) > > /* new request after previous flush is completed */ > > if (ktime_after(req_start, mddev->prev_flush_start)) { > > WARN_ON(mddev->flush_bio); > > + /* > > + * Grab a reference to make sure mddev_suspend() will w= ait for > > + * this flush to be done. > > + * > > + * md_flush_reqeust() is called under md_handle_request= () and > > + * 'active_io' is already grabbed, hence percpu_ref_try= get() > > + * won't fail, percpu_ref_tryget_live() can't be used b= ecause > > + * percpu_ref_kill() can be called by mddev_suspend() > > + * concurrently. > > + */ > > + if (WARN_ON(percpu_ref_tryget(&mddev->active_io))) > > This should be "if (!WARN_ON(..))", right? > > Song > > > + percpu_ref_get(&mddev->active_io); Actually, we can just use percpu_ref_get(), no? Thanks, Song