Received: by 2002:a05:6358:3188:b0:123:57c1:9b43 with SMTP id q8csp5116425rwd; Tue, 23 May 2023 18:49:03 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ7gIdIQPl0KIuAc79pz3mva1Ys92smFRmOCf3cr5I4wLFNbhGsf0OMsKG44pu3HXExvRakb X-Received: by 2002:a05:6a20:440b:b0:10b:e88f:598f with SMTP id ce11-20020a056a20440b00b0010be88f598fmr8133070pzb.51.1684892943515; Tue, 23 May 2023 18:49:03 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1684892943; cv=none; d=google.com; s=arc-20160816; b=b4Sp8OOAigOUQPgwi3AZHl1Mv0iHdXGGMduS7jkd3yPPpGuHOvOVYchB5k/26R/WHS YI18J2e1wGzWo8rcNe7V3zW04Qvt5LCa76wfDp88z1xwvC6K7meE1fa7B9xpG9f0Zhl9 lDA74ZElYz1e+vhpaNrzpkFlL5MeGBeTd8L9HcTupjwpfwYLuvqbT1frQc85J9ACCNXe IYZgld0kU2sfW7+dKq4dNX8uPaWsIQYQUCZzOj/rkr/BbCfPU2odMoy6cyufkGJlF6AP q1fL0OF+YcSfjI/QQQ3vM5n+XMa7tehhd4XTyc/UGGEIyNxSV5tvASx9Q4CTzOWV6S5n 3qBg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:in-reply-to :mime-version:user-agent:date:message-id:from:references:cc:to :subject; bh=SIQOXMJ1QBTvB6EjFsCB1ZWDhI2hXU86AFTD14uIEXM=; b=0KRA+oM1cPSRv2P/rn9xbMqQX5PNoJVpc5oV+KPtWPiGA8DwUvBCPVEbP008ev6h0z VyqpB9f8QXRfgmMX/l788QP5t62t5neANUy5c15aE3muz4hexzMizbXgPMmpncIKYKYW 2hvMz3ixjb6Fl/L3tc2z0G/jGyhI/fAVgKRmNc38eL8OqE4Mbk1XNWvjgM6kEusarH2e GWe/8KOhLwM7JEqCkTFXPcE3mWfTOxZd2LG4TeUbxgVB0Lnu9AmkrsCD3qTs+9pMjjHk S3+c/9bMruw1O9obbGFu3/qIAWgBlwfte3ZGOESqzdvzWaoooycvMKFJU0NslpJEEe6g kssg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id e22-20020a633716000000b0053ef469282bsi1778629pga.645.2023.05.23.18.48.48; Tue, 23 May 2023 18:49:03 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S239004AbjEXBdR (ORCPT + 99 others); Tue, 23 May 2023 21:33:17 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37098 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236034AbjEXBdQ (ORCPT ); Tue, 23 May 2023 21:33:16 -0400 Received: from dggsgout11.his.huawei.com (dggsgout11.his.huawei.com [45.249.212.51]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B4EEC135; Tue, 23 May 2023 18:33:14 -0700 (PDT) Received: from mail02.huawei.com (unknown [172.30.67.169]) by dggsgout11.his.huawei.com (SkyGuard) with ESMTP id 4QQtx21SWTz4f3p0S; Wed, 24 May 2023 09:33:10 +0800 (CST) Received: from [10.174.176.73] (unknown [10.174.176.73]) by APP3 (Coremail) with SMTP id _Ch0CgCHZQNTaW1kYjQoJQ--.1928S3; Wed, 24 May 2023 09:33:08 +0800 (CST) Subject: Re: [PATCH v3] md: fix duplicate filename for rdev To: Song Liu , Yu Kuai Cc: pmenzel@molgen.mpg.de, akpm@linux-foundation.org, neilb@suse.de, linux-raid@vger.kernel.org, linux-kernel@vger.kernel.org, yi.zhang@huawei.com, yangerkun@huawei.com, "yukuai (C)" References: <20230523012727.3042247-1-yukuai1@huaweicloud.com> From: Yu Kuai Message-ID: Date: Wed, 24 May 2023 09:33:06 +0800 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:60.0) Gecko/20100101 Thunderbird/60.8.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit X-CM-TRANSID: _Ch0CgCHZQNTaW1kYjQoJQ--.1928S3 X-Coremail-Antispam: 1UD129KBjvJXoWxZFyxuF17ZrW8uF1DGw1rJFb_yoW5Kry3pF WfKa43Gr4DAr4xCa1qqw1xW34rXws3KFW7J3sakF4jk345Zr9FgryfKa1v9r9YkrZa9r4j va18WFW5Ja4UCFUanT9S1TB71UUUUUUqnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUU9Y14x267AKxVW8JVW5JwAFc2x0x2IEx4CE42xK8VAvwI8IcIk0 rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2ocxC64kIII0Yj41l84x0c7CEw4AK67xGY2AK02 1l84ACjcxK6xIIjxv20xvE14v26F1j6w1UM28EF7xvwVC0I7IYx2IY6xkF7I0E14v26r4U JVWxJr1l84ACjcxK6I8E87Iv67AKxVW0oVCq3wA2z4x0Y4vEx4A2jsIEc7CjxVAFwI0_Gc CE3s1le2I262IYc4CY6c8Ij28IcVAaY2xG8wAqx4xG64xvF2IEw4CE5I8CrVC2j2WlYx0E 2Ix0cI8IcVAFwI0_Jr0_Jr4lYx0Ex4A2jsIE14v26r1j6r4UMcvjeVCFs4IE7xkEbVWUJV W8JwACjcxG0xvEwIxGrwACjI8F5VA0II8E6IAqYI8I648v4I1lFIxGxcIEc7CjxVA2Y2ka 0xkIwI1lc7I2V7IY0VAS07AlzVAYIcxG8wCF04k20xvY0x0EwIxGrwCFx2IqxVCFs4IE7x kEbVWUJVW8JwC20s026c02F40E14v26r1j6r18MI8I3I0E7480Y4vE14v26r106r1rMI8E 67AF67kF1VAFwI0_Jw0_GFylIxkGc2Ij64vIr41lIxAIcVC0I7IYx2IY67AKxVWUJVWUCw CI42IY6xIIjxv20xvEc7CjxVAFwI0_Jr0_Gr1lIxAIcVCF04k26cxKx2IYs7xG6rW3Jr0E 3s1lIxAIcVC2z280aVAFwI0_Jr0_Gr1lIxAIcVC2z280aVCY1x0267AKxVWUJVW8JbIYCT nIWIevJa73UjIFyTuYvjfUoOJ5UUUUU X-CM-SenderInfo: 51xn3trlr6x35dzhxuhorxvhhfrp/ X-CFilter-Loop: Reflected X-Spam-Status: No, score=-2.0 required=5.0 tests=BAYES_00,NICE_REPLY_A, SPF_HELO_NONE,SPF_NONE,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi, 在 2023/05/24 2:05, Song Liu 写道: > On Mon, May 22, 2023 at 6:30 PM Yu Kuai wrote: >> >> From: Yu Kuai >> >> Commit 5792a2856a63 ("[PATCH] md: avoid a deadlock when removing a device >> from an md array via sysfs") delays the deletion of rdev, however, this >> introduces a window that rdev can be added again while the deletion is >> not done yet, and sysfs will complain about duplicate filename. >> >> Follow up patches try to fix this problem by flushing workqueue, however, >> flush_rdev_wq() is just dead code, the progress in >> md_kick_rdev_from_array(): >> >> 1) list_del_rcu(&rdev->same_set); >> 2) synchronize_rcu(); >> 3) queue_work(md_rdev_misc_wq, &rdev->del_work); >> >> So in flush_rdev_wq(), if rdev is found in the list, work_pending() can >> never pass, in the meantime, if work is queued, then rdev can never be >> found in the list. >> >> flush_rdev_wq() can be replaced by flush_workqueue() directly, however, >> this approach is not good: >> - the workqueue is global, this synchronization for all raid disks is >> not necessary. >> - flush_workqueue can't be called under 'reconfig_mutex', there is still >> a small window between flush_workqueue() and mddev_lock() that other >> contexts can queue new work, hence the problem is not solved completely. >> >> sysfs already has apis to support delete itself through writer, and >> these apis, specifically sysfs_break/unbreak_active_protection(), is used >> to support deleting rdev synchronously. Therefore, the above commit can be >> reverted, and sysfs duplicate filename can be avoided. >> >> A new mdadm regression test is proposed as well([1]). >> >> [1] https://lore.kernel.org/linux-raid/20230428062845.1975462-1-yukuai1@huaweicloud.com/ >> Fixes: 5792a2856a63 ("[PATCH] md: avoid a deadlock when removing a device from an md array via sysfs") >> Signed-off-by: Yu Kuai > > Thanks for the fix! I made the following changes and applied it > to md-next: > > 1. remove md_rdev->del_work, which is not used any more; > 2. change list_empty_safe to list_empty protected by the mutex, as > list_empty_safe doesn't seem safe here. Yes, this make sense, we must make sure caller won't see stale rdev through sysfs: t1: remove rdev t2: mutex_lock(reconfig_mutex) list_add mutex_unlock(reconfig_mutex) mutex_lock(reconfig_mutex) mutex_unlock(reconfig_mutex) mutex_lock(delete_mutex) list_del_init list_empty_careful -> list is empty now, return, caller will think rdev is removed, however, since export_rdev is not called yet, adding this rdev again will fail. kobject_del hold mutex is safe, and I think performance should be ok because remove rdev is not hot path. If we don't want to hold a new mutex in hot path, perhaps list_empty_careful can work with following changes, remove rdev from the list after sysfs entries is removed: diff --git a/drivers/md/md.c b/drivers/md/md.c index 296885798a2b..84dce5822f91 100644 --- a/drivers/md/md.c +++ b/drivers/md/md.c @@ -758,8 +758,8 @@ static void md_free_rdev(struct mddev *mddev) mutex_lock(&mddev->delete_mutex); list_for_each_entry_safe(rdev, tmp, &mddev->deleting, same_set) { - list_del_init(&rdev->same_set); kobject_del(&rdev->kobj); + list_del_init(&rdev->same_set); export_rdev(rdev); } mutex_unlock(&mddev->delete_mutex); > > Please let me know if either change doesn't make sense. > > Thanks, > Song > . >