Received: by 2002:a05:7412:1703:b0:e2:908c:2ebd with SMTP id dm3csp3978448rdb; Wed, 30 Aug 2023 11:34:36 -0700 (PDT) X-Google-Smtp-Source: AGHT+IGtM4aRv95JSDbNAaaf9d6YVX9g27uP0thO2TOppQeUbeXaftn42+EQHMLVFmxXEwDjU4VO X-Received: by 2002:a17:902:e88b:b0:1bd:f71d:5298 with SMTP id w11-20020a170902e88b00b001bdf71d5298mr3409983plg.3.1693420475722; Wed, 30 Aug 2023 11:34:35 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1693420475; cv=none; d=google.com; s=arc-20160816; b=v3HpEp0etI/IAwdxKR5dYnKcOxY/PgKDa2EjHch64bOny2gkJg4BS/mCBj3nHQKgrk 9T/IIQ9NPFBN7VZ++VcBLK7bPtbdKoizeeDMIP9E8PLQwVsKmQ2KizfVZ1W2eUjr1WWe KVXE4+KvWOc/jKby7VjoUjdVoPg6URMpMzB5Ir/v+sW4Qyu039a8J8RAYLeEQErDYVxc Bgv5JGLeW5MFL7DVOrsS5lBTnTCG7vy5epP4ZalBq1HG+9Db9OlckQf8QpGc1fSJbrDe pGsZQKOrUjKWscyFm039Gzd2fUVzQzhFlgPBZPSMlzl15+vGEfs4xmHrgtesYcyvDcn7 N7Gw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:in-reply-to :mime-version:user-agent:date:message-id:from:references:cc:to :subject; bh=tZvPW4djNFUc95l1eQtKwUD9EuLodgoKSLQzmmB/+c0=; fh=geGoZUp2J5hQODFxL/hfZE3IcW3C8KjePmIRvn1oqrs=; b=BLWx93NJ3aJ3LElrNzH0VInjmZaf4M9sCU9S86gezvefgKjPhDbvF4dQCG4mgguP35 PBS4lcEITogoykVEIoRLQyIyCjRgN9aaSCcACoEI6eK4i87mZ8MkgfFxerVyC8j+XdeE CFJq0riY3ESdNPfM5ubU8PUl2bQwNy/utBvmoDp32KY5FT9p18UO3uIULQFxsAXaeMq5 Pc69aR+5yPC3TIZ130r7dEikMgt3JU2wbr9sVELy2gSBnMYoUtYlxext7Vfq6vtP72/k 35VGHDDVI39EfjG9MucZmbX21JIQapj3KJKX6ZwNhPUM6vWjKFLLipbTb+D34FplobOf K0Fg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id x11-20020a170902a38b00b001c1fe58a1ccsi2584503pla.591.2023.08.30.11.34.20; Wed, 30 Aug 2023 11:34:35 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S241588AbjH3BhQ (ORCPT + 99 others); Tue, 29 Aug 2023 21:37:16 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59948 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S241582AbjH3BhF (ORCPT ); Tue, 29 Aug 2023 21:37:05 -0400 Received: from dggsgout12.his.huawei.com (unknown [45.249.212.56]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 5A5BBCCF; Tue, 29 Aug 2023 18:37:02 -0700 (PDT) Received: from mail02.huawei.com (unknown [172.30.67.143]) by dggsgout12.his.huawei.com (SkyGuard) with ESMTP id 4Rb6N73cMFz4f4XWd; Wed, 30 Aug 2023 09:36:55 +0800 (CST) Received: from [10.174.176.73] (unknown [10.174.176.73]) by APP4 (Coremail) with SMTP id gCh0CgAXp6k1ne5kuWmhBw--.59943S3; Wed, 30 Aug 2023 09:36:55 +0800 (CST) Subject: Re: md_raid: mdX_raid6 looping after sync_action "check" to "idle" transition To: Dragan Stancevic , Yu Kuai , song@kernel.org Cc: buczek@molgen.mpg.de, guoqing.jiang@linux.dev, it+raid@molgen.mpg.de, linux-kernel@vger.kernel.org, linux-raid@vger.kernel.org, msmith626@gmail.com, "yangerkun@huawei.com" , "yukuai (C)" References: <20230822211627.1389410-1-dragan@stancevic.com> <2061b123-6332-1456-e7c3-b713752527fb@stancevic.com> <07d5c7c2-c444-8747-ed6d-ca24231decd8@huaweicloud.com> From: Yu Kuai Message-ID: <0d79d1f9-00e8-93be-3c7c-244030521cd7@huaweicloud.com> Date: Wed, 30 Aug 2023 09:36:53 +0800 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:60.0) Gecko/20100101 Thunderbird/60.8.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit X-CM-TRANSID: gCh0CgAXp6k1ne5kuWmhBw--.59943S3 X-Coremail-Antispam: 1UD129KBjvdXoW7XFyxXryfCr4rZF1kAFykAFb_yoWfuFX_ur 45KFyxKw13J3Wjya1UGFnavFs3KFy7W34kJrW0kFsF9r1fZan3CFsagr98Ca48Kan5Zwsx ta98Jw4DXr1YyjkaLaAFLSUrUUUUUb8apTn2vfkv8UJUUUU8Yxn0WfASr-VFAUDa7-sFnT 9fnUUIcSsGvfJTRUUUb3AFF20E14v26r4j6ryUM7CY07I20VC2zVCF04k26cxKx2IYs7xG 6rWj6s0DM7CIcVAFz4kK6r1j6r18M28lY4IEw2IIxxk0rwA2F7IY1VAKz4vEj48ve4kI8w A2z4x0Y4vE2Ix0cI8IcVAFwI0_Ar0_tr1l84ACjcxK6xIIjxv20xvEc7CjxVAFwI0_Gr1j 6F4UJwA2z4x0Y4vEx4A2jsIE14v26rxl6s0DM28EF7xvwVC2z280aVCY1x0267AKxVW0oV Cq3wAS0I0E0xvYzxvE52x082IY62kv0487Mc02F40EFcxC0VAKzVAqx4xG6I80ewAv7VC0 I7IYx2IY67AKxVWUJVWUGwAv7VC2z280aVAFwI0_Jr0_Gr1lOx8S6xCaFVCjc4AY6r1j6r 4UM4x0Y48IcVAKI48JM4x0x7Aq67IIx4CEVc8vx2IErcIFxwACI402YVCY1x02628vn2kI c2xKxwCYjI0SjxkI62AI1cAE67vIY487MxAIw28IcxkI7VAKI48JMxC20s026xCaFVCjc4 AY6r1j6r4UMI8I3I0E5I8CrVAFwI0_Jr0_Jr4lx2IqxVCjr7xvwVAFwI0_JrI_JrWlx4CE 17CEb7AF67AKxVWUtVW8ZwCIc40Y0x0EwIxGrwCI42IY6xIIjxv20xvE14v26r1j6r1xMI IF0xvE2Ix0cI8IcVCY1x0267AKxVW8JVWxJwCI42IY6xAIw20EY4v20xvaj40_WFyUJVCq 3wCI42IY6I8E87Iv67AKxVWUJVW8JwCI42IY6I8E87Iv6xkF7I0E14v26r4j6r4UJbIYCT nIWIevJa73UjIFyTuYvjfUoOJ5UUUUU X-CM-SenderInfo: 51xn3trlr6x35dzhxuhorxvhhfrp/ X-CFilter-Loop: Reflected X-Spam-Status: No, score=-1.0 required=5.0 tests=BAYES_00,MAY_BE_FORGED, NICE_REPLY_A,RCVD_IN_DNSWL_BLOCKED,SPF_HELO_NONE,SPF_NONE autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi, 在 2023/08/29 4:32, Dragan Stancevic 写道: > Just a followup on 6.1 testing. I tried reproducing this problem for 5 > days with 6.1.42 kernel without your patches and I was not able to > reproduce it. > > It seems that 6.1 has some other code that prevents this from happening. > I see that there are lots of patches for raid456 between 5.10 and 6.1, however, I remember that I used to reporduce the deadlock after 6.1, and it's true it's not easy to reporduce, see below: https://lore.kernel.org/linux-raid/e9067438-d713-f5f3-0d3d-9e6b0e9efa0e@huaweicloud.com/ My guess is that 6.1 is harder to reporduce than 5.10 due to some changes inside raid456. By the way, raid10 had a similiar deadlock, and can be fixed the same way, so it make sense to backport these patches. https://lore.kernel.org/r/20230529132037.2124527-5-yukuai1@huaweicloud.com Thanks, Kuai > On 5.10 I can reproduce it within minutes to an hour. >