Received: by 2002:a05:6602:18e:0:0:0:0 with SMTP id m14csp3506015ioo; Wed, 25 May 2022 02:03:49 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwAa679O3zkxeG5y1oImP8SqSu3H+WmLFabxc1dw2dR6DhUbjKjVxWEUrJfSwHtERCh95Nr X-Received: by 2002:a17:907:7ea1:b0:6f4:82c9:c366 with SMTP id qb33-20020a1709077ea100b006f482c9c366mr29612361ejc.758.1653469428817; Wed, 25 May 2022 02:03:48 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1653469428; cv=none; d=google.com; s=arc-20160816; b=iHNGK0Thz3WVxpgXpflVo+srxsk2LqjvMRLn4WaIVedmFzcWmukG06qXzih9H903Hi 9IqtBoBsC1HO0V0bhcUSwQ1Y09VtQpPMSgjeJL2AqHDIFL6XEtiD/BNsZmAaUlGJVNTH 4IsLSkY6286VriN6f/YebJfhiKCLzTk3NcDiH64U8Hf+5HVIBjC/rxNSthvXDp4vN956 UIohXeMk+mYKzZcEYYXAcgFvUC5PZD5KtauaS9bNJO1T8botD2MAudMG4UJRMJtQy6uH lQcpNnSqoyMwE6rK3PUs8Tg9GdsvvsOi9g+RMdBgHh3UhmZW6SINFpGDFtdp6JeeJ+yq J0kw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:subject:content-transfer-encoding:in-reply-to :from:references:cc:to:content-language:user-agent:mime-version:date :message-id:dkim-signature; bh=T6zlot3ktjW4h9JkP+I087i4ffSVuTb7wcQm48Kac30=; b=UTFLKCEnrbEBPCrSljPQ6p/VKDnG50Q7ExD5LvQSIu+BskjQ1TJhIvVoWGjJocyK83 1GRWb+ov3yrEkef/unn8eMjQj2kkH1p1ROeTTGgsdfxTKE3dE2Zu66IXpIPldmjjSp3n LnfC+j9+JvenddE8Un9tPCPkYVxURZSvo6d+WHV9B+Ivy2U74Tgm4X5Eg5DJY2BI/bZ6 HAItANmvylkVRSxPnYE/1qZXXk6D1B8D4AquXTKg1Xyj8x5ITICfGmATzid+Dc45WQL7 nFtK9PtlmR4y2Kwj1cBRqInWann3JIAHTPYchJ7sntXNB5f+2Pwv9D2XF/UGkKWIsiTO xH/g== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@deltatee.com header.s=20200525 header.b="TQ65ng/j"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=deltatee.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id e25-20020a056402331900b0042abbcd3c60si14647318eda.362.2022.05.25.02.03.21; Wed, 25 May 2022 02:03:48 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@deltatee.com header.s=20200525 header.b="TQ65ng/j"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=deltatee.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S239017AbiEXPpo (ORCPT + 99 others); Tue, 24 May 2022 11:45:44 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51918 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233767AbiEXPpn (ORCPT ); Tue, 24 May 2022 11:45:43 -0400 Received: from ale.deltatee.com (ale.deltatee.com [204.191.154.188]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B7DC995A11; Tue, 24 May 2022 08:45:40 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=deltatee.com; s=20200525; h=Subject:In-Reply-To:From:References:Cc:To: MIME-Version:Date:Message-ID:content-disposition; bh=T6zlot3ktjW4h9JkP+I087i4ffSVuTb7wcQm48Kac30=; b=TQ65ng/jNoH4DKot65FILApX5O daYYM9H80SQbqsyKw7cSXg+f0tDknOSMcc0FZHFPbFg6o8QfR25ctkCDW4+dR5gPffciVq4AV0yTw QQ+T8XP8Q4f7w37IgBRKkMESoQ4TA+dmcLjTccuqs5clzz5C7VsolEVgeTpM6A+stLEDle8GpDjvE lefhtPefz2fDhIHYO2euUlvuqcGmaB4SLBa6SAThgCE6boUBZ81D3/7X9PA2pWrYH67t1krrZOYep 19z9KkZ1MeLctBGD00n5t7fNycbyiPDiExZFbkqvnO0NrSQmqRI4mWsOB+kpZuRmOfhkc1RhUQkuk U0l0oXUg==; Received: from guinness.priv.deltatee.com ([172.16.1.162]) by ale.deltatee.com with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.94.2) (envelope-from ) id 1ntWj1-006QyR-KH; Tue, 24 May 2022 09:45:39 -0600 Message-ID: <951ee1ca-88e9-3c62-fbf9-a147451b443e@deltatee.com> Date: Tue, 24 May 2022 09:45:33 -0600 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.8.0 Content-Language: en-CA To: Christoph Hellwig Cc: linux-kernel@vger.kernel.org, linux-raid@vger.kernel.org, Song Liu , Guoqing Jiang , Xiao Ni , Stephen Bates , Martin Oliveira , David Sloan References: <20220519191311.17119-1-logang@deltatee.com> <20220519191311.17119-15-logang@deltatee.com> From: Logan Gunthorpe In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-SA-Exim-Connect-IP: 172.16.1.162 X-SA-Exim-Rcpt-To: hch@infradead.org, linux-kernel@vger.kernel.org, linux-raid@vger.kernel.org, song@kernel.org, guoqing.jiang@linux.dev, xni@redhat.com, sbates@raithlin.com, Martin.Oliveira@eideticom.com, David.Sloan@eideticom.com X-SA-Exim-Mail-From: logang@deltatee.com X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net X-Spam-Level: X-Spam-Status: No, score=-5.4 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,NICE_REPLY_A,SPF_HELO_PASS, SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 Subject: Re: [PATCH v1 14/15] md: Ensure resync is reported after it starts X-SA-Exim-Version: 4.2.1 (built Sat, 13 Feb 2021 17:57:42 +0000) X-SA-Exim-Scanned: Yes (on ale.deltatee.com) Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 2022-05-21 05:51, Christoph Hellwig wrote: > On Thu, May 19, 2022 at 01:13:10PM -0600, Logan Gunthorpe wrote: >> The 07layouts test in mdadm fails on some systems. The failure >> presents itself as the backup file not being removed before the next >> layout is grown into: >> >> mdadm: /dev/md0: cannot create backup file /tmp/md-test-backup: >> File exists >> >> This is because the background mdadm process, which is responsible for >> cleaning up this backup file gets into an infinite loop waiting for >> the reshape to start. mdadm checks the mdstat file if a reshape is >> going and, if it is not, it waits for an event on the file or times >> out in 5 seconds. On faster machines, the reshape may complete before >> the 5 seconds times out, and thus the background mdadm process loops >> waiting for a reshape to start that has already occurred. >> >> mdadm reads the mdstat file to start, but mdstat does not report that the >> reshape has begun, even though it has indeed begun. So the mdstat_wait() >> call (in mdadm) which polls on the mdstat file won't ever return until >> timing out. >> >> The reason mdstat reports the reshape has started is due to an issue >> in status_resync(). recovery_active is subtracted from curr_resync which >> will result in a value of zero for the first chunk of reshaped data, and >> the resulting read will report no reshape in progress. >> >> To fix this, if "resync - recovery_active" is zero: force the value to >> be 4 so the code reports a resync in progress. >> >> Signed-off-by: Logan Gunthorpe >> --- >> drivers/md/md.c | 12 ++++++++++-- >> 1 file changed, 10 insertions(+), 2 deletions(-) >> >> diff --git a/drivers/md/md.c b/drivers/md/md.c >> index 8273ac5eef06..dbac63c8e35c 100644 >> --- a/drivers/md/md.c >> +++ b/drivers/md/md.c >> @@ -8022,10 +8022,18 @@ static int status_resync(struct seq_file *seq, struct mddev *mddev) >> if (test_bit(MD_RECOVERY_DONE, &mddev->recovery)) >> /* Still cleaning up */ >> resync = max_sectors; >> - } else if (resync > max_sectors) >> + } else if (resync > max_sectors) { >> resync = max_sectors; >> - else >> + } else { >> resync -= atomic_read(&mddev->recovery_active); >> + if (!resync) { >> + /* >> + * Resync has started, but if it's zero, ensure >> + * it is still reported, by forcing it to be 4 >> + */ >> + resync = 4; > > Where does this magic 4 come from? There are a bunch of existing magic numbers in this code. Just before this hunk there's a if (resync <= 3) and just after the code there's an if (resync < 3). There's a comment in md_do_sync() indicating overloaded values for 1 and 2. I can try and turn this into an enum for v2. Logan