Received: by 2002:a6b:500f:0:0:0:0:0 with SMTP id e15csp2084855iob; Fri, 20 May 2022 01:19:59 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwkZZC8BPTDyHRfKGxbr2F0CMc2TDe/WH1EJhjjL4qabo9aGj0Wdg6D8U/MhnlBt5yWaOed X-Received: by 2002:a50:ea8b:0:b0:428:7d05:eb7e with SMTP id d11-20020a50ea8b000000b004287d05eb7emr9295992edo.185.1653034799331; Fri, 20 May 2022 01:19:59 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1653034799; cv=none; d=google.com; s=arc-20160816; b=NDrY806UZkaJ+aScjgMUZkr13KnQ6idwtMOL/2V69p7sUYVA3XEUgMsZtSHz8Zcl1Z msFARP2J9hMzM73/kPrPFxyYYJzbkRcZktl43Z69Q7l+k0/+9nEBFwiK0FlowmFZPC9Y Hf9miAqBMSxuuSXfMUCc93rVSFI875l/Ay9P9r+trUY0PVo6MGFd9PDByNUZERnsF2dK XZxsaU6/jJfue9VHhFvcUGyRjcN8+Vwa6psClACOoGymlZYVde1kQVUBUfiLCLltlx+l bMnJplcmDZAz2m3pGR8y3liRDER9U5JnSKdR8p2p9b8Bm3LLqEF5U5p9plSyd3Zr6NGd SZ1g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:subject:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:cc:to:from:dkim-signature; bh=sNKnIXj1F99BWmAuKphO7iWl5/fjaKRJgwoKkBx/fsU=; b=ORfm9FqkaoHGa5CAk+9jZCACDGt5UxA8AfdBACvoQCmrQ4QlPilqy8uhvkX5XywzrN zmpaDu6WE5z4kCW6COlTnt+IOFuzQKQzsGRTcOqUoxtbvMCqwewnxxFqYgGmsm40ipKG rXDBfennmLT21uWmbffKusy+NA+4pBMjrMoWOox+6AQ72IFm6cqgu/5eWjAcm6oY511l UtU4AgD326Pssmu2XXjIHiORVcrQ6STkkUzgAIeeUbQ2HGpFspqZHJAXxXluiRHiFAl7 eqt7/cYRM8SJOpROQL0aOnZvFB6FC0Wfb5agkDn7gfkCT3SBODhjMnJjvh62QY33hu/v y1yw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@deltatee.com header.s=20200525 header.b=s042EOGC; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=deltatee.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id bo28-20020a0564020b3c00b00425c9d81104si356236edb.128.2022.05.20.01.19.32; Fri, 20 May 2022 01:19:59 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@deltatee.com header.s=20200525 header.b=s042EOGC; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=deltatee.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S239802AbiESTOa (ORCPT + 99 others); Thu, 19 May 2022 15:14:30 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50194 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S244361AbiESTNc (ORCPT ); Thu, 19 May 2022 15:13:32 -0400 Received: from ale.deltatee.com (ale.deltatee.com [204.191.154.188]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E2161AFAFF; Thu, 19 May 2022 12:13:23 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=deltatee.com; s=20200525; h=Subject:MIME-Version:References:In-Reply-To: Message-Id:Date:Cc:To:From:content-disposition; bh=sNKnIXj1F99BWmAuKphO7iWl5/fjaKRJgwoKkBx/fsU=; b=s042EOGCpFhBjVHWWx2KCatEzu JMnDribYZ0yf1UnS90S/uOABFusCf7Cuvv4puWqnLQes69nYjz7bRSExeZl6VxfMMj76xUycXhNng 15CdiHPnS6caOX4UENBfW6zB9E9RCjNa+v4W8WjQKfFWdgnco+OTL+fSIm4zlylxE81Nn06dn+/sj LAqm7QMXFT774JNB91FMNXGV+14a7lblOwMUAmQQ12Ifd06gr3Zl20ZCLy48YZCg/nBVE4S+xkldZ fSa8y+dMMwQByZfmRu7tiUPOcq4XyJzBpUEThJTYtfG0IZEtLlcQd84MjQsZVV8GEwwg1Fwxq57/q djq09OAA==; Received: from cgy1-donard.priv.deltatee.com ([172.16.1.31]) by ale.deltatee.com with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1nrlaI-002TqT-9E; Thu, 19 May 2022 13:13:18 -0600 Received: from gunthorp by cgy1-donard.priv.deltatee.com with local (Exim 4.94.2) (envelope-from ) id 1nrlaG-0004UD-76; Thu, 19 May 2022 13:13:16 -0600 From: Logan Gunthorpe To: linux-kernel@vger.kernel.org, linux-raid@vger.kernel.org, Song Liu Cc: Christoph Hellwig , Guoqing Jiang , Xiao Ni , Stephen Bates , Martin Oliveira , David Sloan , Logan Gunthorpe Date: Thu, 19 May 2022 13:13:11 -0600 Message-Id: <20220519191311.17119-16-logang@deltatee.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20220519191311.17119-1-logang@deltatee.com> References: <20220519191311.17119-1-logang@deltatee.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-SA-Exim-Connect-IP: 172.16.1.31 X-SA-Exim-Rcpt-To: linux-kernel@vger.kernel.org, linux-raid@vger.kernel.org, song@kernel.org, hch@infradead.org, guoqing.jiang@linux.dev, xni@redhat.com, sbates@raithlin.com, Martin.Oliveira@eideticom.com, David.Sloan@eideticom.com, logang@deltatee.com X-SA-Exim-Mail-From: gunthorp@deltatee.com X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net X-Spam-Level: X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,SPF_HELO_PASS,SPF_PASS, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 Subject: [PATCH v1 15/15] md: Notify sysfs sync_completed in md_reap_sync_thread() X-SA-Exim-Version: 4.2.1 (built Sat, 13 Feb 2021 17:57:42 +0000) X-SA-Exim-Scanned: Yes (on ale.deltatee.com) Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org The mdadm test 07layouts randomly produces a kernel hung task deadlock. The deadlock is caused by the suspend_lo/suspend_hi files being set by the mdadm background process during reshape and not being cleared because the process hangs. (Leaving aside the issue of the fragility of freezing kernel tasks by buggy userspace processes...) When the background mdadm process hangs it, is waiting (without a timeout) on a change to the sync_completed file signalling that the reshape has completed. The process is woken up a couple times when the reshape finishes but it is woken up before MD_RECOVERY_RUNNING is cleared so sync_completed_show() reports 0 instead of "none. To fix this, notify the sysfs file in md_reap_sync_thread() after MD_RECOVERY_RUNNING has been cleared. This wakes up mdadm and causes it to continue and write to suspend_lo/suspend_hi to allow IO to continue. Signed-off-by: Logan Gunthorpe --- drivers/md/md.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/md/md.c b/drivers/md/md.c index dbac63c8e35c..54108de46679 100644 --- a/drivers/md/md.c +++ b/drivers/md/md.c @@ -9478,6 +9478,7 @@ void md_reap_sync_thread(struct mddev *mddev, bool reconfig_mutex_held) wake_up(&resync_wait); /* flag recovery needed just to double check */ set_bit(MD_RECOVERY_NEEDED, &mddev->recovery); + sysfs_notify_dirent_safe(mddev->sysfs_completed); sysfs_notify_dirent_safe(mddev->sysfs_action); md_new_event(); if (mddev->event_work.func) -- 2.30.2