Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753614AbdHVTUb (ORCPT ); Tue, 22 Aug 2017 15:20:31 -0400 Received: from mail.linuxfoundation.org ([140.211.169.12]:44598 "EHLO mail.linuxfoundation.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753476AbdHVTOt (ORCPT ); Tue, 22 Aug 2017 15:14:49 -0400 From: Greg Kroah-Hartman To: linux-kernel@vger.kernel.org Cc: Greg Kroah-Hartman , stable@vger.kernel.org, Dominik Brodowski , David R , NeilBrown , Shaohua Li Subject: [PATCH 4.12 10/41] md: always clear ->safemode when md_check_recovery gets the mddev lock. Date: Tue, 22 Aug 2017 12:13:38 -0700 Message-Id: <20170822190942.325317697@linuxfoundation.org> X-Mailer: git-send-email 2.14.1 In-Reply-To: <20170822190941.918296529@linuxfoundation.org> References: <20170822190941.918296529@linuxfoundation.org> User-Agent: quilt/0.65 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1931 Lines: 56 4.12-stable review patch. If anyone has any objections, please let me know. ------------------ From: NeilBrown commit 33182d15c6bf182f7ae32a66ea4a547d979cd6d7 upstream. If ->safemode == 1, md_check_recovery() will try to get the mddev lock and perform various other checks. If mddev->in_sync is zero, it will call set_in_sync, and clear ->safemode. However if mddev->in_sync is not zero, ->safemode will not be cleared. When md_check_recovery() drops the mddev lock, the thread is woken up again. Normally it would just check if there was anything else to do, find nothing, and go to sleep. However as ->safemode was not cleared, it will take the mddev lock again, then wake itself up when unlocking. This results in an infinite loop, repeatedly calling md_check_recovery(), which RCU or the soft-lockup detector will eventually complain about. Prior to commit 4ad23a976413 ("MD: use per-cpu counter for writes_pending"), safemode would only be set to one when the writes_pending counter reached zero, and would be cleared again when writes_pending is incremented. Since that patch, safemode is set more freely, but is not reliably cleared. So in md_check_recovery() clear ->safemode before checking ->in_sync. Fixes: 4ad23a976413 ("MD: use per-cpu counter for writes_pending") Reported-by: Dominik Brodowski Reported-by: David R Signed-off-by: NeilBrown Signed-off-by: Shaohua Li Signed-off-by: Greg Kroah-Hartman --- drivers/md/md.c | 3 +++ 1 file changed, 3 insertions(+) --- a/drivers/md/md.c +++ b/drivers/md/md.c @@ -8639,6 +8639,9 @@ void md_check_recovery(struct mddev *mdd if (mddev_trylock(mddev)) { int spares = 0; + if (mddev->safemode == 1) + mddev->safemode = 0; + if (mddev->ro) { struct md_rdev *rdev; if (!mddev->external && mddev->in_sync)