Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S934858AbcCPIaQ (ORCPT ); Wed, 16 Mar 2016 04:30:16 -0400 Received: from mail.kernel.org ([198.145.29.136]:59989 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S966001AbcCPIKc (ORCPT ); Wed, 16 Mar 2016 04:10:32 -0400 From: lizf@kernel.org To: stable@vger.kernel.org Cc: linux-kernel@vger.kernel.org, NeilBrown , Zefan Li Subject: [PATCH 3.4 057/107] md/raid1: extend spinlock to protect raid1_end_read_request against inconsistencies Date: Wed, 16 Mar 2016 16:05:51 +0800 Message-Id: <1458115601-5762-57-git-send-email-lizf@kernel.org> X-Mailer: git-send-email 1.9.1 In-Reply-To: <1458115541-5712-1-git-send-email-lizf@kernel.org> References: <1458115541-5712-1-git-send-email-lizf@kernel.org> Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2878 Lines: 79 From: NeilBrown 3.4.111-rc1 review patch. If anyone has any objections, please let me know. ------------------ commit 423f04d63cf421ea436bcc5be02543d549ce4b28 upstream. raid1_end_read_request() assumes that the In_sync bits are consistent with the ->degaded count. raid1_spare_active updates the In_sync bit before the ->degraded count and so exposes an inconsistency, as does error() So extend the spinlock in raid1_spare_active() and error() to hide those inconsistencies. This should probably be part of Commit: 34cab6f42003 ("md/raid1: fix test for 'was read error from last working device'.") as it addresses the same issue. It fixes the same bug and should go to -stable for same reasons. Fixes: 76073054c95b ("md/raid1: clean up read_balance.") Signed-off-by: NeilBrown [lizf: Backported to 3.4: adjust context] Signed-off-by: Zefan Li --- drivers/md/raid1.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c index 27af2f3..189eedb 100644 --- a/drivers/md/raid1.c +++ b/drivers/md/raid1.c @@ -1250,6 +1250,7 @@ static void error(struct mddev *mddev, struct md_rdev *rdev) { char b[BDEVNAME_SIZE]; struct r1conf *conf = mddev->private; + unsigned long flags; /* * If it is not operational, then we have already marked it as dead @@ -1269,6 +1270,7 @@ static void error(struct mddev *mddev, struct md_rdev *rdev) return; } set_bit(Blocked, &rdev->flags); + spin_lock_irqsave(&conf->device_lock, flags); if (test_and_clear_bit(In_sync, &rdev->flags)) { unsigned long flags; spin_lock_irqsave(&conf->device_lock, flags); @@ -1281,6 +1283,7 @@ static void error(struct mddev *mddev, struct md_rdev *rdev) set_bit(MD_RECOVERY_INTR, &mddev->recovery); } else set_bit(Faulty, &rdev->flags); + spin_unlock_irqrestore(&conf->device_lock, flags); set_bit(MD_CHANGE_DEVS, &mddev->flags); printk(KERN_ALERT "md/raid1:%s: Disk failure on %s, disabling device.\n" @@ -1334,7 +1337,10 @@ static int raid1_spare_active(struct mddev *mddev) * Find all failed disks within the RAID1 configuration * and mark them readable. * Called under mddev lock, so rcu protection not needed. + * device_lock used to avoid races with raid1_end_read_request + * which expects 'In_sync' flags and ->degraded to be consistent. */ + spin_lock_irqsave(&conf->device_lock, flags); for (i = 0; i < conf->raid_disks; i++) { struct md_rdev *rdev = conf->mirrors[i].rdev; struct md_rdev *repl = conf->mirrors[conf->raid_disks + i].rdev; @@ -1364,7 +1370,6 @@ static int raid1_spare_active(struct mddev *mddev) sysfs_notify_dirent_safe(rdev->sysfs_state); } } - spin_lock_irqsave(&conf->device_lock, flags); mddev->degraded -= count; spin_unlock_irqrestore(&conf->device_lock, flags); -- 1.9.1