Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755778AbaBGMbs (ORCPT ); Fri, 7 Feb 2014 07:31:48 -0500 Received: from youngberry.canonical.com ([91.189.89.112]:42068 "EHLO youngberry.canonical.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753356AbaBGLuy (ORCPT ); Fri, 7 Feb 2014 06:50:54 -0500 From: Luis Henriques To: linux-kernel@vger.kernel.org, stable@vger.kernel.org, kernel-team@lists.ubuntu.com Cc: NeilBrown , Luis Henriques Subject: [PATCH 3.11 138/233] md/raid5: fix long-standing problem with bitmap handling on write failure. Date: Fri, 7 Feb 2014 11:45:57 +0000 Message-Id: <1391773652-25214-139-git-send-email-luis.henriques@canonical.com> X-Mailer: git-send-email 1.8.3.2 In-Reply-To: <1391773652-25214-1-git-send-email-luis.henriques@canonical.com> References: <1391773652-25214-1-git-send-email-luis.henriques@canonical.com> X-Extended-Stable: 3.11 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org 3.11.10.4 -stable review patch. If anyone has any objections, please let me know. ------------------ From: NeilBrown commit 9f97e4b128d2ea90a5f5063ea0ee3b0911f4c669 upstream. Before a write starts we set a bit in the write-intent bitmap. When the write completes we clear that bit if the write was successful to all devices. However if the write wasn't fully successful we should not clear the bit. If the faulty drive is subsequently re-added, the fact that the bit is still set ensure that we will re-write the data that is missing. This logic is mediated by the STRIPE_DEGRADED flag - we only clear the bitmap bit when this flag is not set. Currently we correctly set the flag if a write starts when some devices are failed or missing. But we do *not* set the flag if some device failed during the write attempt. This is wrong and can result in clearing the bit inappropriately. So: set the flag when a write fails. This bug has been present since bitmaps were introduces, so the fix is suitable for any -stable kernel. Reported-by: Ethan Wilson Signed-off-by: NeilBrown Signed-off-by: Luis Henriques --- drivers/md/raid5.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c index dce8be8..a40b969 100644 --- a/drivers/md/raid5.c +++ b/drivers/md/raid5.c @@ -1893,6 +1893,7 @@ static void raid5_end_write_request(struct bio *bi, int error) set_bit(R5_MadeGoodRepl, &sh->dev[i].flags); } else { if (!uptodate) { + set_bit(STRIPE_DEGRADED, &sh->state); set_bit(WriteErrorSeen, &rdev->flags); set_bit(R5_WriteError, &sh->dev[i].flags); if (!test_and_set_bit(WantReplacement, &rdev->flags)) -- 1.8.3.2 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/