Received: by 2002:ac0:a5b6:0:0:0:0:0 with SMTP id m51-v6csp2479914imm; Mon, 28 May 2018 08:53:26 -0700 (PDT) X-Google-Smtp-Source: AB8JxZpBl1qUhE6sB+Mrba8R0HU8ndfzqzRyzJFJg4byKjCdbxtSEjcDEQwwvT5l4oTEYuQiZeuO X-Received: by 2002:a63:b00f:: with SMTP id h15-v6mr10802560pgf.90.1527522806880; Mon, 28 May 2018 08:53:26 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1527522806; cv=none; d=google.com; s=arc-20160816; b=r/aULovckRa9zkZ7CXPjW/dsV+6w5CHth4yKnJRYwJhX/bMcqXA1Dv+Vy2gOavlop2 DRRsuuXvd4bo5y0RrKjMYzueKVQsBaeWSdv6iRoqd5Eg91w7x9F8I30HYLxZxN6q5+jA 5KXiTWmlWK2VtoWK40uD7ecHza4SKNFPJVD3jCAY4n9GkO7U84NPPfGa/YUdmtmnqZa9 W9lYVMa2WE2EIQ6XsZjEhgWLya0o92XBg5wtlGvXn4qzLYxUjxiAxY90Ee5pZNbqWyfl ATQUrGv5HuUpP2rAZ2WI/97FoI0fjHxPGeu3/6oeyu9g6YDdxEftvR0/1FaN5X4c6gV9 9wEw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:mime-version:user-agent:references :in-reply-to:message-id:date:subject:cc:to:from:dkim-signature :arc-authentication-results; bh=ZhuQk/f79WIbGg0huZYx/flVRGbcKdheFD7IvYHgZ/s=; b=GSYN8hsKYHlTUh37QqOC9LQMvf0zOwAV6GBzehofq55xwLGfl0hgXq1s2T99H6K31W bpnjnLOskYSXcdoyBtwASz2kamObAAVjCSPy3WH3oE12jjnhJ45rUM5z+YSk8eAdJd7W i0j55Suu6y2UGmWtHRZ7arYsLCn9Jm7J58MuCixWFpdnmyL25hr6Qd1SEk9AIXFnc+pl bm8jXOLGWway7EP+JPignHPjcu8QPyPdXcppI8IZbOZdcFHOrcUZt1SHH/ukyPTgbJ6I BJVcmalGsRVnzFoQu7BROMB+ekVzJTfLLBRH26qOHWoU2bfJn9evq5IKdQFNtXlUzxgy aecg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=xs2HvwEr; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id z16-v6si5734487pfj.337.2018.05.28.08.53.12; Mon, 28 May 2018 08:53:26 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=xs2HvwEr; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S966971AbeE1KPu (ORCPT + 99 others); Mon, 28 May 2018 06:15:50 -0400 Received: from mail.kernel.org ([198.145.29.99]:35866 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S966828AbeE1KPn (ORCPT ); Mon, 28 May 2018 06:15:43 -0400 Received: from localhost (LFbn-1-12247-202.w90-92.abo.wanadoo.fr [90.92.61.202]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 83A26206B7; Mon, 28 May 2018 10:15:42 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1527502543; bh=u1QMsp6ZpNPqINRFiyuTtOkZNqGywEtMm5qm+vYMjPY=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=xs2HvwErh6j399hytrsAxFC+lkHH2uED/zW6AOFobMjjodYI07Zc2c9a/Hlg+fhhp vNiZZj6GE4xVLBJJtLn8GwZr1BNS7tO9sX+1wTjway4R8MlKttIR5JYcT5Lq/ZdtD8 OwG7gibWHvSPomygnNJkspr1RVahhiQ7Tq6az+gs= From: Greg Kroah-Hartman To: linux-kernel@vger.kernel.org Cc: Greg Kroah-Hartman , stable@vger.kernel.org, Liu Bo , David Sterba , Sasha Levin Subject: [PATCH 4.4 038/268] Btrfs: fix scrub to repair raid6 corruption Date: Mon, 28 May 2018 12:00:12 +0200 Message-Id: <20180528100206.374694208@linuxfoundation.org> X-Mailer: git-send-email 2.17.0 In-Reply-To: <20180528100202.045206534@linuxfoundation.org> References: <20180528100202.045206534@linuxfoundation.org> User-Agent: quilt/0.65 X-stable: review MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org 4.4-stable review patch. If anyone has any objections, please let me know. ------------------ From: Liu Bo [ Upstream commit 762221f095e3932669093466aaf4b85ed9ad2ac1 ] The raid6 corruption is that, suppose that all disks can be read without problems and if the content that was read out doesn't match its checksum, currently for raid6 btrfs at most retries twice, - the 1st retry is to rebuild with all other stripes, it'll eventually be a raid5 xor rebuild, - if the 1st fails, the 2nd retry will deliberately fail parity p so that it will do raid6 style rebuild, however, the chances are that another non-parity stripe content also has something corrupted, so that the above retries are not able to return correct content. We've fixed normal reads to rebuild raid6 correctly with more retries in Patch "Btrfs: make raid6 rebuild retry more"[1], this is to fix scrub to do the exactly same rebuild process. [1]: https://patchwork.kernel.org/patch/10091755/ Signed-off-by: Liu Bo Signed-off-by: David Sterba Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman --- fs/btrfs/raid56.c | 18 ++++++++++++++---- fs/btrfs/volumes.c | 9 ++++++++- 2 files changed, 22 insertions(+), 5 deletions(-) --- a/fs/btrfs/raid56.c +++ b/fs/btrfs/raid56.c @@ -2160,11 +2160,21 @@ int raid56_parity_recover(struct btrfs_r } /* - * reconstruct from the q stripe if they are - * asking for mirror 3 + * Loop retry: + * for 'mirror == 2', reconstruct from all other stripes. + * for 'mirror_num > 2', select a stripe to fail on every retry. */ - if (mirror_num == 3) - rbio->failb = rbio->real_stripes - 2; + if (mirror_num > 2) { + /* + * 'mirror == 3' is to fail the p stripe and + * reconstruct from the q stripe. 'mirror > 3' is to + * fail a data stripe and reconstruct from p+q stripe. + */ + rbio->failb = rbio->real_stripes - (mirror_num - 1); + ASSERT(rbio->failb > 0); + if (rbio->failb <= rbio->faila) + rbio->failb--; + } ret = lock_stripe_add(rbio); --- a/fs/btrfs/volumes.c +++ b/fs/btrfs/volumes.c @@ -5056,7 +5056,14 @@ int btrfs_num_copies(struct btrfs_fs_inf else if (map->type & BTRFS_BLOCK_GROUP_RAID5) ret = 2; else if (map->type & BTRFS_BLOCK_GROUP_RAID6) - ret = 3; + /* + * There could be two corrupted data stripes, we need + * to loop retry in order to rebuild the correct data. + * + * Fail a stripe at a time on every retry except the + * stripe under reconstruction. + */ + ret = map->num_stripes; else ret = 1; free_extent_map(em);