Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S964842AbdCJJdE (ORCPT ); Fri, 10 Mar 2017 04:33:04 -0500 Received: from mail.linuxfoundation.org ([140.211.169.12]:42498 "EHLO mail.linuxfoundation.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S934071AbdCJJc5 (ORCPT ); Fri, 10 Mar 2017 04:32:57 -0500 From: Greg Kroah-Hartman To: linux-kernel@vger.kernel.org Cc: Greg Kroah-Hartman , stable@vger.kernel.org, Heinz Mauelshagen , Mike Snitzer Subject: [PATCH 4.10 060/167] dm raid: fix data corruption on reshape request Date: Fri, 10 Mar 2017 10:08:23 +0100 Message-Id: <20170310084000.562411920@linuxfoundation.org> X-Mailer: git-send-email 2.12.0 In-Reply-To: <20170310083956.767605269@linuxfoundation.org> References: <20170310083956.767605269@linuxfoundation.org> User-Agent: quilt/0.65 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2635 Lines: 70 4.10-stable review patch. If anyone has any objections, please let me know. ------------------ From: Heinz Mauelshagen commit d36a19541fe8f392778ac137d60f9be8dfdd8f9d upstream. The lvm2 sequence to manage dm-raid constructor flags that trigger a rebuild or a reshape is defined as: 1) load table with flags (e.g. rebuild/delta_disks/data_offset) 2) clear out the flags in lvm2 metadata 3) store the lvm2 metadata, reload the table to reset the flags previously established during the initial load (1) -- in order to prevent repeatedly requesting a rebuild or a reshape on activation Currently, loading an inactive table with rebuild/reshape flags specified will cause dm-raid to rebuild/reshape on resume and thus start updating the raid metadata (about the progress). When the second table reload, to reset the flags, occurs the constructor accesses the volatile progress state kept in the raid superblocks. Because the active mapping is still processing the rebuild/reshape, that position will be stale by the time the device is resumed. In the reshape case, this causes data corruption by processing already reshaped stripes again. In the rebuild case, it does _not_ cause data corruption but instead involves superfluous rebuilds. Fix by keeping the raid set frozen during the first resume and then allow the rebuild/reshape during the second resume. Fixes: 9dbd1aa3a ("dm raid: add reshaping support to the target") Signed-off-by: Heinz Mauelshagen Signed-off-by: Mike Snitzer Signed-off-by: Greg Kroah-Hartman --- drivers/md/dm-raid.c | 12 +++++++++++- 1 file changed, 11 insertions(+), 1 deletion(-) --- a/drivers/md/dm-raid.c +++ b/drivers/md/dm-raid.c @@ -3626,6 +3626,8 @@ static int raid_preresume(struct dm_targ return r; } +#define RESUME_STAY_FROZEN_FLAGS (CTR_FLAG_DELTA_DISKS | CTR_FLAG_DATA_OFFSET) + static void raid_resume(struct dm_target *ti) { struct raid_set *rs = ti->private; @@ -3643,7 +3645,15 @@ static void raid_resume(struct dm_target mddev->ro = 0; mddev->in_sync = 0; - clear_bit(MD_RECOVERY_FROZEN, &mddev->recovery); + /* + * Keep the RAID set frozen if reshape/rebuild flags are set. + * The RAID set is unfrozen once the next table load/resume, + * which clears the reshape/rebuild flags, occurs. + * This ensures that the constructor for the inactive table + * retrieves an up-to-date reshape_position. + */ + if (!(rs->ctr_flags & RESUME_STAY_FROZEN_FLAGS)) + clear_bit(MD_RECOVERY_FROZEN, &mddev->recovery); if (mddev->suspended) mddev_resume(mddev);