Received: by 2002:ac0:a581:0:0:0:0:0 with SMTP id m1-v6csp3066919imm; Sun, 1 Jul 2018 11:28:25 -0700 (PDT) X-Google-Smtp-Source: ADUXVKL36dgW76hKBXYR+/MLlQ6NtL4ktRvETsKcdRrcnSxMHJhbwCpk694I+Y6Q1LTKl81sLars X-Received: by 2002:a17:902:24e:: with SMTP id 72-v6mr22876275plc.87.1530469704987; Sun, 01 Jul 2018 11:28:24 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1530469704; cv=none; d=google.com; s=arc-20160816; b=AO8p7NYY2so7CEd27DLjyww5IrVvYaM3qyjLIMx4FA+rz4oO0GPqhkkGtGcxVgPNGl RqTv9XLXAy9FuIWbkQDL1e+H+EeFtfJJvcQAV98RhHcbD4KFI5wmoeRA8QBJgzpmsOOZ nddjNGEBuEz958LJId0woUMchdME907jtKF4husjU34cIFmLUsiVKn+B3s6sA5VDl8Pe G9yw16LNt8T2qSX809Bw7+0GL5oTbrqm7cHORvbipHBiOVkf9Lt+zTbqIZFtkJZkJAQj NWrMVtsisNeBFLiIJ5votCo1TqSQKmm6tnfKo7F8XZmxDaSUl+FTqPtZtmdtlMBwoPpG kJIA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:mime-version:user-agent:references :in-reply-to:message-id:date:subject:cc:to:from :arc-authentication-results; bh=6ObsCBuxuNc6DCqK+CIeXVhuWe129pBJi0tieVWvMg0=; b=mCJOdk0zF96S3lFNdlntM0KdHPqVlXo6WqzQHpqeP7VxT2neVx2CH7zE2FjlfpfClb tG4iGANAurUyWeg2k6dJSBaz5eMdn8/xEQkYDakGIS/+TXqB5EXttjv/MyM6FdtlvhSq k4f0RnV7N2twgSoP/H8r2sN1B/+HvOemt6UWXS1UfBtkECsvkrgVLhgOQcHzMrPzIOzM 1tFkU9piKzpMNkvXz0lSV5aO5TlzevJBxHTaOzWcNmogK9nyakCstqtl5JTmXgILYfxU ACiwsrx51WZWQ3Vhn0e//OIyBP7Yl/4DRhddWDZOym9Rbi2GB88kfiABD7iZMR57DM1K 50AA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id u7-v6si12827952pgn.194.2018.07.01.11.28.10; Sun, 01 Jul 2018 11:28:24 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S965234AbeGAS1F (ORCPT + 99 others); Sun, 1 Jul 2018 14:27:05 -0400 Received: from mail.linuxfoundation.org ([140.211.169.12]:32972 "EHLO mail.linuxfoundation.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753026AbeGAQSP (ORCPT ); Sun, 1 Jul 2018 12:18:15 -0400 Received: from localhost (LFbn-1-12247-202.w90-92.abo.wanadoo.fr [90.92.61.202]) by mail.linuxfoundation.org (Postfix) with ESMTPSA id A58B249B; Sun, 1 Jul 2018 16:18:14 +0000 (UTC) From: Greg Kroah-Hartman To: linux-kernel@vger.kernel.org Cc: Greg Kroah-Hartman , stable@vger.kernel.org, NeilBrown , Goldwyn Rodrigues , Shaohua Li Subject: [PATCH 4.4 072/105] md: fix two problems with setting the "re-add" device state. Date: Sun, 1 Jul 2018 18:02:22 +0200 Message-Id: <20180701153154.669895647@linuxfoundation.org> X-Mailer: git-send-email 2.18.0 In-Reply-To: <20180701153149.382300170@linuxfoundation.org> References: <20180701153149.382300170@linuxfoundation.org> User-Agent: quilt/0.65 X-stable: review MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org 4.4-stable review patch. If anyone has any objections, please let me know. ------------------ From: NeilBrown commit 011abdc9df559ec75779bb7c53a744c69b2a94c6 upstream. If "re-add" is written to the "state" file for a device which is faulty, this has an effect similar to removing and re-adding the device. It should take up the same slot in the array that it previously had, and an accelerated (e.g. bitmap-based) rebuild should happen. The slot that "it previously had" is determined by rdev->saved_raid_disk. However this is not set when a device fails (only when a device is added), and it is cleared when resync completes. This means that "re-add" will normally work once, but may not work a second time. This patch includes two fixes. 1/ when a device fails, record the ->raid_disk value in ->saved_raid_disk before clearing ->raid_disk 2/ when "re-add" is written to a device for which ->saved_raid_disk is not set, fail. I think this is suitable for stable as it can cause re-adding a device to be forced to do a full resync which takes a lot longer and so puts data at more risk. Cc: (v4.1) Fixes: 97f6cd39da22 ("md-cluster: re-add capabilities") Signed-off-by: NeilBrown Reviewed-by: Goldwyn Rodrigues Signed-off-by: Shaohua Li Signed-off-by: Greg Kroah-Hartman --- drivers/md/md.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) --- a/drivers/md/md.c +++ b/drivers/md/md.c @@ -2690,7 +2690,8 @@ state_store(struct md_rdev *rdev, const err = 0; } } else if (cmd_match(buf, "re-add")) { - if (test_bit(Faulty, &rdev->flags) && (rdev->raid_disk == -1)) { + if (test_bit(Faulty, &rdev->flags) && (rdev->raid_disk == -1) && + rdev->saved_raid_disk >= 0) { /* clear_bit is performed _after_ all the devices * have their local Faulty bit cleared. If any writes * happen in the meantime in the local node, they @@ -8153,6 +8154,7 @@ static int remove_and_add_spares(struct if (mddev->pers->hot_remove_disk( mddev, rdev) == 0) { sysfs_unlink_rdev(mddev, rdev); + rdev->saved_raid_disk = rdev->raid_disk; rdev->raid_disk = -1; removed++; }