Received: by 2002:ac0:a581:0:0:0:0:0 with SMTP id m1-v6csp2973437imm; Sun, 1 Jul 2018 09:28:35 -0700 (PDT) X-Google-Smtp-Source: AAOMgpeHw01WS+PqcW42etOMem4uQZkjAgECzcJGrRfM3DtblAtFYnWKb3s4Gb2/4eFKs2APUbpZ X-Received: by 2002:a62:4898:: with SMTP id q24-v6mr3965834pfi.58.1530462515346; Sun, 01 Jul 2018 09:28:35 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1530462515; cv=none; d=google.com; s=arc-20160816; b=fM7S+AWLfxGdarzz94v9FuXnQdKCTZTy0+w8+9AzQ4s/88VycBChfioTDQ87WkgtFt +zO1P9abay/7RZ6HpkAEH21u4/ucopQHNfwj3pQpysYLCaA2ElFEfy9fmCNPTzlKztZe FUHnldpzHMuNbeOvCkRdPfQz1PoFZTXlc50iPeOLHgwQjmgMtZTV11OkdmuhaOjEymSx tg+9EkbEhnS25KizPfdg0y9OUuyW7wDw97jc4WYeLM10uDAMOMSm/jwIWMWelDd7z9Fu aD0M6qzCBfGzKZ1K/gIzcDIIZOytNmszUOMmqrZ8lzuwpTqRoJWNS73tNrarrmjkRLN6 gHLA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:mime-version:user-agent:references :in-reply-to:message-id:date:subject:cc:to:from :arc-authentication-results; bh=V574VZjEYvQLJAHTMN3HhtyIhryoeNnfjTYlkMh/S8I=; b=UxHVXoYDKrkjXZunD05Kd9ob7jBOiOOe3myzLpc/fZB7dlC6G3OyzZdspAmb5QPksW 071DMU3CRZ+NXwzbcRjEGb3eQxY1W7AnzaA0Y7FPib9m1r4IY+JKp/MsZRkx6DcxeLxh 2NtWvu0wFUZuz/n5TPF5S1HgXSk07XN1umRBvrei/fQG7PzyEjqeJKAw9zLc9i2qYOeS +5DID2StAMSJ2e6KDQwRxhl/e/SToPh0eV8f6M6VwizF8FaZvtnChZQ/xvcYUV7PO6vf 8rNzAy30sVMEZz4EmSeso+IdrbA92xlnvEGXaxAYrtJs3wSBptGxLjKOF5MmVOd5sXS5 FIFg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id s2-v6si14173854plq.372.2018.07.01.09.28.21; Sun, 01 Jul 2018 09:28:35 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S965477AbeGAQ1Q (ORCPT + 99 others); Sun, 1 Jul 2018 12:27:16 -0400 Received: from mail.linuxfoundation.org ([140.211.169.12]:34046 "EHLO mail.linuxfoundation.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S965463AbeGAQ1N (ORCPT ); Sun, 1 Jul 2018 12:27:13 -0400 Received: from localhost (LFbn-1-12247-202.w90-92.abo.wanadoo.fr [90.92.61.202]) by mail.linuxfoundation.org (Postfix) with ESMTPSA id E2C4F92B; Sun, 1 Jul 2018 16:27:12 +0000 (UTC) From: Greg Kroah-Hartman To: linux-kernel@vger.kernel.org Cc: Greg Kroah-Hartman , stable@vger.kernel.org, NeilBrown , Goldwyn Rodrigues , Shaohua Li Subject: [PATCH 4.9 066/101] md: fix two problems with setting the "re-add" device state. Date: Sun, 1 Jul 2018 18:21:52 +0200 Message-Id: <20180701160759.803610690@linuxfoundation.org> X-Mailer: git-send-email 2.18.0 In-Reply-To: <20180701160757.138608453@linuxfoundation.org> References: <20180701160757.138608453@linuxfoundation.org> User-Agent: quilt/0.65 X-stable: review MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org 4.9-stable review patch. If anyone has any objections, please let me know. ------------------ From: NeilBrown commit 011abdc9df559ec75779bb7c53a744c69b2a94c6 upstream. If "re-add" is written to the "state" file for a device which is faulty, this has an effect similar to removing and re-adding the device. It should take up the same slot in the array that it previously had, and an accelerated (e.g. bitmap-based) rebuild should happen. The slot that "it previously had" is determined by rdev->saved_raid_disk. However this is not set when a device fails (only when a device is added), and it is cleared when resync completes. This means that "re-add" will normally work once, but may not work a second time. This patch includes two fixes. 1/ when a device fails, record the ->raid_disk value in ->saved_raid_disk before clearing ->raid_disk 2/ when "re-add" is written to a device for which ->saved_raid_disk is not set, fail. I think this is suitable for stable as it can cause re-adding a device to be forced to do a full resync which takes a lot longer and so puts data at more risk. Cc: (v4.1) Fixes: 97f6cd39da22 ("md-cluster: re-add capabilities") Signed-off-by: NeilBrown Reviewed-by: Goldwyn Rodrigues Signed-off-by: Shaohua Li Signed-off-by: Greg Kroah-Hartman --- drivers/md/md.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) --- a/drivers/md/md.c +++ b/drivers/md/md.c @@ -2694,7 +2694,8 @@ state_store(struct md_rdev *rdev, const err = 0; } } else if (cmd_match(buf, "re-add")) { - if (test_bit(Faulty, &rdev->flags) && (rdev->raid_disk == -1)) { + if (test_bit(Faulty, &rdev->flags) && (rdev->raid_disk == -1) && + rdev->saved_raid_disk >= 0) { /* clear_bit is performed _after_ all the devices * have their local Faulty bit cleared. If any writes * happen in the meantime in the local node, they @@ -8272,6 +8273,7 @@ static int remove_and_add_spares(struct if (mddev->pers->hot_remove_disk( mddev, rdev) == 0) { sysfs_unlink_rdev(mddev, rdev); + rdev->saved_raid_disk = rdev->raid_disk; rdev->raid_disk = -1; removed++; }