Received: by 2002:a05:7412:31a9:b0:e2:908c:2ebd with SMTP id et41csp4019782rdb; Thu, 14 Sep 2023 09:22:30 -0700 (PDT) X-Google-Smtp-Source: AGHT+IFOkMPhWhOo8Xhwuk+F2Qnekg2nE3huETW/yEzW4w38p5rIeJHZtM2XqrezQSxgaBwV6pOe X-Received: by 2002:a05:6a20:8e19:b0:111:a0e5:d2b7 with SMTP id y25-20020a056a208e1900b00111a0e5d2b7mr5605121pzj.4.1694708549933; Thu, 14 Sep 2023 09:22:29 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1694708549; cv=none; d=google.com; s=arc-20160816; b=jedtFiQPLwMKpgSVoaXwHHdl1BGdI1wRsXcJ/NEsTs6axQYFjwnOnnFCmBM7A2naP7 TlSKKADkxKdRnbBOm8jAabjT7Vyh+2HRHD4nukJP964TNpHfZTXvIVLZY5pmevUb0CL6 Eeb2huREKXnSO59AvTmtefVVR3o0iy00RZyO3nSe4+Ja1G6Z7gu4vHgFwVoiYxssEBVd hoeyOwuHytsHXW/Oet1yNvTYQBm/vun8IO9tybSarnokPRI4PvdWPvA4v3tzhIGdW//L ZpjU3mJ7T1nyziTv8d/feQhu2gheNdFikcTHS2KvfKGwHVLIz/Sc76h0QcSVus0x2fha 3ENQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:in-reply-to:references:message-id :content-transfer-encoding:mime-version:subject:date:from :wdcironportexception:ironport-sdr:ironport-sdr:dkim-signature; bh=8y1EeDaMR+GSyK/kgCi4ehz+VjxbFJjveJx4bL89Qjs=; fh=bp3vMIo1qVzFCHckMTFYF7uCNt+DyLY9wZgFAQuWS8M=; b=Elgwzl+CN9PHHYomYJ2aKFuCi9YIuwDUS6FNhpeO01gAHG9Vao98wyBKQsm3Vs8QqL zTMmTdfN3DOKLusTCx1eCoa6pVV8WClsLuNR+Jfk36SN6wFgB8UlKB10slBq2WnxiwAP uxcwRXYYMaDHLQpJaHnSNzX0sJTJ1VrIz3FhX4lpKWf/ZXrKd3X5R3vXFMfV9q2jhxOY /uyaagpPkDEsa9jPM6RKjddLnl3KDD7CAeiT3lFqEFosP4KwzV3Wq2t6ui6BvYLHj1Hh MHGYFFBNKz481W5snhwkRahqWIA5foHPNIsoU6yXo0KbuuQV5uAvS79BBKbGe8O0VdJ5 iD3A== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@wdc.com header.s=dkim.wdc.com header.b=jmbVSmtJ; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.34 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=wdc.com Return-Path: Received: from howler.vger.email (howler.vger.email. [23.128.96.34]) by mx.google.com with ESMTPS id t186-20020a6381c3000000b00573f6ee58fasi1747077pgd.376.2023.09.14.09.22.29 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 14 Sep 2023 09:22:29 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.34 as permitted sender) client-ip=23.128.96.34; Authentication-Results: mx.google.com; dkim=fail header.i=@wdc.com header.s=dkim.wdc.com header.b=jmbVSmtJ; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.34 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=wdc.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by howler.vger.email (Postfix) with ESMTP id D7121870657C; Thu, 14 Sep 2023 09:07:38 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at howler.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S241847AbjINQHj (ORCPT + 99 others); Thu, 14 Sep 2023 12:07:39 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57866 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S241842AbjINQHR (ORCPT ); Thu, 14 Sep 2023 12:07:17 -0400 Received: from esa3.hgst.iphmx.com (esa3.hgst.iphmx.com [216.71.153.141]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id AF0C71FC0; Thu, 14 Sep 2023 09:07:13 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=wdc.com; i=@wdc.com; q=dns/txt; s=dkim.wdc.com; t=1694707633; x=1726243633; h=from:date:subject:mime-version:content-transfer-encoding: message-id:references:in-reply-to:to:cc; bh=d42cqIbxUEKquSGYYk3w7AaeNWr6XkbxX7jX+1lrcSo=; b=jmbVSmtJtHnOCE8OJ/jJIZIpeQA1y7J+rmnkzVDJDkR2ul4vkMx8p1bj +gVqr4fFXnA3WmfhHYdrEkHPXTX2XP8STblejG1TcS0QWOtt17nYkU0s8 jaIJEPLom/MzPj86iZySvkzl47/E55U6io2UHKbmdANnbMq+YgFkPcjYx skVqXVfN+p1vv4LlF2ohegDCRS3k93PohFV520pzsw+e8E/lQd46RJQJk Xt21rdB5LhTdViJflWoU/scU2cOQW8MP5B9AL16f03u5UyMnJqly2y9Xg VsBEz02gJEgQP4sm5QDR5L5QBEOyAWZteMcM98nq+vOaj+E/4FFOdXPOv A==; X-CSE-ConnectionGUID: 0uOTb0vlSEKK3KzsROHjqQ== X-CSE-MsgGUID: A/zWuJGlQEKHl+e3YSgVqA== X-IronPort-AV: E=Sophos;i="6.02,146,1688400000"; d="scan'208";a="248490548" Received: from uls-op-cesaip02.wdc.com (HELO uls-op-cesaep02.wdc.com) ([199.255.45.15]) by ob1.hgst.iphmx.com with ESMTP; 15 Sep 2023 00:07:13 +0800 IronPort-SDR: tE8UV3isZTCeH8JBePxSF48I6xApbZ1v5lROxiXoV2WOJP/aiaWfa5CqJfPIyziyZ3el4PE1Ay 7ugOjKT2wfWKIuO9liiCJgKyAMheujaGRsGDVCLqWUNigyNB9Lsxm3wzXPFWuuCMvJHxfHW5Am tX4FY+CCW5DDNbEgT1yFiUpJEZLIpJ6srA6AQAuVG6YVwdfqBAJNpwUHZ2KPsukKS34wwCJbLY pEw+Y8pqDq3MQKDYZi6Ij/S0F+Fcs/m4oUX0srj1zKm8E/nkvq1tnG4MTFkFe1lSdKhT8hk4Y6 vyA= Received: from uls-op-cesaip01.wdc.com ([10.248.3.36]) by uls-op-cesaep02.wdc.com with ESMTP/TLS/ECDHE-RSA-AES128-GCM-SHA256; 14 Sep 2023 08:14:14 -0700 IronPort-SDR: +CvEQs5BsLVIGqrJTcYE4ynxuB6mAsOW32+/ktK2G+qnaqNoOsaHOCA1meZtH54DhD6PD7ua2y LxE80qHHk9E5wJ9udvAndm7GJibMTl3flaUMJhhG5GzI0udbWToFcw7wLpXuuVmmf6B7Ims6Cc qqr+t3O0YLJGBAfXchgnhc7O1VYxvKSFqBeL+axmn5NdQwNuGajdUtVgn3CF4TbYM9viKO0adv fvcoUk4RnNMdH5T8Qt7HR/jkbtyTa/e8IEbyda12y71VCsXkcN9WYxfPF1cDW9jUWON9FrAS2K 8oQ= WDCIronportException: Internal Received: from unknown (HELO redsun91.ssa.fujisawa.hgst.com) ([10.149.66.6]) by uls-op-cesaip01.wdc.com with ESMTP; 14 Sep 2023 09:07:13 -0700 From: Johannes Thumshirn Date: Thu, 14 Sep 2023 09:07:02 -0700 Subject: [PATCH v9 07/11] btrfs: zoned: allow zoned RAID MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 7bit Message-Id: <20230914-raid-stripe-tree-v9-7-15d423829637@wdc.com> References: <20230914-raid-stripe-tree-v9-0-15d423829637@wdc.com> In-Reply-To: <20230914-raid-stripe-tree-v9-0-15d423829637@wdc.com> To: Chris Mason , Josef Bacik , David Sterba Cc: Christoph Hellwig , Naohiro Aota , Qu Wenruo , Damien Le Moal , linux-btrfs@vger.kernel.org, linux-kernel@vger.kernel.org, Johannes Thumshirn X-Mailer: b4 0.12.3 X-Developer-Signature: v=1; a=ed25519-sha256; t=1694707621; l=7309; i=johannes.thumshirn@wdc.com; s=20230613; h=from:subject:message-id; bh=d42cqIbxUEKquSGYYk3w7AaeNWr6XkbxX7jX+1lrcSo=; b=4/dBOqQnXE+SB9BAc5TTUm8Hj74OPgfEK2eZ0jSSAYeQEGSnsoYpo5DiQkBY1JNdG6XLmyNsX xALtVkIJA/PCryIF+haERYDDbxiG6ErT2y5g3+Zd4xoSRQBYR+fVcKd X-Developer-Key: i=johannes.thumshirn@wdc.com; a=ed25519; pk=TGmHKs78FdPi+QhrViEvjKIGwReUGCfa+3LEnGoR2KM= Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (howler.vger.email [0.0.0.0]); Thu, 14 Sep 2023 09:07:39 -0700 (PDT) When we have a raid-stripe-tree, we can do RAID0/1/10 on zoned devices for data block-groups. For meta-data block-groups, we don't actually need anything special, as all meta-data I/O is protected by the btrfs_zoned_meta_io_lock() already. Signed-off-by: Johannes Thumshirn --- fs/btrfs/raid-stripe-tree.h | 7 ++- fs/btrfs/volumes.c | 2 + fs/btrfs/zoned.c | 144 ++++++++++++++++++++++++++++++++++++++++++-- 3 files changed, 148 insertions(+), 5 deletions(-) diff --git a/fs/btrfs/raid-stripe-tree.h b/fs/btrfs/raid-stripe-tree.h index 5d9629a815c1..f31292ab9030 100644 --- a/fs/btrfs/raid-stripe-tree.h +++ b/fs/btrfs/raid-stripe-tree.h @@ -6,6 +6,11 @@ #ifndef BTRFS_RAID_STRIPE_TREE_H #define BTRFS_RAID_STRIPE_TREE_H +#define BTRFS_RST_SUPP_BLOCK_GROUP_MASK (BTRFS_BLOCK_GROUP_DUP |\ + BTRFS_BLOCK_GROUP_RAID1_MASK |\ + BTRFS_BLOCK_GROUP_RAID0 |\ + BTRFS_BLOCK_GROUP_RAID10) + struct btrfs_io_context; struct btrfs_io_stripe; struct btrfs_ordered_extent; @@ -32,7 +37,7 @@ static inline bool btrfs_need_stripe_tree_update(struct btrfs_fs_info *fs_info, if (type != BTRFS_BLOCK_GROUP_DATA) return false; - if (profile & BTRFS_BLOCK_GROUP_RAID1_MASK) + if (profile & BTRFS_RST_SUPP_BLOCK_GROUP_MASK) return true; return false; diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c index 2326dbcf85f6..dc311e38eb11 100644 --- a/fs/btrfs/volumes.c +++ b/fs/btrfs/volumes.c @@ -6541,6 +6541,8 @@ int btrfs_map_block(struct btrfs_fs_info *fs_info, enum btrfs_map_op op, * I/O context structure. */ if (smap && num_alloc_stripes == 1 && + !(btrfs_need_stripe_tree_update(fs_info, map->type) && + op != BTRFS_MAP_READ) && !((map->type & BTRFS_BLOCK_GROUP_RAID56_MASK) && mirror_num > 1)) { ret = set_io_stripe(fs_info, op, logical, length, smap, map, stripe_index, stripe_offset, stripe_nr); diff --git a/fs/btrfs/zoned.c b/fs/btrfs/zoned.c index d05510cb2cb2..ce2846c944d2 100644 --- a/fs/btrfs/zoned.c +++ b/fs/btrfs/zoned.c @@ -1397,9 +1397,11 @@ static int btrfs_load_block_group_dup(struct btrfs_block_group *bg, struct zone_info *zone_info, unsigned long *active) { - if (map->type & BTRFS_BLOCK_GROUP_DATA) { - btrfs_err(bg->fs_info, - "zoned: profile DUP not yet supported on data bg"); + struct btrfs_fs_info *fs_info = bg->fs_info; + + if (map->type & BTRFS_BLOCK_GROUP_DATA && + !fs_info->stripe_root) { + btrfs_err(fs_info, "zoned: data DUP profile needs stripe_root"); return -EINVAL; } @@ -1433,6 +1435,133 @@ static int btrfs_load_block_group_dup(struct btrfs_block_group *bg, return 0; } +static int btrfs_load_block_group_raid1(struct btrfs_block_group *bg, + struct map_lookup *map, + struct zone_info *zone_info, + unsigned long *active) +{ + struct btrfs_fs_info *fs_info = bg->fs_info; + int i; + + if (map->type & BTRFS_BLOCK_GROUP_DATA && + !fs_info->stripe_root) { + btrfs_err(fs_info, "zoned: data %s needs stripe_root", + btrfs_bg_type_to_raid_name(map->type)); + return -EINVAL; + + } + + for (i = 0; i < map->num_stripes; i++) { + if (zone_info[i].alloc_offset == WP_MISSING_DEV || + zone_info[i].alloc_offset == WP_CONVENTIONAL) + continue; + + if ((zone_info[0].alloc_offset != zone_info[i].alloc_offset) && + !btrfs_test_opt(fs_info, DEGRADED)) { + btrfs_err(fs_info, + "zoned: write pointer offset mismatch of zones in %s profile", + btrfs_bg_type_to_raid_name(map->type)); + return -EIO; + } + if (test_bit(0, active) != test_bit(i, active)) { + if (!btrfs_test_opt(fs_info, DEGRADED) && + !btrfs_zone_activate(bg)) { + return -EIO; + } + } else { + if (test_bit(0, active)) + set_bit(BLOCK_GROUP_FLAG_ZONE_IS_ACTIVE, + &bg->runtime_flags); + } + /* + * In case a device is missing we have a cap of 0, so don't + * use it. + */ + bg->zone_capacity = min_not_zero(zone_info[0].capacity, + zone_info[1].capacity); + } + + if (zone_info[0].alloc_offset != WP_MISSING_DEV) + bg->alloc_offset = zone_info[0].alloc_offset; + else + bg->alloc_offset = zone_info[i - 1].alloc_offset; + + return 0; +} + +static int btrfs_load_block_group_raid0(struct btrfs_block_group *bg, + struct map_lookup *map, + struct zone_info *zone_info, + unsigned long *active) +{ + struct btrfs_fs_info *fs_info = bg->fs_info; + + if (map->type & BTRFS_BLOCK_GROUP_DATA && + !fs_info->stripe_root) { + btrfs_err(fs_info, "zoned: data %s needs stripe_root", + btrfs_bg_type_to_raid_name(map->type)); + return -EINVAL; + + } + + for (int i = 0; i < map->num_stripes; i++) { + if (zone_info[i].alloc_offset == WP_MISSING_DEV || + zone_info[i].alloc_offset == WP_CONVENTIONAL) + continue; + + if (test_bit(0, active) != test_bit(i, active)) { + if (!btrfs_zone_activate(bg)) + return -EIO; + } else { + if (test_bit(0, active)) + set_bit(BLOCK_GROUP_FLAG_ZONE_IS_ACTIVE, + &bg->runtime_flags); + } + bg->zone_capacity += zone_info[i].capacity; + bg->alloc_offset += zone_info[i].alloc_offset; + } + + return 0; +} + +static int btrfs_load_block_group_raid10(struct btrfs_block_group *bg, + struct map_lookup *map, + struct zone_info *zone_info, + unsigned long *active) +{ + struct btrfs_fs_info *fs_info = bg->fs_info; + + if (map->type & BTRFS_BLOCK_GROUP_DATA && + !fs_info->stripe_root) { + btrfs_err(fs_info, "zoned: data %s needs stripe_root", + btrfs_bg_type_to_raid_name(map->type)); + return -EINVAL; + + } + + for (int i = 0; i < map->num_stripes; i++) { + if (zone_info[i].alloc_offset == WP_MISSING_DEV || + zone_info[i].alloc_offset == WP_CONVENTIONAL) + continue; + + if (test_bit(0, active) != test_bit(i, active)) { + if (!btrfs_zone_activate(bg)) + return -EIO; + } else { + if (test_bit(0, active)) + set_bit(BLOCK_GROUP_FLAG_ZONE_IS_ACTIVE, + &bg->runtime_flags); + } + + if ((i % map->sub_stripes) == 0) { + bg->zone_capacity += zone_info[i].capacity; + bg->alloc_offset += zone_info[i].alloc_offset; + } + } + + return 0; +} + int btrfs_load_block_group_zone_info(struct btrfs_block_group *cache, bool new) { struct btrfs_fs_info *fs_info = cache->fs_info; @@ -1525,11 +1654,18 @@ int btrfs_load_block_group_zone_info(struct btrfs_block_group *cache, bool new) ret = btrfs_load_block_group_dup(cache, map, zone_info, active); break; case BTRFS_BLOCK_GROUP_RAID1: + case BTRFS_BLOCK_GROUP_RAID1C3: + case BTRFS_BLOCK_GROUP_RAID1C4: + ret = btrfs_load_block_group_raid1(cache, map, zone_info, active); + break; case BTRFS_BLOCK_GROUP_RAID0: + ret = btrfs_load_block_group_raid0(cache, map, zone_info, active); + break; case BTRFS_BLOCK_GROUP_RAID10: + ret = btrfs_load_block_group_raid10(cache, map, zone_info, active); + break; case BTRFS_BLOCK_GROUP_RAID5: case BTRFS_BLOCK_GROUP_RAID6: - /* non-single profiles are not supported yet */ default: btrfs_err(fs_info, "zoned: profile %s not yet supported", btrfs_bg_type_to_raid_name(map->type)); -- 2.41.0