Received: by 2002:ac0:adb4:0:0:0:0:0 with SMTP id o49-v6csp9310imb; Fri, 10 Aug 2018 06:42:24 -0700 (PDT) X-Google-Smtp-Source: AA+uWPxebA/MSF5tDZyOe8duBhwDACgPF/3DUbYnWAphmLy5PP1hX8aO+L2PvUPyeqlfD+jUpFaJ X-Received: by 2002:a62:d085:: with SMTP id p127-v6mr7184191pfg.119.1533908544644; Fri, 10 Aug 2018 06:42:24 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1533908544; cv=none; d=google.com; s=arc-20160816; b=PHIKpO+Lcm2EfC640q4HKT7GYnFvWt9JHYqTTOZJNRepQQuKWHdyWw5SADnwIwV0wv NxgETIPktvXLA+YF8Raw/97YxnPNe/1+7ca2TwyT3M+Powc/QpeWB3tiJDmdUCdL9Cvt QeVIEDU3e/+9hxNjigVIgriuEYI9WouBoqQeGQjOE6sLM/Dp6n/S4BDRi49UA7iZUIoV MkQ7MM/42Rb6SGDidBPBdrzIHMWIukt/MZamxtOdnHrPVrnzFYtnbuyIv0bCmgypBMIU 835LJCAuhyLQfAfQtW/y1rWSjJioDfOC7uVNXJaoMI7K9UAh8iymaDfybl01GlPvKkiY zqjQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding :content-language:in-reply-to:mime-version:user-agent:date :message-id:from:references:cc:to:subject:arc-authentication-results; bh=fnGq2oZlD547ivVZ8XYyGJOd//GhVKa6n8xzgaX7/yw=; b=k3zC1djpOgm68+J7auZj00TRb5phngdN3PzYvtRtcnfbEGB0KZRzHCwK1CHxO7hTlp gwgRERYY7oNDgz3OlYhZ8v84pRmiLwBDSmTD+B5DlfGLtIir+LTS3NQVlRfZMaEEs4ph q3kMnC+OyBopYI+EnC4YOeYX/jd7JoTHI0R7ztD9fub8fn0udefFA69XAs8H45UgCCVL zhSfGscjVCemvBMWGTQbk70iJ7mHrYB8rKzUsGaEhMYKosrDVKPZt3ACNMMu79odPKO5 9UTtN/jtx6N1eTrFbtO0XJ/NsPXcXbsOD4W7qErvp40WKlZy1HG0UwlMwZMb76pqYxM6 yBIw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id i13-v6si9900177pgi.277.2018.08.10.06.42.10; Fri, 10 Aug 2018 06:42:24 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728184AbeHJQLU (ORCPT + 99 others); Fri, 10 Aug 2018 12:11:20 -0400 Received: from smtp.nue.novell.com ([195.135.221.5]:42716 "EHLO smtp.nue.novell.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727624AbeHJQLT (ORCPT ); Fri, 10 Aug 2018 12:11:19 -0400 Received: from [10.160.4.48] (nat.nue.novell.com [195.135.221.2]) by smtp.nue.novell.com with ESMTP (TLS encrypted); Fri, 10 Aug 2018 15:41:19 +0200 Subject: Re: [RFC PATCH 03/17] btrfs: Check and enable HMZONED mode To: Naohiro Aota , David Sterba , "linux-btrfs@vger.kernel.org" Cc: Chris Mason , Josef Bacik , linux-kernel@vger.kernel.org, Damien Le Moal , Bart Van Assche , Matias Bjorling References: <20180809180450.5091-1-naota@elisp.net> <20180809180450.5091-4-naota@elisp.net> <51ed0d0b-7574-b9a9-bae5-2cc8042913e6@suse.com> <20180810131558.gadsij5g7tshfg5u@zazie> From: Hannes Reinecke Message-ID: <6df03389-5127-28ac-f14b-05846bdd896f@suse.com> Date: Fri, 10 Aug 2018 15:41:18 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.9.1 MIME-Version: 1.0 In-Reply-To: <20180810131558.gadsij5g7tshfg5u@zazie> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 08/10/2018 03:15 PM, Naohiro Aota wrote: > On Fri, Aug 10, 2018 at 02:25:33PM +0200, Hannes Reinecke wrote: >> On 08/09/2018 08:04 PM, Naohiro Aota wrote: >>> HMZONED mode cannot be used together with the RAID5/6 profile. Introduce >>> the function btrfs_check_hmzoned_mode() to check this. This function will >>> also check if HMZONED flag is enabled on the file system and if the file >>> system consists of zoned devices with equal zone size. >>> >>> Additionally, as updates to the space cache are in-place, the space cache >>> cannot be located over sequential zones and there is no guarantees that the >>> device will have enough conventional zones to store this cache. Resolve >>> this problem by disabling completely the space cache. This does not >>> introduces any problems with sequential block groups: all the free space is >>> located after the allocation pointer and no free space before the pointer. >>> There is no need to have such cache. >>> >>> Signed-off-by: Damien Le Moal >>> Signed-off-by: Naohiro Aota >>> --- >>> fs/btrfs/ctree.h | 3 ++ >>> fs/btrfs/dev-replace.c | 7 ++++ >>> fs/btrfs/disk-io.c | 7 ++++ >>> fs/btrfs/super.c | 12 +++--- >>> fs/btrfs/volumes.c | 87 ++++++++++++++++++++++++++++++++++++++++++ >>> fs/btrfs/volumes.h | 1 + >>> 6 files changed, 112 insertions(+), 5 deletions(-) >>> >>> diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h >>> index 66f1d3895bca..14f880126532 100644 >>> --- a/fs/btrfs/ctree.h >>> +++ b/fs/btrfs/ctree.h >>> @@ -763,6 +763,9 @@ struct btrfs_fs_info { >>> struct btrfs_root *uuid_root; >>> struct btrfs_root *free_space_root; >>> >>> + /* Zone size when in HMZONED mode */ >>> + u64 zone_size; >>> + >>> /* the log root tree is a directory of all the other log roots */ >>> struct btrfs_root *log_root_tree; >>> >>> diff --git a/fs/btrfs/dev-replace.c b/fs/btrfs/dev-replace.c >>> index dec01970d8c5..839a35008fd8 100644 >>> --- a/fs/btrfs/dev-replace.c >>> +++ b/fs/btrfs/dev-replace.c >>> @@ -202,6 +202,13 @@ static int btrfs_init_dev_replace_tgtdev(struct btrfs_fs_info *fs_info, >>> return PTR_ERR(bdev); >>> } >>> >>> + if ((bdev_zoned_model(bdev) == BLK_ZONED_HM && >>> + !btrfs_fs_incompat(fs_info, HMZONED)) || >>> + (!bdev_is_zoned(bdev) && btrfs_fs_incompat(fs_info, HMZONED))) { >>> + ret = -EINVAL; >>> + goto error; >>> + } >>> + >>> filemap_write_and_wait(bdev->bd_inode->i_mapping); >>> >>> devices = &fs_info->fs_devices->devices; >>> diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c >>> index 5124c15705ce..14f284382ba7 100644 >>> --- a/fs/btrfs/disk-io.c >>> +++ b/fs/btrfs/disk-io.c >>> @@ -3057,6 +3057,13 @@ int open_ctree(struct super_block *sb, >>> >>> btrfs_free_extra_devids(fs_devices, 1); >>> >>> + ret = btrfs_check_hmzoned_mode(fs_info); >>> + if (ret) { >>> + btrfs_err(fs_info, "failed to init hmzoned mode: %d", >>> + ret); >>> + goto fail_block_groups; >>> + } >>> + >>> ret = btrfs_sysfs_add_fsid(fs_devices, NULL); >>> if (ret) { >>> btrfs_err(fs_info, "failed to init sysfs fsid interface: %d", >>> diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c >>> index 5fdd95e3de05..cc812e459197 100644 >>> --- a/fs/btrfs/super.c >>> +++ b/fs/btrfs/super.c >>> @@ -435,11 +435,13 @@ int btrfs_parse_options(struct btrfs_fs_info *info, char *options, >>> bool saved_compress_force; >>> int no_compress = 0; >>> >>> - cache_gen = btrfs_super_cache_generation(info->super_copy); >>> - if (btrfs_fs_compat_ro(info, FREE_SPACE_TREE)) >>> - btrfs_set_opt(info->mount_opt, FREE_SPACE_TREE); >>> - else if (cache_gen) >>> - btrfs_set_opt(info->mount_opt, SPACE_CACHE); >>> + if (!btrfs_fs_incompat(info, HMZONED)) { >>> + cache_gen = btrfs_super_cache_generation(info->super_copy); >>> + if (btrfs_fs_compat_ro(info, FREE_SPACE_TREE)) >>> + btrfs_set_opt(info->mount_opt, FREE_SPACE_TREE); >>> + else if (cache_gen) >>> + btrfs_set_opt(info->mount_opt, SPACE_CACHE); >>> + } >>> >>> /* >>> * Even the options are empty, we still need to do extra check >>> diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c >>> index 35b3a2187653..ba7ebb80de4d 100644 >>> --- a/fs/btrfs/volumes.c >>> +++ b/fs/btrfs/volumes.c >>> @@ -1293,6 +1293,80 @@ int btrfs_open_devices(struct btrfs_fs_devices *fs_devices, >>> return ret; >>> } >>> >>> +int btrfs_check_hmzoned_mode(struct btrfs_fs_info *fs_info) >>> +{ >>> + struct btrfs_fs_devices *fs_devices = fs_info->fs_devices; >>> + struct btrfs_device *device; >>> + u64 hmzoned_devices = 0; >>> + u64 nr_devices = 0; >>> + u64 zone_size = 0; >>> + int incompat_hmzoned = btrfs_fs_incompat(fs_info, HMZONED); >>> + int ret = 0; >>> + >>> + /* Count zoned devices */ >>> + list_for_each_entry(device, &fs_devices->devices, dev_list) { >>> + if (!device->bdev) >>> + continue; >>> + if (bdev_zoned_model(device->bdev) == BLK_ZONED_HM || >>> + (bdev_zoned_model(device->bdev) == BLK_ZONED_HA && >>> + incompat_hmzoned)) { >>> + hmzoned_devices++; >>> + if (!zone_size) { >>> + zone_size = device->zone_size; >>> + } else if (device->zone_size != zone_size) { >>> + btrfs_err(fs_info, >>> + "Zoned block devices must have equal zone sizes"); >>> + ret = -EINVAL; >>> + goto out; >>> + } >>> + } >>> + nr_devices++; >>> + } >>> + >>> + if (!hmzoned_devices && incompat_hmzoned) { >>> + /* No zoned block device, disable HMZONED */ >>> + btrfs_err(fs_info, "HMZONED enabled file system should have zoned devices"); >>> + ret = -EINVAL; >>> + goto out; >>> + } >>> + >>> + fs_info->zone_size = zone_size; >>> + >>> + if (hmzoned_devices != nr_devices) { >>> + btrfs_err(fs_info, >>> + "zoned devices mixed with regular devices"); >>> + ret = -EINVAL; >>> + goto out; >>> + } >>> + >> This breaks existing setups; as we're not checking if the device >> specified by fs_info is a zoned device we'll fail here for normal devices. > > Ah, I forgot to deel with the normal devices when I convert HMZONED > mount flag to incompat flag. > >> You need this patch to fix it: > > Thank you for fixing this. It's exactly what I wanted to do. I'll fix > in the next version. > Thanks. Other than that it seems to be holding up quite well; did a full 'git clone && make oldconfig && make -j 16' on the upstream linux kernel with no problems at all. Cheers, Hannes -- Dr. Hannes Reinecke zSeries & Storage hare@suse.com +49 911 74053 688 SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg GF: F. Imendörffer, J. Smithard, D. Upmanyu, G. Norton HRB 21284 (AG Nürnberg)