Received: by 2002:a05:6a10:1d13:0:0:0:0 with SMTP id pp19csp45036pxb; Wed, 18 Aug 2021 15:33:05 -0700 (PDT) X-Google-Smtp-Source: ABdhPJyAd9v4jTVYn1Gr39UOvpkv8BfKZsTKDIIPTACAFD7OTDReZsIZ+rAdbhBLxSYkvuygz/YR X-Received: by 2002:a17:906:659:: with SMTP id t25mr11985973ejb.372.1629325984948; Wed, 18 Aug 2021 15:33:04 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1629325984; cv=none; d=google.com; s=arc-20160816; b=mhucJK75wpbLJinjXdUlP3lCPvwXTjJ1+dN8zaxLA/u9lgtj+Dgxe28ivviVDL/TDY AIZvupgUEZzz5F4gIuQtnu8ZcvFdsBy/jJJ/0ZrTX6cR5hHFmY0yN2D/PdaXYbXm5/Ma AmpNAy2ApfwiS+9E/QGPctlD5QawiaN6KtgaqsD3nbd7zPDhZhcX3SXSf8GOu89Sj7e1 C2aABTqIY1ZQzqBGDE7Xzl0pVQ09CsKpVIceVjdGsBZDSACqBzsmDj944dO8HiADCBE9 GetaWJVyCNavyx7lpYM8DEeBhMXPNvOpHdLwGQq5xRKZ1B6tDiKZrqOO7W3EyXOI1SJm DWRw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:user-agent:in-reply-to:content-disposition :mime-version:references:reply-to:message-id:subject:cc:to:from:date; bh=vytX2cPswE2s5k4HtK+OXQBeyJ/IcHeZKkAIibvuInk=; b=QCmUYnUfvUVSS2ve6fdVSSvfQ5buZCVQcj/8iaPQiI+p09YKVrBFpoo1b0VZzraF6R TuG3ZBiUbJLK880docZom5Kjkx6hfkoD3y1WJokzOweffQLeFXq68qfPekA4sds2jq0Q 87GS/5jLjbFHO/AbWHlt6rvdFapIGRSJZQ4QnCLqDLN0snDaeXzoss+1XBIvGwvEtq9C pyttlNcBKVaA2HbrZ6f446FPyxoG1Km9bfUXKnWFAEFbCRTGkhxa67793H3eKBI+4oA0 wykk/UcnnxuYO6HvtLgM4ETG4KYyQszFDC3Kkz6XJe+VvKSL48dbY6x5PG5W1uvf7T4c wrtw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=alibaba.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id v15si1029230edx.590.2021.08.18.15.32.41; Wed, 18 Aug 2021 15:33:04 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=alibaba.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234574AbhHRW2r (ORCPT + 99 others); Wed, 18 Aug 2021 18:28:47 -0400 Received: from out30-44.freemail.mail.aliyun.com ([115.124.30.44]:40964 "EHLO out30-44.freemail.mail.aliyun.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232456AbhHRW2q (ORCPT ); Wed, 18 Aug 2021 18:28:46 -0400 X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R261e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=e01e04420;MF=bo.liu@linux.alibaba.com;NM=1;PH=DS;RN=8;SR=0;TI=SMTPD_---0Ujs6Crv_1629325684; Received: from rsjd01523.et2sqa(mailfrom:bo.liu@linux.alibaba.com fp:SMTPD_---0Ujs6Crv_1629325684) by smtp.aliyun-inc.com(127.0.0.1); Thu, 19 Aug 2021 06:28:10 +0800 Date: Thu, 19 Aug 2021 06:28:04 +0800 From: Liu Bo To: Gao Xiang Cc: linux-erofs@lists.ozlabs.org, Chao Yu , LKML , Peng Tao , Eryu Guan , Liu Jiang , Joseph Qi Subject: Re: [PATCH 1/2] erofs: introduce chunk-based file on-disk format Message-ID: <20210818222804.GA73193@rsjd01523.et2sqa> Reply-To: bo.liu@linux.alibaba.com References: <20210818070713.4437-1-hsiangkao@linux.alibaba.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20210818070713.4437-1-hsiangkao@linux.alibaba.com> User-Agent: Mutt/1.5.21 (2010-09-15) Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Aug 18, 2021 at 03:07:12PM +0800, Gao Xiang wrote: > Currently, uncompressed data except for tail-packing inline is > consecutive on disk. > > In order to support chunk-based data deduplication, add a new > corresponding inode data layout. > > In the future, the data source of chunks can be either (un)compressed. > > Signed-off-by: Gao Xiang > --- > Documentation/filesystems/erofs.rst | 16 ++++++++++-- > fs/erofs/erofs_fs.h | 40 +++++++++++++++++++++++++++-- > 2 files changed, 52 insertions(+), 4 deletions(-) > > diff --git a/Documentation/filesystems/erofs.rst b/Documentation/filesystems/erofs.rst > index 868e3972227f..b46d0fc46eb6 100644 > --- a/Documentation/filesystems/erofs.rst > +++ b/Documentation/filesystems/erofs.rst > @@ -156,13 +156,14 @@ may not. All metadatas can be now observed in two different spaces (views): > > Xattrs, extents, data inline are followed by the corresponding inode with > proper alignment, and they could be optional for different data mappings. > - _currently_ total 4 valid data mappings are supported: > + _currently_ total 5 data layouts are supported: > > == ==================================================================== > 0 flat file data without data inline (no extent); > 1 fixed-sized output data compression (with non-compacted indexes); > 2 flat file data with tail packing data inline (no extent); > - 3 fixed-sized output data compression (with compacted indexes, v5.3+). > + 3 fixed-sized output data compression (with compacted indexes, v5.3+); > + 4 chunk-based file (v5.15+). > == ==================================================================== > > The size of the optional xattrs is indicated by i_xattr_count in inode > @@ -213,6 +214,17 @@ Note that apart from the offset of the first filename, nameoff0 also indicates > the total number of directory entries in this block since it is no need to > introduce another on-disk field at all. > > +Chunk-based file > +---------------- > +In order to support chunk-based file deduplication, a new inode data layout has > +been supported since Linux v5.15: Files are split in equal-sized data chunks > +with ``extents`` area of the inode metadata indicating how to get the chunk > +data: these can be simply as a 4-byte block address array or in the 8-byte > +chunk index form (see struct erofs_inode_chunk_index in erofs_fs.h for more > +details.) > + > +By the way, chunk-based files are all uncompressed for now. > + > Data compression > ---------------- > EROFS implements LZ4 fixed-sized output compression which generates fixed-sized > diff --git a/fs/erofs/erofs_fs.h b/fs/erofs/erofs_fs.h > index 0f8da74570b4..6210fe434930 100644 > --- a/fs/erofs/erofs_fs.h > +++ b/fs/erofs/erofs_fs.h > @@ -4,6 +4,7 @@ > * > * Copyright (C) 2017-2018 HUAWEI, Inc. > * https://www.huawei.com/ > + * Copyright (C) 2021, Alibaba Cloud > */ > #ifndef __EROFS_FS_H > #define __EROFS_FS_H > @@ -19,10 +20,12 @@ > #define EROFS_FEATURE_INCOMPAT_LZ4_0PADDING 0x00000001 > #define EROFS_FEATURE_INCOMPAT_COMPR_CFGS 0x00000002 > #define EROFS_FEATURE_INCOMPAT_BIG_PCLUSTER 0x00000002 > +#define EROFS_FEATURE_INCOMPAT_CHUNKED_FILE 0x00000004 > #define EROFS_ALL_FEATURE_INCOMPAT \ > (EROFS_FEATURE_INCOMPAT_LZ4_0PADDING | \ > EROFS_FEATURE_INCOMPAT_COMPR_CFGS | \ > - EROFS_FEATURE_INCOMPAT_BIG_PCLUSTER) > + EROFS_FEATURE_INCOMPAT_BIG_PCLUSTER | \ > + EROFS_FEATURE_INCOMPAT_CHUNKED_FILE) > > #define EROFS_SB_EXTSLOT_SIZE 16 > > @@ -64,13 +67,16 @@ struct erofs_super_block { > * inode, [xattrs], last_inline_data, ... | ... | no-holed data > * 3 - inode compression D: > * inode, [xattrs], map_header, extents ... | ... > - * 4~7 - reserved > + * 4 - inode chunk-based E: > + * inode, [xattrs], chunk indexes ... | ... > + * 5~7 - reserved > */ > enum { > EROFS_INODE_FLAT_PLAIN = 0, > EROFS_INODE_FLAT_COMPRESSION_LEGACY = 1, > EROFS_INODE_FLAT_INLINE = 2, > EROFS_INODE_FLAT_COMPRESSION = 3, > + EROFS_INODE_CHUNK_BASED = 4, > EROFS_INODE_DATALAYOUT_MAX > }; > > @@ -90,6 +96,19 @@ static inline bool erofs_inode_is_data_compressed(unsigned int datamode) > #define EROFS_I_ALL \ > ((1 << (EROFS_I_DATALAYOUT_BIT + EROFS_I_DATALAYOUT_BITS)) - 1) > > +/* indicate chunk blkbits, thus `chunksize = blocksize << chunk blkbits' */ A typo in the quotation marks. (`chunksize = ) should be ('chunksize =) Otherwise it looks good. Reviewed-by: Liu Bo thanks, liubo > +#define EROFS_CHUNK_FORMAT_BLKBITS_MASK 0x001F > +/* with chunk indexes or just a 4-byte blkaddr array */ > +#define EROFS_CHUNK_FORMAT_INDEXES 0x0020 > + > +#define EROFS_CHUNK_FORMAT_ALL \ > + (EROFS_CHUNK_FORMAT_BLKBITS_MASK | EROFS_CHUNK_FORMAT_INDEXES) > + > +struct erofs_inode_chunk_info { > + __le16 format; /* chunk blkbits */ > + __le16 reserved; > +}; > + > /* 32-byte reduced form of an ondisk inode */ > struct erofs_inode_compact { > __le16 i_format; /* inode format hints */ > @@ -107,6 +126,9 @@ struct erofs_inode_compact { > > /* for device files, used to indicate old/new device # */ > __le32 rdev; > + > + /* for chunk-based files, it contains the summary info */ > + struct erofs_inode_chunk_info c; > } i_u; > __le32 i_ino; /* only used for 32-bit stat compatibility */ > __le16 i_uid; > @@ -135,6 +157,9 @@ struct erofs_inode_extended { > > /* for device files, used to indicate old/new device # */ > __le32 rdev; > + > + /* for chunk-based files, it contains the summary info */ > + struct erofs_inode_chunk_info c; > } i_u; > > /* only used for 32-bit stat compatibility */ > @@ -204,6 +229,15 @@ static inline unsigned int erofs_xattr_entry_size(struct erofs_xattr_entry *e) > e->e_name_len + le16_to_cpu(e->e_value_size)); > } > > +/* represent a zeroed chunk (hole) */ > +#define EROFS_NULL_ADDR -1 > + > +struct erofs_inode_chunk_index { > + __le32 blkaddr; > + __le16 device_id; /* back-end storage id, always 0 for now */ > + __le16 reserved; /* reserved, don't care */ > +}; > + > /* maximum supported size of a physical compression cluster */ > #define Z_EROFS_PCLUSTER_MAX_SIZE (1024 * 1024) > > @@ -338,6 +372,8 @@ static inline void erofs_check_ondisk_layout_definitions(void) > BUILD_BUG_ON(sizeof(struct erofs_inode_extended) != 64); > BUILD_BUG_ON(sizeof(struct erofs_xattr_ibody_header) != 12); > BUILD_BUG_ON(sizeof(struct erofs_xattr_entry) != 4); > + BUILD_BUG_ON(sizeof(struct erofs_inode_chunk_info) != 4); > + BUILD_BUG_ON(sizeof(struct erofs_inode_chunk_index) != 8); > BUILD_BUG_ON(sizeof(struct z_erofs_map_header) != 8); > BUILD_BUG_ON(sizeof(struct z_erofs_vle_decompressed_index) != 8); > BUILD_BUG_ON(sizeof(struct erofs_dirent) != 12); > -- > 2.24.4