2020-11-02 12:27:15

by Chao Yu

[permalink] [raw]
Subject: [PATCH] f2fs: compress: support chksum

This patch supports to store chksum value with compressed
data, and verify the integrality of compressed data while
reading the data.

The feature can be enabled through specifying mount option
'compress_chksum'.

Signed-off-by: Chao Yu <[email protected]>
---
Documentation/filesystems/f2fs.rst | 1 +
fs/f2fs/compress.c | 20 ++++++++++++++++++++
fs/f2fs/f2fs.h | 13 ++++++++++++-
fs/f2fs/inode.c | 3 +++
fs/f2fs/super.c | 9 +++++++++
include/linux/f2fs_fs.h | 2 +-
6 files changed, 46 insertions(+), 2 deletions(-)

diff --git a/Documentation/filesystems/f2fs.rst b/Documentation/filesystems/f2fs.rst
index b8ee761c9922..985ae7d35066 100644
--- a/Documentation/filesystems/f2fs.rst
+++ b/Documentation/filesystems/f2fs.rst
@@ -260,6 +260,7 @@ compress_extension=%s Support adding specified extension, so that f2fs can enab
For other files, we can still enable compression via ioctl.
Note that, there is one reserved special extension '*', it
can be set to enable compression for all files.
+compress_chksum Support verifying chksum of raw data in compressed cluster.
inlinecrypt When possible, encrypt/decrypt the contents of encrypted
files using the blk-crypto framework rather than
filesystem-layer encryption. This allows the use of
diff --git a/fs/f2fs/compress.c b/fs/f2fs/compress.c
index 14262e0f1cd6..a4e0d2c745b6 100644
--- a/fs/f2fs/compress.c
+++ b/fs/f2fs/compress.c
@@ -602,6 +602,7 @@ static int f2fs_compress_pages(struct compress_ctx *cc)
f2fs_cops[fi->i_compress_algorithm];
unsigned int max_len, new_nr_cpages;
struct page **new_cpages;
+ u32 chksum = 0;
int i, ret;

trace_f2fs_compress_pages_start(cc->inode, cc->cluster_idx,
@@ -655,6 +656,11 @@ static int f2fs_compress_pages(struct compress_ctx *cc)

cc->cbuf->clen = cpu_to_le32(cc->clen);

+ if (fi->i_compress_flag & 1 << COMPRESS_CHKSUM)
+ chksum = f2fs_crc32(F2FS_I_SB(cc->inode),
+ cc->cbuf->cdata, cc->clen);
+ cc->cbuf->chksum = cpu_to_le32(chksum);
+
for (i = 0; i < COMPRESS_DATA_RESERVED_SIZE; i++)
cc->cbuf->reserved[i] = cpu_to_le32(0);

@@ -721,6 +727,7 @@ void f2fs_decompress_pages(struct bio *bio, struct page *page, bool verity)
(struct decompress_io_ctx *)page_private(page);
struct f2fs_sb_info *sbi = F2FS_I_SB(dic->inode);
struct f2fs_inode_info *fi= F2FS_I(dic->inode);
+ struct f2fs_sb_info *sbi = F2FS_I_SB(dic->inode);
const struct f2fs_compress_ops *cops =
f2fs_cops[fi->i_compress_algorithm];
int ret;
@@ -790,6 +797,19 @@ void f2fs_decompress_pages(struct bio *bio, struct page *page, bool verity)

ret = cops->decompress_pages(dic);

+ if (!ret && fi->i_compress_flag & 1 << COMPRESS_CHKSUM) {
+ u32 provided = le32_to_cpu(dic->cbuf->chksum);
+ u32 calculated = f2fs_crc32(sbi, dic->cbuf->cdata, dic->clen);
+
+ if (provided != calculated) {
+ printk_ratelimited(
+ "%sF2FS-fs (%s): checksum invalid, nid = %lu, %x vs %x",
+ KERN_INFO, sbi->sb->s_id, dic->inode->i_ino,
+ provided, calculated);
+ ret = -EFSCORRUPTED;
+ }
+ }
+
out_vunmap_cbuf:
vm_unmap_ram(dic->cbuf, dic->nr_cpages);
out_vunmap_rbuf:
diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
index 99bcf4b44a9c..2ae254ab7b7d 100644
--- a/fs/f2fs/f2fs.h
+++ b/fs/f2fs/f2fs.h
@@ -147,7 +147,8 @@ struct f2fs_mount_info {

/* For compression */
unsigned char compress_algorithm; /* algorithm type */
- unsigned compress_log_size; /* cluster log size */
+ unsigned char compress_log_size; /* cluster log size */
+ bool compress_chksum; /* compressed data chksum */
unsigned char compress_ext_cnt; /* extension count */
unsigned char extensions[COMPRESS_EXT_NUM][F2FS_EXTENSION_LEN]; /* extensions */
};
@@ -731,6 +732,7 @@ struct f2fs_inode_info {
atomic_t i_compr_blocks; /* # of compressed blocks */
unsigned char i_compress_algorithm; /* algorithm type */
unsigned char i_log_cluster_size; /* log of cluster size */
+ unsigned short i_compress_flag; /* compress flag */
unsigned int i_cluster_size; /* cluster size */
};

@@ -1270,9 +1272,15 @@ enum compress_algorithm_type {
COMPRESS_MAX,
};

+enum compress_flag {
+ COMPRESS_CHKSUM,
+ COMPRESS_MAX_FLAG,
+};
+
#define COMPRESS_DATA_RESERVED_SIZE 5
struct compress_data {
__le32 clen; /* compressed data size */
+ __le32 chksum; /* compressed data chksum */
__le32 reserved[COMPRESS_DATA_RESERVED_SIZE]; /* reserved */
u8 cdata[]; /* compressed data */
};
@@ -3882,6 +3890,9 @@ static inline void set_compress_context(struct inode *inode)
F2FS_OPTION(sbi).compress_algorithm;
F2FS_I(inode)->i_log_cluster_size =
F2FS_OPTION(sbi).compress_log_size;
+ F2FS_I(inode)->i_compress_flag =
+ F2FS_OPTION(sbi).compress_chksum ?
+ 1 << COMPRESS_CHKSUM : 0;
F2FS_I(inode)->i_cluster_size =
1 << F2FS_I(inode)->i_log_cluster_size;
F2FS_I(inode)->i_flags |= F2FS_COMPR_FL;
diff --git a/fs/f2fs/inode.c b/fs/f2fs/inode.c
index 657db2fb6739..de8f7fc89efa 100644
--- a/fs/f2fs/inode.c
+++ b/fs/f2fs/inode.c
@@ -456,6 +456,7 @@ static int do_read_inode(struct inode *inode)
le64_to_cpu(ri->i_compr_blocks));
fi->i_compress_algorithm = ri->i_compress_algorithm;
fi->i_log_cluster_size = ri->i_log_cluster_size;
+ fi->i_compress_flag = ri->i_compress_flag;
fi->i_cluster_size = 1 << fi->i_log_cluster_size;
set_inode_flag(inode, FI_COMPRESSED_FILE);
}
@@ -634,6 +635,8 @@ void f2fs_update_inode(struct inode *inode, struct page *node_page)
&F2FS_I(inode)->i_compr_blocks));
ri->i_compress_algorithm =
F2FS_I(inode)->i_compress_algorithm;
+ ri->i_compress_flag =
+ cpu_to_le16(F2FS_I(inode)->i_compress_flag);
ri->i_log_cluster_size =
F2FS_I(inode)->i_log_cluster_size;
}
diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c
index 00eff2f51807..f8de4d83a5be 100644
--- a/fs/f2fs/super.c
+++ b/fs/f2fs/super.c
@@ -146,6 +146,7 @@ enum {
Opt_compress_algorithm,
Opt_compress_log_size,
Opt_compress_extension,
+ Opt_compress_chksum,
Opt_atgc,
Opt_err,
};
@@ -214,6 +215,7 @@ static match_table_t f2fs_tokens = {
{Opt_compress_algorithm, "compress_algorithm=%s"},
{Opt_compress_log_size, "compress_log_size=%u"},
{Opt_compress_extension, "compress_extension=%s"},
+ {Opt_compress_chksum, "compress_chksum"},
{Opt_atgc, "atgc"},
{Opt_err, NULL},
};
@@ -934,10 +936,14 @@ static int parse_options(struct super_block *sb, char *options, bool is_remount)
F2FS_OPTION(sbi).compress_ext_cnt++;
kfree(name);
break;
+ case Opt_compress_chksum:
+ F2FS_OPTION(sbi).compress_chksum = true;
+ break;
#else
case Opt_compress_algorithm:
case Opt_compress_log_size:
case Opt_compress_extension:
+ case Opt_compress_chksum:
f2fs_info(sbi, "compression options not supported");
break;
#endif
@@ -1523,6 +1529,9 @@ static inline void f2fs_show_compress_options(struct seq_file *seq,
seq_printf(seq, ",compress_extension=%s",
F2FS_OPTION(sbi).extensions[i]);
}
+
+ if (F2FS_OPTION(sbi).compress_chksum)
+ seq_puts(seq, ",compress_chksum");
}

static int f2fs_show_options(struct seq_file *seq, struct dentry *root)
diff --git a/include/linux/f2fs_fs.h b/include/linux/f2fs_fs.h
index a5dbb57a687f..7dc2a06cf19a 100644
--- a/include/linux/f2fs_fs.h
+++ b/include/linux/f2fs_fs.h
@@ -273,7 +273,7 @@ struct f2fs_inode {
__le64 i_compr_blocks; /* # of compressed blocks */
__u8 i_compress_algorithm; /* compress algorithm */
__u8 i_log_cluster_size; /* log of cluster size */
- __le16 i_padding; /* padding */
+ __le16 i_compress_flag; /* compress flag */
__le32 i_extra_end[0]; /* for attribute size calculation */
} __packed;
__le32 i_addr[DEF_ADDRS_PER_INODE]; /* Pointers to data blocks */
--
2.26.2


2020-11-02 16:36:54

by Jaegeuk Kim

[permalink] [raw]
Subject: Re: [PATCH] f2fs: compress: support chksum

On 11/02, Chao Yu wrote:
> This patch supports to store chksum value with compressed
> data, and verify the integrality of compressed data while
> reading the data.
>
> The feature can be enabled through specifying mount option
> 'compress_chksum'.
>
> Signed-off-by: Chao Yu <[email protected]>
> ---
> Documentation/filesystems/f2fs.rst | 1 +
> fs/f2fs/compress.c | 20 ++++++++++++++++++++
> fs/f2fs/f2fs.h | 13 ++++++++++++-
> fs/f2fs/inode.c | 3 +++
> fs/f2fs/super.c | 9 +++++++++
> include/linux/f2fs_fs.h | 2 +-
> 6 files changed, 46 insertions(+), 2 deletions(-)
>
> diff --git a/Documentation/filesystems/f2fs.rst b/Documentation/filesystems/f2fs.rst
> index b8ee761c9922..985ae7d35066 100644
> --- a/Documentation/filesystems/f2fs.rst
> +++ b/Documentation/filesystems/f2fs.rst
> @@ -260,6 +260,7 @@ compress_extension=%s Support adding specified extension, so that f2fs can enab
> For other files, we can still enable compression via ioctl.
> Note that, there is one reserved special extension '*', it
> can be set to enable compression for all files.
> +compress_chksum Support verifying chksum of raw data in compressed cluster.
> inlinecrypt When possible, encrypt/decrypt the contents of encrypted
> files using the blk-crypto framework rather than
> filesystem-layer encryption. This allows the use of
> diff --git a/fs/f2fs/compress.c b/fs/f2fs/compress.c
> index 14262e0f1cd6..a4e0d2c745b6 100644
> --- a/fs/f2fs/compress.c
> +++ b/fs/f2fs/compress.c
> @@ -602,6 +602,7 @@ static int f2fs_compress_pages(struct compress_ctx *cc)
> f2fs_cops[fi->i_compress_algorithm];
> unsigned int max_len, new_nr_cpages;
> struct page **new_cpages;
> + u32 chksum = 0;
> int i, ret;
>
> trace_f2fs_compress_pages_start(cc->inode, cc->cluster_idx,
> @@ -655,6 +656,11 @@ static int f2fs_compress_pages(struct compress_ctx *cc)
>
> cc->cbuf->clen = cpu_to_le32(cc->clen);
>
> + if (fi->i_compress_flag & 1 << COMPRESS_CHKSUM)
> + chksum = f2fs_crc32(F2FS_I_SB(cc->inode),
> + cc->cbuf->cdata, cc->clen);
> + cc->cbuf->chksum = cpu_to_le32(chksum);
> +
> for (i = 0; i < COMPRESS_DATA_RESERVED_SIZE; i++)
> cc->cbuf->reserved[i] = cpu_to_le32(0);
>
> @@ -721,6 +727,7 @@ void f2fs_decompress_pages(struct bio *bio, struct page *page, bool verity)
> (struct decompress_io_ctx *)page_private(page);
> struct f2fs_sb_info *sbi = F2FS_I_SB(dic->inode);
> struct f2fs_inode_info *fi= F2FS_I(dic->inode);
> + struct f2fs_sb_info *sbi = F2FS_I_SB(dic->inode);
> const struct f2fs_compress_ops *cops =
> f2fs_cops[fi->i_compress_algorithm];
> int ret;
> @@ -790,6 +797,19 @@ void f2fs_decompress_pages(struct bio *bio, struct page *page, bool verity)
>
> ret = cops->decompress_pages(dic);
>
> + if (!ret && fi->i_compress_flag & 1 << COMPRESS_CHKSUM) {
> + u32 provided = le32_to_cpu(dic->cbuf->chksum);
> + u32 calculated = f2fs_crc32(sbi, dic->cbuf->cdata, dic->clen);
> +
> + if (provided != calculated) {
> + printk_ratelimited(
> + "%sF2FS-fs (%s): checksum invalid, nid = %lu, %x vs %x",
> + KERN_INFO, sbi->sb->s_id, dic->inode->i_ino,
> + provided, calculated);
> + ret = -EFSCORRUPTED;

Do we need to change fsck.f2fs to recover this?

> + }
> + }
> +
> out_vunmap_cbuf:
> vm_unmap_ram(dic->cbuf, dic->nr_cpages);
> out_vunmap_rbuf:
> diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
> index 99bcf4b44a9c..2ae254ab7b7d 100644
> --- a/fs/f2fs/f2fs.h
> +++ b/fs/f2fs/f2fs.h
> @@ -147,7 +147,8 @@ struct f2fs_mount_info {
>
> /* For compression */
> unsigned char compress_algorithm; /* algorithm type */
> - unsigned compress_log_size; /* cluster log size */
> + unsigned char compress_log_size; /* cluster log size */
> + bool compress_chksum; /* compressed data chksum */
> unsigned char compress_ext_cnt; /* extension count */
> unsigned char extensions[COMPRESS_EXT_NUM][F2FS_EXTENSION_LEN]; /* extensions */
> };
> @@ -731,6 +732,7 @@ struct f2fs_inode_info {
> atomic_t i_compr_blocks; /* # of compressed blocks */
> unsigned char i_compress_algorithm; /* algorithm type */
> unsigned char i_log_cluster_size; /* log of cluster size */
> + unsigned short i_compress_flag; /* compress flag */
> unsigned int i_cluster_size; /* cluster size */
> };
>
> @@ -1270,9 +1272,15 @@ enum compress_algorithm_type {
> COMPRESS_MAX,
> };
>
> +enum compress_flag {
> + COMPRESS_CHKSUM,
> + COMPRESS_MAX_FLAG,
> +};
> +
> #define COMPRESS_DATA_RESERVED_SIZE 5
> struct compress_data {
> __le32 clen; /* compressed data size */
> + __le32 chksum; /* compressed data chksum */
> __le32 reserved[COMPRESS_DATA_RESERVED_SIZE]; /* reserved */
> u8 cdata[]; /* compressed data */
> };
> @@ -3882,6 +3890,9 @@ static inline void set_compress_context(struct inode *inode)
> F2FS_OPTION(sbi).compress_algorithm;
> F2FS_I(inode)->i_log_cluster_size =
> F2FS_OPTION(sbi).compress_log_size;
> + F2FS_I(inode)->i_compress_flag =
> + F2FS_OPTION(sbi).compress_chksum ?
> + 1 << COMPRESS_CHKSUM : 0;
> F2FS_I(inode)->i_cluster_size =
> 1 << F2FS_I(inode)->i_log_cluster_size;
> F2FS_I(inode)->i_flags |= F2FS_COMPR_FL;
> diff --git a/fs/f2fs/inode.c b/fs/f2fs/inode.c
> index 657db2fb6739..de8f7fc89efa 100644
> --- a/fs/f2fs/inode.c
> +++ b/fs/f2fs/inode.c
> @@ -456,6 +456,7 @@ static int do_read_inode(struct inode *inode)
> le64_to_cpu(ri->i_compr_blocks));
> fi->i_compress_algorithm = ri->i_compress_algorithm;
> fi->i_log_cluster_size = ri->i_log_cluster_size;
> + fi->i_compress_flag = ri->i_compress_flag;
> fi->i_cluster_size = 1 << fi->i_log_cluster_size;
> set_inode_flag(inode, FI_COMPRESSED_FILE);
> }
> @@ -634,6 +635,8 @@ void f2fs_update_inode(struct inode *inode, struct page *node_page)
> &F2FS_I(inode)->i_compr_blocks));
> ri->i_compress_algorithm =
> F2FS_I(inode)->i_compress_algorithm;
> + ri->i_compress_flag =
> + cpu_to_le16(F2FS_I(inode)->i_compress_flag);
> ri->i_log_cluster_size =
> F2FS_I(inode)->i_log_cluster_size;
> }
> diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c
> index 00eff2f51807..f8de4d83a5be 100644
> --- a/fs/f2fs/super.c
> +++ b/fs/f2fs/super.c
> @@ -146,6 +146,7 @@ enum {
> Opt_compress_algorithm,
> Opt_compress_log_size,
> Opt_compress_extension,
> + Opt_compress_chksum,
> Opt_atgc,
> Opt_err,
> };
> @@ -214,6 +215,7 @@ static match_table_t f2fs_tokens = {
> {Opt_compress_algorithm, "compress_algorithm=%s"},
> {Opt_compress_log_size, "compress_log_size=%u"},
> {Opt_compress_extension, "compress_extension=%s"},
> + {Opt_compress_chksum, "compress_chksum"},
> {Opt_atgc, "atgc"},
> {Opt_err, NULL},
> };
> @@ -934,10 +936,14 @@ static int parse_options(struct super_block *sb, char *options, bool is_remount)
> F2FS_OPTION(sbi).compress_ext_cnt++;
> kfree(name);
> break;
> + case Opt_compress_chksum:
> + F2FS_OPTION(sbi).compress_chksum = true;
> + break;
> #else
> case Opt_compress_algorithm:
> case Opt_compress_log_size:
> case Opt_compress_extension:
> + case Opt_compress_chksum:
> f2fs_info(sbi, "compression options not supported");
> break;
> #endif
> @@ -1523,6 +1529,9 @@ static inline void f2fs_show_compress_options(struct seq_file *seq,
> seq_printf(seq, ",compress_extension=%s",
> F2FS_OPTION(sbi).extensions[i]);
> }
> +
> + if (F2FS_OPTION(sbi).compress_chksum)
> + seq_puts(seq, ",compress_chksum");
> }
>
> static int f2fs_show_options(struct seq_file *seq, struct dentry *root)
> diff --git a/include/linux/f2fs_fs.h b/include/linux/f2fs_fs.h
> index a5dbb57a687f..7dc2a06cf19a 100644
> --- a/include/linux/f2fs_fs.h
> +++ b/include/linux/f2fs_fs.h
> @@ -273,7 +273,7 @@ struct f2fs_inode {
> __le64 i_compr_blocks; /* # of compressed blocks */
> __u8 i_compress_algorithm; /* compress algorithm */
> __u8 i_log_cluster_size; /* log of cluster size */
> - __le16 i_padding; /* padding */
> + __le16 i_compress_flag; /* compress flag */
> __le32 i_extra_end[0]; /* for attribute size calculation */
> } __packed;
> __le32 i_addr[DEF_ADDRS_PER_INODE]; /* Pointers to data blocks */
> --
> 2.26.2

2020-11-02 17:40:15

by kernel test robot

[permalink] [raw]
Subject: Re: [PATCH] f2fs: compress: support chksum

Hi Chao,

I love your patch! Perhaps something to improve:

[auto build test WARNING on f2fs/dev-test]
[also build test WARNING on v5.10-rc2 next-20201102]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url: https://github.com/0day-ci/linux/commits/Chao-Yu/f2fs-compress-support-chksum/20201102-202700
base: https://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs.git dev-test
config: x86_64-randconfig-s022-20201102 (attached as .config)
compiler: gcc-9 (Debian 9.3.0-15) 9.3.0
reproduce:
# apt-get install sparse
# sparse version: v0.6.3-56-gc09e8239-dirty
# https://github.com/0day-ci/linux/commit/0216dd41faf31554616bde7437532abd1dc18fbb
git remote add linux-review https://github.com/0day-ci/linux
git fetch --no-tags linux-review Chao-Yu/f2fs-compress-support-chksum/20201102-202700
git checkout 0216dd41faf31554616bde7437532abd1dc18fbb
# save the attached .config to linux build tree
make W=1 C=1 CF='-fdiagnostic-prefix -D__CHECK_ENDIAN__' ARCH=x86_64

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <[email protected]>


"sparse warnings: (new ones prefixed by >>)"
>> fs/f2fs/inode.c:459:45: sparse: sparse: incorrect type in assignment (different base types) @@ expected unsigned short i_compress_flag @@ got restricted __le16 [usertype] i_compress_flag @@
>> fs/f2fs/inode.c:459:45: sparse: expected unsigned short i_compress_flag
>> fs/f2fs/inode.c:459:45: sparse: got restricted __le16 [usertype] i_compress_flag

vim +459 fs/f2fs/inode.c

332
333 static int do_read_inode(struct inode *inode)
334 {
335 struct f2fs_sb_info *sbi = F2FS_I_SB(inode);
336 struct f2fs_inode_info *fi = F2FS_I(inode);
337 struct page *node_page;
338 struct f2fs_inode *ri;
339 projid_t i_projid;
340 int err;
341
342 /* Check if ino is within scope */
343 if (f2fs_check_nid_range(sbi, inode->i_ino))
344 return -EINVAL;
345
346 node_page = f2fs_get_node_page(sbi, inode->i_ino);
347 if (IS_ERR(node_page))
348 return PTR_ERR(node_page);
349
350 ri = F2FS_INODE(node_page);
351
352 inode->i_mode = le16_to_cpu(ri->i_mode);
353 i_uid_write(inode, le32_to_cpu(ri->i_uid));
354 i_gid_write(inode, le32_to_cpu(ri->i_gid));
355 set_nlink(inode, le32_to_cpu(ri->i_links));
356 inode->i_size = le64_to_cpu(ri->i_size);
357 inode->i_blocks = SECTOR_FROM_BLOCK(le64_to_cpu(ri->i_blocks) - 1);
358
359 inode->i_atime.tv_sec = le64_to_cpu(ri->i_atime);
360 inode->i_ctime.tv_sec = le64_to_cpu(ri->i_ctime);
361 inode->i_mtime.tv_sec = le64_to_cpu(ri->i_mtime);
362 inode->i_atime.tv_nsec = le32_to_cpu(ri->i_atime_nsec);
363 inode->i_ctime.tv_nsec = le32_to_cpu(ri->i_ctime_nsec);
364 inode->i_mtime.tv_nsec = le32_to_cpu(ri->i_mtime_nsec);
365 inode->i_generation = le32_to_cpu(ri->i_generation);
366 if (S_ISDIR(inode->i_mode))
367 fi->i_current_depth = le32_to_cpu(ri->i_current_depth);
368 else if (S_ISREG(inode->i_mode))
369 fi->i_gc_failures[GC_FAILURE_PIN] =
370 le16_to_cpu(ri->i_gc_failures);
371 fi->i_xattr_nid = le32_to_cpu(ri->i_xattr_nid);
372 fi->i_flags = le32_to_cpu(ri->i_flags);
373 if (S_ISREG(inode->i_mode))
374 fi->i_flags &= ~F2FS_PROJINHERIT_FL;
375 bitmap_zero(fi->flags, FI_MAX);
376 fi->i_advise = ri->i_advise;
377 fi->i_pino = le32_to_cpu(ri->i_pino);
378 fi->i_dir_level = ri->i_dir_level;
379
380 f2fs_init_extent_tree(inode, node_page);
381
382 get_inline_info(inode, ri);
383
384 fi->i_extra_isize = f2fs_has_extra_attr(inode) ?
385 le16_to_cpu(ri->i_extra_isize) : 0;
386
387 if (f2fs_sb_has_flexible_inline_xattr(sbi)) {
388 fi->i_inline_xattr_size = le16_to_cpu(ri->i_inline_xattr_size);
389 } else if (f2fs_has_inline_xattr(inode) ||
390 f2fs_has_inline_dentry(inode)) {
391 fi->i_inline_xattr_size = DEFAULT_INLINE_XATTR_ADDRS;
392 } else {
393
394 /*
395 * Previous inline data or directory always reserved 200 bytes
396 * in inode layout, even if inline_xattr is disabled. In order
397 * to keep inline_dentry's structure for backward compatibility,
398 * we get the space back only from inline_data.
399 */
400 fi->i_inline_xattr_size = 0;
401 }
402
403 if (!sanity_check_inode(inode, node_page)) {
404 f2fs_put_page(node_page, 1);
405 return -EFSCORRUPTED;
406 }
407
408 /* check data exist */
409 if (f2fs_has_inline_data(inode) && !f2fs_exist_data(inode))
410 __recover_inline_status(inode, node_page);
411
412 /* try to recover cold bit for non-dir inode */
413 if (!S_ISDIR(inode->i_mode) && !is_cold_node(node_page)) {
414 f2fs_wait_on_page_writeback(node_page, NODE, true, true);
415 set_cold_node(node_page, false);
416 set_page_dirty(node_page);
417 }
418
419 /* get rdev by using inline_info */
420 __get_inode_rdev(inode, ri);
421
422 if (S_ISREG(inode->i_mode)) {
423 err = __written_first_block(sbi, ri);
424 if (err < 0) {
425 f2fs_put_page(node_page, 1);
426 return err;
427 }
428 if (!err)
429 set_inode_flag(inode, FI_FIRST_BLOCK_WRITTEN);
430 }
431
432 if (!f2fs_need_inode_block_update(sbi, inode->i_ino))
433 fi->last_disk_size = inode->i_size;
434
435 if (fi->i_flags & F2FS_PROJINHERIT_FL)
436 set_inode_flag(inode, FI_PROJ_INHERIT);
437
438 if (f2fs_has_extra_attr(inode) && f2fs_sb_has_project_quota(sbi) &&
439 F2FS_FITS_IN_INODE(ri, fi->i_extra_isize, i_projid))
440 i_projid = (projid_t)le32_to_cpu(ri->i_projid);
441 else
442 i_projid = F2FS_DEF_PROJID;
443 fi->i_projid = make_kprojid(&init_user_ns, i_projid);
444
445 if (f2fs_has_extra_attr(inode) && f2fs_sb_has_inode_crtime(sbi) &&
446 F2FS_FITS_IN_INODE(ri, fi->i_extra_isize, i_crtime)) {
447 fi->i_crtime.tv_sec = le64_to_cpu(ri->i_crtime);
448 fi->i_crtime.tv_nsec = le32_to_cpu(ri->i_crtime_nsec);
449 }
450
451 if (f2fs_has_extra_attr(inode) && f2fs_sb_has_compression(sbi) &&
452 (fi->i_flags & F2FS_COMPR_FL)) {
453 if (F2FS_FITS_IN_INODE(ri, fi->i_extra_isize,
454 i_log_cluster_size)) {
455 atomic_set(&fi->i_compr_blocks,
456 le64_to_cpu(ri->i_compr_blocks));
457 fi->i_compress_algorithm = ri->i_compress_algorithm;
458 fi->i_log_cluster_size = ri->i_log_cluster_size;
> 459 fi->i_compress_flag = ri->i_compress_flag;
460 fi->i_cluster_size = 1 << fi->i_log_cluster_size;
461 set_inode_flag(inode, FI_COMPRESSED_FILE);
462 }
463 }
464
465 F2FS_I(inode)->i_disk_time[0] = inode->i_atime;
466 F2FS_I(inode)->i_disk_time[1] = inode->i_ctime;
467 F2FS_I(inode)->i_disk_time[2] = inode->i_mtime;
468 F2FS_I(inode)->i_disk_time[3] = F2FS_I(inode)->i_crtime;
469 f2fs_put_page(node_page, 1);
470
471 stat_inc_inline_xattr(inode);
472 stat_inc_inline_inode(inode);
473 stat_inc_inline_dir(inode);
474 stat_inc_compr_inode(inode);
475 stat_add_compr_blocks(inode, atomic_read(&fi->i_compr_blocks));
476
477 return 0;
478 }
479

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/[email protected]


Attachments:
(No filename) (7.60 kB)
.config.gz (29.32 kB)
Download all attachments

2020-11-03 02:05:10

by Chao Yu

[permalink] [raw]
Subject: Re: [PATCH] f2fs: compress: support chksum

On 2020/11/3 0:31, Jaegeuk Kim wrote:
> On 11/02, Chao Yu wrote:
>> This patch supports to store chksum value with compressed
>> data, and verify the integrality of compressed data while
>> reading the data.
>>
>> The feature can be enabled through specifying mount option
>> 'compress_chksum'.
>>
>> Signed-off-by: Chao Yu <[email protected]>
>> ---
>> Documentation/filesystems/f2fs.rst | 1 +
>> fs/f2fs/compress.c | 20 ++++++++++++++++++++
>> fs/f2fs/f2fs.h | 13 ++++++++++++-
>> fs/f2fs/inode.c | 3 +++
>> fs/f2fs/super.c | 9 +++++++++
>> include/linux/f2fs_fs.h | 2 +-
>> 6 files changed, 46 insertions(+), 2 deletions(-)
>>
>> diff --git a/Documentation/filesystems/f2fs.rst b/Documentation/filesystems/f2fs.rst
>> index b8ee761c9922..985ae7d35066 100644
>> --- a/Documentation/filesystems/f2fs.rst
>> +++ b/Documentation/filesystems/f2fs.rst
>> @@ -260,6 +260,7 @@ compress_extension=%s Support adding specified extension, so that f2fs can enab
>> For other files, we can still enable compression via ioctl.
>> Note that, there is one reserved special extension '*', it
>> can be set to enable compression for all files.
>> +compress_chksum Support verifying chksum of raw data in compressed cluster.
>> inlinecrypt When possible, encrypt/decrypt the contents of encrypted
>> files using the blk-crypto framework rather than
>> filesystem-layer encryption. This allows the use of
>> diff --git a/fs/f2fs/compress.c b/fs/f2fs/compress.c
>> index 14262e0f1cd6..a4e0d2c745b6 100644
>> --- a/fs/f2fs/compress.c
>> +++ b/fs/f2fs/compress.c
>> @@ -602,6 +602,7 @@ static int f2fs_compress_pages(struct compress_ctx *cc)
>> f2fs_cops[fi->i_compress_algorithm];
>> unsigned int max_len, new_nr_cpages;
>> struct page **new_cpages;
>> + u32 chksum = 0;
>> int i, ret;
>>
>> trace_f2fs_compress_pages_start(cc->inode, cc->cluster_idx,
>> @@ -655,6 +656,11 @@ static int f2fs_compress_pages(struct compress_ctx *cc)
>>
>> cc->cbuf->clen = cpu_to_le32(cc->clen);
>>
>> + if (fi->i_compress_flag & 1 << COMPRESS_CHKSUM)
>> + chksum = f2fs_crc32(F2FS_I_SB(cc->inode),
>> + cc->cbuf->cdata, cc->clen);
>> + cc->cbuf->chksum = cpu_to_le32(chksum);
>> +
>> for (i = 0; i < COMPRESS_DATA_RESERVED_SIZE; i++)
>> cc->cbuf->reserved[i] = cpu_to_le32(0);
>>
>> @@ -721,6 +727,7 @@ void f2fs_decompress_pages(struct bio *bio, struct page *page, bool verity)
>> (struct decompress_io_ctx *)page_private(page);
>> struct f2fs_sb_info *sbi = F2FS_I_SB(dic->inode);
>> struct f2fs_inode_info *fi= F2FS_I(dic->inode);
>> + struct f2fs_sb_info *sbi = F2FS_I_SB(dic->inode);
>> const struct f2fs_compress_ops *cops =
>> f2fs_cops[fi->i_compress_algorithm];
>> int ret;
>> @@ -790,6 +797,19 @@ void f2fs_decompress_pages(struct bio *bio, struct page *page, bool verity)
>>
>> ret = cops->decompress_pages(dic);
>>
>> + if (!ret && fi->i_compress_flag & 1 << COMPRESS_CHKSUM) {
>> + u32 provided = le32_to_cpu(dic->cbuf->chksum);
>> + u32 calculated = f2fs_crc32(sbi, dic->cbuf->cdata, dic->clen);
>> +
>> + if (provided != calculated) {
>> + printk_ratelimited(
>> + "%sF2FS-fs (%s): checksum invalid, nid = %lu, %x vs %x",
>> + KERN_INFO, sbi->sb->s_id, dic->inode->i_ino,
>> + provided, calculated);
>> + ret = -EFSCORRUPTED;
>
> Do we need to change fsck.f2fs to recover this?

Yes, prepared to update inode layout in fsck.f2fs w/ kernel side change.

Thanks,

>
>> + }
>> + }
>> +
>> out_vunmap_cbuf:
>> vm_unmap_ram(dic->cbuf, dic->nr_cpages);
>> out_vunmap_rbuf:
>> diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
>> index 99bcf4b44a9c..2ae254ab7b7d 100644
>> --- a/fs/f2fs/f2fs.h
>> +++ b/fs/f2fs/f2fs.h
>> @@ -147,7 +147,8 @@ struct f2fs_mount_info {
>>
>> /* For compression */
>> unsigned char compress_algorithm; /* algorithm type */
>> - unsigned compress_log_size; /* cluster log size */
>> + unsigned char compress_log_size; /* cluster log size */
>> + bool compress_chksum; /* compressed data chksum */
>> unsigned char compress_ext_cnt; /* extension count */
>> unsigned char extensions[COMPRESS_EXT_NUM][F2FS_EXTENSION_LEN]; /* extensions */
>> };
>> @@ -731,6 +732,7 @@ struct f2fs_inode_info {
>> atomic_t i_compr_blocks; /* # of compressed blocks */
>> unsigned char i_compress_algorithm; /* algorithm type */
>> unsigned char i_log_cluster_size; /* log of cluster size */
>> + unsigned short i_compress_flag; /* compress flag */
>> unsigned int i_cluster_size; /* cluster size */
>> };
>>
>> @@ -1270,9 +1272,15 @@ enum compress_algorithm_type {
>> COMPRESS_MAX,
>> };
>>
>> +enum compress_flag {
>> + COMPRESS_CHKSUM,
>> + COMPRESS_MAX_FLAG,
>> +};
>> +
>> #define COMPRESS_DATA_RESERVED_SIZE 5
>> struct compress_data {
>> __le32 clen; /* compressed data size */
>> + __le32 chksum; /* compressed data chksum */
>> __le32 reserved[COMPRESS_DATA_RESERVED_SIZE]; /* reserved */
>> u8 cdata[]; /* compressed data */
>> };
>> @@ -3882,6 +3890,9 @@ static inline void set_compress_context(struct inode *inode)
>> F2FS_OPTION(sbi).compress_algorithm;
>> F2FS_I(inode)->i_log_cluster_size =
>> F2FS_OPTION(sbi).compress_log_size;
>> + F2FS_I(inode)->i_compress_flag =
>> + F2FS_OPTION(sbi).compress_chksum ?
>> + 1 << COMPRESS_CHKSUM : 0;
>> F2FS_I(inode)->i_cluster_size =
>> 1 << F2FS_I(inode)->i_log_cluster_size;
>> F2FS_I(inode)->i_flags |= F2FS_COMPR_FL;
>> diff --git a/fs/f2fs/inode.c b/fs/f2fs/inode.c
>> index 657db2fb6739..de8f7fc89efa 100644
>> --- a/fs/f2fs/inode.c
>> +++ b/fs/f2fs/inode.c
>> @@ -456,6 +456,7 @@ static int do_read_inode(struct inode *inode)
>> le64_to_cpu(ri->i_compr_blocks));
>> fi->i_compress_algorithm = ri->i_compress_algorithm;
>> fi->i_log_cluster_size = ri->i_log_cluster_size;
>> + fi->i_compress_flag = ri->i_compress_flag;
>> fi->i_cluster_size = 1 << fi->i_log_cluster_size;
>> set_inode_flag(inode, FI_COMPRESSED_FILE);
>> }
>> @@ -634,6 +635,8 @@ void f2fs_update_inode(struct inode *inode, struct page *node_page)
>> &F2FS_I(inode)->i_compr_blocks));
>> ri->i_compress_algorithm =
>> F2FS_I(inode)->i_compress_algorithm;
>> + ri->i_compress_flag =
>> + cpu_to_le16(F2FS_I(inode)->i_compress_flag);
>> ri->i_log_cluster_size =
>> F2FS_I(inode)->i_log_cluster_size;
>> }
>> diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c
>> index 00eff2f51807..f8de4d83a5be 100644
>> --- a/fs/f2fs/super.c
>> +++ b/fs/f2fs/super.c
>> @@ -146,6 +146,7 @@ enum {
>> Opt_compress_algorithm,
>> Opt_compress_log_size,
>> Opt_compress_extension,
>> + Opt_compress_chksum,
>> Opt_atgc,
>> Opt_err,
>> };
>> @@ -214,6 +215,7 @@ static match_table_t f2fs_tokens = {
>> {Opt_compress_algorithm, "compress_algorithm=%s"},
>> {Opt_compress_log_size, "compress_log_size=%u"},
>> {Opt_compress_extension, "compress_extension=%s"},
>> + {Opt_compress_chksum, "compress_chksum"},
>> {Opt_atgc, "atgc"},
>> {Opt_err, NULL},
>> };
>> @@ -934,10 +936,14 @@ static int parse_options(struct super_block *sb, char *options, bool is_remount)
>> F2FS_OPTION(sbi).compress_ext_cnt++;
>> kfree(name);
>> break;
>> + case Opt_compress_chksum:
>> + F2FS_OPTION(sbi).compress_chksum = true;
>> + break;
>> #else
>> case Opt_compress_algorithm:
>> case Opt_compress_log_size:
>> case Opt_compress_extension:
>> + case Opt_compress_chksum:
>> f2fs_info(sbi, "compression options not supported");
>> break;
>> #endif
>> @@ -1523,6 +1529,9 @@ static inline void f2fs_show_compress_options(struct seq_file *seq,
>> seq_printf(seq, ",compress_extension=%s",
>> F2FS_OPTION(sbi).extensions[i]);
>> }
>> +
>> + if (F2FS_OPTION(sbi).compress_chksum)
>> + seq_puts(seq, ",compress_chksum");
>> }
>>
>> static int f2fs_show_options(struct seq_file *seq, struct dentry *root)
>> diff --git a/include/linux/f2fs_fs.h b/include/linux/f2fs_fs.h
>> index a5dbb57a687f..7dc2a06cf19a 100644
>> --- a/include/linux/f2fs_fs.h
>> +++ b/include/linux/f2fs_fs.h
>> @@ -273,7 +273,7 @@ struct f2fs_inode {
>> __le64 i_compr_blocks; /* # of compressed blocks */
>> __u8 i_compress_algorithm; /* compress algorithm */
>> __u8 i_log_cluster_size; /* log of cluster size */
>> - __le16 i_padding; /* padding */
>> + __le16 i_compress_flag; /* compress flag */
>> __le32 i_extra_end[0]; /* for attribute size calculation */
>> } __packed;
>> __le32 i_addr[DEF_ADDRS_PER_INODE]; /* Pointers to data blocks */
>> --
>> 2.26.2
> .
>

2020-11-03 02:41:47

by Chao Yu

[permalink] [raw]
Subject: Re: [f2fs-dev] [PATCH] f2fs: compress: support chksum

On 2020/11/3 10:02, Chao Yu wrote:
> On 2020/11/3 0:31, Jaegeuk Kim wrote:
>> On 11/02, Chao Yu wrote:
>>> This patch supports to store chksum value with compressed
>>> data, and verify the integrality of compressed data while
>>> reading the data.
>>>
>>> The feature can be enabled through specifying mount option
>>> 'compress_chksum'.
>>>
>>> Signed-off-by: Chao Yu <[email protected]>
>>> ---
>>> Documentation/filesystems/f2fs.rst | 1 +
>>> fs/f2fs/compress.c | 20 ++++++++++++++++++++
>>> fs/f2fs/f2fs.h | 13 ++++++++++++-
>>> fs/f2fs/inode.c | 3 +++
>>> fs/f2fs/super.c | 9 +++++++++
>>> include/linux/f2fs_fs.h | 2 +-
>>> 6 files changed, 46 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/Documentation/filesystems/f2fs.rst b/Documentation/filesystems/f2fs.rst
>>> index b8ee761c9922..985ae7d35066 100644
>>> --- a/Documentation/filesystems/f2fs.rst
>>> +++ b/Documentation/filesystems/f2fs.rst
>>> @@ -260,6 +260,7 @@ compress_extension=%s Support adding specified extension, so that f2fs can enab
>>> For other files, we can still enable compression via ioctl.
>>> Note that, there is one reserved special extension '*', it
>>> can be set to enable compression for all files.
>>> +compress_chksum Support verifying chksum of raw data in compressed cluster.
>>> inlinecrypt When possible, encrypt/decrypt the contents of encrypted
>>> files using the blk-crypto framework rather than
>>> filesystem-layer encryption. This allows the use of
>>> diff --git a/fs/f2fs/compress.c b/fs/f2fs/compress.c
>>> index 14262e0f1cd6..a4e0d2c745b6 100644
>>> --- a/fs/f2fs/compress.c
>>> +++ b/fs/f2fs/compress.c
>>> @@ -602,6 +602,7 @@ static int f2fs_compress_pages(struct compress_ctx *cc)
>>> f2fs_cops[fi->i_compress_algorithm];
>>> unsigned int max_len, new_nr_cpages;
>>> struct page **new_cpages;
>>> + u32 chksum = 0;
>>> int i, ret;
>>>
>>> trace_f2fs_compress_pages_start(cc->inode, cc->cluster_idx,
>>> @@ -655,6 +656,11 @@ static int f2fs_compress_pages(struct compress_ctx *cc)
>>>
>>> cc->cbuf->clen = cpu_to_le32(cc->clen);
>>>
>>> + if (fi->i_compress_flag & 1 << COMPRESS_CHKSUM)
>>> + chksum = f2fs_crc32(F2FS_I_SB(cc->inode),
>>> + cc->cbuf->cdata, cc->clen);
>>> + cc->cbuf->chksum = cpu_to_le32(chksum);
>>> +
>>> for (i = 0; i < COMPRESS_DATA_RESERVED_SIZE; i++)
>>> cc->cbuf->reserved[i] = cpu_to_le32(0);
>>>
>>> @@ -721,6 +727,7 @@ void f2fs_decompress_pages(struct bio *bio, struct page *page, bool verity)
>>> (struct decompress_io_ctx *)page_private(page);
>>> struct f2fs_sb_info *sbi = F2FS_I_SB(dic->inode);
>>> struct f2fs_inode_info *fi= F2FS_I(dic->inode);
>>> + struct f2fs_sb_info *sbi = F2FS_I_SB(dic->inode);
>>> const struct f2fs_compress_ops *cops =
>>> f2fs_cops[fi->i_compress_algorithm];
>>> int ret;
>>> @@ -790,6 +797,19 @@ void f2fs_decompress_pages(struct bio *bio, struct page *page, bool verity)
>>>
>>> ret = cops->decompress_pages(dic);
>>>
>>> + if (!ret && fi->i_compress_flag & 1 << COMPRESS_CHKSUM) {
>>> + u32 provided = le32_to_cpu(dic->cbuf->chksum);
>>> + u32 calculated = f2fs_crc32(sbi, dic->cbuf->cdata, dic->clen);
>>> +
>>> + if (provided != calculated) {
>>> + printk_ratelimited(
>>> + "%sF2FS-fs (%s): checksum invalid, nid = %lu, %x vs %x",
>>> + KERN_INFO, sbi->sb->s_id, dic->inode->i_ino,
>>> + provided, calculated);
>>> + ret = -EFSCORRUPTED;
>>
>> Do we need to change fsck.f2fs to recover this?

However, we don't know which one is correct, compressed data or chksum value?
if compressed data was corrupted, repairing chksum value doesn't help.

Or how about adding chksum values for both raw data and compressed data.

#define COMPRESS_DATA_RESERVED_SIZE 3
struct compress_data {
__le32 clen; /* compressed data size */
+ __le32 raw_chksum; /* raw data chksum */
+ __le32 compress_chksum; /* compressed data chksum */
__le32 reserved[COMPRESS_DATA_RESERVED_SIZE]; /* reserved */
u8 cdata[]; /* compressed data */
};

raw_chksum compress_chksum
match match -> data is verified, pass
not match match -> repair raw_chksum
matcth not match -> repair compress_chksum
not match not match -> corrupted, can not repair

Thanks,

>
> Yes, prepared to update inode layout in fsck.f2fs w/ kernel side change. >
> Thanks,
>
>>
>>> + }
>>> + }
>>> +
>>> out_vunmap_cbuf:
>>> vm_unmap_ram(dic->cbuf, dic->nr_cpages);
>>> out_vunmap_rbuf:
>>> diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
>>> index 99bcf4b44a9c..2ae254ab7b7d 100644
>>> --- a/fs/f2fs/f2fs.h
>>> +++ b/fs/f2fs/f2fs.h
>>> @@ -147,7 +147,8 @@ struct f2fs_mount_info {
>>>
>>> /* For compression */
>>> unsigned char compress_algorithm; /* algorithm type */
>>> - unsigned compress_log_size; /* cluster log size */
>>> + unsigned char compress_log_size; /* cluster log size */
>>> + bool compress_chksum; /* compressed data chksum */
>>> unsigned char compress_ext_cnt; /* extension count */
>>> unsigned char extensions[COMPRESS_EXT_NUM][F2FS_EXTENSION_LEN]; /* extensions */
>>> };
>>> @@ -731,6 +732,7 @@ struct f2fs_inode_info {
>>> atomic_t i_compr_blocks; /* # of compressed blocks */
>>> unsigned char i_compress_algorithm; /* algorithm type */
>>> unsigned char i_log_cluster_size; /* log of cluster size */
>>> + unsigned short i_compress_flag; /* compress flag */
>>> unsigned int i_cluster_size; /* cluster size */
>>> };
>>>
>>> @@ -1270,9 +1272,15 @@ enum compress_algorithm_type {
>>> COMPRESS_MAX,
>>> };
>>>
>>> +enum compress_flag {
>>> + COMPRESS_CHKSUM,
>>> + COMPRESS_MAX_FLAG,
>>> +};
>>> +
>>> #define COMPRESS_DATA_RESERVED_SIZE 5
>>> struct compress_data {
>>> __le32 clen; /* compressed data size */
>>> + __le32 chksum; /* compressed data chksum */
>>> __le32 reserved[COMPRESS_DATA_RESERVED_SIZE]; /* reserved */
>>> u8 cdata[]; /* compressed data */
>>> };
>>> @@ -3882,6 +3890,9 @@ static inline void set_compress_context(struct inode *inode)
>>> F2FS_OPTION(sbi).compress_algorithm;
>>> F2FS_I(inode)->i_log_cluster_size =
>>> F2FS_OPTION(sbi).compress_log_size;
>>> + F2FS_I(inode)->i_compress_flag =
>>> + F2FS_OPTION(sbi).compress_chksum ?
>>> + 1 << COMPRESS_CHKSUM : 0;
>>> F2FS_I(inode)->i_cluster_size =
>>> 1 << F2FS_I(inode)->i_log_cluster_size;
>>> F2FS_I(inode)->i_flags |= F2FS_COMPR_FL;
>>> diff --git a/fs/f2fs/inode.c b/fs/f2fs/inode.c
>>> index 657db2fb6739..de8f7fc89efa 100644
>>> --- a/fs/f2fs/inode.c
>>> +++ b/fs/f2fs/inode.c
>>> @@ -456,6 +456,7 @@ static int do_read_inode(struct inode *inode)
>>> le64_to_cpu(ri->i_compr_blocks));
>>> fi->i_compress_algorithm = ri->i_compress_algorithm;
>>> fi->i_log_cluster_size = ri->i_log_cluster_size;
>>> + fi->i_compress_flag = ri->i_compress_flag;
>>> fi->i_cluster_size = 1 << fi->i_log_cluster_size;
>>> set_inode_flag(inode, FI_COMPRESSED_FILE);
>>> }
>>> @@ -634,6 +635,8 @@ void f2fs_update_inode(struct inode *inode, struct page *node_page)
>>> &F2FS_I(inode)->i_compr_blocks));
>>> ri->i_compress_algorithm =
>>> F2FS_I(inode)->i_compress_algorithm;
>>> + ri->i_compress_flag =
>>> + cpu_to_le16(F2FS_I(inode)->i_compress_flag);
>>> ri->i_log_cluster_size =
>>> F2FS_I(inode)->i_log_cluster_size;
>>> }
>>> diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c
>>> index 00eff2f51807..f8de4d83a5be 100644
>>> --- a/fs/f2fs/super.c
>>> +++ b/fs/f2fs/super.c
>>> @@ -146,6 +146,7 @@ enum {
>>> Opt_compress_algorithm,
>>> Opt_compress_log_size,
>>> Opt_compress_extension,
>>> + Opt_compress_chksum,
>>> Opt_atgc,
>>> Opt_err,
>>> };
>>> @@ -214,6 +215,7 @@ static match_table_t f2fs_tokens = {
>>> {Opt_compress_algorithm, "compress_algorithm=%s"},
>>> {Opt_compress_log_size, "compress_log_size=%u"},
>>> {Opt_compress_extension, "compress_extension=%s"},
>>> + {Opt_compress_chksum, "compress_chksum"},
>>> {Opt_atgc, "atgc"},
>>> {Opt_err, NULL},
>>> };
>>> @@ -934,10 +936,14 @@ static int parse_options(struct super_block *sb, char *options, bool is_remount)
>>> F2FS_OPTION(sbi).compress_ext_cnt++;
>>> kfree(name);
>>> break;
>>> + case Opt_compress_chksum:
>>> + F2FS_OPTION(sbi).compress_chksum = true;
>>> + break;
>>> #else
>>> case Opt_compress_algorithm:
>>> case Opt_compress_log_size:
>>> case Opt_compress_extension:
>>> + case Opt_compress_chksum:
>>> f2fs_info(sbi, "compression options not supported");
>>> break;
>>> #endif
>>> @@ -1523,6 +1529,9 @@ static inline void f2fs_show_compress_options(struct seq_file *seq,
>>> seq_printf(seq, ",compress_extension=%s",
>>> F2FS_OPTION(sbi).extensions[i]);
>>> }
>>> +
>>> + if (F2FS_OPTION(sbi).compress_chksum)
>>> + seq_puts(seq, ",compress_chksum");
>>> }
>>>
>>> static int f2fs_show_options(struct seq_file *seq, struct dentry *root)
>>> diff --git a/include/linux/f2fs_fs.h b/include/linux/f2fs_fs.h
>>> index a5dbb57a687f..7dc2a06cf19a 100644
>>> --- a/include/linux/f2fs_fs.h
>>> +++ b/include/linux/f2fs_fs.h
>>> @@ -273,7 +273,7 @@ struct f2fs_inode {
>>> __le64 i_compr_blocks; /* # of compressed blocks */
>>> __u8 i_compress_algorithm; /* compress algorithm */
>>> __u8 i_log_cluster_size; /* log of cluster size */
>>> - __le16 i_padding; /* padding */
>>> + __le16 i_compress_flag; /* compress flag */
>>> __le32 i_extra_end[0]; /* for attribute size calculation */
>>> } __packed;
>>> __le32 i_addr[DEF_ADDRS_PER_INODE]; /* Pointers to data blocks */
>>> --
>>> 2.26.2
>> .
>>
>
>
> _______________________________________________
> Linux-f2fs-devel mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
> .
>

2020-11-03 10:37:39

by David Laight

[permalink] [raw]
Subject: RE: [f2fs-dev] [PATCH] f2fs: compress: support chksum

From: Chao Yu
> Sent: 03 November 2020 02:37
...
> >> Do we need to change fsck.f2fs to recover this?
>
> However, we don't know which one is correct, compressed data or chksum value?
> if compressed data was corrupted, repairing chksum value doesn't help.
>
> Or how about adding chksum values for both raw data and compressed data.

What errors are you trying to detect?

If there are errors in the data then 'fixing' the checksum is pointless.
(You've got garbage data - might as well not have the checksum).

If you are worried about the implementation of the compression
algorithm then a checksum of the raw data is needed.

If you want to try error correcting burst errors in the compressed
data then a crc of the compressed data can be used for error correction.

OTOH the most likely error is that the file meta-data and data sector
weren't both committed to disk when the system crashed.
In which case the checksum has done its job and the file is corrupt.
fsck should probably move the file to 'lost+found' for manual checking.

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)

2020-11-03 11:55:31

by Chao Yu

[permalink] [raw]
Subject: Re: [f2fs-dev] [PATCH] f2fs: compress: support chksum

On 2020/11/3 18:35, David Laight wrote:
> From: Chao Yu
>> Sent: 03 November 2020 02:37
> ...
>>>> Do we need to change fsck.f2fs to recover this?
>>
>> However, we don't know which one is correct, compressed data or chksum value?
>> if compressed data was corrupted, repairing chksum value doesn't help.
>>
>> Or how about adding chksum values for both raw data and compressed data.
>
> What errors are you trying to detect?

Hi,

The original intention of adding this checksum feature is for code debug
purpose when I develop compress framework in f2fs and add more compress
algorithms into the framework, it helps to find obvious implementation
bug, however this checksum feature was not fully designed, so that I didn't
upstream it at that time.

One other concern is to find any mismatch between original raw data and
persisted data, no matter how it becomes to mismatched and then return
error code to user if it detects the mismatch.

And then fsck can repair mismatched chksum in the condition one persisted
chksum matchs to calculated one, and one other doesn't.

Thanks,

>
> If there are errors in the data then 'fixing' the checksum is pointless.
> (You've got garbage data - might as well not have the checksum).
>
> If you are worried about the implementation of the compression
> algorithm then a checksum of the raw data is needed.
>
> If you want to try error correcting burst errors in the compressed
> data then a crc of the compressed data can be used for error correction.
>
> OTOH the most likely error is that the file meta-data and data sector
> weren't both committed to disk when the system crashed.
> In which case the checksum has done its job and the file is corrupt.
> fsck should probably move the file to 'lost+found' for manual checking.
>
> David
>
> -
> Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
> Registration No: 1397386 (Wales)
>
> .
>

2020-11-05 04:34:31

by Chao Yu

[permalink] [raw]
Subject: Re: [f2fs-dev] [PATCH] f2fs: compress: support chksum

Jaegeuk,

Any comments?

On 2020/11/3 10:36, Chao Yu wrote:
> On 2020/11/3 10:02, Chao Yu wrote:
>> On 2020/11/3 0:31, Jaegeuk Kim wrote:
>>> On 11/02, Chao Yu wrote:
>>>> This patch supports to store chksum value with compressed
>>>> data, and verify the integrality of compressed data while
>>>> reading the data.
>>>>
>>>> The feature can be enabled through specifying mount option
>>>> 'compress_chksum'.
>>>>
>>>> Signed-off-by: Chao Yu <[email protected]>
>>>> ---
>>>> Documentation/filesystems/f2fs.rst | 1 +
>>>> fs/f2fs/compress.c | 20 ++++++++++++++++++++
>>>> fs/f2fs/f2fs.h | 13 ++++++++++++-
>>>> fs/f2fs/inode.c | 3 +++
>>>> fs/f2fs/super.c | 9 +++++++++
>>>> include/linux/f2fs_fs.h | 2 +-
>>>> 6 files changed, 46 insertions(+), 2 deletions(-)
>>>>
>>>> diff --git a/Documentation/filesystems/f2fs.rst b/Documentation/filesystems/f2fs.rst
>>>> index b8ee761c9922..985ae7d35066 100644
>>>> --- a/Documentation/filesystems/f2fs.rst
>>>> +++ b/Documentation/filesystems/f2fs.rst
>>>> @@ -260,6 +260,7 @@ compress_extension=%s Support adding specified extension, so that f2fs can enab
>>>> For other files, we can still enable compression via ioctl.
>>>> Note that, there is one reserved special extension '*', it
>>>> can be set to enable compression for all files.
>>>> +compress_chksum Support verifying chksum of raw data in compressed cluster.
>>>> inlinecrypt When possible, encrypt/decrypt the contents of encrypted
>>>> files using the blk-crypto framework rather than
>>>> filesystem-layer encryption. This allows the use of
>>>> diff --git a/fs/f2fs/compress.c b/fs/f2fs/compress.c
>>>> index 14262e0f1cd6..a4e0d2c745b6 100644
>>>> --- a/fs/f2fs/compress.c
>>>> +++ b/fs/f2fs/compress.c
>>>> @@ -602,6 +602,7 @@ static int f2fs_compress_pages(struct compress_ctx *cc)
>>>> f2fs_cops[fi->i_compress_algorithm];
>>>> unsigned int max_len, new_nr_cpages;
>>>> struct page **new_cpages;
>>>> + u32 chksum = 0;
>>>> int i, ret;
>>>>
>>>> trace_f2fs_compress_pages_start(cc->inode, cc->cluster_idx,
>>>> @@ -655,6 +656,11 @@ static int f2fs_compress_pages(struct compress_ctx *cc)
>>>>
>>>> cc->cbuf->clen = cpu_to_le32(cc->clen);
>>>>
>>>> + if (fi->i_compress_flag & 1 << COMPRESS_CHKSUM)
>>>> + chksum = f2fs_crc32(F2FS_I_SB(cc->inode),
>>>> + cc->cbuf->cdata, cc->clen);
>>>> + cc->cbuf->chksum = cpu_to_le32(chksum);
>>>> +
>>>> for (i = 0; i < COMPRESS_DATA_RESERVED_SIZE; i++)
>>>> cc->cbuf->reserved[i] = cpu_to_le32(0);
>>>>
>>>> @@ -721,6 +727,7 @@ void f2fs_decompress_pages(struct bio *bio, struct page *page, bool verity)
>>>> (struct decompress_io_ctx *)page_private(page);
>>>> struct f2fs_sb_info *sbi = F2FS_I_SB(dic->inode);
>>>> struct f2fs_inode_info *fi= F2FS_I(dic->inode);
>>>> + struct f2fs_sb_info *sbi = F2FS_I_SB(dic->inode);
>>>> const struct f2fs_compress_ops *cops =
>>>> f2fs_cops[fi->i_compress_algorithm];
>>>> int ret;
>>>> @@ -790,6 +797,19 @@ void f2fs_decompress_pages(struct bio *bio, struct page *page, bool verity)
>>>>
>>>> ret = cops->decompress_pages(dic);
>>>>
>>>> + if (!ret && fi->i_compress_flag & 1 << COMPRESS_CHKSUM) {
>>>> + u32 provided = le32_to_cpu(dic->cbuf->chksum);
>>>> + u32 calculated = f2fs_crc32(sbi, dic->cbuf->cdata, dic->clen);
>>>> +
>>>> + if (provided != calculated) {
>>>> + printk_ratelimited(
>>>> + "%sF2FS-fs (%s): checksum invalid, nid = %lu, %x vs %x",
>>>> + KERN_INFO, sbi->sb->s_id, dic->inode->i_ino,
>>>> + provided, calculated);
>>>> + ret = -EFSCORRUPTED;
>>>
>>> Do we need to change fsck.f2fs to recover this?
>
> However, we don't know which one is correct, compressed data or chksum value?
> if compressed data was corrupted, repairing chksum value doesn't help.
>
> Or how about adding chksum values for both raw data and compressed data.
>
> #define COMPRESS_DATA_RESERVED_SIZE 3
> struct compress_data {
> __le32 clen; /* compressed data size */
> + __le32 raw_chksum; /* raw data chksum */
> + __le32 compress_chksum; /* compressed data chksum */
> __le32 reserved[COMPRESS_DATA_RESERVED_SIZE]; /* reserved */
> u8 cdata[]; /* compressed data */
> };
>
> raw_chksum compress_chksum
> match match -> data is verified, pass
> not match match -> repair raw_chksum
> matcth not match -> repair compress_chksum
> not match not match -> corrupted, can not repair
>
> Thanks,
>
>>
>> Yes, prepared to update inode layout in fsck.f2fs w/ kernel side change. >
>> Thanks,
>>
>>>
>>>> + }
>>>> + }
>>>> +
>>>> out_vunmap_cbuf:
>>>> vm_unmap_ram(dic->cbuf, dic->nr_cpages);
>>>> out_vunmap_rbuf:
>>>> diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
>>>> index 99bcf4b44a9c..2ae254ab7b7d 100644
>>>> --- a/fs/f2fs/f2fs.h
>>>> +++ b/fs/f2fs/f2fs.h
>>>> @@ -147,7 +147,8 @@ struct f2fs_mount_info {
>>>>
>>>> /* For compression */
>>>> unsigned char compress_algorithm; /* algorithm type */
>>>> - unsigned compress_log_size; /* cluster log size */
>>>> + unsigned char compress_log_size; /* cluster log size */
>>>> + bool compress_chksum; /* compressed data chksum */
>>>> unsigned char compress_ext_cnt; /* extension count */
>>>> unsigned char extensions[COMPRESS_EXT_NUM][F2FS_EXTENSION_LEN]; /* extensions */
>>>> };
>>>> @@ -731,6 +732,7 @@ struct f2fs_inode_info {
>>>> atomic_t i_compr_blocks; /* # of compressed blocks */
>>>> unsigned char i_compress_algorithm; /* algorithm type */
>>>> unsigned char i_log_cluster_size; /* log of cluster size */
>>>> + unsigned short i_compress_flag; /* compress flag */
>>>> unsigned int i_cluster_size; /* cluster size */
>>>> };
>>>>
>>>> @@ -1270,9 +1272,15 @@ enum compress_algorithm_type {
>>>> COMPRESS_MAX,
>>>> };
>>>>
>>>> +enum compress_flag {
>>>> + COMPRESS_CHKSUM,
>>>> + COMPRESS_MAX_FLAG,
>>>> +};
>>>> +
>>>> #define COMPRESS_DATA_RESERVED_SIZE 5
>>>> struct compress_data {
>>>> __le32 clen; /* compressed data size */
>>>> + __le32 chksum; /* compressed data chksum */
>>>> __le32 reserved[COMPRESS_DATA_RESERVED_SIZE]; /* reserved */
>>>> u8 cdata[]; /* compressed data */
>>>> };
>>>> @@ -3882,6 +3890,9 @@ static inline void set_compress_context(struct inode *inode)
>>>> F2FS_OPTION(sbi).compress_algorithm;
>>>> F2FS_I(inode)->i_log_cluster_size =
>>>> F2FS_OPTION(sbi).compress_log_size;
>>>> + F2FS_I(inode)->i_compress_flag =
>>>> + F2FS_OPTION(sbi).compress_chksum ?
>>>> + 1 << COMPRESS_CHKSUM : 0;
>>>> F2FS_I(inode)->i_cluster_size =
>>>> 1 << F2FS_I(inode)->i_log_cluster_size;
>>>> F2FS_I(inode)->i_flags |= F2FS_COMPR_FL;
>>>> diff --git a/fs/f2fs/inode.c b/fs/f2fs/inode.c
>>>> index 657db2fb6739..de8f7fc89efa 100644
>>>> --- a/fs/f2fs/inode.c
>>>> +++ b/fs/f2fs/inode.c
>>>> @@ -456,6 +456,7 @@ static int do_read_inode(struct inode *inode)
>>>> le64_to_cpu(ri->i_compr_blocks));
>>>> fi->i_compress_algorithm = ri->i_compress_algorithm;
>>>> fi->i_log_cluster_size = ri->i_log_cluster_size;
>>>> + fi->i_compress_flag = ri->i_compress_flag;
>>>> fi->i_cluster_size = 1 << fi->i_log_cluster_size;
>>>> set_inode_flag(inode, FI_COMPRESSED_FILE);
>>>> }
>>>> @@ -634,6 +635,8 @@ void f2fs_update_inode(struct inode *inode, struct page *node_page)
>>>> &F2FS_I(inode)->i_compr_blocks));
>>>> ri->i_compress_algorithm =
>>>> F2FS_I(inode)->i_compress_algorithm;
>>>> + ri->i_compress_flag =
>>>> + cpu_to_le16(F2FS_I(inode)->i_compress_flag);
>>>> ri->i_log_cluster_size =
>>>> F2FS_I(inode)->i_log_cluster_size;
>>>> }
>>>> diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c
>>>> index 00eff2f51807..f8de4d83a5be 100644
>>>> --- a/fs/f2fs/super.c
>>>> +++ b/fs/f2fs/super.c
>>>> @@ -146,6 +146,7 @@ enum {
>>>> Opt_compress_algorithm,
>>>> Opt_compress_log_size,
>>>> Opt_compress_extension,
>>>> + Opt_compress_chksum,
>>>> Opt_atgc,
>>>> Opt_err,
>>>> };
>>>> @@ -214,6 +215,7 @@ static match_table_t f2fs_tokens = {
>>>> {Opt_compress_algorithm, "compress_algorithm=%s"},
>>>> {Opt_compress_log_size, "compress_log_size=%u"},
>>>> {Opt_compress_extension, "compress_extension=%s"},
>>>> + {Opt_compress_chksum, "compress_chksum"},
>>>> {Opt_atgc, "atgc"},
>>>> {Opt_err, NULL},
>>>> };
>>>> @@ -934,10 +936,14 @@ static int parse_options(struct super_block *sb, char *options, bool is_remount)
>>>> F2FS_OPTION(sbi).compress_ext_cnt++;
>>>> kfree(name);
>>>> break;
>>>> + case Opt_compress_chksum:
>>>> + F2FS_OPTION(sbi).compress_chksum = true;
>>>> + break;
>>>> #else
>>>> case Opt_compress_algorithm:
>>>> case Opt_compress_log_size:
>>>> case Opt_compress_extension:
>>>> + case Opt_compress_chksum:
>>>> f2fs_info(sbi, "compression options not supported");
>>>> break;
>>>> #endif
>>>> @@ -1523,6 +1529,9 @@ static inline void f2fs_show_compress_options(struct seq_file *seq,
>>>> seq_printf(seq, ",compress_extension=%s",
>>>> F2FS_OPTION(sbi).extensions[i]);
>>>> }
>>>> +
>>>> + if (F2FS_OPTION(sbi).compress_chksum)
>>>> + seq_puts(seq, ",compress_chksum");
>>>> }
>>>>
>>>> static int f2fs_show_options(struct seq_file *seq, struct dentry *root)
>>>> diff --git a/include/linux/f2fs_fs.h b/include/linux/f2fs_fs.h
>>>> index a5dbb57a687f..7dc2a06cf19a 100644
>>>> --- a/include/linux/f2fs_fs.h
>>>> +++ b/include/linux/f2fs_fs.h
>>>> @@ -273,7 +273,7 @@ struct f2fs_inode {
>>>> __le64 i_compr_blocks; /* # of compressed blocks */
>>>> __u8 i_compress_algorithm; /* compress algorithm */
>>>> __u8 i_log_cluster_size; /* log of cluster size */
>>>> - __le16 i_padding; /* padding */
>>>> + __le16 i_compress_flag; /* compress flag */
>>>> __le32 i_extra_end[0]; /* for attribute size calculation */
>>>> } __packed;
>>>> __le32 i_addr[DEF_ADDRS_PER_INODE]; /* Pointers to data blocks */
>>>> --
>>>> 2.26.2
>>> .
>>>
>>
>>
>> _______________________________________________
>> Linux-f2fs-devel mailing list
>> [email protected]
>> https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
>> .
>>
>
>
> _______________________________________________
> Linux-f2fs-devel mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
> .
>

2020-11-06 21:15:29

by Jaegeuk Kim

[permalink] [raw]
Subject: Re: [f2fs-dev] [PATCH] f2fs: compress: support chksum

On 11/03, Chao Yu wrote:
> On 2020/11/3 10:02, Chao Yu wrote:
> > On 2020/11/3 0:31, Jaegeuk Kim wrote:
> > > On 11/02, Chao Yu wrote:
> > > > This patch supports to store chksum value with compressed
> > > > data, and verify the integrality of compressed data while
> > > > reading the data.
> > > >
> > > > The feature can be enabled through specifying mount option
> > > > 'compress_chksum'.
> > > >
> > > > Signed-off-by: Chao Yu <[email protected]>
> > > > ---
> > > > Documentation/filesystems/f2fs.rst | 1 +
> > > > fs/f2fs/compress.c | 20 ++++++++++++++++++++
> > > > fs/f2fs/f2fs.h | 13 ++++++++++++-
> > > > fs/f2fs/inode.c | 3 +++
> > > > fs/f2fs/super.c | 9 +++++++++
> > > > include/linux/f2fs_fs.h | 2 +-
> > > > 6 files changed, 46 insertions(+), 2 deletions(-)
> > > >
> > > > diff --git a/Documentation/filesystems/f2fs.rst b/Documentation/filesystems/f2fs.rst
> > > > index b8ee761c9922..985ae7d35066 100644
> > > > --- a/Documentation/filesystems/f2fs.rst
> > > > +++ b/Documentation/filesystems/f2fs.rst
> > > > @@ -260,6 +260,7 @@ compress_extension=%s Support adding specified extension, so that f2fs can enab
> > > > For other files, we can still enable compression via ioctl.
> > > > Note that, there is one reserved special extension '*', it
> > > > can be set to enable compression for all files.
> > > > +compress_chksum Support verifying chksum of raw data in compressed cluster.
> > > > inlinecrypt When possible, encrypt/decrypt the contents of encrypted
> > > > files using the blk-crypto framework rather than
> > > > filesystem-layer encryption. This allows the use of
> > > > diff --git a/fs/f2fs/compress.c b/fs/f2fs/compress.c
> > > > index 14262e0f1cd6..a4e0d2c745b6 100644
> > > > --- a/fs/f2fs/compress.c
> > > > +++ b/fs/f2fs/compress.c
> > > > @@ -602,6 +602,7 @@ static int f2fs_compress_pages(struct compress_ctx *cc)
> > > > f2fs_cops[fi->i_compress_algorithm];
> > > > unsigned int max_len, new_nr_cpages;
> > > > struct page **new_cpages;
> > > > + u32 chksum = 0;
> > > > int i, ret;
> > > > trace_f2fs_compress_pages_start(cc->inode, cc->cluster_idx,
> > > > @@ -655,6 +656,11 @@ static int f2fs_compress_pages(struct compress_ctx *cc)
> > > > cc->cbuf->clen = cpu_to_le32(cc->clen);
> > > > + if (fi->i_compress_flag & 1 << COMPRESS_CHKSUM)
> > > > + chksum = f2fs_crc32(F2FS_I_SB(cc->inode),
> > > > + cc->cbuf->cdata, cc->clen);
> > > > + cc->cbuf->chksum = cpu_to_le32(chksum);
> > > > +
> > > > for (i = 0; i < COMPRESS_DATA_RESERVED_SIZE; i++)
> > > > cc->cbuf->reserved[i] = cpu_to_le32(0);
> > > > @@ -721,6 +727,7 @@ void f2fs_decompress_pages(struct bio *bio, struct page *page, bool verity)
> > > > (struct decompress_io_ctx *)page_private(page);
> > > > struct f2fs_sb_info *sbi = F2FS_I_SB(dic->inode);
> > > > struct f2fs_inode_info *fi= F2FS_I(dic->inode);
> > > > + struct f2fs_sb_info *sbi = F2FS_I_SB(dic->inode);
> > > > const struct f2fs_compress_ops *cops =
> > > > f2fs_cops[fi->i_compress_algorithm];
> > > > int ret;
> > > > @@ -790,6 +797,19 @@ void f2fs_decompress_pages(struct bio *bio, struct page *page, bool verity)
> > > > ret = cops->decompress_pages(dic);
> > > > + if (!ret && fi->i_compress_flag & 1 << COMPRESS_CHKSUM) {
> > > > + u32 provided = le32_to_cpu(dic->cbuf->chksum);
> > > > + u32 calculated = f2fs_crc32(sbi, dic->cbuf->cdata, dic->clen);
> > > > +
> > > > + if (provided != calculated) {
> > > > + printk_ratelimited(
> > > > + "%sF2FS-fs (%s): checksum invalid, nid = %lu, %x vs %x",
> > > > + KERN_INFO, sbi->sb->s_id, dic->inode->i_ino,
> > > > + provided, calculated);
> > > > + ret = -EFSCORRUPTED;
> > >
> > > Do we need to change fsck.f2fs to recover this?
>
> However, we don't know which one is correct, compressed data or chksum value?
> if compressed data was corrupted, repairing chksum value doesn't help.
>
> Or how about adding chksum values for both raw data and compressed data.
>
> #define COMPRESS_DATA_RESERVED_SIZE 3
> struct compress_data {
> __le32 clen; /* compressed data size */
> + __le32 raw_chksum; /* raw data chksum */
> + __le32 compress_chksum; /* compressed data chksum */
> __le32 reserved[COMPRESS_DATA_RESERVED_SIZE]; /* reserved */
> u8 cdata[]; /* compressed data */
> };
>
> raw_chksum compress_chksum
> match match -> data is verified, pass
> not match match -> repair raw_chksum
> matcth not match -> repair compress_chksum

I think only compress_chksum would be enough. BTW, can we give WARN_ON and
marking a FSCK flag without returning EFSCORRUPTED, since we don't really
know who was corrupted. If data was corrupted, we should be able to see app
corruption. In that case, we can check the kernel log. If checksum was simply
corrupted, next fsck will fix the checksum. So, in general, I hope to keep
the data as is and raise a flag by the checksum.

> not match not match -> corrupted, can not repair
>
> Thanks,
>
> >
> > Yes, prepared to update inode layout in fsck.f2fs w/ kernel side change. >
> > Thanks,
> >
> > >
> > > > + }
> > > > + }
> > > > +
> > > > out_vunmap_cbuf:
> > > > vm_unmap_ram(dic->cbuf, dic->nr_cpages);
> > > > out_vunmap_rbuf:
> > > > diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
> > > > index 99bcf4b44a9c..2ae254ab7b7d 100644
> > > > --- a/fs/f2fs/f2fs.h
> > > > +++ b/fs/f2fs/f2fs.h
> > > > @@ -147,7 +147,8 @@ struct f2fs_mount_info {
> > > > /* For compression */
> > > > unsigned char compress_algorithm; /* algorithm type */
> > > > - unsigned compress_log_size; /* cluster log size */
> > > > + unsigned char compress_log_size; /* cluster log size */
> > > > + bool compress_chksum; /* compressed data chksum */
> > > > unsigned char compress_ext_cnt; /* extension count */
> > > > unsigned char extensions[COMPRESS_EXT_NUM][F2FS_EXTENSION_LEN]; /* extensions */
> > > > };
> > > > @@ -731,6 +732,7 @@ struct f2fs_inode_info {
> > > > atomic_t i_compr_blocks; /* # of compressed blocks */
> > > > unsigned char i_compress_algorithm; /* algorithm type */
> > > > unsigned char i_log_cluster_size; /* log of cluster size */
> > > > + unsigned short i_compress_flag; /* compress flag */
> > > > unsigned int i_cluster_size; /* cluster size */
> > > > };
> > > > @@ -1270,9 +1272,15 @@ enum compress_algorithm_type {
> > > > COMPRESS_MAX,
> > > > };
> > > > +enum compress_flag {
> > > > + COMPRESS_CHKSUM,
> > > > + COMPRESS_MAX_FLAG,
> > > > +};
> > > > +
> > > > #define COMPRESS_DATA_RESERVED_SIZE 5
> > > > struct compress_data {
> > > > __le32 clen; /* compressed data size */
> > > > + __le32 chksum; /* compressed data chksum */
> > > > __le32 reserved[COMPRESS_DATA_RESERVED_SIZE]; /* reserved */
> > > > u8 cdata[]; /* compressed data */
> > > > };
> > > > @@ -3882,6 +3890,9 @@ static inline void set_compress_context(struct inode *inode)
> > > > F2FS_OPTION(sbi).compress_algorithm;
> > > > F2FS_I(inode)->i_log_cluster_size =
> > > > F2FS_OPTION(sbi).compress_log_size;
> > > > + F2FS_I(inode)->i_compress_flag =
> > > > + F2FS_OPTION(sbi).compress_chksum ?
> > > > + 1 << COMPRESS_CHKSUM : 0;
> > > > F2FS_I(inode)->i_cluster_size =
> > > > 1 << F2FS_I(inode)->i_log_cluster_size;
> > > > F2FS_I(inode)->i_flags |= F2FS_COMPR_FL;
> > > > diff --git a/fs/f2fs/inode.c b/fs/f2fs/inode.c
> > > > index 657db2fb6739..de8f7fc89efa 100644
> > > > --- a/fs/f2fs/inode.c
> > > > +++ b/fs/f2fs/inode.c
> > > > @@ -456,6 +456,7 @@ static int do_read_inode(struct inode *inode)
> > > > le64_to_cpu(ri->i_compr_blocks));
> > > > fi->i_compress_algorithm = ri->i_compress_algorithm;
> > > > fi->i_log_cluster_size = ri->i_log_cluster_size;
> > > > + fi->i_compress_flag = ri->i_compress_flag;
> > > > fi->i_cluster_size = 1 << fi->i_log_cluster_size;
> > > > set_inode_flag(inode, FI_COMPRESSED_FILE);
> > > > }
> > > > @@ -634,6 +635,8 @@ void f2fs_update_inode(struct inode *inode, struct page *node_page)
> > > > &F2FS_I(inode)->i_compr_blocks));
> > > > ri->i_compress_algorithm =
> > > > F2FS_I(inode)->i_compress_algorithm;
> > > > + ri->i_compress_flag =
> > > > + cpu_to_le16(F2FS_I(inode)->i_compress_flag);
> > > > ri->i_log_cluster_size =
> > > > F2FS_I(inode)->i_log_cluster_size;
> > > > }
> > > > diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c
> > > > index 00eff2f51807..f8de4d83a5be 100644
> > > > --- a/fs/f2fs/super.c
> > > > +++ b/fs/f2fs/super.c
> > > > @@ -146,6 +146,7 @@ enum {
> > > > Opt_compress_algorithm,
> > > > Opt_compress_log_size,
> > > > Opt_compress_extension,
> > > > + Opt_compress_chksum,
> > > > Opt_atgc,
> > > > Opt_err,
> > > > };
> > > > @@ -214,6 +215,7 @@ static match_table_t f2fs_tokens = {
> > > > {Opt_compress_algorithm, "compress_algorithm=%s"},
> > > > {Opt_compress_log_size, "compress_log_size=%u"},
> > > > {Opt_compress_extension, "compress_extension=%s"},
> > > > + {Opt_compress_chksum, "compress_chksum"},
> > > > {Opt_atgc, "atgc"},
> > > > {Opt_err, NULL},
> > > > };
> > > > @@ -934,10 +936,14 @@ static int parse_options(struct super_block *sb, char *options, bool is_remount)
> > > > F2FS_OPTION(sbi).compress_ext_cnt++;
> > > > kfree(name);
> > > > break;
> > > > + case Opt_compress_chksum:
> > > > + F2FS_OPTION(sbi).compress_chksum = true;
> > > > + break;
> > > > #else
> > > > case Opt_compress_algorithm:
> > > > case Opt_compress_log_size:
> > > > case Opt_compress_extension:
> > > > + case Opt_compress_chksum:
> > > > f2fs_info(sbi, "compression options not supported");
> > > > break;
> > > > #endif
> > > > @@ -1523,6 +1529,9 @@ static inline void f2fs_show_compress_options(struct seq_file *seq,
> > > > seq_printf(seq, ",compress_extension=%s",
> > > > F2FS_OPTION(sbi).extensions[i]);
> > > > }
> > > > +
> > > > + if (F2FS_OPTION(sbi).compress_chksum)
> > > > + seq_puts(seq, ",compress_chksum");
> > > > }
> > > > static int f2fs_show_options(struct seq_file *seq, struct dentry *root)
> > > > diff --git a/include/linux/f2fs_fs.h b/include/linux/f2fs_fs.h
> > > > index a5dbb57a687f..7dc2a06cf19a 100644
> > > > --- a/include/linux/f2fs_fs.h
> > > > +++ b/include/linux/f2fs_fs.h
> > > > @@ -273,7 +273,7 @@ struct f2fs_inode {
> > > > __le64 i_compr_blocks; /* # of compressed blocks */
> > > > __u8 i_compress_algorithm; /* compress algorithm */
> > > > __u8 i_log_cluster_size; /* log of cluster size */
> > > > - __le16 i_padding; /* padding */
> > > > + __le16 i_compress_flag; /* compress flag */
> > > > __le32 i_extra_end[0]; /* for attribute size calculation */
> > > > } __packed;
> > > > __le32 i_addr[DEF_ADDRS_PER_INODE]; /* Pointers to data blocks */
> > > > --
> > > > 2.26.2
> > > .
> > >
> >
> >
> > _______________________________________________
> > Linux-f2fs-devel mailing list
> > [email protected]
> > https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
> > .
> >

2020-11-09 01:52:32

by Chao Yu

[permalink] [raw]
Subject: Re: [f2fs-dev] [PATCH] f2fs: compress: support chksum

On 2020/11/7 5:12, Jaegeuk Kim wrote:
> On 11/03, Chao Yu wrote:
>> On 2020/11/3 10:02, Chao Yu wrote:
>>> On 2020/11/3 0:31, Jaegeuk Kim wrote:
>>>> On 11/02, Chao Yu wrote:
>>>>> This patch supports to store chksum value with compressed
>>>>> data, and verify the integrality of compressed data while
>>>>> reading the data.
>>>>>
>>>>> The feature can be enabled through specifying mount option
>>>>> 'compress_chksum'.
>>>>>
>>>>> Signed-off-by: Chao Yu <[email protected]>
>>>>> ---
>>>>> Documentation/filesystems/f2fs.rst | 1 +
>>>>> fs/f2fs/compress.c | 20 ++++++++++++++++++++
>>>>> fs/f2fs/f2fs.h | 13 ++++++++++++-
>>>>> fs/f2fs/inode.c | 3 +++
>>>>> fs/f2fs/super.c | 9 +++++++++
>>>>> include/linux/f2fs_fs.h | 2 +-
>>>>> 6 files changed, 46 insertions(+), 2 deletions(-)
>>>>>
>>>>> diff --git a/Documentation/filesystems/f2fs.rst b/Documentation/filesystems/f2fs.rst
>>>>> index b8ee761c9922..985ae7d35066 100644
>>>>> --- a/Documentation/filesystems/f2fs.rst
>>>>> +++ b/Documentation/filesystems/f2fs.rst
>>>>> @@ -260,6 +260,7 @@ compress_extension=%s Support adding specified extension, so that f2fs can enab
>>>>> For other files, we can still enable compression via ioctl.
>>>>> Note that, there is one reserved special extension '*', it
>>>>> can be set to enable compression for all files.
>>>>> +compress_chksum Support verifying chksum of raw data in compressed cluster.
>>>>> inlinecrypt When possible, encrypt/decrypt the contents of encrypted
>>>>> files using the blk-crypto framework rather than
>>>>> filesystem-layer encryption. This allows the use of
>>>>> diff --git a/fs/f2fs/compress.c b/fs/f2fs/compress.c
>>>>> index 14262e0f1cd6..a4e0d2c745b6 100644
>>>>> --- a/fs/f2fs/compress.c
>>>>> +++ b/fs/f2fs/compress.c
>>>>> @@ -602,6 +602,7 @@ static int f2fs_compress_pages(struct compress_ctx *cc)
>>>>> f2fs_cops[fi->i_compress_algorithm];
>>>>> unsigned int max_len, new_nr_cpages;
>>>>> struct page **new_cpages;
>>>>> + u32 chksum = 0;
>>>>> int i, ret;
>>>>> trace_f2fs_compress_pages_start(cc->inode, cc->cluster_idx,
>>>>> @@ -655,6 +656,11 @@ static int f2fs_compress_pages(struct compress_ctx *cc)
>>>>> cc->cbuf->clen = cpu_to_le32(cc->clen);
>>>>> + if (fi->i_compress_flag & 1 << COMPRESS_CHKSUM)
>>>>> + chksum = f2fs_crc32(F2FS_I_SB(cc->inode),
>>>>> + cc->cbuf->cdata, cc->clen);
>>>>> + cc->cbuf->chksum = cpu_to_le32(chksum);
>>>>> +
>>>>> for (i = 0; i < COMPRESS_DATA_RESERVED_SIZE; i++)
>>>>> cc->cbuf->reserved[i] = cpu_to_le32(0);
>>>>> @@ -721,6 +727,7 @@ void f2fs_decompress_pages(struct bio *bio, struct page *page, bool verity)
>>>>> (struct decompress_io_ctx *)page_private(page);
>>>>> struct f2fs_sb_info *sbi = F2FS_I_SB(dic->inode);
>>>>> struct f2fs_inode_info *fi= F2FS_I(dic->inode);
>>>>> + struct f2fs_sb_info *sbi = F2FS_I_SB(dic->inode);
>>>>> const struct f2fs_compress_ops *cops =
>>>>> f2fs_cops[fi->i_compress_algorithm];
>>>>> int ret;
>>>>> @@ -790,6 +797,19 @@ void f2fs_decompress_pages(struct bio *bio, struct page *page, bool verity)
>>>>> ret = cops->decompress_pages(dic);
>>>>> + if (!ret && fi->i_compress_flag & 1 << COMPRESS_CHKSUM) {
>>>>> + u32 provided = le32_to_cpu(dic->cbuf->chksum);
>>>>> + u32 calculated = f2fs_crc32(sbi, dic->cbuf->cdata, dic->clen);
>>>>> +
>>>>> + if (provided != calculated) {
>>>>> + printk_ratelimited(
>>>>> + "%sF2FS-fs (%s): checksum invalid, nid = %lu, %x vs %x",
>>>>> + KERN_INFO, sbi->sb->s_id, dic->inode->i_ino,
>>>>> + provided, calculated);
>>>>> + ret = -EFSCORRUPTED;
>>>>
>>>> Do we need to change fsck.f2fs to recover this?
>>
>> However, we don't know which one is correct, compressed data or chksum value?
>> if compressed data was corrupted, repairing chksum value doesn't help.
>>
>> Or how about adding chksum values for both raw data and compressed data.
>>
>> #define COMPRESS_DATA_RESERVED_SIZE 3
>> struct compress_data {
>> __le32 clen; /* compressed data size */
>> + __le32 raw_chksum; /* raw data chksum */
>> + __le32 compress_chksum; /* compressed data chksum */
>> __le32 reserved[COMPRESS_DATA_RESERVED_SIZE]; /* reserved */
>> u8 cdata[]; /* compressed data */
>> };
>>
>> raw_chksum compress_chksum
>> match match -> data is verified, pass
>> not match match -> repair raw_chksum
>> matcth not match -> repair compress_chksum
>
> I think only compress_chksum would be enough. BTW, can we give WARN_ON and
> marking a FSCK flag without returning EFSCORRUPTED, since we don't really
> know who was corrupted. If data was corrupted, we should be able to see app
> corruption. In that case, we can check the kernel log. If checksum was simply

I don't think that app will always corrupt once data was corrupted, I doubt its
behavior could be slightly abnormal in some cases, e.g. it can just cause apps to
show wrong number in interaction interface.

In this case, if we fix chksum in fsck, the wrong data will never be found due to
data's chksum matches data itself after repair.

IMO, the chksum and data was a whole dataset, once they are mismatch, we can
not trust either of them, fsck should do nothing on them unless we store parity
bits or replica.

Thanks,

> corrupted, next fsck will fix the checksum. So, in general, I hope to keep
> the data as is and raise a flag by the checksum.
>
>> not match not match -> corrupted, can not repair
>>
>> Thanks,
>>
>>>
>>> Yes, prepared to update inode layout in fsck.f2fs w/ kernel side change. >
>>> Thanks,
>>>
>>>>
>>>>> + }
>>>>> + }
>>>>> +
>>>>> out_vunmap_cbuf:
>>>>> vm_unmap_ram(dic->cbuf, dic->nr_cpages);
>>>>> out_vunmap_rbuf:
>>>>> diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
>>>>> index 99bcf4b44a9c..2ae254ab7b7d 100644
>>>>> --- a/fs/f2fs/f2fs.h
>>>>> +++ b/fs/f2fs/f2fs.h
>>>>> @@ -147,7 +147,8 @@ struct f2fs_mount_info {
>>>>> /* For compression */
>>>>> unsigned char compress_algorithm; /* algorithm type */
>>>>> - unsigned compress_log_size; /* cluster log size */
>>>>> + unsigned char compress_log_size; /* cluster log size */
>>>>> + bool compress_chksum; /* compressed data chksum */
>>>>> unsigned char compress_ext_cnt; /* extension count */
>>>>> unsigned char extensions[COMPRESS_EXT_NUM][F2FS_EXTENSION_LEN]; /* extensions */
>>>>> };
>>>>> @@ -731,6 +732,7 @@ struct f2fs_inode_info {
>>>>> atomic_t i_compr_blocks; /* # of compressed blocks */
>>>>> unsigned char i_compress_algorithm; /* algorithm type */
>>>>> unsigned char i_log_cluster_size; /* log of cluster size */
>>>>> + unsigned short i_compress_flag; /* compress flag */
>>>>> unsigned int i_cluster_size; /* cluster size */
>>>>> };
>>>>> @@ -1270,9 +1272,15 @@ enum compress_algorithm_type {
>>>>> COMPRESS_MAX,
>>>>> };
>>>>> +enum compress_flag {
>>>>> + COMPRESS_CHKSUM,
>>>>> + COMPRESS_MAX_FLAG,
>>>>> +};
>>>>> +
>>>>> #define COMPRESS_DATA_RESERVED_SIZE 5
>>>>> struct compress_data {
>>>>> __le32 clen; /* compressed data size */
>>>>> + __le32 chksum; /* compressed data chksum */
>>>>> __le32 reserved[COMPRESS_DATA_RESERVED_SIZE]; /* reserved */
>>>>> u8 cdata[]; /* compressed data */
>>>>> };
>>>>> @@ -3882,6 +3890,9 @@ static inline void set_compress_context(struct inode *inode)
>>>>> F2FS_OPTION(sbi).compress_algorithm;
>>>>> F2FS_I(inode)->i_log_cluster_size =
>>>>> F2FS_OPTION(sbi).compress_log_size;
>>>>> + F2FS_I(inode)->i_compress_flag =
>>>>> + F2FS_OPTION(sbi).compress_chksum ?
>>>>> + 1 << COMPRESS_CHKSUM : 0;
>>>>> F2FS_I(inode)->i_cluster_size =
>>>>> 1 << F2FS_I(inode)->i_log_cluster_size;
>>>>> F2FS_I(inode)->i_flags |= F2FS_COMPR_FL;
>>>>> diff --git a/fs/f2fs/inode.c b/fs/f2fs/inode.c
>>>>> index 657db2fb6739..de8f7fc89efa 100644
>>>>> --- a/fs/f2fs/inode.c
>>>>> +++ b/fs/f2fs/inode.c
>>>>> @@ -456,6 +456,7 @@ static int do_read_inode(struct inode *inode)
>>>>> le64_to_cpu(ri->i_compr_blocks));
>>>>> fi->i_compress_algorithm = ri->i_compress_algorithm;
>>>>> fi->i_log_cluster_size = ri->i_log_cluster_size;
>>>>> + fi->i_compress_flag = ri->i_compress_flag;
>>>>> fi->i_cluster_size = 1 << fi->i_log_cluster_size;
>>>>> set_inode_flag(inode, FI_COMPRESSED_FILE);
>>>>> }
>>>>> @@ -634,6 +635,8 @@ void f2fs_update_inode(struct inode *inode, struct page *node_page)
>>>>> &F2FS_I(inode)->i_compr_blocks));
>>>>> ri->i_compress_algorithm =
>>>>> F2FS_I(inode)->i_compress_algorithm;
>>>>> + ri->i_compress_flag =
>>>>> + cpu_to_le16(F2FS_I(inode)->i_compress_flag);
>>>>> ri->i_log_cluster_size =
>>>>> F2FS_I(inode)->i_log_cluster_size;
>>>>> }
>>>>> diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c
>>>>> index 00eff2f51807..f8de4d83a5be 100644
>>>>> --- a/fs/f2fs/super.c
>>>>> +++ b/fs/f2fs/super.c
>>>>> @@ -146,6 +146,7 @@ enum {
>>>>> Opt_compress_algorithm,
>>>>> Opt_compress_log_size,
>>>>> Opt_compress_extension,
>>>>> + Opt_compress_chksum,
>>>>> Opt_atgc,
>>>>> Opt_err,
>>>>> };
>>>>> @@ -214,6 +215,7 @@ static match_table_t f2fs_tokens = {
>>>>> {Opt_compress_algorithm, "compress_algorithm=%s"},
>>>>> {Opt_compress_log_size, "compress_log_size=%u"},
>>>>> {Opt_compress_extension, "compress_extension=%s"},
>>>>> + {Opt_compress_chksum, "compress_chksum"},
>>>>> {Opt_atgc, "atgc"},
>>>>> {Opt_err, NULL},
>>>>> };
>>>>> @@ -934,10 +936,14 @@ static int parse_options(struct super_block *sb, char *options, bool is_remount)
>>>>> F2FS_OPTION(sbi).compress_ext_cnt++;
>>>>> kfree(name);
>>>>> break;
>>>>> + case Opt_compress_chksum:
>>>>> + F2FS_OPTION(sbi).compress_chksum = true;
>>>>> + break;
>>>>> #else
>>>>> case Opt_compress_algorithm:
>>>>> case Opt_compress_log_size:
>>>>> case Opt_compress_extension:
>>>>> + case Opt_compress_chksum:
>>>>> f2fs_info(sbi, "compression options not supported");
>>>>> break;
>>>>> #endif
>>>>> @@ -1523,6 +1529,9 @@ static inline void f2fs_show_compress_options(struct seq_file *seq,
>>>>> seq_printf(seq, ",compress_extension=%s",
>>>>> F2FS_OPTION(sbi).extensions[i]);
>>>>> }
>>>>> +
>>>>> + if (F2FS_OPTION(sbi).compress_chksum)
>>>>> + seq_puts(seq, ",compress_chksum");
>>>>> }
>>>>> static int f2fs_show_options(struct seq_file *seq, struct dentry *root)
>>>>> diff --git a/include/linux/f2fs_fs.h b/include/linux/f2fs_fs.h
>>>>> index a5dbb57a687f..7dc2a06cf19a 100644
>>>>> --- a/include/linux/f2fs_fs.h
>>>>> +++ b/include/linux/f2fs_fs.h
>>>>> @@ -273,7 +273,7 @@ struct f2fs_inode {
>>>>> __le64 i_compr_blocks; /* # of compressed blocks */
>>>>> __u8 i_compress_algorithm; /* compress algorithm */
>>>>> __u8 i_log_cluster_size; /* log of cluster size */
>>>>> - __le16 i_padding; /* padding */
>>>>> + __le16 i_compress_flag; /* compress flag */
>>>>> __le32 i_extra_end[0]; /* for attribute size calculation */
>>>>> } __packed;
>>>>> __le32 i_addr[DEF_ADDRS_PER_INODE]; /* Pointers to data blocks */
>>>>> --
>>>>> 2.26.2
>>>> .
>>>>
>>>
>>>
>>> _______________________________________________
>>> Linux-f2fs-devel mailing list
>>> [email protected]
>>> https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
>>> .
>>>
> .
>

2020-11-09 17:08:52

by Jaegeuk Kim

[permalink] [raw]
Subject: Re: [f2fs-dev] [PATCH] f2fs: compress: support chksum

On 11/09, Chao Yu wrote:
> On 2020/11/7 5:12, Jaegeuk Kim wrote:
> > On 11/03, Chao Yu wrote:
> > > On 2020/11/3 10:02, Chao Yu wrote:
> > > > On 2020/11/3 0:31, Jaegeuk Kim wrote:
> > > > > On 11/02, Chao Yu wrote:
> > > > > > This patch supports to store chksum value with compressed
> > > > > > data, and verify the integrality of compressed data while
> > > > > > reading the data.
> > > > > >
> > > > > > The feature can be enabled through specifying mount option
> > > > > > 'compress_chksum'.
> > > > > >
> > > > > > Signed-off-by: Chao Yu <[email protected]>
> > > > > > ---
> > > > > > Documentation/filesystems/f2fs.rst | 1 +
> > > > > > fs/f2fs/compress.c | 20 ++++++++++++++++++++
> > > > > > fs/f2fs/f2fs.h | 13 ++++++++++++-
> > > > > > fs/f2fs/inode.c | 3 +++
> > > > > > fs/f2fs/super.c | 9 +++++++++
> > > > > > include/linux/f2fs_fs.h | 2 +-
> > > > > > 6 files changed, 46 insertions(+), 2 deletions(-)
> > > > > >
> > > > > > diff --git a/Documentation/filesystems/f2fs.rst b/Documentation/filesystems/f2fs.rst
> > > > > > index b8ee761c9922..985ae7d35066 100644
> > > > > > --- a/Documentation/filesystems/f2fs.rst
> > > > > > +++ b/Documentation/filesystems/f2fs.rst
> > > > > > @@ -260,6 +260,7 @@ compress_extension=%s Support adding specified extension, so that f2fs can enab
> > > > > > For other files, we can still enable compression via ioctl.
> > > > > > Note that, there is one reserved special extension '*', it
> > > > > > can be set to enable compression for all files.
> > > > > > +compress_chksum Support verifying chksum of raw data in compressed cluster.
> > > > > > inlinecrypt When possible, encrypt/decrypt the contents of encrypted
> > > > > > files using the blk-crypto framework rather than
> > > > > > filesystem-layer encryption. This allows the use of
> > > > > > diff --git a/fs/f2fs/compress.c b/fs/f2fs/compress.c
> > > > > > index 14262e0f1cd6..a4e0d2c745b6 100644
> > > > > > --- a/fs/f2fs/compress.c
> > > > > > +++ b/fs/f2fs/compress.c
> > > > > > @@ -602,6 +602,7 @@ static int f2fs_compress_pages(struct compress_ctx *cc)
> > > > > > f2fs_cops[fi->i_compress_algorithm];
> > > > > > unsigned int max_len, new_nr_cpages;
> > > > > > struct page **new_cpages;
> > > > > > + u32 chksum = 0;
> > > > > > int i, ret;
> > > > > > trace_f2fs_compress_pages_start(cc->inode, cc->cluster_idx,
> > > > > > @@ -655,6 +656,11 @@ static int f2fs_compress_pages(struct compress_ctx *cc)
> > > > > > cc->cbuf->clen = cpu_to_le32(cc->clen);
> > > > > > + if (fi->i_compress_flag & 1 << COMPRESS_CHKSUM)
> > > > > > + chksum = f2fs_crc32(F2FS_I_SB(cc->inode),
> > > > > > + cc->cbuf->cdata, cc->clen);
> > > > > > + cc->cbuf->chksum = cpu_to_le32(chksum);
> > > > > > +
> > > > > > for (i = 0; i < COMPRESS_DATA_RESERVED_SIZE; i++)
> > > > > > cc->cbuf->reserved[i] = cpu_to_le32(0);
> > > > > > @@ -721,6 +727,7 @@ void f2fs_decompress_pages(struct bio *bio, struct page *page, bool verity)
> > > > > > (struct decompress_io_ctx *)page_private(page);
> > > > > > struct f2fs_sb_info *sbi = F2FS_I_SB(dic->inode);
> > > > > > struct f2fs_inode_info *fi= F2FS_I(dic->inode);
> > > > > > + struct f2fs_sb_info *sbi = F2FS_I_SB(dic->inode);
> > > > > > const struct f2fs_compress_ops *cops =
> > > > > > f2fs_cops[fi->i_compress_algorithm];
> > > > > > int ret;
> > > > > > @@ -790,6 +797,19 @@ void f2fs_decompress_pages(struct bio *bio, struct page *page, bool verity)
> > > > > > ret = cops->decompress_pages(dic);
> > > > > > + if (!ret && fi->i_compress_flag & 1 << COMPRESS_CHKSUM) {
> > > > > > + u32 provided = le32_to_cpu(dic->cbuf->chksum);
> > > > > > + u32 calculated = f2fs_crc32(sbi, dic->cbuf->cdata, dic->clen);
> > > > > > +
> > > > > > + if (provided != calculated) {
> > > > > > + printk_ratelimited(
> > > > > > + "%sF2FS-fs (%s): checksum invalid, nid = %lu, %x vs %x",
> > > > > > + KERN_INFO, sbi->sb->s_id, dic->inode->i_ino,
> > > > > > + provided, calculated);
> > > > > > + ret = -EFSCORRUPTED;
> > > > >
> > > > > Do we need to change fsck.f2fs to recover this?
> > >
> > > However, we don't know which one is correct, compressed data or chksum value?
> > > if compressed data was corrupted, repairing chksum value doesn't help.
> > >
> > > Or how about adding chksum values for both raw data and compressed data.
> > >
> > > #define COMPRESS_DATA_RESERVED_SIZE 3
> > > struct compress_data {
> > > __le32 clen; /* compressed data size */
> > > + __le32 raw_chksum; /* raw data chksum */
> > > + __le32 compress_chksum; /* compressed data chksum */
> > > __le32 reserved[COMPRESS_DATA_RESERVED_SIZE]; /* reserved */
> > > u8 cdata[]; /* compressed data */
> > > };
> > >
> > > raw_chksum compress_chksum
> > > match match -> data is verified, pass
> > > not match match -> repair raw_chksum
> > > matcth not match -> repair compress_chksum
> >
> > I think only compress_chksum would be enough. BTW, can we give WARN_ON and
> > marking a FSCK flag without returning EFSCORRUPTED, since we don't really
> > know who was corrupted. If data was corrupted, we should be able to see app
> > corruption. In that case, we can check the kernel log. If checksum was simply
>
> I don't think that app will always corrupt once data was corrupted, I doubt its
> behavior could be slightly abnormal in some cases, e.g. it can just cause apps to
> show wrong number in interaction interface.

I didn't say we can always get it. But, it's likely to happen with something
like that.

>
> In this case, if we fix chksum in fsck, the wrong data will never be found due to
> data's chksum matches data itself after repair.

At least, we should see that log as a hint.

>
> IMO, the chksum and data was a whole dataset, once they are mismatch, we can
> not trust either of them, fsck should do nothing on them unless we store parity
> bits or replica.

Yeah, the point is those are not written as a transaction. Three concerns to me:
1) agreed to see compressed data corruption by checksum, 2) I don't want to
see the error messages all the time, 3) don't touch the data, even if it was
corrupted, since there's no way to fix it.

>
> Thanks,
>
> > corrupted, next fsck will fix the checksum. So, in general, I hope to keep
> > the data as is and raise a flag by the checksum.
> >
> > > not match not match -> corrupted, can not repair
> > >
> > > Thanks,
> > >
> > > >
> > > > Yes, prepared to update inode layout in fsck.f2fs w/ kernel side change. >
> > > > Thanks,
> > > >
> > > > >
> > > > > > + }
> > > > > > + }
> > > > > > +
> > > > > > out_vunmap_cbuf:
> > > > > > vm_unmap_ram(dic->cbuf, dic->nr_cpages);
> > > > > > out_vunmap_rbuf:
> > > > > > diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
> > > > > > index 99bcf4b44a9c..2ae254ab7b7d 100644
> > > > > > --- a/fs/f2fs/f2fs.h
> > > > > > +++ b/fs/f2fs/f2fs.h
> > > > > > @@ -147,7 +147,8 @@ struct f2fs_mount_info {
> > > > > > /* For compression */
> > > > > > unsigned char compress_algorithm; /* algorithm type */
> > > > > > - unsigned compress_log_size; /* cluster log size */
> > > > > > + unsigned char compress_log_size; /* cluster log size */
> > > > > > + bool compress_chksum; /* compressed data chksum */
> > > > > > unsigned char compress_ext_cnt; /* extension count */
> > > > > > unsigned char extensions[COMPRESS_EXT_NUM][F2FS_EXTENSION_LEN]; /* extensions */
> > > > > > };
> > > > > > @@ -731,6 +732,7 @@ struct f2fs_inode_info {
> > > > > > atomic_t i_compr_blocks; /* # of compressed blocks */
> > > > > > unsigned char i_compress_algorithm; /* algorithm type */
> > > > > > unsigned char i_log_cluster_size; /* log of cluster size */
> > > > > > + unsigned short i_compress_flag; /* compress flag */
> > > > > > unsigned int i_cluster_size; /* cluster size */
> > > > > > };
> > > > > > @@ -1270,9 +1272,15 @@ enum compress_algorithm_type {
> > > > > > COMPRESS_MAX,
> > > > > > };
> > > > > > +enum compress_flag {
> > > > > > + COMPRESS_CHKSUM,
> > > > > > + COMPRESS_MAX_FLAG,
> > > > > > +};
> > > > > > +
> > > > > > #define COMPRESS_DATA_RESERVED_SIZE 5
> > > > > > struct compress_data {
> > > > > > __le32 clen; /* compressed data size */
> > > > > > + __le32 chksum; /* compressed data chksum */
> > > > > > __le32 reserved[COMPRESS_DATA_RESERVED_SIZE]; /* reserved */
> > > > > > u8 cdata[]; /* compressed data */
> > > > > > };
> > > > > > @@ -3882,6 +3890,9 @@ static inline void set_compress_context(struct inode *inode)
> > > > > > F2FS_OPTION(sbi).compress_algorithm;
> > > > > > F2FS_I(inode)->i_log_cluster_size =
> > > > > > F2FS_OPTION(sbi).compress_log_size;
> > > > > > + F2FS_I(inode)->i_compress_flag =
> > > > > > + F2FS_OPTION(sbi).compress_chksum ?
> > > > > > + 1 << COMPRESS_CHKSUM : 0;
> > > > > > F2FS_I(inode)->i_cluster_size =
> > > > > > 1 << F2FS_I(inode)->i_log_cluster_size;
> > > > > > F2FS_I(inode)->i_flags |= F2FS_COMPR_FL;
> > > > > > diff --git a/fs/f2fs/inode.c b/fs/f2fs/inode.c
> > > > > > index 657db2fb6739..de8f7fc89efa 100644
> > > > > > --- a/fs/f2fs/inode.c
> > > > > > +++ b/fs/f2fs/inode.c
> > > > > > @@ -456,6 +456,7 @@ static int do_read_inode(struct inode *inode)
> > > > > > le64_to_cpu(ri->i_compr_blocks));
> > > > > > fi->i_compress_algorithm = ri->i_compress_algorithm;
> > > > > > fi->i_log_cluster_size = ri->i_log_cluster_size;
> > > > > > + fi->i_compress_flag = ri->i_compress_flag;
> > > > > > fi->i_cluster_size = 1 << fi->i_log_cluster_size;
> > > > > > set_inode_flag(inode, FI_COMPRESSED_FILE);
> > > > > > }
> > > > > > @@ -634,6 +635,8 @@ void f2fs_update_inode(struct inode *inode, struct page *node_page)
> > > > > > &F2FS_I(inode)->i_compr_blocks));
> > > > > > ri->i_compress_algorithm =
> > > > > > F2FS_I(inode)->i_compress_algorithm;
> > > > > > + ri->i_compress_flag =
> > > > > > + cpu_to_le16(F2FS_I(inode)->i_compress_flag);
> > > > > > ri->i_log_cluster_size =
> > > > > > F2FS_I(inode)->i_log_cluster_size;
> > > > > > }
> > > > > > diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c
> > > > > > index 00eff2f51807..f8de4d83a5be 100644
> > > > > > --- a/fs/f2fs/super.c
> > > > > > +++ b/fs/f2fs/super.c
> > > > > > @@ -146,6 +146,7 @@ enum {
> > > > > > Opt_compress_algorithm,
> > > > > > Opt_compress_log_size,
> > > > > > Opt_compress_extension,
> > > > > > + Opt_compress_chksum,
> > > > > > Opt_atgc,
> > > > > > Opt_err,
> > > > > > };
> > > > > > @@ -214,6 +215,7 @@ static match_table_t f2fs_tokens = {
> > > > > > {Opt_compress_algorithm, "compress_algorithm=%s"},
> > > > > > {Opt_compress_log_size, "compress_log_size=%u"},
> > > > > > {Opt_compress_extension, "compress_extension=%s"},
> > > > > > + {Opt_compress_chksum, "compress_chksum"},
> > > > > > {Opt_atgc, "atgc"},
> > > > > > {Opt_err, NULL},
> > > > > > };
> > > > > > @@ -934,10 +936,14 @@ static int parse_options(struct super_block *sb, char *options, bool is_remount)
> > > > > > F2FS_OPTION(sbi).compress_ext_cnt++;
> > > > > > kfree(name);
> > > > > > break;
> > > > > > + case Opt_compress_chksum:
> > > > > > + F2FS_OPTION(sbi).compress_chksum = true;
> > > > > > + break;
> > > > > > #else
> > > > > > case Opt_compress_algorithm:
> > > > > > case Opt_compress_log_size:
> > > > > > case Opt_compress_extension:
> > > > > > + case Opt_compress_chksum:
> > > > > > f2fs_info(sbi, "compression options not supported");
> > > > > > break;
> > > > > > #endif
> > > > > > @@ -1523,6 +1529,9 @@ static inline void f2fs_show_compress_options(struct seq_file *seq,
> > > > > > seq_printf(seq, ",compress_extension=%s",
> > > > > > F2FS_OPTION(sbi).extensions[i]);
> > > > > > }
> > > > > > +
> > > > > > + if (F2FS_OPTION(sbi).compress_chksum)
> > > > > > + seq_puts(seq, ",compress_chksum");
> > > > > > }
> > > > > > static int f2fs_show_options(struct seq_file *seq, struct dentry *root)
> > > > > > diff --git a/include/linux/f2fs_fs.h b/include/linux/f2fs_fs.h
> > > > > > index a5dbb57a687f..7dc2a06cf19a 100644
> > > > > > --- a/include/linux/f2fs_fs.h
> > > > > > +++ b/include/linux/f2fs_fs.h
> > > > > > @@ -273,7 +273,7 @@ struct f2fs_inode {
> > > > > > __le64 i_compr_blocks; /* # of compressed blocks */
> > > > > > __u8 i_compress_algorithm; /* compress algorithm */
> > > > > > __u8 i_log_cluster_size; /* log of cluster size */
> > > > > > - __le16 i_padding; /* padding */
> > > > > > + __le16 i_compress_flag; /* compress flag */
> > > > > > __le32 i_extra_end[0]; /* for attribute size calculation */
> > > > > > } __packed;
> > > > > > __le32 i_addr[DEF_ADDRS_PER_INODE]; /* Pointers to data blocks */
> > > > > > --
> > > > > > 2.26.2
> > > > > .
> > > > >
> > > >
> > > >
> > > > _______________________________________________
> > > > Linux-f2fs-devel mailing list
> > > > [email protected]
> > > > https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
> > > > .
> > > >
> > .
> >

2020-11-10 01:54:41

by Chao Yu

[permalink] [raw]
Subject: Re: [f2fs-dev] [PATCH] f2fs: compress: support chksum

On 2020/11/10 1:06, Jaegeuk Kim wrote:
> On 11/09, Chao Yu wrote:
>> On 2020/11/7 5:12, Jaegeuk Kim wrote:
>>> On 11/03, Chao Yu wrote:
>>>> On 2020/11/3 10:02, Chao Yu wrote:
>>>>> On 2020/11/3 0:31, Jaegeuk Kim wrote:
>>>>>> On 11/02, Chao Yu wrote:
>>>>>>> This patch supports to store chksum value with compressed
>>>>>>> data, and verify the integrality of compressed data while
>>>>>>> reading the data.
>>>>>>>
>>>>>>> The feature can be enabled through specifying mount option
>>>>>>> 'compress_chksum'.
>>>>>>>
>>>>>>> Signed-off-by: Chao Yu <[email protected]>
>>>>>>> ---
>>>>>>> Documentation/filesystems/f2fs.rst | 1 +
>>>>>>> fs/f2fs/compress.c | 20 ++++++++++++++++++++
>>>>>>> fs/f2fs/f2fs.h | 13 ++++++++++++-
>>>>>>> fs/f2fs/inode.c | 3 +++
>>>>>>> fs/f2fs/super.c | 9 +++++++++
>>>>>>> include/linux/f2fs_fs.h | 2 +-
>>>>>>> 6 files changed, 46 insertions(+), 2 deletions(-)
>>>>>>>
>>>>>>> diff --git a/Documentation/filesystems/f2fs.rst b/Documentation/filesystems/f2fs.rst
>>>>>>> index b8ee761c9922..985ae7d35066 100644
>>>>>>> --- a/Documentation/filesystems/f2fs.rst
>>>>>>> +++ b/Documentation/filesystems/f2fs.rst
>>>>>>> @@ -260,6 +260,7 @@ compress_extension=%s Support adding specified extension, so that f2fs can enab
>>>>>>> For other files, we can still enable compression via ioctl.
>>>>>>> Note that, there is one reserved special extension '*', it
>>>>>>> can be set to enable compression for all files.
>>>>>>> +compress_chksum Support verifying chksum of raw data in compressed cluster.
>>>>>>> inlinecrypt When possible, encrypt/decrypt the contents of encrypted
>>>>>>> files using the blk-crypto framework rather than
>>>>>>> filesystem-layer encryption. This allows the use of
>>>>>>> diff --git a/fs/f2fs/compress.c b/fs/f2fs/compress.c
>>>>>>> index 14262e0f1cd6..a4e0d2c745b6 100644
>>>>>>> --- a/fs/f2fs/compress.c
>>>>>>> +++ b/fs/f2fs/compress.c
>>>>>>> @@ -602,6 +602,7 @@ static int f2fs_compress_pages(struct compress_ctx *cc)
>>>>>>> f2fs_cops[fi->i_compress_algorithm];
>>>>>>> unsigned int max_len, new_nr_cpages;
>>>>>>> struct page **new_cpages;
>>>>>>> + u32 chksum = 0;
>>>>>>> int i, ret;
>>>>>>> trace_f2fs_compress_pages_start(cc->inode, cc->cluster_idx,
>>>>>>> @@ -655,6 +656,11 @@ static int f2fs_compress_pages(struct compress_ctx *cc)
>>>>>>> cc->cbuf->clen = cpu_to_le32(cc->clen);
>>>>>>> + if (fi->i_compress_flag & 1 << COMPRESS_CHKSUM)
>>>>>>> + chksum = f2fs_crc32(F2FS_I_SB(cc->inode),
>>>>>>> + cc->cbuf->cdata, cc->clen);
>>>>>>> + cc->cbuf->chksum = cpu_to_le32(chksum);
>>>>>>> +
>>>>>>> for (i = 0; i < COMPRESS_DATA_RESERVED_SIZE; i++)
>>>>>>> cc->cbuf->reserved[i] = cpu_to_le32(0);
>>>>>>> @@ -721,6 +727,7 @@ void f2fs_decompress_pages(struct bio *bio, struct page *page, bool verity)
>>>>>>> (struct decompress_io_ctx *)page_private(page);
>>>>>>> struct f2fs_sb_info *sbi = F2FS_I_SB(dic->inode);
>>>>>>> struct f2fs_inode_info *fi= F2FS_I(dic->inode);
>>>>>>> + struct f2fs_sb_info *sbi = F2FS_I_SB(dic->inode);
>>>>>>> const struct f2fs_compress_ops *cops =
>>>>>>> f2fs_cops[fi->i_compress_algorithm];
>>>>>>> int ret;
>>>>>>> @@ -790,6 +797,19 @@ void f2fs_decompress_pages(struct bio *bio, struct page *page, bool verity)
>>>>>>> ret = cops->decompress_pages(dic);
>>>>>>> + if (!ret && fi->i_compress_flag & 1 << COMPRESS_CHKSUM) {
>>>>>>> + u32 provided = le32_to_cpu(dic->cbuf->chksum);
>>>>>>> + u32 calculated = f2fs_crc32(sbi, dic->cbuf->cdata, dic->clen);
>>>>>>> +
>>>>>>> + if (provided != calculated) {
>>>>>>> + printk_ratelimited(
>>>>>>> + "%sF2FS-fs (%s): checksum invalid, nid = %lu, %x vs %x",
>>>>>>> + KERN_INFO, sbi->sb->s_id, dic->inode->i_ino,
>>>>>>> + provided, calculated);
>>>>>>> + ret = -EFSCORRUPTED;
>>>>>>
>>>>>> Do we need to change fsck.f2fs to recover this?
>>>>
>>>> However, we don't know which one is correct, compressed data or chksum value?
>>>> if compressed data was corrupted, repairing chksum value doesn't help.
>>>>
>>>> Or how about adding chksum values for both raw data and compressed data.
>>>>
>>>> #define COMPRESS_DATA_RESERVED_SIZE 3
>>>> struct compress_data {
>>>> __le32 clen; /* compressed data size */
>>>> + __le32 raw_chksum; /* raw data chksum */
>>>> + __le32 compress_chksum; /* compressed data chksum */
>>>> __le32 reserved[COMPRESS_DATA_RESERVED_SIZE]; /* reserved */
>>>> u8 cdata[]; /* compressed data */
>>>> };
>>>>
>>>> raw_chksum compress_chksum
>>>> match match -> data is verified, pass
>>>> not match match -> repair raw_chksum
>>>> matcth not match -> repair compress_chksum
>>>
>>> I think only compress_chksum would be enough. BTW, can we give WARN_ON and
>>> marking a FSCK flag without returning EFSCORRUPTED, since we don't really
>>> know who was corrupted. If data was corrupted, we should be able to see app
>>> corruption. In that case, we can check the kernel log. If checksum was simply
>>
>> I don't think that app will always corrupt once data was corrupted, I doubt its
>> behavior could be slightly abnormal in some cases, e.g. it can just cause apps to
>> show wrong number in interaction interface.
>
> I didn't say we can always get it. But, it's likely to happen with something
> like that.
>
>>
>> In this case, if we fix chksum in fsck, the wrong data will never be found due to
>> data's chksum matches data itself after repair.
>
> At least, we should see that log as a hint.
>
>>
>> IMO, the chksum and data was a whole dataset, once they are mismatch, we can
>> not trust either of them, fsck should do nothing on them unless we store parity
>> bits or replica.
>
> Yeah, the point is those are not written as a transaction. Three concerns to me:

Just to confirm:

> 1) agreed to see compressed data corruption by checksum

return EFSBADCRC always like we did as inode chksum feature.

2) I don't want to see the error messages all the time

Something like below?

if (provided != calculated) {
if (!is_inode_flag_set(,FI_DATA_CORRUPTED)) {
printk_ratelimited();
WARN_ON()
set_inode_flag(, FI_DATA_CORRUPTED);
}
return EFSBADCRC;
}

3) don't touch the data, even if it was corrupted, since there's no way to fix it.

It seems that we don't need to change fsck.f2fs finally...

Thanks,

>
>>
>> Thanks,
>>
>>> corrupted, next fsck will fix the checksum. So, in general, I hope to keep
>>> the data as is and raise a flag by the checksum.
>>>
>>>> not match not match -> corrupted, can not repair
>>>>
>>>> Thanks,
>>>>
>>>>>
>>>>> Yes, prepared to update inode layout in fsck.f2fs w/ kernel side change. >
>>>>> Thanks,
>>>>>
>>>>>>
>>>>>>> + }
>>>>>>> + }
>>>>>>> +
>>>>>>> out_vunmap_cbuf:
>>>>>>> vm_unmap_ram(dic->cbuf, dic->nr_cpages);
>>>>>>> out_vunmap_rbuf:
>>>>>>> diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
>>>>>>> index 99bcf4b44a9c..2ae254ab7b7d 100644
>>>>>>> --- a/fs/f2fs/f2fs.h
>>>>>>> +++ b/fs/f2fs/f2fs.h
>>>>>>> @@ -147,7 +147,8 @@ struct f2fs_mount_info {
>>>>>>> /* For compression */
>>>>>>> unsigned char compress_algorithm; /* algorithm type */
>>>>>>> - unsigned compress_log_size; /* cluster log size */
>>>>>>> + unsigned char compress_log_size; /* cluster log size */
>>>>>>> + bool compress_chksum; /* compressed data chksum */
>>>>>>> unsigned char compress_ext_cnt; /* extension count */
>>>>>>> unsigned char extensions[COMPRESS_EXT_NUM][F2FS_EXTENSION_LEN]; /* extensions */
>>>>>>> };
>>>>>>> @@ -731,6 +732,7 @@ struct f2fs_inode_info {
>>>>>>> atomic_t i_compr_blocks; /* # of compressed blocks */
>>>>>>> unsigned char i_compress_algorithm; /* algorithm type */
>>>>>>> unsigned char i_log_cluster_size; /* log of cluster size */
>>>>>>> + unsigned short i_compress_flag; /* compress flag */
>>>>>>> unsigned int i_cluster_size; /* cluster size */
>>>>>>> };
>>>>>>> @@ -1270,9 +1272,15 @@ enum compress_algorithm_type {
>>>>>>> COMPRESS_MAX,
>>>>>>> };
>>>>>>> +enum compress_flag {
>>>>>>> + COMPRESS_CHKSUM,
>>>>>>> + COMPRESS_MAX_FLAG,
>>>>>>> +};
>>>>>>> +
>>>>>>> #define COMPRESS_DATA_RESERVED_SIZE 5
>>>>>>> struct compress_data {
>>>>>>> __le32 clen; /* compressed data size */
>>>>>>> + __le32 chksum; /* compressed data chksum */
>>>>>>> __le32 reserved[COMPRESS_DATA_RESERVED_SIZE]; /* reserved */
>>>>>>> u8 cdata[]; /* compressed data */
>>>>>>> };
>>>>>>> @@ -3882,6 +3890,9 @@ static inline void set_compress_context(struct inode *inode)
>>>>>>> F2FS_OPTION(sbi).compress_algorithm;
>>>>>>> F2FS_I(inode)->i_log_cluster_size =
>>>>>>> F2FS_OPTION(sbi).compress_log_size;
>>>>>>> + F2FS_I(inode)->i_compress_flag =
>>>>>>> + F2FS_OPTION(sbi).compress_chksum ?
>>>>>>> + 1 << COMPRESS_CHKSUM : 0;
>>>>>>> F2FS_I(inode)->i_cluster_size =
>>>>>>> 1 << F2FS_I(inode)->i_log_cluster_size;
>>>>>>> F2FS_I(inode)->i_flags |= F2FS_COMPR_FL;
>>>>>>> diff --git a/fs/f2fs/inode.c b/fs/f2fs/inode.c
>>>>>>> index 657db2fb6739..de8f7fc89efa 100644
>>>>>>> --- a/fs/f2fs/inode.c
>>>>>>> +++ b/fs/f2fs/inode.c
>>>>>>> @@ -456,6 +456,7 @@ static int do_read_inode(struct inode *inode)
>>>>>>> le64_to_cpu(ri->i_compr_blocks));
>>>>>>> fi->i_compress_algorithm = ri->i_compress_algorithm;
>>>>>>> fi->i_log_cluster_size = ri->i_log_cluster_size;
>>>>>>> + fi->i_compress_flag = ri->i_compress_flag;
>>>>>>> fi->i_cluster_size = 1 << fi->i_log_cluster_size;
>>>>>>> set_inode_flag(inode, FI_COMPRESSED_FILE);
>>>>>>> }
>>>>>>> @@ -634,6 +635,8 @@ void f2fs_update_inode(struct inode *inode, struct page *node_page)
>>>>>>> &F2FS_I(inode)->i_compr_blocks));
>>>>>>> ri->i_compress_algorithm =
>>>>>>> F2FS_I(inode)->i_compress_algorithm;
>>>>>>> + ri->i_compress_flag =
>>>>>>> + cpu_to_le16(F2FS_I(inode)->i_compress_flag);
>>>>>>> ri->i_log_cluster_size =
>>>>>>> F2FS_I(inode)->i_log_cluster_size;
>>>>>>> }
>>>>>>> diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c
>>>>>>> index 00eff2f51807..f8de4d83a5be 100644
>>>>>>> --- a/fs/f2fs/super.c
>>>>>>> +++ b/fs/f2fs/super.c
>>>>>>> @@ -146,6 +146,7 @@ enum {
>>>>>>> Opt_compress_algorithm,
>>>>>>> Opt_compress_log_size,
>>>>>>> Opt_compress_extension,
>>>>>>> + Opt_compress_chksum,
>>>>>>> Opt_atgc,
>>>>>>> Opt_err,
>>>>>>> };
>>>>>>> @@ -214,6 +215,7 @@ static match_table_t f2fs_tokens = {
>>>>>>> {Opt_compress_algorithm, "compress_algorithm=%s"},
>>>>>>> {Opt_compress_log_size, "compress_log_size=%u"},
>>>>>>> {Opt_compress_extension, "compress_extension=%s"},
>>>>>>> + {Opt_compress_chksum, "compress_chksum"},
>>>>>>> {Opt_atgc, "atgc"},
>>>>>>> {Opt_err, NULL},
>>>>>>> };
>>>>>>> @@ -934,10 +936,14 @@ static int parse_options(struct super_block *sb, char *options, bool is_remount)
>>>>>>> F2FS_OPTION(sbi).compress_ext_cnt++;
>>>>>>> kfree(name);
>>>>>>> break;
>>>>>>> + case Opt_compress_chksum:
>>>>>>> + F2FS_OPTION(sbi).compress_chksum = true;
>>>>>>> + break;
>>>>>>> #else
>>>>>>> case Opt_compress_algorithm:
>>>>>>> case Opt_compress_log_size:
>>>>>>> case Opt_compress_extension:
>>>>>>> + case Opt_compress_chksum:
>>>>>>> f2fs_info(sbi, "compression options not supported");
>>>>>>> break;
>>>>>>> #endif
>>>>>>> @@ -1523,6 +1529,9 @@ static inline void f2fs_show_compress_options(struct seq_file *seq,
>>>>>>> seq_printf(seq, ",compress_extension=%s",
>>>>>>> F2FS_OPTION(sbi).extensions[i]);
>>>>>>> }
>>>>>>> +
>>>>>>> + if (F2FS_OPTION(sbi).compress_chksum)
>>>>>>> + seq_puts(seq, ",compress_chksum");
>>>>>>> }
>>>>>>> static int f2fs_show_options(struct seq_file *seq, struct dentry *root)
>>>>>>> diff --git a/include/linux/f2fs_fs.h b/include/linux/f2fs_fs.h
>>>>>>> index a5dbb57a687f..7dc2a06cf19a 100644
>>>>>>> --- a/include/linux/f2fs_fs.h
>>>>>>> +++ b/include/linux/f2fs_fs.h
>>>>>>> @@ -273,7 +273,7 @@ struct f2fs_inode {
>>>>>>> __le64 i_compr_blocks; /* # of compressed blocks */
>>>>>>> __u8 i_compress_algorithm; /* compress algorithm */
>>>>>>> __u8 i_log_cluster_size; /* log of cluster size */
>>>>>>> - __le16 i_padding; /* padding */
>>>>>>> + __le16 i_compress_flag; /* compress flag */
>>>>>>> __le32 i_extra_end[0]; /* for attribute size calculation */
>>>>>>> } __packed;
>>>>>>> __le32 i_addr[DEF_ADDRS_PER_INODE]; /* Pointers to data blocks */
>>>>>>> --
>>>>>>> 2.26.2
>>>>>> .
>>>>>>
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> Linux-f2fs-devel mailing list
>>>>> [email protected]
>>>>> https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
>>>>> .
>>>>>
>>> .
>>>
> .
>

2020-11-10 04:27:31

by Jaegeuk Kim

[permalink] [raw]
Subject: Re: [f2fs-dev] [PATCH] f2fs: compress: support chksum

On 11/10, Chao Yu wrote:
> On 2020/11/10 1:06, Jaegeuk Kim wrote:
> > On 11/09, Chao Yu wrote:
> > > On 2020/11/7 5:12, Jaegeuk Kim wrote:
> > > > On 11/03, Chao Yu wrote:
> > > > > On 2020/11/3 10:02, Chao Yu wrote:
> > > > > > On 2020/11/3 0:31, Jaegeuk Kim wrote:
> > > > > > > On 11/02, Chao Yu wrote:
> > > > > > > > This patch supports to store chksum value with compressed
> > > > > > > > data, and verify the integrality of compressed data while
> > > > > > > > reading the data.
> > > > > > > >
> > > > > > > > The feature can be enabled through specifying mount option
> > > > > > > > 'compress_chksum'.
> > > > > > > >
> > > > > > > > Signed-off-by: Chao Yu <[email protected]>
> > > > > > > > ---
> > > > > > > > Documentation/filesystems/f2fs.rst | 1 +
> > > > > > > > fs/f2fs/compress.c | 20 ++++++++++++++++++++
> > > > > > > > fs/f2fs/f2fs.h | 13 ++++++++++++-
> > > > > > > > fs/f2fs/inode.c | 3 +++
> > > > > > > > fs/f2fs/super.c | 9 +++++++++
> > > > > > > > include/linux/f2fs_fs.h | 2 +-
> > > > > > > > 6 files changed, 46 insertions(+), 2 deletions(-)
> > > > > > > >
> > > > > > > > diff --git a/Documentation/filesystems/f2fs.rst b/Documentation/filesystems/f2fs.rst
> > > > > > > > index b8ee761c9922..985ae7d35066 100644
> > > > > > > > --- a/Documentation/filesystems/f2fs.rst
> > > > > > > > +++ b/Documentation/filesystems/f2fs.rst
> > > > > > > > @@ -260,6 +260,7 @@ compress_extension=%s Support adding specified extension, so that f2fs can enab
> > > > > > > > For other files, we can still enable compression via ioctl.
> > > > > > > > Note that, there is one reserved special extension '*', it
> > > > > > > > can be set to enable compression for all files.
> > > > > > > > +compress_chksum Support verifying chksum of raw data in compressed cluster.
> > > > > > > > inlinecrypt When possible, encrypt/decrypt the contents of encrypted
> > > > > > > > files using the blk-crypto framework rather than
> > > > > > > > filesystem-layer encryption. This allows the use of
> > > > > > > > diff --git a/fs/f2fs/compress.c b/fs/f2fs/compress.c
> > > > > > > > index 14262e0f1cd6..a4e0d2c745b6 100644
> > > > > > > > --- a/fs/f2fs/compress.c
> > > > > > > > +++ b/fs/f2fs/compress.c
> > > > > > > > @@ -602,6 +602,7 @@ static int f2fs_compress_pages(struct compress_ctx *cc)
> > > > > > > > f2fs_cops[fi->i_compress_algorithm];
> > > > > > > > unsigned int max_len, new_nr_cpages;
> > > > > > > > struct page **new_cpages;
> > > > > > > > + u32 chksum = 0;
> > > > > > > > int i, ret;
> > > > > > > > trace_f2fs_compress_pages_start(cc->inode, cc->cluster_idx,
> > > > > > > > @@ -655,6 +656,11 @@ static int f2fs_compress_pages(struct compress_ctx *cc)
> > > > > > > > cc->cbuf->clen = cpu_to_le32(cc->clen);
> > > > > > > > + if (fi->i_compress_flag & 1 << COMPRESS_CHKSUM)
> > > > > > > > + chksum = f2fs_crc32(F2FS_I_SB(cc->inode),
> > > > > > > > + cc->cbuf->cdata, cc->clen);
> > > > > > > > + cc->cbuf->chksum = cpu_to_le32(chksum);
> > > > > > > > +
> > > > > > > > for (i = 0; i < COMPRESS_DATA_RESERVED_SIZE; i++)
> > > > > > > > cc->cbuf->reserved[i] = cpu_to_le32(0);
> > > > > > > > @@ -721,6 +727,7 @@ void f2fs_decompress_pages(struct bio *bio, struct page *page, bool verity)
> > > > > > > > (struct decompress_io_ctx *)page_private(page);
> > > > > > > > struct f2fs_sb_info *sbi = F2FS_I_SB(dic->inode);
> > > > > > > > struct f2fs_inode_info *fi= F2FS_I(dic->inode);
> > > > > > > > + struct f2fs_sb_info *sbi = F2FS_I_SB(dic->inode);
> > > > > > > > const struct f2fs_compress_ops *cops =
> > > > > > > > f2fs_cops[fi->i_compress_algorithm];
> > > > > > > > int ret;
> > > > > > > > @@ -790,6 +797,19 @@ void f2fs_decompress_pages(struct bio *bio, struct page *page, bool verity)
> > > > > > > > ret = cops->decompress_pages(dic);
> > > > > > > > + if (!ret && fi->i_compress_flag & 1 << COMPRESS_CHKSUM) {
> > > > > > > > + u32 provided = le32_to_cpu(dic->cbuf->chksum);
> > > > > > > > + u32 calculated = f2fs_crc32(sbi, dic->cbuf->cdata, dic->clen);
> > > > > > > > +
> > > > > > > > + if (provided != calculated) {
> > > > > > > > + printk_ratelimited(
> > > > > > > > + "%sF2FS-fs (%s): checksum invalid, nid = %lu, %x vs %x",
> > > > > > > > + KERN_INFO, sbi->sb->s_id, dic->inode->i_ino,
> > > > > > > > + provided, calculated);
> > > > > > > > + ret = -EFSCORRUPTED;
> > > > > > >
> > > > > > > Do we need to change fsck.f2fs to recover this?
> > > > >
> > > > > However, we don't know which one is correct, compressed data or chksum value?
> > > > > if compressed data was corrupted, repairing chksum value doesn't help.
> > > > >
> > > > > Or how about adding chksum values for both raw data and compressed data.
> > > > >
> > > > > #define COMPRESS_DATA_RESERVED_SIZE 3
> > > > > struct compress_data {
> > > > > __le32 clen; /* compressed data size */
> > > > > + __le32 raw_chksum; /* raw data chksum */
> > > > > + __le32 compress_chksum; /* compressed data chksum */
> > > > > __le32 reserved[COMPRESS_DATA_RESERVED_SIZE]; /* reserved */
> > > > > u8 cdata[]; /* compressed data */
> > > > > };
> > > > >
> > > > > raw_chksum compress_chksum
> > > > > match match -> data is verified, pass
> > > > > not match match -> repair raw_chksum
> > > > > matcth not match -> repair compress_chksum
> > > >
> > > > I think only compress_chksum would be enough. BTW, can we give WARN_ON and
> > > > marking a FSCK flag without returning EFSCORRUPTED, since we don't really
> > > > know who was corrupted. If data was corrupted, we should be able to see app
> > > > corruption. In that case, we can check the kernel log. If checksum was simply
> > >
> > > I don't think that app will always corrupt once data was corrupted, I doubt its
> > > behavior could be slightly abnormal in some cases, e.g. it can just cause apps to
> > > show wrong number in interaction interface.
> >
> > I didn't say we can always get it. But, it's likely to happen with something
> > like that.
> >
> > >
> > > In this case, if we fix chksum in fsck, the wrong data will never be found due to
> > > data's chksum matches data itself after repair.
> >
> > At least, we should see that log as a hint.
> >
> > >
> > > IMO, the chksum and data was a whole dataset, once they are mismatch, we can
> > > not trust either of them, fsck should do nothing on them unless we store parity
> > > bits or replica.
> >
> > Yeah, the point is those are not written as a transaction. Three concerns to me:
>
> Just to confirm:
>
> > 1) agreed to see compressed data corruption by checksum
>
> return EFSBADCRC always like we did as inode chksum feature.

inode chksum is a bit different, since 4KB write is atomic. If chksum is
corrupted, inode should be broken.

>
> 2) I don't want to see the error messages all the time
>
> Something like below?
>
> if (provided != calculated) {
> if (!is_inode_flag_set(,FI_DATA_CORRUPTED)) {
> printk_ratelimited();
> WARN_ON()
> set_inode_flag(, FI_DATA_CORRUPTED);
> }
> return EFSBADCRC;

No, what if only checksum was wrong? I mean giving WARN_ON() and setting
FSCK_FLAG to stop endless WARN_ON().

> }
>
> 3) don't touch the data, even if it was corrupted, since there's no way to fix it.
>
> It seems that we don't need to change fsck.f2fs finally...
>
> Thanks,
>
> >
> > >
> > > Thanks,
> > >
> > > > corrupted, next fsck will fix the checksum. So, in general, I hope to keep
> > > > the data as is and raise a flag by the checksum.
> > > >
> > > > > not match not match -> corrupted, can not repair
> > > > >
> > > > > Thanks,
> > > > >
> > > > > >
> > > > > > Yes, prepared to update inode layout in fsck.f2fs w/ kernel side change. >
> > > > > > Thanks,
> > > > > >
> > > > > > >
> > > > > > > > + }
> > > > > > > > + }
> > > > > > > > +
> > > > > > > > out_vunmap_cbuf:
> > > > > > > > vm_unmap_ram(dic->cbuf, dic->nr_cpages);
> > > > > > > > out_vunmap_rbuf:
> > > > > > > > diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
> > > > > > > > index 99bcf4b44a9c..2ae254ab7b7d 100644
> > > > > > > > --- a/fs/f2fs/f2fs.h
> > > > > > > > +++ b/fs/f2fs/f2fs.h
> > > > > > > > @@ -147,7 +147,8 @@ struct f2fs_mount_info {
> > > > > > > > /* For compression */
> > > > > > > > unsigned char compress_algorithm; /* algorithm type */
> > > > > > > > - unsigned compress_log_size; /* cluster log size */
> > > > > > > > + unsigned char compress_log_size; /* cluster log size */
> > > > > > > > + bool compress_chksum; /* compressed data chksum */
> > > > > > > > unsigned char compress_ext_cnt; /* extension count */
> > > > > > > > unsigned char extensions[COMPRESS_EXT_NUM][F2FS_EXTENSION_LEN]; /* extensions */
> > > > > > > > };
> > > > > > > > @@ -731,6 +732,7 @@ struct f2fs_inode_info {
> > > > > > > > atomic_t i_compr_blocks; /* # of compressed blocks */
> > > > > > > > unsigned char i_compress_algorithm; /* algorithm type */
> > > > > > > > unsigned char i_log_cluster_size; /* log of cluster size */
> > > > > > > > + unsigned short i_compress_flag; /* compress flag */
> > > > > > > > unsigned int i_cluster_size; /* cluster size */
> > > > > > > > };
> > > > > > > > @@ -1270,9 +1272,15 @@ enum compress_algorithm_type {
> > > > > > > > COMPRESS_MAX,
> > > > > > > > };
> > > > > > > > +enum compress_flag {
> > > > > > > > + COMPRESS_CHKSUM,
> > > > > > > > + COMPRESS_MAX_FLAG,
> > > > > > > > +};
> > > > > > > > +
> > > > > > > > #define COMPRESS_DATA_RESERVED_SIZE 5
> > > > > > > > struct compress_data {
> > > > > > > > __le32 clen; /* compressed data size */
> > > > > > > > + __le32 chksum; /* compressed data chksum */
> > > > > > > > __le32 reserved[COMPRESS_DATA_RESERVED_SIZE]; /* reserved */
> > > > > > > > u8 cdata[]; /* compressed data */
> > > > > > > > };
> > > > > > > > @@ -3882,6 +3890,9 @@ static inline void set_compress_context(struct inode *inode)
> > > > > > > > F2FS_OPTION(sbi).compress_algorithm;
> > > > > > > > F2FS_I(inode)->i_log_cluster_size =
> > > > > > > > F2FS_OPTION(sbi).compress_log_size;
> > > > > > > > + F2FS_I(inode)->i_compress_flag =
> > > > > > > > + F2FS_OPTION(sbi).compress_chksum ?
> > > > > > > > + 1 << COMPRESS_CHKSUM : 0;
> > > > > > > > F2FS_I(inode)->i_cluster_size =
> > > > > > > > 1 << F2FS_I(inode)->i_log_cluster_size;
> > > > > > > > F2FS_I(inode)->i_flags |= F2FS_COMPR_FL;
> > > > > > > > diff --git a/fs/f2fs/inode.c b/fs/f2fs/inode.c
> > > > > > > > index 657db2fb6739..de8f7fc89efa 100644
> > > > > > > > --- a/fs/f2fs/inode.c
> > > > > > > > +++ b/fs/f2fs/inode.c
> > > > > > > > @@ -456,6 +456,7 @@ static int do_read_inode(struct inode *inode)
> > > > > > > > le64_to_cpu(ri->i_compr_blocks));
> > > > > > > > fi->i_compress_algorithm = ri->i_compress_algorithm;
> > > > > > > > fi->i_log_cluster_size = ri->i_log_cluster_size;
> > > > > > > > + fi->i_compress_flag = ri->i_compress_flag;
> > > > > > > > fi->i_cluster_size = 1 << fi->i_log_cluster_size;
> > > > > > > > set_inode_flag(inode, FI_COMPRESSED_FILE);
> > > > > > > > }
> > > > > > > > @@ -634,6 +635,8 @@ void f2fs_update_inode(struct inode *inode, struct page *node_page)
> > > > > > > > &F2FS_I(inode)->i_compr_blocks));
> > > > > > > > ri->i_compress_algorithm =
> > > > > > > > F2FS_I(inode)->i_compress_algorithm;
> > > > > > > > + ri->i_compress_flag =
> > > > > > > > + cpu_to_le16(F2FS_I(inode)->i_compress_flag);
> > > > > > > > ri->i_log_cluster_size =
> > > > > > > > F2FS_I(inode)->i_log_cluster_size;
> > > > > > > > }
> > > > > > > > diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c
> > > > > > > > index 00eff2f51807..f8de4d83a5be 100644
> > > > > > > > --- a/fs/f2fs/super.c
> > > > > > > > +++ b/fs/f2fs/super.c
> > > > > > > > @@ -146,6 +146,7 @@ enum {
> > > > > > > > Opt_compress_algorithm,
> > > > > > > > Opt_compress_log_size,
> > > > > > > > Opt_compress_extension,
> > > > > > > > + Opt_compress_chksum,
> > > > > > > > Opt_atgc,
> > > > > > > > Opt_err,
> > > > > > > > };
> > > > > > > > @@ -214,6 +215,7 @@ static match_table_t f2fs_tokens = {
> > > > > > > > {Opt_compress_algorithm, "compress_algorithm=%s"},
> > > > > > > > {Opt_compress_log_size, "compress_log_size=%u"},
> > > > > > > > {Opt_compress_extension, "compress_extension=%s"},
> > > > > > > > + {Opt_compress_chksum, "compress_chksum"},
> > > > > > > > {Opt_atgc, "atgc"},
> > > > > > > > {Opt_err, NULL},
> > > > > > > > };
> > > > > > > > @@ -934,10 +936,14 @@ static int parse_options(struct super_block *sb, char *options, bool is_remount)
> > > > > > > > F2FS_OPTION(sbi).compress_ext_cnt++;
> > > > > > > > kfree(name);
> > > > > > > > break;
> > > > > > > > + case Opt_compress_chksum:
> > > > > > > > + F2FS_OPTION(sbi).compress_chksum = true;
> > > > > > > > + break;
> > > > > > > > #else
> > > > > > > > case Opt_compress_algorithm:
> > > > > > > > case Opt_compress_log_size:
> > > > > > > > case Opt_compress_extension:
> > > > > > > > + case Opt_compress_chksum:
> > > > > > > > f2fs_info(sbi, "compression options not supported");
> > > > > > > > break;
> > > > > > > > #endif
> > > > > > > > @@ -1523,6 +1529,9 @@ static inline void f2fs_show_compress_options(struct seq_file *seq,
> > > > > > > > seq_printf(seq, ",compress_extension=%s",
> > > > > > > > F2FS_OPTION(sbi).extensions[i]);
> > > > > > > > }
> > > > > > > > +
> > > > > > > > + if (F2FS_OPTION(sbi).compress_chksum)
> > > > > > > > + seq_puts(seq, ",compress_chksum");
> > > > > > > > }
> > > > > > > > static int f2fs_show_options(struct seq_file *seq, struct dentry *root)
> > > > > > > > diff --git a/include/linux/f2fs_fs.h b/include/linux/f2fs_fs.h
> > > > > > > > index a5dbb57a687f..7dc2a06cf19a 100644
> > > > > > > > --- a/include/linux/f2fs_fs.h
> > > > > > > > +++ b/include/linux/f2fs_fs.h
> > > > > > > > @@ -273,7 +273,7 @@ struct f2fs_inode {
> > > > > > > > __le64 i_compr_blocks; /* # of compressed blocks */
> > > > > > > > __u8 i_compress_algorithm; /* compress algorithm */
> > > > > > > > __u8 i_log_cluster_size; /* log of cluster size */
> > > > > > > > - __le16 i_padding; /* padding */
> > > > > > > > + __le16 i_compress_flag; /* compress flag */
> > > > > > > > __le32 i_extra_end[0]; /* for attribute size calculation */
> > > > > > > > } __packed;
> > > > > > > > __le32 i_addr[DEF_ADDRS_PER_INODE]; /* Pointers to data blocks */
> > > > > > > > --
> > > > > > > > 2.26.2
> > > > > > > .
> > > > > > >
> > > > > >
> > > > > >
> > > > > > _______________________________________________
> > > > > > Linux-f2fs-devel mailing list
> > > > > > [email protected]
> > > > > > https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
> > > > > > .
> > > > > >
> > > > .
> > > >
> > .
> >

2020-11-10 06:30:35

by Chao Yu

[permalink] [raw]
Subject: Re: [f2fs-dev] [PATCH] f2fs: compress: support chksum

On 2020/11/10 12:23, Jaegeuk Kim wrote:
> On 11/10, Chao Yu wrote:
>> On 2020/11/10 1:06, Jaegeuk Kim wrote:
>>> On 11/09, Chao Yu wrote:
>>>> On 2020/11/7 5:12, Jaegeuk Kim wrote:
>>>>> On 11/03, Chao Yu wrote:
>>>>>> On 2020/11/3 10:02, Chao Yu wrote:
>>>>>>> On 2020/11/3 0:31, Jaegeuk Kim wrote:
>>>>>>>> On 11/02, Chao Yu wrote:
>>>>>>>>> This patch supports to store chksum value with compressed
>>>>>>>>> data, and verify the integrality of compressed data while
>>>>>>>>> reading the data.
>>>>>>>>>
>>>>>>>>> The feature can be enabled through specifying mount option
>>>>>>>>> 'compress_chksum'.
>>>>>>>>>
>>>>>>>>> Signed-off-by: Chao Yu <[email protected]>
>>>>>>>>> ---
>>>>>>>>> Documentation/filesystems/f2fs.rst | 1 +
>>>>>>>>> fs/f2fs/compress.c | 20 ++++++++++++++++++++
>>>>>>>>> fs/f2fs/f2fs.h | 13 ++++++++++++-
>>>>>>>>> fs/f2fs/inode.c | 3 +++
>>>>>>>>> fs/f2fs/super.c | 9 +++++++++
>>>>>>>>> include/linux/f2fs_fs.h | 2 +-
>>>>>>>>> 6 files changed, 46 insertions(+), 2 deletions(-)
>>>>>>>>>
>>>>>>>>> diff --git a/Documentation/filesystems/f2fs.rst b/Documentation/filesystems/f2fs.rst
>>>>>>>>> index b8ee761c9922..985ae7d35066 100644
>>>>>>>>> --- a/Documentation/filesystems/f2fs.rst
>>>>>>>>> +++ b/Documentation/filesystems/f2fs.rst
>>>>>>>>> @@ -260,6 +260,7 @@ compress_extension=%s Support adding specified extension, so that f2fs can enab
>>>>>>>>> For other files, we can still enable compression via ioctl.
>>>>>>>>> Note that, there is one reserved special extension '*', it
>>>>>>>>> can be set to enable compression for all files.
>>>>>>>>> +compress_chksum Support verifying chksum of raw data in compressed cluster.
>>>>>>>>> inlinecrypt When possible, encrypt/decrypt the contents of encrypted
>>>>>>>>> files using the blk-crypto framework rather than
>>>>>>>>> filesystem-layer encryption. This allows the use of
>>>>>>>>> diff --git a/fs/f2fs/compress.c b/fs/f2fs/compress.c
>>>>>>>>> index 14262e0f1cd6..a4e0d2c745b6 100644
>>>>>>>>> --- a/fs/f2fs/compress.c
>>>>>>>>> +++ b/fs/f2fs/compress.c
>>>>>>>>> @@ -602,6 +602,7 @@ static int f2fs_compress_pages(struct compress_ctx *cc)
>>>>>>>>> f2fs_cops[fi->i_compress_algorithm];
>>>>>>>>> unsigned int max_len, new_nr_cpages;
>>>>>>>>> struct page **new_cpages;
>>>>>>>>> + u32 chksum = 0;
>>>>>>>>> int i, ret;
>>>>>>>>> trace_f2fs_compress_pages_start(cc->inode, cc->cluster_idx,
>>>>>>>>> @@ -655,6 +656,11 @@ static int f2fs_compress_pages(struct compress_ctx *cc)
>>>>>>>>> cc->cbuf->clen = cpu_to_le32(cc->clen);
>>>>>>>>> + if (fi->i_compress_flag & 1 << COMPRESS_CHKSUM)
>>>>>>>>> + chksum = f2fs_crc32(F2FS_I_SB(cc->inode),
>>>>>>>>> + cc->cbuf->cdata, cc->clen);
>>>>>>>>> + cc->cbuf->chksum = cpu_to_le32(chksum);
>>>>>>>>> +
>>>>>>>>> for (i = 0; i < COMPRESS_DATA_RESERVED_SIZE; i++)
>>>>>>>>> cc->cbuf->reserved[i] = cpu_to_le32(0);
>>>>>>>>> @@ -721,6 +727,7 @@ void f2fs_decompress_pages(struct bio *bio, struct page *page, bool verity)
>>>>>>>>> (struct decompress_io_ctx *)page_private(page);
>>>>>>>>> struct f2fs_sb_info *sbi = F2FS_I_SB(dic->inode);
>>>>>>>>> struct f2fs_inode_info *fi= F2FS_I(dic->inode);
>>>>>>>>> + struct f2fs_sb_info *sbi = F2FS_I_SB(dic->inode);
>>>>>>>>> const struct f2fs_compress_ops *cops =
>>>>>>>>> f2fs_cops[fi->i_compress_algorithm];
>>>>>>>>> int ret;
>>>>>>>>> @@ -790,6 +797,19 @@ void f2fs_decompress_pages(struct bio *bio, struct page *page, bool verity)
>>>>>>>>> ret = cops->decompress_pages(dic);
>>>>>>>>> + if (!ret && fi->i_compress_flag & 1 << COMPRESS_CHKSUM) {
>>>>>>>>> + u32 provided = le32_to_cpu(dic->cbuf->chksum);
>>>>>>>>> + u32 calculated = f2fs_crc32(sbi, dic->cbuf->cdata, dic->clen);
>>>>>>>>> +
>>>>>>>>> + if (provided != calculated) {
>>>>>>>>> + printk_ratelimited(
>>>>>>>>> + "%sF2FS-fs (%s): checksum invalid, nid = %lu, %x vs %x",
>>>>>>>>> + KERN_INFO, sbi->sb->s_id, dic->inode->i_ino,
>>>>>>>>> + provided, calculated);
>>>>>>>>> + ret = -EFSCORRUPTED;
>>>>>>>>
>>>>>>>> Do we need to change fsck.f2fs to recover this?
>>>>>>
>>>>>> However, we don't know which one is correct, compressed data or chksum value?
>>>>>> if compressed data was corrupted, repairing chksum value doesn't help.
>>>>>>
>>>>>> Or how about adding chksum values for both raw data and compressed data.
>>>>>>
>>>>>> #define COMPRESS_DATA_RESERVED_SIZE 3
>>>>>> struct compress_data {
>>>>>> __le32 clen; /* compressed data size */
>>>>>> + __le32 raw_chksum; /* raw data chksum */
>>>>>> + __le32 compress_chksum; /* compressed data chksum */
>>>>>> __le32 reserved[COMPRESS_DATA_RESERVED_SIZE]; /* reserved */
>>>>>> u8 cdata[]; /* compressed data */
>>>>>> };
>>>>>>
>>>>>> raw_chksum compress_chksum
>>>>>> match match -> data is verified, pass
>>>>>> not match match -> repair raw_chksum
>>>>>> matcth not match -> repair compress_chksum
>>>>>
>>>>> I think only compress_chksum would be enough. BTW, can we give WARN_ON and
>>>>> marking a FSCK flag without returning EFSCORRUPTED, since we don't really
>>>>> know who was corrupted. If data was corrupted, we should be able to see app
>>>>> corruption. In that case, we can check the kernel log. If checksum was simply
>>>>
>>>> I don't think that app will always corrupt once data was corrupted, I doubt its
>>>> behavior could be slightly abnormal in some cases, e.g. it can just cause apps to
>>>> show wrong number in interaction interface.
>>>
>>> I didn't say we can always get it. But, it's likely to happen with something
>>> like that.
>>>
>>>>
>>>> In this case, if we fix chksum in fsck, the wrong data will never be found due to
>>>> data's chksum matches data itself after repair.
>>>
>>> At least, we should see that log as a hint.
>>>
>>>>
>>>> IMO, the chksum and data was a whole dataset, once they are mismatch, we can
>>>> not trust either of them, fsck should do nothing on them unless we store parity
>>>> bits or replica.
>>>
>>> Yeah, the point is those are not written as a transaction. Three concerns to me:
>>
>> Just to confirm:
>>
>>> 1) agreed to see compressed data corruption by checksum
>>
>> return EFSBADCRC always like we did as inode chksum feature.
>
> inode chksum is a bit different, since 4KB write is atomic. If chksum is
> corrupted, inode should be broken.

Actually, I think the both results are the same, inode chksum doesn't match inode
metadata, like current case that cluster chksum doesn't match cluster data, it
doesn't matter how it becomes mismatched.

And also, in those inode corrupted cases, there should be some cases that hacker
or fuzz tester injects random data in chksum intentionally, or bit-flipping happed
on chksum value in inode, inode metadata (except inode chksum) is integrated though,
we can not distinguish such cases from case of inode metadata (except inode chksum)
corruption.

Thanks,

>
>>
>> 2) I don't want to see the error messages all the time
>>
>> Something like below?
>>
>> if (provided != calculated) {
>> if (!is_inode_flag_set(,FI_DATA_CORRUPTED)) {
>> printk_ratelimited();
>> WARN_ON()
>> set_inode_flag(, FI_DATA_CORRUPTED);
>> }
>> return EFSBADCRC;
>
> No, what if only checksum was wrong? I mean giving WARN_ON() and setting
> FSCK_FLAG to stop endless WARN_ON().
>
>> }
>>
>> 3) don't touch the data, even if it was corrupted, since there's no way to fix it.
>>
>> It seems that we don't need to change fsck.f2fs finally...
>>
>> Thanks,
>>
>>>
>>>>
>>>> Thanks,
>>>>
>>>>> corrupted, next fsck will fix the checksum. So, in general, I hope to keep
>>>>> the data as is and raise a flag by the checksum.
>>>>>
>>>>>> not match not match -> corrupted, can not repair
>>>>>>
>>>>>> Thanks,
>>>>>>
>>>>>>>
>>>>>>> Yes, prepared to update inode layout in fsck.f2fs w/ kernel side change. >
>>>>>>> Thanks,
>>>>>>>
>>>>>>>>
>>>>>>>>> + }
>>>>>>>>> + }
>>>>>>>>> +
>>>>>>>>> out_vunmap_cbuf:
>>>>>>>>> vm_unmap_ram(dic->cbuf, dic->nr_cpages);
>>>>>>>>> out_vunmap_rbuf:
>>>>>>>>> diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
>>>>>>>>> index 99bcf4b44a9c..2ae254ab7b7d 100644
>>>>>>>>> --- a/fs/f2fs/f2fs.h
>>>>>>>>> +++ b/fs/f2fs/f2fs.h
>>>>>>>>> @@ -147,7 +147,8 @@ struct f2fs_mount_info {
>>>>>>>>> /* For compression */
>>>>>>>>> unsigned char compress_algorithm; /* algorithm type */
>>>>>>>>> - unsigned compress_log_size; /* cluster log size */
>>>>>>>>> + unsigned char compress_log_size; /* cluster log size */
>>>>>>>>> + bool compress_chksum; /* compressed data chksum */
>>>>>>>>> unsigned char compress_ext_cnt; /* extension count */
>>>>>>>>> unsigned char extensions[COMPRESS_EXT_NUM][F2FS_EXTENSION_LEN]; /* extensions */
>>>>>>>>> };
>>>>>>>>> @@ -731,6 +732,7 @@ struct f2fs_inode_info {
>>>>>>>>> atomic_t i_compr_blocks; /* # of compressed blocks */
>>>>>>>>> unsigned char i_compress_algorithm; /* algorithm type */
>>>>>>>>> unsigned char i_log_cluster_size; /* log of cluster size */
>>>>>>>>> + unsigned short i_compress_flag; /* compress flag */
>>>>>>>>> unsigned int i_cluster_size; /* cluster size */
>>>>>>>>> };
>>>>>>>>> @@ -1270,9 +1272,15 @@ enum compress_algorithm_type {
>>>>>>>>> COMPRESS_MAX,
>>>>>>>>> };
>>>>>>>>> +enum compress_flag {
>>>>>>>>> + COMPRESS_CHKSUM,
>>>>>>>>> + COMPRESS_MAX_FLAG,
>>>>>>>>> +};
>>>>>>>>> +
>>>>>>>>> #define COMPRESS_DATA_RESERVED_SIZE 5
>>>>>>>>> struct compress_data {
>>>>>>>>> __le32 clen; /* compressed data size */
>>>>>>>>> + __le32 chksum; /* compressed data chksum */
>>>>>>>>> __le32 reserved[COMPRESS_DATA_RESERVED_SIZE]; /* reserved */
>>>>>>>>> u8 cdata[]; /* compressed data */
>>>>>>>>> };
>>>>>>>>> @@ -3882,6 +3890,9 @@ static inline void set_compress_context(struct inode *inode)
>>>>>>>>> F2FS_OPTION(sbi).compress_algorithm;
>>>>>>>>> F2FS_I(inode)->i_log_cluster_size =
>>>>>>>>> F2FS_OPTION(sbi).compress_log_size;
>>>>>>>>> + F2FS_I(inode)->i_compress_flag =
>>>>>>>>> + F2FS_OPTION(sbi).compress_chksum ?
>>>>>>>>> + 1 << COMPRESS_CHKSUM : 0;
>>>>>>>>> F2FS_I(inode)->i_cluster_size =
>>>>>>>>> 1 << F2FS_I(inode)->i_log_cluster_size;
>>>>>>>>> F2FS_I(inode)->i_flags |= F2FS_COMPR_FL;
>>>>>>>>> diff --git a/fs/f2fs/inode.c b/fs/f2fs/inode.c
>>>>>>>>> index 657db2fb6739..de8f7fc89efa 100644
>>>>>>>>> --- a/fs/f2fs/inode.c
>>>>>>>>> +++ b/fs/f2fs/inode.c
>>>>>>>>> @@ -456,6 +456,7 @@ static int do_read_inode(struct inode *inode)
>>>>>>>>> le64_to_cpu(ri->i_compr_blocks));
>>>>>>>>> fi->i_compress_algorithm = ri->i_compress_algorithm;
>>>>>>>>> fi->i_log_cluster_size = ri->i_log_cluster_size;
>>>>>>>>> + fi->i_compress_flag = ri->i_compress_flag;
>>>>>>>>> fi->i_cluster_size = 1 << fi->i_log_cluster_size;
>>>>>>>>> set_inode_flag(inode, FI_COMPRESSED_FILE);
>>>>>>>>> }
>>>>>>>>> @@ -634,6 +635,8 @@ void f2fs_update_inode(struct inode *inode, struct page *node_page)
>>>>>>>>> &F2FS_I(inode)->i_compr_blocks));
>>>>>>>>> ri->i_compress_algorithm =
>>>>>>>>> F2FS_I(inode)->i_compress_algorithm;
>>>>>>>>> + ri->i_compress_flag =
>>>>>>>>> + cpu_to_le16(F2FS_I(inode)->i_compress_flag);
>>>>>>>>> ri->i_log_cluster_size =
>>>>>>>>> F2FS_I(inode)->i_log_cluster_size;
>>>>>>>>> }
>>>>>>>>> diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c
>>>>>>>>> index 00eff2f51807..f8de4d83a5be 100644
>>>>>>>>> --- a/fs/f2fs/super.c
>>>>>>>>> +++ b/fs/f2fs/super.c
>>>>>>>>> @@ -146,6 +146,7 @@ enum {
>>>>>>>>> Opt_compress_algorithm,
>>>>>>>>> Opt_compress_log_size,
>>>>>>>>> Opt_compress_extension,
>>>>>>>>> + Opt_compress_chksum,
>>>>>>>>> Opt_atgc,
>>>>>>>>> Opt_err,
>>>>>>>>> };
>>>>>>>>> @@ -214,6 +215,7 @@ static match_table_t f2fs_tokens = {
>>>>>>>>> {Opt_compress_algorithm, "compress_algorithm=%s"},
>>>>>>>>> {Opt_compress_log_size, "compress_log_size=%u"},
>>>>>>>>> {Opt_compress_extension, "compress_extension=%s"},
>>>>>>>>> + {Opt_compress_chksum, "compress_chksum"},
>>>>>>>>> {Opt_atgc, "atgc"},
>>>>>>>>> {Opt_err, NULL},
>>>>>>>>> };
>>>>>>>>> @@ -934,10 +936,14 @@ static int parse_options(struct super_block *sb, char *options, bool is_remount)
>>>>>>>>> F2FS_OPTION(sbi).compress_ext_cnt++;
>>>>>>>>> kfree(name);
>>>>>>>>> break;
>>>>>>>>> + case Opt_compress_chksum:
>>>>>>>>> + F2FS_OPTION(sbi).compress_chksum = true;
>>>>>>>>> + break;
>>>>>>>>> #else
>>>>>>>>> case Opt_compress_algorithm:
>>>>>>>>> case Opt_compress_log_size:
>>>>>>>>> case Opt_compress_extension:
>>>>>>>>> + case Opt_compress_chksum:
>>>>>>>>> f2fs_info(sbi, "compression options not supported");
>>>>>>>>> break;
>>>>>>>>> #endif
>>>>>>>>> @@ -1523,6 +1529,9 @@ static inline void f2fs_show_compress_options(struct seq_file *seq,
>>>>>>>>> seq_printf(seq, ",compress_extension=%s",
>>>>>>>>> F2FS_OPTION(sbi).extensions[i]);
>>>>>>>>> }
>>>>>>>>> +
>>>>>>>>> + if (F2FS_OPTION(sbi).compress_chksum)
>>>>>>>>> + seq_puts(seq, ",compress_chksum");
>>>>>>>>> }
>>>>>>>>> static int f2fs_show_options(struct seq_file *seq, struct dentry *root)
>>>>>>>>> diff --git a/include/linux/f2fs_fs.h b/include/linux/f2fs_fs.h
>>>>>>>>> index a5dbb57a687f..7dc2a06cf19a 100644
>>>>>>>>> --- a/include/linux/f2fs_fs.h
>>>>>>>>> +++ b/include/linux/f2fs_fs.h
>>>>>>>>> @@ -273,7 +273,7 @@ struct f2fs_inode {
>>>>>>>>> __le64 i_compr_blocks; /* # of compressed blocks */
>>>>>>>>> __u8 i_compress_algorithm; /* compress algorithm */
>>>>>>>>> __u8 i_log_cluster_size; /* log of cluster size */
>>>>>>>>> - __le16 i_padding; /* padding */
>>>>>>>>> + __le16 i_compress_flag; /* compress flag */
>>>>>>>>> __le32 i_extra_end[0]; /* for attribute size calculation */
>>>>>>>>> } __packed;
>>>>>>>>> __le32 i_addr[DEF_ADDRS_PER_INODE]; /* Pointers to data blocks */
>>>>>>>>> --
>>>>>>>>> 2.26.2
>>>>>>>> .
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> Linux-f2fs-devel mailing list
>>>>>>> [email protected]
>>>>>>> https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
>>>>>>> .
>>>>>>>
>>>>> .
>>>>>
>>> .
>>>
> .
>

2020-11-10 12:20:40

by David Laight

[permalink] [raw]
Subject: RE: [f2fs-dev] [PATCH] f2fs: compress: support chksum

From: Chao Yu
> Sent: 10 November 2020 06:28
...
> Actually, I think the both results are the same, inode chksum doesn't match inode
> metadata, like current case that cluster chksum doesn't match cluster data, it
> doesn't matter how it becomes mismatched.
>
> And also, in those inode corrupted cases, there should be some cases that hacker
> or fuzz tester injects random data in chksum intentionally, or bit-flipping happed
> on chksum value in inode, inode metadata (except inode chksum) is integrated though,
> we can not distinguish such cases from case of inode metadata (except inode chksum)
> corruption.

If you reverse the crc algotithm on the checksum error bits and get a
pattern with a small number of adjacent ones then that would be a burst
error that would generate the observed crc error.

So if you have a 32bit crc on an 8k byte buffer there are 64k bit positions
so (IIRC from my university days) you can fix an assumed 16 bit error burst.
That would be a bit dangerous through.
OTOH if you find an 8-bit error pattern you could assume (and fix)
a single byte corruption.

Whether that is worth doing depends on any assumed mechanisms
for generating corruptions.

I don't know of any ethernet hardware that uses the 32bit crc
to fix very short burst errors (or even single bit errors)
even though it could be done and burst errors are quite likely.
(Below Ge speeds.)

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)