Received: by 2002:a05:6a10:16a7:0:0:0:0 with SMTP id gp39csp3770699pxb; Mon, 9 Nov 2020 22:30:35 -0800 (PST) X-Google-Smtp-Source: ABdhPJwEfjw5i46hK6hQLF9Qt83ex5GeeTGpS92LtIbkUp+pMRLeZwijdVYNv2E+TRGdBG2UjPAc X-Received: by 2002:a50:eb96:: with SMTP id y22mr19480835edr.116.1604989835053; Mon, 09 Nov 2020 22:30:35 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1604989835; cv=none; d=google.com; s=arc-20160816; b=xZoS+KAhjYR4UaGdpWiETKEZT+ZyTyt0JMjNljToy32P9vJLimEu43GITA1SuxMsWM 7hCKLRrtKOS7ExMmBcUxhIgcGz8eKt2tKPnS4h9h8SwtnB+Kaur7dC7IVvffhXx/CisY M6r5cVqrRw6QckwBTWNfH5w/RNKkic8ZGcU2t3LwplASkdIz6uqml4yOUbTgue/qVJtt TahX+RpADiSfMTnNK0olEmF806Iws4ajIuGFIn3aGrp1w8lUP9xcfIPrdajCR041D7Pl MGjXtqK/xKmoFIFPjpwlIROr+Q5zoRiWZig05bJ9w/sfEf5FKFItQUOwA8CTJaa513oA YCOA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:content-language :in-reply-to:mime-version:user-agent:date:message-id:from:references :cc:to:subject; bh=FXc25obaoasVR3tY0yYYdOWpO6ktsh036JuRK7FF0Zw=; b=RBisrUyDVvo6Zu6Xys55IBSA4WMsq9Jc2biFWVxwhUm+3cwY9B55MGybGeiQu2v+hH qXXEnCRcqIxunKVxPvvIQreu6o/ILMfiNyG2+/hP8dpS/vQxbI3CpzxcKj7KJLGW+mmS /WPvtSQe50GFH21x1fD5olXHj/wwbFGr6qIAJQjn/HP+bazTww2HzCKPGW12VguCXW4b hoyvIRtukVKZ6UIgy6jOXc4n3/+vDvPHT7j2YU/2EGwHZB2stbW5xQUuIiGQatGsqCwe 1CeWWFJ7nChS0pBJvOfF/4NW7N9Y63zeBNqTU8nsni2RbjmuLgM3VJ4OtV90kck2sAol CF8A== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id n7si8830650ejk.217.2020.11.09.22.30.12; Mon, 09 Nov 2020 22:30:35 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730877AbgKJG14 (ORCPT + 99 others); Tue, 10 Nov 2020 01:27:56 -0500 Received: from szxga05-in.huawei.com ([45.249.212.191]:7511 "EHLO szxga05-in.huawei.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726010AbgKJG14 (ORCPT ); Tue, 10 Nov 2020 01:27:56 -0500 Received: from DGGEMS411-HUB.china.huawei.com (unknown [172.30.72.60]) by szxga05-in.huawei.com (SkyGuard) with ESMTP id 4CVdFs3nQZzhjlN; Tue, 10 Nov 2020 14:27:45 +0800 (CST) Received: from [10.136.114.67] (10.136.114.67) by smtp.huawei.com (10.3.19.211) with Microsoft SMTP Server (TLS) id 14.3.487.0; Tue, 10 Nov 2020 14:27:50 +0800 Subject: Re: [f2fs-dev] [PATCH] f2fs: compress: support chksum To: Jaegeuk Kim CC: , References: <20201102122333.76667-1-yuchao0@huawei.com> <20201102163123.GD529594@google.com> <756e482c-b638-1c09-3868-ae45d33ed2c2@huawei.com> <6b5bce0e-c967-b9cf-3544-a8e65595059c@huawei.com> <20201106211247.GA1474936@google.com> <908682bb-486c-222f-bea7-43fc961ef1b0@huawei.com> <20201109170625.GB2129970@google.com> <3417aea5-ace8-74be-ec26-f491dddea676@huawei.com> <20201110042353.GB1598246@google.com> From: Chao Yu Message-ID: <513c56d7-cefd-37a8-efdf-fa1ac8c2a1d3@huawei.com> Date: Tue, 10 Nov 2020 14:27:49 +0800 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:52.0) Gecko/20100101 Thunderbird/52.9.1 MIME-Version: 1.0 In-Reply-To: <20201110042353.GB1598246@google.com> Content-Type: text/plain; charset="windows-1252"; format=flowed Content-Language: en-US Content-Transfer-Encoding: 7bit X-Originating-IP: [10.136.114.67] X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 2020/11/10 12:23, Jaegeuk Kim wrote: > On 11/10, Chao Yu wrote: >> On 2020/11/10 1:06, Jaegeuk Kim wrote: >>> On 11/09, Chao Yu wrote: >>>> On 2020/11/7 5:12, Jaegeuk Kim wrote: >>>>> On 11/03, Chao Yu wrote: >>>>>> On 2020/11/3 10:02, Chao Yu wrote: >>>>>>> On 2020/11/3 0:31, Jaegeuk Kim wrote: >>>>>>>> On 11/02, Chao Yu wrote: >>>>>>>>> This patch supports to store chksum value with compressed >>>>>>>>> data, and verify the integrality of compressed data while >>>>>>>>> reading the data. >>>>>>>>> >>>>>>>>> The feature can be enabled through specifying mount option >>>>>>>>> 'compress_chksum'. >>>>>>>>> >>>>>>>>> Signed-off-by: Chao Yu >>>>>>>>> --- >>>>>>>>> Documentation/filesystems/f2fs.rst | 1 + >>>>>>>>> fs/f2fs/compress.c | 20 ++++++++++++++++++++ >>>>>>>>> fs/f2fs/f2fs.h | 13 ++++++++++++- >>>>>>>>> fs/f2fs/inode.c | 3 +++ >>>>>>>>> fs/f2fs/super.c | 9 +++++++++ >>>>>>>>> include/linux/f2fs_fs.h | 2 +- >>>>>>>>> 6 files changed, 46 insertions(+), 2 deletions(-) >>>>>>>>> >>>>>>>>> diff --git a/Documentation/filesystems/f2fs.rst b/Documentation/filesystems/f2fs.rst >>>>>>>>> index b8ee761c9922..985ae7d35066 100644 >>>>>>>>> --- a/Documentation/filesystems/f2fs.rst >>>>>>>>> +++ b/Documentation/filesystems/f2fs.rst >>>>>>>>> @@ -260,6 +260,7 @@ compress_extension=%s Support adding specified extension, so that f2fs can enab >>>>>>>>> For other files, we can still enable compression via ioctl. >>>>>>>>> Note that, there is one reserved special extension '*', it >>>>>>>>> can be set to enable compression for all files. >>>>>>>>> +compress_chksum Support verifying chksum of raw data in compressed cluster. >>>>>>>>> inlinecrypt When possible, encrypt/decrypt the contents of encrypted >>>>>>>>> files using the blk-crypto framework rather than >>>>>>>>> filesystem-layer encryption. This allows the use of >>>>>>>>> diff --git a/fs/f2fs/compress.c b/fs/f2fs/compress.c >>>>>>>>> index 14262e0f1cd6..a4e0d2c745b6 100644 >>>>>>>>> --- a/fs/f2fs/compress.c >>>>>>>>> +++ b/fs/f2fs/compress.c >>>>>>>>> @@ -602,6 +602,7 @@ static int f2fs_compress_pages(struct compress_ctx *cc) >>>>>>>>> f2fs_cops[fi->i_compress_algorithm]; >>>>>>>>> unsigned int max_len, new_nr_cpages; >>>>>>>>> struct page **new_cpages; >>>>>>>>> + u32 chksum = 0; >>>>>>>>> int i, ret; >>>>>>>>> trace_f2fs_compress_pages_start(cc->inode, cc->cluster_idx, >>>>>>>>> @@ -655,6 +656,11 @@ static int f2fs_compress_pages(struct compress_ctx *cc) >>>>>>>>> cc->cbuf->clen = cpu_to_le32(cc->clen); >>>>>>>>> + if (fi->i_compress_flag & 1 << COMPRESS_CHKSUM) >>>>>>>>> + chksum = f2fs_crc32(F2FS_I_SB(cc->inode), >>>>>>>>> + cc->cbuf->cdata, cc->clen); >>>>>>>>> + cc->cbuf->chksum = cpu_to_le32(chksum); >>>>>>>>> + >>>>>>>>> for (i = 0; i < COMPRESS_DATA_RESERVED_SIZE; i++) >>>>>>>>> cc->cbuf->reserved[i] = cpu_to_le32(0); >>>>>>>>> @@ -721,6 +727,7 @@ void f2fs_decompress_pages(struct bio *bio, struct page *page, bool verity) >>>>>>>>> (struct decompress_io_ctx *)page_private(page); >>>>>>>>> struct f2fs_sb_info *sbi = F2FS_I_SB(dic->inode); >>>>>>>>> struct f2fs_inode_info *fi= F2FS_I(dic->inode); >>>>>>>>> + struct f2fs_sb_info *sbi = F2FS_I_SB(dic->inode); >>>>>>>>> const struct f2fs_compress_ops *cops = >>>>>>>>> f2fs_cops[fi->i_compress_algorithm]; >>>>>>>>> int ret; >>>>>>>>> @@ -790,6 +797,19 @@ void f2fs_decompress_pages(struct bio *bio, struct page *page, bool verity) >>>>>>>>> ret = cops->decompress_pages(dic); >>>>>>>>> + if (!ret && fi->i_compress_flag & 1 << COMPRESS_CHKSUM) { >>>>>>>>> + u32 provided = le32_to_cpu(dic->cbuf->chksum); >>>>>>>>> + u32 calculated = f2fs_crc32(sbi, dic->cbuf->cdata, dic->clen); >>>>>>>>> + >>>>>>>>> + if (provided != calculated) { >>>>>>>>> + printk_ratelimited( >>>>>>>>> + "%sF2FS-fs (%s): checksum invalid, nid = %lu, %x vs %x", >>>>>>>>> + KERN_INFO, sbi->sb->s_id, dic->inode->i_ino, >>>>>>>>> + provided, calculated); >>>>>>>>> + ret = -EFSCORRUPTED; >>>>>>>> >>>>>>>> Do we need to change fsck.f2fs to recover this? >>>>>> >>>>>> However, we don't know which one is correct, compressed data or chksum value? >>>>>> if compressed data was corrupted, repairing chksum value doesn't help. >>>>>> >>>>>> Or how about adding chksum values for both raw data and compressed data. >>>>>> >>>>>> #define COMPRESS_DATA_RESERVED_SIZE 3 >>>>>> struct compress_data { >>>>>> __le32 clen; /* compressed data size */ >>>>>> + __le32 raw_chksum; /* raw data chksum */ >>>>>> + __le32 compress_chksum; /* compressed data chksum */ >>>>>> __le32 reserved[COMPRESS_DATA_RESERVED_SIZE]; /* reserved */ >>>>>> u8 cdata[]; /* compressed data */ >>>>>> }; >>>>>> >>>>>> raw_chksum compress_chksum >>>>>> match match -> data is verified, pass >>>>>> not match match -> repair raw_chksum >>>>>> matcth not match -> repair compress_chksum >>>>> >>>>> I think only compress_chksum would be enough. BTW, can we give WARN_ON and >>>>> marking a FSCK flag without returning EFSCORRUPTED, since we don't really >>>>> know who was corrupted. If data was corrupted, we should be able to see app >>>>> corruption. In that case, we can check the kernel log. If checksum was simply >>>> >>>> I don't think that app will always corrupt once data was corrupted, I doubt its >>>> behavior could be slightly abnormal in some cases, e.g. it can just cause apps to >>>> show wrong number in interaction interface. >>> >>> I didn't say we can always get it. But, it's likely to happen with something >>> like that. >>> >>>> >>>> In this case, if we fix chksum in fsck, the wrong data will never be found due to >>>> data's chksum matches data itself after repair. >>> >>> At least, we should see that log as a hint. >>> >>>> >>>> IMO, the chksum and data was a whole dataset, once they are mismatch, we can >>>> not trust either of them, fsck should do nothing on them unless we store parity >>>> bits or replica. >>> >>> Yeah, the point is those are not written as a transaction. Three concerns to me: >> >> Just to confirm: >> >>> 1) agreed to see compressed data corruption by checksum >> >> return EFSBADCRC always like we did as inode chksum feature. > > inode chksum is a bit different, since 4KB write is atomic. If chksum is > corrupted, inode should be broken. Actually, I think the both results are the same, inode chksum doesn't match inode metadata, like current case that cluster chksum doesn't match cluster data, it doesn't matter how it becomes mismatched. And also, in those inode corrupted cases, there should be some cases that hacker or fuzz tester injects random data in chksum intentionally, or bit-flipping happed on chksum value in inode, inode metadata (except inode chksum) is integrated though, we can not distinguish such cases from case of inode metadata (except inode chksum) corruption. Thanks, > >> >> 2) I don't want to see the error messages all the time >> >> Something like below? >> >> if (provided != calculated) { >> if (!is_inode_flag_set(,FI_DATA_CORRUPTED)) { >> printk_ratelimited(); >> WARN_ON() >> set_inode_flag(, FI_DATA_CORRUPTED); >> } >> return EFSBADCRC; > > No, what if only checksum was wrong? I mean giving WARN_ON() and setting > FSCK_FLAG to stop endless WARN_ON(). > >> } >> >> 3) don't touch the data, even if it was corrupted, since there's no way to fix it. >> >> It seems that we don't need to change fsck.f2fs finally... >> >> Thanks, >> >>> >>>> >>>> Thanks, >>>> >>>>> corrupted, next fsck will fix the checksum. So, in general, I hope to keep >>>>> the data as is and raise a flag by the checksum. >>>>> >>>>>> not match not match -> corrupted, can not repair >>>>>> >>>>>> Thanks, >>>>>> >>>>>>> >>>>>>> Yes, prepared to update inode layout in fsck.f2fs w/ kernel side change. > >>>>>>> Thanks, >>>>>>> >>>>>>>> >>>>>>>>> + } >>>>>>>>> + } >>>>>>>>> + >>>>>>>>> out_vunmap_cbuf: >>>>>>>>> vm_unmap_ram(dic->cbuf, dic->nr_cpages); >>>>>>>>> out_vunmap_rbuf: >>>>>>>>> diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h >>>>>>>>> index 99bcf4b44a9c..2ae254ab7b7d 100644 >>>>>>>>> --- a/fs/f2fs/f2fs.h >>>>>>>>> +++ b/fs/f2fs/f2fs.h >>>>>>>>> @@ -147,7 +147,8 @@ struct f2fs_mount_info { >>>>>>>>> /* For compression */ >>>>>>>>> unsigned char compress_algorithm; /* algorithm type */ >>>>>>>>> - unsigned compress_log_size; /* cluster log size */ >>>>>>>>> + unsigned char compress_log_size; /* cluster log size */ >>>>>>>>> + bool compress_chksum; /* compressed data chksum */ >>>>>>>>> unsigned char compress_ext_cnt; /* extension count */ >>>>>>>>> unsigned char extensions[COMPRESS_EXT_NUM][F2FS_EXTENSION_LEN]; /* extensions */ >>>>>>>>> }; >>>>>>>>> @@ -731,6 +732,7 @@ struct f2fs_inode_info { >>>>>>>>> atomic_t i_compr_blocks; /* # of compressed blocks */ >>>>>>>>> unsigned char i_compress_algorithm; /* algorithm type */ >>>>>>>>> unsigned char i_log_cluster_size; /* log of cluster size */ >>>>>>>>> + unsigned short i_compress_flag; /* compress flag */ >>>>>>>>> unsigned int i_cluster_size; /* cluster size */ >>>>>>>>> }; >>>>>>>>> @@ -1270,9 +1272,15 @@ enum compress_algorithm_type { >>>>>>>>> COMPRESS_MAX, >>>>>>>>> }; >>>>>>>>> +enum compress_flag { >>>>>>>>> + COMPRESS_CHKSUM, >>>>>>>>> + COMPRESS_MAX_FLAG, >>>>>>>>> +}; >>>>>>>>> + >>>>>>>>> #define COMPRESS_DATA_RESERVED_SIZE 5 >>>>>>>>> struct compress_data { >>>>>>>>> __le32 clen; /* compressed data size */ >>>>>>>>> + __le32 chksum; /* compressed data chksum */ >>>>>>>>> __le32 reserved[COMPRESS_DATA_RESERVED_SIZE]; /* reserved */ >>>>>>>>> u8 cdata[]; /* compressed data */ >>>>>>>>> }; >>>>>>>>> @@ -3882,6 +3890,9 @@ static inline void set_compress_context(struct inode *inode) >>>>>>>>> F2FS_OPTION(sbi).compress_algorithm; >>>>>>>>> F2FS_I(inode)->i_log_cluster_size = >>>>>>>>> F2FS_OPTION(sbi).compress_log_size; >>>>>>>>> + F2FS_I(inode)->i_compress_flag = >>>>>>>>> + F2FS_OPTION(sbi).compress_chksum ? >>>>>>>>> + 1 << COMPRESS_CHKSUM : 0; >>>>>>>>> F2FS_I(inode)->i_cluster_size = >>>>>>>>> 1 << F2FS_I(inode)->i_log_cluster_size; >>>>>>>>> F2FS_I(inode)->i_flags |= F2FS_COMPR_FL; >>>>>>>>> diff --git a/fs/f2fs/inode.c b/fs/f2fs/inode.c >>>>>>>>> index 657db2fb6739..de8f7fc89efa 100644 >>>>>>>>> --- a/fs/f2fs/inode.c >>>>>>>>> +++ b/fs/f2fs/inode.c >>>>>>>>> @@ -456,6 +456,7 @@ static int do_read_inode(struct inode *inode) >>>>>>>>> le64_to_cpu(ri->i_compr_blocks)); >>>>>>>>> fi->i_compress_algorithm = ri->i_compress_algorithm; >>>>>>>>> fi->i_log_cluster_size = ri->i_log_cluster_size; >>>>>>>>> + fi->i_compress_flag = ri->i_compress_flag; >>>>>>>>> fi->i_cluster_size = 1 << fi->i_log_cluster_size; >>>>>>>>> set_inode_flag(inode, FI_COMPRESSED_FILE); >>>>>>>>> } >>>>>>>>> @@ -634,6 +635,8 @@ void f2fs_update_inode(struct inode *inode, struct page *node_page) >>>>>>>>> &F2FS_I(inode)->i_compr_blocks)); >>>>>>>>> ri->i_compress_algorithm = >>>>>>>>> F2FS_I(inode)->i_compress_algorithm; >>>>>>>>> + ri->i_compress_flag = >>>>>>>>> + cpu_to_le16(F2FS_I(inode)->i_compress_flag); >>>>>>>>> ri->i_log_cluster_size = >>>>>>>>> F2FS_I(inode)->i_log_cluster_size; >>>>>>>>> } >>>>>>>>> diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c >>>>>>>>> index 00eff2f51807..f8de4d83a5be 100644 >>>>>>>>> --- a/fs/f2fs/super.c >>>>>>>>> +++ b/fs/f2fs/super.c >>>>>>>>> @@ -146,6 +146,7 @@ enum { >>>>>>>>> Opt_compress_algorithm, >>>>>>>>> Opt_compress_log_size, >>>>>>>>> Opt_compress_extension, >>>>>>>>> + Opt_compress_chksum, >>>>>>>>> Opt_atgc, >>>>>>>>> Opt_err, >>>>>>>>> }; >>>>>>>>> @@ -214,6 +215,7 @@ static match_table_t f2fs_tokens = { >>>>>>>>> {Opt_compress_algorithm, "compress_algorithm=%s"}, >>>>>>>>> {Opt_compress_log_size, "compress_log_size=%u"}, >>>>>>>>> {Opt_compress_extension, "compress_extension=%s"}, >>>>>>>>> + {Opt_compress_chksum, "compress_chksum"}, >>>>>>>>> {Opt_atgc, "atgc"}, >>>>>>>>> {Opt_err, NULL}, >>>>>>>>> }; >>>>>>>>> @@ -934,10 +936,14 @@ static int parse_options(struct super_block *sb, char *options, bool is_remount) >>>>>>>>> F2FS_OPTION(sbi).compress_ext_cnt++; >>>>>>>>> kfree(name); >>>>>>>>> break; >>>>>>>>> + case Opt_compress_chksum: >>>>>>>>> + F2FS_OPTION(sbi).compress_chksum = true; >>>>>>>>> + break; >>>>>>>>> #else >>>>>>>>> case Opt_compress_algorithm: >>>>>>>>> case Opt_compress_log_size: >>>>>>>>> case Opt_compress_extension: >>>>>>>>> + case Opt_compress_chksum: >>>>>>>>> f2fs_info(sbi, "compression options not supported"); >>>>>>>>> break; >>>>>>>>> #endif >>>>>>>>> @@ -1523,6 +1529,9 @@ static inline void f2fs_show_compress_options(struct seq_file *seq, >>>>>>>>> seq_printf(seq, ",compress_extension=%s", >>>>>>>>> F2FS_OPTION(sbi).extensions[i]); >>>>>>>>> } >>>>>>>>> + >>>>>>>>> + if (F2FS_OPTION(sbi).compress_chksum) >>>>>>>>> + seq_puts(seq, ",compress_chksum"); >>>>>>>>> } >>>>>>>>> static int f2fs_show_options(struct seq_file *seq, struct dentry *root) >>>>>>>>> diff --git a/include/linux/f2fs_fs.h b/include/linux/f2fs_fs.h >>>>>>>>> index a5dbb57a687f..7dc2a06cf19a 100644 >>>>>>>>> --- a/include/linux/f2fs_fs.h >>>>>>>>> +++ b/include/linux/f2fs_fs.h >>>>>>>>> @@ -273,7 +273,7 @@ struct f2fs_inode { >>>>>>>>> __le64 i_compr_blocks; /* # of compressed blocks */ >>>>>>>>> __u8 i_compress_algorithm; /* compress algorithm */ >>>>>>>>> __u8 i_log_cluster_size; /* log of cluster size */ >>>>>>>>> - __le16 i_padding; /* padding */ >>>>>>>>> + __le16 i_compress_flag; /* compress flag */ >>>>>>>>> __le32 i_extra_end[0]; /* for attribute size calculation */ >>>>>>>>> } __packed; >>>>>>>>> __le32 i_addr[DEF_ADDRS_PER_INODE]; /* Pointers to data blocks */ >>>>>>>>> -- >>>>>>>>> 2.26.2 >>>>>>>> . >>>>>>>> >>>>>>> >>>>>>> >>>>>>> _______________________________________________ >>>>>>> Linux-f2fs-devel mailing list >>>>>>> Linux-f2fs-devel@lists.sourceforge.net >>>>>>> https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel >>>>>>> . >>>>>>> >>>>> . >>>>> >>> . >>> > . >