Received: by 2002:a25:23cc:0:0:0:0:0 with SMTP id j195csp1391889ybj; Thu, 7 May 2020 23:44:31 -0700 (PDT) X-Google-Smtp-Source: APiQypKxG+dHTOw0XNk03wIBnMey4fTEUA62gDYXyP8aT2J2/YX6cQeaDYyrPaZtCZsZt69Xkj08 X-Received: by 2002:a17:906:770b:: with SMTP id q11mr623020ejm.224.1588920271716; Thu, 07 May 2020 23:44:31 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1588920271; cv=none; d=google.com; s=arc-20160816; b=XTAY1GVt6tBsm0oHjuudlY08JxD6dWTaKrKm6/H8jdRWiEouqthhXT8M8H+RoZoRml hBpDivaux94ra2x9TnKaMfPmAKhwDVhNHz5rNt5nTHBGx8kETSaeOQTXp/0bX+jmB1PS 3l0xpP9J3vDQ9frzbNMwzOI9/9p17pGu2DAuCB3eoUdoEBpWRk/RSjAfb8czVuZlGwtu kkP0v3VlvfRbi6+12qfSMwf+qNIxpK1fp7oaaqAFkTi1KRNkYXQVP8Wq0b0armo3FBqo o0X2lYv6NkGgMSr4COHyNje5CbfmzzKa3Gid5qKHVK9Y2VE86ckdmtMcmJySAhL2DLQe OCzQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding :content-language:in-reply-to:mime-version:user-agent:date :message-id:from:references:cc:to:subject; bh=mNzQBp/AezTi//VOVIlqn3Brh4g3TphjJVprS6T41WQ=; b=zTZH66OeSpRXLJpEcnCblzmFHp9vJAzCqa1aP3gmQsAMoaa9heVeAaflahptOB4X5l IVZRleSE+7Uozxil4//J1mc/3C8Zgk6lDTlxCpuU96sJzDYLSi7a3fsCxc6sR1uftT4M lpyuKf/Ppg2uwre7ChnvAD1gjXY+2m5AACDs2eNgHRTVy12Jt9Q7Yjj38SENNiqZOtjP DL4b4m+q4jIz1dibKSuMPOMvzQxvtcs3Zg2KM3rHYdHu3zLFjXSYGoXCxrqFsKqpOcCX imAedh977MTt1hWG6Ts0HAOIO5ukvwqCI31a+q4+33SlMcNAhTl679wz4V4ErDyl4ZI7 rSew== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id c11si423058ejs.396.2020.05.07.23.44.09; Thu, 07 May 2020 23:44:31 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727030AbgEHGmx (ORCPT + 99 others); Fri, 8 May 2020 02:42:53 -0400 Received: from szxga04-in.huawei.com ([45.249.212.190]:4349 "EHLO huawei.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1725988AbgEHGmx (ORCPT ); Fri, 8 May 2020 02:42:53 -0400 Received: from DGGEMS413-HUB.china.huawei.com (unknown [172.30.72.58]) by Forcepoint Email with ESMTP id D4B68F644DD8B07BD068; Fri, 8 May 2020 14:42:44 +0800 (CST) Received: from [10.134.22.195] (10.134.22.195) by smtp.huawei.com (10.3.19.213) with Microsoft SMTP Server (TLS) id 14.3.487.0; Fri, 8 May 2020 14:42:42 +0800 Subject: Re: [f2fs-dev] [PATCH] f2fs: compress: fix zstd data corruption To: Daeho Jeong CC: , Daeho Jeong , , References: <20200508011603.54553-1-yuchao0@huawei.com> <0d41e29e-c601-e016-e471-aed184ca4a6a@huawei.com> From: Chao Yu Message-ID: <2a241a80-2597-ef9e-62b5-cf2b8bdb33c4@huawei.com> Date: Fri, 8 May 2020 14:42:41 +0800 User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:52.0) Gecko/20100101 Thunderbird/52.9.1 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset="utf-8" Content-Language: en-US Content-Transfer-Encoding: 8bit X-Originating-IP: [10.134.22.195] X-CFilter-Loop: Reflected Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 2020/5/8 11:30, Daeho Jeong wrote: > I am a little bit confused. > > In compress_log=2 (4 pages), > > Every compression algorithm will set the cc->nr_cpages to 5 pages like below. > >         max_len = COMPRESS_HEADER_SIZE + cc->clen; >         cc->nr_cpages = DIV_ROUND_UP(max_len, PAGE_SIZE); > >         cc->cpages = f2fs_kzalloc(sbi, sizeof(struct page *) * >                                         cc->nr_cpages, GFP_NOFS); > > And call cops->compress_pages(cc) and the returned length of the compressed data will be set to cc->clen for every case. > And if the cc->clen is larger than max_len, we will give up compression. > >         ret = cops->compress_pages(cc); >         if (ret) >                 goto out_vunmap_cbuf; > >         max_len = PAGE_SIZE * (cc->cluster_size - 1) - COMPRESS_HEADER_SIZE; > >         if (cc->clen > max_len) { >                 ret = -EAGAIN; >                 goto out_vunmap_cbuf; >         } > > So, with your patch, we will just use 3 pages for ZSTD and 5 pages for LZO and LZ4 now. > My question was whether it is also possible to decrease the compression buffer size for LZO and LZ4 to 3 pages like ZSTD case. > I was just curious about that. :) I guess we can change LZ4 as we did for ZSTD case, since it supports partially compression: - lz4_compress_pages - LZ4_compress_default - LZ4_compress_fast - LZ4_compress_fast_extState if (maxOutputSize < LZ4_COMPRESSBOUND(inputSize)) - LZ4_compress_generic(..., limitedOutput, ...) - if (outputLimited && boundary_check_condition) return 0; And for LZO case, it looks we have to keep to allocate 5 pages for worst compression case as it doesn't support partially compression as I checked. Thanks, > > > 2020년 5월 8일 (금) 오전 11:48, Chao Yu >님이 작성: > > Hi Daeho, > > On 2020/5/8 9:28, Daeho Jeong wrote: > > Hi Chao, > > > > IIUC, you are trying not to use ZSTD_compressBound() to save the memory > > space. Am I right? > > > > Then, how about LZ4_compressBound() for LZ4 and lzo1x_worst_compress() for > > LZO? > > Oops, it looks those limits were wrongly used... > > #define LZ4_COMPRESSBOUND(isize)        (\ >         (unsigned int)(isize) > (unsigned int)LZ4_MAX_INPUT_SIZE \ >         ? 0 \ >         : (isize) + ((isize)/255) + 16) > > #define lzo1x_worst_compress(x) ((x) + ((x) / 16) + 64 + 3 + 2) > > Newly calculated boundary size is larger than target buffer size. > > However comments on LZ4_compress_default() said: > > ... >  * @maxOutputSize: full or partial size of buffer 'dest' >  *      which must be already allocated > ... > int LZ4_compress_default(const char *source, char *dest, int inputSize, >         int maxOutputSize, void *wrkmem); > > And @out_len in lzo1x_1_compress() was passed as an output parameter to > pass length of data that compressor compressed into @out buffer. > > Let me know if I missed sth. > > Thannks, > > > Could we save more memory space for these two cases like ZSTD? > > As you know, we are using 5 pages compression buffer for LZ4 and LZO in > > compress_log_size=2, > > and if the compressed data doesn't fit in 3 pages, it returns -EAGAIN to > > give up compressing that one. > > > > Thanks, > > > > 2020년 5월 8일 (금) 오전 10:17, Chao Yu >님이 작성: > > > >> During zstd compression, ZSTD_endStream() may return non-zero value > >> because distination buffer is full, but there is still compressed data > >> remained in intermediate buffer, it means that zstd algorithm can not > >> save at last one block space, let's just writeback raw data instead of > >> compressed one, this can fix data corruption when decompressing > >> incomplete stored compression data. > >> > >> Signed-off-by: Daeho Jeong > > >> Signed-off-by: Chao Yu > > >> --- > >>  fs/f2fs/compress.c | 7 +++++++ > >>  1 file changed, 7 insertions(+) > >> > >> diff --git a/fs/f2fs/compress.c b/fs/f2fs/compress.c > >> index c22cc0d37369..5e4947250262 100644 > >> --- a/fs/f2fs/compress.c > >> +++ b/fs/f2fs/compress.c > >> @@ -358,6 +358,13 @@ static int zstd_compress_pages(struct compress_ctx > >> *cc) > >>                 return -EIO; > >>         } > >> > >> +       /* > >> +        * there is compressed data remained in intermediate buffer due to > >> +        * no more space in cbuf.cdata > >> +        */ > >> +       if (ret) > >> +               return -EAGAIN; > >> + > >>         cc->clen = outbuf.pos; > >>         return 0; > >>  } > >> -- > >> 2.18.0.rc1 > >> > >> > >> > >> _______________________________________________ > >> Linux-f2fs-devel mailing list > >> Linux-f2fs-devel@lists.sourceforge.net > >> https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel > >> > > >