Received: by 2002:a25:23cc:0:0:0:0:0 with SMTP id j195csp1397615ybj; Thu, 7 May 2020 23:56:05 -0700 (PDT) X-Google-Smtp-Source: APiQypLGqye1VVkmWPdgA9jNIGsr7i0zPz01daGX4Hx05Zd2zMSepeVdbW3nGSXu8B+Z/YP5MJZO X-Received: by 2002:aa7:d4cd:: with SMTP id t13mr944299edr.30.1588920965228; Thu, 07 May 2020 23:56:05 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1588920965; cv=none; d=google.com; s=arc-20160816; b=Az8fSSNPrNCXdmEwor48SZODiO8WFY3wdRG5/c5jobpPI2nqaiB1I1mZ4oAXevPnzS DkHt4TQZkmVGvkxL3f1eeXUKDx1dFyqfJ0ddZmOgFo7cOeZ3q1UnB99ZGdM8d/mMakps Cncwg0EWr5r7APFIoaHD5+V8vp881pgbh1+XPdnqsM/YFNBlmy/9yZDUuAQvQbno6QxD 1v0mW++oW022EaJ/IG9adDkz/S/LS6sgNpd4K2RCD0JQt+6offqoTKkzfttfVwrkmBli +Rtpr4e68qwS8J9Ebmr4fC/E8FArdJg2BWqvGEMnHDxeexJj+/hV72xAfrpBvyKlZfQr xPHA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:cc:to:subject :message-id:date:from:in-reply-to:references:mime-version :dkim-signature; bh=snsveWT+yJB6VQ5fgjYi4vu754HWsLYGYs2/57IiIIs=; b=aKxy9b360d8UhW7lYDOCACBY0nZDH5cFRPOnWjrQnx95j3d+Ma2Yr2VB0d/hT9Kgca DEEfzxHv4D9XtF3lSoom34HmGqW2ane/b4MPDI39EGs+WV9NbV72ujPhTnXvpJGy12vT nUuAdjMwrig00V1PiIXLoI05Wbg46I+dGUqOLCraQpuPZXj+1izQfezQW7At/LvvDGZ5 5xvN9ULCPQGCaEtnjeWiPzaxLRK0d9++7L4ttLfVDcn9BgI4hfIev3Azi93pYQ4L9cUY TzZR8u1XjZakMiPTHkb/mpv5lzQNp0lqH8hFxjmW12F/fvufZoUXhlJtOISSHLPfm+Ig /U5g== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=THnNxIew; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id i12si440813edl.294.2020.05.07.23.55.42; Thu, 07 May 2020 23:56:05 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=THnNxIew; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726797AbgEHGwI (ORCPT + 99 others); Fri, 8 May 2020 02:52:08 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33194 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-FAIL-OK-FAIL) by vger.kernel.org with ESMTP id S1725971AbgEHGwI (ORCPT ); Fri, 8 May 2020 02:52:08 -0400 Received: from mail-lj1-x242.google.com (mail-lj1-x242.google.com [IPv6:2a00:1450:4864:20::242]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C7718C05BD43 for ; Thu, 7 May 2020 23:52:07 -0700 (PDT) Received: by mail-lj1-x242.google.com with SMTP id f11so537802ljp.1 for ; Thu, 07 May 2020 23:52:07 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc:content-transfer-encoding; bh=snsveWT+yJB6VQ5fgjYi4vu754HWsLYGYs2/57IiIIs=; b=THnNxIewCJV4MBQ49YK3xWX6brA2V105UMphDKN/uYq0/0CIK/y4l1fcWiHiQ8m6gc P+yfSZep8hKMPuXDcEdPhTvbDB8aH8Za4pekl33uteo/mKqWS8f/A32ttRRYoAGM4F0w Qn4JpFG/AjGgkZwx+p/EHQ3Il/JOvTGx/H3epo9E+UK6ZcJeNtYvCdAxmigl91gu73Ch pmkb5eGaOds2e9IT9MK6uJRgTHoGN5235d9j0yVCDp5bQGROvfbPe4Uokz22O7+cSObp rqmM2I5SqqFV07IgcpdJTbFJIscQr2z2J3nkIA0P3W6mdaxkQ4agZQlo2xew99k1Zjhq u56Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc:content-transfer-encoding; bh=snsveWT+yJB6VQ5fgjYi4vu754HWsLYGYs2/57IiIIs=; b=r+9Jy1MM0HOZRfWW5SNlFoC6eT/72RdNK7DvBaKcI981/MOclDNu7VLiY1ipcgrjSF wLz+kADjlUQ9CS+mwPN/KdR4g0fXuMULqaKXhhlU6HOUD7D9CQn/poKWipyejkbcoPaQ ubnAAowUmzBl8MMWe7Ni7tZU2iKJGHP4NBiVqqxIkmOKRrbyFkMQLgxlRQNHI18tuv0K JoCKu/SC5qJyGfuJBcaIcfJAtXpVj7XTOBqtJ3n85AZkpujItzrjhDwRRK4wJ25War2L mtpNR0SY32oZXRrjzV2ysPt7K461oYp40tG7IcakyeL6BtOKh0Lv2onxdOJJ4ZRCT+g6 C0Iw== X-Gm-Message-State: AOAM531o0pL0TAj1RsDq4kH39Axy9L3Umy6XAco2ho2EbRmKuPXZJGwn 9o0rOeZ93hJnAjnOhfnZVzlixGGC3i55JsZTbe8= X-Received: by 2002:a2e:361a:: with SMTP id d26mr657621lja.171.1588920726023; Thu, 07 May 2020 23:52:06 -0700 (PDT) MIME-Version: 1.0 References: <20200508011603.54553-1-yuchao0@huawei.com> <0d41e29e-c601-e016-e471-aed184ca4a6a@huawei.com> <2a241a80-2597-ef9e-62b5-cf2b8bdb33c4@huawei.com> In-Reply-To: <2a241a80-2597-ef9e-62b5-cf2b8bdb33c4@huawei.com> From: Daeho Jeong Date: Fri, 8 May 2020 15:51:54 +0900 Message-ID: Subject: Re: [f2fs-dev] [PATCH] f2fs: compress: fix zstd data corruption To: Chao Yu Cc: jaegeuk@kernel.org, Daeho Jeong , linux-kernel@vger.kernel.org, linux-f2fs-devel@lists.sourceforge.net Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Great, thanks for checking~! :) 2020=EB=85=84 5=EC=9B=94 8=EC=9D=BC (=EA=B8=88) =EC=98=A4=ED=9B=84 3:42, Ch= ao Yu =EB=8B=98=EC=9D=B4 =EC=9E=91=EC=84=B1: > > On 2020/5/8 11:30, Daeho Jeong wrote: > > I am a little bit confused. > > > > In compress_log=3D2 (4 pages), > > > > Every compression algorithm will set the cc->nr_cpages to 5 pages like = below. > > > > max_len =3D COMPRESS_HEADER_SIZE + cc->clen; > > cc->nr_cpages =3D DIV_ROUND_UP(max_len, PAGE_SIZE); > > > > cc->cpages =3D f2fs_kzalloc(sbi, sizeof(struct page *) * > > cc->nr_cpages, GFP_NOFS); > > > > And call cops->compress_pages(cc) and the returned length of the compre= ssed data will be set to cc->clen for every case. > > And if the cc->clen is larger than max_len, we will give up compression= . > > > > ret =3D cops->compress_pages(cc); > > if (ret) > > goto out_vunmap_cbuf; > > > > max_len =3D PAGE_SIZE * (cc->cluster_size - 1) - COMPRESS_HEADE= R_SIZE; > > > > if (cc->clen > max_len) { > > ret =3D -EAGAIN; > > goto out_vunmap_cbuf; > > } > > > > So, with your patch, we will just use 3 pages for ZSTD and 5 pages for = LZO and LZ4 now. > > My question was whether it is also possible to decrease the compression= buffer size for LZO and LZ4 to 3 pages like ZSTD case. > > I was just curious about that. :) > I guess we can change LZ4 as we did for ZSTD case, since it supports part= ially > compression: > > - lz4_compress_pages > - LZ4_compress_default > - LZ4_compress_fast > - LZ4_compress_fast_extState > if (maxOutputSize < LZ4_COMPRESSBOUND(inputSize)) > - LZ4_compress_generic(..., limitedOutput, ...) > - if (outputLimited && boundary_check_condition) return 0; > > And for LZO case, it looks we have to keep to allocate 5 pages for worst > compression case as it doesn't support partially compression as I checked= . > > Thanks, > > > > > > > 2020=EB=85=84 5=EC=9B=94 8=EC=9D=BC (=EA=B8=88) =EC=98=A4=EC=A0=84 11:4= 8, Chao Yu >=EB=8B=98=EC=9D= =B4 =EC=9E=91=EC=84=B1: > > > > Hi Daeho, > > > > On 2020/5/8 9:28, Daeho Jeong wrote: > > > Hi Chao, > > > > > > IIUC, you are trying not to use ZSTD_compressBound() to save the = memory > > > space. Am I right? > > > > > > Then, how about LZ4_compressBound() for LZ4 and lzo1x_worst_compr= ess() for > > > LZO? > > > > Oops, it looks those limits were wrongly used... > > > > #define LZ4_COMPRESSBOUND(isize) (\ > > (unsigned int)(isize) > (unsigned int)LZ4_MAX_INPUT_SIZE \ > > ? 0 \ > > : (isize) + ((isize)/255) + 16) > > > > #define lzo1x_worst_compress(x) ((x) + ((x) / 16) + 64 + 3 + 2) > > > > Newly calculated boundary size is larger than target buffer size. > > > > However comments on LZ4_compress_default() said: > > > > ... > > * @maxOutputSize: full or partial size of buffer 'dest' > > * which must be already allocated > > ... > > int LZ4_compress_default(const char *source, char *dest, int inputS= ize, > > int maxOutputSize, void *wrkmem); > > > > And @out_len in lzo1x_1_compress() was passed as an output paramete= r to > > pass length of data that compressor compressed into @out buffer. > > > > Let me know if I missed sth. > > > > Thannks, > > > > > Could we save more memory space for these two cases like ZSTD? > > > As you know, we are using 5 pages compression buffer for LZ4 and = LZO in > > > compress_log_size=3D2, > > > and if the compressed data doesn't fit in 3 pages, it returns -EA= GAIN to > > > give up compressing that one. > > > > > > Thanks, > > > > > > 2020=EB=85=84 5=EC=9B=94 8=EC=9D=BC (=EA=B8=88) =EC=98=A4=EC=A0= =84 10:17, Chao Yu >=EB=8B= =98=EC=9D=B4 =EC=9E=91=EC=84=B1: > > > > > >> During zstd compression, ZSTD_endStream() may return non-zero va= lue > > >> because distination buffer is full, but there is still compresse= d data > > >> remained in intermediate buffer, it means that zstd algorithm ca= n not > > >> save at last one block space, let's just writeback raw data inst= ead of > > >> compressed one, this can fix data corruption when decompressing > > >> incomplete stored compression data. > > >> > > >> Signed-off-by: Daeho Jeong > > > >> Signed-off-by: Chao Yu > > > >> --- > > >> fs/f2fs/compress.c | 7 +++++++ > > >> 1 file changed, 7 insertions(+) > > >> > > >> diff --git a/fs/f2fs/compress.c b/fs/f2fs/compress.c > > >> index c22cc0d37369..5e4947250262 100644 > > >> --- a/fs/f2fs/compress.c > > >> +++ b/fs/f2fs/compress.c > > >> @@ -358,6 +358,13 @@ static int zstd_compress_pages(struct compr= ess_ctx > > >> *cc) > > >> return -EIO; > > >> } > > >> > > >> + /* > > >> + * there is compressed data remained in intermediate buf= fer due to > > >> + * no more space in cbuf.cdata > > >> + */ > > >> + if (ret) > > >> + return -EAGAIN; > > >> + > > >> cc->clen =3D outbuf.pos; > > >> return 0; > > >> } > > >> -- > > >> 2.18.0.rc1 > > >> > > >> > > >> > > >> _______________________________________________ > > >> Linux-f2fs-devel mailing list > > >> Linux-f2fs-devel@lists.sourceforge.net > > >> https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel > > >> > > > > >