Received: by 2002:a25:ad19:0:0:0:0:0 with SMTP id y25csp6512973ybi; Sun, 21 Jul 2019 19:52:58 -0700 (PDT) X-Google-Smtp-Source: APXvYqz5ux67l6dZ4KdUKVX267/XX/IXpjQox548OwhL8bx2lqaohBnE8H3opMuP+1IKEJn+JI+m X-Received: by 2002:a65:6406:: with SMTP id a6mr32710721pgv.393.1563763978626; Sun, 21 Jul 2019 19:52:58 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1563763978; cv=none; d=google.com; s=arc-20160816; b=StGN41kZLVmJ3tHRgxmweveHycnHbPyqAJJWpMZR735X7ZkfXRHNh762Mw0dJ1gsta yxqdK6CMjlvj+zr35ZrK6+wJr7EvknHqw0ot72kOsLuhMwJFkwsD4NtRIhEvdemuld+a uwOU4ww3yWbdp+Gc/3B9y3VUR7r7EeMao46cj5ceiUjXGntKkkspGdUnhHM2gxAlJUXJ Bm0LyL3A63MkMUzkbkHIR1p2zPxujLbaIXkrlZyfFKAdTLcyG5vihpn0oHMaju7vtWN9 CiGZDm985kq1HGW+2IzIg4CQQP7YatI8JKeV4muBNrBDvdmcQzDPjMbSy82TQecZVrqv EG1A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from; bh=tDHwAyCEuuiJyjtkuWSs2nhQVhjLcnTNXpix5GuPUBo=; b=vVc/5T1d3ISQxIVm4NhiTL1yi97aolPF/KVk+59LkGHRcpHUlTLng2qsdsrzbh6jzL dCxT0yGG2zhUUY0SLqXPlvH6E3TABGAMzhkWRmDBCs6JmzNliCKAX+DBzI6zLfh0Ig+w BmvFl6dcbkEmuuxL8X3ZOIeFmWOt+q/YHNg3DiIEVnXVktHtVhfi2o02cqU0A6OHm1J9 S+Nxbb+DixlvklkTKetWHrfV7fLvW2PQLPoyWk2miwvlHBAXXHRco3ZSMEMPGkWNZcpX 0HI1sy2SqvP3DXCl4sLn+okNy2Vow7ZiAM/LD0a/GEB1M+4oANJse7iTgk5KRU+ZemrI qjgw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id a1si10429501pff.73.2019.07.21.19.52.43; Sun, 21 Jul 2019 19:52:58 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728981AbfGVCvv (ORCPT + 99 others); Sun, 21 Jul 2019 22:51:51 -0400 Received: from szxga05-in.huawei.com ([45.249.212.191]:2732 "EHLO huawei.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1728915AbfGVCvb (ORCPT ); Sun, 21 Jul 2019 22:51:31 -0400 Received: from DGGEMS403-HUB.china.huawei.com (unknown [172.30.72.58]) by Forcepoint Email with ESMTP id 73C21ABFF51F86BF5568; Mon, 22 Jul 2019 10:51:29 +0800 (CST) Received: from architecture4.huawei.com (10.140.130.215) by smtp.huawei.com (10.3.19.203) with Microsoft SMTP Server (TLS) id 14.3.439.0; Mon, 22 Jul 2019 10:51:19 +0800 From: Gao Xiang To: Alexander Viro , Greg Kroah-Hartman , Andrew Morton , Stephen Rothwell , Theodore Ts'o , "Linus Torvalds" CC: , , LKML , , Chao Yu , Miao Xie , Li Guifu , Fang Wei , Gao Xiang Subject: [PATCH v3 21/24] erofs: introduce LZ4 decompression inplace Date: Mon, 22 Jul 2019 10:50:40 +0800 Message-ID: <20190722025043.166344-22-gaoxiang25@huawei.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20190722025043.166344-1-gaoxiang25@huawei.com> References: <20190722025043.166344-1-gaoxiang25@huawei.com> MIME-Version: 1.0 Content-Type: text/plain X-Originating-IP: [10.140.130.215] X-CFilter-Loop: Reflected Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org compressed data will be usually loaded into last pages of the extent (the last page for 4k) for in-place decompression (more specifically, in-place IO), as ilustration below, start of compressed logical extent | end of this logical extent | | ______v___________________________v________ ... | page 6 | page 7 | page 8 | page 9 | ... |__________|__________|__________|__________| . ^ . ^ . |compressed| . | data | . . . |< dstsize >|| oend iend op ip Therefore, it's possible to do decompression inplace (thus no memcpy at all) if the margin is sufficient and safe enough [1], and it can be implemented only for fixed-size output compression compared with fixed-size input compression. No memcpy for most of in-place IO (about 99% of enwik9) after decompression inplace is implemented and sequential read will be improved of course (see the following patches for test results). [1] https://github.com/lz4/lz4/commit/b17f578a919b7e6b078cede2d52be29dd48c8e8c https://github.com/lz4/lz4/commit/5997e139f53169fa3a1c1b4418d2452a90b01602 Signed-off-by: Gao Xiang --- fs/erofs/decompressor.c | 36 ++++++++++++++++++++++++++++++++---- fs/erofs/erofs_fs.h | 2 +- 2 files changed, 33 insertions(+), 5 deletions(-) diff --git a/fs/erofs/decompressor.c b/fs/erofs/decompressor.c index 2ee10bb7e440..18fcbd75423e 100644 --- a/fs/erofs/decompressor.c +++ b/fs/erofs/decompressor.c @@ -14,6 +14,9 @@ #endif #define LZ4_MAX_DISTANCE_PAGES (DIV_ROUND_UP(LZ4_DISTANCE_MAX, PAGE_SIZE) + 1) +#ifndef LZ4_DECOMPRESS_INPLACE_MARGIN +#define LZ4_DECOMPRESS_INPLACE_MARGIN(srcsize) (((srcsize) >> 8) + 32) +#endif struct z_erofs_decompressor { /* @@ -112,7 +115,7 @@ static int lz4_decompress(struct z_erofs_decompress_req *rq, u8 *out) { unsigned int inputmargin, inlen; u8 *src; - bool copied; + bool copied, support_0padding; int ret; if (rq->inputsize > PAGE_SIZE) @@ -120,13 +123,38 @@ static int lz4_decompress(struct z_erofs_decompress_req *rq, u8 *out) src = kmap_atomic(*rq->in); inputmargin = 0; + support_0padding = false; + + /* decompression inplace is only safe when 0padding is enabled */ + if (EROFS_SB(rq->sb)->requirements & EROFS_REQUIREMENT_LZ4_0PADDING) { + support_0padding = true; + + while (!src[inputmargin & ~PAGE_MASK]) + if (!(++inputmargin & ~PAGE_MASK)) + break; + + if (inputmargin >= rq->inputsize) { + kunmap_atomic(src); + return -EIO; + } + } copied = false; inlen = rq->inputsize - inputmargin; if (rq->inplace_io) { - src = generic_copy_inplace_data(rq, src, inputmargin); - inputmargin = 0; - copied = true; + const uint oend = (rq->pageofs_out + + rq->outputsize) & ~PAGE_MASK; + const uint nr = PAGE_ALIGN(rq->pageofs_out + + rq->outputsize) >> PAGE_SHIFT; + + if (rq->partial_decoding || !support_0padding || + rq->out[nr - 1] != rq->in[0] || + rq->inputsize - oend < + LZ4_DECOMPRESS_INPLACE_MARGIN(inlen)) { + src = generic_copy_inplace_data(rq, src, inputmargin); + inputmargin = 0; + copied = true; + } } ret = LZ4_decompress_safe_partial(src + inputmargin, out, diff --git a/fs/erofs/erofs_fs.h b/fs/erofs/erofs_fs.h index e418725abfd6..8c64e78ceaca 100644 --- a/fs/erofs/erofs_fs.h +++ b/fs/erofs/erofs_fs.h @@ -17,7 +17,7 @@ * incompatible with this kernel version. */ #define EROFS_REQUIREMENT_LZ4_0PADDING 0x00000001 -#define EROFS_ALL_REQUIREMENTS 0 +#define EROFS_ALL_REQUIREMENTS EROFS_REQUIREMENT_LZ4_0PADDING struct erofs_super_block { /* 0 */__le32 magic; /* in the little endian */ -- 2.17.1