It seems some optimization has been removed from the code without removing
the if condition which should activate it only on 64-bit platforms and as
a result the code responsible for decompression with window larger than
8MB was disabled on 32-bit platforms.
Signed-off-by: Petr Malat <[email protected]>
---
lib/zstd/decompress.c | 8 ++------
1 file changed, 2 insertions(+), 6 deletions(-)
diff --git a/lib/zstd/decompress.c b/lib/zstd/decompress.c
index db6761ea4deb..509a3b8d51b9 100644
--- a/lib/zstd/decompress.c
+++ b/lib/zstd/decompress.c
@@ -1457,12 +1457,8 @@ static size_t ZSTD_decompressBlock_internal(ZSTD_DCtx *dctx, void *dst, size_t d
ip += litCSize;
srcSize -= litCSize;
}
- if (sizeof(size_t) > 4) /* do not enable prefetching on 32-bits x86, as it's performance detrimental */
- /* likely because of register pressure */
- /* if that's the correct cause, then 32-bits ARM should be affected differently */
- /* it would be good to test this on ARM real hardware, to see if prefetch version improves speed */
- if (dctx->fParams.windowSize > (1 << 23))
- return ZSTD_decompressSequencesLong(dctx, dst, dstCapacity, ip, srcSize);
+ if (dctx->fParams.windowSize > (1 << 23))
+ return ZSTD_decompressSequencesLong(dctx, dst, dstCapacity, ip, srcSize);
return ZSTD_decompressSequences(dctx, dst, dstCapacity, ip, srcSize);
}
--
2.20.1
On Sun, Sep 13, 2020 at 11:19 PM Petr Malat <[email protected]> wrote:
>
> It seems some optimization has been removed from the code without removing
> the if condition which should activate it only on 64-bit platforms and as
> a result the code responsible for decompression with window larger than
> 8MB was disabled on 32-bit platforms.
>
> Signed-off-by: Petr Malat <[email protected]>
Reviewed-by: Nick Terrell <[email protected]>
Thanks for the fix! I looked upstream and this fix corresponds to this
upstream commit:
https://github.com/facebook/zstd/commit/8a5c0c98ae5a7884694589d7a69bc99011add94d
Thanks,
Nick Terrell
> ---
> lib/zstd/decompress.c | 8 ++------
> 1 file changed, 2 insertions(+), 6 deletions(-)
>
> diff --git a/lib/zstd/decompress.c b/lib/zstd/decompress.c
> index db6761ea4deb..509a3b8d51b9 100644
> --- a/lib/zstd/decompress.c
> +++ b/lib/zstd/decompress.c
> @@ -1457,12 +1457,8 @@ static size_t ZSTD_decompressBlock_internal(ZSTD_DCtx *dctx, void *dst, size_t d
> ip += litCSize;
> srcSize -= litCSize;
> }
> - if (sizeof(size_t) > 4) /* do not enable prefetching on 32-bits x86, as it's performance detrimental */
> - /* likely because of register pressure */
> - /* if that's the correct cause, then 32-bits ARM should be affected differently */
> - /* it would be good to test this on ARM real hardware, to see if prefetch version improves speed */
> - if (dctx->fParams.windowSize > (1 << 23))
> - return ZSTD_decompressSequencesLong(dctx, dst, dstCapacity, ip, srcSize);
> + if (dctx->fParams.windowSize > (1 << 23))
> + return ZSTD_decompressSequencesLong(dctx, dst, dstCapacity, ip, srcSize);
> return ZSTD_decompressSequences(dctx, dst, dstCapacity, ip, srcSize);
> }
>
> --
> 2.20.1
>