v2:
- Addressed GregKH's review and comments.
Hi,
The first patch fixes zram with lz4 compression on ppc64 (and big endian
architectures with efficient unaligned access), the second is just a
cleanup.
Thanks,
Rui
Rui Salvaterra (2):
lib: lz4: fixed zram with lz4 on big endian machines
lib: lz4: cleanup unaligned access efficiency detection
lib/lz4/lz4defs.h | 25 +++++++++++++------------
1 file changed, 13 insertions(+), 12 deletions(-)
--
2.7.4
Based on Sergey's test patch [1], this fixes zram with lz4 compression
on big endian cpus.
Note that the 64-bit preprocessor test is not a cleanup, it's part of
the fix, since those identifiers are bogus (for example, __ppc64__
isn't defined anywhere else in the kernel, which means we'd fall into
the 32-bit definitions on ppc64).
Tested on ppc64 with no regression on x86_64.
[1] http://marc.info/?l=linux-kernel&m=145994470805853&w=4
Cc: [email protected]
Suggested-by: Sergey Senozhatsky <[email protected]>
Signed-off-by: Rui Salvaterra <[email protected]>
---
lib/lz4/lz4defs.h | 21 ++++++++++++---------
1 file changed, 12 insertions(+), 9 deletions(-)
diff --git a/lib/lz4/lz4defs.h b/lib/lz4/lz4defs.h
index abcecdc..0710a62 100644
--- a/lib/lz4/lz4defs.h
+++ b/lib/lz4/lz4defs.h
@@ -11,8 +11,7 @@
/*
* Detects 64 bits mode
*/
-#if (defined(__x86_64__) || defined(__x86_64) || defined(__amd64__) \
- || defined(__ppc64__) || defined(__LP64__))
+#if defined(CONFIG_64BIT)
#define LZ4_ARCH64 1
#else
#define LZ4_ARCH64 0
@@ -35,6 +34,10 @@ typedef struct _U64_S { u64 v; } U64_S;
#define PUT4(s, d) (A32(d) = A32(s))
#define PUT8(s, d) (A64(d) = A64(s))
+
+#define LZ4_READ_LITTLEENDIAN_16(d, s, p) \
+ (d = s - A16(p))
+
#define LZ4_WRITE_LITTLEENDIAN_16(p, v) \
do { \
A16(p) = v; \
@@ -51,10 +54,13 @@ typedef struct _U64_S { u64 v; } U64_S;
#define PUT8(s, d) \
put_unaligned(get_unaligned((const u64 *) s), (u64 *) d)
-#define LZ4_WRITE_LITTLEENDIAN_16(p, v) \
- do { \
- put_unaligned(v, (u16 *)(p)); \
- p += 2; \
+#define LZ4_READ_LITTLEENDIAN_16(d, s, p) \
+ (d = s - get_unaligned_le16(p))
+
+#define LZ4_WRITE_LITTLEENDIAN_16(p, v) \
+ do { \
+ put_unaligned_le16(v, (u16 *)(p)); \
+ p += 2; \
} while (0)
#endif
@@ -140,9 +146,6 @@ typedef struct _U64_S { u64 v; } U64_S;
#endif
-#define LZ4_READ_LITTLEENDIAN_16(d, s, p) \
- (d = s - get_unaligned_le16(p))
-
#define LZ4_WILDCOPY(s, d, e) \
do { \
LZ4_COPYPACKET(s, d); \
--
2.7.4
These identifiers are bogus. The interested architectures should define
HAVE_EFFICIENT_UNALIGNED_ACCESS whenever relevant to do so. If this
isn't true for some arch, it should be fixed in the arch definition.
Signed-off-by: Rui Salvaterra <[email protected]>
---
lib/lz4/lz4defs.h | 4 +---
1 file changed, 1 insertion(+), 3 deletions(-)
diff --git a/lib/lz4/lz4defs.h b/lib/lz4/lz4defs.h
index 0710a62..c79d7ea 100644
--- a/lib/lz4/lz4defs.h
+++ b/lib/lz4/lz4defs.h
@@ -24,9 +24,7 @@
typedef struct _U16_S { u16 v; } U16_S;
typedef struct _U32_S { u32 v; } U32_S;
typedef struct _U64_S { u64 v; } U64_S;
-#if defined(CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS) \
- || defined(CONFIG_ARM) && __LINUX_ARM_ARCH__ >= 6 \
- && defined(ARM_EFFICIENT_UNALIGNED_ACCESS)
+#if defined(CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS)
#define A16(x) (((U16_S *)(x))->v)
#define A32(x) (((U32_S *)(x))->v)
--
2.7.4
On (04/09/16 22:05), Rui Salvaterra wrote:
> These identifiers are bogus. The interested architectures should define
> HAVE_EFFICIENT_UNALIGNED_ACCESS whenever relevant to do so. If this
> isn't true for some arch, it should be fixed in the arch definition.
yes, besides ARM_EFFICIENT_UNALIGNED_ACCESS exists only in lib/lz4/lz4defs.h
> Signed-off-by: Rui Salvaterra <[email protected]>
Reviewed-by: Sergey Senozhatsky <[email protected]>
-ss
> ---
> lib/lz4/lz4defs.h | 4 +---
> 1 file changed, 1 insertion(+), 3 deletions(-)
>
> diff --git a/lib/lz4/lz4defs.h b/lib/lz4/lz4defs.h
> index 0710a62..c79d7ea 100644
> --- a/lib/lz4/lz4defs.h
> +++ b/lib/lz4/lz4defs.h
> @@ -24,9 +24,7 @@
> typedef struct _U16_S { u16 v; } U16_S;
> typedef struct _U32_S { u32 v; } U32_S;
> typedef struct _U64_S { u64 v; } U64_S;
> -#if defined(CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS) \
> - || defined(CONFIG_ARM) && __LINUX_ARM_ARCH__ >= 6 \
> - && defined(ARM_EFFICIENT_UNALIGNED_ACCESS)
> +#if defined(CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS)
>
> #define A16(x) (((U16_S *)(x))->v)
> #define A32(x) (((U32_S *)(x))->v)
> --
> 2.7.4
>
On (04/09/16 22:05), Rui Salvaterra wrote:
> Note that the 64-bit preprocessor test is not a cleanup, it's part of
> the fix, since those identifiers are bogus (for example, __ppc64__
> isn't defined anywhere else in the kernel, which means we'd fall into
> the 32-bit definitions on ppc64).
good find.
> Tested on ppc64 with no regression on x86_64.
>
> [1] http://marc.info/?l=linux-kernel&m=145994470805853&w=4
>
> Cc: [email protected]
> Suggested-by: Sergey Senozhatsky <[email protected]>
> Signed-off-by: Rui Salvaterra <[email protected]>
Reviewed-by: Sergey Senozhatsky <[email protected]>
-ss