2016-04-05 14:07:55

by Rui Salvaterra

[permalink] [raw]
Subject: [BUG] lib: zram lz4 compression/decompression still broken on big endian

Hi,


I apologise in advance if I've cc'ed too many/the wrong people/lists.

Whenever I try to use zram with lz4, on my Power Mac G5 (tested with
kernel 4.4.0-16-powerpc64-smp from Ubuntu 16.04 LTS), I get the
following on my dmesg:

[13150.675820] zram: Added device: zram0
[13150.704133] zram0: detected capacity change from 0 to 5131976704
[13150.715960] zram: Decompression failed! err=-1, page=0
[13150.716008] zram: Decompression failed! err=-1, page=0
[13150.716027] zram: Decompression failed! err=-1, page=0
[13150.716032] Buffer I/O error on dev zram0, logical block 0, async page read

I believe Eunbong Song wrote a patch [1] to fix this (or a very
identical) bug on MIPS, but it never got merged (maybe
incorrect/incomplete?). Is there any hope of seeing this bug fixed?


Thanks,

Rui Salvaterra


[1] http://comments.gmane.org/gmane.linux.kernel/1752745


2016-04-05 15:34:45

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [BUG] lib: zram lz4 compression/decompression still broken on big endian

On Tue, Apr 05, 2016 at 03:07:48PM +0100, Rui Salvaterra wrote:
> Hi,
>
>
> I apologise in advance if I've cc'ed too many/the wrong people/lists.
>
> Whenever I try to use zram with lz4, on my Power Mac G5 (tested with
> kernel 4.4.0-16-powerpc64-smp from Ubuntu 16.04 LTS), I get the
> following on my dmesg:
>
> [13150.675820] zram: Added device: zram0
> [13150.704133] zram0: detected capacity change from 0 to 5131976704
> [13150.715960] zram: Decompression failed! err=-1, page=0
> [13150.716008] zram: Decompression failed! err=-1, page=0
> [13150.716027] zram: Decompression failed! err=-1, page=0
> [13150.716032] Buffer I/O error on dev zram0, logical block 0, async page read
>
> I believe Eunbong Song wrote a patch [1] to fix this (or a very
> identical) bug on MIPS, but it never got merged (maybe
> incorrect/incomplete?). Is there any hope of seeing this bug fixed?
>
>
> Thanks,
>
> Rui Salvaterra
>
>
> [1] http://comments.gmane.org/gmane.linux.kernel/1752745

For some reason it never got merged, sorry, I don't remember why.

Have you tested this patch? If so, can you resend it with your
tested-by: line added to it?

thanks,

greg k-h

2016-04-05 16:02:24

by Rui Salvaterra

[permalink] [raw]
Subject: Re: [BUG] lib: zram lz4 compression/decompression still broken on big endian

2016-04-05 16:34 GMT+01:00 Greg KH <[email protected]>:
> On Tue, Apr 05, 2016 at 03:07:48PM +0100, Rui Salvaterra wrote:
>> Hi,
>>
>>
>> I apologise in advance if I've cc'ed too many/the wrong people/lists.
>>
>> Whenever I try to use zram with lz4, on my Power Mac G5 (tested with
>> kernel 4.4.0-16-powerpc64-smp from Ubuntu 16.04 LTS), I get the
>> following on my dmesg:
>>
>> [13150.675820] zram: Added device: zram0
>> [13150.704133] zram0: detected capacity change from 0 to 5131976704
>> [13150.715960] zram: Decompression failed! err=-1, page=0
>> [13150.716008] zram: Decompression failed! err=-1, page=0
>> [13150.716027] zram: Decompression failed! err=-1, page=0
>> [13150.716032] Buffer I/O error on dev zram0, logical block 0, async page read
>>
>> I believe Eunbong Song wrote a patch [1] to fix this (or a very
>> identical) bug on MIPS, but it never got merged (maybe
>> incorrect/incomplete?). Is there any hope of seeing this bug fixed?
>>
>>
>> Thanks,
>>
>> Rui Salvaterra
>>
>>
>> [1] http://comments.gmane.org/gmane.linux.kernel/1752745
>
> For some reason it never got merged, sorry, I don't remember why.
>
> Have you tested this patch? If so, can you resend it with your
> tested-by: line added to it?
>
> thanks,
>
> greg k-h

Hi, Greg


No, I haven't tested the patch at all. I want to do so, and fix if if
necessary, but I still need to learn how to (meaning, I need to watch
your "first kernel patch" presentation again). I'd love to get
involved in kernel development, and this seems to be a good
opportunity, if none of the kernel gods beat me to it (I may need a
month, but then again nobody complained about this bug in almost two
years).


Thanks,

Rui

2016-04-06 05:34:10

by Sergey Senozhatsky

[permalink] [raw]
Subject: Re: [BUG] lib: zram lz4 compression/decompression still broken on big endian

On (04/05/16 17:02), Rui Salvaterra wrote:
[..]
> > For some reason it never got merged, sorry, I don't remember why.
> >
> > Have you tested this patch? If so, can you resend it with your
> > tested-by: line added to it?
> >
> > thanks,
> >
> > greg k-h
>
> Hi, Greg
>
>
> No, I haven't tested the patch at all. I want to do so, and fix if if
> necessary, but I still need to learn how to (meaning, I need to watch
> your "first kernel patch" presentation again). I'd love to get
> involved in kernel development, and this seems to be a good
> opportunity, if none of the kernel gods beat me to it (I may need a
> month, but then again nobody complained about this bug in almost two
> years).

Hello Rui,

may we please ask you to test the patch first? quite possible there
is nothing to fix there; I've no access to mips h/w but the patch
seems correct to me.

LZ4_READ_LITTLEENDIAN_16 does get_unaligned_le16(), so
LZ4_WRITE_LITTLEENDIAN_16 must do put_unaligned_le16() /* not put_unaligned() */

-ss

2016-04-06 09:39:59

by Rui Salvaterra

[permalink] [raw]
Subject: Re: [BUG] lib: zram lz4 compression/decompression still broken on big endian

2016-04-06 6:33 GMT+01:00 Sergey Senozhatsky
<[email protected]>:
> On (04/05/16 17:02), Rui Salvaterra wrote:
> [..]
>> > For some reason it never got merged, sorry, I don't remember why.
>> >
>> > Have you tested this patch? If so, can you resend it with your
>> > tested-by: line added to it?
>> >
>> > thanks,
>> >
>> > greg k-h
>>
>> Hi, Greg
>>
>>
>> No, I haven't tested the patch at all. I want to do so, and fix if if
>> necessary, but I still need to learn how to (meaning, I need to watch
>> your "first kernel patch" presentation again). I'd love to get
>> involved in kernel development, and this seems to be a good
>> opportunity, if none of the kernel gods beat me to it (I may need a
>> month, but then again nobody complained about this bug in almost two
>> years).
>
> Hello Rui,
>
> may we please ask you to test the patch first? quite possible there
> is nothing to fix there; I've no access to mips h/w but the patch
> seems correct to me.
>
> LZ4_READ_LITTLEENDIAN_16 does get_unaligned_le16(), so
> LZ4_WRITE_LITTLEENDIAN_16 must do put_unaligned_le16() /* not put_unaligned() */
>
> -ss

Hi, Sergey


Besides ppc64, I have ppc32, x86 and x86_64 hardware readily
available. The only mips (74kc, also big endian) hardware I have
access to is my router, running OpenWrt, I can try to test it there
too, but it will be more complicated. Still, after reading the
existing code [1] more thoroughly, I can't see how Eunbong Song's
patch [2] would fix the ppc case (please correct me if I'm wrong,
which is highly likely, since my C preprocessor knowledge varies
between nonexistent to very superficial).

Now, LZ4_READ_LITTLEENDIAN_16 is unconditionally defined as:

#define LZ4_READ_LITTLEENDIAN_16(d, s, p)
(d = s - get_unaligned_le16(p))

As far as I can tell, and unlike ppc, mips doesn't define
HAVE_EFFICIENT_UNALIGNED_ACCESS, which means for mips case,
LZ4_WRITE_LITTLEENDIAN_16 will be defined as:

#define LZ4_WRITE_LITTLEENDIAN_16(p, v)
do {
put_unaligned(v, (u16 *)(p));
p += 2;
} while (0)

Whereas for ppc, which defines HAVE_EFFICIENT_UNALIGNED_ACCESS,
LZ4_WRITE_LITTLEENDIAN_16 will be defined as:

#define LZ4_WRITE_LITTLEENDIAN_16(p, v)
do {
A16(p) = v;
p += 2;
} while (0)

Consequentially, while I believe the patch will fix the mips case, I'm
not so sure about ppc (or any other big endian architecture with
efficient unaligned accesses).


Thanks,

Rui

[1] https://git.kernel.org/cgit/linux/kernel/git/stable/linux-stable.git/tree/lib/lz4/lz4defs.h?h=v4.4.6
[2] http://permalink.gmane.org/gmane.linux.kernel/1752745

2016-04-06 12:11:29

by Sergey Senozhatsky

[permalink] [raw]
Subject: Re: [BUG] lib: zram lz4 compression/decompression still broken on big endian

Cc Chanho Min, Kyungsik Lee


Hello,

On (04/06/16 10:39), Rui Salvaterra wrote:
> > may we please ask you to test the patch first? quite possible there
> > is nothing to fix there; I've no access to mips h/w but the patch
> > seems correct to me.
> >
> > LZ4_READ_LITTLEENDIAN_16 does get_unaligned_le16(), so
> > LZ4_WRITE_LITTLEENDIAN_16 must do put_unaligned_le16() /* not put_unaligned() */
> >
[..]
> Consequentially, while I believe the patch will fix the mips case, I'm
> not so sure about ppc (or any other big endian architecture with
> efficient unaligned accesses).

frankly, yes, I took a quick look today (after I sent my initial
message, tho) ... and it is fishy, I agree. was going to followup
on my email but somehow got interrupted, sorry.

so we have, write:
((U16_S *)(p)) = v OR put_unaligned(v, (u16 *)(p))

and only one read:
get_unaligned_le16(p))

I guess it's either read part also must depend on
HAVE_EFFICIENT_UNALIGNED_ACCESS, or write path
should stop doing so.

I ended up with two patches, NONE was tested (!!!). like at all.

1) provide CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS-dependent
LZ4_READ_LITTLEENDIAN_16

2) provide common LZ4_WRITE_LITTLEENDIAN_16 and LZ4_READ_LITTLEENDIAN_16
regardless CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS.


assuming that common LZ4_WRITE_LITTLEENDIAN_16 will somehow hit the
performance, I'd probably prefer option #1.

the patch is below. would be great if you can help testing it.

---

lib/lz4/lz4defs.h | 22 +++++++++++++---------
1 file changed, 13 insertions(+), 9 deletions(-)

diff --git a/lib/lz4/lz4defs.h b/lib/lz4/lz4defs.h
index abcecdc..a23e6c2 100644
--- a/lib/lz4/lz4defs.h
+++ b/lib/lz4/lz4defs.h
@@ -36,10 +36,14 @@ typedef struct _U64_S { u64 v; } U64_S;
#define PUT4(s, d) (A32(d) = A32(s))
#define PUT8(s, d) (A64(d) = A64(s))
#define LZ4_WRITE_LITTLEENDIAN_16(p, v) \
- do { \
- A16(p) = v; \
- p += 2; \
+ do { \
+ A16(p) = v; \
+ p += 2; \
} while (0)
+
+#define LZ4_READ_LITTLEENDIAN_16(d, s, p) \
+ (d = s - A16(p))
+
#else /* CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS */

#define A64(x) get_unaligned((u64 *)&(((U16_S *)(x))->v))
@@ -52,10 +56,13 @@ typedef struct _U64_S { u64 v; } U64_S;
put_unaligned(get_unaligned((const u64 *) s), (u64 *) d)

#define LZ4_WRITE_LITTLEENDIAN_16(p, v) \
- do { \
- put_unaligned(v, (u16 *)(p)); \
- p += 2; \
+ do { \
+ put_unaligned_le16(v, (u16 *)(p)); \
+ p += 2; \
} while (0)
+
+#define LZ4_READ_LITTLEENDIAN_16(d, s, p) \
+ (d = s - get_unaligned_le16(p))
#endif

#define COPYLENGTH 8
@@ -140,9 +147,6 @@ typedef struct _U64_S { u64 v; } U64_S;

#endif

-#define LZ4_READ_LITTLEENDIAN_16(d, s, p) \
- (d = s - get_unaligned_le16(p))
-
#define LZ4_WILDCOPY(s, d, e) \
do { \
LZ4_COPYPACKET(s, d); \

2016-04-07 12:33:42

by Rui Salvaterra

[permalink] [raw]
Subject: Re: [BUG] lib: zram lz4 compression/decompression still broken on big endian

2016-04-06 14:09 GMT+01:00 Sergey Senozhatsky <[email protected]>:
> Cc Chanho Min, Kyungsik Lee
>
>
> Hello,
>
> On (04/06/16 10:39), Rui Salvaterra wrote:
>> > may we please ask you to test the patch first? quite possible there
>> > is nothing to fix there; I've no access to mips h/w but the patch
>> > seems correct to me.
>> >
>> > LZ4_READ_LITTLEENDIAN_16 does get_unaligned_le16(), so
>> > LZ4_WRITE_LITTLEENDIAN_16 must do put_unaligned_le16() /* not put_unaligned() */
>> >
> [..]
>> Consequentially, while I believe the patch will fix the mips case, I'm
>> not so sure about ppc (or any other big endian architecture with
>> efficient unaligned accesses).
>
> frankly, yes, I took a quick look today (after I sent my initial
> message, tho) ... and it is fishy, I agree. was going to followup
> on my email but somehow got interrupted, sorry.
>
> so we have, write:
> ((U16_S *)(p)) = v OR put_unaligned(v, (u16 *)(p))
>
> and only one read:
> get_unaligned_le16(p))
>
> I guess it's either read part also must depend on
> HAVE_EFFICIENT_UNALIGNED_ACCESS, or write path
> should stop doing so.
>
> I ended up with two patches, NONE was tested (!!!). like at all.
>
> 1) provide CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS-dependent
> LZ4_READ_LITTLEENDIAN_16
>
> 2) provide common LZ4_WRITE_LITTLEENDIAN_16 and LZ4_READ_LITTLEENDIAN_16
> regardless CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS.
>
>
> assuming that common LZ4_WRITE_LITTLEENDIAN_16 will somehow hit the
> performance, I'd probably prefer option #1.
>
> the patch is below. would be great if you can help testing it.
>
> ---
>
> lib/lz4/lz4defs.h | 22 +++++++++++++---------
> 1 file changed, 13 insertions(+), 9 deletions(-)
>
> diff --git a/lib/lz4/lz4defs.h b/lib/lz4/lz4defs.h
> index abcecdc..a23e6c2 100644
> --- a/lib/lz4/lz4defs.h
> +++ b/lib/lz4/lz4defs.h
> @@ -36,10 +36,14 @@ typedef struct _U64_S { u64 v; } U64_S;
> #define PUT4(s, d) (A32(d) = A32(s))
> #define PUT8(s, d) (A64(d) = A64(s))
> #define LZ4_WRITE_LITTLEENDIAN_16(p, v) \
> - do { \
> - A16(p) = v; \
> - p += 2; \
> + do { \
> + A16(p) = v; \
> + p += 2; \
> } while (0)
> +
> +#define LZ4_READ_LITTLEENDIAN_16(d, s, p) \
> + (d = s - A16(p))
> +
> #else /* CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS */
>
> #define A64(x) get_unaligned((u64 *)&(((U16_S *)(x))->v))
> @@ -52,10 +56,13 @@ typedef struct _U64_S { u64 v; } U64_S;
> put_unaligned(get_unaligned((const u64 *) s), (u64 *) d)
>
> #define LZ4_WRITE_LITTLEENDIAN_16(p, v) \
> - do { \
> - put_unaligned(v, (u16 *)(p)); \
> - p += 2; \
> + do { \
> + put_unaligned_le16(v, (u16 *)(p)); \
> + p += 2; \
> } while (0)
> +
> +#define LZ4_READ_LITTLEENDIAN_16(d, s, p) \
> + (d = s - get_unaligned_le16(p))
> #endif
>
> #define COPYLENGTH 8
> @@ -140,9 +147,6 @@ typedef struct _U64_S { u64 v; } U64_S;
>
> #endif
>
> -#define LZ4_READ_LITTLEENDIAN_16(d, s, p) \
> - (d = s - get_unaligned_le16(p))
> -
> #define LZ4_WILDCOPY(s, d, e) \
> do { \
> LZ4_COPYPACKET(s, d); \
>


Hi again, Sergey


Thanks for the patch, I'll test it as soon as possible. I agree with
your second option, usually one selects lz4 when (especially
decompression) speed is paramount, so it needs all the help it can
get.

Speaking of fishy, the 64-bit detection code also looks suspiciously
bogus. Some of the identifiers don't even exist anywhere in the kernel
(__ppc64__, por example, after grepping all .c and .h files).
Shouldn't we instead check for CONFIG_64BIT or BITS_PER_LONG == 64?


Thanks,

Rui

2016-04-07 13:09:19

by Sergey Senozhatsky

[permalink] [raw]
Subject: Re: [BUG] lib: zram lz4 compression/decompression still broken on big endian

On (04/07/16 13:33), Rui Salvaterra wrote:
[..]
> Hi again, Sergey

Hello,

> Thanks for the patch, I'll test it as soon as possible. I agree with
> your second option, usually one selects lz4 when (especially
> decompression) speed is paramount, so it needs all the help it can
> get.

thanks!

> Speaking of fishy, the 64-bit detection code also looks suspiciously
> bogus. Some of the identifiers don't even exist anywhere in the kernel
> (__ppc64__, por example, after grepping all .c and .h files).
> Shouldn't we instead check for CONFIG_64BIT or BITS_PER_LONG == 64?

definitely a good question. personally, I'd prefer to test for
CONFIG_64BIT only, looking at this hairy

/* Detects 64 bits mode */
#if (defined(__x86_64__) || defined(__x86_64) || defined(__amd64__) \
|| defined(__ppc64__) || defined(__LP64__))

and remove/rewrite a bunch of other stuff. but the thing with cleanups
is that they don't fix anything, while potentially can introduce bugs.
it's more risky to touch the stable code. /* well, removing those 'ghost'
identifiers is sort of OK to me */. but that's just my opinion, I'll
leave it to you and Greg.

-ss

2016-04-08 14:53:34

by Rui Salvaterra

[permalink] [raw]
Subject: Re: [BUG] lib: zram lz4 compression/decompression still broken on big endian

2016-04-07 15:07 GMT+01:00 Sergey Senozhatsky <[email protected]>:
> On (04/07/16 13:33), Rui Salvaterra wrote:
> [..]
>> Hi again, Sergey
>
> Hello,
>
>> Thanks for the patch, I'll test it as soon as possible. I agree with
>> your second option, usually one selects lz4 when (especially
>> decompression) speed is paramount, so it needs all the help it can
>> get.
>
> thanks!
>
>> Speaking of fishy, the 64-bit detection code also looks suspiciously
>> bogus. Some of the identifiers don't even exist anywhere in the kernel
>> (__ppc64__, por example, after grepping all .c and .h files).
>> Shouldn't we instead check for CONFIG_64BIT or BITS_PER_LONG == 64?
>
> definitely a good question. personally, I'd prefer to test for
> CONFIG_64BIT only, looking at this hairy
>
> /* Detects 64 bits mode */
> #if (defined(__x86_64__) || defined(__x86_64) || defined(__amd64__) \
> || defined(__ppc64__) || defined(__LP64__))
>
> and remove/rewrite a bunch of other stuff. but the thing with cleanups
> is that they don't fix anything, while potentially can introduce bugs.
> it's more risky to touch the stable code. /* well, removing those 'ghost'
> identifiers is sort of OK to me */. but that's just my opinion, I'll
> leave it to you and Greg.
>
> -ss

Hi again, Sergey

I finally was able to test your patch but, as I suspected, it wasn't
enough. However, based on it, I was able to write a (hopefully)
correct one, which I'll send soon (tested on ppc64, with no
regressions on x86_64).

Thanks,

Rui