2020-06-06 13:32:52

by Vasily Averin

[permalink] [raw]
Subject: [PATCH] lib/lz4: smatch warning in LZ4_decompress_generic()

Found by smatch:
lib/lz4/lz4_decompress.c:150 LZ4_decompress_generic() warn: maybe use && instead of &
It was realy incorrectly copied from
https://github.com/lz4/lz4/commit/45f8603aae389d34c689d3ff7427b314071ccd2c
line 1431

Fixes: 2209fda323e2 ("lib/lz4: update LZ4 decompressor module")
Signed-off-by: Vasily Averin <[email protected]>
---
lib/lz4/lz4_decompress.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/lib/lz4/lz4_decompress.c b/lib/lz4/lz4_decompress.c
index 0c9d3ad..f7f7dca 100644
--- a/lib/lz4/lz4_decompress.c
+++ b/lib/lz4/lz4_decompress.c
@@ -147,7 +147,7 @@ static FORCE_INLINE int LZ4_decompress_generic(
* strictly "less than" on input, to re-enter
* the loop with at least one byte
*/
- && likely((endOnInput ? ip < shortiend : 1) &
+ && likely((endOnInput ? ip < shortiend : 1) &&
(op <= shortoend))) {
/* Copy the literals */
memcpy(op, ip, endOnInput ? 16 : 8);
--
1.8.3.1


2020-06-06 14:39:31

by Gao Xiang

[permalink] [raw]
Subject: Re: [PATCH] lib/lz4: smatch warning in LZ4_decompress_generic()

(I'm not sure whether my email was finally sent out since
I haven't seen it in lore yet. Resend with my another email...)

On Sat, Jun 06, 2020 at 04:28:02PM +0300, Vasily Averin wrote:
> Found by smatch:
> lib/lz4/lz4_decompress.c:150 LZ4_decompress_generic() warn: maybe use && instead of &
> It was realy incorrectly copied from
> https://github.com/lz4/lz4/commit/45f8603aae389d34c689d3ff7427b314071ccd2c
> line 1431

Simply no.

>
> Fixes: 2209fda323e2 ("lib/lz4: update LZ4 decompressor module")
> Signed-off-by: Vasily Averin <[email protected]>
> ---
> lib/lz4/lz4_decompress.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/lib/lz4/lz4_decompress.c b/lib/lz4/lz4_decompress.c
> index 0c9d3ad..f7f7dca 100644
> --- a/lib/lz4/lz4_decompress.c
> +++ b/lib/lz4/lz4_decompress.c
> @@ -147,7 +147,7 @@ static FORCE_INLINE int LZ4_decompress_generic(
> * strictly "less than" on input, to re-enter
> * the loop with at least one byte
> */
> - && likely((endOnInput ? ip < shortiend : 1) &
> + && likely((endOnInput ? ip < shortiend : 1) &&

I'd like to say, this is not my mistake (even not an issue).
If you notice the latest LZ4 upstream
https://github.com/lz4/lz4/blob/dev/lib/lz4.c#L1865

Or some related change, the lz4 author Cyan did it on purpose.
https://github.com/lz4/lz4/commit/1a191b3f8d26b50a7c1d41590b529ec308d768cd

I think we could follow the latest LZ4 upstream in order to
avoid further maintainence overhead. That's my own thought
anyway.

Thanks,
Gao Xiang

2020-06-06 15:19:12

by Vasily Averin

[permalink] [raw]
Subject: Re: [PATCH] lib/lz4: smatch warning in LZ4_decompress_generic()

Dear Yann,
could you please consult us about your lz4 pacth
https://github.com/lz4/lz4/commit/1a191b3f8d26b50a7c1d41590b529ec308d768cd

Please see details below.

Thank you,
Vasily Averin

On 6/6/20 5:36 PM, Gao Xiang wrote:
> On Sat, Jun 06, 2020 at 04:28:02PM +0300, Vasily Averin wrote:
>> Found by smatch:
>> lib/lz4/lz4_decompress.c:150 LZ4_decompress_generic() warn: maybe use && instead of &
>> It was realy incorrectly copied from
>> https://github.com/lz4/lz4/commit/45f8603aae389d34c689d3ff7427b314071ccd2c
>> line 1431
>
> Simply no.
>
>>
>> Fixes: 2209fda323e2 ("lib/lz4: update LZ4 decompressor module")
>> Signed-off-by: Vasily Averin <[email protected]>
>> ---
>> lib/lz4/lz4_decompress.c | 2 +-
>> 1 file changed, 1 insertion(+), 1 deletion(-)
>commit 1a191b3f8d26b50a7c1d41590b529ec308d768cd
Author: Yann Collet <[email protected]>
Date: Wed May 2 10:33:12 2018 -0700

simplify shortcut

diff --git a/lib/lz4.c b/lib/lz4.c
index c6f0426..b46910f 100644
--- a/lib/lz4.c
+++ b/lib/lz4.c
@@ -1429,62 +1429,29 @@ LZ4_FORCE_INLINE int LZ4_decompress_generic(
*/
if ((endOnInput ? length != RUN_MASK : length <= 8) &&
/* strictly "less than" on input, to re-enter the loop with at least one byte */
- likely((endOnInput ? ip < shortiend : 1) && (op <= shortoend)))
+ likely((endOnInput ? ip < shortiend : 1) & (op <= shortoend)))
>
>> diff --git a/lib/lz4/lz4_decompress.c b/lib/lz4/lz4_decompress.c
>> index 0c9d3ad..f7f7dca 100644
>> --- a/lib/lz4/lz4_decompress.c
>> +++ b/lib/lz4/lz4_decompress.c
>> @@ -147,7 +147,7 @@ static FORCE_INLINE int LZ4_decompress_generic(
>> * strictly "less than" on input, to re-enter
>> * the loop with at least one byte
>> */
>> - && likely((endOnInput ? ip < shortiend : 1) &
>> + && likely((endOnInput ? ip < shortiend : 1) &&
>
> I'd like to say, this is not my mistake (even not an issue).
> If you notice the latest LZ4 upstream
> https://github.com/lz4/lz4/blob/dev/lib/lz4.c#L1865
>
> Or some related change, the lz4 author Cyan did it on purpose.
> https://github.com/lz4/lz4/commit/1a191b3f8d26b50a7c1d41590b529ec308d768cd

For me it looks like typo in patch,
lets ask author of this commit.

> I think we could follow the latest LZ4 upstream in order to
> avoid further maintainence overhead. That's my own thought
> anyway.
>
> Thanks,
> Gao Xiang
>

2020-06-08 00:45:13

by Yann Collet

[permalink] [raw]
Subject: Re: [PATCH] lib/lz4: smatch warning in LZ4_decompress_generic()

Hi Vasily


If I do understand the discussion, the question is about usage of `&` instead of `&&`,
and the speculation that it might be an error.

It's not an error. Unfortunately, explaining the reasoning behind this decision is a bit long.

First, in term of outcome, I believe it should be clear that :
(endOnInput ? ip < shortiend : 1) && (op <= shortoend)
is completely equivalent to :
(endOnInput ? ip < shortiend : 1) & (op <= shortoend)

Note that one cannot just mindlessly replace && by & anywhere and expect it to work the same,
In the general case of int A,B; A && B is different from A & B.
But (A>0) && (B>0) is equivalent to (A>0) & (B>0), because each side of the test is now a strict 0/1 value.
This condition is respected above.

Now why selecting one rather than the other ?
It has something to do with C language sequential guarantees and the pipeline of modern cpus.

When spelling: if (f(A) && f(B))
the program must do f(A) first, and then, if it's true, move on to f(B).
This introduces a serial dependency and an additional intermediate branch.

Whereas in: if (f(A) & f(B))
f(A) and f(B) can be executed in parallel,
then combined with a simple AND operation,
and there is only one branch, at the end.

Once again, this should not be done mindlessly :
such construction is only valid if f(B) is truly completely independent from f(A).
For example, in : (op != NULL) && (*op=K), one must use &&, otherwise it's UB, because the second test is only valid after the first one is verified.

But for the example at stake :
(endOnInput ? ip < shortiend : 1) & (op <= shortoend)
one can see that each test is fully independent,
(op<=shortoend) doesn't depend on anything from the left side.


OK, now, does it make a difference ?
Well, as usual, it depends.
If it was a branch anywhere in the code base, it probably doesn't.
One needs a very hot code section for such a difference to become noticeable.
But it happens that in lz4, we do have such hot sections.

There is another issue at stake, almost invisible :
if you know the architecture of modern intel cpu,
you might know that they have a micro-op cache.
The way this micro-op cache is structured is a bit complex, and more importantly, poorly documented, and changes with each chip generation.
But well, among the notable elements, the concentration of branches matter :
if there are too many branches densely packed, the micro-op cache cannot do its job,
therefore, every time the instruction segment must be played, it has to be decoded again, costing significant performance.

Also, the branch predictor has a limited history, and a denser pack of branches ends up being more difficult to track hence to predict
(here also, details are not especially well documented, and change all the time. The only "generic rule" is that less branches tends to be easier than more branches).

So, as one can see, it's _generally_ better to replace a && branch by an & AND operation, when conditions are fulfilled,
but it only matters within very hot code segments.

To make matters even more difficult, sometimes, clever compilers can make the same choice automatically,
realizing that in a code like f(A) && f(B), there may actually be a way to execute both sides and replace the intermediate branch by a mask.
The problem is, that's true sometimes. Not always.
Other compilers will react differently and may miss this opportunity.
That's important for codes designed to be highly portable, as "it works fine on my machine" isn't enough to justify a modification that may affect other systems.


Anyway, I hope it answers your question :
- both expressions are equivalent, though that's only because some conditions are carefully respected.
- && is better from a style perspective, and is my first choice for "normal" code
- hot code segments value speed, in which case, & can be a more efficient alternative


Regards

Yann

On 6/6/20, 08:16, "Vasily Averin" <[email protected]> wrote:

Dear Yann,
could you please consult us about your lz4 pacth
https://github.com/lz4/lz4/commit/1a191b3f8d26b50a7c1d41590b529ec308d768cd

Please see details below.

Thank you,
Vasily Averin

On 6/6/20 5:36 PM, Gao Xiang wrote:
> On Sat, Jun 06, 2020 at 04:28:02PM +0300, Vasily Averin wrote:
>> Found by smatch:
>> lib/lz4/lz4_decompress.c:150 LZ4_decompress_generic() warn: maybe use && instead of &
>> It was realy incorrectly copied from
>> https://github.com/lz4/lz4/commit/45f8603aae389d34c689d3ff7427b314071ccd2c
>> line 1431
>
> Simply no.
>
>>
>> Fixes: 2209fda323e2 ("lib/lz4: update LZ4 decompressor module")
>> Signed-off-by: Vasily Averin <[email protected]>
>> ---
>> lib/lz4/lz4_decompress.c | 2 +-
>> 1 file changed, 1 insertion(+), 1 deletion(-)
>commit 1a191b3f8d26b50a7c1d41590b529ec308d768cd
Author: Yann Collet <[email protected]>
Date: Wed May 2 10:33:12 2018 -0700

simplify shortcut

diff --git a/lib/lz4.c b/lib/lz4.c
index c6f0426..b46910f 100644
--- a/lib/lz4.c
+++ b/lib/lz4.c
@@ -1429,62 +1429,29 @@ LZ4_FORCE_INLINE int LZ4_decompress_generic(
*/
if ((endOnInput ? length != RUN_MASK : length <= 8) &&
/* strictly "less than" on input, to re-enter the loop with at least one byte */
- likely((endOnInput ? ip < shortiend : 1) && (op <= shortoend)))
+ likely((endOnInput ? ip < shortiend : 1) & (op <= shortoend)))
>
>> diff --git a/lib/lz4/lz4_decompress.c b/lib/lz4/lz4_decompress.c
>> index 0c9d3ad..f7f7dca 100644
>> --- a/lib/lz4/lz4_decompress.c
>> +++ b/lib/lz4/lz4_decompress.c
>> @@ -147,7 +147,7 @@ static FORCE_INLINE int LZ4_decompress_generic(
>> * strictly "less than" on input, to re-enter
>> * the loop with at least one byte
>> */
>> - && likely((endOnInput ? ip < shortiend : 1) &
>> + && likely((endOnInput ? ip < shortiend : 1) &&
>
> I'd like to say, this is not my mistake (even not an issue).
> If you notice the latest LZ4 upstream
> https://github.com/lz4/lz4/blob/dev/lib/lz4.c#L1865
>
> Or some related change, the lz4 author Cyan did it on purpose.
> https://github.com/lz4/lz4/commit/1a191b3f8d26b50a7c1d41590b529ec308d768cd

For me it looks like typo in patch,
lets ask author of this commit.

> I think we could follow the latest LZ4 upstream in order to
> avoid further maintainence overhead. That's my own thought
> anyway.
>
> Thanks,
> Gao Xiang
>

2020-06-08 01:04:56

by Joe Perches

[permalink] [raw]
Subject: Re: [PATCH] lib/lz4: smatch warning in LZ4_decompress_generic()

On Mon, 2020-06-08 at 00:40 +0000, Yann Collet wrote:
> Hi Vasily
>
>
> If I do understand the discussion, the question is about usage of `&` instead of `&&`,
> and the speculation that it might be an error.
>
> It's not an error. Unfortunately, explaining the reasoning behind this decision is a bit long.

Likely better to add a comment around the use so that
another patch like this doesn't get submitted again.

Perhaps something like:
---
lib/lz4/lz4_decompress.c | 3 +++
1 file changed, 3 insertions(+)

diff --git a/lib/lz4/lz4_decompress.c b/lib/lz4/lz4_decompress.c
index 0c9d3ad17e0f..5371dab6b481 100644
--- a/lib/lz4/lz4_decompress.c
+++ b/lib/lz4/lz4_decompress.c
@@ -141,6 +141,9 @@ static FORCE_INLINE int LZ4_decompress_generic(
* space in the output for those 18 bytes earlier, upon
* entering the shortcut (in other words, there is a
* combined check for both stages).
+ *
+ * The & in the likely() below is intentionally not && so that
+ * some compilers can produce better parallelized runtime code
*/
if ((endOnInput ? length != RUN_MASK : length <= 8)
/*


2020-06-08 01:27:32

by Andrew Morton

[permalink] [raw]
Subject: Re: [PATCH] lib/lz4: smatch warning in LZ4_decompress_generic()

On Sun, 07 Jun 2020 18:02:44 -0700 Joe Perches <[email protected]> wrote:

> On Mon, 2020-06-08 at 00:40 +0000, Yann Collet wrote:
> > Hi Vasily
> >
> >
> > If I do understand the discussion, the question is about usage of `&` instead of `&&`,
> > and the speculation that it might be an error.
> >
> > It's not an error. Unfortunately, explaining the reasoning behind this decision is a bit long.
>
> Likely better to add a comment around the use so that
> another patch like this doesn't get submitted again.
>
> Perhaps something like:

Yup.


From: Joe Perches <[email protected]>
Subject: lib/lz4/lz4_decompress.c: document deliberate use of `&'

This operation was intentional, but tools such as smatch will warn that it
might not have been.

Link: http://lkml.kernel.org/r/[email protected]
Cc: Yann Collet <[email protected]>
Cc: Vasily Averin <[email protected]>
Cc: Gao Xiang <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
---

lib/lz4/lz4_decompress.c | 3 +++
1 file changed, 3 insertions(+)

--- a/lib/lz4/lz4_decompress.c~lib-lz4-smatch-warning-in-lz4_decompress_generic
+++ a/lib/lz4/lz4_decompress.c
@@ -141,6 +141,9 @@ static FORCE_INLINE int LZ4_decompress_g
* space in the output for those 18 bytes earlier, upon
* entering the shortcut (in other words, there is a
* combined check for both stages).
+ *
+ * The & in the likely() below is intentionally not && so that
+ * some compilers can produce better parallelized runtime code
*/
if ((endOnInput ? length != RUN_MASK : length <= 8)
/*
_