2019-06-06 22:33:13

by Joe Korty

[permalink] [raw]
Subject: [BUG 4.4.178] x86_64 compat mode futexes broken

Starting with 4.4.178, the LTP test

pthread_cond_wait/2-3

when compiled on x86_64 with 'gcc -m32', started failing. It generates this log output:

[16:18:38]Implementation supports the MONOTONIC CLOCK but option is disabled in test.
[16:18:38]Test starting
[16:18:38] Process-shared primitive will be tested
[16:18:38] Alternative clock for cond will be tested
[16:18:38]Test 2-3.c FAILED: The child did not own the mutex inside the cleanup handler

A git bisection between 4.4.177..178 shows that this commit is the culprit:

Git-Commit: 79739ad2d0ac5787a15a1acf7caaf34cd95bbf3c
Author: Alistair Strachan <[email protected]>
Subject: [PATCH] x86: vdso: Use $LD instead of $CC to link

And, indeed, when I back this patch out of 4.4.178 proper, the above test
passes again.

Please consider backing this patch out of linux-4.4.y, and from master, and from
any other linux branch it has been backported to.

PS: In backing it out of 4.4.178, I first backed out

7c45b45fd6e928c9ce275c32f6fa98d317e6f5ee

This is a follow-on vdso patch which collides with the
patch we are interested in removing. As it claims to be
only removing redundant code, it probably should never
have been backported in the first place.

Signed-off-by: Joe Korty <[email protected]>


2019-06-07 01:07:24

by Joe Korty

[permalink] [raw]
Subject: Re: [BUG 4.4.178] x86_64 compat mode futexes broken

On Thu, Jun 06, 2019 at 04:11:30PM -0700, Nathan Chancellor wrote:
> On Thu, Jun 06, 2019 at 09:11:43PM +0000, Joe Korty wrote:
> > Starting with 4.4.178, the LTP test
> >
> > pthread_cond_wait/2-3
> >
> > when compiled on x86_64 with 'gcc -m32', started failing. It generates this log output:
> >
> > [16:18:38]Implementation supports the MONOTONIC CLOCK but option is disabled in test.
> > [16:18:38]Test starting
> > [16:18:38] Process-shared primitive will be tested
> > [16:18:38] Alternative clock for cond will be tested
> > [16:18:38]Test 2-3.c FAILED: The child did not own the mutex inside the cleanup handler
> >
>
> What is the exact build command + test case command? I'd like to
> reproduce this myself.
>
> > A git bisection between 4.4.177..178 shows that this commit is the culprit:
> >
> > Git-Commit: 79739ad2d0ac5787a15a1acf7caaf34cd95bbf3c
> > Author: Alistair Strachan <[email protected]>
> > Subject: [PATCH] x86: vdso: Use $LD instead of $CC to link
> >
>
> Have you tested 4.4.180? There were two subsequent fixes to this patch
> in 4.4:

Hi Nathan,
I started with 4.4.179-rt181 and worked backwards from there. Per your
suggestion, I tried 4.4.180 and it does work properly.

Thanks,
Joe




> 485d15db01ca ("kbuild: simplify ld-option implementation")
> 07d35512e494 ("x86/vdso: Pass --eh-frame-hdr to the linker")
>
> > And, indeed, when I back this patch out of 4.4.178 proper, the above test
> > passes again.
> >
> > Please consider backing this patch out of linux-4.4.y, and from master, and from
> > any other linux branch it has been backported to.
> >
>
> So this is broken in mainline too?
>
> > PS: In backing it out of 4.4.178, I first backed out
> >
> > 7c45b45fd6e928c9ce275c32f6fa98d317e6f5ee
> >
> > This is a follow-on vdso patch which collides with the
> > patch we are interested in removing. As it claims to be
> > only removing redundant code, it probably should never
> > have been backported in the first place.
>
> While it is redundant for ld.bfd, it causes a build failure with the
> release version of ld.lld:
>
> https://github.com/ClangBuiltLinux/linux/issues/31
>
> Cheers,
> Nathan

2019-06-07 02:07:22

by Nathan Chancellor

[permalink] [raw]
Subject: Re: [BUG 4.4.178] x86_64 compat mode futexes broken

On Thu, Jun 06, 2019 at 09:11:43PM +0000, Joe Korty wrote:
> Starting with 4.4.178, the LTP test
>
> pthread_cond_wait/2-3
>
> when compiled on x86_64 with 'gcc -m32', started failing. It generates this log output:
>
> [16:18:38]Implementation supports the MONOTONIC CLOCK but option is disabled in test.
> [16:18:38]Test starting
> [16:18:38] Process-shared primitive will be tested
> [16:18:38] Alternative clock for cond will be tested
> [16:18:38]Test 2-3.c FAILED: The child did not own the mutex inside the cleanup handler
>

What is the exact build command + test case command? I'd like to
reproduce this myself.

> A git bisection between 4.4.177..178 shows that this commit is the culprit:
>
> Git-Commit: 79739ad2d0ac5787a15a1acf7caaf34cd95bbf3c
> Author: Alistair Strachan <[email protected]>
> Subject: [PATCH] x86: vdso: Use $LD instead of $CC to link
>

Have you tested 4.4.180? There were two subsequent fixes to this patch
in 4.4:

485d15db01ca ("kbuild: simplify ld-option implementation")
07d35512e494 ("x86/vdso: Pass --eh-frame-hdr to the linker")

> And, indeed, when I back this patch out of 4.4.178 proper, the above test
> passes again.
>
> Please consider backing this patch out of linux-4.4.y, and from master, and from
> any other linux branch it has been backported to.
>

So this is broken in mainline too?

> PS: In backing it out of 4.4.178, I first backed out
>
> 7c45b45fd6e928c9ce275c32f6fa98d317e6f5ee
>
> This is a follow-on vdso patch which collides with the
> patch we are interested in removing. As it claims to be
> only removing redundant code, it probably should never
> have been backported in the first place.

While it is redundant for ld.bfd, it causes a build failure with the
release version of ld.lld:

https://github.com/ClangBuiltLinux/linux/issues/31

Cheers,
Nathan

2019-06-07 03:26:03

by Nathan Chancellor

[permalink] [raw]
Subject: Re: [BUG 4.4.178] x86_64 compat mode futexes broken

On Fri, Jun 07, 2019 at 01:01:36AM +0000, Joe Korty wrote:
> Hi Nathan,
> I started with 4.4.179-rt181 and worked backwards from there. Per your
> suggestion, I tried 4.4.180 and it does work properly.
>
> Thanks,
> Joe

Great, thank you for testing and sorry for the breakage in the first
place, I missed those commits in my series :(

Cheers,
Nathan