2022-08-08 16:05:53

by Florian Weimer

[permalink] [raw]
Subject: tools/testing/selftests/kvm/rseq_test and glibc 2.35

It has come to my attention that the KVM rseq test apparently needs to
be ported to glibc 2.35. The background is that on aarch64, rseq is the
only way to get a practically useful sched_getcpu. (There's no hidden
per-task CPU state the vDSO could reveal as the CPU ID.)

The main rseq tests have already been adjusted via:

commit 233e667e1ae3e348686bd9dd0172e62a09d852e1
Author: Mathieu Desnoyers <[email protected]>
Date: Mon Jan 24 12:12:45 2022 -0500

selftests/rseq: Uplift rseq selftests for compatibility with glibc-2.35

glibc-2.35 (upcoming release date 2022-02-01) exposes the rseq per-thread
data in the TCB, accessible at an offset from the thread pointer, rather
than through an actual Thread-Local Storage (TLS) variable, as the
Linux kernel selftests initially expected.

The __rseq_abi TLS and glibc-2.35's ABI for per-thread data cannot
actively coexist in a process, because the kernel supports only a single
rseq registration per thread.

Here is the scheme introduced to ensure selftests can work both with an
older glibc and with glibc-2.35+:

- librseq exposes its own "rseq_offset, rseq_size, rseq_flags" ABI.

- librseq queries for glibc rseq ABI (__rseq_offset, __rseq_size,
__rseq_flags) using dlsym() in a librseq library constructor. If those
are found, copy their values into rseq_offset, rseq_size, and
rseq_flags.

- Else, if those glibc symbols are not found, handle rseq registration
from librseq and use its own IE-model TLS to implement the rseq ABI
per-thread storage.

Signed-off-by: Mathieu Desnoyers <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]

But I don't see a similar adjustment for
tools/testing/selftests/kvm/rseq_test.c. As an additional wrinkle,
you'd have to start calling getcpu (glibc function or system call)
because comparing rseq.cpu_id against sched_getcpu won't test anything
anymore once glibc implements sched_getcpu using rseq.

We noticed this because our downstream glibc version, while based on
2.34, enables rseq registration by default. To facilitate coordination
with rseq application usage, we also backported the __rseq_* ABI
symbols, so the selftests could use that even in our downstream version.
(We enable the glibc tunables downstream, but they are an optional
glibc feature, so it's probably better in the long run to fix the kernel
selftests rather than using the tunables as a workaround.)

Thanks,
Florian


2022-08-08 23:53:09

by Gavin Shan

[permalink] [raw]
Subject: Re: tools/testing/selftests/kvm/rseq_test and glibc 2.35

Hi Florian,

On 8/9/22 2:01 AM, Florian Weimer wrote:
> It has come to my attention that the KVM rseq test apparently needs to
> be ported to glibc 2.35. The background is that on aarch64, rseq is the
> only way to get a practically useful sched_getcpu. (There's no hidden
> per-task CPU state the vDSO could reveal as the CPU ID.)
>

Yes, kvm/selftests/rseq needs to support glibc 2.35. The question is
about glibc 2.34 or 2.35 because kvm/selftest/rseq fails on glibc 2.34

I would guess upstream-glibc-2.35 feature is enabled on downstream
glibc-2.34?

# ./rseq_test
==== Test Assertion Failure ====
rseq_test.c:60: !r
pid=112043 tid=112043 errno=22 - Invalid argument
1 0x0000000000401973: main at rseq_test.c:226
2 0x0000ffff84b6c79b: ?? ??:0
3 0x0000ffff84b6c86b: ?? ??:0
4 0x0000000000401b6f: _start at ??:?
rseq failed, errno = 22 (Invalid argument)
# rpm -aq | grep glibc-2
glibc-2.34-39.el9.aarch64


> The main rseq tests have already been adjusted via:
>
> commit 233e667e1ae3e348686bd9dd0172e62a09d852e1
> Author: Mathieu Desnoyers <[email protected]>
> Date: Mon Jan 24 12:12:45 2022 -0500
>
> selftests/rseq: Uplift rseq selftests for compatibility with glibc-2.35
>
> glibc-2.35 (upcoming release date 2022-02-01) exposes the rseq per-thread
> data in the TCB, accessible at an offset from the thread pointer, rather
> than through an actual Thread-Local Storage (TLS) variable, as the
> Linux kernel selftests initially expected.
>
> The __rseq_abi TLS and glibc-2.35's ABI for per-thread data cannot
> actively coexist in a process, because the kernel supports only a single
> rseq registration per thread.
>
> Here is the scheme introduced to ensure selftests can work both with an
> older glibc and with glibc-2.35+:
>
> - librseq exposes its own "rseq_offset, rseq_size, rseq_flags" ABI.
>
> - librseq queries for glibc rseq ABI (__rseq_offset, __rseq_size,
> __rseq_flags) using dlsym() in a librseq library constructor. If those
> are found, copy their values into rseq_offset, rseq_size, and
> rseq_flags.
>
> - Else, if those glibc symbols are not found, handle rseq registration
> from librseq and use its own IE-model TLS to implement the rseq ABI
> per-thread storage.
>
> Signed-off-by: Mathieu Desnoyers <[email protected]>
> Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
> Link: https://lkml.kernel.org/r/[email protected]
>
> But I don't see a similar adjustment for
> tools/testing/selftests/kvm/rseq_test.c. As an additional wrinkle,
> you'd have to start calling getcpu (glibc function or system call)
> because comparing rseq.cpu_id against sched_getcpu won't test anything
> anymore once glibc implements sched_getcpu using rseq.
>
> We noticed this because our downstream glibc version, while based on
> 2.34, enables rseq registration by default. To facilitate coordination
> with rseq application usage, we also backported the __rseq_* ABI
> symbols, so the selftests could use that even in our downstream version.
> (We enable the glibc tunables downstream, but they are an optional
> glibc feature, so it's probably better in the long run to fix the kernel
> selftests rather than using the tunables as a workaround.)
>

Thanks for the pointer. It makes sense. So it means rseq registration has
been done by glibc TLS? In this case, kvm/selftests/rseq is unable to
register again.

I will come up something similiar for kvm/selftest/rseq.

Thanks,
Gavin

2022-08-09 01:07:10

by Mathieu Desnoyers

[permalink] [raw]
Subject: Re: tools/testing/selftests/kvm/rseq_test and glibc 2.35


----- Gavin Shan <[email protected]> wrote:
> Hi Florian,
>
> On 8/9/22 2:01 AM, Florian Weimer wrote:
> > It has come to my attention that the KVM rseq test apparently needs to
> > be ported to glibc 2.35. The background is that on aarch64, rseq is the
> > only way to get a practically useful sched_getcpu. (There's no hidden
> > per-task CPU state the vDSO could reveal as the CPU ID.)
> >
>
> Yes, kvm/selftests/rseq needs to support glibc 2.35. The question is
> about glibc 2.34 or 2.35 because kvm/selftest/rseq fails on glibc 2.34
>
> I would guess upstream-glibc-2.35 feature is enabled on downstream
> glibc-2.34?
>
> # ./rseq_test
> ==== Test Assertion Failure ====
> rseq_test.c:60: !r
> pid=112043 tid=112043 errno=22 - Invalid argument
> 1 0x0000000000401973: main at rseq_test.c:226
> 2 0x0000ffff84b6c79b: ?? ??:0
> 3 0x0000ffff84b6c86b: ?? ??:0
> 4 0x0000000000401b6f: _start at ??:?
> rseq failed, errno = 22 (Invalid argument)
> # rpm -aq | grep glibc-2
> glibc-2.34-39.el9.aarch64
>
>
> > The main rseq tests have already been adjusted via:
> >
> > commit 233e667e1ae3e348686bd9dd0172e62a09d852e1
> > Author: Mathieu Desnoyers <[email protected]>
> > Date: Mon Jan 24 12:12:45 2022 -0500
> >
> > selftests/rseq: Uplift rseq selftests for compatibility with glibc-2.35
> >
> > glibc-2.35 (upcoming release date 2022-02-01) exposes the rseq per-thread
> > data in the TCB, accessible at an offset from the thread pointer, rather
> > than through an actual Thread-Local Storage (TLS) variable, as the
> > Linux kernel selftests initially expected.
> >
> > The __rseq_abi TLS and glibc-2.35's ABI for per-thread data cannot
> > actively coexist in a process, because the kernel supports only a single
> > rseq registration per thread.
> >
> > Here is the scheme introduced to ensure selftests can work both with an
> > older glibc and with glibc-2.35+:
> >
> > - librseq exposes its own "rseq_offset, rseq_size, rseq_flags" ABI.
> >
> > - librseq queries for glibc rseq ABI (__rseq_offset, __rseq_size,
> > __rseq_flags) using dlsym() in a librseq library constructor. If those
> > are found, copy their values into rseq_offset, rseq_size, and
> > rseq_flags.
> >
> > - Else, if those glibc symbols are not found, handle rseq registration
> > from librseq and use its own IE-model TLS to implement the rseq ABI
> > per-thread storage.
> >
> > Signed-off-by: Mathieu Desnoyers <[email protected]>
> > Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
> > Link: https://lkml.kernel.org/r/[email protected]
> >
> > But I don't see a similar adjustment for
> > tools/testing/selftests/kvm/rseq_test.c. As an additional wrinkle,
> > you'd have to start calling getcpu (glibc function or system call)
> > because comparing rseq.cpu_id against sched_getcpu won't test anything
> > anymore once glibc implements sched_getcpu using rseq.
> >
> > We noticed this because our downstream glibc version, while based on
> > 2.34, enables rseq registration by default. To facilitate coordination
> > with rseq application usage, we also backported the __rseq_* ABI
> > symbols, so the selftests could use that even in our downstream version.
> > (We enable the glibc tunables downstream, but they are an optional
> > glibc feature, so it's probably better in the long run to fix the kernel
> > selftests rather than using the tunables as a workaround.)
> >
>
> Thanks for the pointer. It makes sense. So it means rseq registration has
> been done by glibc TLS? In this case, kvm/selftests/rseq is unable to
> register again.

The registration is done by glibc initialization and thread startup code.

>
> I will come up something similiar for kvm/selftest/rseq.

Make sure to chech the rseq selftests fixes recently pulled in the current merge window as well. One is relevant:

https://github.com/torvalds/linux/commit/d1a997ba4c1bf65497d956aea90de42a6398f73a

We may want to find a way to remove this duplicated rseq.c code eventually.

Thanks,

Mathieu

>
> Thanks,
> Gavin
>

--
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com

2022-08-09 02:14:03

by Gavin Shan

[permalink] [raw]
Subject: Re: tools/testing/selftests/kvm/rseq_test and glibc 2.35

On 8/9/22 10:57 AM, Mathieu Desnoyers wrote:
>
> ----- Gavin Shan <[email protected]> wrote:
>> Hi Florian,
>>
>> On 8/9/22 2:01 AM, Florian Weimer wrote:
>>> It has come to my attention that the KVM rseq test apparently needs to
>>> be ported to glibc 2.35. The background is that on aarch64, rseq is the
>>> only way to get a practically useful sched_getcpu. (There's no hidden
>>> per-task CPU state the vDSO could reveal as the CPU ID.)
>>>
>>
>> Yes, kvm/selftests/rseq needs to support glibc 2.35. The question is
>> about glibc 2.34 or 2.35 because kvm/selftest/rseq fails on glibc 2.34
>>
>> I would guess upstream-glibc-2.35 feature is enabled on downstream
>> glibc-2.34?
>>
>> # ./rseq_test
>> ==== Test Assertion Failure ====
>> rseq_test.c:60: !r
>> pid=112043 tid=112043 errno=22 - Invalid argument
>> 1 0x0000000000401973: main at rseq_test.c:226
>> 2 0x0000ffff84b6c79b: ?? ??:0
>> 3 0x0000ffff84b6c86b: ?? ??:0
>> 4 0x0000000000401b6f: _start at ??:?
>> rseq failed, errno = 22 (Invalid argument)
>> # rpm -aq | grep glibc-2
>> glibc-2.34-39.el9.aarch64
>>
>>
>>> The main rseq tests have already been adjusted via:
>>>
>>> commit 233e667e1ae3e348686bd9dd0172e62a09d852e1
>>> Author: Mathieu Desnoyers <[email protected]>
>>> Date: Mon Jan 24 12:12:45 2022 -0500
>>>
>>> selftests/rseq: Uplift rseq selftests for compatibility with glibc-2.35
>>>
>>> glibc-2.35 (upcoming release date 2022-02-01) exposes the rseq per-thread
>>> data in the TCB, accessible at an offset from the thread pointer, rather
>>> than through an actual Thread-Local Storage (TLS) variable, as the
>>> Linux kernel selftests initially expected.
>>>
>>> The __rseq_abi TLS and glibc-2.35's ABI for per-thread data cannot
>>> actively coexist in a process, because the kernel supports only a single
>>> rseq registration per thread.
>>>
>>> Here is the scheme introduced to ensure selftests can work both with an
>>> older glibc and with glibc-2.35+:
>>>
>>> - librseq exposes its own "rseq_offset, rseq_size, rseq_flags" ABI.
>>>
>>> - librseq queries for glibc rseq ABI (__rseq_offset, __rseq_size,
>>> __rseq_flags) using dlsym() in a librseq library constructor. If those
>>> are found, copy their values into rseq_offset, rseq_size, and
>>> rseq_flags.
>>>
>>> - Else, if those glibc symbols are not found, handle rseq registration
>>> from librseq and use its own IE-model TLS to implement the rseq ABI
>>> per-thread storage.
>>>
>>> Signed-off-by: Mathieu Desnoyers <[email protected]>
>>> Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
>>> Link: https://lkml.kernel.org/r/[email protected]
>>>
>>> But I don't see a similar adjustment for
>>> tools/testing/selftests/kvm/rseq_test.c. As an additional wrinkle,
>>> you'd have to start calling getcpu (glibc function or system call)
>>> because comparing rseq.cpu_id against sched_getcpu won't test anything
>>> anymore once glibc implements sched_getcpu using rseq.
>>>
>>> We noticed this because our downstream glibc version, while based on
>>> 2.34, enables rseq registration by default. To facilitate coordination
>>> with rseq application usage, we also backported the __rseq_* ABI
>>> symbols, so the selftests could use that even in our downstream version.
>>> (We enable the glibc tunables downstream, but they are an optional
>>> glibc feature, so it's probably better in the long run to fix the kernel
>>> selftests rather than using the tunables as a workaround.)
>>>
>>
>> Thanks for the pointer. It makes sense. So it means rseq registration has
>> been done by glibc TLS? In this case, kvm/selftests/rseq is unable to
>> register again.
>
> The registration is done by glibc initialization and thread startup code.
>
>>
>> I will come up something similiar for kvm/selftest/rseq.
>
> Make sure to chech the rseq selftests fixes recently pulled in the current merge window as well. One is relevant:
>
> https://github.com/torvalds/linux/commit/d1a997ba4c1bf65497d956aea90de42a6398f73a
>
> We may want to find a way to remove this duplicated rseq.c code eventually.
>

Thanks, Mathieu. The check for 'rseq-size' will be included either. I almost
have something working. I will post the fixes after some tests.

Thanks,
Gavin

2022-08-09 04:57:55

by Gavin Shan

[permalink] [raw]
Subject: Re: tools/testing/selftests/kvm/rseq_test and glibc 2.35

On 8/9/22 1:58 PM, Gavin Shan wrote:
> On 8/9/22 10:57 AM, Mathieu Desnoyers wrote:
>>> On 8/9/22 2:01 AM, Florian Weimer wrote:
>>>> It has come to my attention that the KVM rseq test apparently needs to
>>>> be ported to glibc 2.35.  The background is that on aarch64, rseq is the
>>>> only way to get a practically useful sched_getcpu.  (There's no hidden
>>>> per-task CPU state the vDSO could reveal as the CPU ID.)
>>>>
>>>
>>> Yes, kvm/selftests/rseq needs to support glibc 2.35. The question is
>>> about glibc 2.34 or 2.35 because kvm/selftest/rseq fails on glibc 2.34
>>>
>>> I would guess upstream-glibc-2.35 feature is enabled on downstream
>>> glibc-2.34?
>>>
>>> # ./rseq_test
>>> ==== Test Assertion Failure ====
>>>     rseq_test.c:60: !r
>>>     pid=112043 tid=112043 errno=22 - Invalid argument
>>>        1    0x0000000000401973: main at rseq_test.c:226
>>>        2    0x0000ffff84b6c79b: ?? ??:0
>>>        3    0x0000ffff84b6c86b: ?? ??:0
>>>        4    0x0000000000401b6f: _start at ??:?
>>>     rseq failed, errno = 22 (Invalid argument)
>>> # rpm -aq | grep glibc-2
>>> glibc-2.34-39.el9.aarch64
>>>
>>>
>>>> The main rseq tests have already been adjusted via:
>>>>
>>>> commit 233e667e1ae3e348686bd9dd0172e62a09d852e1
>>>> Author: Mathieu Desnoyers <[email protected]>
>>>> Date:   Mon Jan 24 12:12:45 2022 -0500
>>>>
>>>>       selftests/rseq: Uplift rseq selftests for compatibility with glibc-2.35
>>>>       glibc-2.35 (upcoming release date 2022-02-01) exposes the rseq per-thread
>>>>       data in the TCB, accessible at an offset from the thread pointer, rather
>>>>       than through an actual Thread-Local Storage (TLS) variable, as the
>>>>       Linux kernel selftests initially expected.
>>>>       The __rseq_abi TLS and glibc-2.35's ABI for per-thread data cannot
>>>>       actively coexist in a process, because the kernel supports only a single
>>>>       rseq registration per thread.
>>>>       Here is the scheme introduced to ensure selftests can work both with an
>>>>       older glibc and with glibc-2.35+:
>>>>       - librseq exposes its own "rseq_offset, rseq_size, rseq_flags" ABI.
>>>>       - librseq queries for glibc rseq ABI (__rseq_offset, __rseq_size,
>>>>         __rseq_flags) using dlsym() in a librseq library constructor. If those
>>>>         are found, copy their values into rseq_offset, rseq_size, and
>>>>         rseq_flags.
>>>>       - Else, if those glibc symbols are not found, handle rseq registration
>>>>         from librseq and use its own IE-model TLS to implement the rseq ABI
>>>>         per-thread storage.
>>>>       Signed-off-by: Mathieu Desnoyers <[email protected]>
>>>>       Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
>>>>       Link: https://lkml.kernel.org/r/[email protected]
>>>>
>>>> But I don't see a similar adjustment for
>>>> tools/testing/selftests/kvm/rseq_test.c.  As an additional wrinkle,
>>>> you'd have to start calling getcpu (glibc function or system call)
>>>> because comparing rseq.cpu_id against sched_getcpu won't test anything
>>>> anymore once glibc implements sched_getcpu using rseq.
>>>>
>>>> We noticed this because our downstream glibc version, while based on
>>>> 2.34, enables rseq registration by default.  To facilitate coordination
>>>> with rseq application usage, we also backported the __rseq_* ABI
>>>> symbols, so the selftests could use that even in our downstream version.
>>>> (We enable the glibc tunables downstream, but they are an optional
>>>> glibc feature, so it's probably better in the long run to fix the kernel
>>>> selftests rather than using the tunables as a workaround.)
>>>>
>>>
>>> Thanks for the pointer. It makes sense. So it means rseq registration has
>>> been done by glibc TLS? In this case, kvm/selftests/rseq is unable to
>>> register again.
>>
>> The registration is done by glibc initialization and thread startup code.
>>
>>>
>>> I will come up something similiar for kvm/selftest/rseq.
>>
>> Make sure to chech the rseq selftests fixes recently pulled in the current merge window as well. One is relevant:
>>
>> https://github.com/torvalds/linux/commit/d1a997ba4c1bf65497d956aea90de42a6398f73a
>>
>> We may want to find a way to remove this duplicated rseq.c code eventually.
>>
>
> Thanks, Mathieu. The check for 'rseq-size' will be included either. I almost
> have something working. I will post the fixes after some tests.
>

Mathieu and Florian, the fixes have been posted. It would be nice for you
to review if you have free cycles :)

https://lore.kernel.org/kvmarm/[email protected]/T/#t

Thanks,
Gavin

2022-08-09 06:52:09

by Florian Weimer

[permalink] [raw]
Subject: Re: tools/testing/selftests/kvm/rseq_test and glibc 2.35

* Gavin Shan:

> Hi Florian,
>
> On 8/9/22 2:01 AM, Florian Weimer wrote:
>> It has come to my attention that the KVM rseq test apparently needs to
>> be ported to glibc 2.35. The background is that on aarch64, rseq is the
>> only way to get a practically useful sched_getcpu. (There's no hidden
>> per-task CPU state the vDSO could reveal as the CPU ID.)
>>
>
> Yes, kvm/selftests/rseq needs to support glibc 2.35. The question is
> about glibc 2.34 or 2.35 because kvm/selftest/rseq fails on glibc 2.34
>
> I would guess upstream-glibc-2.35 feature is enabled on downstream
> glibc-2.34?
>
> # ./rseq_test
> ==== Test Assertion Failure ====
> rseq_test.c:60: !r
> pid=112043 tid=112043 errno=22 - Invalid argument
> 1 0x0000000000401973: main at rseq_test.c:226
> 2 0x0000ffff84b6c79b: ?? ??:0
> 3 0x0000ffff84b6c86b: ?? ??:0
> 4 0x0000000000401b6f: _start at ??:?
> rseq failed, errno = 22 (Invalid argument)
> # rpm -aq | grep glibc-2
> glibc-2.34-39.el9.aarch64

Yes, we have enabled it downstream.

glibc: Restartable sequences interfaces and sched_getcpu accelerated
by default
<https://bugzilla.redhat.com/show_bug.cgi?id=2085529>

However,

GLIBC_TUNABLES=glibc.pthread.rseq=0 ./rseq_test

should still work (we added the ability to disable rseq registration
precisely to enable scenarios like this), but tunables are an optional
glibc feature, so the upstream kernel should probably still be fixed.

> Mathieu and Florian, the fixes have been posted. It would be nice for you
> to review if you have free cycles :)
>
> https://lore.kernel.org/kvmarm/[email protected]/T/#t

Excellent. I'm going to have a look.

Thanks,
Florian