2019-12-23 14:32:50

by Christophe Leroy

[permalink] [raw]
Subject: [RFC PATCH v2 07/10] lib: vdso: don't use READ_ONCE() in __c_kernel_time()

READ_ONCE() forces the read of the 64 bit value of
vd[CS_HRES_COARSE].basetime[CLOCK_REALTIME].sec allthough
only the lower part is needed.

This results in a suboptimal code:

00000af4 <__c_kernel_time>:
af4: 2c 03 00 00 cmpwi r3,0
af8: 81 44 00 20 lwz r10,32(r4)
afc: 81 64 00 24 lwz r11,36(r4)
b00: 41 82 00 08 beq b08 <__c_kernel_time+0x14>
b04: 91 63 00 00 stw r11,0(r3)
b08: 7d 63 5b 78 mr r3,r11
b0c: 4e 80 00 20 blr

By removing the READ_ONCE(), only the lower part is read from
memory, and the code is cleaner:

00000af4 <__c_kernel_time>:
af4: 7c 69 1b 79 mr. r9,r3
af8: 80 64 00 24 lwz r3,36(r4)
afc: 4d 82 00 20 beqlr
b00: 90 69 00 00 stw r3,0(r9)
b04: 4e 80 00 20 blr

Signed-off-by: Christophe Leroy <[email protected]>
---
lib/vdso/gettimeofday.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/lib/vdso/gettimeofday.c b/lib/vdso/gettimeofday.c
index 17b4cff6e5f0..5a17a9d2e6cd 100644
--- a/lib/vdso/gettimeofday.c
+++ b/lib/vdso/gettimeofday.c
@@ -144,7 +144,7 @@ __cvdso_gettimeofday(const struct vdso_data *vd, struct __kernel_old_timeval *tv
static __maybe_unused __kernel_old_time_t
__cvdso_time(const struct vdso_data *vd, __kernel_old_time_t *time)
{
- __kernel_old_time_t t = READ_ONCE(vd[CS_HRES_COARSE].basetime[CLOCK_REALTIME].sec);
+ __kernel_old_time_t t = vd[CS_HRES_COARSE].basetime[CLOCK_REALTIME].sec;

if (time)
*time = t;
--
2.13.3


2019-12-24 01:59:29

by Andy Lutomirski

[permalink] [raw]
Subject: Re: [RFC PATCH v2 07/10] lib: vdso: don't use READ_ONCE() in __c_kernel_time()

On Mon, Dec 23, 2019 at 6:31 AM Christophe Leroy
<[email protected]> wrote:
>
> READ_ONCE() forces the read of the 64 bit value of
> vd[CS_HRES_COARSE].basetime[CLOCK_REALTIME].sec allthough
> only the lower part is needed.

Seems reasonable and very unlikely to be harmful. That being said,
this function really ought to be considered deprecated -- 32-bit
time_t is insufficient.

Do you get even better code if you move the read into the if statement?

Reviewed-by: Andy Lutomirski <[email protected]>

--Andy

2019-12-24 11:13:24

by Christophe Leroy

[permalink] [raw]
Subject: Re: [RFC PATCH v2 07/10] lib: vdso: don't use READ_ONCE() in __c_kernel_time()



Le 24/12/2019 à 02:58, Andy Lutomirski a écrit :
> On Mon, Dec 23, 2019 at 6:31 AM Christophe Leroy
> <[email protected]> wrote:
>>
>> READ_ONCE() forces the read of the 64 bit value of
>> vd[CS_HRES_COARSE].basetime[CLOCK_REALTIME].sec allthough
>> only the lower part is needed.
>
> Seems reasonable and very unlikely to be harmful. That being said,
> this function really ought to be considered deprecated -- 32-bit
> time_t is insufficient.
>
> Do you get even better code if you move the read into the if statement?

Euh ...

How can you return t when time pointer is NULL if you read t only when
time pointer is not NULL ?

Christophe

2019-12-24 12:05:30

by Andy Lutomirski

[permalink] [raw]
Subject: Re: [RFC PATCH v2 07/10] lib: vdso: don't use READ_ONCE() in __c_kernel_time()


> On Dec 24, 2019, at 7:12 PM, christophe leroy <[email protected]> wrote:
>
> 
>
>> Le 24/12/2019 à 02:58, Andy Lutomirski a écrit :
>>> On Mon, Dec 23, 2019 at 6:31 AM Christophe Leroy
>>> <[email protected]> wrote:
>>>
>>> READ_ONCE() forces the read of the 64 bit value of
>>> vd[CS_HRES_COARSE].basetime[CLOCK_REALTIME].sec allthough
>>> only the lower part is needed.
>> Seems reasonable and very unlikely to be harmful. That being said,
>> this function really ought to be considered deprecated -- 32-bit
>> time_t is insufficient.
>> Do you get even better code if you move the read into the if statement?
>
> Euh ...
>
> How can you return t when time pointer is NULL if you read t only when time pointer is not NULL ?
>
>

Duh, never mind.

But this means your patch may be buggy: you need to make sure the compiler returns the *same* value it stores. Maybe you’re saved by the potential aliasing between the data page and the passed parameter and the value you read, but that’sa bad thing to rely on.

Try barrier() after the read.

2020-01-10 21:13:21

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [RFC PATCH v2 07/10] lib: vdso: don't use READ_ONCE() in __c_kernel_time()

Christophe Leroy <[email protected]> writes:
>
> diff --git a/lib/vdso/gettimeofday.c b/lib/vdso/gettimeofday.c
> index 17b4cff6e5f0..5a17a9d2e6cd 100644
> --- a/lib/vdso/gettimeofday.c
> +++ b/lib/vdso/gettimeofday.c
> @@ -144,7 +144,7 @@ __cvdso_gettimeofday(const struct vdso_data *vd, struct __kernel_old_timeval *tv
> static __maybe_unused __kernel_old_time_t
> __cvdso_time(const struct vdso_data *vd, __kernel_old_time_t *time)
> {
> - __kernel_old_time_t t = READ_ONCE(vd[CS_HRES_COARSE].basetime[CLOCK_REALTIME].sec);
> + __kernel_old_time_t t = vd[CS_HRES_COARSE].basetime[CLOCK_REALTIME].sec;
>
> if (time)
> *time = t;

Allows the compiler to load twice, i.e. the returned value might be different from the
stored value. So no.

Thanks,

tglx

2020-01-11 08:09:43

by Christophe Leroy

[permalink] [raw]
Subject: Re: [RFC PATCH v2 07/10] lib: vdso: don't use READ_ONCE() in __c_kernel_time()



On 01/10/2020 09:12 PM, Thomas Gleixner wrote:
> Christophe Leroy <[email protected]> writes:
>>
>> diff --git a/lib/vdso/gettimeofday.c b/lib/vdso/gettimeofday.c
>> index 17b4cff6e5f0..5a17a9d2e6cd 100644
>> --- a/lib/vdso/gettimeofday.c
>> +++ b/lib/vdso/gettimeofday.c
>> @@ -144,7 +144,7 @@ __cvdso_gettimeofday(const struct vdso_data *vd, struct __kernel_old_timeval *tv
>> static __maybe_unused __kernel_old_time_t
>> __cvdso_time(const struct vdso_data *vd, __kernel_old_time_t *time)
>> {
>> - __kernel_old_time_t t = READ_ONCE(vd[CS_HRES_COARSE].basetime[CLOCK_REALTIME].sec);
>> + __kernel_old_time_t t = vd[CS_HRES_COARSE].basetime[CLOCK_REALTIME].sec;
>>
>> if (time)
>> *time = t;
>
> Allows the compiler to load twice, i.e. the returned value might be different from the
> stored value. So no.
>

With READ_ONCE() the 64 bits are being read:

00000ac8 <__c_kernel_time>:
ac8: 2c 03 00 00 cmpwi r3,0
acc: 81 44 00 20 lwz r10,32(r4)
ad0: 81 64 00 24 lwz r11,36(r4)
ad4: 41 82 00 08 beq adc <__c_kernel_time+0x14>
ad8: 91 63 00 00 stw r11,0(r3)
adc: 7d 63 5b 78 mr r3,r11
ae0: 4e 80 00 20 blr

Without the READ_ONCE() only 32 bits are read. That's the most optimal.

00000ac8 <__c_kernel_time>:
ac8: 7c 69 1b 79 mr. r9,r3
acc: 80 64 00 24 lwz r3,36(r4)
ad0: 4d 82 00 20 beqlr
ad4: 90 69 00 00 stw r3,0(r9)
ad8: 4e 80 00 20 blr

Without READ_ONCE() but with a barrier() after the read, we should get
the same result but GCC (GCC 8.1) does less good:

00000ac8 <__c_kernel_time>:
ac8: 81 24 00 24 lwz r9,36(r4)
acc: 2f 83 00 00 cmpwi cr7,r3,0
ad0: 41 9e 00 08 beq cr7,ad8 <__c_kernel_time+0x10>
ad4: 91 23 00 00 stw r9,0(r3)
ad8: 7d 23 4b 78 mr r3,r9
adc: 4e 80 00 20 blr

Assuming both part of the 64 bits data will fall into a single
cacheline, the second read is in the noise.

So agreed to drop this change.

Christophe

2020-01-11 11:08:57

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [RFC PATCH v2 07/10] lib: vdso: don't use READ_ONCE() in __c_kernel_time()

Christophe Leroy <[email protected]> writes:
>
> With READ_ONCE() the 64 bits are being read:
>
> Without the READ_ONCE() only 32 bits are read. That's the most optimal.
>
> Without READ_ONCE() but with a barrier() after the read, we should get
> the same result but GCC (GCC 8.1) does less good:
>
> Assuming both part of the 64 bits data will fall into a single
> cacheline, the second read is in the noise.

They definitely are in the same cacheline.

> So agreed to drop this change.

We could be smart about this and force the compiler to issue a 32bit
read for 32bit builds. See below. Not sure whether it's worth it, but
OTOH it will take quite a while until the 32bit time interfaces die
completely.

Thanks,

tglx

8<------------
--- a/include/vdso/datapage.h
+++ b/include/vdso/datapage.h
@@ -21,6 +21,18 @@
#define CS_RAW 1
#define CS_BASES (CS_RAW + 1)

+#ifdef __LITTLE_ENDIAN
+struct sec_hl {
+ u32 sec_l;
+ u32 sec_h;
+};
+#else
+struct sec_hl {
+ u32 sec_h;
+ u32 sec_l;
+};
+#endif
+
/**
* struct vdso_timestamp - basetime per clock_id
* @sec: seconds
@@ -35,7 +47,10 @@
* vdso_data.cs[x].shift.
*/
struct vdso_timestamp {
- u64 sec;
+ union {
+ u64 sec;
+ struct sec_hl sec_hl;
+ };
u64 nsec;
};

--- a/lib/vdso/gettimeofday.c
+++ b/lib/vdso/gettimeofday.c
@@ -165,8 +165,13 @@ static __maybe_unused int
static __maybe_unused __kernel_old_time_t __cvdso_time(__kernel_old_time_t *time)
{
const struct vdso_data *vd = __arch_get_vdso_data();
- __kernel_old_time_t t = READ_ONCE(vd[CS_HRES_COARSE].basetime[CLOCK_REALTIME].sec);
+ __kernel_old_time_t t;

+#if BITS_PER_LONG == 32
+ t = READ_ONCE(vd[CS_HRES_COARSE].basetime[CLOCK_REALTIME].sec_hl.sec_l);
+#else
+ t = READ_ONCE(vd[CS_HRES_COARSE].basetime[CLOCK_REALTIME].sec);
+#endif
if (time)
*time = t;

2020-01-13 06:54:39

by Christophe Leroy

[permalink] [raw]
Subject: Re: [RFC PATCH v2 07/10] lib: vdso: don't use READ_ONCE() in __c_kernel_time()



Le 11/01/2020 à 12:07, Thomas Gleixner a écrit :
> Christophe Leroy <[email protected]> writes:
>>
>> With READ_ONCE() the 64 bits are being read:
>>
>> Without the READ_ONCE() only 32 bits are read. That's the most optimal.
>>
>> Without READ_ONCE() but with a barrier() after the read, we should get
>> the same result but GCC (GCC 8.1) does less good:
>>
>> Assuming both part of the 64 bits data will fall into a single
>> cacheline, the second read is in the noise.
>
> They definitely are in the same cacheline.
>
>> So agreed to drop this change.
>
> We could be smart about this and force the compiler to issue a 32bit
> read for 32bit builds. See below. Not sure whether it's worth it, but
> OTOH it will take quite a while until the 32bit time interfaces die
> completely.

I don't think it is worth something so big to just save 1 or 2 cycles in
time() function. Lets keep it as it is.

Thanks,
Christophe

>
> Thanks,
>
> tglx
>
> 8<------------
> --- a/include/vdso/datapage.h
> +++ b/include/vdso/datapage.h
> @@ -21,6 +21,18 @@
> #define CS_RAW 1
> #define CS_BASES (CS_RAW + 1)
>
> +#ifdef __LITTLE_ENDIAN
> +struct sec_hl {
> + u32 sec_l;
> + u32 sec_h;
> +};
> +#else
> +struct sec_hl {
> + u32 sec_h;
> + u32 sec_l;
> +};
> +#endif
> +
> /**
> * struct vdso_timestamp - basetime per clock_id
> * @sec: seconds
> @@ -35,7 +47,10 @@
> * vdso_data.cs[x].shift.
> */
> struct vdso_timestamp {
> - u64 sec;
> + union {
> + u64 sec;
> + struct sec_hl sec_hl;
> + };
> u64 nsec;
> };
>
> --- a/lib/vdso/gettimeofday.c
> +++ b/lib/vdso/gettimeofday.c
> @@ -165,8 +165,13 @@ static __maybe_unused int
> static __maybe_unused __kernel_old_time_t __cvdso_time(__kernel_old_time_t *time)
> {
> const struct vdso_data *vd = __arch_get_vdso_data();
> - __kernel_old_time_t t = READ_ONCE(vd[CS_HRES_COARSE].basetime[CLOCK_REALTIME].sec);
> + __kernel_old_time_t t;
>
> +#if BITS_PER_LONG == 32
> + t = READ_ONCE(vd[CS_HRES_COARSE].basetime[CLOCK_REALTIME].sec_hl.sec_l);
> +#else
> + t = READ_ONCE(vd[CS_HRES_COARSE].basetime[CLOCK_REALTIME].sec);
> +#endif
> if (time)
> *time = t;
>
>