2020-01-23 15:36:04

by Will Deacon

[permalink] [raw]
Subject: [PATCH v2 00/10] Rework READ_ONCE() to improve codegen

Hi folks,

This is version two of the patches I previously posted as an RFC here:

https://lore.kernel.org/lkml/[email protected]

Changes since then include:

* Adopted a less vomit-inducing series of macros for __unqual_scalar_typeof

* Cast to 'const' in READ_ONCE() to prevent assignment to the resulting
expression

* Only warn once at build-time if GCC prior to 4.8 is detected...

* ... and then raise the minimum GCC version to 4.8, with an error for
older versions of the compiler

* Remove some dead gcov code so that the resulting diffstat can distract
from those less vomit-inducing macros I mentioned earlier on

Failing the build for older compilers is always a contentious topic, so
I've done that as a separate couple of patches on the end in case we end
up dropping or reverting them.

Cheers,

Will

Cc: Michael Ellerman <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Segher Boessenkool <[email protected]>
Cc: Christian Borntraeger <[email protected]>
Cc: Luc Van Oostenryck <[email protected]>
Cc: Arnd Bergmann <[email protected]>
Cc: Peter Oberparleiter <[email protected]>
Cc: Masahiro Yamada <[email protected]>
Cc: Nick Desaulniers <[email protected]>

--->8

Will Deacon (10):
compiler/gcc: Emit build-time warning for GCC prior to version 4.8
netfilter: Avoid assigning 'const' pointer to non-const pointer
fault_inject: Don't rely on "return value" from WRITE_ONCE()
READ_ONCE: Simplify implementations of {READ,WRITE}_ONCE()
READ_ONCE: Enforce atomicity for {READ,WRITE}_ONCE() memory accesses
READ_ONCE: Drop pointer qualifiers when reading from scalar types
locking/barriers: Use '__unqual_scalar_typeof' for load-acquire macros
arm64: barrier: Use '__unqual_scalar_typeof' for acquire/release
macros
compiler/gcc: Raise minimum GCC version for kernel builds to 4.8
gcov: Remove old GCC 3.4 support

Documentation/process/changes.rst | 2 +-
arch/arm/crypto/Kconfig | 12 +-
arch/arm64/include/asm/barrier.h | 16 +-
crypto/Kconfig | 1 -
drivers/xen/time.c | 2 +-
include/asm-generic/barrier.h | 16 +-
include/linux/compiler-gcc.h | 5 +-
include/linux/compiler.h | 129 +++----
include/linux/compiler_types.h | 21 ++
init/Kconfig | 5 +-
kernel/gcov/Kconfig | 24 --
kernel/gcov/Makefile | 3 +-
kernel/gcov/gcc_3_4.c | 573 ------------------------------
lib/fault-inject.c | 4 +-
net/netfilter/core.c | 2 +-
net/xdp/xsk_queue.h | 2 +-
scripts/Kconfig.include | 3 +
scripts/gcc-plugins/Kconfig | 4 +-
18 files changed, 111 insertions(+), 713 deletions(-)
delete mode 100644 kernel/gcov/gcc_3_4.c

--
2.25.0.341.g760bfbb309-goog


2020-01-23 15:36:04

by Will Deacon

[permalink] [raw]
Subject: [PATCH v2 06/10] READ_ONCE: Drop pointer qualifiers when reading from scalar types

Passing a volatile-qualified pointer to READ_ONCE() is an absolute
trainwreck for code generation: the use of 'typeof()' to define a
temporary variable inside the macro means that the final evaluation in
macro scope ends up forcing a read back from the stack. When stack
protector is enabled (the default for arm64, at least), this causes
the compiler to vomit up all sorts of junk.

Unfortunately, dropping pointer qualifiers inside the macro poses quite
a challenge, especially since the pointed-to type is permitted to be an
aggregate, and this is relied upon by mm/ code accessing things like
'pmd_t'. Based on numerous hacks and discussions on the mailing list,
this is the best I've managed to come up with.

Introduce '__unqual_scalar_typeof()' which takes an expression and, if
the expression is an optionally qualified 8, 16, 32 or 64-bit scalar
type, evaluates to the unqualified type. Other input types, including
aggregates, remain unchanged. Hopefully READ_ONCE() on volatile aggregate
pointers isn't something we do on a fast-path.

Cc: Peter Zijlstra <[email protected]>
Cc: Arnd Bergmann <[email protected]>
Suggested-by: Linus Torvalds <[email protected]>
Reported-by: Michael Ellerman <[email protected]>
Signed-off-by: Will Deacon <[email protected]>
---
include/linux/compiler.h | 6 +++---
include/linux/compiler_types.h | 21 +++++++++++++++++++++
2 files changed, 24 insertions(+), 3 deletions(-)

diff --git a/include/linux/compiler.h b/include/linux/compiler.h
index a7b2195f2655..994c35638584 100644
--- a/include/linux/compiler.h
+++ b/include/linux/compiler.h
@@ -203,13 +203,13 @@ void ftrace_likely_update(struct ftrace_likely_data *f, int val,
* atomicity or dependency ordering guarantees. Note that this may result
* in tears!
*/
-#define __READ_ONCE(x) (*(const volatile typeof(x) *)&(x))
+#define __READ_ONCE(x) (*(const volatile __unqual_scalar_typeof(x) *)&(x))

#define __READ_ONCE_SCALAR(x) \
({ \
- typeof(x) __x = __READ_ONCE(x); \
+ __unqual_scalar_typeof(x) __x = __READ_ONCE(x); \
smp_read_barrier_depends(); \
- __x; \
+ (typeof(x))__x; \
})

/*
diff --git a/include/linux/compiler_types.h b/include/linux/compiler_types.h
index 72393a8c1a6c..58361f2d3b98 100644
--- a/include/linux/compiler_types.h
+++ b/include/linux/compiler_types.h
@@ -219,6 +219,27 @@ struct ftrace_likely_data {
/* Are two types/vars the same type (ignoring qualifiers)? */
#define __same_type(a, b) __builtin_types_compatible_p(typeof(a), typeof(b))

+/*
+ * __unqual_scalar_typeof(x) - Declare an unqualified scalar type, leaving
+ * non-scalar types unchanged.
+ *
+ * We build this out of a couple of helper macros in a vain attempt to
+ * help you keep your lunch down while reading it.
+ */
+#define __pick_scalar_type(x, type, otherwise) \
+ __builtin_choose_expr(__same_type(x, type), (type)0, otherwise)
+
+#define __pick_integer_type(x, type, otherwise) \
+ __pick_scalar_type(x, unsigned type, \
+ __pick_scalar_type(x, signed type, otherwise))
+
+#define __unqual_scalar_typeof(x) typeof( \
+ __pick_integer_type(x, char, \
+ __pick_integer_type(x, short, \
+ __pick_integer_type(x, int, \
+ __pick_integer_type(x, long, \
+ __pick_integer_type(x, long long, x))))))
+
/* Is this type a native word size -- useful for atomic operations */
#define __native_word(t) \
(sizeof(t) == sizeof(char) || sizeof(t) == sizeof(short) || \
--
2.25.0.341.g760bfbb309-goog

2020-01-23 15:36:15

by Will Deacon

[permalink] [raw]
Subject: [PATCH v2 05/10] READ_ONCE: Enforce atomicity for {READ,WRITE}_ONCE() memory accesses

{READ,WRITE}_ONCE() cannot guarantee atomicity for arbitrary data sizes.
This can be surprising to callers that might incorrectly be expecting
atomicity for accesses to aggregate structures, although there are other
callers where tearing is actually permissable (e.g. if they are using
something akin to sequence locking to protect the access).

Linus sayeth:

| We could also look at being stricter for the normal READ/WRITE_ONCE(),
| and require that they are
|
| (a) regular integer types
|
| (b) fit in an atomic word
|
| We actually did (b) for a while, until we noticed that we do it on
| loff_t's etc and relaxed the rules. But maybe we could have a
| "non-atomic" version of READ/WRITE_ONCE() that is used for the
| questionable cases?

The slight snag is that we also have to support 64-bit accesses on 32-bit
architectures, as these appear to be widespread and tend to work out ok
if either the architecture supports atomic 64-bit accesses (x86, armv7)
or if the variable being accesses represents a virtual address and
therefore only requires 32-bit atomicity in practice.

Take a step in that direction by introducing a variant of
'compiletime_assert_atomic_type()' and use it to check the pointer
argument to {READ,WRITE}_ONCE(). Expose __{READ,WRITE_ONCE}() variants
which are allowed to tear and convert the two broken callers over to the
new macros.

Suggested-by: Linus Torvalds <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Michael Ellerman <[email protected]>
Cc: Arnd Bergmann <[email protected]>
Signed-off-by: Will Deacon <[email protected]>
---
drivers/xen/time.c | 2 +-
include/linux/compiler.h | 37 +++++++++++++++++++++++++++++++++----
net/xdp/xsk_queue.h | 2 +-
3 files changed, 35 insertions(+), 6 deletions(-)

diff --git a/drivers/xen/time.c b/drivers/xen/time.c
index 0968859c29d0..108edbcbc040 100644
--- a/drivers/xen/time.c
+++ b/drivers/xen/time.c
@@ -64,7 +64,7 @@ static void xen_get_runstate_snapshot_cpu_delta(
do {
state_time = get64(&state->state_entry_time);
rmb(); /* Hypervisor might update data. */
- *res = READ_ONCE(*state);
+ *res = __READ_ONCE(*state);
rmb(); /* Hypervisor might update data. */
} while (get64(&state->state_entry_time) != state_time ||
(state_time & XEN_RUNSTATE_UPDATE));
diff --git a/include/linux/compiler.h b/include/linux/compiler.h
index 44974d658f30..a7b2195f2655 100644
--- a/include/linux/compiler.h
+++ b/include/linux/compiler.h
@@ -198,24 +198,43 @@ void ftrace_likely_update(struct ftrace_likely_data *f, int val,
#include <asm/barrier.h>
#include <linux/kasan-checks.h>

+/*
+ * Use __READ_ONCE() instead of READ_ONCE() if you do not require any
+ * atomicity or dependency ordering guarantees. Note that this may result
+ * in tears!
+ */
+#define __READ_ONCE(x) (*(const volatile typeof(x) *)&(x))
+
+#define __READ_ONCE_SCALAR(x) \
+({ \
+ typeof(x) __x = __READ_ONCE(x); \
+ smp_read_barrier_depends(); \
+ __x; \
+})
+
/*
* Use READ_ONCE_NOCHECK() instead of READ_ONCE() if you need
* to hide memory access from KASAN.
*/
#define READ_ONCE_NOCHECK(x) \
({ \
- typeof(x) __x = *(volatile typeof(x) *)&(x); \
- smp_read_barrier_depends(); \
- __x; \
+ compiletime_assert_rwonce_type(x); \
+ __READ_ONCE_SCALAR(x); \
})

#define READ_ONCE(x) READ_ONCE_NOCHECK(x)

-#define WRITE_ONCE(x, val) \
+#define __WRITE_ONCE(x, val) \
do { \
*(volatile typeof(x) *)&(x) = (val); \
} while (0)

+#define WRITE_ONCE(x, val) \
+do { \
+ compiletime_assert_rwonce_type(x); \
+ __WRITE_ONCE(x, val); \
+} while (0)
+
#ifdef CONFIG_KASAN
/*
* We can't declare function 'inline' because __no_sanitize_address conflicts
@@ -299,6 +318,16 @@ static inline void *offset_to_ptr(const int *off)
compiletime_assert(__native_word(t), \
"Need native word sized stores/loads for atomicity.")

+/*
+ * Yes, this permits 64-bit accesses on 32-bit architectures. These will
+ * actually be atomic in many cases (namely x86), but for others we rely on
+ * the access being split into 2x32-bit accesses for a 32-bit quantity (e.g.
+ * a virtual address) and a strong prevailing wind.
+ */
+#define compiletime_assert_rwonce_type(t) \
+ compiletime_assert(__native_word(t) || sizeof(t) == sizeof(long long), \
+ "Unsupported access size for {READ,WRITE}_ONCE().")
+
/* &a[0] degrades to a pointer: a different type from an array */
#define __must_be_array(a) BUILD_BUG_ON_ZERO(__same_type((a), &(a)[0]))

diff --git a/net/xdp/xsk_queue.h b/net/xdp/xsk_queue.h
index eddae4688862..2b55c1c7b2b6 100644
--- a/net/xdp/xsk_queue.h
+++ b/net/xdp/xsk_queue.h
@@ -304,7 +304,7 @@ static inline struct xdp_desc *xskq_validate_desc(struct xsk_queue *q,
struct xdp_rxtx_ring *ring = (struct xdp_rxtx_ring *)q->ring;
unsigned int idx = q->cons_tail & q->ring_mask;

- *desc = READ_ONCE(ring->desc[idx]);
+ *desc = __READ_ONCE(ring->desc[idx]);
if (xskq_is_valid_desc(q, desc, umem))
return desc;

--
2.25.0.341.g760bfbb309-goog

2020-01-23 15:36:23

by Will Deacon

[permalink] [raw]
Subject: [PATCH v2 03/10] fault_inject: Don't rely on "return value" from WRITE_ONCE()

It's a bit weird that WRITE_ONCE() evaluates to the value it stores and
it's different to smp_store_release(), which can't be used this way.

In preparation for preventing this in WRITE_ONCE(), change the fault
injection code to use a local variable instead.

Cc: Akinobu Mita <[email protected]>
Signed-off-by: Will Deacon <[email protected]>
---
lib/fault-inject.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/lib/fault-inject.c b/lib/fault-inject.c
index 8186ca84910b..ce12621b4275 100644
--- a/lib/fault-inject.c
+++ b/lib/fault-inject.c
@@ -106,7 +106,9 @@ bool should_fail(struct fault_attr *attr, ssize_t size)
unsigned int fail_nth = READ_ONCE(current->fail_nth);

if (fail_nth) {
- if (!WRITE_ONCE(current->fail_nth, fail_nth - 1))
+ fail_nth--;
+ WRITE_ONCE(current->fail_nth, fail_nth);
+ if (!fail_nth)
goto fail;

return false;
--
2.25.0.341.g760bfbb309-goog

2020-01-23 15:36:36

by Will Deacon

[permalink] [raw]
Subject: [PATCH v2 07/10] locking/barriers: Use '__unqual_scalar_typeof' for load-acquire macros

Passing volatile-qualified pointers to the asm-generic implementations of
the load-acquire macros results in a re-load from the stack due to the
temporary result variable inheriting the volatile semantics thanks to the
use of 'typeof()'.

Define these temporary variables using 'unqual_scalar_typeof' to drop
the volatile qualifier in the case that they are scalar types.

Cc: Linus Torvalds <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Arnd Bergmann <[email protected]>
Signed-off-by: Will Deacon <[email protected]>
---
include/asm-generic/barrier.h | 16 ++++++++--------
1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/include/asm-generic/barrier.h b/include/asm-generic/barrier.h
index 85b28eb80b11..2eacaf7d62f6 100644
--- a/include/asm-generic/barrier.h
+++ b/include/asm-generic/barrier.h
@@ -128,10 +128,10 @@ do { \
#ifndef __smp_load_acquire
#define __smp_load_acquire(p) \
({ \
- typeof(*p) ___p1 = READ_ONCE(*p); \
+ __unqual_scalar_typeof(*p) ___p1 = READ_ONCE(*p); \
compiletime_assert_atomic_type(*p); \
__smp_mb(); \
- ___p1; \
+ (typeof(*p))___p1; \
})
#endif

@@ -183,10 +183,10 @@ do { \
#ifndef smp_load_acquire
#define smp_load_acquire(p) \
({ \
- typeof(*p) ___p1 = READ_ONCE(*p); \
+ __unqual_scalar_typeof(*p) ___p1 = READ_ONCE(*p); \
compiletime_assert_atomic_type(*p); \
barrier(); \
- ___p1; \
+ (typeof(*p))___p1; \
})
#endif

@@ -229,14 +229,14 @@ do { \
#ifndef smp_cond_load_relaxed
#define smp_cond_load_relaxed(ptr, cond_expr) ({ \
typeof(ptr) __PTR = (ptr); \
- typeof(*ptr) VAL; \
+ __unqual_scalar_typeof(*ptr) VAL; \
for (;;) { \
VAL = READ_ONCE(*__PTR); \
if (cond_expr) \
break; \
cpu_relax(); \
} \
- VAL; \
+ (typeof(*ptr))VAL; \
})
#endif

@@ -250,10 +250,10 @@ do { \
*/
#ifndef smp_cond_load_acquire
#define smp_cond_load_acquire(ptr, cond_expr) ({ \
- typeof(*ptr) _val; \
+ __unqual_scalar_typeof(*ptr) _val; \
_val = smp_cond_load_relaxed(ptr, cond_expr); \
smp_acquire__after_ctrl_dep(); \
- _val; \
+ (typeof(*ptr))_val; \
})
#endif

--
2.25.0.341.g760bfbb309-goog

2020-01-23 17:19:16

by David Laight

[permalink] [raw]
Subject: RE: [PATCH v2 00/10] Rework READ_ONCE() to improve codegen

From: Will Deacon
> Sent: 23 January 2020 15:34
...
> * Only warn once at build-time if GCC prior to 4.8 is detected...
>
> * ... and then raise the minimum GCC version to 4.8, with an error for
> older versions of the compiler

If the kernel compiled with gcc 4.7 is likely to be buggy, don't these
need to be in the other order?

Otherwise you need to keep the old versions for use with the old
compilers.

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)

2020-01-23 17:20:33

by Will Deacon

[permalink] [raw]
Subject: Re: [PATCH v2 00/10] Rework READ_ONCE() to improve codegen

On Thu, Jan 23, 2020 at 05:07:40PM +0000, David Laight wrote:
> From: Will Deacon
> > Sent: 23 January 2020 15:34
> ...
> > * Only warn once at build-time if GCC prior to 4.8 is detected...
> >
> > * ... and then raise the minimum GCC version to 4.8, with an error for
> > older versions of the compiler
>
> If the kernel compiled with gcc 4.7 is likely to be buggy, don't these
> need to be in the other order?
>
> Otherwise you need to keep the old versions for use with the old
> compilers.

I think it depends how much we care about those older compilers. My series
first moves it to "Good luck mate, you're on your own" and then follows up
with a "Let me take that off you it's sharp".

Will

2020-01-23 17:33:21

by David Laight

[permalink] [raw]
Subject: RE: [PATCH v2 00/10] Rework READ_ONCE() to improve codegen

From: Will Deacon
> Sent: 23 January 2020 17:17
>
> On Thu, Jan 23, 2020 at 05:07:40PM +0000, David Laight wrote:
> > From: Will Deacon
> > > Sent: 23 January 2020 15:34
> > ...
> > > * Only warn once at build-time if GCC prior to 4.8 is detected...
> > >
> > > * ... and then raise the minimum GCC version to 4.8, with an error for
> > > older versions of the compiler
> >
> > If the kernel compiled with gcc 4.7 is likely to be buggy, don't these
> > need to be in the other order?
> >
> > Otherwise you need to keep the old versions for use with the old
> > compilers.
>
> I think it depends how much we care about those older compilers. My series
> first moves it to "Good luck mate, you're on your own" and then follows up
> with a "Let me take that off you it's sharp".

Depends on how 'sharp' it is.

If the kernel suffers from the code example in the gcc bug itself
(where 'volatile' is lost and some code is moved out of a loop)
then things will really break somewhere odd.

OTOH if it might generate code that reads something twice
you'd have to be unlucky as well.

Oh - and I need to find a newer compiler :-(

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)

2020-01-23 18:00:57

by Linus Torvalds

[permalink] [raw]
Subject: Re: [PATCH v2 00/10] Rework READ_ONCE() to improve codegen

On Thu, Jan 23, 2020 at 7:33 AM Will Deacon <[email protected]> wrote:
>
> This is version two of the patches I previously posted as an RFC here:

Looks fine to me, as far as I can tell,

Linus

2020-01-23 18:47:39

by Nick Desaulniers

[permalink] [raw]
Subject: Re: [PATCH v2 00/10] Rework READ_ONCE() to improve codegen

On Thu, Jan 23, 2020 at 9:32 AM David Laight <[email protected]> wrote:
>
> From: Will Deacon
> > Sent: 23 January 2020 17:17
> >
> > I think it depends how much we care about those older compilers. My series
> > first moves it to "Good luck mate, you're on your own" and then follows up

I wish the actual warning was worded that way. :P

> > with a "Let me take that off you it's sharp".

> Oh - and I need to find a newer compiler :-(

What distro are you using? Does it have a package for a newer
compiler? I'm honestly curious about what policies if any the kernel
has for supporting developer's toolchains from their distributions.
(ie. Arnd usually has pretty good stats what distro's use which
version of GCC and are still supported; Do we strive to not break
them? Is asking kernel devs to compile their own toolchain too much to
ask? Is it still if they're using really old distro's/toolchains that
we don't want to support? Do we survey kernel devs about what they're
using?). Apologies if this is already documented somewhere, but if
not I'd eventually like to brainstorm and write it down somewhere in
the tree. Documentation/process/changes.rst doesn't really answer the
above questions, I think.

--
Thanks,
~Nick Desaulniers

2020-01-23 19:04:27

by Arvind Sankar

[permalink] [raw]
Subject: Re: [PATCH v2 00/10] Rework READ_ONCE() to improve codegen

On Thu, Jan 23, 2020 at 10:45:08AM -0800, Nick Desaulniers wrote:
> On Thu, Jan 23, 2020 at 9:32 AM David Laight <[email protected]> wrote:
> >
> > From: Will Deacon
> > > Sent: 23 January 2020 17:17
> > >
> > > I think it depends how much we care about those older compilers. My series
> > > first moves it to "Good luck mate, you're on your own" and then follows up
>
> I wish the actual warning was worded that way. :P
>
> > > with a "Let me take that off you it's sharp".
>
> > Oh - and I need to find a newer compiler :-(
>
> What distro are you using? Does it have a package for a newer
> compiler? I'm honestly curious about what policies if any the kernel
> has for supporting developer's toolchains from their distributions.
> (ie. Arnd usually has pretty good stats what distro's use which
> version of GCC and are still supported; Do we strive to not break
> them? Is asking kernel devs to compile their own toolchain too much to
> ask? Is it still if they're using really old distro's/toolchains that
> we don't want to support? Do we survey kernel devs about what they're
> using?). Apologies if this is already documented somewhere, but if
> not I'd eventually like to brainstorm and write it down somewhere in
> the tree. Documentation/process/changes.rst doesn't really answer the
> above questions, I think.
>
> --
> Thanks,
> ~Nick Desaulniers

Reposting Arnd's link
https://www.spinics.net/lists/linux-kbuild/msg23648.html

Seems like there is nothing still on versions between 4.6 and 4.8. It
might be useful to put a list of distro's that are running the current
minimum supported toolchain versions in process/changes.rst?

I think the issue is not just kernel devs. Users may need to compile a
custom kernel or at least build a module.

2020-01-24 08:45:00

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH v2 00/10] Rework READ_ONCE() to improve codegen

On Thu, Jan 23, 2020 at 09:59:03AM -0800, Linus Torvalds wrote:
> On Thu, Jan 23, 2020 at 7:33 AM Will Deacon <[email protected]> wrote:
> >
> > This is version two of the patches I previously posted as an RFC here:
>
> Looks fine to me, as far as I can tell,

Awesome, I've picked them up with a target for tip/locking/core.

2020-01-24 10:13:26

by David Laight

[permalink] [raw]
Subject: RE: [PATCH v2 00/10] Rework READ_ONCE() to improve codegen

From: Arvind Sankar
> Sent: 23 January 2020 19:01
...
> > > Oh - and I need to find a newer compiler :-(
> >
> > What distro are you using? Does it have a package for a newer
> > compiler? I'm honestly curious about what policies if any the kernel
> > has for supporting developer's toolchains from their distributions.

Like many developers I get lazy with upgrades of some systems.
So my main Linux 'desktop' system is running Ubuntu 13.04 userspace
with a very recent kernel.
So it has gcc 4.7.3 as default and a gcc 4.8.1 installed (probably not mature
enough to be safe!).
(I've 2 other i7 systems running Ubuntu 18.04.0x with the distro kernels.)
...
> I think the issue is not just kernel devs. Users may need to compile a
> custom kernel or at least build a module.

If they are just building a module they'll be using the kernel headers
that go with the distro kernel - and the module sources will need to
match. So the distro's gcc should match.

Much more interesting is the gcc compiler we use to generate the
'binary blob' for our 'out of tree' drivers.
These get compiled (without any of the kernel headers) on the
same system as we build the userspace programs on.
To get libc and (especially) libc++ compatibility this has to be
a very old system - as old as the oldest system any of our
customers are likely to use.
In practise this (now) means something running a 2.6.32 era
kernel (or slightly earlier).
So is a debian gcc 4.3.2-1.1 compiler.
I think we've finally persuaded all our customers to upgrade from 2.6.18.

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)

2020-01-24 11:10:26

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH v2 00/10] Rework READ_ONCE() to improve codegen

On Fri, Jan 24, 2020 at 09:33:07AM +0100, Peter Zijlstra wrote:
> On Thu, Jan 23, 2020 at 09:59:03AM -0800, Linus Torvalds wrote:
> > On Thu, Jan 23, 2020 at 7:33 AM Will Deacon <[email protected]> wrote:
> > >
> > > This is version two of the patches I previously posted as an RFC here:
> >
> > Looks fine to me, as far as I can tell,
>
> Awesome, I've picked them up with a target for tip/locking/core.

FWIW, I have the following merge resolution against locking/kcsan.

---

diff --cc include/linux/compiler.h
index 8c0beb10c1dd,994c35638584..000000000000
--- a/include/linux/compiler.h
+++ b/include/linux/compiler.h
@@@ -177,28 -177,69 +177,74 @@@ void ftrace_likely_update(struct ftrace
# define __UNIQUE_ID(prefix) __PASTE(__PASTE(__UNIQUE_ID_, prefix), __LINE__)
#endif

- #include <uapi/linux/types.h>
- #include <linux/kcsan-checks.h>
+ /*
+ * Prevent the compiler from merging or refetching reads or writes. The
+ * compiler is also forbidden from reordering successive instances of
+ * READ_ONCE and WRITE_ONCE, but only when the compiler is aware of some
+ * particular ordering. One way to make the compiler aware of ordering is to
+ * put the two invocations of READ_ONCE or WRITE_ONCE in different C
+ * statements.
+ *
+ * These two macros will also work on aggregate data types like structs or
+ * unions.
+ *
+ * Their two major use cases are: (1) Mediating communication between
+ * process-level code and irq/NMI handlers, all running on the same CPU,
+ * and (2) Ensuring that the compiler does not fold, spindle, or otherwise
+ * mutilate accesses that either do not require ordering or that interact
+ * with an explicit memory barrier or atomic instruction that provides the
+ * required ordering.
+ */
+ #include <asm/barrier.h>
+ #include <linux/kasan-checks.h>
+
+ /*
+ * Use __READ_ONCE() instead of READ_ONCE() if you do not require any
+ * atomicity or dependency ordering guarantees. Note that this may result
+ * in tears!
+ */
+ #define __READ_ONCE(x) (*(const volatile __unqual_scalar_typeof(x) *)&(x))
+
+ #define __READ_ONCE_SCALAR(x) \
+ ({ \
+ __unqual_scalar_typeof(x) __x = __READ_ONCE(x); \
+ smp_read_barrier_depends(); \
+ (typeof(x))__x; \
+ })

- #define __READ_ONCE_SIZE \
+ /*
+ * Use READ_ONCE_NOCHECK() instead of READ_ONCE() if you need
+ * to hide memory access from KASAN.
+ */
+ #define READ_ONCE_NOCHECK(x) \
({ \
- switch (size) { \
- case 1: *(__u8 *)res = *(volatile __u8 *)p; break; \
- case 2: *(__u16 *)res = *(volatile __u16 *)p; break; \
- case 4: *(__u32 *)res = *(volatile __u32 *)p; break; \
- case 8: *(__u64 *)res = *(volatile __u64 *)p; break; \
- default: \
- barrier(); \
- __builtin_memcpy((void *)res, (const void *)p, size); \
- barrier(); \
- } \
+ compiletime_assert_rwonce_type(x); \
+ __READ_ONCE_SCALAR(x); \
+ })
+
-#define READ_ONCE(x) READ_ONCE_NOCHECK(x)
++#define READ_ONCE(x) \
++({ \
++ kcsan_check_atomic_read(&(x), sizeof(x)); \
++ READ_ONCE_NOCHECK(x); \
+})

+ #define __WRITE_ONCE(x, val) \
+ do { \
+ *(volatile typeof(x) *)&(x) = (val); \
+ } while (0)
+
+ #define WRITE_ONCE(x, val) \
+ do { \
+ compiletime_assert_rwonce_type(x); \
++ kcsan_check_atomic_write(&(x), sizeof(x)); \
+ __WRITE_ONCE(x, val); \
+ } while (0)
+
#ifdef CONFIG_KASAN
/*
- * We can't declare function 'inline' because __no_sanitize_address conflicts
+ * We can't declare function 'inline' because __no_sanitize_address confilcts
* with inlining. Attempt to inline it may cause a build failure.
- * https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67368
- * https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67368
++ * https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67368
* '__maybe_unused' allows us to avoid defined-but-not-used warnings.
*/
# define __no_kasan_or_inline __no_sanitize_address notrace __maybe_unused
@@@ -207,97 -247,6 +253,24 @@@
# define __no_kasan_or_inline __always_inline
#endif

+#define __no_kcsan __no_sanitize_thread
+#ifdef __SANITIZE_THREAD__
+/*
+ * Rely on __SANITIZE_THREAD__ instead of CONFIG_KCSAN, to avoid not inlining in
+ * compilation units where instrumentation is disabled. The attribute 'noinline'
+ * is required for older compilers, where implicit inlining of very small
+ * functions renders __no_sanitize_thread ineffective.
+ */
+# define __no_kcsan_or_inline __no_kcsan noinline notrace __maybe_unused
+# define __no_sanitize_or_inline __no_kcsan_or_inline
+#else
+# define __no_kcsan_or_inline __always_inline
+#endif
+
+#ifndef __no_sanitize_or_inline
+#define __no_sanitize_or_inline __always_inline
+#endif
+
- static __no_kcsan_or_inline
- void __read_once_size(const volatile void *p, void *res, int size)
- {
- kcsan_check_atomic_read(p, size);
- __READ_ONCE_SIZE;
- }
-
- static __no_sanitize_or_inline
- void __read_once_size_nocheck(const volatile void *p, void *res, int size)
- {
- __READ_ONCE_SIZE;
- }
-
- static __no_kcsan_or_inline
- void __write_once_size(volatile void *p, void *res, int size)
- {
- kcsan_check_atomic_write(p, size);
-
- switch (size) {
- case 1: *(volatile __u8 *)p = *(__u8 *)res; break;
- case 2: *(volatile __u16 *)p = *(__u16 *)res; break;
- case 4: *(volatile __u32 *)p = *(__u32 *)res; break;
- case 8: *(volatile __u64 *)p = *(__u64 *)res; break;
- default:
- barrier();
- __builtin_memcpy((void *)p, (const void *)res, size);
- barrier();
- }
- }
-
- /*
- * Prevent the compiler from merging or refetching reads or writes. The
- * compiler is also forbidden from reordering successive instances of
- * READ_ONCE and WRITE_ONCE, but only when the compiler is aware of some
- * particular ordering. One way to make the compiler aware of ordering is to
- * put the two invocations of READ_ONCE or WRITE_ONCE in different C
- * statements.
- *
- * These two macros will also work on aggregate data types like structs or
- * unions. If the size of the accessed data type exceeds the word size of
- * the machine (e.g., 32 bits or 64 bits) READ_ONCE() and WRITE_ONCE() will
- * fall back to memcpy(). There's at least two memcpy()s: one for the
- * __builtin_memcpy() and then one for the macro doing the copy of variable
- * - '__u' allocated on the stack.
- *
- * Their two major use cases are: (1) Mediating communication between
- * process-level code and irq/NMI handlers, all running on the same CPU,
- * and (2) Ensuring that the compiler does not fold, spindle, or otherwise
- * mutilate accesses that either do not require ordering or that interact
- * with an explicit memory barrier or atomic instruction that provides the
- * required ordering.
- */
- #include <asm/barrier.h>
- #include <linux/kasan-checks.h>
-
- #define __READ_ONCE(x, check) \
- ({ \
- union { typeof(x) __val; char __c[1]; } __u; \
- if (check) \
- __read_once_size(&(x), __u.__c, sizeof(x)); \
- else \
- __read_once_size_nocheck(&(x), __u.__c, sizeof(x)); \
- smp_read_barrier_depends(); /* Enforce dependency ordering from x */ \
- __u.__val; \
- })
- #define READ_ONCE(x) __READ_ONCE(x, 1)
-
- /*
- * Use READ_ONCE_NOCHECK() instead of READ_ONCE() if you need
- * to hide memory access from KASAN.
- */
- #define READ_ONCE_NOCHECK(x) __READ_ONCE(x, 0)
-
static __no_kasan_or_inline
unsigned long read_word_at_a_time(const void *addr)
{
@@@ -305,34 -254,6 +278,26 @@@
return *(unsigned long *)addr;
}

- #define WRITE_ONCE(x, val) \
- ({ \
- union { typeof(x) __val; char __c[1]; } __u = \
- { .__val = (__force typeof(x)) (val) }; \
- __write_once_size(&(x), __u.__c, sizeof(x)); \
- __u.__val; \
- })
-
+#include <linux/kcsan.h>
+
+/*
+ * data_race(): macro to document that accesses in an expression may conflict with
+ * other concurrent accesses resulting in data races, but the resulting
+ * behaviour is deemed safe regardless.
+ *
+ * This macro *does not* affect normal code generation, but is a hint to tooling
+ * that data races here should be ignored.
+ */
+#define data_race(expr) \
+ ({ \
+ typeof(({ expr; })) __val; \
+ kcsan_nestable_atomic_begin(); \
+ __val = ({ expr; }); \
+ kcsan_nestable_atomic_end(); \
+ __val; \
+ })
+#else
+
#endif /* __KERNEL__ */

/*

2020-01-25 08:31:07

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH v2 05/10] READ_ONCE: Enforce atomicity for {READ,WRITE}_ONCE() memory accesses

On Thu, Jan 23, 2020 at 03:33:36PM +0000, Will Deacon wrote:
> {READ,WRITE}_ONCE() cannot guarantee atomicity for arbitrary data sizes.
> This can be surprising to callers that might incorrectly be expecting
> atomicity for accesses to aggregate structures, although there are other
> callers where tearing is actually permissable (e.g. if they are using
> something akin to sequence locking to protect the access).
>
> Linus sayeth:
>
> | We could also look at being stricter for the normal READ/WRITE_ONCE(),
> | and require that they are
> |
> | (a) regular integer types
> |
> | (b) fit in an atomic word
> |
> | We actually did (b) for a while, until we noticed that we do it on
> | loff_t's etc and relaxed the rules. But maybe we could have a
> | "non-atomic" version of READ/WRITE_ONCE() that is used for the
> | questionable cases?
>
> The slight snag is that we also have to support 64-bit accesses on 32-bit
> architectures, as these appear to be widespread and tend to work out ok
> if either the architecture supports atomic 64-bit accesses (x86, armv7)
> or if the variable being accesses represents a virtual address and
> therefore only requires 32-bit atomicity in practice.
>
> Take a step in that direction by introducing a variant of
> 'compiletime_assert_atomic_type()' and use it to check the pointer
> argument to {READ,WRITE}_ONCE(). Expose __{READ,WRITE_ONCE}() variants
> which are allowed to tear and convert the two broken callers over to the
> new macros.

The build robot is telling me we also need this for m68k; they have:

arch/m68k/include/asm/page.h:typedef struct { unsigned long pmd[16]; } pmd_t;

Commit 688272809fcce seems to suggest the below is actually wrong tho.

---
diff --git a/mm/gup.c b/mm/gup.c
index 7646bf993b25..62885dad5444 100644
--- a/mm/gup.c
+++ b/mm/gup.c
@@ -320,7 +320,7 @@ static struct page *follow_pmd_mask(struct vm_area_struct *vma,
* The READ_ONCE() will stabilize the pmdval in a register or
* on the stack so that it will stop changing under the code.
*/
- pmdval = READ_ONCE(*pmd);
+ pmdval = __READ_ONCE(*pmd);
if (pmd_none(pmdval))
return no_page_table(vma, flags);
if (pmd_huge(pmdval) && vma->vm_flags & VM_HUGETLB) {
@@ -345,7 +345,7 @@ static struct page *follow_pmd_mask(struct vm_area_struct *vma,
!is_pmd_migration_entry(pmdval));
if (is_pmd_migration_entry(pmdval))
pmd_migration_entry_wait(mm, pmd);
- pmdval = READ_ONCE(*pmd);
+ pmdval = __READ_ONCE(*pmd);
/*
* MADV_DONTNEED may convert the pmd to null because
* mmap_sem is held in read mode

2020-01-26 01:12:37

by Qais Yousef

[permalink] [raw]
Subject: Re: [PATCH v2 00/10] Rework READ_ONCE() to improve codegen

On 01/23/20 14:01, Arvind Sankar wrote:
> On Thu, Jan 23, 2020 at 10:45:08AM -0800, Nick Desaulniers wrote:
> > On Thu, Jan 23, 2020 at 9:32 AM David Laight <[email protected]> wrote:
> > >
> > > From: Will Deacon
> > > > Sent: 23 January 2020 17:17
> > > >
> > > > I think it depends how much we care about those older compilers. My series
> > > > first moves it to "Good luck mate, you're on your own" and then follows up
> >
> > I wish the actual warning was worded that way. :P
> >
> > > > with a "Let me take that off you it's sharp".
> >
> > > Oh - and I need to find a newer compiler :-(
> >
> > What distro are you using? Does it have a package for a newer
> > compiler? I'm honestly curious about what policies if any the kernel
> > has for supporting developer's toolchains from their distributions.
> > (ie. Arnd usually has pretty good stats what distro's use which
> > version of GCC and are still supported; Do we strive to not break
> > them? Is asking kernel devs to compile their own toolchain too much to
> > ask? Is it still if they're using really old distro's/toolchains that
> > we don't want to support? Do we survey kernel devs about what they're
> > using?). Apologies if this is already documented somewhere, but if
> > not I'd eventually like to brainstorm and write it down somewhere in
> > the tree. Documentation/process/changes.rst doesn't really answer the
> > above questions, I think.
> >
> > --
> > Thanks,
> > ~Nick Desaulniers
>
> Reposting Arnd's link
> https://www.spinics.net/lists/linux-kbuild/msg23648.html

This list seems to be x86 centric? I remember when the switch to GCC 4.6
happened a couple or more archs had to be dropped because they lacked a newer
compiler.

So popular archs would probably have moved quickly, but 'niche' ones might
still be catching up at a slower pace.

--
Qais Yousef

2020-01-27 07:27:28

by Arnd Bergmann

[permalink] [raw]
Subject: Re: [PATCH v2 00/10] Rework READ_ONCE() to improve codegen

On Sun, Jan 26, 2020 at 2:10 AM Qais Yousef <[email protected]> wrote:
> On 01/23/20 14:01, Arvind Sankar wrote:
> > On Thu, Jan 23, 2020 at 10:45:08AM -0800, Nick Desaulniers wrote:
> > > On Thu, Jan 23, 2020 at 9:32 AM David Laight <[email protected]> wrote:
> >
> > Reposting Arnd's link
> > https://www.spinics.net/lists/linux-kbuild/msg23648.html
>
> This list seems to be x86 centric? I remember when the switch to GCC 4.6
> happened a couple or more archs had to be dropped because they lacked a newer
> compiler.

There are two architectures that already had problems last time:

- unicore32 never had any compiler that shipped with sources, only an ancient
set of gcc binaries that already had problems building the kernel during the
move to gcc-4.6. The maintainer said he'd work on providing support for
modern gcc or clang, but I don't think anything came out of that.

- hexagon had an unmaintained gcc-4.5 port, but internally Qualcomm were
already using clang to build their kernels, which should now work with the
upstream version. I don't think there are any plans to have a more modern
gcc.

Everything else works with mainline gcc now, openrisc and csky were the
last to get added in gcc-9.

Some of the older sub-targets (armv3, s390-g6, powerpcspe) are removed
in gcc-9, but these have a few more years before we need to worry about
them.

Arnd

2020-01-29 10:51:36

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH v2 05/10] READ_ONCE: Enforce atomicity for {READ,WRITE}_ONCE() memory accesses

On Sat, Jan 25, 2020 at 09:27:46AM +0100, Peter Zijlstra wrote:
> On Thu, Jan 23, 2020 at 03:33:36PM +0000, Will Deacon wrote:
> > {READ,WRITE}_ONCE() cannot guarantee atomicity for arbitrary data sizes.
> > This can be surprising to callers that might incorrectly be expecting
> > atomicity for accesses to aggregate structures, although there are other
> > callers where tearing is actually permissable (e.g. if they are using
> > something akin to sequence locking to protect the access).
> >
> > Linus sayeth:
> >
> > | We could also look at being stricter for the normal READ/WRITE_ONCE(),
> > | and require that they are
> > |
> > | (a) regular integer types
> > |
> > | (b) fit in an atomic word
> > |
> > | We actually did (b) for a while, until we noticed that we do it on
> > | loff_t's etc and relaxed the rules. But maybe we could have a
> > | "non-atomic" version of READ/WRITE_ONCE() that is used for the
> > | questionable cases?
> >
> > The slight snag is that we also have to support 64-bit accesses on 32-bit
> > architectures, as these appear to be widespread and tend to work out ok
> > if either the architecture supports atomic 64-bit accesses (x86, armv7)
> > or if the variable being accesses represents a virtual address and
> > therefore only requires 32-bit atomicity in practice.
> >
> > Take a step in that direction by introducing a variant of
> > 'compiletime_assert_atomic_type()' and use it to check the pointer
> > argument to {READ,WRITE}_ONCE(). Expose __{READ,WRITE_ONCE}() variants
> > which are allowed to tear and convert the two broken callers over to the
> > new macros.
>
> The build robot is telling me we also need this for m68k; they have:
>
> arch/m68k/include/asm/page.h:typedef struct { unsigned long pmd[16]; } pmd_t;

Fixed that with these patches:

https://lkml.kernel.org/r/[email protected]

2020-01-31 10:21:47

by David Howells

[permalink] [raw]
Subject: Re: [PATCH v2 00/10] Rework READ_ONCE() to improve codegen

Of course, if you're feeling adventurous and willing to at least entertain the
mere speculation of switching the kernel source to C++, we could then use
inline template functions:

template <typename T>
static inline T __READ_ONCE(T &var)
{
T val = *(const volatile T *)&var;
return val;
}

template <typename T>
static inline T READ_ONCE(T &var)
{
T val;
compiletime_assert_rwonce_type(var);
val = __READ_ONCE(var);
smp_read_barrier_depends();
return val;
}

Of course, that would mean using C++...

David

2020-02-10 09:52:03

by Masahiro Yamada

[permalink] [raw]
Subject: Re: [PATCH v2 00/10] Rework READ_ONCE() to improve codegen

On Fri, Jan 24, 2020 at 5:33 PM Peter Zijlstra <[email protected]> wrote:
>
> On Thu, Jan 23, 2020 at 09:59:03AM -0800, Linus Torvalds wrote:
> > On Thu, Jan 23, 2020 at 7:33 AM Will Deacon <[email protected]> wrote:
> > >
> > > This is version two of the patches I previously posted as an RFC here:
> >
> > Looks fine to me, as far as I can tell,
>
> Awesome, I've picked them up with a target for tip/locking/core.

Were they really picked up?

The MW is closed now, but I do not them in Linus' tree.
I do not see them even in linux-next.




--
Best Regards

Masahiro Yamada

2020-02-10 10:01:37

by Will Deacon

[permalink] [raw]
Subject: Re: [PATCH v2 00/10] Rework READ_ONCE() to improve codegen

On Mon, Feb 10, 2020 at 06:50:53PM +0900, Masahiro Yamada wrote:
> On Fri, Jan 24, 2020 at 5:33 PM Peter Zijlstra <[email protected]> wrote:
> >
> > On Thu, Jan 23, 2020 at 09:59:03AM -0800, Linus Torvalds wrote:
> > > On Thu, Jan 23, 2020 at 7:33 AM Will Deacon <[email protected]> wrote:
> > > >
> > > > This is version two of the patches I previously posted as an RFC here:
> > >
> > > Looks fine to me, as far as I can tell,
> >
> > Awesome, I've picked them up with a target for tip/locking/core.
>
> Were they really picked up?
>
> The MW is closed now, but I do not them in Linus' tree.
> I do not see them even in linux-next.

We ended up running into build issues with m68k which took quite a bit of
effort to fix, so that meant we missed the merge window. It also seems that
we've now run into similar looking issues for sparc32 :(

Will