Hi all,
This is a follow-up RFC to the discussions we had on the mailing list at
the end of last year:
https://lore.kernel.org/lkml/[email protected]
Unfortunately, we didn't get a "silver bullet" solution out of that
long thread, but I've tried to piece together some of the bits and
pieces we discussed and I've ended up with this series, which does at
least solve the pressing problem with the bitops for arm64.
The rough summary of the series is:
* Drop the GCC 4.8 workarounds, so that READ_ONCE() is a
straightforward dereference of a cast-to-volatile pointer.
* Require that the access is either 1, 2, 4 or 8 bytes in size
(even 32-bit architectures tend to use 8-byte accesses here).
* Introduce __READ_ONCE() for tearing operations with no size
restriction.
* Drop pointer qualifiers from scalar types, so that volatile scalars
don't generate horrible stack-spilling mess. This is pretty ugly,
but it's also mechanical and wrapped up in a macro.
* Convert acquire/release accessors to perform the same qualifier
stripping.
I gave up trying to prevent READ_ONCE() on aggregates because it is
pervasive, particularly within the mm/ layer on things like pmd_t.
Thankfully, these don't tend to be volatile.
I have more patches in this area because I'm trying to move all the
read_barrier_depends() magic into arch/alpha/, but I'm holding off until
we agree on this part first.
Cheers,
Will
Cc: Michael Ellerman <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Segher Boessenkool <[email protected]>
Cc: Christian Borntraeger <[email protected]>
Cc: Luc Van Oostenryck <[email protected]>
Cc: Arnd Bergmann <[email protected]>
--->8
Will Deacon (8):
compiler/gcc: Emit build-time warning for GCC prior to version 4.8
netfilter: Avoid assigning 'const' pointer to non-const pointer
fault_inject: Don't rely on "return value" from WRITE_ONCE()
READ_ONCE: Simplify implementations of {READ,WRITE}_ONCE()
READ_ONCE: Enforce atomicity for {READ,WRITE}_ONCE() memory accesses
READ_ONCE: Drop pointer qualifiers when reading from scalar types
locking/barriers: Use '__unqual_scalar_typeof' for load-acquire macros
arm64: barrier: Use '__unqual_scalar_typeof' for acquire/release
macros
arch/arm64/include/asm/barrier.h | 16 ++--
drivers/xen/time.c | 2 +-
include/asm-generic/barrier.h | 16 ++--
include/linux/compiler-gcc.h | 4 +
include/linux/compiler.h | 129 +++++++++++++------------------
include/linux/compiler_types.h | 15 ++++
lib/fault-inject.c | 4 +-
net/netfilter/core.c | 2 +-
net/xdp/xsk_queue.h | 2 +-
9 files changed, 93 insertions(+), 97 deletions(-)
--
2.25.0.rc1.283.g88dfdc4193-goog
Prior to version 4.8, GCC may miscompile READ_ONCE() by erroneously
discarding the 'volatile' qualifier:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=58145
We've been working around this using some nasty hacks which make
READ_ONCE() both horribly complicated and also prevent us from enforcing
that it is only used on scalar types. Since GCC 4.8 is pretty old for
kernel builds now, emit a warning if we detect it during the build.
Suggested-by: Linus Torvalds <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Michael Ellerman <[email protected]>
Cc: Arnd Bergmann <[email protected]>
Cc: Christian Borntraeger <[email protected]>
Signed-off-by: Will Deacon <[email protected]>
---
include/linux/compiler-gcc.h | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/include/linux/compiler-gcc.h b/include/linux/compiler-gcc.h
index d7ee4c6bad48..62afe874073e 100644
--- a/include/linux/compiler-gcc.h
+++ b/include/linux/compiler-gcc.h
@@ -14,6 +14,10 @@
# error Sorry, your compiler is too old - please upgrade it.
#endif
+#if GCC_VERSION < 40800
+# warning Your compiler is old and may miscompile the kernel due to https://gcc.gnu.org/bugzilla/show_bug.cgi?id=58145 - please upgrade it.
+#endif
+
/* Optimization barrier */
/* The "volatile" is due to gcc bugs */
--
2.25.0.rc1.283.g88dfdc4193-goog
It's a bit weird that WRITE_ONCE() evaluates to the value it stores and
it's different to smp_store_release(), which can't be used this way.
In preparation for preventing this in WRITE_ONCE(), change the fault
injection code to use a local variable instead.
Cc: Akinobu Mita <[email protected]>
Signed-off-by: Will Deacon <[email protected]>
---
lib/fault-inject.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/lib/fault-inject.c b/lib/fault-inject.c
index 8186ca84910b..ce12621b4275 100644
--- a/lib/fault-inject.c
+++ b/lib/fault-inject.c
@@ -106,7 +106,9 @@ bool should_fail(struct fault_attr *attr, ssize_t size)
unsigned int fail_nth = READ_ONCE(current->fail_nth);
if (fail_nth) {
- if (!WRITE_ONCE(current->fail_nth, fail_nth - 1))
+ fail_nth--;
+ WRITE_ONCE(current->fail_nth, fail_nth);
+ if (!fail_nth)
goto fail;
return false;
--
2.25.0.rc1.283.g88dfdc4193-goog
Passing a volatile-qualified pointer to READ_ONCE() is an absolute
trainwreck for code generation: the use of 'typeof()' to define a
temporary variable inside the macro means that the final evaluation in
macro scope ends up forcing a read back from the stack. When stack
protector is enabled (the default for arm64, at least), this causes
the compiler to vomit up all sorts of junk.
Unfortunately, dropping pointer qualifiers inside the macro poses quite
a challenge, especially since the pointed-to type is permitted to be an
aggregate, and this is relied upon by mm/ code accessing things like
'pmd_t'. Based on numerous hacks and discussions on the mailing list,
this is the best I've managed to come up with.
Introduce '__unqual_scalar_typeof()' which takes an expression and, if
the expression is an optionally qualified 8, 16, 32 or 64-bit scalar
type, evaluates to the unqualified type. Other input types, including
aggregates, remain unchanged. Hopefully READ_ONCE() on volatile aggregate
pointers isn't something we do on a fast-path.
Cc: Linus Torvalds <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Arnd Bergmann <[email protected]>
Reported-by: Michael Ellerman <[email protected]>
Signed-off-by: Will Deacon <[email protected]>
---
include/linux/compiler.h | 6 +++---
include/linux/compiler_types.h | 15 +++++++++++++++
2 files changed, 18 insertions(+), 3 deletions(-)
diff --git a/include/linux/compiler.h b/include/linux/compiler.h
index 863180641336..d3491fd44c19 100644
--- a/include/linux/compiler.h
+++ b/include/linux/compiler.h
@@ -203,13 +203,13 @@ void ftrace_likely_update(struct ftrace_likely_data *f, int val,
* atomicity or dependency ordering guarantees. Note that this may result
* in tears!
*/
-#define __READ_ONCE(x) (*(volatile typeof(x) *)&(x))
+#define __READ_ONCE(x) (*(volatile __unqual_scalar_typeof(x) *)&(x))
#define __READ_ONCE_SCALAR(x) \
({ \
- typeof(x) __x = __READ_ONCE(x); \
+ __unqual_scalar_typeof(x) __x = __READ_ONCE(x); \
smp_read_barrier_depends(); \
- __x; \
+ (typeof(x))__x; \
})
/*
diff --git a/include/linux/compiler_types.h b/include/linux/compiler_types.h
index 72393a8c1a6c..21e7859a356f 100644
--- a/include/linux/compiler_types.h
+++ b/include/linux/compiler_types.h
@@ -219,6 +219,21 @@ struct ftrace_likely_data {
/* Are two types/vars the same type (ignoring qualifiers)? */
#define __same_type(a, b) __builtin_types_compatible_p(typeof(a), typeof(b))
+/* Declare an unqualified scalar type. Leaves non-scalar types unchanged. */
+#define __unqual_scalar_typeof(x) typeof( \
+ __builtin_choose_expr(__same_type(x, (__s8)0), (__s8)0, \
+ __builtin_choose_expr(__same_type(x, (__u8)0), (__u8)0, \
+ __builtin_choose_expr(__same_type(x, (__s16)0), (__s16)0, \
+ __builtin_choose_expr(__same_type(x, (__u16)0), (__u16)0, \
+ __builtin_choose_expr(__same_type(x, (__s32)0), (__s32)0, \
+ __builtin_choose_expr(__same_type(x, (__u32)0), (__u32)0, \
+ __builtin_choose_expr(__same_type(x, 0L), 0L, \
+ __builtin_choose_expr(__same_type(x, 0UL), 0UL, \
+ __builtin_choose_expr(__same_type(x, 0LL), 0LL, \
+ __builtin_choose_expr(__same_type(x, 0ULL), 0ULL, \
+ (x) \
+)))))))))))
+
/* Is this type a native word size -- useful for atomic operations */
#define __native_word(t) \
(sizeof(t) == sizeof(char) || sizeof(t) == sizeof(short) || \
--
2.25.0.rc1.283.g88dfdc4193-goog
{READ,WRITE}_ONCE() cannot guarantee atomicity for arbitrary data sizes.
This can be surprising to callers that might incorrectly be expecting
atomicity for accesses to aggregate structures, although there are other
callers where tearing is actually permissable (e.g. if they are using
something akin to sequence locking to protect the access).
Linus sayeth:
| We could also look at being stricter for the normal READ/WRITE_ONCE(),
| and require that they are
|
| (a) regular integer types
|
| (b) fit in an atomic word
|
| We actually did (b) for a while, until we noticed that we do it on
| loff_t's etc and relaxed the rules. But maybe we could have a
| "non-atomic" version of READ/WRITE_ONCE() that is used for the
| questionable cases?
The slight snag is that we also have to support 64-bit accesses on 32-bit
architectures, as these appear to be widespread and tend to work out ok
if either the architecture supports atomic 64-bit accesses (x86, armv7)
or if the variable being accesses represents a virtual address and
therefore only requires 32-bit atomicity in practice.
Take a step in that direction by introducing a variant of
'compiletime_assert_atomic_type()' and use it to check the pointer
argument to {READ,WRITE}_ONCE(). Expose __{READ,WRITE_ONCE}() variants
which are allowed to tear and convert the two broken callers over to the
new macros.
Suggested-by: Linus Torvalds <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Michael Ellerman <[email protected]>
Cc: Arnd Bergmann <[email protected]>
Signed-off-by: Will Deacon <[email protected]>
---
drivers/xen/time.c | 2 +-
include/linux/compiler.h | 37 +++++++++++++++++++++++++++++++++----
net/xdp/xsk_queue.h | 2 +-
3 files changed, 35 insertions(+), 6 deletions(-)
diff --git a/drivers/xen/time.c b/drivers/xen/time.c
index 0968859c29d0..108edbcbc040 100644
--- a/drivers/xen/time.c
+++ b/drivers/xen/time.c
@@ -64,7 +64,7 @@ static void xen_get_runstate_snapshot_cpu_delta(
do {
state_time = get64(&state->state_entry_time);
rmb(); /* Hypervisor might update data. */
- *res = READ_ONCE(*state);
+ *res = __READ_ONCE(*state);
rmb(); /* Hypervisor might update data. */
} while (get64(&state->state_entry_time) != state_time ||
(state_time & XEN_RUNSTATE_UPDATE));
diff --git a/include/linux/compiler.h b/include/linux/compiler.h
index 44974d658f30..863180641336 100644
--- a/include/linux/compiler.h
+++ b/include/linux/compiler.h
@@ -198,24 +198,43 @@ void ftrace_likely_update(struct ftrace_likely_data *f, int val,
#include <asm/barrier.h>
#include <linux/kasan-checks.h>
+/*
+ * Use __READ_ONCE() instead of READ_ONCE() if you do not require any
+ * atomicity or dependency ordering guarantees. Note that this may result
+ * in tears!
+ */
+#define __READ_ONCE(x) (*(volatile typeof(x) *)&(x))
+
+#define __READ_ONCE_SCALAR(x) \
+({ \
+ typeof(x) __x = __READ_ONCE(x); \
+ smp_read_barrier_depends(); \
+ __x; \
+})
+
/*
* Use READ_ONCE_NOCHECK() instead of READ_ONCE() if you need
* to hide memory access from KASAN.
*/
#define READ_ONCE_NOCHECK(x) \
({ \
- typeof(x) __x = *(volatile typeof(x) *)&(x); \
- smp_read_barrier_depends(); \
- __x; \
+ compiletime_assert_rwonce_type(x); \
+ __READ_ONCE_SCALAR(x); \
})
#define READ_ONCE(x) READ_ONCE_NOCHECK(x)
-#define WRITE_ONCE(x, val) \
+#define __WRITE_ONCE(x, val) \
do { \
*(volatile typeof(x) *)&(x) = (val); \
} while (0)
+#define WRITE_ONCE(x, val) \
+do { \
+ compiletime_assert_rwonce_type(x); \
+ __WRITE_ONCE(x, val); \
+} while (0)
+
#ifdef CONFIG_KASAN
/*
* We can't declare function 'inline' because __no_sanitize_address conflicts
@@ -299,6 +318,16 @@ static inline void *offset_to_ptr(const int *off)
compiletime_assert(__native_word(t), \
"Need native word sized stores/loads for atomicity.")
+/*
+ * Yes, this permits 64-bit accesses on 32-bit architectures. These will
+ * actually be atomic in many cases (namely x86), but for others we rely on
+ * the access being split into 2x32-bit accesses for a 32-bit quantity (e.g.
+ * a virtual address) and a strong prevailing wind.
+ */
+#define compiletime_assert_rwonce_type(t) \
+ compiletime_assert(__native_word(t) || sizeof(t) == sizeof(long long), \
+ "Unsupported access size for {READ,WRITE}_ONCE().")
+
/* &a[0] degrades to a pointer: a different type from an array */
#define __must_be_array(a) BUILD_BUG_ON_ZERO(__same_type((a), &(a)[0]))
diff --git a/net/xdp/xsk_queue.h b/net/xdp/xsk_queue.h
index eddae4688862..2b55c1c7b2b6 100644
--- a/net/xdp/xsk_queue.h
+++ b/net/xdp/xsk_queue.h
@@ -304,7 +304,7 @@ static inline struct xdp_desc *xskq_validate_desc(struct xsk_queue *q,
struct xdp_rxtx_ring *ring = (struct xdp_rxtx_ring *)q->ring;
unsigned int idx = q->cons_tail & q->ring_mask;
- *desc = READ_ONCE(ring->desc[idx]);
+ *desc = __READ_ONCE(ring->desc[idx]);
if (xskq_is_valid_desc(q, desc, umem))
return desc;
--
2.25.0.rc1.283.g88dfdc4193-goog
Passing volatile-qualified pointers to the arm64 implementations of the
load-acquire/store-release macros results in a re-load from the stack
and a bunch of associated stack-protector churn due to the temporary
result variable inheriting the volatile semantics thanks to the use of
'typeof()'.
Define these temporary variables using 'unqual_scalar_typeof' to drop
the volatile qualifier in the case that they are scalar types.
Cc: Mark Rutland <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Arnd Bergmann <[email protected]>
Signed-off-by: Will Deacon <[email protected]>
---
arch/arm64/include/asm/barrier.h | 16 ++++++++--------
1 file changed, 8 insertions(+), 8 deletions(-)
diff --git a/arch/arm64/include/asm/barrier.h b/arch/arm64/include/asm/barrier.h
index 7d9cc5ec4971..fb4c27506ef4 100644
--- a/arch/arm64/include/asm/barrier.h
+++ b/arch/arm64/include/asm/barrier.h
@@ -76,8 +76,8 @@ static inline unsigned long array_index_mask_nospec(unsigned long idx,
#define __smp_store_release(p, v) \
do { \
typeof(p) __p = (p); \
- union { typeof(*p) __val; char __c[1]; } __u = \
- { .__val = (__force typeof(*p)) (v) }; \
+ union { __unqual_scalar_typeof(*p) __val; char __c[1]; } __u = \
+ { .__val = (__force __unqual_scalar_typeof(*p)) (v) }; \
compiletime_assert_atomic_type(*p); \
kasan_check_write(__p, sizeof(*p)); \
switch (sizeof(*p)) { \
@@ -110,7 +110,7 @@ do { \
#define __smp_load_acquire(p) \
({ \
- union { typeof(*p) __val; char __c[1]; } __u; \
+ union { __unqual_scalar_typeof(*p) __val; char __c[1]; } __u; \
typeof(p) __p = (p); \
compiletime_assert_atomic_type(*p); \
kasan_check_read(__p, sizeof(*p)); \
@@ -136,33 +136,33 @@ do { \
: "Q" (*__p) : "memory"); \
break; \
} \
- __u.__val; \
+ (typeof(*p))__u.__val; \
})
#define smp_cond_load_relaxed(ptr, cond_expr) \
({ \
typeof(ptr) __PTR = (ptr); \
- typeof(*ptr) VAL; \
+ __unqual_scalar_typeof(*ptr) VAL; \
for (;;) { \
VAL = READ_ONCE(*__PTR); \
if (cond_expr) \
break; \
__cmpwait_relaxed(__PTR, VAL); \
} \
- VAL; \
+ (typeof(*ptr))VAL; \
})
#define smp_cond_load_acquire(ptr, cond_expr) \
({ \
typeof(ptr) __PTR = (ptr); \
- typeof(*ptr) VAL; \
+ __unqual_scalar_typeof(*ptr) VAL; \
for (;;) { \
VAL = smp_load_acquire(__PTR); \
if (cond_expr) \
break; \
__cmpwait_relaxed(__PTR, VAL); \
} \
- VAL; \
+ (typeof(*ptr))VAL; \
})
#include <asm-generic/barrier.h>
--
2.25.0.rc1.283.g88dfdc4193-goog
Passing volatile-qualified pointers to the asm-generic implementations of
the load-acquire macros results in a re-load from the stack due to the
temporary result variable inheriting the volatile semantics thanks to the
use of 'typeof()'.
Define these temporary variables using 'unqual_scalar_typeof' to drop
the volatile qualifier in the case that they are scalar types.
Cc: Linus Torvalds <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Arnd Bergmann <[email protected]>
Signed-off-by: Will Deacon <[email protected]>
---
include/asm-generic/barrier.h | 16 ++++++++--------
1 file changed, 8 insertions(+), 8 deletions(-)
diff --git a/include/asm-generic/barrier.h b/include/asm-generic/barrier.h
index 85b28eb80b11..2eacaf7d62f6 100644
--- a/include/asm-generic/barrier.h
+++ b/include/asm-generic/barrier.h
@@ -128,10 +128,10 @@ do { \
#ifndef __smp_load_acquire
#define __smp_load_acquire(p) \
({ \
- typeof(*p) ___p1 = READ_ONCE(*p); \
+ __unqual_scalar_typeof(*p) ___p1 = READ_ONCE(*p); \
compiletime_assert_atomic_type(*p); \
__smp_mb(); \
- ___p1; \
+ (typeof(*p))___p1; \
})
#endif
@@ -183,10 +183,10 @@ do { \
#ifndef smp_load_acquire
#define smp_load_acquire(p) \
({ \
- typeof(*p) ___p1 = READ_ONCE(*p); \
+ __unqual_scalar_typeof(*p) ___p1 = READ_ONCE(*p); \
compiletime_assert_atomic_type(*p); \
barrier(); \
- ___p1; \
+ (typeof(*p))___p1; \
})
#endif
@@ -229,14 +229,14 @@ do { \
#ifndef smp_cond_load_relaxed
#define smp_cond_load_relaxed(ptr, cond_expr) ({ \
typeof(ptr) __PTR = (ptr); \
- typeof(*ptr) VAL; \
+ __unqual_scalar_typeof(*ptr) VAL; \
for (;;) { \
VAL = READ_ONCE(*__PTR); \
if (cond_expr) \
break; \
cpu_relax(); \
} \
- VAL; \
+ (typeof(*ptr))VAL; \
})
#endif
@@ -250,10 +250,10 @@ do { \
*/
#ifndef smp_cond_load_acquire
#define smp_cond_load_acquire(ptr, cond_expr) ({ \
- typeof(*ptr) _val; \
+ __unqual_scalar_typeof(*ptr) _val; \
_val = smp_cond_load_relaxed(ptr, cond_expr); \
smp_acquire__after_ctrl_dep(); \
- _val; \
+ (typeof(*ptr))_val; \
})
#endif
--
2.25.0.rc1.283.g88dfdc4193-goog
The implementations of {READ,WRITE}_ONCE() suffer from a significant
amount of indirection and complexity due to a historic GCC bug:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=58145
which was originally worked around by 230fa253df63 ("kernel: Provide
READ_ONCE and ASSIGN_ONCE").
Since GCC 4.8 is fairly vintage at this point and we emit a warning if
we detect it during the build, return {READ,WRITE}_ONCE() to their former
glory with an implementation that is easier to understand and, crucially,
more amenable to optimisation. A side effect of this simplification is
that WRITE_ONCE() no longer returns a value, but nobody seems to be
relying on that and the new behaviour is aligned with smp_store_release().
Suggested-by: Linus Torvalds <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Michael Ellerman <[email protected]>
Cc: Arnd Bergmann <[email protected]>
Cc: Christian Borntraeger <[email protected]>
Signed-off-by: Will Deacon <[email protected]>
---
include/linux/compiler.h | 104 ++++++++++-----------------------------
1 file changed, 25 insertions(+), 79 deletions(-)
diff --git a/include/linux/compiler.h b/include/linux/compiler.h
index 5e88e7e33abe..44974d658f30 100644
--- a/include/linux/compiler.h
+++ b/include/linux/compiler.h
@@ -177,60 +177,6 @@ void ftrace_likely_update(struct ftrace_likely_data *f, int val,
# define __UNIQUE_ID(prefix) __PASTE(__PASTE(__UNIQUE_ID_, prefix), __LINE__)
#endif
-#include <uapi/linux/types.h>
-
-#define __READ_ONCE_SIZE \
-({ \
- switch (size) { \
- case 1: *(__u8 *)res = *(volatile __u8 *)p; break; \
- case 2: *(__u16 *)res = *(volatile __u16 *)p; break; \
- case 4: *(__u32 *)res = *(volatile __u32 *)p; break; \
- case 8: *(__u64 *)res = *(volatile __u64 *)p; break; \
- default: \
- barrier(); \
- __builtin_memcpy((void *)res, (const void *)p, size); \
- barrier(); \
- } \
-})
-
-static __always_inline
-void __read_once_size(const volatile void *p, void *res, int size)
-{
- __READ_ONCE_SIZE;
-}
-
-#ifdef CONFIG_KASAN
-/*
- * We can't declare function 'inline' because __no_sanitize_address confilcts
- * with inlining. Attempt to inline it may cause a build failure.
- * https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67368
- * '__maybe_unused' allows us to avoid defined-but-not-used warnings.
- */
-# define __no_kasan_or_inline __no_sanitize_address notrace __maybe_unused
-#else
-# define __no_kasan_or_inline __always_inline
-#endif
-
-static __no_kasan_or_inline
-void __read_once_size_nocheck(const volatile void *p, void *res, int size)
-{
- __READ_ONCE_SIZE;
-}
-
-static __always_inline void __write_once_size(volatile void *p, void *res, int size)
-{
- switch (size) {
- case 1: *(volatile __u8 *)p = *(__u8 *)res; break;
- case 2: *(volatile __u16 *)p = *(__u16 *)res; break;
- case 4: *(volatile __u32 *)p = *(__u32 *)res; break;
- case 8: *(volatile __u64 *)p = *(__u64 *)res; break;
- default:
- barrier();
- __builtin_memcpy((void *)p, (const void *)res, size);
- barrier();
- }
-}
-
/*
* Prevent the compiler from merging or refetching reads or writes. The
* compiler is also forbidden from reordering successive instances of
@@ -240,11 +186,7 @@ static __always_inline void __write_once_size(volatile void *p, void *res, int s
* statements.
*
* These two macros will also work on aggregate data types like structs or
- * unions. If the size of the accessed data type exceeds the word size of
- * the machine (e.g., 32 bits or 64 bits) READ_ONCE() and WRITE_ONCE() will
- * fall back to memcpy(). There's at least two memcpy()s: one for the
- * __builtin_memcpy() and then one for the macro doing the copy of variable
- * - '__u' allocated on the stack.
+ * unions.
*
* Their two major use cases are: (1) Mediating communication between
* process-level code and irq/NMI handlers, all running on the same CPU,
@@ -256,23 +198,35 @@ static __always_inline void __write_once_size(volatile void *p, void *res, int s
#include <asm/barrier.h>
#include <linux/kasan-checks.h>
-#define __READ_ONCE(x, check) \
+/*
+ * Use READ_ONCE_NOCHECK() instead of READ_ONCE() if you need
+ * to hide memory access from KASAN.
+ */
+#define READ_ONCE_NOCHECK(x) \
({ \
- union { typeof(x) __val; char __c[1]; } __u; \
- if (check) \
- __read_once_size(&(x), __u.__c, sizeof(x)); \
- else \
- __read_once_size_nocheck(&(x), __u.__c, sizeof(x)); \
- smp_read_barrier_depends(); /* Enforce dependency ordering from x */ \
- __u.__val; \
+ typeof(x) __x = *(volatile typeof(x) *)&(x); \
+ smp_read_barrier_depends(); \
+ __x; \
})
-#define READ_ONCE(x) __READ_ONCE(x, 1)
+#define READ_ONCE(x) READ_ONCE_NOCHECK(x)
+
+#define WRITE_ONCE(x, val) \
+do { \
+ *(volatile typeof(x) *)&(x) = (val); \
+} while (0)
+
+#ifdef CONFIG_KASAN
/*
- * Use READ_ONCE_NOCHECK() instead of READ_ONCE() if you need
- * to hide memory access from KASAN.
+ * We can't declare function 'inline' because __no_sanitize_address conflicts
+ * with inlining. Attempt to inline it may cause a build failure.
+ * https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67368
+ * '__maybe_unused' allows us to avoid defined-but-not-used warnings.
*/
-#define READ_ONCE_NOCHECK(x) __READ_ONCE(x, 0)
+# define __no_kasan_or_inline __no_sanitize_address notrace __maybe_unused
+#else
+# define __no_kasan_or_inline __always_inline
+#endif
static __no_kasan_or_inline
unsigned long read_word_at_a_time(const void *addr)
@@ -281,14 +235,6 @@ unsigned long read_word_at_a_time(const void *addr)
return *(unsigned long *)addr;
}
-#define WRITE_ONCE(x, val) \
-({ \
- union { typeof(x) __val; char __c[1]; } __u = \
- { .__val = (__force typeof(x)) (val) }; \
- __write_once_size(&(x), __u.__c, sizeof(x)); \
- __u.__val; \
-})
-
#endif /* __KERNEL__ */
/*
--
2.25.0.rc1.283.g88dfdc4193-goog
nf_remove_net_hook() uses WRITE_ONCE() to assign a 'const pointer to a
'non-const' pointer. Cleanups to the implementation of WRITE_ONCE() mean
that this will give rise to a compiler warning, just like a plain old
assignment would do:
| In file included from ./include/linux/export.h:43,
| from ./include/linux/linkage.h:7,
| from ./include/linux/kernel.h:8,
| from net/netfilter/core.c:9:
| net/netfilter/core.c: In function ‘nf_remove_net_hook’:
| ./include/linux/compiler.h:216:30: warning: assignment discards ‘const’ qualifier from pointer target type [-Wdiscarded-qualifiers]
| *(volatile typeof(x) *)&(x) = (val); \
| ^
| net/netfilter/core.c:379:3: note: in expansion of macro ‘WRITE_ONCE’
| WRITE_ONCE(orig_ops[i], &dummy_ops);
| ^~~~~~~~~~
Follow the pattern used elsewhere in this file and add a cast to 'void *'
to squash the warning.
Cc: Pablo Neira Ayuso <[email protected]>
Cc: Jozsef Kadlecsik <[email protected]>
Cc: Florian Westphal <[email protected]>
Cc: "David S. Miller" <[email protected]>
Signed-off-by: Will Deacon <[email protected]>
---
net/netfilter/core.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/net/netfilter/core.c b/net/netfilter/core.c
index 78f046ec506f..3ac7c8c1548d 100644
--- a/net/netfilter/core.c
+++ b/net/netfilter/core.c
@@ -376,7 +376,7 @@ static bool nf_remove_net_hook(struct nf_hook_entries *old,
if (orig_ops[i] != unreg)
continue;
WRITE_ONCE(old->hooks[i].hook, accept_all);
- WRITE_ONCE(orig_ops[i], &dummy_ops);
+ WRITE_ONCE(orig_ops[i], (void *)&dummy_ops);
return true;
}
--
2.25.0.rc1.283.g88dfdc4193-goog
On Fri, Jan 10, 2020 at 5:56 PM Will Deacon <[email protected]> wrote:
>
> Prior to version 4.8, GCC may miscompile READ_ONCE() by erroneously
> discarding the 'volatile' qualifier:
>
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=58145
>
> We've been working around this using some nasty hacks which make
> READ_ONCE() both horribly complicated and also prevent us from enforcing
> that it is only used on scalar types. Since GCC 4.8 is pretty old for
> kernel builds now, emit a warning if we detect it during the build.
No objection to recommending gcc-4.8, but I think this should either
just warn once during the kernel build instead of for every file, or
it should become a hard requirement.
Arnd
On Fri, Jan 10, 2020 at 04:56:36PM +0000, Will Deacon wrote:
> Passing volatile-qualified pointers to the arm64 implementations of the
> load-acquire/store-release macros results in a re-load from the stack
> and a bunch of associated stack-protector churn due to the temporary
> result variable inheriting the volatile semantics thanks to the use of
> 'typeof()'.
>
> Define these temporary variables using 'unqual_scalar_typeof' to drop
> the volatile qualifier in the case that they are scalar types.
>
> Cc: Mark Rutland <[email protected]>
> Cc: Linus Torvalds <[email protected]>
> Cc: Peter Zijlstra <[email protected]>
> Cc: Arnd Bergmann <[email protected]>
> Signed-off-by: Will Deacon <[email protected]>
Based on my understanding of __unqual_scalar_typeof(), these changes
look sound to me:
Acked-by: Mark Rutland <[email protected]>
Mark.
> ---
> arch/arm64/include/asm/barrier.h | 16 ++++++++--------
> 1 file changed, 8 insertions(+), 8 deletions(-)
>
> diff --git a/arch/arm64/include/asm/barrier.h b/arch/arm64/include/asm/barrier.h
> index 7d9cc5ec4971..fb4c27506ef4 100644
> --- a/arch/arm64/include/asm/barrier.h
> +++ b/arch/arm64/include/asm/barrier.h
> @@ -76,8 +76,8 @@ static inline unsigned long array_index_mask_nospec(unsigned long idx,
> #define __smp_store_release(p, v) \
> do { \
> typeof(p) __p = (p); \
> - union { typeof(*p) __val; char __c[1]; } __u = \
> - { .__val = (__force typeof(*p)) (v) }; \
> + union { __unqual_scalar_typeof(*p) __val; char __c[1]; } __u = \
> + { .__val = (__force __unqual_scalar_typeof(*p)) (v) }; \
> compiletime_assert_atomic_type(*p); \
> kasan_check_write(__p, sizeof(*p)); \
> switch (sizeof(*p)) { \
> @@ -110,7 +110,7 @@ do { \
>
> #define __smp_load_acquire(p) \
> ({ \
> - union { typeof(*p) __val; char __c[1]; } __u; \
> + union { __unqual_scalar_typeof(*p) __val; char __c[1]; } __u; \
> typeof(p) __p = (p); \
> compiletime_assert_atomic_type(*p); \
> kasan_check_read(__p, sizeof(*p)); \
> @@ -136,33 +136,33 @@ do { \
> : "Q" (*__p) : "memory"); \
> break; \
> } \
> - __u.__val; \
> + (typeof(*p))__u.__val; \
> })
>
> #define smp_cond_load_relaxed(ptr, cond_expr) \
> ({ \
> typeof(ptr) __PTR = (ptr); \
> - typeof(*ptr) VAL; \
> + __unqual_scalar_typeof(*ptr) VAL; \
> for (;;) { \
> VAL = READ_ONCE(*__PTR); \
> if (cond_expr) \
> break; \
> __cmpwait_relaxed(__PTR, VAL); \
> } \
> - VAL; \
> + (typeof(*ptr))VAL; \
> })
>
> #define smp_cond_load_acquire(ptr, cond_expr) \
> ({ \
> typeof(ptr) __PTR = (ptr); \
> - typeof(*ptr) VAL; \
> + __unqual_scalar_typeof(*ptr) VAL; \
> for (;;) { \
> VAL = smp_load_acquire(__PTR); \
> if (cond_expr) \
> break; \
> __cmpwait_relaxed(__PTR, VAL); \
> } \
> - VAL; \
> + (typeof(*ptr))VAL; \
> })
>
> #include <asm-generic/barrier.h>
> --
> 2.25.0.rc1.283.g88dfdc4193-goog
>
On Fri, 2020-01-10 at 18:35 +0100, Arnd Bergmann wrote:
> On Fri, Jan 10, 2020 at 5:56 PM Will Deacon <[email protected]> wrote:
> > Prior to version 4.8, GCC may miscompile READ_ONCE() by erroneously
> > discarding the 'volatile' qualifier:
> >
> > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=58145
> >
> > We've been working around this using some nasty hacks which make
> > READ_ONCE() both horribly complicated and also prevent us from enforcing
> > that it is only used on scalar types. Since GCC 4.8 is pretty old for
> > kernel builds now, emit a warning if we detect it during the build.
>
> No objection to recommending gcc-4.8, but I think this should either
> just warn once during the kernel build instead of for every file, or
> it should become a hard requirement.
It might as well be a hard requirement as
gcc 4.8.0 is already nearly 7 years old.
gcc 4.6.0 released 2011-03-25
gcc 4.8.0 released 2013-03-22
Perhaps there are exceedingly few to zero new
instances using gcc compiler versions < 4.8
On Fri, Jan 10, 2020 at 5:56 PM Will Deacon <[email protected]> wrote:
>
> Hi all,
>
> This is a follow-up RFC to the discussions we had on the mailing list at
> the end of last year:
>
> https://lore.kernel.org/lkml/[email protected]
>
> Unfortunately, we didn't get a "silver bullet" solution out of that
> long thread, but I've tried to piece together some of the bits and
> pieces we discussed and I've ended up with this series, which does at
> least solve the pressing problem with the bitops for arm64.
>
> The rough summary of the series is:
>
> * Drop the GCC 4.8 workarounds, so that READ_ONCE() is a
> straightforward dereference of a cast-to-volatile pointer.
>
> * Require that the access is either 1, 2, 4 or 8 bytes in size
> (even 32-bit architectures tend to use 8-byte accesses here).
>
> * Introduce __READ_ONCE() for tearing operations with no size
> restriction.
>
> * Drop pointer qualifiers from scalar types, so that volatile scalars
> don't generate horrible stack-spilling mess. This is pretty ugly,
> but it's also mechanical and wrapped up in a macro.
>
> * Convert acquire/release accessors to perform the same qualifier
> stripping.
>
> I gave up trying to prevent READ_ONCE() on aggregates because it is
> pervasive, particularly within the mm/ layer on things like pmd_t.
> Thankfully, these don't tend to be volatile.
>
> I have more patches in this area because I'm trying to move all the
> read_barrier_depends() magic into arch/alpha/, but I'm holding off until
> we agree on this part first.
Looks very nice overall, thanks for working on this.
I've added a the series into my randconfig build setup to see
if I run into build-time regressions. Unfortunately there are some
conflicts with the kcsan patches in linux-next that I have to work
around first.
Arnd
On Fri, Jan 10, 2020 at 8:56 AM Will Deacon <[email protected]> wrote:
>
> +/* Declare an unqualified scalar type. Leaves non-scalar types unchanged. */
> +#define __unqual_scalar_typeof(x) typeof( \
Ugh. My eyes. That's horrendous.
I can't see any better alternatives, but it does make me go "Eww".
Well, I do see one possible alternative: just re-write the bitop
implementations in terms of "atomic_long_t", and just avoid the issue
entirely.
IOW, do something like the attached (but fleshed out and tested - this
patch has not seen a compiler, much less any thought at all).
Linus
On Fri, Jan 10, 2020 at 5:56 PM Will Deacon <[email protected]> wrote:
> +/*
> + * Use __READ_ONCE() instead of READ_ONCE() if you do not require any
> + * atomicity or dependency ordering guarantees. Note that this may result
> + * in tears!
> + */
> +#define __READ_ONCE(x) (*(volatile typeof(x) *)&(x))
> +
This probably allows writing
extern int i;
__READ_ONCE(i) = 1;
and not get a warning for it. How about also casting to 'const'?
Arnd
On Fri, Jan 10, 2020 at 5:57 PM Will Deacon <[email protected]> wrote:
> @@ -128,10 +128,10 @@ do { \
> #ifndef __smp_load_acquire
> #define __smp_load_acquire(p) \
> ({ \
> - typeof(*p) ___p1 = READ_ONCE(*p); \
> + __unqual_scalar_typeof(*p) ___p1 = READ_ONCE(*p); \
> compiletime_assert_atomic_type(*p); \
> __smp_mb(); \
> - ___p1; \
> + (typeof(*p))___p1; \
> })
Doesn't that last (typeof(*p))___p1 mean you put the potential
'volatile' back on the assignment after you went through the
effort of taking it out?
Arnd
On Fri, Jan 10, 2020 at 5:56 PM Will Deacon <[email protected]> wrote:
>
> I have more patches in this area because I'm trying to move all the
> read_barrier_depends() magic into arch/alpha/, but I'm holding off until
> we agree on this part first.
Isn't the read_barrier_depends() the only reason for actually needing
the temporary local variable that must not be volatile?
If you make alpha provide its own READ_ONCE() as the first
step, it would seem that the rest of the series gets much easier
as the others can go back to the simple statement from your
#define __READ_ONCE(x) (*(volatile __unqual_scalar_typeof(x) *)&(x))
Arnd
On Fri, Jan 10, 2020 at 11:47 AM Arnd Bergmann <[email protected]> wrote:
>
> Isn't the read_barrier_depends() the only reason for actually needing
> the temporary local variable that must not be volatile?
>
> If you make alpha provide its own READ_ONCE() as the first
> step, it would seem that the rest of the series gets much easier
> as the others can go back to the simple statement from your
Hmm.. The union still would cause that "take the address of a volatile
thing on the stack" problem, wouldn't it? And that was what caused
most of the issues.
I think the _real_ issue is how KASAN forces that odd pair of inline
functions in order to have the annotations on the accesses.
Linus
From: Will Deacon
> Sent: 10 January 2020 16:56
...
> The rough summary of the series is:
>
> * Drop the GCC 4.8 workarounds, so that READ_ONCE() is a
> straightforward dereference of a cast-to-volatile pointer.
What is the effect of using the changed code on older compilers (esp. x86-64)?
Requiring gcc > 4.8 is a significant bump.
David
-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)
On 10.01.20 17:56, Will Deacon wrote:
> Hi all,
>
> This is a follow-up RFC to the discussions we had on the mailing list at
> the end of last year:
>
> https://lore.kernel.org/lkml/[email protected]
>
> Unfortunately, we didn't get a "silver bullet" solution out of that
> long thread, but I've tried to piece together some of the bits and
> pieces we discussed and I've ended up with this series, which does at
> least solve the pressing problem with the bitops for arm64.
>
> The rough summary of the series is:
>
> * Drop the GCC 4.8 workarounds, so that READ_ONCE() is a
> straightforward dereference of a cast-to-volatile pointer.
>
> * Require that the access is either 1, 2, 4 or 8 bytes in size
> (even 32-bit architectures tend to use 8-byte accesses here).
>
> * Introduce __READ_ONCE() for tearing operations with no size
> restriction.
>
> * Drop pointer qualifiers from scalar types, so that volatile scalars
> don't generate horrible stack-spilling mess. This is pretty ugly,
> but it's also mechanical and wrapped up in a macro.
>
> * Convert acquire/release accessors to perform the same qualifier
> stripping.
>
> I gave up trying to prevent READ_ONCE() on aggregates because it is
> pervasive, particularly within the mm/ layer on things like pmd_t.
> Thankfully, these don't tend to be volatile.
>
> I have more patches in this area because I'm trying to move all the
> read_barrier_depends() magic into arch/alpha/, but I'm holding off until
> we agree on this part first.
>
> Cheers,
>
> Will
>
> Cc: Michael Ellerman <[email protected]>
> Cc: Peter Zijlstra <[email protected]>
> Cc: Linus Torvalds <[email protected]>
> Cc: Segher Boessenkool <[email protected]>
> Cc: Christian Borntraeger <[email protected]>
> Cc: Luc Van Oostenryck <[email protected]>
> Cc: Arnd Bergmann <[email protected]>
Looks sane on s390. I also checked that the problematic sequence in
arch/s390/kvm/gaccess.c is not miscompiled (the binary code for the
ipte_lock function is almost the same, just different addresses due
to a different start address.)
The kernel seems to get slighly larger though.
Mostly due to different inlining decisions it seems.
Total: Before=14133361, After=14135643, chg +0.02%
On Fri, Jan 10, 2020 at 9:15 PM Linus Torvalds
<[email protected]> wrote:
>
> On Fri, Jan 10, 2020 at 11:47 AM Arnd Bergmann <[email protected]> wrote:
> >
> > Isn't the read_barrier_depends() the only reason for actually needing
> > the temporary local variable that must not be volatile?
> >
> > If you make alpha provide its own READ_ONCE() as the first
> > step, it would seem that the rest of the series gets much easier
> > as the others can go back to the simple statement from your
>
> Hmm.. The union still would cause that "take the address of a volatile
> thing on the stack" problem, wouldn't it? And that was what caused
> most of the issues.
Ah, I was missing that there is still the union in smp_load_acquire(),
I only saw that the one in READ_ONCE() is needed only on alpha.
The number of files using smp_load_acquire() is fairly small though,
so we could consider changing it to pass both source and destination
as macro arguments and use typeof(dest) instad of typeof(source)
to avoid the volatile pointer access.
> I think the _real_ issue is how KASAN forces that odd pair of inline
> functions in order to have the annotations on the accesses.
But the inline functions (I assume you mean __write_once_size
and __read_once_size_nocheck?) are completely removed after
Will's series, so those no longer cause harm, right?
Arnd
[+Masahiro]
On Fri, Jan 10, 2020 at 06:35:02PM +0100, Arnd Bergmann wrote:
> On Fri, Jan 10, 2020 at 5:56 PM Will Deacon <[email protected]> wrote:
> >
> > Prior to version 4.8, GCC may miscompile READ_ONCE() by erroneously
> > discarding the 'volatile' qualifier:
> >
> > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=58145
> >
> > We've been working around this using some nasty hacks which make
> > READ_ONCE() both horribly complicated and also prevent us from enforcing
> > that it is only used on scalar types. Since GCC 4.8 is pretty old for
> > kernel builds now, emit a warning if we detect it during the build.
>
> No objection to recommending gcc-4.8, but I think this should either
> just warn once during the kernel build instead of for every file, or
> it should become a hard requirement.
Oops, yes, good point. I blindly followed the '< 4.6' check, but that's
actually a '#error' so it doesn't have the issue you mention. I've had a
crack at moving the check into kbuild -- see below.
Will
--->8
diff --git a/include/linux/compiler-gcc.h b/include/linux/compiler-gcc.h
index 62afe874073e..d7ee4c6bad48 100644
--- a/include/linux/compiler-gcc.h
+++ b/include/linux/compiler-gcc.h
@@ -14,10 +14,6 @@
# error Sorry, your compiler is too old - please upgrade it.
#endif
-#if GCC_VERSION < 40800
-# warning Your compiler is old and may miscompile the kernel due to https://gcc.gnu.org/bugzilla/show_bug.cgi?id=58145 - please upgrade it.
-#endif
-
/* Optimization barrier */
/* The "volatile" is due to gcc bugs */
diff --git a/init/Kconfig b/init/Kconfig
index a34064a031a5..bdc2f1b1667b 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -10,11 +10,11 @@ config DEFCONFIG_LIST
default "arch/$(ARCH)/defconfig"
config CC_IS_GCC
- def_bool $(success,$(CC) --version | head -n 1 | grep -q gcc)
+ def_bool $(cc-is-gcc)
config GCC_VERSION
int
- default $(shell,$(srctree)/scripts/gcc-version.sh $(CC)) if CC_IS_GCC
+ default $(gcc-version) if CC_IS_GCC
default 0
config CC_IS_CLANG
diff --git a/scripts/Kconfig.include b/scripts/Kconfig.include
index d4adfbe42690..4e645a798b56 100644
--- a/scripts/Kconfig.include
+++ b/scripts/Kconfig.include
@@ -40,3 +40,9 @@ $(error-if,$(success, $(LD) -v | grep -q gold), gold linker '$(LD)' not supporte
# gcc version including patch level
gcc-version := $(shell,$(srctree)/scripts/gcc-version.sh $(CC))
+
+# Return y if the compiler is GCC, n otherwise
+cc-is-gcc := $(success,$(CC) --version | head -n 1 | grep -q gcc)
+
+# Warn if the compiler is GCC prior to 4.8
+$(warning-if,$(if-success,[ $(gcc-version) -lt 40800 ],$(cc-is-gcc),n),"Your compiler is old and may miscompile the kernel due to https://gcc.gnu.org/bugzilla/show_bug.cgi?id=58145 - please upgrade it.")
On Fri, Jan 10, 2020 at 6:54 PM Joe Perches <[email protected]> wrote:
>
> On Fri, 2020-01-10 at 18:35 +0100, Arnd Bergmann wrote:
> > On Fri, Jan 10, 2020 at 5:56 PM Will Deacon <[email protected]> wrote:
> > > Prior to version 4.8, GCC may miscompile READ_ONCE() by erroneously
> > > discarding the 'volatile' qualifier:
> > >
> > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=58145
> > >
> > > We've been working around this using some nasty hacks which make
> > > READ_ONCE() both horribly complicated and also prevent us from enforcing
> > > that it is only used on scalar types. Since GCC 4.8 is pretty old for
> > > kernel builds now, emit a warning if we detect it during the build.
> >
> > No objection to recommending gcc-4.8, but I think this should either
> > just warn once during the kernel build instead of for every file, or
> > it should become a hard requirement.
>
> It might as well be a hard requirement as
> gcc 4.8.0 is already nearly 7 years old.
>
> gcc 4.6.0 released 2011-03-25
> gcc 4.8.0 released 2013-03-22
>
> Perhaps there are exceedingly few to zero new
> instances using gcc compiler versions < 4.8
The last time we had this discussion, the result was that gcc-4.8 is
probably ok as a minimum version, but moving to 5.1+ (from 2015)
was not an obvious choice:
https://www.spinics.net/lists/linux-kbuild/msg23648.html
If nobody complains about the move to 4.8, we can try moving to
gcc-5.1 and GNU99/GNU11 next year ;-)
Arnd
On Fri, Jan 10, 2020 at 10:54:27AM -0800, Linus Torvalds wrote:
> On Fri, Jan 10, 2020 at 8:56 AM Will Deacon <[email protected]> wrote:
> >
> > +/* Declare an unqualified scalar type. Leaves non-scalar types unchanged. */
> > +#define __unqual_scalar_typeof(x) typeof( \
>
> Ugh. My eyes. That's horrendous.
>
> I can't see any better alternatives, but it does make me go "Eww".
I can't disagree with that, but the only option we've come up with so far
that solves this in the READ_ONCE() macro itself is the thing from PeterZ:
// Insert big fat comment here
#define unqual_typeof(x) typeof(({_Atomic typeof(x) ___x __maybe_unused; ___x; }))
That apparently *requires* GCC 4.8, but I think the question is more about
whether it's easier to stomach the funny use of _Atomic or the nested
__builtin_choose_expr() I have here. I'm also worried about how reliable
the _Atomic thing is, or whether it's just an artifact of how GCC happens
to work today.
> Well, I do see one possible alternative: just re-write the bitop
> implementations in terms of "atomic_long_t", and just avoid the issue
> entirely.
>
> IOW, do something like the attached (but fleshed out and tested - this
> patch has not seen a compiler, much less any thought at all).
The big downside of this approach in preference to the proposal here is that
as long as we've got volatile-qualified pointer arguments describing shared
memory, I fear that we'll be playing a constant game of whack-a-mole adding
non-volatile casts as you do below. The same problem manifests for the
acquire/release accessors, which is why having something like
__unqual_typeof() would be beneficial and at least the awfulness is
contained in one place.
So I suppose my question is: how ill does this code really make you feel?
The disassembly is really nice!
Will
> include/asm-generic/bitops/lock.h | 12 +++++++++---
> 1 file changed, 9 insertions(+), 3 deletions(-)
>
> diff --git a/include/asm-generic/bitops/lock.h b/include/asm-generic/bitops/lock.h
> index 3ae021368f48..071d8bfd86e5 100644
> --- a/include/asm-generic/bitops/lock.h
> +++ b/include/asm-generic/bitops/lock.h
> @@ -6,6 +6,12 @@
> #include <linux/compiler.h>
> #include <asm/barrier.h>
>
> +/* Drop the volatile, we will be doing READ_ONCE by hand */
> +static inline atomic_long_t *atomic_long_bit_word(unsigned int nr, volatile unsigned long *p)
> +{
> + return BIT_WORD(nr) + (atomic_long_t *)p;
> +}
> +
> /**
> * test_and_set_bit_lock - Set a bit and return its old value, for lock
> * @nr: Bit to set
> @@ -20,12 +26,12 @@ static inline int test_and_set_bit_lock(unsigned int nr,
> {
> long old;
> unsigned long mask = BIT_MASK(nr);
> + atomic_long_t *loc = atomic_long_bit_word(nr, p);
>
> - p += BIT_WORD(nr);
> - if (READ_ONCE(*p) & mask)
> + if (atomic_read(loc) & mask)
> return 1;
>
> - old = atomic_long_fetch_or_acquire(mask, (atomic_long_t *)p);
> + old = atomic_long_fetch_or_acquire(mask, loc);
> return !!(old & mask);
> }
>
On Fri, Jan 10, 2020 at 08:42:37PM +0100, Arnd Bergmann wrote:
> On Fri, Jan 10, 2020 at 5:57 PM Will Deacon <[email protected]> wrote:
>
> > @@ -128,10 +128,10 @@ do { \
> > #ifndef __smp_load_acquire
> > #define __smp_load_acquire(p) \
> > ({ \
> > - typeof(*p) ___p1 = READ_ONCE(*p); \
> > + __unqual_scalar_typeof(*p) ___p1 = READ_ONCE(*p); \
> > compiletime_assert_atomic_type(*p); \
> > __smp_mb(); \
> > - ___p1; \
> > + (typeof(*p))___p1; \
> > })
>
> Doesn't that last (typeof(*p))___p1 mean you put the potential
> 'volatile' back on the assignment after you went through the
> effort of taking it out?
Yes, but that's ok wrt codegen since the local variable isn't volatile,
and I definitely ran into issues with 'const' if I returned the unqualified
type here.
Will
On Mon, Jan 13, 2020 at 11:40 PM Arnd Bergmann <[email protected]> wrote:
>
> On Fri, Jan 10, 2020 at 6:54 PM Joe Perches <[email protected]> wrote:
> >
> > On Fri, 2020-01-10 at 18:35 +0100, Arnd Bergmann wrote:
> > > On Fri, Jan 10, 2020 at 5:56 PM Will Deacon <[email protected]> wrote:
> > > > Prior to version 4.8, GCC may miscompile READ_ONCE() by erroneously
> > > > discarding the 'volatile' qualifier:
> > > >
> > > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=58145
> > > >
> > > > We've been working around this using some nasty hacks which make
> > > > READ_ONCE() both horribly complicated and also prevent us from enforcing
> > > > that it is only used on scalar types. Since GCC 4.8 is pretty old for
> > > > kernel builds now, emit a warning if we detect it during the build.
> > >
> > > No objection to recommending gcc-4.8, but I think this should either
> > > just warn once during the kernel build instead of for every file, or
> > > it should become a hard requirement.
> >
> > It might as well be a hard requirement as
> > gcc 4.8.0 is already nearly 7 years old.
> >
> > gcc 4.6.0 released 2011-03-25
> > gcc 4.8.0 released 2013-03-22
> >
> > Perhaps there are exceedingly few to zero new
> > instances using gcc compiler versions < 4.8
>
> The last time we had this discussion, the result was that gcc-4.8 is
> probably ok as a minimum version, but moving to 5.1+ (from 2015)
> was not an obvious choice:
>
> https://www.spinics.net/lists/linux-kbuild/msg23648.html
>
> If nobody complains about the move to 4.8, we can try moving to
> gcc-5.1 and GNU99/GNU11 next year ;-)
>
> Arnd
I agree.
"Your compiler is old and may miscompile the kernel due to
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=58145 - please upgrade it."
sounds like an error rather than a warning.
--
Best Regards
Masahiro Yamada
On Fri, Jan 10, 2020 at 08:24:00PM +0100, Arnd Bergmann wrote:
> On Fri, Jan 10, 2020 at 5:56 PM Will Deacon <[email protected]> wrote:
>
> > +/*
> > + * Use __READ_ONCE() instead of READ_ONCE() if you do not require any
> > + * atomicity or dependency ordering guarantees. Note that this may result
> > + * in tears!
> > + */
> > +#define __READ_ONCE(x) (*(volatile typeof(x) *)&(x))
> > +
>
> This probably allows writing
>
> extern int i;
> __READ_ONCE(i) = 1;
>
> and not get a warning for it. How about also casting to 'const'?
Well spotted! I'll fold that in.
Will
On Mon, Jan 13, 2020 at 02:59:54PM +0000, Will Deacon wrote:
> // Insert big fat comment here
> #define unqual_typeof(x) typeof(({_Atomic typeof(x) ___x __maybe_unused; ___x; }))
>
> That apparently *requires* GCC 4.8, but I think the question is more about
> whether it's easier to stomach the funny use of _Atomic or the nested
> __builtin_choose_expr() I have here. I'm also worried about how reliable
> the _Atomic thing is, or whether it's just an artifact of how GCC happens
> to work today.
As far as I understand it, it's an artifact of how GCC works today (it
was added to support the type-generic macros in <tgmath.h>).
I also think it's also quite fragile, for example, the unqualified type
is returned if typeof's argument is an expression but not if it's a
'typename'. IOW:
typeof(_Atomic typeof(const int))
returns 'const int', while
typeof(({_Atomic typeof(const int) x; x; }))
returns 'int'.
-- Luc
On Mon, Jan 13, 2020 at 7:00 AM Will Deacon <[email protected]> wrote:
>
> I can't disagree with that, but the only option we've come up with so far
> that solves this in the READ_ONCE() macro itself is the thing from PeterZ:
>
> // Insert big fat comment here
> #define unqual_typeof(x) typeof(({_Atomic typeof(x) ___x __maybe_unused; ___x; }))
I'm with Luc on this - that not only looks gcc-specific, it looks
fragile too, in that it's not obvious that "_Atomic typeof(x)" really
is guaranteed to do what we want.
> So I suppose my question is: how ill does this code really make you feel?
I wish the code was more obvious.
One way to do that might be to do your approach, but just write it as
a series of macros that makes it a bit more understandable what it
does.
Maybe it's just because of a "pee in the snow" effect, but I think
this is easier to explain:
#define __pick_scalar_type(x,type,otherwise) \
__builtin_choose_expr(__same_type(x,type), (type)0, otherwise)
#define __pick_integer_type(x, type, otherwise) \
__pick_scalar_type(x, unsigned type, \
__pick_scalar_type(x, signed type, otherwise))
#define __unqual_scalar_typeof(x) typeof( \
__pick_integer_type(x, char, \
__pick_integer_type(x, short, \
__pick_integer_type(x, int, \
__pick_integer_type(x, long, \
__pick_integer_type(x, long long, x))))))
just because you there's less repeated noise, and the repetition there
is is simpler.
So still "Eww", but maybe not quite _as_ "Eww".
Linus
On Fri, Jan 10, 2020 at 06:35:02PM +0100, Arnd Bergmann wrote:
> On Fri, Jan 10, 2020 at 5:56 PM Will Deacon <[email protected]> wrote:
> >
> > Prior to version 4.8, GCC may miscompile READ_ONCE() by erroneously
> > discarding the 'volatile' qualifier:
> >
> > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=58145
> >
> > We've been working around this using some nasty hacks which make
> > READ_ONCE() both horribly complicated and also prevent us from enforcing
> > that it is only used on scalar types. Since GCC 4.8 is pretty old for
> > kernel builds now, emit a warning if we detect it during the build.
>
> No objection to recommending gcc-4.8, but I think this should either
> just warn once during the kernel build instead of for every file, or
> it should become a hard requirement.
Yeah, hard requirement sounds good to me. Arnd, do you have stats on which
distros have which versions of GCC (IIRC, you had some stats for the GCC 4.6
upgrade)? This allows us to clean up more cruft in the kernel (grep for
GCC_VERSION).
Will, Documentation/process/changes.rst should also be modified.
Android is still using GCC 4.9 (which is more like GCC 4.8 plus patches), but
I've been actively moving them (to Clang) over the past 2 years. I'll check
with our other internal distro's and give them a heads up.
From: Nick Desaulniers
> Sent: 14 January 2020 21:39
...
> Yeah, hard requirement sounds good to me. Arnd, do you have stats on which
> distros have which versions of GCC (IIRC, you had some stats for the GCC 4.6
> upgrade)? This allows us to clean up more cruft in the kernel (grep for
> GCC_VERSION).
FWIW My Ubuntu 13.04 system has gcc 4.7.3.
(Actually, looking in /usr/bin there is a copy of 4.8.1 from the original install.)
David
-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)
On Tue, Jan 14, 2020 at 10:39 PM Nick Desaulniers
<[email protected]> wrote:
>
> On Fri, Jan 10, 2020 at 06:35:02PM +0100, Arnd Bergmann wrote:
> > On Fri, Jan 10, 2020 at 5:56 PM Will Deacon <[email protected]> wrote:
> > >
> > > Prior to version 4.8, GCC may miscompile READ_ONCE() by erroneously
> > > discarding the 'volatile' qualifier:
> > >
> > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=58145
> > >
> > > We've been working around this using some nasty hacks which make
> > > READ_ONCE() both horribly complicated and also prevent us from enforcing
> > > that it is only used on scalar types. Since GCC 4.8 is pretty old for
> > > kernel builds now, emit a warning if we detect it during the build.
> >
> > No objection to recommending gcc-4.8, but I think this should either
> > just warn once during the kernel build instead of for every file, or
> > it should become a hard requirement.
>
> Yeah, hard requirement sounds good to me. Arnd, do you have stats on which
> distros have which versions of GCC (IIRC, you had some stats for the GCC 4.6
> upgrade)? This allows us to clean up more cruft in the kernel (grep for
> GCC_VERSION).
This is the list that Kirill and I came up with in the thread I linked to
> - Debian 8.0 Jessie has 4.9.2, EOL 2020-05
> - Ubuntu 14.04 LTS Trusty has 4.8.2, EOL 2019-04;
> - Fedora 21 has 4.9.2, EOL 2015-12;
> - OpenSUSE 42.3 has 4.8.5, EOL 2019-06;
> - RHEL 7.7 has 4.8.5, EOL 2024-06;
> - RHEL 6.9 has 4.4.7, EOL 2020-11;
> - SUSE 12-SP4 has 4.8.6, EOL 2024, extended support 2027
> - Oracle 7.6 has 4.8.5, EOL ?;
> - Slackware 14.1 (no EOL announced): gcc-4.8
In an older thread about moving to gcc-4.3 or 4.6 as a minimum,
this were the even older distros:
> - RHEL 5.x has 4.1; extended support EOL 2020
> - SLES11 had gcc-4.3; extended support EOL 2022
> - Debian Wheezy (oldoldstable) had gcc-4.6, EOL 2018.
> - OpenWRT 12.07 Attitude Adjustment had gcc-4.6
> and is still used with devices that have only 4MB flash / 32 MB RAM
Arnd