LinuxLists.cc - [PATCH v5 00/18] Rework READ

[permalink] [raw]

Subject: [PATCH v5 18/18] linux/compiler.h: Remove redundant '#else'

The '#else' clause after checking '#ifdef __KERNEL__' is empty in
linux/compiler.h, so remove it.

Signed-off-by: Will Deacon <[email protected]>
---
include/linux/compiler.h | 2 --
1 file changed, 2 deletions(-)

diff --git a/include/linux/compiler.h b/include/linux/compiler.h
index d21a823e73c6..741c93c62ecf 100644
--- a/include/linux/compiler.h
+++ b/include/linux/compiler.h
@@ -318,8 +318,6 @@ unsigned long read_word_at_a_time(const void *addr)
return *(unsigned long *)addr;
}

-#else
-
#endif /* __KERNEL__ */

/*
--
2.26.2.645.ge9eca65c58-goog

2020-05-11 20:47:10

[permalink] [raw]

Subject: [PATCH v5 09/18] arm64: csum: Disable KASAN for do_csum()

do_csum() over-reads the source buffer and therefore abuses
READ_ONCE_NOCHECK() on a 128-bit type to avoid tripping up KASAN. In
preparation for READ_ONCE_NOCHECK() requiring an atomic access, and
therefore failing to build when fed a '__uint128_t', annotate do_csum()
explicitly with '__no_sanitize_address' and fall back to normal loads.

Acked-by: Robin Murphy <[email protected]>
Acked-by: Mark Rutland <[email protected]>
Signed-off-by: Will Deacon <[email protected]>
---
arch/arm64/lib/csum.c | 20 ++++++++++++--------
1 file changed, 12 insertions(+), 8 deletions(-)

diff --git a/arch/arm64/lib/csum.c b/arch/arm64/lib/csum.c
index 60eccae2abad..78b87a64ca0a 100644
--- a/arch/arm64/lib/csum.c
+++ b/arch/arm64/lib/csum.c
@@ -14,7 +14,11 @@ static u64 accumulate(u64 sum, u64 data)
return tmp + (tmp >> 64);
}

-unsigned int do_csum(const unsigned char *buff, int len)
+/*
+ * We over-read the buffer and this makes KASAN unhappy. Instead, disable
+ * instrumentation and call kasan explicitly.
+ */
+unsigned int __no_sanitize_address do_csum(const unsigned char *buff, int len)
{
unsigned int offset, shift, sum;
const u64 *ptr;
@@ -42,7 +46,7 @@ unsigned int do_csum(const unsigned char *buff, int len)
* odd/even alignment, and means we can ignore it until the very end.
*/
shift = offset * 8;
- data = READ_ONCE_NOCHECK(*ptr++);
+ data = *ptr++;
#ifdef __LITTLE_ENDIAN
data = (data >> shift) << shift;
#else
@@ -58,10 +62,10 @@ unsigned int do_csum(const unsigned char *buff, int len)
while (unlikely(len > 64)) {
__uint128_t tmp1, tmp2, tmp3, tmp4;

- tmp1 = READ_ONCE_NOCHECK(*(__uint128_t *)ptr);
- tmp2 = READ_ONCE_NOCHECK(*(__uint128_t *)(ptr + 2));
- tmp3 = READ_ONCE_NOCHECK(*(__uint128_t *)(ptr + 4));
- tmp4 = READ_ONCE_NOCHECK(*(__uint128_t *)(ptr + 6));
+ tmp1 = *(__uint128_t *)ptr;
+ tmp2 = *(__uint128_t *)(ptr + 2);
+ tmp3 = *(__uint128_t *)(ptr + 4);
+ tmp4 = *(__uint128_t *)(ptr + 6);

len -= 64;
ptr += 8;
@@ -85,7 +89,7 @@ unsigned int do_csum(const unsigned char *buff, int len)
__uint128_t tmp;

sum64 = accumulate(sum64, data);
- tmp = READ_ONCE_NOCHECK(*(__uint128_t *)ptr);
+ tmp = *(__uint128_t *)ptr;

len -= 16;
ptr += 2;
@@ -100,7 +104,7 @@ unsigned int do_csum(const unsigned char *buff, int len)
}
if (len > 0) {
sum64 = accumulate(sum64, data);
- data = READ_ONCE_NOCHECK(*ptr);
+ data = *ptr;
len -= 8;
}
/*
--
2.26.2.645.ge9eca65c58-goog

2020-05-11 20:48:00

[permalink] [raw]

Subject: [PATCH v5 12/18] READ_ONCE: Drop pointer qualifiers when reading from scalar types

Passing a volatile-qualified pointer to READ_ONCE() is an absolute
trainwreck for code generation: the use of 'typeof()' to define a
temporary variable inside the macro means that the final evaluation in
macro scope ends up forcing a read back from the stack. When stack
protector is enabled (the default for arm64, at least), this causes
the compiler to vomit up all sorts of junk.

Unfortunately, dropping pointer qualifiers inside the macro poses quite
a challenge, especially since the pointed-to type is permitted to be an
aggregate, and this is relied upon by mm/ code accessing things like
'pmd_t'. Based on numerous hacks and discussions on the mailing list,
this is the best I've managed to come up with.

Introduce '__unqual_scalar_typeof()' which takes an expression and, if
the expression is an optionally qualified 8, 16, 32 or 64-bit scalar
type, evaluates to the unqualified type. Other input types, including
aggregates, remain unchanged. Hopefully READ_ONCE() on volatile aggregate
pointers isn't something we do on a fast-path.

Cc: Peter Zijlstra <[email protected]>
Cc: Arnd Bergmann <[email protected]>
Suggested-by: Linus Torvalds <[email protected]>
Reported-by: Michael Ellerman <[email protected]>
Signed-off-by: Will Deacon <[email protected]>
---
include/linux/compiler.h | 6 +++---
include/linux/compiler_types.h | 26 ++++++++++++++++++++++++++
2 files changed, 29 insertions(+), 3 deletions(-)

diff --git a/include/linux/compiler.h b/include/linux/compiler.h
index e1b839e42563..0caced170a8a 100644
--- a/include/linux/compiler.h
+++ b/include/linux/compiler.h
@@ -204,7 +204,7 @@ void ftrace_likely_update(struct ftrace_likely_data *f, int val,
* atomicity or dependency ordering guarantees. Note that this may result
* in tears!
*/
-#define __READ_ONCE(x) (*(const volatile typeof(x) *)&(x))
+#define __READ_ONCE(x) (*(const volatile __unqual_scalar_typeof(x) *)&(x))

#define __READ_ONCE_SCALAR(x) \
({ \
@@ -212,10 +212,10 @@ void ftrace_likely_update(struct ftrace_likely_data *f, int val,
kcsan_check_atomic_read(__xp, sizeof(*__xp)); \
__kcsan_disable_current(); \
({ \
- typeof(x) __x = __READ_ONCE(*__xp); \
+ __unqual_scalar_typeof(x) __x = __READ_ONCE(*__xp); \
__kcsan_enable_current(); \
smp_read_barrier_depends(); \
- __x; \
+ (typeof(x))__x; \
}); \
})

diff --git a/include/linux/compiler_types.h b/include/linux/compiler_types.h
index e970f97a7fcb..6ed0612bc143 100644
--- a/include/linux/compiler_types.h
+++ b/include/linux/compiler_types.h
@@ -210,6 +210,32 @@ struct ftrace_likely_data {
/* Are two types/vars the same type (ignoring qualifiers)? */
#define __same_type(a, b) __builtin_types_compatible_p(typeof(a), typeof(b))

+/*
+ * __unqual_scalar_typeof(x) - Declare an unqualified scalar type, leaving
+ * non-scalar types unchanged.
+ *
+ * We build this out of a couple of helper macros in a vain attempt to
+ * help you keep your lunch down while reading it.
+ */
+#define __pick_scalar_type(x, type, otherwise) \
+ __builtin_choose_expr(__same_type(x, type), (type)0, otherwise)
+
+/*
+ * 'char' is not type-compatible with either 'signed char' or 'unsigned char',
+ * so we include the naked type here as well as the signed/unsigned variants.
+ */
+#define __pick_integer_type(x, type, otherwise) \
+ __pick_scalar_type(x, type, \
+ __pick_scalar_type(x, unsigned type, \
+ __pick_scalar_type(x, signed type, otherwise)))
+
+#define __unqual_scalar_typeof(x) typeof( \
+ __pick_integer_type(x, char, \
+ __pick_integer_type(x, short, \
+ __pick_integer_type(x, int, \
+ __pick_integer_type(x, long, \
+ __pick_integer_type(x, long long, x))))))
+
/* Is this type a native word size -- useful for atomic operations */
#define __native_word(t) \
(sizeof(t) == sizeof(char) || sizeof(t) == sizeof(short) || \
--
2.26.2.645.ge9eca65c58-goog

2020-05-11 20:48:20

[permalink] [raw]

Subject: [PATCH v5 06/18] netfilter: Avoid assigning 'const' pointer to non-const pointer

nf_remove_net_hook() uses WRITE_ONCE() to assign a 'const' pointer to a
'non-const' pointer. Cleanups to the implementation of WRITE_ONCE() mean
that this will give rise to a compiler warning, just like a plain old
assignment would do:

| In file included from ./include/linux/export.h:43,
| from ./include/linux/linkage.h:7,
| from ./include/linux/kernel.h:8,
| from net/netfilter/core.c:9:
| net/netfilter/core.c: In function ‘nf_remove_net_hook’:
| ./include/linux/compiler.h:216:30: warning: assignment discards ‘const’ qualifier from pointer target type [-Wdiscarded-qualifiers]
| *(volatile typeof(x) *)&(x) = (val); \
| ^
| net/netfilter/core.c:379:3: note: in expansion of macro ‘WRITE_ONCE’
| WRITE_ONCE(orig_ops[i], &dummy_ops);
| ^~~~~~~~~~

Follow the pattern used elsewhere in this file and add a cast to 'void *'
to squash the warning.

Cc: Pablo Neira Ayuso <[email protected]>
Cc: Jozsef Kadlecsik <[email protected]>
Cc: Florian Westphal <[email protected]>
Cc: "David S. Miller" <[email protected]>
Reviewed-by: Nick Desaulniers <[email protected]>
Signed-off-by: Will Deacon <[email protected]>
---
net/netfilter/core.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/netfilter/core.c b/net/netfilter/core.c
index 78f046ec506f..3ac7c8c1548d 100644
--- a/net/netfilter/core.c
+++ b/net/netfilter/core.c
@@ -376,7 +376,7 @@ static bool nf_remove_net_hook(struct nf_hook_entries *old,
if (orig_ops[i] != unreg)
continue;
WRITE_ONCE(old->hooks[i].hook, accept_all);
- WRITE_ONCE(orig_ops[i], &dummy_ops);
+ WRITE_ONCE(orig_ops[i], (void *)&dummy_ops);
return true;
}

--
2.26.2.645.ge9eca65c58-goog

2020-05-12 08:23:15

[permalink] [raw]

Subject: Re: [PATCH v5 00/18] Rework READ_ONCE() to improve codegen

On Mon, May 11, 2020 at 09:41:32PM +0100, Will Deacon wrote:
> Hi folks,
>
> (trimmed CC list since v4 since this is largely just a rebase)
>
> This is version five of the READ_ONCE() codegen improvement series that
> I've previously posted here:
>
> RFC: https://lore.kernel.org/lkml/[email protected]
> v2: https://lore.kernel.org/lkml/[email protected]
> v3: https://lore.kernel.org/lkml/[email protected]
> v4: https://lore.kernel.org/lkml/[email protected]
>
> The main change since v4 is that this is now based on top of the KCSAN
> changes queued in -tip (locking/kcsan) and therefore contains the patches
> necessary to avoid breaking sparc32 as well as some cleanups to
> consolidate {READ,WRITE}_ONCE() and data_race().
>
> Other changes include:
>
> * Treat 'char' as distinct from 'signed char' and 'unsigned char' for
> __builtin_types_compatible_p()
>
> * Add a compile-time assertion that the argument to READ_ONCE_NOCHECK()
> points at something the same size as 'unsigned long'
>
> I'm happy for all of this to go via -tip, or I can take it via arm64.

Looks good to me; Thanks!

Acked-by: Peter Zijlstra (Intel) <[email protected]>

2020-05-12 14:38:57

[permalink] [raw]

Subject: [tip: locking/kcsan] READ_ONCE: Drop pointer qualifiers when reading from scalar types

The following commit has been merged into the locking/kcsan branch of tip:

Commit-ID: 7b364f0949ae2dd205d5e9afa4b82ee17030d928
Gitweb: https://git.kernel.org/tip/7b364f0949ae2dd205d5e9afa4b82ee17030d928
Author: Will Deacon <[email protected]>
AuthorDate: Mon, 11 May 2020 21:41:44 +01:00
Committer: Thomas Gleixner <[email protected]>
CommitterDate: Tue, 12 May 2020 11:04:14 +02:00

READ_ONCE: Drop pointer qualifiers when reading from scalar types

Passing a volatile-qualified pointer to READ_ONCE() is an absolute
trainwreck for code generation: the use of 'typeof()' to define a
temporary variable inside the macro means that the final evaluation in
macro scope ends up forcing a read back from the stack. When stack
protector is enabled (the default for arm64, at least), this causes
the compiler to vomit up all sorts of junk.

Unfortunately, dropping pointer qualifiers inside the macro poses quite
a challenge, especially since the pointed-to type is permitted to be an
aggregate, and this is relied upon by mm/ code accessing things like
'pmd_t'. Based on numerous hacks and discussions on the mailing list,
this is the best I've managed to come up with.

Introduce '__unqual_scalar_typeof()' which takes an expression and, if
the expression is an optionally qualified 8, 16, 32 or 64-bit scalar
type, evaluates to the unqualified type. Other input types, including
aggregates, remain unchanged. Hopefully READ_ONCE() on volatile aggregate
pointers isn't something we do on a fast-path.

Reported-by: Michael Ellerman <[email protected]>
Suggested-by: Linus Torvalds <[email protected]>
Signed-off-by: Will Deacon <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Acked-by: Peter Zijlstra (Intel) <[email protected]>
Cc: Arnd Bergmann <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]

---
include/linux/compiler.h | 6 +++---
include/linux/compiler_types.h | 26 ++++++++++++++++++++++++++
2 files changed, 29 insertions(+), 3 deletions(-)

diff --git a/include/linux/compiler.h b/include/linux/compiler.h
index 733605f..548294e 100644
--- a/include/linux/compiler.h
+++ b/include/linux/compiler.h
@@ -204,7 +204,7 @@ void ftrace_likely_update(struct ftrace_likely_data *f, int val,
* atomicity or dependency ordering guarantees. Note that this may result
* in tears!
*/
-#define __READ_ONCE(x) (*(const volatile typeof(x) *)&(x))
+#define __READ_ONCE(x) (*(const volatile __unqual_scalar_typeof(x) *)&(x))

#define __READ_ONCE_SCALAR(x) \
({ \
@@ -212,10 +212,10 @@ void ftrace_likely_update(struct ftrace_likely_data *f, int val,
kcsan_check_atomic_read(__xp, sizeof(*__xp)); \
__kcsan_disable_current(); \
({ \
- typeof(x) __x = __READ_ONCE(*__xp); \
+ __unqual_scalar_typeof(x) __x = __READ_ONCE(*__xp); \
__kcsan_enable_current(); \
smp_read_barrier_depends(); \
- __x; \
+ (typeof(x))__x; \
}); \
})

diff --git a/include/linux/compiler_types.h b/include/linux/compiler_types.h
index e970f97..6ed0612 100644
--- a/include/linux/compiler_types.h
+++ b/include/linux/compiler_types.h
@@ -210,6 +210,32 @@ struct ftrace_likely_data {
/* Are two types/vars the same type (ignoring qualifiers)? */
#define __same_type(a, b) __builtin_types_compatible_p(typeof(a), typeof(b))

+/*
+ * __unqual_scalar_typeof(x) - Declare an unqualified scalar type, leaving
+ * non-scalar types unchanged.
+ *
+ * We build this out of a couple of helper macros in a vain attempt to
+ * help you keep your lunch down while reading it.
+ */
+#define __pick_scalar_type(x, type, otherwise) \
+ __builtin_choose_expr(__same_type(x, type), (type)0, otherwise)
+
+/*
+ * 'char' is not type-compatible with either 'signed char' or 'unsigned char',
+ * so we include the naked type here as well as the signed/unsigned variants.
+ */
+#define __pick_integer_type(x, type, otherwise) \
+ __pick_scalar_type(x, type, \
+ __pick_scalar_type(x, unsigned type, \
+ __pick_scalar_type(x, signed type, otherwise)))
+
+#define __unqual_scalar_typeof(x) typeof( \
+ __pick_integer_type(x, char, \
+ __pick_integer_type(x, short, \
+ __pick_integer_type(x, int, \
+ __pick_integer_type(x, long, \
+ __pick_integer_type(x, long long, x))))))
+
/* Is this type a native word size -- useful for atomic operations */
#define __native_word(t) \
(sizeof(t) == sizeof(char) || sizeof(t) == sizeof(short) || \

2020-05-12 14:39:31

[permalink] [raw]

Subject: [tip: locking/kcsan] netfilter: Avoid assigning 'const' pointer to non-const pointer

The following commit has been merged into the locking/kcsan branch of tip:

Commit-ID: f64554152014597a40403ea1a291c80785a2dfe9
Gitweb: https://git.kernel.org/tip/f64554152014597a40403ea1a291c80785a2dfe9
Author: Will Deacon <[email protected]>
AuthorDate: Mon, 11 May 2020 21:41:38 +01:00
Committer: Thomas Gleixner <[email protected]>
CommitterDate: Tue, 12 May 2020 11:04:11 +02:00

netfilter: Avoid assigning 'const' pointer to non-const pointer

nf_remove_net_hook() uses WRITE_ONCE() to assign a 'const' pointer to a
'non-const' pointer. Cleanups to the implementation of WRITE_ONCE() mean
that this will give rise to a compiler warning, just like a plain old
assignment would do:

| In file included from ./include/linux/export.h:43,
| from ./include/linux/linkage.h:7,
| from ./include/linux/kernel.h:8,
| from net/netfilter/core.c:9:
| net/netfilter/core.c: In function ‘nf_remove_net_hook’:
| ./include/linux/compiler.h:216:30: warning: assignment discards ‘const’ qualifier from pointer target type [-Wdiscarded-qualifiers]
| *(volatile typeof(x) *)&(x) = (val); \
| ^
| net/netfilter/core.c:379:3: note: in expansion of macro ‘WRITE_ONCE’
| WRITE_ONCE(orig_ops[i], &dummy_ops);
| ^~~~~~~~~~

Follow the pattern used elsewhere in this file and add a cast to 'void *'
to squash the warning.

Signed-off-by: Will Deacon <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Reviewed-by: Nick Desaulniers <[email protected]>
Acked-by: Peter Zijlstra (Intel) <[email protected]>
Cc: Pablo Neira Ayuso <[email protected]>
Cc: Jozsef Kadlecsik <[email protected]>
Cc: Florian Westphal <[email protected]>
Cc: "David S. Miller" <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]

---
net/netfilter/core.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/netfilter/core.c b/net/netfilter/core.c
index 78f046e..3ac7c8c 100644
--- a/net/netfilter/core.c
+++ b/net/netfilter/core.c
@@ -376,7 +376,7 @@ static bool nf_remove_net_hook(struct nf_hook_entries *old,
if (orig_ops[i] != unreg)
continue;
WRITE_ONCE(old->hooks[i].hook, accept_all);
- WRITE_ONCE(orig_ops[i], &dummy_ops);
+ WRITE_ONCE(orig_ops[i], (void *)&dummy_ops);
return true;
}

2020-05-12 14:39:41

[permalink] [raw]

Subject: [tip: locking/kcsan] linux/compiler.h: Remove redundant '#else'

The following commit has been merged into the locking/kcsan branch of tip:

Commit-ID: 8367aadcd83d2570fd4ce4af40ae7aec7c2bfcb7
Gitweb: https://git.kernel.org/tip/8367aadcd83d2570fd4ce4af40ae7aec7c2bfcb7
Author: Will Deacon <[email protected]>
AuthorDate: Mon, 11 May 2020 21:41:50 +01:00
Committer: Thomas Gleixner <[email protected]>
CommitterDate: Tue, 12 May 2020 11:04:11 +02:00

linux/compiler.h: Remove redundant '#else'

The '#else' clause after checking '#ifdef __KERNEL__' is empty in
linux/compiler.h, so remove it.

Signed-off-by: Will Deacon <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Acked-by: Peter Zijlstra (Intel) <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]

---
include/linux/compiler.h | 1 -
1 file changed, 1 deletion(-)

diff --git a/include/linux/compiler.h b/include/linux/compiler.h
index cce2c92..9bd0f76 100644
--- a/include/linux/compiler.h
+++ b/include/linux/compiler.h
@@ -331,7 +331,6 @@ unsigned long read_word_at_a_time(const void *addr)
kcsan_enable_current(); \
__val; \
})
-#else

#endif /* __KERNEL__ */

2020-05-12 14:42:11

[permalink] [raw]

Subject: [tip: locking/kcsan] arm64: csum: Disable KASAN for do_csum()

The following commit has been merged into the locking/kcsan branch of tip:

Commit-ID: 5a7d7f5d57f61d650619b89c1b7d4adcf4fdecfe
Gitweb: https://git.kernel.org/tip/5a7d7f5d57f61d650619b89c1b7d4adcf4fdecfe
Author: Will Deacon <[email protected]>
AuthorDate: Mon, 11 May 2020 21:41:41 +01:00
Committer: Thomas Gleixner <[email protected]>
CommitterDate: Tue, 12 May 2020 11:04:13 +02:00

arm64: csum: Disable KASAN for do_csum()

do_csum() over-reads the source buffer and therefore abuses
READ_ONCE_NOCHECK() on a 128-bit type to avoid tripping up KASAN. In
preparation for READ_ONCE_NOCHECK() requiring an atomic access, and
therefore failing to build when fed a '__uint128_t', annotate do_csum()
explicitly with '__no_sanitize_address' and fall back to normal loads.

Signed-off-by: Will Deacon <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Acked-by: Peter Zijlstra (Intel) <[email protected]>
Acked-by: Robin Murphy <[email protected]>
Acked-by: Mark Rutland <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]

---
arch/arm64/lib/csum.c | 20 ++++++++++++--------
1 file changed, 12 insertions(+), 8 deletions(-)

diff --git a/arch/arm64/lib/csum.c b/arch/arm64/lib/csum.c
index 60eccae..78b87a6 100644
--- a/arch/arm64/lib/csum.c
+++ b/arch/arm64/lib/csum.c
@@ -14,7 +14,11 @@ static u64 accumulate(u64 sum, u64 data)
return tmp + (tmp >> 64);
}

-unsigned int do_csum(const unsigned char *buff, int len)
+/*
+ * We over-read the buffer and this makes KASAN unhappy. Instead, disable
+ * instrumentation and call kasan explicitly.
+ */
+unsigned int __no_sanitize_address do_csum(const unsigned char *buff, int len)
{
unsigned int offset, shift, sum;
const u64 *ptr;
@@ -42,7 +46,7 @@ unsigned int do_csum(const unsigned char *buff, int len)
* odd/even alignment, and means we can ignore it until the very end.
*/
shift = offset * 8;
- data = READ_ONCE_NOCHECK(*ptr++);
+ data = *ptr++;
#ifdef __LITTLE_ENDIAN
data = (data >> shift) << shift;
#else
@@ -58,10 +62,10 @@ unsigned int do_csum(const unsigned char *buff, int len)
while (unlikely(len > 64)) {
__uint128_t tmp1, tmp2, tmp3, tmp4;

- tmp1 = READ_ONCE_NOCHECK(*(__uint128_t *)ptr);
- tmp2 = READ_ONCE_NOCHECK(*(__uint128_t *)(ptr + 2));
- tmp3 = READ_ONCE_NOCHECK(*(__uint128_t *)(ptr + 4));
- tmp4 = READ_ONCE_NOCHECK(*(__uint128_t *)(ptr + 6));
+ tmp1 = *(__uint128_t *)ptr;
+ tmp2 = *(__uint128_t *)(ptr + 2);
+ tmp3 = *(__uint128_t *)(ptr + 4);
+ tmp4 = *(__uint128_t *)(ptr + 6);

len -= 64;
ptr += 8;
@@ -85,7 +89,7 @@ unsigned int do_csum(const unsigned char *buff, int len)
__uint128_t tmp;

sum64 = accumulate(sum64, data);
- tmp = READ_ONCE_NOCHECK(*(__uint128_t *)ptr);
+ tmp = *(__uint128_t *)ptr;

len -= 16;
ptr += 2;
@@ -100,7 +104,7 @@ unsigned int do_csum(const unsigned char *buff, int len)
}
if (len > 0) {
sum64 = accumulate(sum64, data);
- data = READ_ONCE_NOCHECK(*ptr);
+ data = *ptr;
len -= 8;
}
/*

2020-05-12 17:57:11

[permalink] [raw]

Subject: Re: [PATCH v5 00/18] Rework READ_ONCE() to improve codegen

On Tue, 12 May 2020 at 10:18, Peter Zijlstra <[email protected]> wrote:
>
> On Mon, May 11, 2020 at 09:41:32PM +0100, Will Deacon wrote:
> > Hi folks,
> >
> > (trimmed CC list since v4 since this is largely just a rebase)
> >
> > This is version five of the READ_ONCE() codegen improvement series that
> > I've previously posted here:
> >
> > RFC: https://lore.kernel.org/lkml/[email protected]
> > v2: https://lore.kernel.org/lkml/[email protected]
> > v3: https://lore.kernel.org/lkml/[email protected]
> > v4: https://lore.kernel.org/lkml/[email protected]
> >
> > The main change since v4 is that this is now based on top of the KCSAN
> > changes queued in -tip (locking/kcsan) and therefore contains the patches
> > necessary to avoid breaking sparc32 as well as some cleanups to
> > consolidate {READ,WRITE}_ONCE() and data_race().
> >
> > Other changes include:
> >
> > * Treat 'char' as distinct from 'signed char' and 'unsigned char' for
> > __builtin_types_compatible_p()
> >
> > * Add a compile-time assertion that the argument to READ_ONCE_NOCHECK()
> > points at something the same size as 'unsigned long'
> >
> > I'm happy for all of this to go via -tip, or I can take it via arm64.
>
> Looks good to me; Thanks!
>
> Acked-by: Peter Zijlstra (Intel) <[email protected]>

I just ran a bunch of KCSAN tests. While this series alone would have
passed the tests, there appears to be a problem with
__READ_ONCE/__WRITE_ONCE. I think they should already be using
'data_race()', as otherwise we will get lots of false positives in
future.

I noticed this when testing -tip/locking/kcsan, which breaks
unfortunately, because I see a bunch of spurious data races with
arch_atomic_{read,set} because "locking/atomics: Flip fallbacks and
instrumentation" changed them to use __READ_ONCE()/__WRITE_ONCE().
From what I see, the intent was to not double-instrument,
unfortunately they are still double-instrumented because
__READ_ONCE/__WRITE_ONCE doesn't hide the access from KCSAN (nor KASAN
actually). I don't think we can use __no_sanitize_or_inline for the
arch_ functions, because we really want them to be __always_inline
(also to avoid calls to these functions in uaccess regions, which
objtool would notice).

I think the easiest way to resolve this is to wrap the accesses in
__*_ONCE with data_race().

Thanks,
-- Marco

2020-05-12 18:58:46

[permalink] [raw]

Subject: Re: [PATCH v5 00/18] Rework READ_ONCE() to improve codegen

On Tue, 12 May 2020 at 19:53, Marco Elver <[email protected]> wrote:
>
> On Tue, 12 May 2020 at 10:18, Peter Zijlstra <[email protected]> wrote:
> >
> > On Mon, May 11, 2020 at 09:41:32PM +0100, Will Deacon wrote:
> > > Hi folks,
> > >
> > > (trimmed CC list since v4 since this is largely just a rebase)
> > >
> > > This is version five of the READ_ONCE() codegen improvement series that
> > > I've previously posted here:
> > >
> > > RFC: https://lore.kernel.org/lkml/[email protected]
> > > v2: https://lore.kernel.org/lkml/[email protected]
> > > v3: https://lore.kernel.org/lkml/[email protected]
> > > v4: https://lore.kernel.org/lkml/[email protected]
> > >
> > > The main change since v4 is that this is now based on top of the KCSAN
> > > changes queued in -tip (locking/kcsan) and therefore contains the patches
> > > necessary to avoid breaking sparc32 as well as some cleanups to
> > > consolidate {READ,WRITE}_ONCE() and data_race().
> > >
> > > Other changes include:
> > >
> > > * Treat 'char' as distinct from 'signed char' and 'unsigned char' for
> > > __builtin_types_compatible_p()
> > >
> > > * Add a compile-time assertion that the argument to READ_ONCE_NOCHECK()
> > > points at something the same size as 'unsigned long'
> > >
> > > I'm happy for all of this to go via -tip, or I can take it via arm64.
> >
> > Looks good to me; Thanks!
> >
> > Acked-by: Peter Zijlstra (Intel) <[email protected]>
>
> I just ran a bunch of KCSAN tests. While this series alone would have
> passed the tests, there appears to be a problem with
> __READ_ONCE/__WRITE_ONCE. I think they should already be using
> 'data_race()', as otherwise we will get lots of false positives in
> future.
>
> I noticed this when testing -tip/locking/kcsan, which breaks
> unfortunately, because I see a bunch of spurious data races with
> arch_atomic_{read,set} because "locking/atomics: Flip fallbacks and
> instrumentation" changed them to use __READ_ONCE()/__WRITE_ONCE().
> From what I see, the intent was to not double-instrument,
> unfortunately they are still double-instrumented because
> __READ_ONCE/__WRITE_ONCE doesn't hide the access from KCSAN (nor KASAN
> actually). I don't think we can use __no_sanitize_or_inline for the
> arch_ functions, because we really want them to be __always_inline
> (also to avoid calls to these functions in uaccess regions, which
> objtool would notice).
>
> I think the easiest way to resolve this is to wrap the accesses in
> __*_ONCE with data_race().

I just sent https://lkml.kernel.org/r/[email protected]
-- note that, using __*_ONCE in arch_atomic_{read,set} will once again
double-instrument with this. Overall there are 2 options:
1. provide __READ_ONCE/__WRITE_ONCE wrapped purely in data_race(), or
2. make __READ_ONCE/__WRITE_ONCE perform an atomic check so we may
still catch races with plain accesses.
The patch I sent does (2). It is inevitable that these will be used in
places that we did not expect, purely to get around the type check,
which is why I thought it might be the more conservative approach.

Thoughts?

Thanks,
-- Marco

2020-05-12 19:10:12

[permalink] [raw]

Subject: Re: [PATCH v5 00/18] Rework READ_ONCE() to improve codegen

On Tue, May 12, 2020 at 07:53:00PM +0200, Marco Elver wrote:
> I just ran a bunch of KCSAN tests. While this series alone would have
> passed the tests, there appears to be a problem with
> __READ_ONCE/__WRITE_ONCE. I think they should already be using
> 'data_race()', as otherwise we will get lots of false positives in
> future.
>
> I noticed this when testing -tip/locking/kcsan, which breaks
> unfortunately, because I see a bunch of spurious data races with
> arch_atomic_{read,set} because "locking/atomics: Flip fallbacks and
> instrumentation" changed them to use __READ_ONCE()/__WRITE_ONCE().
> From what I see, the intent was to not double-instrument,
> unfortunately they are still double-instrumented because
> __READ_ONCE/__WRITE_ONCE doesn't hide the access from KCSAN (nor KASAN
> actually). I don't think we can use __no_sanitize_or_inline for the
> arch_ functions, because we really want them to be __always_inline
> (also to avoid calls to these functions in uaccess regions, which
> objtool would notice).
>
> I think the easiest way to resolve this is to wrap the accesses in
> __*_ONCE with data_race().

But we can't... because I need arch_atomic_*() and __READ_ONCE() to not
call out to _ANYTHING_.

Sadly, because the compilers are 'broken' that whole __no_sanitize thing
didn't work, but I'll be moving a whole bunch of code into .c files with
all the sanitizers killed dead. And we'll be validating it'll not be
calling out to anything.

data_race() will include active calls to kcsan_{dis,en}able_current(),
and this must not happen.

2020-05-12 20:35:23

[permalink] [raw]

Subject: Re: [PATCH v5 00/18] Rework READ_ONCE() to improve codegen

On Tue, 12 May 2020 at 21:08, Peter Zijlstra <[email protected]> wrote:
>
> On Tue, May 12, 2020 at 07:53:00PM +0200, Marco Elver wrote:
> > I just ran a bunch of KCSAN tests. While this series alone would have
> > passed the tests, there appears to be a problem with
> > __READ_ONCE/__WRITE_ONCE. I think they should already be using
> > 'data_race()', as otherwise we will get lots of false positives in
> > future.
> >
> > I noticed this when testing -tip/locking/kcsan, which breaks
> > unfortunately, because I see a bunch of spurious data races with
> > arch_atomic_{read,set} because "locking/atomics: Flip fallbacks and
> > instrumentation" changed them to use __READ_ONCE()/__WRITE_ONCE().
> > From what I see, the intent was to not double-instrument,
> > unfortunately they are still double-instrumented because
> > __READ_ONCE/__WRITE_ONCE doesn't hide the access from KCSAN (nor KASAN
> > actually). I don't think we can use __no_sanitize_or_inline for the
> > arch_ functions, because we really want them to be __always_inline
> > (also to avoid calls to these functions in uaccess regions, which
> > objtool would notice).
> >
> > I think the easiest way to resolve this is to wrap the accesses in
> > __*_ONCE with data_race().
>
> But we can't... because I need arch_atomic_*() and __READ_ONCE() to not
> call out to _ANYTHING_.
>
> Sadly, because the compilers are 'broken' that whole __no_sanitize thing
> didn't work, but I'll be moving a whole bunch of code into .c files with
> all the sanitizers killed dead. And we'll be validating it'll not be
> calling out to anything.
>
> data_race() will include active calls to kcsan_{dis,en}able_current(),
> and this must not happen.

Only if instrumentation is enabled for the compilation unit. If you
have KCSAN_SANITIZE_foo.c := n, no calls are emitted not even to
kcsan_{dis,en}able_current(). Does that help?

By default, right now __READ_ONCE() will still generate a call due to
instrumentation (call to __tsan_readX).

Thanks,
-- Marco

2020-05-12 21:16:48

[permalink] [raw]

Subject: Re: [PATCH v5 00/18] Rework READ_ONCE() to improve codegen

On Tue, May 12, 2020 at 09:07:55PM +0200, Peter Zijlstra wrote:
> On Tue, May 12, 2020 at 07:53:00PM +0200, Marco Elver wrote:
> > I just ran a bunch of KCSAN tests. While this series alone would have
> > passed the tests, there appears to be a problem with
> > __READ_ONCE/__WRITE_ONCE. I think they should already be using
> > 'data_race()', as otherwise we will get lots of false positives in
> > future.
> >
> > I noticed this when testing -tip/locking/kcsan, which breaks
> > unfortunately, because I see a bunch of spurious data races with
> > arch_atomic_{read,set} because "locking/atomics: Flip fallbacks and
> > instrumentation" changed them to use __READ_ONCE()/__WRITE_ONCE().
> > From what I see, the intent was to not double-instrument,
> > unfortunately they are still double-instrumented because
> > __READ_ONCE/__WRITE_ONCE doesn't hide the access from KCSAN (nor KASAN
> > actually). I don't think we can use __no_sanitize_or_inline for the
> > arch_ functions, because we really want them to be __always_inline
> > (also to avoid calls to these functions in uaccess regions, which
> > objtool would notice).
> >
> > I think the easiest way to resolve this is to wrap the accesses in
> > __*_ONCE with data_race().
>
> But we can't... because I need arch_atomic_*() and __READ_ONCE() to not
> call out to _ANYTHING_.
>
> Sadly, because the compilers are 'broken' that whole __no_sanitize thing
> didn't work, but I'll be moving a whole bunch of code into .c files with
> all the sanitizers killed dead. And we'll be validating it'll not be
> calling out to anything.

Hmm, I may have just run into this problem too. I'm using clang 11.0.1,
but even if I do something like:

unsigned long __no_sanitize_or_inline foo(unsigned long *p)
{
return READ_ONCE_NOCHECK(*p);
}

then I /still/ get calls to __tcsan_func_{entry,exit} emitted by the
compiler. Marco -- how do you turn this thing off?!

I'm also not particularly fond of treating __{READ,WRITE}ONCE() as "atomic",
since they're allowed to tear and I think callers should probably either be
using data_race() explicitly or disabling instrumentation (assuming that's
possible).

Will

2020-05-12 22:03:28

[permalink] [raw]

Subject: Re: [PATCH v5 00/18] Rework READ_ONCE() to improve codegen

On Tue, 12 May 2020 at 23:15, Will Deacon <[email protected]> wrote:
>
> On Tue, May 12, 2020 at 09:07:55PM +0200, Peter Zijlstra wrote:
> > On Tue, May 12, 2020 at 07:53:00PM +0200, Marco Elver wrote:
> > > I just ran a bunch of KCSAN tests. While this series alone would have
> > > passed the tests, there appears to be a problem with
> > > __READ_ONCE/__WRITE_ONCE. I think they should already be using
> > > 'data_race()', as otherwise we will get lots of false positives in
> > > future.
> > >
> > > I noticed this when testing -tip/locking/kcsan, which breaks
> > > unfortunately, because I see a bunch of spurious data races with
> > > arch_atomic_{read,set} because "locking/atomics: Flip fallbacks and
> > > instrumentation" changed them to use __READ_ONCE()/__WRITE_ONCE().
> > > From what I see, the intent was to not double-instrument,
> > > unfortunately they are still double-instrumented because
> > > __READ_ONCE/__WRITE_ONCE doesn't hide the access from KCSAN (nor KASAN
> > > actually). I don't think we can use __no_sanitize_or_inline for the
> > > arch_ functions, because we really want them to be __always_inline
> > > (also to avoid calls to these functions in uaccess regions, which
> > > objtool would notice).
> > >
> > > I think the easiest way to resolve this is to wrap the accesses in
> > > __*_ONCE with data_race().
> >
> > But we can't... because I need arch_atomic_*() and __READ_ONCE() to not
> > call out to _ANYTHING_.
> >
> > Sadly, because the compilers are 'broken' that whole __no_sanitize thing
> > didn't work, but I'll be moving a whole bunch of code into .c files with
> > all the sanitizers killed dead. And we'll be validating it'll not be
> > calling out to anything.
>
> Hmm, I may have just run into this problem too. I'm using clang 11.0.1,
> but even if I do something like:
>
> unsigned long __no_sanitize_or_inline foo(unsigned long *p)
> {
> return READ_ONCE_NOCHECK(*p);
> }
>
> then I /still/ get calls to __tcsan_func_{entry,exit} emitted by the
> compiler. Marco -- how do you turn this thing off?!

For Clang we have an option ("-mllvm
-tsan-instrument-func-entry-exit=0"), for GCC, I don't think we have
the option.

I had hoped we could keep these compiler changes optional for now, to
not require a very recent compiler. I'll send a patch to enable the
option, but keep it optional for now. Or do you think we require the
compiler to support this? Because then we'll only support Clang.

> I'm also not particularly fond of treating __{READ,WRITE}ONCE() as "atomic",
> since they're allowed to tear and I think callers should probably either be
> using data_race() explicitly or disabling instrumentation (assuming that's
> possible).

That point is fair enough. But how do we fix arch_atomic_{read,set} then?

Thanks,
-- Marco

2020-05-13 11:53:08

[permalink] [raw]

Subject: Re: [PATCH v5 00/18] Rework READ_ONCE() to improve codegen

On Wed, 13 May 2020 at 13:11, Peter Zijlstra <[email protected]> wrote:
>
> On Tue, May 12, 2020 at 10:31:44PM +0200, Marco Elver wrote:
> > On Tue, 12 May 2020 at 21:08, Peter Zijlstra <[email protected]> wrote:
>
> > > data_race() will include active calls to kcsan_{dis,en}able_current(),
> > > and this must not happen.
> >
> > Only if instrumentation is enabled for the compilation unit. If you
> > have KCSAN_SANITIZE_foo.c := n, no calls are emitted not even to
> > kcsan_{dis,en}able_current(). Does that help?
> >
> > By default, right now __READ_ONCE() will still generate a call due to
> > instrumentation (call to __tsan_readX).
>
> Ah, so looking at:
>
> #define data_race(expr) \
> ({ \
> __kcsan_disable_current(); \
> ({ \
> __unqual_scalar_typeof(({ expr; })) __v = ({ expr; }); \
> __kcsan_enable_current(); \
> __v; \
> }); \
> })
>
> had me confused, but then you've got this squirreled away in another
> header:
>
> #ifdef __SANITIZE_THREAD__
> /*
> * Only calls into the runtime when the particular compilation unit has KCSAN
> * instrumentation enabled. May be used in header files.
> */
> #define kcsan_check_access __kcsan_check_access
>
> /*
> * Only use these to disable KCSAN for accesses in the current compilation unit;
> * calls into libraries may still perform KCSAN checks.
> */
> #define __kcsan_disable_current kcsan_disable_current
> #define __kcsan_enable_current kcsan_enable_current_nowarn
> #else
> static inline void kcsan_check_access(const volatile void *ptr, size_t size,
> int type) { }
> static inline void __kcsan_enable_current(void) { }
> static inline void __kcsan_disable_current(void) { }
> #endif
>
> And I suppose KCSAN_SANITIZE := n, results in __SANITIZE_THREAD__ not
> being defined.
>
> I really hate the function attribute situation, that is some ill
> considered trainwreck.
>
> Looking at this more, I found you already have:
>
> arch/x86/kernel/Makefile:KCSAN_SANITIZE := n
> arch/x86/kernel/Makefile:KCOV_INSTRUMENT := n
> arch/x86/mm/Makefile:KCSAN_SANITIZE := n
>
> So how about I complete that and kill everhthing for all arch/x86/ that
> has DEFINE_IDTENTRY*() in.
>
> That avoids me having to do a lot of work to split up the tricky bits.
> You didn't think it was important, so why should I bother.
>
> So then I end up with something like the below, and I've validated that
> does not generate instrumentation... HOWEVER, I now need ~10g of memory
> and many seconds to compile each file in arch/x86/kernel/.
>
> That is, when I do 'make arch/x86/kernel/ -j8', it is slow enough that I
> can run top and grab:
>
> 31249 root 20 0 6128580 4.1g 13092 R 100.0 13.1 0:16.29 cc1
> 31278 root 20 0 6259456 4.4g 12932 R 100.0 13.9 0:16.27 cc1
> 31286 root 20 0 7243160 4.9g 13028 R 100.0 15.5 0:16.26 cc1
> 31289 root 20 0 5933824 4.0g 12936 R 100.0 12.8 0:16.26 cc1
> 31331 root 20 0 4250924 2.9g 13016 R 100.0 9.3 0:09.54 cc1
> 31346 root 20 0 1939552 1.3g 13028 R 100.0 4.1 0:07.01 cc1
> 31238 root 20 0 6293524 4.1g 13008 R 100.0 13.0 0:16.29 cc1
> 31259 root 20 0 6817076 4.7g 12956 R 100.0 14.9 0:16.27 cc1
>
> and it then triggers OOMs, while previously I could build kernels with
> -j80 on that machine:
>
> 31289 root 20 0 10.8g 6.2g 884 R 100.0 19.7 1:01.56 cc1
> 31249 root 20 0 10.2g 6.1g 484 R 100.0 19.3 1:00.10 cc1
> 31331 root 20 0 10.3g 7.2g 496 R 100.0 23.1 0:53.95 cc1
>
> Only 3 left, because the others OOM'ed.
>
> This is gcc-8.3, the situation with gcc-10 seems marginally better, but
> still atrocious.
>
> ---
> diff --git a/arch/x86/entry/Makefile b/arch/x86/entry/Makefile
> index b7a5790d8d63..ff959f0209e7 100644
> --- a/arch/x86/entry/Makefile
> +++ b/arch/x86/entry/Makefile
> @@ -6,6 +6,7 @@
> KASAN_SANITIZE := n
> UBSAN_SANITIZE := n
> KCOV_INSTRUMENT := n
> +KCSAN_INSTRUMENT := n
>
> CFLAGS_REMOVE_common.o = $(CC_FLAGS_FTRACE) -fstack-protector -fstack-protector-strong
> CFLAGS_REMOVE_syscall_32.o = $(CC_FLAGS_FTRACE) -fstack-protector -fstack-protector-strong
> diff --git a/arch/x86/kernel/Makefile b/arch/x86/kernel/Makefile
> index d6d61c4455fa..f2a46a87026e 100644
> --- a/arch/x86/kernel/Makefile
> +++ b/arch/x86/kernel/Makefile
> @@ -22,15 +22,18 @@ CFLAGS_REMOVE_early_printk.o = -pg
> CFLAGS_REMOVE_head64.o = -pg
> endif
>
> -KASAN_SANITIZE_head$(BITS).o := n
> -KASAN_SANITIZE_dumpstack.o := n
> -KASAN_SANITIZE_dumpstack_$(BITS).o := n
> -KASAN_SANITIZE_stacktrace.o := n
> -KASAN_SANITIZE_paravirt.o := n
> -
> -# With some compiler versions the generated code results in boot hangs, caused
> -# by several compilation units. To be safe, disable all instrumentation.
> -KCSAN_SANITIZE := n
> +#
> +# You cannot instrument entry code, that results in definite problems.
> +# In particular, anything with DEFINE_IDTENTRY*() in must not have
> +# instrumentation on.
> +#
> +# If only function attributes and inlining would work properly, without
> +# that untangling this is a giant trainwreck, don't attempt.
> +#
> +KASAN_SANITIZE := n
> +UBSAN_SANITIZE := n
> +KCOV_INSTRUMENT := n
> +KCSAN_INSTRUMENT := n
>
> OBJECT_FILES_NON_STANDARD_test_nx.o := y
> OBJECT_FILES_NON_STANDARD_paravirt_patch.o := y
> @@ -39,11 +42,6 @@ ifdef CONFIG_FRAME_POINTER
> OBJECT_FILES_NON_STANDARD_ftrace_$(BITS).o := y
> endif
>
> -# If instrumentation of this dir is enabled, boot hangs during first second.
> -# Probably could be more selective here, but note that files related to irqs,
> -# boot, dumpstack/stacktrace, etc are either non-interesting or can lead to
> -# non-deterministic coverage.
> -KCOV_INSTRUMENT := n
>
> CFLAGS_irq.o := -I $(srctree)/$(src)/../include/asm/trace
>
> diff --git a/arch/x86/mm/Makefile b/arch/x86/mm/Makefile
> index f7fd0e868c9c..f8d7e7432847 100644
> --- a/arch/x86/mm/Makefile
> +++ b/arch/x86/mm/Makefile
> @@ -1,15 +1,17 @@
> # SPDX-License-Identifier: GPL-2.0
> -# Kernel does not boot with instrumentation of tlb.c and mem_encrypt*.c
> -KCOV_INSTRUMENT_tlb.o := n
> -KCOV_INSTRUMENT_mem_encrypt.o := n
> -KCOV_INSTRUMENT_mem_encrypt_identity.o := n
>
> -KASAN_SANITIZE_mem_encrypt.o := n
> -KASAN_SANITIZE_mem_encrypt_identity.o := n
> -
> -# Disable KCSAN entirely, because otherwise we get warnings that some functions
> -# reference __initdata sections.
> -KCSAN_SANITIZE := n
> +#
> +# You cannot instrument entry code, that results in definite problems.
> +# In particular, anything with DEFINE_IDTENTRY*() in must not have
> +# instrumentation on.
> +#
> +# If only function attributes and inlining would work properly, without
> +# that untangling this is a giant trainwreck, don't attempt.
> +#
> +KASAN_SANITIZE := n
> +UBSAN_SANITIZE := n
> +KCOV_INSTRUMENT := n
> +KCSAN_INSTRUMENT := n
>
> ifdef CONFIG_FUNCTION_TRACER
> CFLAGS_REMOVE_mem_encrypt.o = -pg
> diff --git a/include/linux/compiler.h b/include/linux/compiler.h
> index 3bb962959d8b..48f85d1d2db6 100644
> --- a/include/linux/compiler.h
> +++ b/include/linux/compiler.h
> @@ -241,7 +241,7 @@ void ftrace_likely_update(struct ftrace_likely_data *f, int val,
> * atomicity or dependency ordering guarantees. Note that this may result
> * in tears!
> */
> -#define __READ_ONCE(x) (*(const volatile __unqual_scalar_typeof(x) *)&(x))
> +#define __READ_ONCE(x) data_race((*(const volatile __unqual_scalar_typeof(x) *)&(x)))
>
> #define __READ_ONCE_SCALAR(x) \
> ({ \
> @@ -260,7 +260,7 @@ void ftrace_likely_update(struct ftrace_likely_data *f, int val,
>
> #define __WRITE_ONCE(x, val) \
> do { \
> - *(volatile typeof(x) *)&(x) = (val); \
> + data_race(*(volatile typeof(x) *)&(x) = (val)); \
> } while (0)
>
> #define __WRITE_ONCE_SCALAR(x, val) \
>

Disabling most instrumentation for arch/x86 is reasonable. Also fine
with the __READ_ONCE/__WRITE_ONCE changes (your improved
compiler-friendlier version).

We likely can't have both: still instrument __READ_ONCE/__WRITE_ONCE
(as Will suggested) *and* avoid double-instrumentation in arch_atomic.
If most use-cases of __READ_ONCE/__WRITE_ONCE are likely to use
data_race() or KCSAN_SANITIZE := n anyway, I'd say it's reasonable for
now.

Thanks,
-- Marco

2020-05-13 12:34:59

[permalink] [raw]

Subject: Re: [PATCH v5 00/18] Rework READ_ONCE() to improve codegen

On Wed, May 13, 2020 at 01:48:41PM +0200, Marco Elver wrote:

> Disabling most instrumentation for arch/x86 is reasonable. Also fine
> with the __READ_ONCE/__WRITE_ONCE changes (your improved
> compiler-friendlier version).
>
> We likely can't have both: still instrument __READ_ONCE/__WRITE_ONCE
> (as Will suggested) *and* avoid double-instrumentation in arch_atomic.
> If most use-cases of __READ_ONCE/__WRITE_ONCE are likely to use
> data_race() or KCSAN_SANITIZE := n anyway, I'd say it's reasonable for
> now.

Right, if/when people want sanitize crud enabled for x86 I need
something that:

- can mark a function 'no_sanitize' and all code that gets inlined into
that function must automagically also not get sanitized. ie. make
inline work like macros (again).

And optionally:

- can mark a function explicitly 'sanitize', and only when an explicit
sanitize and no_sanitize mix in inlining give the current
incompatible attribute splat.

That way we can have the noinstr function attribute imply no_sanitize
and frob the DEFINE_IDTENTRY*() macros to use (a new) sanitize_or_inline
helper instead of __always_inline for __##func().

2020-05-13 12:44:49

[permalink] [raw]

Subject: Re: [PATCH v5 00/18] Rework READ_ONCE() to improve codegen

On Wed, May 13, 2020 at 02:32:43PM +0200, Peter Zijlstra wrote:
> On Wed, May 13, 2020 at 01:48:41PM +0200, Marco Elver wrote:
>
> > Disabling most instrumentation for arch/x86 is reasonable. Also fine
> > with the __READ_ONCE/__WRITE_ONCE changes (your improved
> > compiler-friendlier version).
> >
> > We likely can't have both: still instrument __READ_ONCE/__WRITE_ONCE
> > (as Will suggested) *and* avoid double-instrumentation in arch_atomic.
> > If most use-cases of __READ_ONCE/__WRITE_ONCE are likely to use
> > data_race() or KCSAN_SANITIZE := n anyway, I'd say it's reasonable for
> > now.

I agree that Peter's patch is the right thing to do for now. I was hoping we
could instrument __{READ,WRITE}_ONCE(), but that we before I realised that
__no_sanitize_or_inline doesn't seem to do anything.

> Right, if/when people want sanitize crud enabled for x86 I need
> something that:
>
> - can mark a function 'no_sanitize' and all code that gets inlined into
> that function must automagically also not get sanitized. ie. make
> inline work like macros (again).
>
> And optionally:
>
> - can mark a function explicitly 'sanitize', and only when an explicit
> sanitize and no_sanitize mix in inlining give the current
> incompatible attribute splat.
>
> That way we can have the noinstr function attribute imply no_sanitize
> and frob the DEFINE_IDTENTRY*() macros to use (a new) sanitize_or_inline
> helper instead of __always_inline for __##func().

Sounds like a good plan to me, assuming the compiler folks are onboard.
In the meantime, can we kill __no_sanitize_or_inline and put it back to
the old __no_kasan_or_inline, which I think simplifies compiler.h and
doesn't mislead people into using the function annotation to avoid KCSAN?

READ_ONCE_NOCHECK should also probably be READ_ONCE_NOKASAN, but I
appreciate that's a noisier change.

Will

2020-05-13 13:23:12

by David Laight

[permalink] [raw]

Subject: RE: [PATCH v5 00/18] Rework READ_ONCE() to improve codegen

From: Will Deacon
> Sent: 13 May 2020 13:40
> On Wed, May 13, 2020 at 02:32:43PM +0200, Peter Zijlstra wrote:
> > On Wed, May 13, 2020 at 01:48:41PM +0200, Marco Elver wrote:
> >
> > > Disabling most instrumentation for arch/x86 is reasonable. Also fine
> > > with the __READ_ONCE/__WRITE_ONCE changes (your improved
> > > compiler-friendlier version).
> > >
> > > We likely can't have both: still instrument __READ_ONCE/__WRITE_ONCE
> > > (as Will suggested) *and* avoid double-instrumentation in arch_atomic.
> > > If most use-cases of __READ_ONCE/__WRITE_ONCE are likely to use
> > > data_race() or KCSAN_SANITIZE := n anyway, I'd say it's reasonable for
> > > now.
>
> I agree that Peter's patch is the right thing to do for now. I was hoping we
> could instrument __{READ,WRITE}_ONCE(), but that we before I realised that
> __no_sanitize_or_inline doesn't seem to do anything.

Could something be done that put the addresses of the instructions
into a separate segment and have KASAN check that table before
reporting an actual error?

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)

2020-05-13 14:01:52

[permalink] [raw]

Subject: Re: [PATCH v5 00/18] Rework READ_ONCE() to improve codegen

On Wed, 13 May 2020 at 15:24, Peter Zijlstra <[email protected]> wrote:
>
> On Wed, May 13, 2020 at 03:15:55PM +0200, Marco Elver wrote:
> > So far so good, except: both __no_sanitize_or_inline and
> > __no_kcsan_or_inline *do* avoid KCSAN instrumenting plain accesses, it
> > just doesn't avoid explicit kcsan_check calls, like those in
> > READ/WRITE_ONCE if KCSAN is enabled for the compilation unit. That's
> > just because macros won't be redefined just for __no_sanitize
> > functions. Similarly, READ_ONCE_NOCHECK does work as expected, and its
> > access is unchecked.
> >
> > This will have the expected result:
> > __no_sanitize_or_inline void foo(void) { x++; } // no data races reported
> >
> > This will not work as expected:
> > __no_sanitize_or_inline void foo(void) { READ_ONCE(x); } // data
> > races are reported
> >
> > All this could be fixed if GCC devs would finally take my patch to
> > make -fsanitize=thread distinguish volatile [1], but then we have to
> > wait ~years for the new compilers to reach us. So please don't hold
> > your breath for this one any time soon.
> > [1] https://gcc.gnu.org/pipermail/gcc-patches/2020-April/544452.html
>
> Right, but that does not address the much larger issue of the attribute
> vs inline tranwreck :/

Could you check if Clang is equally broken for you? I think GCC and
Clang have differing behaviour on this. No idea what it takes to fix
GCC though.

> Also, could not this compiler instrumentation live as a kernel specific
> GCC-plugin instead of being part of GCC proper? Because in that case,
> we'd have much better control over it.

I'd like it if we could make it a GCC-plugin for GCC, but how? I don't
see a way to affect TSAN instrumentation. FWIW Clang already has
distinguish-volatile support (unreleased Clang 11).

Thanks,
-- Marco

2020-05-13 16:34:40

by Thomas Gleixner

[permalink] [raw]

Subject: RE: [PATCH v5 00/18] Rework READ_ONCE() to improve codegen

David Laight <[email protected]> writes:
> From: Will Deacon
>> Sent: 13 May 2020 13:40
>> On Wed, May 13, 2020 at 02:32:43PM +0200, Peter Zijlstra wrote:
>> > On Wed, May 13, 2020 at 01:48:41PM +0200, Marco Elver wrote:
>> >
>> > > Disabling most instrumentation for arch/x86 is reasonable. Also fine
>> > > with the __READ_ONCE/__WRITE_ONCE changes (your improved
>> > > compiler-friendlier version).
>> > >
>> > > We likely can't have both: still instrument __READ_ONCE/__WRITE_ONCE
>> > > (as Will suggested) *and* avoid double-instrumentation in arch_atomic.
>> > > If most use-cases of __READ_ONCE/__WRITE_ONCE are likely to use
>> > > data_race() or KCSAN_SANITIZE := n anyway, I'd say it's reasonable for
>> > > now.
>>
>> I agree that Peter's patch is the right thing to do for now. I was hoping we
>> could instrument __{READ,WRITE}_ONCE(), but that we before I realised that
>> __no_sanitize_or_inline doesn't seem to do anything.
>
> Could something be done that put the addresses of the instructions
> into a separate segment and have KASAN check that table before
> reporting an actual error?

That still does not solve the problem that the compiler adds calls into
k*san which we need to prevent in the first place.

Thanks,

tglx

2020-05-13 16:52:28

[permalink] [raw]

Subject: Re: [PATCH v5 00/18] Rework READ_ONCE() to improve codegen

On Wed, May 13, 2020 at 03:15:55PM +0200, Marco Elver wrote:
> On Wed, 13 May 2020 at 14:40, Will Deacon <[email protected]> wrote:
> >
> > On Wed, May 13, 2020 at 02:32:43PM +0200, Peter Zijlstra wrote:
> > > On Wed, May 13, 2020 at 01:48:41PM +0200, Marco Elver wrote:
> > >
> > > > Disabling most instrumentation for arch/x86 is reasonable. Also fine
> > > > with the __READ_ONCE/__WRITE_ONCE changes (your improved
> > > > compiler-friendlier version).
> > > >
> > > > We likely can't have both: still instrument __READ_ONCE/__WRITE_ONCE
> > > > (as Will suggested) *and* avoid double-instrumentation in arch_atomic.
> > > > If most use-cases of __READ_ONCE/__WRITE_ONCE are likely to use
> > > > data_race() or KCSAN_SANITIZE := n anyway, I'd say it's reasonable for
> > > > now.
> >
> > I agree that Peter's patch is the right thing to do for now. I was hoping we
> > could instrument __{READ,WRITE}_ONCE(), but that we before I realised that
> > __no_sanitize_or_inline doesn't seem to do anything.
> >
> > > Right, if/when people want sanitize crud enabled for x86 I need
> > > something that:
> > >
> > > - can mark a function 'no_sanitize' and all code that gets inlined into
> > > that function must automagically also not get sanitized. ie. make
> > > inline work like macros (again).
> > >
> > > And optionally:
> > >
> > > - can mark a function explicitly 'sanitize', and only when an explicit
> > > sanitize and no_sanitize mix in inlining give the current
> > > incompatible attribute splat.
> > >
> > > That way we can have the noinstr function attribute imply no_sanitize
> > > and frob the DEFINE_IDTENTRY*() macros to use (a new) sanitize_or_inline
> > > helper instead of __always_inline for __##func().
> >
> > Sounds like a good plan to me, assuming the compiler folks are onboard.
> > In the meantime, can we kill __no_sanitize_or_inline and put it back to
> > the old __no_kasan_or_inline, which I think simplifies compiler.h and
> > doesn't mislead people into using the function annotation to avoid KCSAN?
> >
> > READ_ONCE_NOCHECK should also probably be READ_ONCE_NOKASAN, but I
> > appreciate that's a noisier change.
>
> So far so good, except: both __no_sanitize_or_inline and
> __no_kcsan_or_inline *do* avoid KCSAN instrumenting plain accesses, it
> just doesn't avoid explicit kcsan_check calls, like those in
> READ/WRITE_ONCE if KCSAN is enabled for the compilation unit. That's
> just because macros won't be redefined just for __no_sanitize
> functions. Similarly, READ_ONCE_NOCHECK does work as expected, and its
> access is unchecked.
>
> This will have the expected result:
> __no_sanitize_or_inline void foo(void) { x++; } // no data races reported
>
> This will not work as expected:
> __no_sanitize_or_inline void foo(void) { READ_ONCE(x); } // data
> races are reported

But the problem is that *this* does not work as expected:

unsigned long __no_sanitize_or_inline foo(unsigned long *ptr)
{
return READ_ONCE_NOCHECK(*ptr);
}

which I think means that the function annotation is practically useless.

Will

2020-05-13 17:38:04

[permalink] [raw]

Subject: Re: [PATCH v5 00/18] Rework READ_ONCE() to improve codegen

On Wed, 13 May 2020 at 18:50, Will Deacon <[email protected]> wrote:
>
> On Wed, May 13, 2020 at 03:15:55PM +0200, Marco Elver wrote:
> > On Wed, 13 May 2020 at 14:40, Will Deacon <[email protected]> wrote:
> > >
> > > On Wed, May 13, 2020 at 02:32:43PM +0200, Peter Zijlstra wrote:
> > > > On Wed, May 13, 2020 at 01:48:41PM +0200, Marco Elver wrote:
> > > >
> > > > > Disabling most instrumentation for arch/x86 is reasonable. Also fine
> > > > > with the __READ_ONCE/__WRITE_ONCE changes (your improved
> > > > > compiler-friendlier version).
> > > > >
> > > > > We likely can't have both: still instrument __READ_ONCE/__WRITE_ONCE
> > > > > (as Will suggested) *and* avoid double-instrumentation in arch_atomic.
> > > > > If most use-cases of __READ_ONCE/__WRITE_ONCE are likely to use
> > > > > data_race() or KCSAN_SANITIZE := n anyway, I'd say it's reasonable for
> > > > > now.
> > >
> > > I agree that Peter's patch is the right thing to do for now. I was hoping we
> > > could instrument __{READ,WRITE}_ONCE(), but that we before I realised that
> > > __no_sanitize_or_inline doesn't seem to do anything.
> > >
> > > > Right, if/when people want sanitize crud enabled for x86 I need
> > > > something that:
> > > >
> > > > - can mark a function 'no_sanitize' and all code that gets inlined into
> > > > that function must automagically also not get sanitized. ie. make
> > > > inline work like macros (again).
> > > >
> > > > And optionally:
> > > >
> > > > - can mark a function explicitly 'sanitize', and only when an explicit
> > > > sanitize and no_sanitize mix in inlining give the current
> > > > incompatible attribute splat.
> > > >
> > > > That way we can have the noinstr function attribute imply no_sanitize
> > > > and frob the DEFINE_IDTENTRY*() macros to use (a new) sanitize_or_inline
> > > > helper instead of __always_inline for __##func().
> > >
> > > Sounds like a good plan to me, assuming the compiler folks are onboard.
> > > In the meantime, can we kill __no_sanitize_or_inline and put it back to
> > > the old __no_kasan_or_inline, which I think simplifies compiler.h and
> > > doesn't mislead people into using the function annotation to avoid KCSAN?
> > >
> > > READ_ONCE_NOCHECK should also probably be READ_ONCE_NOKASAN, but I
> > > appreciate that's a noisier change.
> >
> > So far so good, except: both __no_sanitize_or_inline and
> > __no_kcsan_or_inline *do* avoid KCSAN instrumenting plain accesses, it
> > just doesn't avoid explicit kcsan_check calls, like those in
> > READ/WRITE_ONCE if KCSAN is enabled for the compilation unit. That's
> > just because macros won't be redefined just for __no_sanitize
> > functions. Similarly, READ_ONCE_NOCHECK does work as expected, and its
> > access is unchecked.
> >
> > This will have the expected result:
> > __no_sanitize_or_inline void foo(void) { x++; } // no data races reported
> >
> > This will not work as expected:
> > __no_sanitize_or_inline void foo(void) { READ_ONCE(x); } // data
> > races are reported
>
> But the problem is that *this* does not work as expected:
>
> unsigned long __no_sanitize_or_inline foo(unsigned long *ptr)
> {
> return READ_ONCE_NOCHECK(*ptr);
> }
>
> which I think means that the function annotation is practically useless.

Let me understand the problem better:

- We do not want __tsan_func_entry/exit (looking at the disassembly,
these aren't always generated).
- We do not want kcsan_disable/enable calls (with the new __READ_ONCE version).
- We do *not* want the call to __read_once_word_nocheck if we have
__no_sanitize_or_inline. AFAIK that's the main problem -- this applies
to both KASAN and KCSAN.

From what I gather, we want to just compile the function as if the
sanitizer was never enabled. One reason for why this doesn't quite
work is because of the preprocessor.

Note that the sanitizers won't complain about these accesses, which
unfortunately is all these attributes ever were documented to do. So
the attributes aren't completely useless. Why doesn't
K[AC]SAN_SANITIZE := n work?

Thanks,
-- Marco

2020-05-13 17:50:32

[permalink] [raw]

Subject: Re: [PATCH v5 00/18] Rework READ_ONCE() to improve codegen

On Wed, May 13, 2020 at 07:32:58PM +0200, Marco Elver wrote:
> On Wed, 13 May 2020 at 18:50, Will Deacon <[email protected]> wrote:
> >
> > On Wed, May 13, 2020 at 03:15:55PM +0200, Marco Elver wrote:
> > > On Wed, 13 May 2020 at 14:40, Will Deacon <[email protected]> wrote:
> > > >
> > > > On Wed, May 13, 2020 at 02:32:43PM +0200, Peter Zijlstra wrote:
> > > > > On Wed, May 13, 2020 at 01:48:41PM +0200, Marco Elver wrote:
> > > > >
> > > > > > Disabling most instrumentation for arch/x86 is reasonable. Also fine
> > > > > > with the __READ_ONCE/__WRITE_ONCE changes (your improved
> > > > > > compiler-friendlier version).
> > > > > >
> > > > > > We likely can't have both: still instrument __READ_ONCE/__WRITE_ONCE
> > > > > > (as Will suggested) *and* avoid double-instrumentation in arch_atomic.
> > > > > > If most use-cases of __READ_ONCE/__WRITE_ONCE are likely to use
> > > > > > data_race() or KCSAN_SANITIZE := n anyway, I'd say it's reasonable for
> > > > > > now.
> > > >
> > > > I agree that Peter's patch is the right thing to do for now. I was hoping we
> > > > could instrument __{READ,WRITE}_ONCE(), but that we before I realised that
> > > > __no_sanitize_or_inline doesn't seem to do anything.
> > > >
> > > > > Right, if/when people want sanitize crud enabled for x86 I need
> > > > > something that:
> > > > >
> > > > > - can mark a function 'no_sanitize' and all code that gets inlined into
> > > > > that function must automagically also not get sanitized. ie. make
> > > > > inline work like macros (again).
> > > > >
> > > > > And optionally:
> > > > >
> > > > > - can mark a function explicitly 'sanitize', and only when an explicit
> > > > > sanitize and no_sanitize mix in inlining give the current
> > > > > incompatible attribute splat.
> > > > >
> > > > > That way we can have the noinstr function attribute imply no_sanitize
> > > > > and frob the DEFINE_IDTENTRY*() macros to use (a new) sanitize_or_inline
> > > > > helper instead of __always_inline for __##func().
> > > >
> > > > Sounds like a good plan to me, assuming the compiler folks are onboard.
> > > > In the meantime, can we kill __no_sanitize_or_inline and put it back to
> > > > the old __no_kasan_or_inline, which I think simplifies compiler.h and
> > > > doesn't mislead people into using the function annotation to avoid KCSAN?
> > > >
> > > > READ_ONCE_NOCHECK should also probably be READ_ONCE_NOKASAN, but I
> > > > appreciate that's a noisier change.
> > >
> > > So far so good, except: both __no_sanitize_or_inline and
> > > __no_kcsan_or_inline *do* avoid KCSAN instrumenting plain accesses, it
> > > just doesn't avoid explicit kcsan_check calls, like those in
> > > READ/WRITE_ONCE if KCSAN is enabled for the compilation unit. That's
> > > just because macros won't be redefined just for __no_sanitize
> > > functions. Similarly, READ_ONCE_NOCHECK does work as expected, and its
> > > access is unchecked.
> > >
> > > This will have the expected result:
> > > __no_sanitize_or_inline void foo(void) { x++; } // no data races reported
> > >
> > > This will not work as expected:
> > > __no_sanitize_or_inline void foo(void) { READ_ONCE(x); } // data
> > > races are reported
> >
> > But the problem is that *this* does not work as expected:
> >
> > unsigned long __no_sanitize_or_inline foo(unsigned long *ptr)
> > {
> > return READ_ONCE_NOCHECK(*ptr);
> > }
> >
> > which I think means that the function annotation is practically useless.
>
> Let me understand the problem better:
>
> - We do not want __tsan_func_entry/exit (looking at the disassembly,
> these aren't always generated).
> - We do not want kcsan_disable/enable calls (with the new __READ_ONCE version).
> - We do *not* want the call to __read_once_word_nocheck if we have
> __no_sanitize_or_inline. AFAIK that's the main problem -- this applies
> to both KASAN and KCSAN.

Sorry, I should've been more explicit. The code above, with KASAN enabled,
compiles to:

ffffffff810a2d50 <foo>:
ffffffff810a2d50: 48 8b 07 mov (%rdi),%rax
ffffffff810a2d53: c3 retq

but with KCSAN enabled, compiles to:

ffffffff8109ecd0 <foo>:
ffffffff8109ecd0: 53 push %rbx
ffffffff8109ecd1: 48 89 fb mov %rdi,%rbx
ffffffff8109ecd4: 48 8b 7c 24 08 mov 0x8(%rsp),%rdi
ffffffff8109ecd9: e8 52 9c 1a 00 callq ffffffff81248930 <__tsan_func_entry>
ffffffff8109ecde: 48 89 df mov %rbx,%rdi
ffffffff8109ece1: e8 1a 00 00 00 callq ffffffff8109ed00 <__read_once_word_nocheck>
ffffffff8109ece6: 48 89 c3 mov %rax,%rbx
ffffffff8109ece9: e8 52 9c 1a 00 callq ffffffff81248940 <__tsan_func_exit>
ffffffff8109ecee: 48 89 d8 mov %rbx,%rax
ffffffff8109ecf1: 5b pop %rbx
ffffffff8109ecf2: c3 retq

Is that expected? There don't appear to be any more annotations to throw
at it.

> From what I gather, we want to just compile the function as if the
> sanitizer was never enabled. One reason for why this doesn't quite
> work is because of the preprocessor.
>
> Note that the sanitizers won't complain about these accesses, which
> unfortunately is all these attributes ever were documented to do. So
> the attributes aren't completely useless. Why doesn't
> K[AC]SAN_SANITIZE := n work?

I just don't get the point in having a function annotation if you then have to
pass flags at the per-object level. That also then necessitates either weird
refactoring and grouping of code into "noinstrument.c" type files, or blanket
disabling of instrumentation for things like arch/x86/

Will

2020-05-13 20:36:27

[permalink] [raw]

Subject: Re: [PATCH v5 00/18] Rework READ_ONCE() to improve codegen

On Wed, May 13, 2020 at 01:10:57PM +0200, Peter Zijlstra wrote:

> So then I end up with something like the below, and I've validated that
> does not generate instrumentation... HOWEVER, I now need ~10g of memory
> and many seconds to compile each file in arch/x86/kernel/.

> diff --git a/include/linux/compiler.h b/include/linux/compiler.h
> index 3bb962959d8b..48f85d1d2db6 100644
> --- a/include/linux/compiler.h
> +++ b/include/linux/compiler.h
> @@ -241,7 +241,7 @@ void ftrace_likely_update(struct ftrace_likely_data *f, int val,
> * atomicity or dependency ordering guarantees. Note that this may result
> * in tears!
> */
> -#define __READ_ONCE(x) (*(const volatile __unqual_scalar_typeof(x) *)&(x))
> +#define __READ_ONCE(x) data_race((*(const volatile __unqual_scalar_typeof(x) *)&(x)))
>
> #define __READ_ONCE_SCALAR(x) \
> ({ \
> @@ -260,7 +260,7 @@ void ftrace_likely_update(struct ftrace_likely_data *f, int val,
>
> #define __WRITE_ONCE(x, val) \
> do { \
> - *(volatile typeof(x) *)&(x) = (val); \
> + data_race(*(volatile typeof(x) *)&(x) = (val)); \
> } while (0)
>
> #define __WRITE_ONCE_SCALAR(x, val) \

The above is responsible for that, the below variant is _MUCH_ better
again. It really doesn't like nested data_race(), as in _REALLY_ doesn't
like.

---

diff --git a/include/linux/compiler.h b/include/linux/compiler.h
index 3bb962959d8b..2ea532b19e75 100644
--- a/include/linux/compiler.h
+++ b/include/linux/compiler.h
@@ -241,12 +241,12 @@ void ftrace_likely_update(struct ftrace_likely_data *f, int val,
* atomicity or dependency ordering guarantees. Note that this may result
* in tears!
*/
-#define __READ_ONCE(x) (*(const volatile __unqual_scalar_typeof(x) *)&(x))
+#define __READ_ONCE(x) data_race((*(const volatile __unqual_scalar_typeof(x) *)&(x)))

#define __READ_ONCE_SCALAR(x) \
({ \
typeof(x) *__xp = &(x); \
- __unqual_scalar_typeof(x) __x = data_race(__READ_ONCE(*__xp)); \
+ __unqual_scalar_typeof(x) __x = __READ_ONCE(*__xp); \
kcsan_check_atomic_read(__xp, sizeof(*__xp)); \
smp_read_barrier_depends(); \
(typeof(x))__x; \
@@ -260,14 +260,14 @@ void ftrace_likely_update(struct ftrace_likely_data *f, int val,

#define __WRITE_ONCE(x, val) \
do { \
- *(volatile typeof(x) *)&(x) = (val); \
+ data_race(*(volatile typeof(x) *)&(x) = (val)); \
} while (0)

#define __WRITE_ONCE_SCALAR(x, val) \
do { \
typeof(x) *__xp = &(x); \
kcsan_check_atomic_write(__xp, sizeof(*__xp)); \
- data_race(({ __WRITE_ONCE(*__xp, val); 0; })); \
+ __WRITE_ONCE(*__xp, val); \
} while (0)

#define WRITE_ONCE(x, val) \

2020-05-13 20:37:16

[permalink] [raw]

Subject: Re: [PATCH v5 00/18] Rework READ_ONCE() to improve codegen

On Tue, May 12, 2020 at 10:31:44PM +0200, Marco Elver wrote:
> On Tue, 12 May 2020 at 21:08, Peter Zijlstra <[email protected]> wrote:

> > data_race() will include active calls to kcsan_{dis,en}able_current(),
> > and this must not happen.
>
> Only if instrumentation is enabled for the compilation unit. If you
> have KCSAN_SANITIZE_foo.c := n, no calls are emitted not even to
> kcsan_{dis,en}able_current(). Does that help?
>
> By default, right now __READ_ONCE() will still generate a call due to
> instrumentation (call to __tsan_readX).

Ah, so looking at:

#define data_race(expr) \
({ \
__kcsan_disable_current(); \
({ \
__unqual_scalar_typeof(({ expr; })) __v = ({ expr; }); \
__kcsan_enable_current(); \
__v; \
}); \
})

had me confused, but then you've got this squirreled away in another
header:

#ifdef __SANITIZE_THREAD__
/*
* Only calls into the runtime when the particular compilation unit has KCSAN
* instrumentation enabled. May be used in header files.
*/
#define kcsan_check_access __kcsan_check_access

/*
* Only use these to disable KCSAN for accesses in the current compilation unit;
* calls into libraries may still perform KCSAN checks.
*/
#define __kcsan_disable_current kcsan_disable_current
#define __kcsan_enable_current kcsan_enable_current_nowarn
#else
static inline void kcsan_check_access(const volatile void *ptr, size_t size,
int type) { }
static inline void __kcsan_enable_current(void) { }
static inline void __kcsan_disable_current(void) { }
#endif

And I suppose KCSAN_SANITIZE := n, results in __SANITIZE_THREAD__ not
being defined.

I really hate the function attribute situation, that is some ill
considered trainwreck.

Looking at this more, I found you already have:

arch/x86/kernel/Makefile:KCSAN_SANITIZE := n
arch/x86/kernel/Makefile:KCOV_INSTRUMENT := n
arch/x86/mm/Makefile:KCSAN_SANITIZE := n

So how about I complete that and kill everhthing for all arch/x86/ that
has DEFINE_IDTENTRY*() in.

That avoids me having to do a lot of work to split up the tricky bits.
You didn't think it was important, so why should I bother.

So then I end up with something like the below, and I've validated that
does not generate instrumentation... HOWEVER, I now need ~10g of memory
and many seconds to compile each file in arch/x86/kernel/.

That is, when I do 'make arch/x86/kernel/ -j8', it is slow enough that I
can run top and grab:

31249 root 20 0 6128580 4.1g 13092 R 100.0 13.1 0:16.29 cc1
31278 root 20 0 6259456 4.4g 12932 R 100.0 13.9 0:16.27 cc1
31286 root 20 0 7243160 4.9g 13028 R 100.0 15.5 0:16.26 cc1
31289 root 20 0 5933824 4.0g 12936 R 100.0 12.8 0:16.26 cc1
31331 root 20 0 4250924 2.9g 13016 R 100.0 9.3 0:09.54 cc1
31346 root 20 0 1939552 1.3g 13028 R 100.0 4.1 0:07.01 cc1
31238 root 20 0 6293524 4.1g 13008 R 100.0 13.0 0:16.29 cc1
31259 root 20 0 6817076 4.7g 12956 R 100.0 14.9 0:16.27 cc1

and it then triggers OOMs, while previously I could build kernels with
-j80 on that machine:

31289 root 20 0 10.8g 6.2g 884 R 100.0 19.7 1:01.56 cc1
31249 root 20 0 10.2g 6.1g 484 R 100.0 19.3 1:00.10 cc1
31331 root 20 0 10.3g 7.2g 496 R 100.0 23.1 0:53.95 cc1

Only 3 left, because the others OOM'ed.

This is gcc-8.3, the situation with gcc-10 seems marginally better, but
still atrocious.

---
diff --git a/arch/x86/entry/Makefile b/arch/x86/entry/Makefile
index b7a5790d8d63..ff959f0209e7 100644
--- a/arch/x86/entry/Makefile
+++ b/arch/x86/entry/Makefile
@@ -6,6 +6,7 @@
KASAN_SANITIZE := n
UBSAN_SANITIZE := n
KCOV_INSTRUMENT := n
+KCSAN_INSTRUMENT := n

CFLAGS_REMOVE_common.o = $(CC_FLAGS_FTRACE) -fstack-protector -fstack-protector-strong
CFLAGS_REMOVE_syscall_32.o = $(CC_FLAGS_FTRACE) -fstack-protector -fstack-protector-strong
diff --git a/arch/x86/kernel/Makefile b/arch/x86/kernel/Makefile
index d6d61c4455fa..f2a46a87026e 100644
--- a/arch/x86/kernel/Makefile
+++ b/arch/x86/kernel/Makefile
@@ -22,15 +22,18 @@ CFLAGS_REMOVE_early_printk.o = -pg
CFLAGS_REMOVE_head64.o = -pg
endif

-KASAN_SANITIZE_head$(BITS).o := n
-KASAN_SANITIZE_dumpstack.o := n
-KASAN_SANITIZE_dumpstack_$(BITS).o := n
-KASAN_SANITIZE_stacktrace.o := n
-KASAN_SANITIZE_paravirt.o := n
-
-# With some compiler versions the generated code results in boot hangs, caused
-# by several compilation units. To be safe, disable all instrumentation.
-KCSAN_SANITIZE := n
+#
+# You cannot instrument entry code, that results in definite problems.
+# In particular, anything with DEFINE_IDTENTRY*() in must not have
+# instrumentation on.
+#
+# If only function attributes and inlining would work properly, without
+# that untangling this is a giant trainwreck, don't attempt.
+#
+KASAN_SANITIZE := n
+UBSAN_SANITIZE := n
+KCOV_INSTRUMENT := n
+KCSAN_INSTRUMENT := n

OBJECT_FILES_NON_STANDARD_test_nx.o := y
OBJECT_FILES_NON_STANDARD_paravirt_patch.o := y
@@ -39,11 +42,6 @@ ifdef CONFIG_FRAME_POINTER
OBJECT_FILES_NON_STANDARD_ftrace_$(BITS).o := y
endif

-# If instrumentation of this dir is enabled, boot hangs during first second.
-# Probably could be more selective here, but note that files related to irqs,
-# boot, dumpstack/stacktrace, etc are either non-interesting or can lead to
-# non-deterministic coverage.
-KCOV_INSTRUMENT := n

CFLAGS_irq.o := -I $(srctree)/$(src)/../include/asm/trace

diff --git a/arch/x86/mm/Makefile b/arch/x86/mm/Makefile
index f7fd0e868c9c..f8d7e7432847 100644
--- a/arch/x86/mm/Makefile
+++ b/arch/x86/mm/Makefile
@@ -1,15 +1,17 @@
# SPDX-License-Identifier: GPL-2.0
-# Kernel does not boot with instrumentation of tlb.c and mem_encrypt*.c
-KCOV_INSTRUMENT_tlb.o := n
-KCOV_INSTRUMENT_mem_encrypt.o := n
-KCOV_INSTRUMENT_mem_encrypt_identity.o := n

-KASAN_SANITIZE_mem_encrypt.o := n
-KASAN_SANITIZE_mem_encrypt_identity.o := n
-
-# Disable KCSAN entirely, because otherwise we get warnings that some functions
-# reference __initdata sections.
-KCSAN_SANITIZE := n
+#
+# You cannot instrument entry code, that results in definite problems.
+# In particular, anything with DEFINE_IDTENTRY*() in must not have
+# instrumentation on.
+#
+# If only function attributes and inlining would work properly, without
+# that untangling this is a giant trainwreck, don't attempt.
+#
+KASAN_SANITIZE := n
+UBSAN_SANITIZE := n
+KCOV_INSTRUMENT := n
+KCSAN_INSTRUMENT := n

ifdef CONFIG_FUNCTION_TRACER
CFLAGS_REMOVE_mem_encrypt.o = -pg
diff --git a/include/linux/compiler.h b/include/linux/compiler.h
index 3bb962959d8b..48f85d1d2db6 100644
--- a/include/linux/compiler.h
+++ b/include/linux/compiler.h
@@ -241,7 +241,7 @@ void ftrace_likely_update(struct ftrace_likely_data *f, int val,
* atomicity or dependency ordering guarantees. Note that this may result
* in tears!
*/
-#define __READ_ONCE(x) (*(const volatile __unqual_scalar_typeof(x) *)&(x))
+#define __READ_ONCE(x) data_race((*(const volatile __unqual_scalar_typeof(x) *)&(x)))

#define __READ_ONCE_SCALAR(x) \
({ \
@@ -260,7 +260,7 @@ void ftrace_likely_update(struct ftrace_likely_data *f, int val,

#define __WRITE_ONCE(x, val) \
do { \
- *(volatile typeof(x) *)&(x) = (val); \
+ data_race(*(volatile typeof(x) *)&(x) = (val)); \
} while (0)

#define __WRITE_ONCE_SCALAR(x, val) \

2020-05-13 20:42:42

[permalink] [raw]

Subject: Re: [PATCH v5 00/18] Rework READ_ONCE() to improve codegen

On Wed, 13 May 2020 at 14:40, Will Deacon <[email protected]> wrote:
>
> On Wed, May 13, 2020 at 02:32:43PM +0200, Peter Zijlstra wrote:
> > On Wed, May 13, 2020 at 01:48:41PM +0200, Marco Elver wrote:
> >
> > > Disabling most instrumentation for arch/x86 is reasonable. Also fine
> > > with the __READ_ONCE/__WRITE_ONCE changes (your improved
> > > compiler-friendlier version).
> > >
> > > We likely can't have both: still instrument __READ_ONCE/__WRITE_ONCE
> > > (as Will suggested) *and* avoid double-instrumentation in arch_atomic.
> > > If most use-cases of __READ_ONCE/__WRITE_ONCE are likely to use
> > > data_race() or KCSAN_SANITIZE := n anyway, I'd say it's reasonable for
> > > now.
>
> I agree that Peter's patch is the right thing to do for now. I was hoping we
> could instrument __{READ,WRITE}_ONCE(), but that we before I realised that
> __no_sanitize_or_inline doesn't seem to do anything.
>
> > Right, if/when people want sanitize crud enabled for x86 I need
> > something that:
> >
> > - can mark a function 'no_sanitize' and all code that gets inlined into
> > that function must automagically also not get sanitized. ie. make
> > inline work like macros (again).
> >
> > And optionally:
> >
> > - can mark a function explicitly 'sanitize', and only when an explicit
> > sanitize and no_sanitize mix in inlining give the current
> > incompatible attribute splat.
> >
> > That way we can have the noinstr function attribute imply no_sanitize
> > and frob the DEFINE_IDTENTRY*() macros to use (a new) sanitize_or_inline
> > helper instead of __always_inline for __##func().
>
> Sounds like a good plan to me, assuming the compiler folks are onboard.
> In the meantime, can we kill __no_sanitize_or_inline and put it back to
> the old __no_kasan_or_inline, which I think simplifies compiler.h and
> doesn't mislead people into using the function annotation to avoid KCSAN?
>
> READ_ONCE_NOCHECK should also probably be READ_ONCE_NOKASAN, but I
> appreciate that's a noisier change.

So far so good, except: both __no_sanitize_or_inline and
__no_kcsan_or_inline *do* avoid KCSAN instrumenting plain accesses, it
just doesn't avoid explicit kcsan_check calls, like those in
READ/WRITE_ONCE if KCSAN is enabled for the compilation unit. That's
just because macros won't be redefined just for __no_sanitize
functions. Similarly, READ_ONCE_NOCHECK does work as expected, and its
access is unchecked.

This will have the expected result:
__no_sanitize_or_inline void foo(void) { x++; } // no data races reported

This will not work as expected:
__no_sanitize_or_inline void foo(void) { READ_ONCE(x); } // data
races are reported

All this could be fixed if GCC devs would finally take my patch to
make -fsanitize=thread distinguish volatile [1], but then we have to
wait ~years for the new compilers to reach us. So please don't hold
your breath for this one any time soon.
[1] https://gcc.gnu.org/pipermail/gcc-patches/2020-April/544452.html

Thanks,
-- Marco

2020-05-13 20:44:47

[permalink] [raw]

Subject: Re: [PATCH v5 00/18] Rework READ_ONCE() to improve codegen

On Wed, May 13, 2020 at 03:15:55PM +0200, Marco Elver wrote:
> So far so good, except: both __no_sanitize_or_inline and
> __no_kcsan_or_inline *do* avoid KCSAN instrumenting plain accesses, it
> just doesn't avoid explicit kcsan_check calls, like those in
> READ/WRITE_ONCE if KCSAN is enabled for the compilation unit. That's
> just because macros won't be redefined just for __no_sanitize
> functions. Similarly, READ_ONCE_NOCHECK does work as expected, and its
> access is unchecked.
>
> This will have the expected result:
> __no_sanitize_or_inline void foo(void) { x++; } // no data races reported
>
> This will not work as expected:
> __no_sanitize_or_inline void foo(void) { READ_ONCE(x); } // data
> races are reported
>
> All this could be fixed if GCC devs would finally take my patch to
> make -fsanitize=thread distinguish volatile [1], but then we have to
> wait ~years for the new compilers to reach us. So please don't hold
> your breath for this one any time soon.
> [1] https://gcc.gnu.org/pipermail/gcc-patches/2020-April/544452.html

Right, but that does not address the much larger issue of the attribute
vs inline tranwreck :/

Also, could not this compiler instrumentation live as a kernel specific
GCC-plugin instead of being part of GCC proper? Because in that case,
we'd have much better control over it.

2020-05-13 20:53:32

[permalink] [raw]

Subject: Re: [PATCH v5 00/18] Rework READ_ONCE() to improve codegen

On Wed, 13 May 2020 at 19:47, Will Deacon <[email protected]> wrote:
>
> On Wed, May 13, 2020 at 07:32:58PM +0200, Marco Elver wrote:
> > On Wed, 13 May 2020 at 18:50, Will Deacon <[email protected]> wrote:
> > >
> > > On Wed, May 13, 2020 at 03:15:55PM +0200, Marco Elver wrote:
> > > > On Wed, 13 May 2020 at 14:40, Will Deacon <[email protected]> wrote:
> > > > >
> > > > > On Wed, May 13, 2020 at 02:32:43PM +0200, Peter Zijlstra wrote:
> > > > > > On Wed, May 13, 2020 at 01:48:41PM +0200, Marco Elver wrote:
> > > > > >
> > > > > > > Disabling most instrumentation for arch/x86 is reasonable. Also fine
> > > > > > > with the __READ_ONCE/__WRITE_ONCE changes (your improved
> > > > > > > compiler-friendlier version).
> > > > > > >
> > > > > > > We likely can't have both: still instrument __READ_ONCE/__WRITE_ONCE
> > > > > > > (as Will suggested) *and* avoid double-instrumentation in arch_atomic.
> > > > > > > If most use-cases of __READ_ONCE/__WRITE_ONCE are likely to use
> > > > > > > data_race() or KCSAN_SANITIZE := n anyway, I'd say it's reasonable for
> > > > > > > now.
> > > > >
> > > > > I agree that Peter's patch is the right thing to do for now. I was hoping we
> > > > > could instrument __{READ,WRITE}_ONCE(), but that we before I realised that
> > > > > __no_sanitize_or_inline doesn't seem to do anything.
> > > > >
> > > > > > Right, if/when people want sanitize crud enabled for x86 I need
> > > > > > something that:
> > > > > >
> > > > > > - can mark a function 'no_sanitize' and all code that gets inlined into
> > > > > > that function must automagically also not get sanitized. ie. make
> > > > > > inline work like macros (again).
> > > > > >
> > > > > > And optionally:
> > > > > >
> > > > > > - can mark a function explicitly 'sanitize', and only when an explicit
> > > > > > sanitize and no_sanitize mix in inlining give the current
> > > > > > incompatible attribute splat.
> > > > > >
> > > > > > That way we can have the noinstr function attribute imply no_sanitize
> > > > > > and frob the DEFINE_IDTENTRY*() macros to use (a new) sanitize_or_inline
> > > > > > helper instead of __always_inline for __##func().
> > > > >
> > > > > Sounds like a good plan to me, assuming the compiler folks are onboard.
> > > > > In the meantime, can we kill __no_sanitize_or_inline and put it back to
> > > > > the old __no_kasan_or_inline, which I think simplifies compiler.h and
> > > > > doesn't mislead people into using the function annotation to avoid KCSAN?
> > > > >
> > > > > READ_ONCE_NOCHECK should also probably be READ_ONCE_NOKASAN, but I
> > > > > appreciate that's a noisier change.
> > > >
> > > > So far so good, except: both __no_sanitize_or_inline and
> > > > __no_kcsan_or_inline *do* avoid KCSAN instrumenting plain accesses, it
> > > > just doesn't avoid explicit kcsan_check calls, like those in
> > > > READ/WRITE_ONCE if KCSAN is enabled for the compilation unit. That's
> > > > just because macros won't be redefined just for __no_sanitize
> > > > functions. Similarly, READ_ONCE_NOCHECK does work as expected, and its
> > > > access is unchecked.
> > > >
> > > > This will have the expected result:
> > > > __no_sanitize_or_inline void foo(void) { x++; } // no data races reported
> > > >
> > > > This will not work as expected:
> > > > __no_sanitize_or_inline void foo(void) { READ_ONCE(x); } // data
> > > > races are reported
> > >
> > > But the problem is that *this* does not work as expected:
> > >
> > > unsigned long __no_sanitize_or_inline foo(unsigned long *ptr)
> > > {
> > > return READ_ONCE_NOCHECK(*ptr);
> > > }
> > >
> > > which I think means that the function annotation is practically useless.
> >
> > Let me understand the problem better:
> >
> > - We do not want __tsan_func_entry/exit (looking at the disassembly,
> > these aren't always generated).
> > - We do not want kcsan_disable/enable calls (with the new __READ_ONCE version).
> > - We do *not* want the call to __read_once_word_nocheck if we have
> > __no_sanitize_or_inline. AFAIK that's the main problem -- this applies
> > to both KASAN and KCSAN.
>
> Sorry, I should've been more explicit. The code above, with KASAN enabled,
> compiles to:
>
> ffffffff810a2d50 <foo>:
> ffffffff810a2d50: 48 8b 07 mov (%rdi),%rax
> ffffffff810a2d53: c3 retq
>
> but with KCSAN enabled, compiles to:
>
> ffffffff8109ecd0 <foo>:
> ffffffff8109ecd0: 53 push %rbx
> ffffffff8109ecd1: 48 89 fb mov %rdi,%rbx
> ffffffff8109ecd4: 48 8b 7c 24 08 mov 0x8(%rsp),%rdi
> ffffffff8109ecd9: e8 52 9c 1a 00 callq ffffffff81248930 <__tsan_func_entry>
> ffffffff8109ecde: 48 89 df mov %rbx,%rdi
> ffffffff8109ece1: e8 1a 00 00 00 callq ffffffff8109ed00 <__read_once_word_nocheck>
> ffffffff8109ece6: 48 89 c3 mov %rax,%rbx
> ffffffff8109ece9: e8 52 9c 1a 00 callq ffffffff81248940 <__tsan_func_exit>
> ffffffff8109ecee: 48 89 d8 mov %rbx,%rax
> ffffffff8109ecf1: 5b pop %rbx
> ffffffff8109ecf2: c3 retq
>
> Is that expected? There don't appear to be any more annotations to throw
> at it.

Right, so this is expected. We can definitely make
__tsan_func_entry/exit disappear with Clang, with GCC it's going to be
a while if we want to fix it.

If we remove 'noinline' from __no_kcsan_or_inline, we no longer get
the call to __read_once_word_nocheck above! But...

For KCSAN we force 'noinline' because older compilers still inline and
then instrument small functions even if we just have the no_sanitize
attribute (without inline mentioned). The same is actually true for
KASAN, so KASAN's READ_ONCE_NOCHECK might be broken in a few places,
but nobody seems to have noticed [1]. KASAN's __no_kasan_or_inline
should also have a 'noinline' I think. I just tested
__no_kcsan_or_inline without 'noinline', and yes, GCC 9 still decided
to inline a small function and then instrument the accesses.

[1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=59600

The good news is that Clang does the right thing when removing
'noinline' from __no_kcsan_or_inline:
1. doesn't inline into functions that are instrumented, and
2. your above example doesn't do the call to __read_once_word_nocheck.

The obvious solution to this is: restrict which compiler we want to support?

> > From what I gather, we want to just compile the function as if the
> > sanitizer was never enabled. One reason for why this doesn't quite
> > work is because of the preprocessor.
> >
> > Note that the sanitizers won't complain about these accesses, which
> > unfortunately is all these attributes ever were documented to do. So
> > the attributes aren't completely useless. Why doesn't
> > K[AC]SAN_SANITIZE := n work?
>
> I just don't get the point in having a function annotation if you then have to
> pass flags at the per-object level. That also then necessitates either weird
> refactoring and grouping of code into "noinstrument.c" type files, or blanket
> disabling of instrumentation for things like arch/x86/

If you want a solution now, here is one way to get us closer to where
we want to be:

1. Peter's patch to add data_race around __READ_ONCE/__WRITE_ONCE.
2. Patch to make __tsan_func_entry/exit disappear with Clang.
3. Remove 'noinline' from __no_kcsan_or_inline.
4. Patch to warn users that KCSAN may have problems with GCC and
should use Clang >= 7.

But this is probably only half a solution.

If you *also* want to fix __READ_ONCE etc not adding calls to
__no_sanitize functions:

5. Remove any mention of data_race and kcsan_check calls from
__{READ,WRITE}_ONCE, {READ,WRITE}_ONCE. [Won't need #1 above.]
6. I'll send a patch to make KCSAN distinguish volatile accesses, and
we will require Clang 11.

That is *if* you insist on __no_sanitize to behave like you suggest.
Note that, at that point, I really don't know how to salvage GCC,
mainly because of fixing __no_sanitize with GCC looking hopeless.

Let me know what you prefer.

Thanks,
-- Marco

2020-05-13 21:28:54

[permalink] [raw]

Subject: Re: [PATCH v5 00/18] Rework READ_ONCE() to improve codegen

On Wed, May 13, 2020 at 08:54:03PM +0200, Marco Elver wrote:
> On Wed, 13 May 2020 at 19:47, Will Deacon <[email protected]> wrote:
> > On Wed, May 13, 2020 at 07:32:58PM +0200, Marco Elver wrote:
> > > - We do *not* want the call to __read_once_word_nocheck if we have
> > > __no_sanitize_or_inline. AFAIK that's the main problem -- this applies
> > > to both KASAN and KCSAN.
> >
> > Sorry, I should've been more explicit. The code above, with KASAN enabled,
> > compiles to:
> >
> > ffffffff810a2d50 <foo>:
> > ffffffff810a2d50: 48 8b 07 mov (%rdi),%rax
> > ffffffff810a2d53: c3 retq
> >
> > but with KCSAN enabled, compiles to:
> >
> > ffffffff8109ecd0 <foo>:
> > ffffffff8109ecd0: 53 push %rbx
> > ffffffff8109ecd1: 48 89 fb mov %rdi,%rbx
> > ffffffff8109ecd4: 48 8b 7c 24 08 mov 0x8(%rsp),%rdi
> > ffffffff8109ecd9: e8 52 9c 1a 00 callq ffffffff81248930 <__tsan_func_entry>
> > ffffffff8109ecde: 48 89 df mov %rbx,%rdi
> > ffffffff8109ece1: e8 1a 00 00 00 callq ffffffff8109ed00 <__read_once_word_nocheck>
> > ffffffff8109ece6: 48 89 c3 mov %rax,%rbx
> > ffffffff8109ece9: e8 52 9c 1a 00 callq ffffffff81248940 <__tsan_func_exit>
> > ffffffff8109ecee: 48 89 d8 mov %rbx,%rax
> > ffffffff8109ecf1: 5b pop %rbx
> > ffffffff8109ecf2: c3 retq
> >
> > Is that expected? There don't appear to be any more annotations to throw
> > at it.
>
> Right, so this is expected.

Fair enough, I just found it weird since it's different to the usual
"disable instrumentation/trace" function annotations.

> We can definitely make __tsan_func_entry/exit disappear with Clang, with
> GCC it's going to be a while if we want to fix it.
>
> If we remove 'noinline' from __no_kcsan_or_inline, we no longer get
> the call to __read_once_word_nocheck above! But...
>
> For KCSAN we force 'noinline' because older compilers still inline and
> then instrument small functions even if we just have the no_sanitize
> attribute (without inline mentioned). The same is actually true for
> KASAN, so KASAN's READ_ONCE_NOCHECK might be broken in a few places,
> but nobody seems to have noticed [1]. KASAN's __no_kasan_or_inline
> should also have a 'noinline' I think. I just tested
> __no_kcsan_or_inline without 'noinline', and yes, GCC 9 still decided
> to inline a small function and then instrument the accesses.
>
> [1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=59600
>
> The good news is that Clang does the right thing when removing
> 'noinline' from __no_kcsan_or_inline:
> 1. doesn't inline into functions that are instrumented, and
> 2. your above example doesn't do the call to __read_once_word_nocheck.
>
> The obvious solution to this is: restrict which compiler we want to support?

I would be in favour of that, but I defer to the x86 folks since this
affects them much more than it does me. On the arm64 side, we've got patches
queued for 5.8 that require GCC 10.0.1 or later, and that thing is only a
week old. I think it's reasonable to require a recent toolchain for optional
features like this that inherently rely on compiler support.

> > > From what I gather, we want to just compile the function as if the
> > > sanitizer was never enabled. One reason for why this doesn't quite
> > > work is because of the preprocessor.
> > >
> > > Note that the sanitizers won't complain about these accesses, which
> > > unfortunately is all these attributes ever were documented to do. So
> > > the attributes aren't completely useless. Why doesn't
> > > K[AC]SAN_SANITIZE := n work?
> >
> > I just don't get the point in having a function annotation if you then have to
> > pass flags at the per-object level. That also then necessitates either weird
> > refactoring and grouping of code into "noinstrument.c" type files, or blanket
> > disabling of instrumentation for things like arch/x86/
>
> If you want a solution now, here is one way to get us closer to where
> we want to be:
>
> 1. Peter's patch to add data_race around __READ_ONCE/__WRITE_ONCE.
> 2. Patch to make __tsan_func_entry/exit disappear with Clang.
> 3. Remove 'noinline' from __no_kcsan_or_inline.
> 4. Patch to warn users that KCSAN may have problems with GCC and
> should use Clang >= 7.
>
> But this is probably only half a solution.

At this point, I think that if READ_ONCE_NOCHECK() works as expected, and
calling __{READ,WRITE}_ONCE from functions tagged with __no_sanitize doesn't
result in instrumentation, then we're good.

Will

2020-05-14 07:33:47

[permalink] [raw]

Subject: Re: [PATCH v5 00/18] Rework READ_ONCE() to improve codegen

On Wed, 13 May 2020 at 23:25, Will Deacon <[email protected]> wrote:
>
> On Wed, May 13, 2020 at 08:54:03PM +0200, Marco Elver wrote:
> > On Wed, 13 May 2020 at 19:47, Will Deacon <[email protected]> wrote:
> > > On Wed, May 13, 2020 at 07:32:58PM +0200, Marco Elver wrote:
> > > > - We do *not* want the call to __read_once_word_nocheck if we have
> > > > __no_sanitize_or_inline. AFAIK that's the main problem -- this applies
> > > > to both KASAN and KCSAN.
> > >
> > > Sorry, I should've been more explicit. The code above, with KASAN enabled,
> > > compiles to:
> > >
> > > ffffffff810a2d50 <foo>:
> > > ffffffff810a2d50: 48 8b 07 mov (%rdi),%rax
> > > ffffffff810a2d53: c3 retq
> > >
> > > but with KCSAN enabled, compiles to:
> > >
> > > ffffffff8109ecd0 <foo>:
> > > ffffffff8109ecd0: 53 push %rbx
> > > ffffffff8109ecd1: 48 89 fb mov %rdi,%rbx
> > > ffffffff8109ecd4: 48 8b 7c 24 08 mov 0x8(%rsp),%rdi
> > > ffffffff8109ecd9: e8 52 9c 1a 00 callq ffffffff81248930 <__tsan_func_entry>
> > > ffffffff8109ecde: 48 89 df mov %rbx,%rdi
> > > ffffffff8109ece1: e8 1a 00 00 00 callq ffffffff8109ed00 <__read_once_word_nocheck>
> > > ffffffff8109ece6: 48 89 c3 mov %rax,%rbx
> > > ffffffff8109ece9: e8 52 9c 1a 00 callq ffffffff81248940 <__tsan_func_exit>
> > > ffffffff8109ecee: 48 89 d8 mov %rbx,%rax
> > > ffffffff8109ecf1: 5b pop %rbx
> > > ffffffff8109ecf2: c3 retq
> > >
> > > Is that expected? There don't appear to be any more annotations to throw
> > > at it.
> >
> > Right, so this is expected.
>
> Fair enough, I just found it weird since it's different to the usual
> "disable instrumentation/trace" function annotations.
>
> > We can definitely make __tsan_func_entry/exit disappear with Clang, with
> > GCC it's going to be a while if we want to fix it.
> >
> > If we remove 'noinline' from __no_kcsan_or_inline, we no longer get
> > the call to __read_once_word_nocheck above! But...
> >
> > For KCSAN we force 'noinline' because older compilers still inline and
> > then instrument small functions even if we just have the no_sanitize
> > attribute (without inline mentioned). The same is actually true for
> > KASAN, so KASAN's READ_ONCE_NOCHECK might be broken in a few places,
> > but nobody seems to have noticed [1]. KASAN's __no_kasan_or_inline
> > should also have a 'noinline' I think. I just tested
> > __no_kcsan_or_inline without 'noinline', and yes, GCC 9 still decided
> > to inline a small function and then instrument the accesses.
> >
> > [1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=59600
> >
> > The good news is that Clang does the right thing when removing
> > 'noinline' from __no_kcsan_or_inline:
> > 1. doesn't inline into functions that are instrumented, and
> > 2. your above example doesn't do the call to __read_once_word_nocheck.
> >
> > The obvious solution to this is: restrict which compiler we want to support?
>
> I would be in favour of that, but I defer to the x86 folks since this
> affects them much more than it does me. On the arm64 side, we've got patches
> queued for 5.8 that require GCC 10.0.1 or later, and that thing is only a
> week old. I think it's reasonable to require a recent toolchain for optional
> features like this that inherently rely on compiler support.
>
> > > > From what I gather, we want to just compile the function as if the
> > > > sanitizer was never enabled. One reason for why this doesn't quite
> > > > work is because of the preprocessor.
> > > >
> > > > Note that the sanitizers won't complain about these accesses, which
> > > > unfortunately is all these attributes ever were documented to do. So
> > > > the attributes aren't completely useless. Why doesn't
> > > > K[AC]SAN_SANITIZE := n work?
> > >
> > > I just don't get the point in having a function annotation if you then have to
> > > pass flags at the per-object level. That also then necessitates either weird
> > > refactoring and grouping of code into "noinstrument.c" type files, or blanket
> > > disabling of instrumentation for things like arch/x86/
> >
> > If you want a solution now, here is one way to get us closer to where
> > we want to be:
> >
> > 1. Peter's patch to add data_race around __READ_ONCE/__WRITE_ONCE.
> > 2. Patch to make __tsan_func_entry/exit disappear with Clang.
> > 3. Remove 'noinline' from __no_kcsan_or_inline.
> > 4. Patch to warn users that KCSAN may have problems with GCC and
> > should use Clang >= 7.
> >
> > But this is probably only half a solution.
>
> At this point, I think that if READ_ONCE_NOCHECK() works as expected, and
> calling __{READ,WRITE}_ONCE from functions tagged with __no_sanitize doesn't
> result in instrumentation, then we're good.

Ouch. With the __{READ,WRITE}_ONCE requirement, we're going to need
Clang 11 though.

Because without the data_race() around __*_ONCE,
arch_atomic_{read,set} will be broken for KCSAN, but we can't have
data_race() because it would still add
kcsan_{enable,disable}_current() calls to __no_sanitize functions (if
compilation unit is instrumented). We can't make arch_atomic functions
__no_sanitize_or_inline, because even in code that we want to
sanitize, they should remain __always_inline (so they work properly in
__no_sanitize functions). Therefore, Clang 11 with support for
distinguishing volatiles will be the compiler that will satisfy all
the constraints.

If this is what we want, let me prepare a series on top of
-tip/locking/kcsan with all the things I think we need.

Thanks,
-- Marco

2020-05-14 11:08:10

[permalink] [raw]

Subject: Re: [PATCH v5 00/18] Rework READ_ONCE() to improve codegen

Hi Marco,

On Thu, May 14, 2020 at 09:31:49AM +0200, Marco Elver wrote:
> Ouch. With the __{READ,WRITE}_ONCE requirement, we're going to need
> Clang 11 though.
>
> Because without the data_race() around __*_ONCE,
> arch_atomic_{read,set} will be broken for KCSAN, but we can't have
> data_race() because it would still add
> kcsan_{enable,disable}_current() calls to __no_sanitize functions (if
> compilation unit is instrumented). We can't make arch_atomic functions
> __no_sanitize_or_inline, because even in code that we want to
> sanitize, they should remain __always_inline (so they work properly in
> __no_sanitize functions). Therefore, Clang 11 with support for
> distinguishing volatiles will be the compiler that will satisfy all
> the constraints.
>
> If this is what we want, let me prepare a series on top of
> -tip/locking/kcsan with all the things I think we need.

Stepping back a second, the locking/kcsan branch is at least functional at
the moment by virtue of KCSAN_SANITIZE := n being used liberally in
arch/x86/. However, I still think we want to do better than that because (a)
it would be good to get more x86 coverage and (b) enabling this for arm64,
where objtool is not yet available, will be fragile if we have to whitelist
object files. There's also a fair bit of arm64 low-level code spread around
drivers/, so it feels like we'd end up with a really bad case of whack-a-mole.

Talking off-list, Clang >= 7 is pretty reasonable wrt inlining decisions
and the behaviour for __always_inline is:

* An __always_inline function inlined into a __no_sanitize function is
not instrumented
* An __always_inline function inlined into an instrumented function is
instrumented
* You can't mark a function as both __always_inline __no_sanitize, because
__no_sanitize functions are never inlined

GCC, on the other hand, may still inline __no_sanitize functions and then
subsequently instrument them.

So if were willing to make KCSAN depend on Clang >= 7, then we could:

- Remove the data_race() from __{READ,WRITE}_ONCE()
- Wrap arch_atomic*() in data_race() when called from the instrumented
atomic wrappers

At which point, I *think* everything works as expected. READ_ONCE_NOCHECK()
won't generate any surprises, and Peter can happily use arch_atomic()
from non-instrumented code.

Thoughts? I don't see the need to support buggy compilers when enabling
a new debug feature.

Will

2020-05-14 11:23:55

[permalink] [raw]

Subject: Re: [PATCH v5 00/18] Rework READ_ONCE() to improve codegen

On Wed, May 13, 2020 at 03:58:30PM +0200, Marco Elver wrote:
> On Wed, 13 May 2020 at 15:24, Peter Zijlstra <[email protected]> wrote:
> >
> > On Wed, May 13, 2020 at 03:15:55PM +0200, Marco Elver wrote:
> > > So far so good, except: both __no_sanitize_or_inline and
> > > __no_kcsan_or_inline *do* avoid KCSAN instrumenting plain accesses, it
> > > just doesn't avoid explicit kcsan_check calls, like those in
> > > READ/WRITE_ONCE if KCSAN is enabled for the compilation unit. That's
> > > just because macros won't be redefined just for __no_sanitize
> > > functions. Similarly, READ_ONCE_NOCHECK does work as expected, and its
> > > access is unchecked.
> > >
> > > This will have the expected result:
> > > __no_sanitize_or_inline void foo(void) { x++; } // no data races reported
> > >
> > > This will not work as expected:
> > > __no_sanitize_or_inline void foo(void) { READ_ONCE(x); } // data
> > > races are reported
> > >
> > > All this could be fixed if GCC devs would finally take my patch to
> > > make -fsanitize=thread distinguish volatile [1], but then we have to
> > > wait ~years for the new compilers to reach us. So please don't hold
> > > your breath for this one any time soon.
> > > [1] https://gcc.gnu.org/pipermail/gcc-patches/2020-April/544452.html
> >
> > Right, but that does not address the much larger issue of the attribute
> > vs inline tranwreck :/
>
> Could you check if Clang is equally broken for you? I think GCC and
> Clang have differing behaviour on this. No idea what it takes to fix
> GCC though.

So I have some good and some maybe not so good news.

Given the patch below (on top of tglx's entry-v5-the-rest tag); I did
find that I could actually build alternative.o for gcc-{8,9,10} and
indeed clang-10. Any earlier gcc (I tried, 5,6,7) does not build:

../arch/x86/include/asm/ptrace.h:126:28: error: inlining failed in call to always_inline ‘user_mode’: function attribute mismatch

I dumped the poke_int3_handler output using objdump, find the attached
files.

It looks like clang-10 doesn't want to turn UBSAN off :/ The GCC files
look OK, no funny calls in those.

(the config has KASAN/UBSAN on, it looks like KCSAN and KASAN are
mutually exclusive)

---

diff --git a/arch/x86/kernel/alternative.c b/arch/x86/kernel/alternative.c
index 77c83833d91e..06d8db612efc 100644
--- a/arch/x86/kernel/alternative.c
+++ b/arch/x86/kernel/alternative.c
@@ -990,7 +990,7 @@ static __always_inline int patch_cmp(const void *key, const void *elt)
return 0;
}

-int noinstr poke_int3_handler(struct pt_regs *regs)
+int noinstr __no_kcsan __no_sanitize_address __no_sanitize_undefined poke_int3_handler(struct pt_regs *regs)
{
struct bp_patching_desc *desc;
struct text_poke_loc *tp;
diff --git a/include/linux/compiler-clang.h b/include/linux/compiler-clang.h
index 2cb42d8bdedc..5e83aada6553 100644
--- a/include/linux/compiler-clang.h
+++ b/include/linux/compiler-clang.h
@@ -15,6 +15,13 @@
/* all clang versions usable with the kernel support KASAN ABI version 5 */
#define KASAN_ABI_VERSION 5

+#if __has_feature(undefined_sanitizer)
+#define __no_sanitize_undefined \
+ __attribute__((no_sanitize("undefined")))
+#else
+#define __no_sanitize_undefined
+#endif
+
#if __has_feature(address_sanitizer) || __has_feature(hwaddress_sanitizer)
/* Emulate GCC's __SANITIZE_ADDRESS__ flag */
#define __SANITIZE_ADDRESS__
diff --git a/include/linux/compiler-gcc.h b/include/linux/compiler-gcc.h
index 7dd4e0349ef3..8196a121a78e 100644
--- a/include/linux/compiler-gcc.h
+++ b/include/linux/compiler-gcc.h
@@ -138,6 +138,12 @@
#define KASAN_ABI_VERSION 3
#endif

+#if __has_attribute(__no_sanitize_undefined__)
+#define __no_sanitize_undefined __attribute__((no_sanitize_undefined))
+#else
+#define __no_sanitize_undefined
+#endif
+
#if __has_attribute(__no_sanitize_address__)
#define __no_sanitize_address __attribute__((no_sanitize_address))
#else

Attachments:

(No filename) (3.94 kB)
poke_int3_handler-gcc8.asm (5.07 kB)
poke_int3_handler-gcc9.asm (4.71 kB)
poke_int3_handler-gcc10.asm (4.71 kB)
poke_int3_handler-clang10.asm (7.75 kB)
Download all attachments

2020-05-14 11:26:54

[permalink] [raw]

Subject: Re: [PATCH v5 00/18] Rework READ_ONCE() to improve codegen

On Thu, May 14, 2020 at 01:21:42PM +0200, Peter Zijlstra wrote:
> On Wed, May 13, 2020 at 03:58:30PM +0200, Marco Elver wrote:
> > On Wed, 13 May 2020 at 15:24, Peter Zijlstra <[email protected]> wrote:
> > >
> > > On Wed, May 13, 2020 at 03:15:55PM +0200, Marco Elver wrote:
> > > > So far so good, except: both __no_sanitize_or_inline and
> > > > __no_kcsan_or_inline *do* avoid KCSAN instrumenting plain accesses, it
> > > > just doesn't avoid explicit kcsan_check calls, like those in
> > > > READ/WRITE_ONCE if KCSAN is enabled for the compilation unit. That's
> > > > just because macros won't be redefined just for __no_sanitize
> > > > functions. Similarly, READ_ONCE_NOCHECK does work as expected, and its
> > > > access is unchecked.
> > > >
> > > > This will have the expected result:
> > > > __no_sanitize_or_inline void foo(void) { x++; } // no data races reported
> > > >
> > > > This will not work as expected:
> > > > __no_sanitize_or_inline void foo(void) { READ_ONCE(x); } // data
> > > > races are reported
> > > >
> > > > All this could be fixed if GCC devs would finally take my patch to
> > > > make -fsanitize=thread distinguish volatile [1], but then we have to
> > > > wait ~years for the new compilers to reach us. So please don't hold
> > > > your breath for this one any time soon.
> > > > [1] https://gcc.gnu.org/pipermail/gcc-patches/2020-April/544452.html
> > >
> > > Right, but that does not address the much larger issue of the attribute
> > > vs inline tranwreck :/
> >
> > Could you check if Clang is equally broken for you? I think GCC and
> > Clang have differing behaviour on this. No idea what it takes to fix
> > GCC though.
>
> So I have some good and some maybe not so good news.
>
> Given the patch below (on top of tglx's entry-v5-the-rest tag); I did
> find that I could actually build alternative.o for gcc-{8,9,10} and
> indeed clang-10. Any earlier gcc (I tried, 5,6,7) does not build:
>
> ../arch/x86/include/asm/ptrace.h:126:28: error: inlining failed in call to always_inline ‘user_mode’: function attribute mismatch
>
> I dumped the poke_int3_handler output using objdump, find the attached
> files.
>
> It looks like clang-10 doesn't want to turn UBSAN off :/ The GCC files
> look OK, no funny calls in those.
>
> (the config has KASAN/UBSAN on, it looks like KCSAN and KASAN are
> mutually exclusive)

I just swapped them and rebuild with gcc-10 and that still looks ok.

0000 0000000000000000 <poke_int3_handler>:
0000 0: f6 87 88 00 00 00 03 testb $0x3,0x88(%rdi)
0007 7: 75 4d jne 56 <poke_int3_handler+0x56>
0009 9: 48 8b 15 00 00 00 00 mov 0x0(%rip),%rdx # 10 <poke_int3_handler+0x10>
000c c: R_X86_64_PC32 .bss+0x101c
0010 10: 48 85 d2 test %rdx,%rdx
0013 13: 74 41 je 56 <poke_int3_handler+0x56>
0015 15: 8b 42 0c mov 0xc(%rdx),%eax
0018 18: 48 8d 4a 0c lea 0xc(%rdx),%rcx
001c 1c: 85 c0 test %eax,%eax
001e 1e: 74 36 je 56 <poke_int3_handler+0x56>
0020 20: 8d 70 01 lea 0x1(%rax),%esi
0023 23: f0 0f b1 31 lock cmpxchg %esi,(%rcx)
0027 27: 75 f3 jne 1c <poke_int3_handler+0x1c>
0029 29: 4c 8b 8f 80 00 00 00 mov 0x80(%rdi),%r9
0030 30: 48 63 42 08 movslq 0x8(%rdx),%rax
0034 34: 48 8b 32 mov (%rdx),%rsi
0037 37: 49 8d 49 ff lea -0x1(%r9),%rcx
003b 3b: 83 f8 01 cmp $0x1,%eax
003e 3e: 7f 19 jg 59 <poke_int3_handler+0x59>
0040 40: 4c 63 06 movslq (%rsi),%r8
0043 43: 31 c0 xor %eax,%eax
0045 45: 49 81 c0 00 00 00 00 add $0x0,%r8
0048 48: R_X86_64_32S _stext
004c 4c: 4c 39 c1 cmp %r8,%rcx
004f 4f: 74 39 je 8a <poke_int3_handler+0x8a>
0051 51: f0 ff 4a 0c lock decl 0xc(%rdx)
0055 55: c3 retq
0056 56: 31 c0 xor %eax,%eax
0058 58: c3 retq
0059 59: 49 89 f3 mov %rsi,%r11
005c 5c: 49 89 c2 mov %rax,%r10
005f 5f: 49 d1 ea shr %r10
0062 62: 4c 89 d6 mov %r10,%rsi
0065 65: 48 c1 e6 04 shl $0x4,%rsi
0069 69: 4c 01 de add %r11,%rsi
006c 6c: 4c 63 06 movslq (%rsi),%r8
006f 6f: 49 81 c0 00 00 00 00 add $0x0,%r8
0072 72: R_X86_64_32S _stext
0076 76: 4c 39 c1 cmp %r8,%rcx
0079 79: 0f 82 a2 00 00 00 jb 121 <poke_int3_handler+0x121>
007f 7f: 0f 87 83 00 00 00 ja 108 <poke_int3_handler+0x108>
0085 85: 48 85 f6 test %rsi,%rsi
0088 88: 74 45 je cf <poke_int3_handler+0xcf>
008a 8a: 0f b6 46 08 movzbl 0x8(%rsi),%eax
008e 8e: 44 8d 40 34 lea 0x34(%rax),%r8d
0092 92: 41 80 f8 1f cmp $0x1f,%r8b
0096 96: 76 02 jbe 9a <poke_int3_handler+0x9a>
0098 98: 0f 0b ud2
009a 9a: 45 0f b6 c0 movzbl %r8b,%r8d
009e 9e: 4d 0f be 80 00 00 00 movsbq 0x0(%r8),%r8
00a5 a5: 00
00a2 a2: R_X86_64_32S .rodata
00a6 a6: 4c 01 c1 add %r8,%rcx
00a9 a9: 3c e8 cmp $0xe8,%al
00ab ab: 74 29 je d6 <poke_int3_handler+0xd6>
00ad ad: 76 1c jbe cb <poke_int3_handler+0xcb>
00af af: 83 e0 fd and $0xfffffffd,%eax
00b2 b2: 3c e9 cmp $0xe9,%al
00b4 b4: 75 e2 jne 98 <poke_int3_handler+0x98>
00b6 b6: 48 63 46 04 movslq 0x4(%rsi),%rax
00ba ba: 48 01 c1 add %rax,%rcx
00bd bd: b8 01 00 00 00 mov $0x1,%eax
00c2 c2: 48 89 8f 80 00 00 00 mov %rcx,0x80(%rdi)
00c9 c9: eb 86 jmp 51 <poke_int3_handler+0x51>
00cb cb: 3c cc cmp $0xcc,%al
00cd cd: 75 c9 jne 98 <poke_int3_handler+0x98>
00cf cf: 31 c0 xor %eax,%eax
00d1 d1: e9 7b ff ff ff jmpq 51 <poke_int3_handler+0x51>
00d6 d6: 48 63 46 04 movslq 0x4(%rsi),%rax
00da da: 49 83 c1 04 add $0x4,%r9
00de de: 48 01 c1 add %rax,%rcx
00e1 e1: 48 8b 87 98 00 00 00 mov 0x98(%rdi),%rax
00e8 e8: 48 8d 70 f8 lea -0x8(%rax),%rsi
00ec ec: 48 89 b7 98 00 00 00 mov %rsi,0x98(%rdi)
00f3 f3: 4c 89 48 f8 mov %r9,-0x8(%rax)
00f7 f7: b8 01 00 00 00 mov $0x1,%eax
00fc fc: 48 89 8f 80 00 00 00 mov %rcx,0x80(%rdi)
0103 103: e9 49 ff ff ff jmpq 51 <poke_int3_handler+0x51>
0108 108: 48 83 e8 01 sub $0x1,%rax
010c 10c: 4c 8d 5e 10 lea 0x10(%rsi),%r11
0110 110: 48 d1 e8 shr %rax
0113 113: 48 85 c0 test %rax,%rax
0116 116: 0f 85 40 ff ff ff jne 5c <poke_int3_handler+0x5c>
011c 11c: e9 30 ff ff ff jmpq 51 <poke_int3_handler+0x51>
0121 121: 4c 89 d0 mov %r10,%rax
0124 124: eb ed jmp 113 <poke_int3_handler+0x113>

2020-05-14 11:40:13

[permalink] [raw]

Subject: Re: [PATCH v5 00/18] Rework READ_ONCE() to improve codegen

On Thu, May 14, 2020 at 01:21:42PM +0200, Peter Zijlstra wrote:
> So I have some good and some maybe not so good news.
>
> Given the patch below (on top of tglx's entry-v5-the-rest tag); I did
> find that I could actually build alternative.o for gcc-{8,9,10} and
> indeed clang-10.

And, for completeness, here's the vomit from gcc-10 without the patch
for a KASAN+UBSAN build:

0000 0000000000000000 <poke_int3_handler>:
0000 0: 41 57 push %r15
0002 2: 41 56 push %r14
0004 4: 41 55 push %r13
0006 6: 41 54 push %r12
0008 8: 55 push %rbp
0009 9: 53 push %rbx
000a a: 48 89 fb mov %rdi,%rbx
000d d: 48 81 c7 88 00 00 00 add $0x88,%rdi
0014 14: 48 83 ec 10 sub $0x10,%rsp
0018 18: e8 00 00 00 00 callq 1d <poke_int3_handler+0x1d>
0019 19: R_X86_64_PLT32 __asan_load8_noabort-0x4
001d 1d: f6 83 88 00 00 00 03 testb $0x3,0x88(%rbx)
0024 24: 0f 85 99 00 00 00 jne c3 <poke_int3_handler+0xc3>
002a 2a: 48 8b 2d 00 00 00 00 mov 0x0(%rip),%rbp # 31 <poke_int3_handler+0x31>
002d 2d: R_X86_64_PC32 .bss+0x105c
0031 31: 48 85 ed test %rbp,%rbp
0034 34: 0f 84 89 00 00 00 je c3 <poke_int3_handler+0xc3>
003a 3a: 4c 8d 65 0c lea 0xc(%rbp),%r12
003e 3e: 4c 89 e7 mov %r12,%rdi
0041 41: e8 00 00 00 00 callq 46 <poke_int3_handler+0x46>
0042 42: R_X86_64_PLT32 __asan_load4_noabort-0x4
0046 46: 8b 45 0c mov 0xc(%rbp),%eax
0049 49: 85 c0 test %eax,%eax
004b 4b: 74 76 je c3 <poke_int3_handler+0xc3>
004d 4d: 8d 50 01 lea 0x1(%rax),%edx
0050 50: f0 41 0f b1 14 24 lock cmpxchg %edx,(%r12)
0056 56: 75 f1 jne 49 <poke_int3_handler+0x49>
0058 58: 48 8d bb 80 00 00 00 lea 0x80(%rbx),%rdi
005f 5f: e8 00 00 00 00 callq 64 <poke_int3_handler+0x64>
0060 60: R_X86_64_PLT32 __asan_load8_noabort-0x4
0064 64: 48 8b 83 80 00 00 00 mov 0x80(%rbx),%rax
006b 6b: 48 8d 7d 08 lea 0x8(%rbp),%rdi
006f 6f: 48 89 04 24 mov %rax,(%rsp)
0073 73: 4c 8d 60 ff lea -0x1(%rax),%r12
0077 77: e8 00 00 00 00 callq 7c <poke_int3_handler+0x7c>
0078 78: R_X86_64_PLT32 __asan_load4_noabort-0x4
007c 7c: 4c 63 7d 08 movslq 0x8(%rbp),%r15
0080 80: 48 89 ef mov %rbp,%rdi
0083 83: e8 00 00 00 00 callq 88 <poke_int3_handler+0x88>
0084 84: R_X86_64_PLT32 __asan_load8_noabort-0x4
0088 88: 4c 8b 6d 00 mov 0x0(%rbp),%r13
008c 8c: 41 83 ff 01 cmp $0x1,%r15d
0090 90: 7f 36 jg c8 <poke_int3_handler+0xc8>
0092 92: 4c 89 ef mov %r13,%rdi
0095 95: e8 00 00 00 00 callq 9a <poke_int3_handler+0x9a>
0096 96: R_X86_64_PLT32 __asan_load4_noabort-0x4
009a 9a: 49 63 55 00 movslq 0x0(%r13),%rdx
009e 9e: 45 31 c0 xor %r8d,%r8d
00a1 a1: 48 81 c2 00 00 00 00 add $0x0,%rdx
00a4 a4: R_X86_64_32S _stext
00a8 a8: 49 39 d4 cmp %rdx,%r12
00ab ab: 74 5d je 10a <poke_int3_handler+0x10a>
00ad ad: f0 ff 4d 0c lock decl 0xc(%rbp)
00b1 b1: 48 83 c4 10 add $0x10,%rsp
00b5 b5: 44 89 c0 mov %r8d,%eax
00b8 b8: 5b pop %rbx
00b9 b9: 5d pop %rbp
00ba ba: 41 5c pop %r12
00bc bc: 41 5d pop %r13
00be be: 41 5e pop %r14
00c0 c0: 41 5f pop %r15
00c2 c2: c3 retq
00c3 c3: 45 31 c0 xor %r8d,%r8d
00c6 c6: eb e9 jmp b1 <poke_int3_handler+0xb1>
00c8 c8: 4c 89 6c 24 08 mov %r13,0x8(%rsp)
00cd cd: 4d 89 fe mov %r15,%r14
00d0 d0: 48 8b 74 24 08 mov 0x8(%rsp),%rsi
00d5 d5: 49 d1 ee shr %r14
00d8 d8: 4c 89 f0 mov %r14,%rax
00db db: 48 c1 e0 04 shl $0x4,%rax
00df df: 4c 8d 2c 06 lea (%rsi,%rax,1),%r13
00e3 e3: 4c 89 ef mov %r13,%rdi
00e6 e6: e8 00 00 00 00 callq eb <poke_int3_handler+0xeb>
00e7 e7: R_X86_64_PLT32 __asan_load4_noabort-0x4
00eb eb: 49 63 4d 00 movslq 0x0(%r13),%rcx
00ef ef: 48 81 c1 00 00 00 00 add $0x0,%rcx
00f2 f2: R_X86_64_32S _stext
00f6 f6: 49 39 cc cmp %rcx,%r12
00f9 f9: 0f 82 fd 00 00 00 jb 1fc <poke_int3_handler+0x1fc>
00ff ff: 0f 87 d9 00 00 00 ja 1de <poke_int3_handler+0x1de>
0105 105: 4d 85 ed test %r13,%r13
0108 108: 74 7b je 185 <poke_int3_handler+0x185>
010a 10a: 49 8d 7d 08 lea 0x8(%r13),%rdi
010e 10e: e8 00 00 00 00 callq 113 <poke_int3_handler+0x113>
010f 10f: R_X86_64_PLT32 __asan_load1_noabort-0x4
0113 113: 45 0f b6 75 08 movzbl 0x8(%r13),%r14d
0118 118: 45 8d 7e 34 lea 0x34(%r14),%r15d
011c 11c: 41 80 ff 1f cmp $0x1f,%r15b
0120 120: 76 0e jbe 130 <poke_int3_handler+0x130>
0122 122: 0f 0b ud2
0124 124: 48 c7 c7 00 00 00 00 mov $0x0,%rdi
0127 127: R_X86_64_32S .data+0xc0
012b 12b: e8 00 00 00 00 callq 130 <poke_int3_handler+0x130>
012c 12c: R_X86_64_PLT32 __ubsan_handle_builtin_unreachable-0x4
0130 130: 45 0f b6 ff movzbl %r15b,%r15d
0134 134: 49 8d bf 00 00 00 00 lea 0x0(%r15),%rdi
0137 137: R_X86_64_32S .rodata+0x620
013b 13b: e8 00 00 00 00 callq 140 <poke_int3_handler+0x140>
013c 13c: R_X86_64_PLT32 __asan_load1_noabort-0x4
0140 140: 49 0f be 97 00 00 00 movsbq 0x0(%r15),%rdx
0147 147: 00
0144 144: R_X86_64_32S .rodata+0x620
0148 148: 49 01 d4 add %rdx,%r12
014b 14b: 41 80 fe e8 cmp $0xe8,%r14b
014f 14f: 74 3c je 18d <poke_int3_handler+0x18d>
0151 151: 76 2c jbe 17f <poke_int3_handler+0x17f>
0153 153: 41 83 e6 fd and $0xfffffffd,%r14d
0157 157: 41 80 fe e9 cmp $0xe9,%r14b
015b 15b: 75 c5 jne 122 <poke_int3_handler+0x122>
015d 15d: 49 8d 7d 04 lea 0x4(%r13),%rdi
0161 161: e8 00 00 00 00 callq 166 <poke_int3_handler+0x166>
0162 162: R_X86_64_PLT32 __asan_load4_noabort-0x4
0166 166: 49 63 45 04 movslq 0x4(%r13),%rax
016a 16a: 41 b8 01 00 00 00 mov $0x1,%r8d
0170 170: 49 01 c4 add %rax,%r12
0173 173: 4c 89 a3 80 00 00 00 mov %r12,0x80(%rbx)
017a 17a: e9 2e ff ff ff jmpq ad <poke_int3_handler+0xad>
017f 17f: 41 80 fe cc cmp $0xcc,%r14b
0183 183: 75 9d jne 122 <poke_int3_handler+0x122>
0185 185: 45 31 c0 xor %r8d,%r8d
0188 188: e9 20 ff ff ff jmpq ad <poke_int3_handler+0xad>
018d 18d: 49 8d 7d 04 lea 0x4(%r13),%rdi
0191 191: e8 00 00 00 00 callq 196 <poke_int3_handler+0x196>
0192 192: R_X86_64_PLT32 __asan_load4_noabort-0x4
0196 196: 49 63 45 04 movslq 0x4(%r13),%rax
019a 19a: 48 8d bb 98 00 00 00 lea 0x98(%rbx),%rdi
01a1 1a1: 49 01 c4 add %rax,%r12
01a4 1a4: e8 00 00 00 00 callq 1a9 <poke_int3_handler+0x1a9>
01a5 1a5: R_X86_64_PLT32 __asan_load8_noabort-0x4
01a9 1a9: 4c 8b b3 98 00 00 00 mov 0x98(%rbx),%r14
01b0 1b0: 49 8d 7e f8 lea -0x8(%r14),%rdi
01b4 1b4: 48 89 bb 98 00 00 00 mov %rdi,0x98(%rbx)
01bb 1bb: e8 00 00 00 00 callq 1c0 <poke_int3_handler+0x1c0>
01bc 1bc: R_X86_64_PLT32 __asan_store8_noabort-0x4
01c0 1c0: 4c 8b 2c 24 mov (%rsp),%r13
01c4 1c4: 41 b8 01 00 00 00 mov $0x1,%r8d
01ca 1ca: 49 83 c5 04 add $0x4,%r13
01ce 1ce: 4d 89 6e f8 mov %r13,-0x8(%r14)
01d2 1d2: 4c 89 a3 80 00 00 00 mov %r12,0x80(%rbx)
01d9 1d9: e9 cf fe ff ff jmpq ad <poke_int3_handler+0xad>
01de 1de: 49 8d 45 10 lea 0x10(%r13),%rax
01e2 1e2: 4d 8d 77 ff lea -0x1(%r15),%r14
01e6 1e6: 48 89 44 24 08 mov %rax,0x8(%rsp)
01eb 1eb: 49 d1 ee shr %r14
01ee 1ee: 4d 89 f7 mov %r14,%r15
01f1 1f1: 4d 85 ff test %r15,%r15
01f4 1f4: 0f 85 d3 fe ff ff jne cd <poke_int3_handler+0xcd>
01fa 1fa: eb 89 jmp 185 <poke_int3_handler+0x185>
01fc 1fc: 4d 89 f7 mov %r14,%r15
01ff 1ff: eb f0 jmp 1f1 <poke_int3_handler+0x1f1>

2020-05-14 12:05:51

[permalink] [raw]

Subject: Re: [PATCH v5 00/18] Rework READ_ONCE() to improve codegen

2020-05-14 12:22:43

[permalink] [raw]

Subject: Re: [PATCH v5 00/18] Rework READ_ONCE() to improve codegen

On Thu, May 14, 2020 at 01:21:42PM +0200, Peter Zijlstra wrote:
> Given the patch below (on top of tglx's entry-v5-the-rest tag); I did
> find that I could actually build alternative.o for gcc-{8,9,10} and
> indeed clang-10. Any earlier gcc (I tried, 5,6,7) does not build:

Damn!, I forgot the patch from https://lkml.kernel.org/r/[email protected]

With that included, on a GCC-10 KCSAN+UBSAN build, I now get this, and
that is very much not okay. This is the thing Will complained about as
well I think.

Hohumm :-(

---
diff --git a/arch/x86/kernel/Makefile b/arch/x86/kernel/Makefile
index d6d61c4455fa..ba89cabe5fcf 100644
--- a/arch/x86/kernel/Makefile
+++ b/arch/x86/kernel/Makefile
@@ -28,10 +28,6 @@ KASAN_SANITIZE_dumpstack_$(BITS).o := n
KASAN_SANITIZE_stacktrace.o := n
KASAN_SANITIZE_paravirt.o := n

-# With some compiler versions the generated code results in boot hangs, caused
-# by several compilation units. To be safe, disable all instrumentation.
-KCSAN_SANITIZE := n
-
OBJECT_FILES_NON_STANDARD_test_nx.o := y
OBJECT_FILES_NON_STANDARD_paravirt_patch.o := y

diff --git a/arch/x86/kernel/alternative.c b/arch/x86/kernel/alternative.c
index 77c83833d91e..06d8db612efc 100644
--- a/arch/x86/kernel/alternative.c
+++ b/arch/x86/kernel/alternative.c
@@ -990,7 +990,7 @@ static __always_inline int patch_cmp(const void *key, const void *elt)
return 0;
}

-int noinstr poke_int3_handler(struct pt_regs *regs)
+int noinstr __no_kcsan __no_sanitize_address __no_sanitize_undefined poke_int3_handler(struct pt_regs *regs)
{
struct bp_patching_desc *desc;
struct text_poke_loc *tp;
diff --git a/include/linux/compiler-clang.h b/include/linux/compiler-clang.h
index 2cb42d8bdedc..c728ae9dcf96 100644
--- a/include/linux/compiler-clang.h
+++ b/include/linux/compiler-clang.h
@@ -15,6 +15,9 @@
/* all clang versions usable with the kernel support KASAN ABI version 5 */
#define KASAN_ABI_VERSION 5

+#define __no_sanitize_undefined \
+ __attribute__((no_sanitize("undefined")))
+
#if __has_feature(address_sanitizer) || __has_feature(hwaddress_sanitizer)
/* Emulate GCC's __SANITIZE_ADDRESS__ flag */
#define __SANITIZE_ADDRESS__
diff --git a/include/linux/compiler-gcc.h b/include/linux/compiler-gcc.h
index 7dd4e0349ef3..8196a121a78e 100644
--- a/include/linux/compiler-gcc.h
+++ b/include/linux/compiler-gcc.h
@@ -138,6 +138,12 @@
#define KASAN_ABI_VERSION 3
#endif

+#if __has_attribute(__no_sanitize_undefined__)
+#define __no_sanitize_undefined __attribute__((no_sanitize_undefined))
+#else
+#define __no_sanitize_undefined
+#endif
+
#if __has_attribute(__no_sanitize_address__)
#define __no_sanitize_address __attribute__((no_sanitize_address))
#else
diff --git a/include/linux/compiler.h b/include/linux/compiler.h
index 3bb962959d8b..2ea532b19e75 100644
--- a/include/linux/compiler.h
+++ b/include/linux/compiler.h
@@ -241,12 +241,12 @@ void ftrace_likely_update(struct ftrace_likely_data *f, int val,
* atomicity or dependency ordering guarantees. Note that this may result
* in tears!
*/
-#define __READ_ONCE(x) (*(const volatile __unqual_scalar_typeof(x) *)&(x))
+#define __READ_ONCE(x) data_race((*(const volatile __unqual_scalar_typeof(x) *)&(x)))

#define __READ_ONCE_SCALAR(x) \
({ \
typeof(x) *__xp = &(x); \
- __unqual_scalar_typeof(x) __x = data_race(__READ_ONCE(*__xp)); \
+ __unqual_scalar_typeof(x) __x = __READ_ONCE(*__xp); \
kcsan_check_atomic_read(__xp, sizeof(*__xp)); \
smp_read_barrier_depends(); \
(typeof(x))__x; \
@@ -260,14 +260,14 @@ void ftrace_likely_update(struct ftrace_likely_data *f, int val,

#define __WRITE_ONCE(x, val) \
do { \
- *(volatile typeof(x) *)&(x) = (val); \
+ data_race(*(volatile typeof(x) *)&(x) = (val)); \
} while (0)

#define __WRITE_ONCE_SCALAR(x, val) \
do { \
typeof(x) *__xp = &(x); \
kcsan_check_atomic_write(__xp, sizeof(*__xp)); \
- data_race(({ __WRITE_ONCE(*__xp, val); 0; })); \
+ __WRITE_ONCE(*__xp, val); \
} while (0)

#define WRITE_ONCE(x, val) \

---
0000 0000000000000000 <poke_int3_handler>:
0000 0: 41 55 push %r13
0002 2: 41 54 push %r12
0004 4: 55 push %rbp
0005 5: 53 push %rbx
0006 6: 48 83 ec 10 sub $0x10,%rsp
000a a: 65 48 8b 04 25 28 00 mov %gs:0x28,%rax
0011 11: 00 00
0013 13: 48 89 44 24 08 mov %rax,0x8(%rsp)
0018 18: 31 c0 xor %eax,%eax
001a 1a: f6 87 88 00 00 00 03 testb $0x3,0x88(%rdi)
0021 21: 74 21 je 44 <poke_int3_handler+0x44>
0023 23: 31 c0 xor %eax,%eax
0025 25: 48 8b 4c 24 08 mov 0x8(%rsp),%rcx
002a 2a: 65 48 2b 0c 25 28 00 sub %gs:0x28,%rcx
0031 31: 00 00
0033 33: 0f 85 79 01 00 00 jne 1b2 <poke_int3_handler+0x1b2>
0039 39: 48 83 c4 10 add $0x10,%rsp
003d 3d: 5b pop %rbx
003e 3e: 5d pop %rbp
003f 3f: 41 5c pop %r12
0041 41: 41 5d pop %r13
0043 43: c3 retq
0044 44: 48 89 fb mov %rdi,%rbx
0047 47: e8 00 00 00 00 callq 4c <poke_int3_handler+0x4c>
0048 48: R_X86_64_PLT32 kcsan_disable_current-0x4
004c 4c: 4c 8b 2d 00 00 00 00 mov 0x0(%rip),%r13 # 53 <poke_int3_handler+0x53>
004f 4f: R_X86_64_PC32 .bss+0x101c
0053 53: 48 8d 6c 24 08 lea 0x8(%rsp),%rbp
0058 58: 49 89 e4 mov %rsp,%r12
005b 5b: 48 89 e8 mov %rbp,%rax
005e 5e: 4c 29 e0 sub %r12,%rax
0061 61: 48 83 f8 08 cmp $0x8,%rax
0065 65: 0f 87 4c 01 00 00 ja 1b7 <poke_int3_handler+0x1b7>
006b 6b: 4c 29 e5 sub %r12,%rbp
006e 6e: 4c 89 2c 24 mov %r13,(%rsp)
0072 72: e8 00 00 00 00 callq 77 <poke_int3_handler+0x77>
0073 73: R_X86_64_PLT32 kcsan_enable_current_nowarn-0x4
0077 77: 48 83 fd 08 cmp $0x8,%rbp
007b 7b: 0f 87 53 01 00 00 ja 1d4 <poke_int3_handler+0x1d4>
0081 81: 4c 8b 24 24 mov (%rsp),%r12
0085 85: 4d 85 e4 test %r12,%r12
0088 88: 74 99 je 23 <poke_int3_handler+0x23>
008a 8a: e8 00 00 00 00 callq 8f <poke_int3_handler+0x8f>
008b 8b: R_X86_64_PLT32 kcsan_disable_current-0x4
008f 8f: 4d 8d 6c 24 0c lea 0xc(%r12),%r13
0094 94: 41 8b 6c 24 0c mov 0xc(%r12),%ebp
0099 99: e8 00 00 00 00 callq 9e <poke_int3_handler+0x9e>
009a 9a: R_X86_64_PLT32 kcsan_enable_current_nowarn-0x4
009e 9e: 85 ed test %ebp,%ebp
00a0 a0: 74 81 je 23 <poke_int3_handler+0x23>
00a2 a2: 8d 55 01 lea 0x1(%rbp),%edx
00a5 a5: 89 e8 mov %ebp,%eax
00a7 a7: f0 41 0f b1 55 00 lock cmpxchg %edx,0x0(%r13)
00ad ad: 89 c5 mov %eax,%ebp
00af af: 75 ed jne 9e <poke_int3_handler+0x9e>
00b1 b1: 48 8b bb 80 00 00 00 mov 0x80(%rbx),%rdi
00b8 b8: 49 63 44 24 08 movslq 0x8(%r12),%rax
00bd bd: 49 8b 0c 24 mov (%r12),%rcx
00c1 c1: 48 8d 57 ff lea -0x1(%rdi),%rdx
00c5 c5: 83 f8 01 cmp $0x1,%eax
00c8 c8: 7f 1c jg e6 <poke_int3_handler+0xe6>
00ca ca: 48 63 31 movslq (%rcx),%rsi
00cd cd: 31 c0 xor %eax,%eax
00cf cf: 48 81 c6 00 00 00 00 add $0x0,%rsi
00d2 d2: R_X86_64_32S _stext
00d6 d6: 48 39 f2 cmp %rsi,%rdx
00d9 d9: 74 3c je 117 <poke_int3_handler+0x117>
00db db: f0 41 ff 4c 24 0c lock decl 0xc(%r12)
00e1 e1: e9 3f ff ff ff jmpq 25 <poke_int3_handler+0x25>
00e6 e6: 49 89 c9 mov %rcx,%r9
00e9 e9: 49 89 c0 mov %rax,%r8
00ec ec: 49 d1 e8 shr %r8
00ef ef: 4c 89 c1 mov %r8,%rcx
00f2 f2: 48 c1 e1 04 shl $0x4,%rcx
00f6 f6: 4c 01 c9 add %r9,%rcx
00f9 f9: 48 63 31 movslq (%rcx),%rsi
00fc fc: 48 81 c6 00 00 00 00 add $0x0,%rsi
00ff ff: R_X86_64_32S _stext
0103 103: 48 39 f2 cmp %rsi,%rdx
0106 106: 0f 82 88 00 00 00 jb 194 <poke_int3_handler+0x194>
010c 10c: 0f 87 93 00 00 00 ja 1a5 <poke_int3_handler+0x1a5>
0112 112: 48 85 c9 test %rcx,%rcx
0115 115: 74 28 je 13f <poke_int3_handler+0x13f>
0117 117: 0f b6 41 08 movzbl 0x8(%rcx),%eax
011b 11b: 8d 70 34 lea 0x34(%rax),%esi
011e 11e: 40 80 fe 1f cmp $0x1f,%sil
0122 122: 76 02 jbe 126 <poke_int3_handler+0x126>
0124 124: 0f 0b ud2
0126 126: 40 0f b6 f6 movzbl %sil,%esi
012a 12a: 48 0f be b6 00 00 00 movsbq 0x0(%rsi),%rsi
0131 131: 00
012e 12e: R_X86_64_32S .rodata
0132 132: 48 01 f2 add %rsi,%rdx
0135 135: 3c e8 cmp $0xe8,%al
0137 137: 74 29 je 162 <poke_int3_handler+0x162>
0139 139: 77 08 ja 143 <poke_int3_handler+0x143>
013b 13b: 3c cc cmp $0xcc,%al
013d 13d: 75 e5 jne 124 <poke_int3_handler+0x124>
013f 13f: 31 c0 xor %eax,%eax
0141 141: eb 98 jmp db <poke_int3_handler+0xdb>
0143 143: 83 e0 fd and $0xfffffffd,%eax
0146 146: 3c e9 cmp $0xe9,%al
0148 148: 75 da jne 124 <poke_int3_handler+0x124>
014a 14a: 48 63 41 04 movslq 0x4(%rcx),%rax
014e 14e: 48 01 c2 add %rax,%rdx
0151 151: b8 01 00 00 00 mov $0x1,%eax
0156 156: 48 89 93 80 00 00 00 mov %rdx,0x80(%rbx)
015d 15d: e9 79 ff ff ff jmpq db <poke_int3_handler+0xdb>
0162 162: 48 63 41 04 movslq 0x4(%rcx),%rax
0166 166: 48 83 c7 04 add $0x4,%rdi
016a 16a: 48 01 c2 add %rax,%rdx
016d 16d: 48 8b 83 98 00 00 00 mov 0x98(%rbx),%rax
0174 174: 48 8d 48 f8 lea -0x8(%rax),%rcx
0178 178: 48 89 8b 98 00 00 00 mov %rcx,0x98(%rbx)
017f 17f: 48 89 78 f8 mov %rdi,-0x8(%rax)
0183 183: b8 01 00 00 00 mov $0x1,%eax
0188 188: 48 89 93 80 00 00 00 mov %rdx,0x80(%rbx)
018f 18f: e9 47 ff ff ff jmpq db <poke_int3_handler+0xdb>
0194 194: 4c 89 c0 mov %r8,%rax
0197 197: 48 85 c0 test %rax,%rax
019a 19a: 0f 85 49 ff ff ff jne e9 <poke_int3_handler+0xe9>
01a0 1a0: e9 36 ff ff ff jmpq db <poke_int3_handler+0xdb>
01a5 1a5: 48 83 e8 01 sub $0x1,%rax
01a9 1a9: 4c 8d 49 10 lea 0x10(%rcx),%r9
01ad 1ad: 48 d1 e8 shr %rax
01b0 1b0: eb e5 jmp 197 <poke_int3_handler+0x197>
01b2 1b2: e8 00 00 00 00 callq 1b7 <poke_int3_handler+0x1b7>
01b3 1b3: R_X86_64_PLT32 __stack_chk_fail-0x4
01b7 1b7: 4c 01 e0 add %r12,%rax
01ba 1ba: 0f 82 ab fe ff ff jb 6b <poke_int3_handler+0x6b>
01c0 1c0: 4c 89 e6 mov %r12,%rsi
01c3 1c3: 48 c7 c7 00 00 00 00 mov $0x0,%rdi
01c6 1c6: R_X86_64_32S .data+0x80
01ca 1ca: e8 00 00 00 00 callq 1cf <poke_int3_handler+0x1cf>
01cb 1cb: R_X86_64_PLT32 __ubsan_handle_type_mismatch_v1-0x4
01cf 1cf: e9 97 fe ff ff jmpq 6b <poke_int3_handler+0x6b>
01d4 1d4: 4c 01 e5 add %r12,%rbp
01d7 1d7: 0f 82 a4 fe ff ff jb 81 <poke_int3_handler+0x81>
01dd 1dd: 4c 89 e6 mov %r12,%rsi
01e0 1e0: 48 c7 c7 00 00 00 00 mov $0x0,%rdi
01e3 1e3: R_X86_64_32S .data+0x60
01e7 1e7: e8 00 00 00 00 callq 1ec <poke_int3_handler+0x1ec>
01e8 1e8: R_X86_64_PLT32 __ubsan_handle_type_mismatch_v1-0x4
01ec 1ec: e9 90 fe ff ff jmpq 81 <poke_int3_handler+0x81>

2020-05-14 12:31:27

[permalink] [raw]

Subject: Re: [PATCH v5 00/18] Rework READ_ONCE() to improve codegen

On Thu, May 14, 2020 at 01:01:04PM +0100, Will Deacon wrote:

> > +#if __has_feature(undefined_sanitizer)
>
> Hmm, this might want to be __has_feature(undefined_behavior_sanitizer)
> (and damn is that hard for a Brit to type out!)

(I know right, it should be behaviour, dammit!)

I tried without the condition, eg.:

+#define __no_sanitize_undefined \
+ __attribute__((no_sanitize("undefined")))

and it still generated UBSAN gunk.

2020-05-14 13:11:38

[permalink] [raw]

Subject: Re: [PATCH v5 00/18] Rework READ_ONCE() to improve codegen

On Thu, 14 May 2020 at 14:27, Peter Zijlstra <[email protected]> wrote:
>
> On Thu, May 14, 2020 at 01:01:04PM +0100, Will Deacon wrote:
>
> > > +#if __has_feature(undefined_sanitizer)
> >
> > Hmm, this might want to be __has_feature(undefined_behavior_sanitizer)
> > (and damn is that hard for a Brit to type out!)
>
> (I know right, it should be behaviour, dammit!)
>
> I tried without the condition, eg.:
>
> +#define __no_sanitize_undefined \
> + __attribute__((no_sanitize("undefined")))
>
> and it still generated UBSAN gunk.

Which ubsan calls are left? I'm trying to reproduce.

Thanks,
-- Marco

2020-05-14 13:16:45

[permalink] [raw]

Subject: Re: [PATCH v5 00/18] Rework READ_ONCE() to improve codegen

On Thu, May 14, 2020 at 03:07:08PM +0200, Marco Elver wrote:
> On Thu, 14 May 2020 at 14:27, Peter Zijlstra <[email protected]> wrote:
> >
> > On Thu, May 14, 2020 at 01:01:04PM +0100, Will Deacon wrote:
> >
> > > > +#if __has_feature(undefined_sanitizer)
> > >
> > > Hmm, this might want to be __has_feature(undefined_behavior_sanitizer)
> > > (and damn is that hard for a Brit to type out!)
> >
> > (I know right, it should be behaviour, dammit!)
> >
> > I tried without the condition, eg.:
> >
> > +#define __no_sanitize_undefined \
> > + __attribute__((no_sanitize("undefined")))
> >
> > and it still generated UBSAN gunk.
>
> Which ubsan calls are left? I'm trying to reproduce.

To be more precise, the patches were on top of:

git://git.kernel.org/pub/scm/linux/kernel/git/tglx/devel.git entry-v5-the-rest

$ grep ubsan poke_int3_handler-clang10.asm
0074 74: R_X86_64_PLT32 __ubsan_handle_load_invalid_value-0x4
0083 83: R_X86_64_PLT32 __ubsan_handle_load_invalid_value-0x4
01c5 1c5: R_X86_64_PLT32 __ubsan_handle_builtin_unreachable-0x4

I think the config was defconfig_x86-64 inspired with KASAN+UBSAN
enabled.

So I build with:

touch arch/x86/kernel/alterantive.c;
make CC=clang-10 V=1 O=defconfig-build/ arch/x86/kernel/alterantive.o

And then dump the output with:

./objdump-func.sh defconfig-build/arch/x86/kernel/alterantive.o poke_int3_handler

$ # cat objdump-func.sh
#!/bin/bash
objdump -dr $1 | awk "/^\$/ { P=0; } /$2[^>]*>:\$/ { P=1; O=strtonum(\"0x\" \$1); } { if (P) { o=strtonum(\"0x\" \$1); printf(\"%04x \", o-O); print \$0; } }"

Hope that is enough to reproduce.

2020-05-14 13:38:42

[permalink] [raw]

Subject: Re: [PATCH v5 00/18] Rework READ_ONCE() to improve codegen

On Thu, 14 May 2020 at 13:05, Will Deacon <[email protected]> wrote:
>
> Hi Marco,
>
> On Thu, May 14, 2020 at 09:31:49AM +0200, Marco Elver wrote:
> > Ouch. With the __{READ,WRITE}_ONCE requirement, we're going to need
> > Clang 11 though.
> >
> > Because without the data_race() around __*_ONCE,
> > arch_atomic_{read,set} will be broken for KCSAN, but we can't have
> > data_race() because it would still add
> > kcsan_{enable,disable}_current() calls to __no_sanitize functions (if
> > compilation unit is instrumented). We can't make arch_atomic functions
> > __no_sanitize_or_inline, because even in code that we want to
> > sanitize, they should remain __always_inline (so they work properly in
> > __no_sanitize functions). Therefore, Clang 11 with support for
> > distinguishing volatiles will be the compiler that will satisfy all
> > the constraints.
> >
> > If this is what we want, let me prepare a series on top of
> > -tip/locking/kcsan with all the things I think we need.
>
> Stepping back a second, the locking/kcsan branch is at least functional at
> the moment by virtue of KCSAN_SANITIZE := n being used liberally in
> arch/x86/. However, I still think we want to do better than that because (a)
> it would be good to get more x86 coverage and (b) enabling this for arm64,
> where objtool is not yet available, will be fragile if we have to whitelist
> object files. There's also a fair bit of arm64 low-level code spread around
> drivers/, so it feels like we'd end up with a really bad case of whack-a-mole.
>
> Talking off-list, Clang >= 7 is pretty reasonable wrt inlining decisions
> and the behaviour for __always_inline is:
>
> * An __always_inline function inlined into a __no_sanitize function is
> not instrumented
> * An __always_inline function inlined into an instrumented function is
> instrumented
> * You can't mark a function as both __always_inline __no_sanitize, because
> __no_sanitize functions are never inlined
>
> GCC, on the other hand, may still inline __no_sanitize functions and then
> subsequently instrument them.
>
> So if were willing to make KCSAN depend on Clang >= 7, then we could:
>
> - Remove the data_race() from __{READ,WRITE}_ONCE()
> - Wrap arch_atomic*() in data_race() when called from the instrumented
> atomic wrappers
>
> At which point, I *think* everything works as expected. READ_ONCE_NOCHECK()
> won't generate any surprises, and Peter can happily use arch_atomic()
> from non-instrumented code.
>
> Thoughts? I don't see the need to support buggy compilers when enabling
> a new debug feature.

This is also a reply to
https://lkml.kernel.org/r/[email protected]
-- the problem with __READ_ONCE would be solved with what Will
proposed above.

Let me try to spell out the requirements I see so far (this is for
KCSAN only though -- other sanitizers might be similar):

1. __no_kcsan functions should not call anything, not even
kcsan_{enable,disable}_current(), when using __{READ,WRITE}_ONCE.
[Requires leaving data_race() off of these.]

2. __always_inline functions inlined into __no_sanitize function is
not instrumented. [Has always been satisfied by GCC and Clang.]

3. __always_inline functions inlined into instrumented function is
instrumented. [Has always been satisfied by GCC and Clang.]

4. __no_kcsan functions should never be spuriously inlined into
instrumented functions, causing the accesses of the __no_kcsan
function to be instrumented. [Satisfied by Clang >= 7. All GCC
versions are broken.]

5. we should not break atomic_{read,set} for KCSAN. [Because of #1,
we'd need to add data_race() around the arch-calls in
atomic_{read,set}; or rely on Clang 11's -tsan-distinguish-volatile
support (GCC 11 might get this as well).]

6. never emit __tsan_func_{entry,exit}. [Clang supports disabling
this, GCC doesn't.]

7. kernel is supported by compiler. [Clang >= 9 seems to build -tip
for me, anything below complains about lack of asm goto. GCC trivial.]

So, because of #4 & #6 & #7 we're down to Clang >= 9. Because of #5
we'll have to make a choice between Clang >= 9 or Clang >= 11
(released in ~June). In an ideal world we might even fix GCC in
future.

That's not even considering the problems around UBSan and KASAN. But
maybe one step at a time?

Any preferences?

Thanks,
-- Marco

2020-05-14 13:49:24

[permalink] [raw]

Subject: Re: [PATCH v5 00/18] Rework READ_ONCE() to improve codegen

On Thu, May 14, 2020 at 03:35:58PM +0200, Marco Elver wrote:
> 2. __always_inline functions inlined into __no_sanitize function is
> not instrumented. [Has always been satisfied by GCC and Clang.]

GCC <= 7 fails to compile in this case.

2020-05-14 13:54:58

[permalink] [raw]

Subject: Re: [PATCH v5 00/18] Rework READ_ONCE() to improve codegen

On Thu, May 14, 2020 at 03:35:58PM +0200, Marco Elver wrote:
> 4. __no_kcsan functions should never be spuriously inlined into
> instrumented functions, causing the accesses of the __no_kcsan
> function to be instrumented. [Satisfied by Clang >= 7. All GCC
> versions are broken.]

The current noinstr annotation implies noinline, for a similar issue, we
need the function to be emitted in a specific section. So while yuck,
this is not an immediate issue for us.

2020-05-14 13:58:37

[permalink] [raw]

Subject: Re: [PATCH v5 00/18] Rework READ_ONCE() to improve codegen

On Thu, May 14, 2020 at 03:35:58PM +0200, Marco Elver wrote:

> 5. we should not break atomic_{read,set} for KCSAN. [Because of #1,
> we'd need to add data_race() around the arch-calls in
> atomic_{read,set}; or rely on Clang 11's -tsan-distinguish-volatile
> support (GCC 11 might get this as well).]

Putting the data_race() in atomic_{read,set} would 'break' any sanitized
user of arch_atomic_{read,set}(). Now it so happens there aren't any
such just now, but we need to be aware of that.

I'm thinking the volatile thing is the nicest solution, but yes, that'll
make us depend on 11 everything.

2020-05-14 14:15:37

[permalink] [raw]

Subject: Re: [PATCH v5 00/18] Rework READ_ONCE() to improve codegen

On Wed, May 13, 2020 at 03:58:30PM +0200, Marco Elver wrote:
> On Wed, 13 May 2020 at 15:24, Peter Zijlstra <[email protected]> wrote:

> > Also, could not this compiler instrumentation live as a kernel specific
> > GCC-plugin instead of being part of GCC proper? Because in that case,
> > we'd have much better control over it.
>
> I'd like it if we could make it a GCC-plugin for GCC, but how? I don't
> see a way to affect TSAN instrumentation. FWIW Clang already has
> distinguish-volatile support (unreleased Clang 11).

Ah, I figured not use the built-in TSAN at all, do a complete
replacement of the instrumentation with a plugin. AFAIU plugins are able
to emit instrumentation, but this isn't something I know a lot about.

2020-05-14 14:22:54

[permalink] [raw]

Subject: Re: [PATCH v5 00/18] Rework READ_ONCE() to improve codegen

On Thu, 14 May 2020 at 16:13, Peter Zijlstra <[email protected]> wrote:
>
> On Wed, May 13, 2020 at 03:58:30PM +0200, Marco Elver wrote:
> > On Wed, 13 May 2020 at 15:24, Peter Zijlstra <[email protected]> wrote:
>
> > > Also, could not this compiler instrumentation live as a kernel specific
> > > GCC-plugin instead of being part of GCC proper? Because in that case,
> > > we'd have much better control over it.
> >
> > I'd like it if we could make it a GCC-plugin for GCC, but how? I don't
> > see a way to affect TSAN instrumentation. FWIW Clang already has
> > distinguish-volatile support (unreleased Clang 11).
>
> Ah, I figured not use the built-in TSAN at all, do a complete
> replacement of the instrumentation with a plugin. AFAIU plugins are able
> to emit instrumentation, but this isn't something I know a lot about.

Interesting option. But it will likely not solve the no_sanitize and
inlining problem, because those are deeply tied to the optimization
pipelines.

2020-05-14 14:27:24

[permalink] [raw]

Subject: Re: [PATCH v5 00/18] Rework READ_ONCE() to improve codegen

On Thu, May 14, 2020 at 03:35:58PM +0200, Marco Elver wrote:

> Let me try to spell out the requirements I see so far (this is for
> KCSAN only though -- other sanitizers might be similar):
>
> 1. __no_kcsan functions should not call anything, not even
> kcsan_{enable,disable}_current(), when using __{READ,WRITE}_ONCE.
> [Requires leaving data_race() off of these.]
>
> 2. __always_inline functions inlined into __no_sanitize function is
> not instrumented. [Has always been satisfied by GCC and Clang.]
>
> 3. __always_inline functions inlined into instrumented function is
> instrumented. [Has always been satisfied by GCC and Clang.]
>
> 4. __no_kcsan functions should never be spuriously inlined into
> instrumented functions, causing the accesses of the __no_kcsan
> function to be instrumented. [Satisfied by Clang >= 7. All GCC
> versions are broken.]
>
> 5. we should not break atomic_{read,set} for KCSAN. [Because of #1,
> we'd need to add data_race() around the arch-calls in
> atomic_{read,set}; or rely on Clang 11's -tsan-distinguish-volatile
> support (GCC 11 might get this as well).]
>
> 6. never emit __tsan_func_{entry,exit}. [Clang supports disabling
> this, GCC doesn't.]
>
> 7. kernel is supported by compiler. [Clang >= 9 seems to build -tip
> for me, anything below complains about lack of asm goto. GCC trivial.]
>
> So, because of #4 & #6 & #7 we're down to Clang >= 9. Because of #5
> we'll have to make a choice between Clang >= 9 or Clang >= 11
> (released in ~June). In an ideal world we might even fix GCC in
> future.
>
> That's not even considering the problems around UBSan and KASAN. But
> maybe one step at a time?

Exact same requirements, KASAN even has the data_race() problem through
READ_ONCE_NOCHECK(), UBSAN doesn't and might be simpler because of it.

> Any preferences?

I suppose DTRT, if we then write the Makefile rule like:

KCSAN_SANITIZE := KCSAN_FUNCTION_ATTRIBUTES

and set that to either 'y'/'n' depending on the compiler at hand
supporting enough magic to make it all work.

I suppose all the sanitize stuff is most important for developers and
we tend to have the latest compiler versions anyway, right?

2020-05-14 15:12:29

by Thomas Gleixner

[permalink] [raw]

Subject: Re: [PATCH v5 00/18] Rework READ_ONCE() to improve codegen

Peter Zijlstra <[email protected]> writes:
> On Thu, May 14, 2020 at 03:35:58PM +0200, Marco Elver wrote:
>> Any preferences?
>
> I suppose DTRT, if we then write the Makefile rule like:
>
> KCSAN_SANITIZE := KCSAN_FUNCTION_ATTRIBUTES
>
> and set that to either 'y'/'n' depending on the compiler at hand
> supporting enough magic to make it all work.
>
> I suppose all the sanitize stuff is most important for developers and
> we tend to have the latest compiler versions anyway, right?

Developers and CI/testing stuff. Yes we really should require a sane
compiler instead of introducing boatloads of horrible workarounds all
over the place which then break when the code changes slightly.

Thanks,

tglx

2020-05-14 15:31:47

[permalink] [raw]

Subject: Re: [PATCH v5 00/18] Rework READ_ONCE() to improve codegen

On Thu, 14 May 2020 at 17:09, Thomas Gleixner <[email protected]> wrote:
>
> Peter Zijlstra <[email protected]> writes:
> > On Thu, May 14, 2020 at 03:35:58PM +0200, Marco Elver wrote:
> >> Any preferences?
> >
> > I suppose DTRT, if we then write the Makefile rule like:
> >
> > KCSAN_SANITIZE := KCSAN_FUNCTION_ATTRIBUTES
> >
> > and set that to either 'y'/'n' depending on the compiler at hand
> > supporting enough magic to make it all work.
> >
> > I suppose all the sanitize stuff is most important for developers and
> > we tend to have the latest compiler versions anyway, right?
>
> Developers and CI/testing stuff. Yes we really should require a sane
> compiler instead of introducing boatloads of horrible workarounds all
> over the place which then break when the code changes slightly.

In which case, let me prepare a series on top of -tip for switching at
least KCSAN to Clang 11. If that's what we'll need, I don't see a
better option right now.

Thanks,
-- Marco

2020-05-14 15:42:02

by Paul E. McKenney

[permalink] [raw]

Subject: Re: [PATCH v5 00/18] Rework READ_ONCE() to improve codegen

On Thu, May 14, 2020 at 03:35:58PM +0200, Marco Elver wrote:
> On Thu, 14 May 2020 at 13:05, Will Deacon <[email protected]> wrote:
> >
> > Hi Marco,
> >
> > On Thu, May 14, 2020 at 09:31:49AM +0200, Marco Elver wrote:
> > > Ouch. With the __{READ,WRITE}_ONCE requirement, we're going to need
> > > Clang 11 though.
> > >
> > > Because without the data_race() around __*_ONCE,
> > > arch_atomic_{read,set} will be broken for KCSAN, but we can't have
> > > data_race() because it would still add
> > > kcsan_{enable,disable}_current() calls to __no_sanitize functions (if
> > > compilation unit is instrumented). We can't make arch_atomic functions
> > > __no_sanitize_or_inline, because even in code that we want to
> > > sanitize, they should remain __always_inline (so they work properly in
> > > __no_sanitize functions). Therefore, Clang 11 with support for
> > > distinguishing volatiles will be the compiler that will satisfy all
> > > the constraints.
> > >
> > > If this is what we want, let me prepare a series on top of
> > > -tip/locking/kcsan with all the things I think we need.
> >
> > Stepping back a second, the locking/kcsan branch is at least functional at
> > the moment by virtue of KCSAN_SANITIZE := n being used liberally in
> > arch/x86/. However, I still think we want to do better than that because (a)
> > it would be good to get more x86 coverage and (b) enabling this for arm64,
> > where objtool is not yet available, will be fragile if we have to whitelist
> > object files. There's also a fair bit of arm64 low-level code spread around
> > drivers/, so it feels like we'd end up with a really bad case of whack-a-mole.
> >
> > Talking off-list, Clang >= 7 is pretty reasonable wrt inlining decisions
> > and the behaviour for __always_inline is:
> >
> > * An __always_inline function inlined into a __no_sanitize function is
> > not instrumented
> > * An __always_inline function inlined into an instrumented function is
> > instrumented
> > * You can't mark a function as both __always_inline __no_sanitize, because
> > __no_sanitize functions are never inlined
> >
> > GCC, on the other hand, may still inline __no_sanitize functions and then
> > subsequently instrument them.
> >
> > So if were willing to make KCSAN depend on Clang >= 7, then we could:
> >
> > - Remove the data_race() from __{READ,WRITE}_ONCE()
> > - Wrap arch_atomic*() in data_race() when called from the instrumented
> > atomic wrappers
> >
> > At which point, I *think* everything works as expected. READ_ONCE_NOCHECK()
> > won't generate any surprises, and Peter can happily use arch_atomic()
> > from non-instrumented code.
> >
> > Thoughts? I don't see the need to support buggy compilers when enabling
> > a new debug feature.
>
> This is also a reply to
> https://lkml.kernel.org/r/[email protected]
> -- the problem with __READ_ONCE would be solved with what Will
> proposed above.
>
> Let me try to spell out the requirements I see so far (this is for
> KCSAN only though -- other sanitizers might be similar):
>
> 1. __no_kcsan functions should not call anything, not even
> kcsan_{enable,disable}_current(), when using __{READ,WRITE}_ONCE.
> [Requires leaving data_race() off of these.]
>
> 2. __always_inline functions inlined into __no_sanitize function is
> not instrumented. [Has always been satisfied by GCC and Clang.]
>
> 3. __always_inline functions inlined into instrumented function is
> instrumented. [Has always been satisfied by GCC and Clang.]
>
> 4. __no_kcsan functions should never be spuriously inlined into
> instrumented functions, causing the accesses of the __no_kcsan
> function to be instrumented. [Satisfied by Clang >= 7. All GCC
> versions are broken.]
>
> 5. we should not break atomic_{read,set} for KCSAN. [Because of #1,
> we'd need to add data_race() around the arch-calls in
> atomic_{read,set}; or rely on Clang 11's -tsan-distinguish-volatile
> support (GCC 11 might get this as well).]
>
> 6. never emit __tsan_func_{entry,exit}. [Clang supports disabling
> this, GCC doesn't.]
>
> 7. kernel is supported by compiler. [Clang >= 9 seems to build -tip
> for me, anything below complains about lack of asm goto. GCC trivial.]
>
> So, because of #4 & #6 & #7 we're down to Clang >= 9. Because of #5
> we'll have to make a choice between Clang >= 9 or Clang >= 11
> (released in ~June). In an ideal world we might even fix GCC in
> future.
>
> That's not even considering the problems around UBSan and KASAN. But
> maybe one step at a time?
>
> Any preferences?

I am already having to choose where I run KCSAN based on what compiler
is available, so I cannot argue too hard against a dependency on a
specific compiler. I reserve the right to ask for help installing it,
if need be though. ;-)

Thanx, Paul

2020-05-14 20:09:22

by Thomas Gleixner

[permalink] [raw]

Subject: Re: [PATCH v5 00/18] Rework READ_ONCE() to improve codegen

Marco Elver <[email protected]> writes:
> On Thu, 14 May 2020 at 17:09, Thomas Gleixner <[email protected]> wrote:
>>
>> Peter Zijlstra <[email protected]> writes:
>> > On Thu, May 14, 2020 at 03:35:58PM +0200, Marco Elver wrote:
>> >> Any preferences?
>> >
>> > I suppose DTRT, if we then write the Makefile rule like:
>> >
>> > KCSAN_SANITIZE := KCSAN_FUNCTION_ATTRIBUTES
>> >
>> > and set that to either 'y'/'n' depending on the compiler at hand
>> > supporting enough magic to make it all work.
>> >
>> > I suppose all the sanitize stuff is most important for developers and
>> > we tend to have the latest compiler versions anyway, right?
>>
>> Developers and CI/testing stuff. Yes we really should require a sane
>> compiler instead of introducing boatloads of horrible workarounds all
>> over the place which then break when the code changes slightly.
>
> In which case, let me prepare a series on top of -tip for switching at
> least KCSAN to Clang 11. If that's what we'll need, I don't see a
> better option right now.

And for a change that might make this time GCC people look at their open
bugs. :)

/me mumbles jumplabels and goes back to juggle patches

2020-05-15 09:22:19

[permalink] [raw]

Subject: Re: [PATCH v5 00/18] Rework READ_ONCE() to improve codegen

On Thu, May 14, 2020 at 04:20:42PM +0200, Marco Elver wrote:
> On Thu, 14 May 2020 at 16:13, Peter Zijlstra <[email protected]> wrote:
> >
> > On Wed, May 13, 2020 at 03:58:30PM +0200, Marco Elver wrote:
> > > On Wed, 13 May 2020 at 15:24, Peter Zijlstra <[email protected]> wrote:
> >
> > > > Also, could not this compiler instrumentation live as a kernel specific
> > > > GCC-plugin instead of being part of GCC proper? Because in that case,
> > > > we'd have much better control over it.
> > >
> > > I'd like it if we could make it a GCC-plugin for GCC, but how? I don't
> > > see a way to affect TSAN instrumentation. FWIW Clang already has
> > > distinguish-volatile support (unreleased Clang 11).
> >
> > Ah, I figured not use the built-in TSAN at all, do a complete
> > replacement of the instrumentation with a plugin. AFAIU plugins are able
> > to emit instrumentation, but this isn't something I know a lot about.
>
> Interesting option. But it will likely not solve the no_sanitize and
> inlining problem, because those are deeply tied to the optimization
> pipelines.

So I'm imagining adding the instrumentation is done at a very late pass,
after all, all we want is to add instrumentation to any memops. I
imagine this is done right before doing register allocation and emitting
asm.

At this point we can look if the current function has a no_sanitize
attribute, no?

That is, this is done after all the optimization and inlining stages
anyway; why would we care about that?

Maybe I'm too naive of compiler internals; this really isn't my area :-)

2020-05-15 13:59:51

by David Laight

[permalink] [raw]

Subject: RE: [PATCH v5 00/18] Rework READ_ONCE() to improve codegen

From: Peter Zijlstra
> Sent: 14 May 2020 15:25
..
> Exact same requirements, KASAN even has the data_race() problem through
> READ_ONCE_NOCHECK(), UBSAN doesn't and might be simpler because of it.

What happens if you implement READ_ONCE_NOCHECK() with an
asm() statement containing a memory load?

Is that enough to kill all the instrumentation?

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)

2020-05-15 14:06:34

[permalink] [raw]

Subject: Re: [PATCH v5 00/18] Rework READ_ONCE() to improve codegen

On Fri, 15 May 2020 at 15:55, David Laight <[email protected]> wrote:
>
> From: Peter Zijlstra
> > Sent: 14 May 2020 15:25
> ..
> > Exact same requirements, KASAN even has the data_race() problem through
> > READ_ONCE_NOCHECK(), UBSAN doesn't and might be simpler because of it.
>
> What happens if you implement READ_ONCE_NOCHECK() with an
> asm() statement containing a memory load?
>
> Is that enough to kill all the instrumentation?

Yes, it is.

However, READ_ONCE_NOCHECK() for KASAN can be fixed if the problem is
randomly uninlined READ_ONCE_NOCHECK() in KASAN_SANITIZE := n
compilation units. KASAN's __no_kasan_or_inline is still conditionally
defined based on CONFIG_KASAN and not __SANITIZE_ADDRESS__. I'm about
to send a patch that does that for KASAN, since for KCSAN we've been
doing it for a while. However, if that was the exact problem Peter
observed I can't tell.

Thanks,
-- Marco

2020-05-15 14:09:55

[permalink] [raw]

Subject: Re: [PATCH v5 00/18] Rework READ_ONCE() to improve codegen

On Fri, May 15, 2020 at 01:55:43PM +0000, David Laight wrote:
> From: Peter Zijlstra
> > Sent: 14 May 2020 15:25
> ..
> > Exact same requirements, KASAN even has the data_race() problem through
> > READ_ONCE_NOCHECK(), UBSAN doesn't and might be simpler because of it.
>
> What happens if you implement READ_ONCE_NOCHECK() with an
> asm() statement containing a memory load?
>
> Is that enough to kill all the instrumentation?

You'll have to implement it for all archs, but yes, I think that ought
to work.

2020-05-22 16:10:59

[permalink] [raw]

Subject: [tip: locking/kcsan] kcsan: Restrict supported compilers

The following commit has been merged into the locking/kcsan branch of tip:

Commit-ID: 0d473b1d6e5c240f8ffed02715c718024802d0fa
Gitweb: https://git.kernel.org/tip/0d473b1d6e5c240f8ffed02715c718024802d0fa
Author: Marco Elver <[email protected]>
AuthorDate: Thu, 21 May 2020 16:20:42 +02:00
Committer: Borislav Petkov <[email protected]>
CommitterDate: Fri, 22 May 2020 14:46:02 +02:00

kcsan: Restrict supported compilers

The first version of Clang that supports -tsan-distinguish-volatile will
be able to support KCSAN. The first Clang release to do so, will be
Clang 11. This is due to satisfying all the following requirements:

1. Never emit calls to __tsan_func_{entry,exit}.

2. __no_kcsan functions should not call anything, not even
kcsan_{enable,disable}_current(), when using __{READ,WRITE}_ONCE => Requires
leaving them plain!

3. Support atomic_{read,set}*() with KCSAN, which rely on
arch_atomic_{read,set}*() using __{READ,WRITE}_ONCE() => Because of
#2, rely on Clang 11's -tsan-distinguish-volatile support. We will
double-instrument atomic_{read,set}*(), but that's reasonable given
it's still lower cost than the data_race() variant due to avoiding 2
extra calls (kcsan_{en,dis}able_current() calls).

4. __always_inline functions inlined into __no_kcsan functions are never
instrumented.

5. __always_inline functions inlined into instrumented functions are
instrumented.

6. __no_kcsan_or_inline functions may be inlined into __no_kcsan functions =>
Implies leaving 'noinline' off of __no_kcsan_or_inline.

7. Because of #6, __no_kcsan and __no_kcsan_or_inline functions should never be
spuriously inlined into instrumented functions, causing the accesses of the
__no_kcsan function to be instrumented.

Older versions of Clang do not satisfy #3. The latest GCC currently
doesn't support at least #1, #3, and #7.

Signed-off-by: Marco Elver <[email protected]>
Signed-off-by: Borislav Petkov <[email protected]>
Acked-by: Peter Zijlstra (Intel) <[email protected]>
Acked-by: Will Deacon <[email protected]>
Link: https://lkml.kernel.org/r/CANpmjNMTsY_8241bS7=XAfqvZHFLrVEkv_uM4aDUWE_kh3Rvbw@mail.gmail.com
Link: https://lkml.kernel.org/r/[email protected]
---
lib/Kconfig.kcsan | 9 ++++++++-
1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/lib/Kconfig.kcsan b/lib/Kconfig.kcsan
index b5d88ac..5ee88e5 100644
--- a/lib/Kconfig.kcsan
+++ b/lib/Kconfig.kcsan
@@ -3,6 +3,12 @@
config HAVE_ARCH_KCSAN
bool

+config HAVE_KCSAN_COMPILER
+ def_bool CC_IS_CLANG && $(cc-option,-fsanitize=thread -mllvm -tsan-distinguish-volatile=1)
+ help
+ For the list of compilers that support KCSAN, please see
+ <file:Documentation/dev-tools/kcsan.rst>.
+
config KCSAN_KCOV_BROKEN
def_bool KCOV && CC_HAS_SANCOV_TRACE_PC
depends on CC_IS_CLANG
@@ -15,7 +21,8 @@ config KCSAN_KCOV_BROKEN

menuconfig KCSAN
bool "KCSAN: dynamic data race detector"
- depends on HAVE_ARCH_KCSAN && DEBUG_KERNEL && !KASAN
+ depends on HAVE_ARCH_KCSAN && HAVE_KCSAN_COMPILER
+ depends on DEBUG_KERNEL && !KASAN
depends on !KCSAN_KCOV_BROKEN
select STACKTRACE
help

2020-05-22 16:11:58

[permalink] [raw]

Subject: [tip: locking/kcsan] kcsan: Remove 'noinline' from __no_kcsan_or_inline

The following commit has been merged into the locking/kcsan branch of tip:

Commit-ID: f487a549ea30ee894055d8d20e81c1996a6e10a0
Gitweb: https://git.kernel.org/tip/f487a549ea30ee894055d8d20e81c1996a6e10a0
Author: Marco Elver <[email protected]>
AuthorDate: Thu, 21 May 2020 16:20:41 +02:00
Committer: Borislav Petkov <[email protected]>
CommitterDate: Fri, 22 May 2020 15:12:39 +02:00

kcsan: Remove 'noinline' from __no_kcsan_or_inline

Some compilers incorrectly inline small __no_kcsan functions, which then
results in instrumenting the accesses. For this reason, the 'noinline'
attribute was added to __no_kcsan_or_inline. All known versions of GCC
are affected by this. Supported versions of Clang are unaffected, and
never inline a no_sanitize function.

However, the attribute 'noinline' in __no_kcsan_or_inline causes
unexpected code generation in functions that are __no_kcsan and call a
__no_kcsan_or_inline function.

In certain situations it is expected that the __no_kcsan_or_inline
function is actually inlined by the __no_kcsan function, and *no* calls
are emitted. By removing the 'noinline' attribute, give the compiler
the ability to inline and generate the expected code in __no_kcsan
functions.

Signed-off-by: Marco Elver <[email protected]>
Signed-off-by: Borislav Petkov <[email protected]>
Acked-by: Peter Zijlstra (Intel) <[email protected]>
Acked-by: Will Deacon <[email protected]>
Link: https://lkml.kernel.org/r/CANpmjNNOpJk0tprXKB_deiNAv_UmmORf1-2uajLhnLWQQ1hvoA@mail.gmail.com
Link: https://lkml.kernel.org/r/[email protected]
---
include/linux/compiler.h | 6 ++----
1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/include/linux/compiler.h b/include/linux/compiler.h
index e24cc3a..17c98b2 100644
--- a/include/linux/compiler.h
+++ b/include/linux/compiler.h
@@ -276,11 +276,9 @@ do { \
#ifdef __SANITIZE_THREAD__
/*
* Rely on __SANITIZE_THREAD__ instead of CONFIG_KCSAN, to avoid not inlining in
- * compilation units where instrumentation is disabled. The attribute 'noinline'
- * is required for older compilers, where implicit inlining of very small
- * functions renders __no_sanitize_thread ineffective.
+ * compilation units where instrumentation is disabled.
*/
-# define __no_kcsan_or_inline __no_kcsan noinline notrace __maybe_unused
+# define __no_kcsan_or_inline __no_kcsan notrace __maybe_unused
# define __no_sanitize_or_inline __no_kcsan_or_inline
#else
# define __no_kcsan_or_inline __always_inline

2020-06-03 18:55:06

by Borislav Petkov

[permalink] [raw]

Subject: Re: [PATCH v5 00/18] Rework READ_ONCE() to improve codegen

On Thu, May 14, 2020 at 12:05:38PM +0100, Will Deacon wrote:
> Talking off-list, Clang >= 7 is pretty reasonable wrt inlining decisions
> and the behaviour for __always_inline is:
>
> * An __always_inline function inlined into a __no_sanitize function is
> not instrumented
> * An __always_inline function inlined into an instrumented function is
> instrumented
> * You can't mark a function as both __always_inline __no_sanitize, because
> __no_sanitize functions are never inlined
>
> GCC, on the other hand, may still inline __no_sanitize functions and then
> subsequently instrument them.

Yeah, about that: I've been looking for a way to trigger this so that
I can show preprocessed source to gcc people. So do you guys have a
.config or somesuch I can try?

Thx.

--
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette

2020-06-03 19:26:20

[permalink] [raw]

Subject: Re: [PATCH v5 00/18] Rework READ_ONCE() to improve codegen

On Wed, 03 Jun 2020, Borislav Petkov wrote:

> On Thu, May 14, 2020 at 12:05:38PM +0100, Will Deacon wrote:
> > Talking off-list, Clang >= 7 is pretty reasonable wrt inlining decisions
> > and the behaviour for __always_inline is:
> >
> > * An __always_inline function inlined into a __no_sanitize function is
> > not instrumented
> > * An __always_inline function inlined into an instrumented function is
> > instrumented
> > * You can't mark a function as both __always_inline __no_sanitize, because
> > __no_sanitize functions are never inlined
> >
> > GCC, on the other hand, may still inline __no_sanitize functions and then
> > subsequently instrument them.
>
> Yeah, about that: I've been looking for a way to trigger this so that
> I can show preprocessed source to gcc people. So do you guys have a
> .config or somesuch I can try?

For example take this:

int x;

static inline __attribute__((no_sanitize_thread)) void do_not_sanitize(void) {
x++;
}

void sanitize_this(void) {
do_not_sanitize();
}

Then

gcc-10 -O3 -fsanitize=thread -o example.o -c example.c
objdump -D example.o

will show that do_not_sanitize() was inlined into sanitize_this() and is
instrumented. (With Clang this doesn't happen.)

Hope this is enough.

Thanks,
-- Marco

2020-06-03 22:07:51

by Borislav Petkov

[permalink] [raw]

Subject: Re: [PATCH v5 00/18] Rework READ_ONCE() to improve codegen

On Wed, Jun 03, 2020 at 09:23:53PM +0200, Marco Elver wrote:
> Hope this is enough.

Thanks - it is. :-)

--
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette

2020-06-08 17:53:54

by Martin Liška

[permalink] [raw]

Subject: Re: [PATCH v5 00/18] Rework READ_ONCE() to improve codegen

On 6/3/20 9:23 PM, Marco Elver wrote:
>
>
> On Wed, 03 Jun 2020, Borislav Petkov wrote:
>
>> On Thu, May 14, 2020 at 12:05:38PM +0100, Will Deacon wrote:
>>> Talking off-list, Clang >= 7 is pretty reasonable wrt inlining decisions
>>> and the behaviour for __always_inline is:
>>>
>>> * An __always_inline function inlined into a __no_sanitize function is
>>> not instrumented
>>> * An __always_inline function inlined into an instrumented function is
>>> instrumented
>>> * You can't mark a function as both __always_inline __no_sanitize, because
>>> __no_sanitize functions are never inlined
>>>
>>> GCC, on the other hand, may still inline __no_sanitize functions and then
>>> subsequently instrument them.
>>
>> Yeah, about that: I've been looking for a way to trigger this so that
>> I can show preprocessed source to gcc people. So do you guys have a
>> .config or somesuch I can try?
>
> For example take this:
>
> int x;
>
> static inline __attribute__((no_sanitize_thread)) void do_not_sanitize(void) {
> x++;
> }
>
> void sanitize_this(void) {
> do_not_sanitize();
> }
>
> Then
>
> gcc-10 -O3 -fsanitize=thread -o example.o -c example.c
> objdump -D example.o

Hello.

Thank you for the example. It seems to me that Clang does not inline a no_sanitize_* function
into one which is instrumented. Is it a documented behavior ([1] doesn't mention that)?
If so, we can do the same in GCC.

Thanks,
Martin

[1] https://clang.llvm.org/docs/AttributeReference.html#no-sanitize

>
> will show that do_not_sanitize() was inlined into sanitize_this() and is
> instrumented. (With Clang this doesn't happen.)
>
> Hope this is enough.
>
> Thanks,
> -- Marco
>

2020-06-08 19:59:54

[permalink] [raw]

Subject: Re: [PATCH v5 00/18] Rework READ_ONCE() to improve codegen

On Mon, 8 Jun 2020 at 19:32, Martin Liška <[email protected]> wrote:
>
> On 6/3/20 9:23 PM, Marco Elver wrote:
> >
> >
> > On Wed, 03 Jun 2020, Borislav Petkov wrote:
> >
> >> On Thu, May 14, 2020 at 12:05:38PM +0100, Will Deacon wrote:
> >>> Talking off-list, Clang >= 7 is pretty reasonable wrt inlining decisions
> >>> and the behaviour for __always_inline is:
> >>>
> >>> * An __always_inline function inlined into a __no_sanitize function is
> >>> not instrumented
> >>> * An __always_inline function inlined into an instrumented function is
> >>> instrumented
> >>> * You can't mark a function as both __always_inline __no_sanitize, because
> >>> __no_sanitize functions are never inlined
> >>>
> >>> GCC, on the other hand, may still inline __no_sanitize functions and then
> >>> subsequently instrument them.
> >>
> >> Yeah, about that: I've been looking for a way to trigger this so that
> >> I can show preprocessed source to gcc people. So do you guys have a
> >> .config or somesuch I can try?
> >
> > For example take this:
> >
> > int x;
> >
> > static inline __attribute__((no_sanitize_thread)) void do_not_sanitize(void) {
> > x++;
> > }
> >
> > void sanitize_this(void) {
> > do_not_sanitize();
> > }
> >
> > Then
> >
> > gcc-10 -O3 -fsanitize=thread -o example.o -c example.c
> > objdump -D example.o
>
> Hello.
>
> Thank you for the example. It seems to me that Clang does not inline a no_sanitize_* function
> into one which is instrumented. Is it a documented behavior ([1] doesn't mention that)?
> If so, we can do the same in GCC.

It is not explicitly mentioned in [1]. But the contract of
"no_sanitize" is "that a particular instrumentation or set of
instrumentations should not be applied". That contract is broken if a
function is instrumented, however that may happen. It sadly does
happen with GCC when a function is inlined. Presumably because the
sanitizer passes for TSAN/ASAN/MSAN run after the optimizer -- this
definitely can't change. Also because it currently gives us the
property that __always_inline functions are instrumented according to
the function they are inlined into (a property we want).

The easy fix to no_sanitize seems to be to do what Clang does, and
never inline no_sanitize functions (with or without "inline"
attribute). always_inline functions should remain unchanged
(specifying no_sanitize on an always_inline function is an error).

Note this applies to all sanitizers (TSAN/ASAN/MSAN) and their
no_sanitize attribute that GCC has.

The list of requirements were also summarized in more detail here:
https://lore.kernel.org/lkml/CANpmjNMTsY_8241bS7=XAfqvZHFLrVEkv_uM4aDUWE_kh3Rvbw@mail.gmail.com/

Hope that makes sense. (I also need to send a v2 for param
tsan-distinguish-volatile, but haven't gotten around to it yet --
hopefully soon. And then we also need a param
tsan-instrument-func-entry-exit, which LLVM has for TSAN. One step at
a time though.)

Thanks,
-- Marco

> Thanks,
> Martin
>
> [1] https://clang.llvm.org/docs/AttributeReference.html#no-sanitize
>
> >
> > will show that do_not_sanitize() was inlined into sanitize_this() and is
> > instrumented. (With Clang this doesn't happen.)
> >
> > Hope this is enough.
> >
> > Thanks,
> > -- Marco
> >
>

2020-06-09 11:59:36

by Martin Liška

[permalink] [raw]

Subject: Re: [PATCH v5 00/18] Rework READ_ONCE() to improve codegen

On 6/8/20 9:56 PM, Marco Elver wrote:
> On Mon, 8 Jun 2020 at 19:32, Martin Liška <[email protected]> wrote:
>>
>> On 6/3/20 9:23 PM, Marco Elver wrote:
>>>
>>>
>>> On Wed, 03 Jun 2020, Borislav Petkov wrote:
>>>
>>>> On Thu, May 14, 2020 at 12:05:38PM +0100, Will Deacon wrote:
>>>>> Talking off-list, Clang >= 7 is pretty reasonable wrt inlining decisions
>>>>> and the behaviour for __always_inline is:
>>>>>
>>>>> * An __always_inline function inlined into a __no_sanitize function is
>>>>> not instrumented
>>>>> * An __always_inline function inlined into an instrumented function is
>>>>> instrumented
>>>>> * You can't mark a function as both __always_inline __no_sanitize, because
>>>>> __no_sanitize functions are never inlined
>>>>>
>>>>> GCC, on the other hand, may still inline __no_sanitize functions and then
>>>>> subsequently instrument them.
>>>>
>>>> Yeah, about that: I've been looking for a way to trigger this so that
>>>> I can show preprocessed source to gcc people. So do you guys have a
>>>> .config or somesuch I can try?
>>>
>>> For example take this:
>>>
>>> int x;
>>>
>>> static inline __attribute__((no_sanitize_thread)) void do_not_sanitize(void) {
>>> x++;
>>> }
>>>
>>> void sanitize_this(void) {
>>> do_not_sanitize();
>>> }
>>>
>>> Then
>>>
>>> gcc-10 -O3 -fsanitize=thread -o example.o -c example.c
>>> objdump -D example.o
>>
>> Hello.
>>
>> Thank you for the example. It seems to me that Clang does not inline a no_sanitize_* function
>> into one which is instrumented. Is it a documented behavior ([1] doesn't mention that)?
>> If so, we can do the same in GCC.
>
> It is not explicitly mentioned in [1]. But the contract of
> "no_sanitize" is "that a particular instrumentation or set of
> instrumentations should not be applied". That contract is broken if a
> function is instrumented, however that may happen. It sadly does
> happen with GCC when a function is inlined. Presumably because the
> sanitizer passes for TSAN/ASAN/MSAN run after the optimizer -- this
> definitely can't change. Also because it currently gives us the
> property that __always_inline functions are instrumented according to
> the function they are inlined into (a property we want).
>
> The easy fix to no_sanitize seems to be to do what Clang does, and
> never inline no_sanitize functions (with or without "inline"
> attribute). always_inline functions should remain unchanged
> (specifying no_sanitize on an always_inline function is an error).

Hello.

Works for me and I've just sent patch for that:
https://gcc.gnu.org/pipermail/gcc-patches/2020-June/547618.html

>
> Note this applies to all sanitizers (TSAN/ASAN/MSAN) and their
> no_sanitize attribute that GCC has.

Sure.

>
> The list of requirements were also summarized in more detail here:
> https://lore.kernel.org/lkml/CANpmjNMTsY_8241bS7=XAfqvZHFLrVEkv_uM4aDUWE_kh3Rvbw@mail.gmail.com/
>
> Hope that makes sense. (I also need to send a v2 for param
> tsan-distinguish-volatile, but haven't gotten around to it yet --
> hopefully soon.

The patch is approved now.

And then we also need a param
> tsan-instrument-func-entry-exit, which LLVM has for TSAN. One step at
> a time though.)

Yes, please send a patch for it.

Martin

>
> Thanks,
> -- Marco
>
>
>> Thanks,
>> Martin
>>
>> [1] https://clang.llvm.org/docs/AttributeReference.html#no-sanitize
>>
>>>
>>> will show that do_not_sanitize() was inlined into sanitize_this() and is
>>> instrumented. (With Clang this doesn't happen.)
>>>
>>> Hope this is enough.
>>>
>>> Thanks,
>>> -- Marco
>>>
>>

2020-06-09 18:52:20

by Martin Liška

[permalink] [raw]

Subject: Re: [PATCH v5 00/18] Rework READ_ONCE() to improve codegen

On 6/9/20 1:55 PM, Martin Liška wrote:
> Works for me and I've just sent patch for that:
> https://gcc.gnu.org/pipermail/gcc-patches/2020-June/547618.html

The patch has landed into master.

Martin

2020-06-09 19:01:59