2018-04-15 04:29:11

by Yonghong Song

[permalink] [raw]
Subject: [PATCH v2] x86/cpufeature: guard asm_volatile_goto usage with NO_BPF_WORKAROUND

Commit d0266046ad54 ("x86: Remove FAST_FEATURE_TESTS")
removed X86_FAST_FEATURE_TESTS and make macro static_cpu_has() always
use __always_inline function _static_cpu_has() funciton.
The static_cpu_has() uses gcc feature asm_volatile_goto construct,
which is not supported by clang.

Currently, for BPF programs written in C, the only widely-supported
compiler is clang. Because of clang lacking support
of asm_volatile_goto construct, if you try to compile
bpf programs under samples/bpf/,
$ make -j20 && make headers_install && make samples/bpf/
you will see a lot of failures like below:

=========================
clang -nostdinc -isystem /usr/lib/gcc/x86_64-redhat-linux/7/include \
-I/home/yhs/work/bpf-next/arch/x86/include \
-I./arch/x86/include/generated -I/home/yhs/work/bpf-next/include \
-I./include -I/home/yhs/work/bpf-next/arch/x86/include/uapi \
-I./arch/x86/include/generated/uapi \
-I/home/yhs/work/bpf-next/include/uapi -I./include/generated/uapi \
-include /home/yhs/work/bpf-next/include/linux/kconfig.h
-Isamples/bpf \
-I/home/yhs/work/bpf-next/tools/testing/selftests/bpf/ \
-D__KERNEL__ -Wno-unused-value -Wno-pointer-sign \
-D__TARGET_ARCH_x86 -Wno-compare-distinct-pointer-types \
-Wno-gnu-variable-sized-type-not-at-end \
-Wno-address-of-packed-member -Wno-tautological-compare \
-Wno-unknown-warning-option \
-O2 -emit-llvm -c /home/yhs/work/bpf-next/samples/bpf/xdp_redirect_cpu_kern.c \
-o -| llc -march=bpf -filetype=obj -o samples/bpf/xdp_redirect_cpu_kern.o
In file included from /home/yhs/work/bpf-next/samples/bpf/xdp_redirect_cpu_kern.c:10:
In file included from /home/yhs/work/bpf-next/include/uapi/linux/in.h:24:
In file included from /home/yhs/work/bpf-next/include/linux/socket.h:8:
In file included from /home/yhs/work/bpf-next/include/linux/uio.h:13:
In file included from /home/yhs/work/bpf-next/include/linux/thread_info.h:38:
In file included from /home/yhs/work/bpf-next/arch/x86/include/asm/thread_info.h:53:
/home/yhs/work/bpf-next/arch/x86/include/asm/cpufeature.h:150:2: error: 'asm goto' constructs are not supported yet
asm_volatile_goto("1: jmp 6f\n"
^
/home/yhs/work/bpf-next/include/linux/compiler-gcc.h:296:42: note: expanded from macro 'asm_volatile_goto'
#define asm_volatile_goto(x...) do { asm goto(x); asm (""); } while (0)
^
1 error generated.
=========================
...
In file included from /home/yhs/work/bpf-next/samples/bpf/tracex4_kern.c:7:
In file included from /home/yhs/work/bpf-next/include/linux/ptrace.h:6:
In file included from /home/yhs/work/bpf-next/include/linux/sched.h:14:
In file included from /home/yhs/work/bpf-next/include/linux/pid.h:5:
In file included from /home/yhs/work/bpf-next/include/linux/rculist.h:11:
In file included from /home/yhs/work/bpf-next/include/linux/rcupdate.h:40:
In file included from /home/yhs/work/bpf-next/include/linux/preempt.h:81:
In file included from /home/yhs/work/bpf-next/arch/x86/include/asm/preempt.h:7:
In file included from /home/yhs/work/bpf-next/include/linux/thread_info.h:38:
In file included from /home/yhs/work/bpf-next/arch/x86/include/asm/thread_info.h:53:
/home/yhs/work/bpf-next/arch/x86/include/asm/cpufeature.h:150:2: error: 'asm goto' constructs are not supported yet
...
=========================

This patch adds a preprocessor guard NO_BPF_WORKAROUND around the
asm_volatile_goto based static_cpu_has(). NO_BPF_WORKAROUND is set
at toplevel Makefile when compiler supports asm-goto. That is,
if the compiler supports asm-goto, the kernel build will use
asm-goto version of static_cpu_has().

For clang compilation for bpf programs where only header files
are accessed, NO_BPF_WORKAROUND is not defined and asm-goto version
of static_cpu_has() will not be accessed, hence the above compilation
error will disappear.

If later clang compiler starts to support asm_volatile_goto,
the change in this patch can be reverted.

Fixes: d0266046ad54 ("x86: Remove FAST_FEATURE_TESTS")
Suggested-by: Alexei Starovoitov <[email protected]>
Signed-off-by: Yonghong Song <[email protected]>
---
Makefile | 1 +
arch/x86/include/asm/cpufeature.h | 5 +++++
2 files changed, 6 insertions(+)

Changelog:
v1 -> v2:
Use NO_BPF_WORKAROUND macro instead of CC_HAVE_ASM_GOTO
to make it explicit that this is a workaround.

diff --git a/Makefile b/Makefile
index c1a608a..7b81f1f 100644
--- a/Makefile
+++ b/Makefile
@@ -504,6 +504,7 @@ export RETPOLINE_CFLAGS
ifeq ($(call shell-cached,$(CONFIG_SHELL) $(srctree)/scripts/gcc-goto.sh $(CC) $(KBUILD_CFLAGS)), y)
CC_HAVE_ASM_GOTO := 1
KBUILD_CFLAGS += -DCC_HAVE_ASM_GOTO
+ KBUILD_CFLAGS += -DNO_BPF_WORKAROUND
KBUILD_AFLAGS += -DCC_HAVE_ASM_GOTO
endif

diff --git a/arch/x86/include/asm/cpufeature.h b/arch/x86/include/asm/cpufeature.h
index b27da96..638417b 100644
--- a/arch/x86/include/asm/cpufeature.h
+++ b/arch/x86/include/asm/cpufeature.h
@@ -140,6 +140,8 @@ extern void clear_cpu_cap(struct cpuinfo_x86 *c, unsigned int bit);

#define setup_force_cpu_bug(bit) setup_force_cpu_cap(bit)

+/* this macro is a temporary hack for bpf until clang gains asm-goto support */
+#ifdef NO_BPF_WORKAROUND
/*
* Static testing of CPU features. Used the same as boot_cpu_has().
* These will statically patch the target code for additional
@@ -195,6 +197,9 @@ static __always_inline __pure bool _static_cpu_has(u16 bit)
boot_cpu_has(bit) : \
_static_cpu_has(bit) \
)
+#else
+#define static_cpu_has(bit) boot_cpu_has(bit)
+#endif

#define cpu_has_bug(c, bit) cpu_has(c, (bit))
#define set_cpu_bug(c, bit) set_cpu_cap(c, (bit))
--
2.9.5



2018-04-19 17:02:49

by Yonghong Song

[permalink] [raw]
Subject: Re: [PATCH v2] x86/cpufeature: guard asm_volatile_goto usage with NO_BPF_WORKAROUND


Hi, Peter,

Just pinging. Did you get chances to look at this?

[ cc netdev as well so folks are aware of the issue. ]

Thanks!

Yonghong

On 4/14/18 9:27 PM, Yonghong Song wrote:
> Commit d0266046ad54 ("x86: Remove FAST_FEATURE_TESTS")
> removed X86_FAST_FEATURE_TESTS and make macro static_cpu_has() always
> use __always_inline function _static_cpu_has() funciton.
> The static_cpu_has() uses gcc feature asm_volatile_goto construct,
> which is not supported by clang.
>
> Currently, for BPF programs written in C, the only widely-supported
> compiler is clang. Because of clang lacking support
> of asm_volatile_goto construct, if you try to compile
> bpf programs under samples/bpf/,
> $ make -j20 && make headers_install && make samples/bpf/
> you will see a lot of failures like below:
>
> =========================
> clang -nostdinc -isystem /usr/lib/gcc/x86_64-redhat-linux/7/include \
> -I/home/yhs/work/bpf-next/arch/x86/include \
> -I./arch/x86/include/generated -I/home/yhs/work/bpf-next/include \
> -I./include -I/home/yhs/work/bpf-next/arch/x86/include/uapi \
> -I./arch/x86/include/generated/uapi \
> -I/home/yhs/work/bpf-next/include/uapi -I./include/generated/uapi \
> -include /home/yhs/work/bpf-next/include/linux/kconfig.h
> -Isamples/bpf \
> -I/home/yhs/work/bpf-next/tools/testing/selftests/bpf/ \
> -D__KERNEL__ -Wno-unused-value -Wno-pointer-sign \
> -D__TARGET_ARCH_x86 -Wno-compare-distinct-pointer-types \
> -Wno-gnu-variable-sized-type-not-at-end \
> -Wno-address-of-packed-member -Wno-tautological-compare \
> -Wno-unknown-warning-option \
> -O2 -emit-llvm -c /home/yhs/work/bpf-next/samples/bpf/xdp_redirect_cpu_kern.c \
> -o -| llc -march=bpf -filetype=obj -o samples/bpf/xdp_redirect_cpu_kern.o
> In file included from /home/yhs/work/bpf-next/samples/bpf/xdp_redirect_cpu_kern.c:10:
> In file included from /home/yhs/work/bpf-next/include/uapi/linux/in.h:24:
> In file included from /home/yhs/work/bpf-next/include/linux/socket.h:8:
> In file included from /home/yhs/work/bpf-next/include/linux/uio.h:13:
> In file included from /home/yhs/work/bpf-next/include/linux/thread_info.h:38:
> In file included from /home/yhs/work/bpf-next/arch/x86/include/asm/thread_info.h:53:
> /home/yhs/work/bpf-next/arch/x86/include/asm/cpufeature.h:150:2: error: 'asm goto' constructs are not supported yet
> asm_volatile_goto("1: jmp 6f\n"
> ^
> /home/yhs/work/bpf-next/include/linux/compiler-gcc.h:296:42: note: expanded from macro 'asm_volatile_goto'
> #define asm_volatile_goto(x...) do { asm goto(x); asm (""); } while (0)
> ^
> 1 error generated.
> =========================
> ...
> In file included from /home/yhs/work/bpf-next/samples/bpf/tracex4_kern.c:7:
> In file included from /home/yhs/work/bpf-next/include/linux/ptrace.h:6:
> In file included from /home/yhs/work/bpf-next/include/linux/sched.h:14:
> In file included from /home/yhs/work/bpf-next/include/linux/pid.h:5:
> In file included from /home/yhs/work/bpf-next/include/linux/rculist.h:11:
> In file included from /home/yhs/work/bpf-next/include/linux/rcupdate.h:40:
> In file included from /home/yhs/work/bpf-next/include/linux/preempt.h:81:
> In file included from /home/yhs/work/bpf-next/arch/x86/include/asm/preempt.h:7:
> In file included from /home/yhs/work/bpf-next/include/linux/thread_info.h:38:
> In file included from /home/yhs/work/bpf-next/arch/x86/include/asm/thread_info.h:53:
> /home/yhs/work/bpf-next/arch/x86/include/asm/cpufeature.h:150:2: error: 'asm goto' constructs are not supported yet
> ...
> =========================
>
> This patch adds a preprocessor guard NO_BPF_WORKAROUND around the
> asm_volatile_goto based static_cpu_has(). NO_BPF_WORKAROUND is set
> at toplevel Makefile when compiler supports asm-goto. That is,
> if the compiler supports asm-goto, the kernel build will use
> asm-goto version of static_cpu_has().
>
> For clang compilation for bpf programs where only header files
> are accessed, NO_BPF_WORKAROUND is not defined and asm-goto version
> of static_cpu_has() will not be accessed, hence the above compilation
> error will disappear.
>
> If later clang compiler starts to support asm_volatile_goto,
> the change in this patch can be reverted.
>
> Fixes: d0266046ad54 ("x86: Remove FAST_FEATURE_TESTS")
> Suggested-by: Alexei Starovoitov <[email protected]>
> Signed-off-by: Yonghong Song <[email protected]>
> ---
> Makefile | 1 +
> arch/x86/include/asm/cpufeature.h | 5 +++++
> 2 files changed, 6 insertions(+)
>
> Changelog:
> v1 -> v2:
> Use NO_BPF_WORKAROUND macro instead of CC_HAVE_ASM_GOTO
> to make it explicit that this is a workaround.
>
> diff --git a/Makefile b/Makefile
> index c1a608a..7b81f1f 100644
> --- a/Makefile
> +++ b/Makefile
> @@ -504,6 +504,7 @@ export RETPOLINE_CFLAGS
> ifeq ($(call shell-cached,$(CONFIG_SHELL) $(srctree)/scripts/gcc-goto.sh $(CC) $(KBUILD_CFLAGS)), y)
> CC_HAVE_ASM_GOTO := 1
> KBUILD_CFLAGS += -DCC_HAVE_ASM_GOTO
> + KBUILD_CFLAGS += -DNO_BPF_WORKAROUND
> KBUILD_AFLAGS += -DCC_HAVE_ASM_GOTO
> endif
>
> diff --git a/arch/x86/include/asm/cpufeature.h b/arch/x86/include/asm/cpufeature.h
> index b27da96..638417b 100644
> --- a/arch/x86/include/asm/cpufeature.h
> +++ b/arch/x86/include/asm/cpufeature.h
> @@ -140,6 +140,8 @@ extern void clear_cpu_cap(struct cpuinfo_x86 *c, unsigned int bit);
>
> #define setup_force_cpu_bug(bit) setup_force_cpu_cap(bit)
>
> +/* this macro is a temporary hack for bpf until clang gains asm-goto support */
> +#ifdef NO_BPF_WORKAROUND
> /*
> * Static testing of CPU features. Used the same as boot_cpu_has().
> * These will statically patch the target code for additional
> @@ -195,6 +197,9 @@ static __always_inline __pure bool _static_cpu_has(u16 bit)
> boot_cpu_has(bit) : \
> _static_cpu_has(bit) \
> )
> +#else
> +#define static_cpu_has(bit) boot_cpu_has(bit)
> +#endif
>
> #define cpu_has_bug(c, bit) cpu_has(c, (bit))
> #define set_cpu_bug(c, bit) set_cpu_cap(c, (bit))
>

2018-04-20 08:21:02

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH v2] x86/cpufeature: guard asm_volatile_goto usage with NO_BPF_WORKAROUND

On Sat, Apr 14, 2018 at 09:27:38PM -0700, Yonghong Song wrote:

> This patch adds a preprocessor guard NO_BPF_WORKAROUND around the
> asm_volatile_goto based static_cpu_has(). NO_BPF_WORKAROUND is set
> at toplevel Makefile when compiler supports asm-goto. That is,
> if the compiler supports asm-goto, the kernel build will use
> asm-goto version of static_cpu_has().

Hurm, so adding __BPF__ for BPF compiles isn't an option? It seems to me
having a CPP flag to identify BPF compile context might be useful in
general.



2018-04-20 18:08:22

by Yonghong Song

[permalink] [raw]
Subject: Re: [PATCH v2] x86/cpufeature: guard asm_volatile_goto usage with NO_BPF_WORKAROUND



On 4/20/18 1:19 AM, Peter Zijlstra wrote:
> On Sat, Apr 14, 2018 at 09:27:38PM -0700, Yonghong Song wrote:
>
>> This patch adds a preprocessor guard NO_BPF_WORKAROUND around the
>> asm_volatile_goto based static_cpu_has(). NO_BPF_WORKAROUND is set
>> at toplevel Makefile when compiler supports asm-goto. That is,
>> if the compiler supports asm-goto, the kernel build will use
>> asm-goto version of static_cpu_has().
>
> Hurm, so adding __BPF__ for BPF compiles isn't an option? It seems to me
> having a CPP flag to identify BPF compile context might be useful in
> general.

With "clang -target bpf", we already have __BPF__ defined.
For tracing, esp. ptrace.h is included, "clang -target <native_arch>"
where "-target <native_arch>" can be omitted, is typically used.
The reason is the native architecture header files typically
include a lot of various asm related stuff where "-target bpf" cannot
really handle. We relay on native clang to flush out all these
asm constructs and only bpf program needed stuff survives
reach to backend compiler.

The backend compiler, llc, will have option "-march=bpf" to do
right thing to generate bpf byte codes.

So the patch is really a workaround for "clang -target x86_64" with
intention of using "llc -march=bpf" later on.

2018-04-23 10:54:21

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH v2] x86/cpufeature: guard asm_volatile_goto usage with NO_BPF_WORKAROUND

On Fri, Apr 20, 2018 at 11:06:03AM -0700, Yonghong Song wrote:
> On 4/20/18 1:19 AM, Peter Zijlstra wrote:

> > Hurm, so adding __BPF__ for BPF compiles isn't an option? It seems to me
> > having a CPP flag to identify BPF compile context might be useful in
> > general.
>
> With "clang -target bpf", we already have __BPF__ defined.
> For tracing, esp. ptrace.h is included, "clang -target <native_arch>" where
> "-target <native_arch>" can be omitted, is typically used.

> The reason is the native architecture header files typically
> include a lot of various asm related stuff where "-target bpf" cannot
> really handle. We relay on native clang to flush out all these
> asm constructs and only bpf program needed stuff survives
> reach to backend compiler.

So because 'clang -target bpf' is 'broken', you do a work-around using
'clang -target <native_arch>'. But because that doesn't set __BPF__ you
want to add NO_BPF_WORKAROUND to the kernel instead of adding __BPF__ to
your build rules to better mimick -target bpf, which you should be
using.

How is that sane? Why not use 'clang -target <native_arch> -D__BPF__'

2018-04-23 16:53:25

by Yonghong Song

[permalink] [raw]
Subject: Re: [PATCH v2] x86/cpufeature: guard asm_volatile_goto usage with NO_BPF_WORKAROUND

Hi, Peter,

Please see comments below.

On 4/23/18 3:52 AM, Peter Zijlstra wrote:
> On Fri, Apr 20, 2018 at 11:06:03AM -0700, Yonghong Song wrote:
>> On 4/20/18 1:19 AM, Peter Zijlstra wrote:
>
>>> Hurm, so adding __BPF__ for BPF compiles isn't an option? It seems to me
>>> having a CPP flag to identify BPF compile context might be useful in
>>> general.
>>
>> With "clang -target bpf", we already have __BPF__ defined.
>> For tracing, esp. ptrace.h is included, "clang -target <native_arch>" where
>> "-target <native_arch>" can be omitted, is typically used.
>
>> The reason is the native architecture header files typically
>> include a lot of various asm related stuff where "-target bpf" cannot
>> really handle. We relay on native clang to flush out all these
>> asm constructs and only bpf program needed stuff survives
>> reach to backend compiler.
>
> So because 'clang -target bpf' is 'broken', you do a work-around using

'clang -target bpf' is 'broken' in this case because the x86 arch has
a lot of inline asm's in the header file where bpf target cannot handle.
For most networking related bpf programs where `asm` is rarely involved,
`clang -target bpf` works fine most of time.

> 'clang -target <native_arch>'. But because that doesn't set __BPF__ you

`clang -target <native_arch>` should work, regardless of whether __BPF__
macro is setup or not. The reason it doesn't work now is due to its
lacking asm-goto support. So to use `clang -target <native_arch>` is not
really a workaround for `target bpf`. It by itself should work.

> want to add NO_BPF_WORKAROUND to the kernel instead of adding __BPF__ to
> your build rules to better mimick -target bpf, which you should be
> using.
>
> How is that sane? Why not use 'clang -target <native_arch> -D__BPF__'

To workaround the asm-goto issue, the suggested macro __BPF__ can be
added to user space and kernel. But note that `clang -target
<native_arch>` will not define the macro __BPF__, so this requires
user space change.

Also, to make sure people understand that this is a WORKAROUND for
asm-goto issue and is not a lasting thing we want to support. I have
the following change for cpufeature.h:

diff --git a/arch/x86/include/asm/cpufeature.h
b/arch/x86/include/asm/cpufeature.h
index b27da9602a6d..c832118defa1 100644
--- a/arch/x86/include/asm/cpufeature.h
+++ b/arch/x86/include/asm/cpufeature.h
@@ -140,6 +140,7 @@ extern void clear_cpu_cap(struct cpuinfo_x86 *c,
unsigned int bit);

#define setup_force_cpu_bug(bit) setup_force_cpu_cap(bit)

+#ifndef __BPF_WORKAROUND__
/*
* Static testing of CPU features. Used the same as boot_cpu_has().
* These will statically patch the target code for additional
@@ -195,6 +196,9 @@ static __always_inline __pure bool
_static_cpu_has(u16 bit)
boot_cpu_has(bit) : \
_static_cpu_has(bit) \
)
+#else
+#define static_cpu_has(bit) boot_cpu_has(bit)
+#endif

#define cpu_has_bug(c, bit) cpu_has(c, (bit))
#define set_cpu_bug(c, bit) set_cpu_cap(c, (bit))
diff --git a/samples/bpf/Makefile b/samples/bpf/Makefile
index 4d6a6edd4bf6..b229e5090e4a 100644

As mentioned above, user space needs to add this new macro definition.
Specifically for kernel/samples/bpf:
diff --git a/samples/bpf/Makefile b/samples/bpf/Makefile
index 4d6a6edd4bf6..b229e5090e4a 100644
--- a/samples/bpf/Makefile
+++ b/samples/bpf/Makefile
@@ -255,7 +255,7 @@ $(obj)/tracex5_kern.o: $(obj)/syscall_nrs.h
$(obj)/%.o: $(src)/%.c
$(CLANG) $(NOSTDINC_FLAGS) $(LINUXINCLUDE) $(EXTRA_CFLAGS)
-I$(obj) \
-I$(srctree)/tools/testing/selftests/bpf/ \
- -D__KERNEL__ -Wno-unused-value -Wno-pointer-sign \
+ -D__KERNEL__ -D__BPF_WORKAROUND__ -Wno-unused-value
-Wno-pointer-sign \
-D__TARGET_ARCH_$(ARCH)
-Wno-compare-distinct-pointer-types \
-Wno-gnu-variable-sized-type-not-at-end \
-Wno-address-of-packed-member -Wno-tautological-compare \

Please let me know whether this approach is okay to you or not,
whether the name __BPF_WORKAROUND__ is better than __BPF__ or not, or we
could use the earlier approach which does not require user space change.

Thanks!

2018-04-27 16:36:27

by Yonghong Song

[permalink] [raw]
Subject: Re: [PATCH v2] x86/cpufeature: guard asm_volatile_goto usage with NO_BPF_WORKAROUND

Hi, Peter,

Just wanted to ping again so that you did not miss the email below.
Please let me know your opinion.

Thanks!

Yonghong


On 4/23/18 9:50 AM, Yonghong Song wrote:
> Hi, Peter,
>
> Please see comments below.
>
> On 4/23/18 3:52 AM, Peter Zijlstra wrote:
>> On Fri, Apr 20, 2018 at 11:06:03AM -0700, Yonghong Song wrote:
>>> On 4/20/18 1:19 AM, Peter Zijlstra wrote:
>>
>>>> Hurm, so adding __BPF__ for BPF compiles isn't an option? It seems
>>>> to me
>>>> having a CPP flag to identify BPF compile context might be useful in
>>>> general.
>>>
>>> With "clang -target bpf", we already have __BPF__ defined.
>>> For tracing, esp. ptrace.h is included, "clang -target <native_arch>"
>>> where
>>> "-target <native_arch>" can be omitted, is typically used.
>>
>>> The reason is the native architecture header files typically
>>> include a lot of various asm related stuff where "-target bpf" cannot
>>> really handle. We relay on native clang to flush out all these
>>> asm constructs and only bpf program needed stuff survives
>>> reach to backend compiler.
>>
>> So because 'clang -target bpf' is 'broken', you do a work-around using
>
> 'clang -target bpf' is 'broken' in this case because the x86 arch has
> a lot of inline asm's in the header file where bpf target cannot handle.
> For most networking related bpf programs where `asm` is rarely involved,
> `clang -target bpf` works fine most of time.
>
>> 'clang -target <native_arch>'. But because that doesn't set __BPF__ you
>
> `clang -target <native_arch>` should work, regardless of whether __BPF__
> macro is setup or not. The reason it doesn't work now is due to its
> lacking asm-goto support. So to use `clang -target <native_arch>` is not
> really a workaround for `target bpf`. It by itself should work.
>
>> want to add NO_BPF_WORKAROUND to the kernel instead of adding __BPF__ to
>> your build rules to better mimick -target bpf, which you should be
>> using.
>>
>> How is that sane? Why not use 'clang -target <native_arch> -D__BPF__'
>
> To workaround the asm-goto issue, the suggested macro __BPF__ can be
> added to user space and kernel. But note that `clang -target
> <native_arch>` will not define the macro __BPF__, so this requires
> user space change.
>
> Also, to make sure people understand that this is a WORKAROUND for
> asm-goto issue and is not a lasting thing we want to support. I have
> the following change for cpufeature.h:
>
> diff --git a/arch/x86/include/asm/cpufeature.h
> b/arch/x86/include/asm/cpufeature.h
> index b27da9602a6d..c832118defa1 100644
> --- a/arch/x86/include/asm/cpufeature.h
> +++ b/arch/x86/include/asm/cpufeature.h
> @@ -140,6 +140,7 @@ extern void clear_cpu_cap(struct cpuinfo_x86 *c,
> unsigned int bit);
>
>  #define setup_force_cpu_bug(bit) setup_force_cpu_cap(bit)
>
> +#ifndef __BPF_WORKAROUND__
>  /*
>   * Static testing of CPU features.  Used the same as boot_cpu_has().
>   * These will statically patch the target code for additional
> @@ -195,6 +196,9 @@ static __always_inline __pure bool
> _static_cpu_has(u16 bit)
>                 boot_cpu_has(bit) :                             \
>                 _static_cpu_has(bit)                            \
>  )
> +#else
> +#define static_cpu_has(bit)            boot_cpu_has(bit)
> +#endif
>
>  #define cpu_has_bug(c, bit)            cpu_has(c, (bit))
>  #define set_cpu_bug(c, bit)            set_cpu_cap(c, (bit))
> diff --git a/samples/bpf/Makefile b/samples/bpf/Makefile
> index 4d6a6edd4bf6..b229e5090e4a 100644
>
> As mentioned above, user space needs to add this new macro definition.
> Specifically for kernel/samples/bpf:
> diff --git a/samples/bpf/Makefile b/samples/bpf/Makefile
> index 4d6a6edd4bf6..b229e5090e4a 100644
> --- a/samples/bpf/Makefile
> +++ b/samples/bpf/Makefile
> @@ -255,7 +255,7 @@ $(obj)/tracex5_kern.o: $(obj)/syscall_nrs.h
>  $(obj)/%.o: $(src)/%.c
>         $(CLANG) $(NOSTDINC_FLAGS) $(LINUXINCLUDE) $(EXTRA_CFLAGS)
> -I$(obj) \
>                 -I$(srctree)/tools/testing/selftests/bpf/ \
> -               -D__KERNEL__ -Wno-unused-value -Wno-pointer-sign \
> +               -D__KERNEL__ -D__BPF_WORKAROUND__ -Wno-unused-value
> -Wno-pointer-sign \
>                 -D__TARGET_ARCH_$(ARCH)
> -Wno-compare-distinct-pointer-types \
>                 -Wno-gnu-variable-sized-type-not-at-end \
>                 -Wno-address-of-packed-member -Wno-tautological-compare \
>
> Please let me know whether this approach is okay to you or not,
> whether the name __BPF_WORKAROUND__ is better than __BPF__ or not, or we
> could use the earlier approach which does not require user space change.
>
> Thanks!

2018-05-02 14:03:13

by Yonghong Song

[permalink] [raw]
Subject: Re: [PATCH v2] x86/cpufeature: guard asm_volatile_goto usage with NO_BPF_WORKAROUND


Hi, Peter,

Ping again. Did you get chances to think about this issue again?

Thanks!

Yonghong

On 4/27/18 9:34 AM, Yonghong Song wrote:
> Hi, Peter,
>
> Just wanted to ping again so that you did not miss the email below.
> Please let me know your opinion.
>
> Thanks!
>
> Yonghong
>
>
> On 4/23/18 9:50 AM, Yonghong Song wrote:
>> Hi, Peter,
>>
>> Please see comments below.
>>
>> On 4/23/18 3:52 AM, Peter Zijlstra wrote:
>>> On Fri, Apr 20, 2018 at 11:06:03AM -0700, Yonghong Song wrote:
>>>> On 4/20/18 1:19 AM, Peter Zijlstra wrote:
>>>
>>>>> Hurm, so adding __BPF__ for BPF compiles isn't an option? It seems
>>>>> to me
>>>>> having a CPP flag to identify BPF compile context might be useful in
>>>>> general.
>>>>
>>>> With "clang -target bpf", we already have __BPF__ defined.
>>>> For tracing, esp. ptrace.h is included, "clang -target
>>>> <native_arch>" where
>>>> "-target <native_arch>" can be omitted, is typically used.
>>>
>>>> The reason is the native architecture header files typically
>>>> include a lot of various asm related stuff where "-target bpf" cannot
>>>> really handle. We relay on native clang to flush out all these
>>>> asm constructs and only bpf program needed stuff survives
>>>> reach to backend compiler.
>>>
>>> So because 'clang -target bpf' is 'broken', you do a work-around using
>>
>> 'clang -target bpf' is 'broken' in this case because the x86 arch has
>> a lot of inline asm's in the header file where bpf target cannot handle.
>> For most networking related bpf programs where `asm` is rarely involved,
>> `clang -target bpf` works fine most of time.
>>
>>> 'clang -target <native_arch>'. But because that doesn't set __BPF__ you
>>
>> `clang -target <native_arch>` should work, regardless of whether __BPF__
>> macro is setup or not. The reason it doesn't work now is due to its
>> lacking asm-goto support. So to use `clang -target <native_arch>` is not
>> really a workaround for `target bpf`. It by itself should work.
>>
>>> want to add NO_BPF_WORKAROUND to the kernel instead of adding __BPF__ to
>>> your build rules to better mimick -target bpf, which you should be
>>> using.
>>>
>>> How is that sane? Why not use 'clang -target <native_arch> -D__BPF__'
>>
>> To workaround the asm-goto issue, the suggested macro __BPF__ can be
>> added to user space and kernel. But note that `clang -target
>> <native_arch>` will not define the macro __BPF__, so this requires
>> user space change.
>>
>> Also, to make sure people understand that this is a WORKAROUND for
>> asm-goto issue and is not a lasting thing we want to support. I have
>> the following change for cpufeature.h:
>>
>> diff --git a/arch/x86/include/asm/cpufeature.h
>> b/arch/x86/include/asm/cpufeature.h
>> index b27da9602a6d..c832118defa1 100644
>> --- a/arch/x86/include/asm/cpufeature.h
>> +++ b/arch/x86/include/asm/cpufeature.h
>> @@ -140,6 +140,7 @@ extern void clear_cpu_cap(struct cpuinfo_x86 *c,
>> unsigned int bit);
>>
>>   #define setup_force_cpu_bug(bit) setup_force_cpu_cap(bit)
>>
>> +#ifndef __BPF_WORKAROUND__
>>   /*
>>    * Static testing of CPU features.  Used the same as boot_cpu_has().
>>    * These will statically patch the target code for additional
>> @@ -195,6 +196,9 @@ static __always_inline __pure bool
>> _static_cpu_has(u16 bit)
>>                  boot_cpu_has(bit) :                             \
>>                  _static_cpu_has(bit)                            \
>>   )
>> +#else
>> +#define static_cpu_has(bit)            boot_cpu_has(bit)
>> +#endif
>>
>>   #define cpu_has_bug(c, bit)            cpu_has(c, (bit))
>>   #define set_cpu_bug(c, bit)            set_cpu_cap(c, (bit))
>> diff --git a/samples/bpf/Makefile b/samples/bpf/Makefile
>> index 4d6a6edd4bf6..b229e5090e4a 100644
>>
>> As mentioned above, user space needs to add this new macro definition.
>> Specifically for kernel/samples/bpf:
>> diff --git a/samples/bpf/Makefile b/samples/bpf/Makefile
>> index 4d6a6edd4bf6..b229e5090e4a 100644
>> --- a/samples/bpf/Makefile
>> +++ b/samples/bpf/Makefile
>> @@ -255,7 +255,7 @@ $(obj)/tracex5_kern.o: $(obj)/syscall_nrs.h
>>   $(obj)/%.o: $(src)/%.c
>>          $(CLANG) $(NOSTDINC_FLAGS) $(LINUXINCLUDE) $(EXTRA_CFLAGS)
>> -I$(obj) \
>>                  -I$(srctree)/tools/testing/selftests/bpf/ \
>> -               -D__KERNEL__ -Wno-unused-value -Wno-pointer-sign \
>> +               -D__KERNEL__ -D__BPF_WORKAROUND__ -Wno-unused-value
>> -Wno-pointer-sign \
>>                  -D__TARGET_ARCH_$(ARCH)
>> -Wno-compare-distinct-pointer-types \
>>                  -Wno-gnu-variable-sized-type-not-at-end \
>>                  -Wno-address-of-packed-member
>> -Wno-tautological-compare \
>>
>> Please let me know whether this approach is okay to you or not,
>> whether the name __BPF_WORKAROUND__ is better than __BPF__ or not, or
>> we could use the earlier approach which does not require user space
>> change.
>>
>> Thanks!