2021-06-04 17:05:41

by Jarmo Tiitto

[permalink] [raw]
Subject: [PATCH v3 1/1] pgo: Fix allocate_node() v2

When clang instrumentation eventually calls allocate_node()
the struct llvm_prf_data *p argument tells us from what section
we should reserve the vnode: It either points into vmlinux's
core __llvm_prf_data section or some loaded module's
__llvm_prf_data section.

But since we don't have access to corresponding
__llvm_prf_vnds section(s) for any module, the function
should return just NULL and ignore any profiling attempts
from modules for now.

Signed-off-by: Jarmo Tiitto <[email protected]>
---
Based on Kees and others feedback here is v3 patch
that clarifies why the current checks in allocate_node()
are flawed. I did fair amount of KGDB time on it.

The commit is based on kees/for-next/clang/features tree,
hopefully this is ok. Should I have based it on linux-next
instead?

I grep -R'd where the memory_contains() can be found and it is only
found in #include <asm-generic/sections.h>

I cross my fingers and await if this is my first accepted patch. :-)
---
kernel/pgo/instrument.c | 23 ++++++++++++++---------
1 file changed, 14 insertions(+), 9 deletions(-)

diff --git a/kernel/pgo/instrument.c b/kernel/pgo/instrument.c
index 0e07ee1b17d9..c69b4f7ebaad 100644
--- a/kernel/pgo/instrument.c
+++ b/kernel/pgo/instrument.c
@@ -18,6 +18,7 @@

#define pr_fmt(fmt) "pgo: " fmt

+#include <asm-generic/sections.h>
#include <linux/bitops.h>
#include <linux/kernel.h>
#include <linux/export.h>
@@ -55,17 +56,21 @@ void prf_unlock(unsigned long flags)
static struct llvm_prf_value_node *allocate_node(struct llvm_prf_data *p,
u32 index, u64 value)
{
- if (&__llvm_prf_vnds_start[current_node + 1] >= __llvm_prf_vnds_end)
- return NULL; /* Out of nodes */
-
- current_node++;
-
- /* Make sure the node is entirely within the section */
- if (&__llvm_prf_vnds_start[current_node] >= __llvm_prf_vnds_end ||
- &__llvm_prf_vnds_start[current_node + 1] > __llvm_prf_vnds_end)
+ const int max_vnds = prf_vnds_count();
+
+ /*
+ * Check that p is within vmlinux __llvm_prf_data section.
+ * If not, don't allocate since we can't handle modules yet.
+ */
+ if (!memory_contains(__llvm_prf_data_start,
+ __llvm_prf_data_end, p, sizeof(*p)))
return NULL;

- return &__llvm_prf_vnds_start[current_node];
+ if (WARN_ON_ONCE(current_node >= max_vnds))
+ return NULL; /* Out of nodes */
+
+ /* reserve vnode for vmlinux */
+ return &__llvm_prf_vnds_start[current_node++];
}

/*

base-commit: 5d0cda65918279ada060417c5fecb7e86ccb3def
--
2.31.1


2021-06-04 17:06:50

by Nathan Chancellor

[permalink] [raw]
Subject: Re: [PATCH v3 1/1] pgo: Fix allocate_node() v2

On 6/4/2021 9:58 AM, Jarmo Tiitto wrote:
> When clang instrumentation eventually calls allocate_node()
> the struct llvm_prf_data *p argument tells us from what section
> we should reserve the vnode: It either points into vmlinux's
> core __llvm_prf_data section or some loaded module's
> __llvm_prf_data section.
>
> But since we don't have access to corresponding
> __llvm_prf_vnds section(s) for any module, the function
> should return just NULL and ignore any profiling attempts
> from modules for now.
>
> Signed-off-by: Jarmo Tiitto <[email protected]>

Reviewed-by: Nathan Chancellor <[email protected]>

> ---
> Based on Kees and others feedback here is v3 patch
> that clarifies why the current checks in allocate_node()
> are flawed. I did fair amount of KGDB time on it.
>
> The commit is based on kees/for-next/clang/features tree,
> hopefully this is ok. Should I have based it on linux-next
> instead?
>
> I grep -R'd where the memory_contains() can be found and it is only
> found in #include <asm-generic/sections.h>
>
> I cross my fingers and await if this is my first accepted patch. :-)
> ---
> kernel/pgo/instrument.c | 23 ++++++++++++++---------
> 1 file changed, 14 insertions(+), 9 deletions(-)
>
> diff --git a/kernel/pgo/instrument.c b/kernel/pgo/instrument.c
> index 0e07ee1b17d9..c69b4f7ebaad 100644
> --- a/kernel/pgo/instrument.c
> +++ b/kernel/pgo/instrument.c
> @@ -18,6 +18,7 @@
>
> #define pr_fmt(fmt) "pgo: " fmt
>
> +#include <asm-generic/sections.h>
> #include <linux/bitops.h>
> #include <linux/kernel.h>
> #include <linux/export.h>
> @@ -55,17 +56,21 @@ void prf_unlock(unsigned long flags)
> static struct llvm_prf_value_node *allocate_node(struct llvm_prf_data *p,
> u32 index, u64 value)
> {
> - if (&__llvm_prf_vnds_start[current_node + 1] >= __llvm_prf_vnds_end)
> - return NULL; /* Out of nodes */
> -
> - current_node++;
> -
> - /* Make sure the node is entirely within the section */
> - if (&__llvm_prf_vnds_start[current_node] >= __llvm_prf_vnds_end ||
> - &__llvm_prf_vnds_start[current_node + 1] > __llvm_prf_vnds_end)
> + const int max_vnds = prf_vnds_count();
> +
> + /*
> + * Check that p is within vmlinux __llvm_prf_data section.
> + * If not, don't allocate since we can't handle modules yet.
> + */
> + if (!memory_contains(__llvm_prf_data_start,
> + __llvm_prf_data_end, p, sizeof(*p)))
> return NULL;
>
> - return &__llvm_prf_vnds_start[current_node];
> + if (WARN_ON_ONCE(current_node >= max_vnds))
> + return NULL; /* Out of nodes */
> +
> + /* reserve vnode for vmlinux */
> + return &__llvm_prf_vnds_start[current_node++];
> }
>
> /*
>
> base-commit: 5d0cda65918279ada060417c5fecb7e86ccb3def
>

2021-06-04 17:33:18

by Nick Desaulniers

[permalink] [raw]
Subject: Re: [PATCH v3 1/1] pgo: Fix allocate_node() v2

On Fri, Jun 4, 2021 at 10:03 AM Jarmo Tiitto <[email protected]> wrote:
>
> When clang instrumentation eventually calls allocate_node()
> the struct llvm_prf_data *p argument tells us from what section
> we should reserve the vnode: It either points into vmlinux's
> core __llvm_prf_data section or some loaded module's
> __llvm_prf_data section.
>
> But since we don't have access to corresponding
> __llvm_prf_vnds section(s) for any module, the function
> should return just NULL and ignore any profiling attempts
> from modules for now.
>
> Signed-off-by: Jarmo Tiitto <[email protected]>
> ---
> Based on Kees and others feedback here is v3 patch
> that clarifies why the current checks in allocate_node()
> are flawed. I did fair amount of KGDB time on it.
>
> The commit is based on kees/for-next/clang/features tree,
> hopefully this is ok. Should I have based it on linux-next
> instead?
>
> I grep -R'd where the memory_contains() can be found and it is only
> found in #include <asm-generic/sections.h>
>
> I cross my fingers and await if this is my first accepted patch. :-)
> ---
> kernel/pgo/instrument.c | 23 ++++++++++++++---------
> 1 file changed, 14 insertions(+), 9 deletions(-)
>
> diff --git a/kernel/pgo/instrument.c b/kernel/pgo/instrument.c
> index 0e07ee1b17d9..c69b4f7ebaad 100644
> --- a/kernel/pgo/instrument.c
> +++ b/kernel/pgo/instrument.c
> @@ -18,6 +18,7 @@
>
> #define pr_fmt(fmt) "pgo: " fmt
>
> +#include <asm-generic/sections.h>

^ What about Nathan's feedback on this being just `<asm/sections.h>`?

> #include <linux/bitops.h>
> #include <linux/kernel.h>
> #include <linux/export.h>
> @@ -55,17 +56,21 @@ void prf_unlock(unsigned long flags)
> static struct llvm_prf_value_node *allocate_node(struct llvm_prf_data *p,
> u32 index, u64 value)
> {
> - if (&__llvm_prf_vnds_start[current_node + 1] >= __llvm_prf_vnds_end)
> - return NULL; /* Out of nodes */
> -
> - current_node++;
> -
> - /* Make sure the node is entirely within the section */
> - if (&__llvm_prf_vnds_start[current_node] >= __llvm_prf_vnds_end ||
> - &__llvm_prf_vnds_start[current_node + 1] > __llvm_prf_vnds_end)
> + const int max_vnds = prf_vnds_count();
> +
> + /*
> + * Check that p is within vmlinux __llvm_prf_data section.
> + * If not, don't allocate since we can't handle modules yet.
> + */
> + if (!memory_contains(__llvm_prf_data_start,
> + __llvm_prf_data_end, p, sizeof(*p)))
> return NULL;
>
> - return &__llvm_prf_vnds_start[current_node];
> + if (WARN_ON_ONCE(current_node >= max_vnds))
> + return NULL; /* Out of nodes */
> +
> + /* reserve vnode for vmlinux */
> + return &__llvm_prf_vnds_start[current_node++];
> }
>
> /*
>
> base-commit: 5d0cda65918279ada060417c5fecb7e86ccb3def
> --
> 2.31.1
>


--
Thanks,
~Nick Desaulniers

2021-06-04 18:04:14

by Kees Cook

[permalink] [raw]
Subject: Re: [PATCH v3 1/1] pgo: Fix allocate_node() v2

On Fri, 4 Jun 2021 19:58:20 +0300, Jarmo Tiitto wrote:
> When clang instrumentation eventually calls allocate_node()
> the struct llvm_prf_data *p argument tells us from what section
> we should reserve the vnode: It either points into vmlinux's
> core __llvm_prf_data section or some loaded module's
> __llvm_prf_data section.
>
> But since we don't have access to corresponding
> __llvm_prf_vnds section(s) for any module, the function
> should return just NULL and ignore any profiling attempts
> from modules for now.

Thanks for working on this! I tweaked the commit title, reflowed the
commit log to 80 columns, and adjusted asm-generic to asm.

Applied to for-next/clang/features, thanks!

[1/1] pgo: Limit allocate_node() to vmlinux sections
https://git.kernel.org/kees/c/46773f32ddf1

--
Kees Cook

2021-06-04 18:09:18

by Kees Cook

[permalink] [raw]
Subject: Re: [PATCH v3 1/1] pgo: Fix allocate_node() v2

On Fri, Jun 04, 2021 at 07:58:20PM +0300, Jarmo Tiitto wrote:
> When clang instrumentation eventually calls allocate_node()
> the struct llvm_prf_data *p argument tells us from what section
> we should reserve the vnode: It either points into vmlinux's
> core __llvm_prf_data section or some loaded module's
> __llvm_prf_data section.
>
> But since we don't have access to corresponding
> __llvm_prf_vnds section(s) for any module, the function
> should return just NULL and ignore any profiling attempts
> from modules for now.
>
> Signed-off-by: Jarmo Tiitto <[email protected]>
> ---
> Based on Kees and others feedback here is v3 patch
> that clarifies why the current checks in allocate_node()
> are flawed. I did fair amount of KGDB time on it.
>
> The commit is based on kees/for-next/clang/features tree,
> hopefully this is ok. Should I have based it on linux-next
> instead?
>
> I grep -R'd where the memory_contains() can be found and it is only
> found in #include <asm-generic/sections.h>

That's true, but the way to use "asm-generic" is to include the
top-level "asm" file, so that architectures can override things as
needed.

> I cross my fingers and await if this is my first accepted patch. :-)

I tweaked it a bit and applied it (see the separate email).

Thank you!

-Kees

--
Kees Cook

2021-06-05 17:16:50

by Jarmo Tiitto

[permalink] [raw]
Subject: Re: [PATCH v3 1/1] pgo: Fix allocate_node() v2

> Kees Cook wrote perjantaina 4. kesäkuuta 2021 21.06.37 EEST:
> >
> > I grep -R'd where the memory_contains() can be found and it is only
> > found in #include <asm-generic/sections.h>
>
> That's true, but the way to use "asm-generic" is to include the
> top-level "asm" file, so that architectures can override things as
> needed.
>
Thanks, I didn't know that.

> > I cross my fingers and await if this is my first accepted patch. :-)
>
> I tweaked it a bit and applied it (see the separate email).
>
> Thank you!
>
> -Kees
>
> --
> Kees Cook
>

Whoa!
Thanks, I'm glad it worked out. :-)

Btw. I have almost forgotten that I once wrote code
(that I didn't send) for the GCC gcov subsystem that also enabled
-fprofile-generate/use for the kernel.
However the Clang PGO looks much more approachable and
easier to hack on since the profile data format is simpler.

So starting to work on this felt just natural to me. :-)

-Jarmo



2021-06-05 23:57:08

by Fangrui Song

[permalink] [raw]
Subject: Re: [PATCH v3 1/1] pgo: Fix allocate_node() v2

On 2021-06-05, Jarmo Tiitto wrote:
>> Kees Cook wrote perjantaina 4. kes?kuuta 2021 21.06.37 EEST:
>> >
>> > I grep -R'd where the memory_contains() can be found and it is only
>> > found in #include <asm-generic/sections.h>
>>
>> That's true, but the way to use "asm-generic" is to include the
>> top-level "asm" file, so that architectures can override things as
>> needed.
>>
>Thanks, I didn't know that.
>
>> > I cross my fingers and await if this is my first accepted patch. :-)
>>
>> I tweaked it a bit and applied it (see the separate email).
>>
>> Thank you!
>>
>> -Kees
>>
>> --
>> Kees Cook
>>
>
>Whoa!
>Thanks, I'm glad it worked out. :-)
>
>Btw. I have almost forgotten that I once wrote code
>(that I didn't send) for the GCC gcov subsystem that also enabled
>-fprofile-generate/use for the kernel.
>However the Clang PGO looks much more approachable and
>easier to hack on since the profile data format is simpler.
>
>So starting to work on this felt just natural to me. :-)
>
>-Jarmo

Right, __llvm_prf_vnodes reserves space for static allocation.
There is no relocation referencing __llvm_prf_vnodes so there is
no straightforward way using the space for the kernel.

In userspace shared objects, the scheme works by linking
libclang_rt.profile-*.a into every shared object. The
__start___llvm_prf_vnodes/__stop___llvm_prf_vnodes symbols are
delieberately hidden in compiler-rt InstrProfilingPlatformLinux.c.

The GCC gcov format is definitely simpler than the LLVM format:)