2021-03-12 09:59:04

by Johannes Berg

[permalink] [raw]
Subject: [PATCH 0/6] um: fix up CONFIG_GCOV support

CONFIG_GCOV is fairly useful for ARCH=um (e.g. with kunit, though
my main use case is a bit different) since it writes coverage data
directly out like a normal userspace binary. Theoretically, that
is.

Unfortunately, it's broken in multiple ways today:

1) it doesn't like, due to 'mangle_path' in seq_file, and the only
solution to that seems to be to rename our symbol, but that's
not so bad, and "mangle_path" sounds very generic anyway, which
it isn't quite

2) gcov requires exit handlers to write out the data, and those are
never called for modules, config CONSTRUCTORS exists for init
handlers, so add CONFIG_MODULE_DESTRUCTORS here that we can then
select in ARCH=um

3) As mentioned above, gcov requires init/exit handlers, but they
aren't linked into binary properly, that's easy to fix.

4) gcda files are then written, so .gitignore them

5) it's not always useful to create coverage data for the *entire*
kernel, so I've split off CONFIG_GCOV_BASE from CONFIG_GCOV to
allow option in only in some places, which of course requires
adding the necessary "subdir-cflags" or "CFLAGS_obj" changes in
the places where it's desired, as local patches.


None of these changes (hopefully) seem too controversional, biggest
are the module changes but obviously they compile to nothing if the
architecture doesn't WANT_MODULE_DESTRUCTORS.

Any thoughts on how to merge this? The seq_file/.gitignore changes
are independent at least code-wise, though of course it only works
with the seq_file changes (.gitignore doesn't matter, of course),
while the module changes are a requirement for the later ARCH=um
patches since the Kconfig symbol has to exist.

Perhaps I can just get ACKs on all the patches and then they can go
through the UML tree?

Thanks,
johannes



2021-03-12 09:59:09

by Johannes Berg

[permalink] [raw]
Subject: [PATCH 3/6] .gitignore: also ignore gcda files

From: Johannes Berg <[email protected]>

We already ignore gcno files that are created by the compiler
at build time for -ftest-coverage. However, with ARCH=um it's
possible to select CONFIG_GCOV which actually has the kernel
binary write out gcda files (rather than have them in debugfs
like CONFIG_GCOV_KERNEL does), so an in-tree build can create
them. Ignore them so the tree doesn't look dirty for that.

Signed-off-by: Johannes Berg <[email protected]>
---
.gitignore | 1 +
1 file changed, 1 insertion(+)

diff --git a/.gitignore b/.gitignore
index 3af66272d6f1..91e46190d418 100644
--- a/.gitignore
+++ b/.gitignore
@@ -23,6 +23,7 @@
*.dwo
*.elf
*.gcno
+*.gcda
*.gz
*.i
*.ko
--
2.29.2

2021-03-12 09:59:15

by Johannes Berg

[permalink] [raw]
Subject: [PATCH 2/6] module: add support for CONFIG_MODULE_DESTRUCTORS

From: Johannes Berg <[email protected]>

At least in ARCH=um with CONFIG_GCOV (which writes all the
coverage data directly out from the userspace binary rather
than presenting it in debugfs) it's necessary to run all
the atexit handlers (dtors/fini_array) so that gcov actually
does write out the data.

Add a new config option CONFIG_MODULE_DESTRUCTORS that can
be selected via CONFIG_WANT_MODULE_DESTRUCTORS that the arch
selects (this indirection exists so the architecture doesn't
have to worry about whether or not CONFIG_MODULES is on).
Additionally, the architecture must then (when it exits and
no more module code can run) call run_all_module_destructors
to run the code for all modules that are still loaded. When
modules are unloaded, the handlers are called as well.

Signed-off-by: Johannes Berg <[email protected]>
---
include/linux/module.h | 14 ++++++++++++++
init/Kconfig | 17 +++++++++++++++++
kernel/module.c | 39 +++++++++++++++++++++++++++++++++++++++
3 files changed, 70 insertions(+)

diff --git a/include/linux/module.h b/include/linux/module.h
index 59f094fa6f74..8574d76a884d 100644
--- a/include/linux/module.h
+++ b/include/linux/module.h
@@ -517,6 +517,12 @@ struct module {
unsigned int num_ctors;
#endif

+#ifdef CONFIG_MODULE_DESTRUCTORS
+ /* Destructor functions. */
+ ctor_fn_t *dtors;
+ unsigned int num_dtors;
+#endif
+
#ifdef CONFIG_FUNCTION_ERROR_INJECTION
struct error_injection_entry *ei_funcs;
unsigned int num_ei_funcs;
@@ -853,4 +859,12 @@ int module_kallsyms_on_each_symbol(int (*fn)(void *, const char *,
struct module *, unsigned long),
void *data);

+#ifdef CONFIG_MODULE_DESTRUCTORS
+void run_all_module_destructors(void);
+#else
+static inline void run_all_module_destructors(void)
+{
+}
+#endif
+
#endif /* _LINUX_MODULE_H */
diff --git a/init/Kconfig b/init/Kconfig
index 22946fe5ded9..b0f0f51f9d2c 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -2295,6 +2295,23 @@ config UNUSED_KSYMS_WHITELIST

endif # MODULES

+config WANT_MODULE_DESTRUCTORS
+ bool
+ help
+ Architectures may select this if they need atexit functions (such as
+ generated by the compiler for -ftest-coverage/gcov) to run in modules.
+ They're then responsible for calling run_all_module_destructors() at
+ shutdown so that module destructors are called for all still loaded
+ modules as well.
+
+ Note that CONFIG_GCOV_KERNEL does *not* require this since it keeps
+ all the coverage data in the kernel, notably CONFIG_GCOV in ARCH=um
+ requires this.
+
+config MODULE_DESTRUCTORS
+ def_bool y
+ depends on WANT_MODULE_DESTRUCTORS && MODULES
+
config MODULES_TREE_LOOKUP
def_bool y
depends on PERF_EVENTS || TRACING
diff --git a/kernel/module.c b/kernel/module.c
index 30479355ab85..3023b5f054d4 100644
--- a/kernel/module.c
+++ b/kernel/module.c
@@ -904,6 +904,27 @@ EXPORT_SYMBOL(module_refcount);
/* This exists whether we can unload or not */
static void free_module(struct module *mod);

+#ifdef CONFIG_MODULE_DESTRUCTORS
+static void do_mod_dtors(struct module *mod)
+{
+ unsigned long i;
+
+ for (i = 0; i < mod->num_dtors; i++)
+ mod->dtors[i]();
+}
+
+void run_all_module_destructors(void)
+{
+ struct module *mod;
+
+ /* we no longer need to care about locking at this point */
+ list_for_each_entry(mod, &modules, list)
+ do_mod_dtors(mod);
+}
+#else
+static inline void do_mod_dtors(struct module *mod) {}
+#endif
+
SYSCALL_DEFINE2(delete_module, const char __user *, name_user,
unsigned int, flags)
{
@@ -966,6 +987,7 @@ SYSCALL_DEFINE2(delete_module, const char __user *, name_user,
MODULE_STATE_GOING, mod);
klp_module_going(mod);
ftrace_release_mod(mod);
+ do_mod_dtors(mod);

async_synchronize_full();

@@ -3263,6 +3285,23 @@ static int find_module_sections(struct module *mod, struct load_info *info)
}
#endif

+#ifdef CONFIG_MODULE_DESTRUCTORS
+ mod->dtors = section_objs(info, ".dtors",
+ sizeof(*mod->dtors), &mod->num_dtors);
+ if (!mod->dtors)
+ mod->dtors = section_objs(info, ".fini_array",
+ sizeof(*mod->dtors), &mod->num_dtors);
+ else if (find_sec(info, ".fini_array")) {
+ /*
+ * This shouldn't happen with same compiler and binutils
+ * building all parts of the module.
+ */
+ pr_warn("%s: has both .dtors and .fini_array.\n",
+ mod->name);
+ return -EINVAL;
+ }
+#endif
+
mod->noinstr_text_start = section_objs(info, ".noinstr.text", 1,
&mod->noinstr_text_size);

--
2.29.2

2021-03-12 10:01:15

by Johannes Berg

[permalink] [raw]
Subject: [PATCH 4/6] um: split up CONFIG_GCOV

From: Johannes Berg <[email protected]>

It's not always desirable to collect coverage data for the
entire kernel, so split off CONFIG_GCOV_BASE. This option
only enables linking with coverage options, and compiles a
single file (reboot.c) with them as well to force gcov to
be linked into the kernel binary. That way, modules also
work.

To use this new option properly, one needs to manually add
'-fprofile-arcs -ftest-coverage' to the compiler options
of some object(s) or subdir(s) to collect coverage data at
the desired places.

Signed-off-by: Johannes Berg <[email protected]>
---
arch/um/Kconfig.debug | 38 ++++++++++++++++++++++++++++++--------
arch/um/Makefile-skas | 2 +-
arch/um/kernel/Makefile | 11 ++++++++++-
3 files changed, 41 insertions(+), 10 deletions(-)

diff --git a/arch/um/Kconfig.debug b/arch/um/Kconfig.debug
index 315d368e63ad..ca040b4e86e5 100644
--- a/arch/um/Kconfig.debug
+++ b/arch/um/Kconfig.debug
@@ -13,19 +13,41 @@ config GPROF
If you're involved in UML kernel development and want to use gprof,
say Y. If you're unsure, say N.

-config GCOV
- bool "Enable gcov support"
+config GCOV_BASE
+ bool "Enable gcov support (selectively)"
depends on DEBUG_INFO
- depends on !KCOV
+ depends on !KCOV && !GCOV_KERNEL
help
This option allows developers to retrieve coverage data from a UML
- session.
+ session, stored to disk just like with a regular userspace binary,
+ use the same tools (gcov, lcov, ...) to collect and process the
+ data.

- See <http://user-mode-linux.sourceforge.net/old/gprof.html> for more
- details.
+ See also KCOV and GCOV_KERNEL as alternatives.

- If you're involved in UML kernel development and want to use gcov,
- say Y. If you're unsure, say N.
+ This option (almost) only links with the needed support code, but
+ doesn't enable coverage data collection for any code (other than a
+ dummy file to get everything linked properly). See also the GCOV
+ option which enables coverage collection for the entire kernel and
+ all modules.
+
+ If you're using UML to test something and want to manually instruct
+ the compiler to instrument only parts of the code by adding the
+ relevant options for the objects you care about, say Y and do that
+ to get coverage collection only for the parts you need.
+
+ If you're unsure, say N.
+
+config GCOV
+ bool "Enable gcov support (whole kernel)"
+ depends on DEBUG_INFO
+ depends on !KCOV && !GCOV_KERNEL
+ select GCOV_BASE
+ help
+ This enables coverage data collection for the entire kernel and
+ all modules, see the GCOV_BASE option for more information.
+
+ If you're unsure, say N.

config EARLY_PRINTK
bool "Early printk"
diff --git a/arch/um/Makefile-skas b/arch/um/Makefile-skas
index ac35de5316a6..b5be5f55ac11 100644
--- a/arch/um/Makefile-skas
+++ b/arch/um/Makefile-skas
@@ -8,5 +8,5 @@ GCOV_OPT += -fprofile-arcs -ftest-coverage

CFLAGS-$(CONFIG_GCOV) += $(GCOV_OPT)
CFLAGS-$(CONFIG_GPROF) += $(GPROF_OPT)
-LINK-$(CONFIG_GCOV) += $(GCOV_OPT)
+LINK-$(CONFIG_GCOV_BASE) += $(GCOV_OPT)
LINK-$(CONFIG_GPROF) += $(GPROF_OPT)
diff --git a/arch/um/kernel/Makefile b/arch/um/kernel/Makefile
index c1205f9ec17e..0403e329f931 100644
--- a/arch/um/kernel/Makefile
+++ b/arch/um/kernel/Makefile
@@ -21,7 +21,7 @@ obj-y = config.o exec.o exitcode.o irq.o ksyms.o mem.o \

obj-$(CONFIG_BLK_DEV_INITRD) += initrd.o
obj-$(CONFIG_GPROF) += gprof_syms.o
-obj-$(CONFIG_GCOV) += gmon_syms.o
+obj-$(CONFIG_GCOV_BASE) += gmon_syms.o
obj-$(CONFIG_EARLY_PRINTK) += early_printk.o
obj-$(CONFIG_STACKTRACE) += stacktrace.o
obj-$(CONFIG_GENERIC_PCI_IOMAP) += ioport.o
@@ -32,6 +32,15 @@ include arch/um/scripts/Makefile.rules

targets := config.c config.tmp

+# This just causes _something_ to be compiled with coverage
+# collection so that gcov is linked into the binary, in case
+# the only thing that has it enabled is a module, when only
+# CONFIG_GCOV_BASE is selected. Yes, we thus always get some
+# coverage data for this file, but it's not hit often ...
+ifeq ($(CONFIG_GCOV_BASE),y)
+CFLAGS_reboot.o += -fprofile-arcs -ftest-coverage
+endif
+
# Be careful with the below Sed code - sed is pitfall-rich!
# We use sed to lower build requirements, for "embedded" builders for instance.

--
2.29.2

2021-03-12 10:29:52

by Johannes Berg

[permalink] [raw]
Subject: Re: [PATCH 2/6] module: add support for CONFIG_MODULE_DESTRUCTORS

On Fri, 2021-03-12 at 10:55 +0100, Johannes Berg wrote:
> From: Johannes Berg <[email protected]>
>
> At least in ARCH=um with CONFIG_GCOV (which writes all the
> coverage data directly out from the userspace binary rather
> than presenting it in debugfs) it's necessary to run all
> the atexit handlers (dtors/fini_array) so that gcov actually
> does write out the data.
>
> Add a new config option CONFIG_MODULE_DESTRUCTORS that can
> be selected via CONFIG_WANT_MODULE_DESTRUCTORS that the arch
> selects (this indirection exists so the architecture doesn't
> have to worry about whether or not CONFIG_MODULES is on).
> Additionally, the architecture must then (when it exits and
> no more module code can run) call run_all_module_destructors
> to run the code for all modules that are still loaded. When
> modules are unloaded, the handlers are called as well.

Oops, I forgot to add this bit to the patch:

--- a/scripts/module.lds.S
+++ b/scripts/module.lds.S
@@ -16,6 +16,8 @@ SECTIONS {

.init_array 0 : ALIGN(8) { *(SORT(.init_array.*)) *(.init_array) }

+ .fini_array 0 : ALIGN(8) { *(SORT(.fini_array.*)) *(.fini_array) }
+
__jump_table 0 : ALIGN(8) { KEEP(*(__jump_table)) }

__patchable_function_entries : { *(__patchable_function_entries) }


Should that be under the ifdef? .init_array isn't, even though it's only
relevant for CONFIG_CONSTRUCTORS.

johannes

2021-03-12 11:49:08

by Johannes Berg

[permalink] [raw]
Subject: [PATCH 6/6] um: fix CONFIG_GCOV for modules

From: Johannes Berg <[email protected]>

At least with current toolchain versions, gcov (as enabled
by CONFIG_GCOV) requires init and exit handlers to run. For
modules, this wasn't done properly, so use the new support
for CONFIG_MODULE_DESTRUCTORS as well as CONFIG_CONSTRUCTORS
to have gcov init and exit called appropriately.

Signed-off-by: Johannes Berg <[email protected]>
---
arch/um/Kconfig.debug | 2 ++
arch/um/kernel/process.c | 2 ++
2 files changed, 4 insertions(+)

diff --git a/arch/um/Kconfig.debug b/arch/um/Kconfig.debug
index ca040b4e86e5..74b27b11cb44 100644
--- a/arch/um/Kconfig.debug
+++ b/arch/um/Kconfig.debug
@@ -17,6 +17,8 @@ config GCOV_BASE
bool "Enable gcov support (selectively)"
depends on DEBUG_INFO
depends on !KCOV && !GCOV_KERNEL
+ select CONSTRUCTORS
+ select WANT_MODULE_DESTRUCTORS
help
This option allows developers to retrieve coverage data from a UML
session, stored to disk just like with a regular userspace binary,
diff --git a/arch/um/kernel/process.c b/arch/um/kernel/process.c
index c5011064b5dd..33f895a95ff8 100644
--- a/arch/um/kernel/process.c
+++ b/arch/um/kernel/process.c
@@ -240,6 +240,8 @@ void do_uml_exitcalls(void)
call = &__uml_exitcall_end;
while (--call >= &__uml_exitcall_begin)
(*call)();
+
+ run_all_module_destructors();
}

char *uml_strdup(const char *string)
--
2.29.2

2021-03-12 14:37:12

by Johannes Berg

[permalink] [raw]
Subject: Re: [PATCH 0/6] um: fix up CONFIG_GCOV support

On Fri, 2021-03-12 at 10:55 +0100, Johannes Berg wrote:
> CONFIG_GCOV is fairly useful for ARCH=um (e.g. with kunit, though
> my main use case is a bit different) since it writes coverage data
> directly out like a normal userspace binary. Theoretically, that
> is.
>
> Unfortunately, it's broken in multiple ways today:
>
>  1) it doesn't like, due to 'mangle_path' in seq_file, and the only
>     solution to that seems to be to rename our symbol, but that's
>     not so bad, and "mangle_path" sounds very generic anyway, which
>     it isn't quite
>
>  2) gcov requires exit handlers to write out the data, and those are
>     never called for modules, config CONSTRUCTORS exists for init
>     handlers, so add CONFIG_MODULE_DESTRUCTORS here that we can then
>     select in ARCH=um

Yeah, I wish.

None of this can really work. Thing is, __gcov_init(), called from the
constructors, will add the local data structure for the object file into
the global list (__gcov_root). So far, so good.

However, __gcov_exit(), which is called from the destructors (fini_array
which I added support for here) never gets passed the local data
structure pointer. It dumps __gcov_root, and that's it.

That basically means each executable/shared object should have its own
instance of __gcov_root and the functions. But the code in UML was set
up to export __gcov_exit(), which obviously then dumps the kernel's gcov
data.

So to make this really work we should treat modules like shared objects,
and link libgcov.a into each one of them. That might even work, but we
get

ERROR: modpost: "free" [module.ko] undefined!
ERROR: modpost: "vfprintf" [module.ko] undefined!
ERROR: modpost: "fcntl" [module.ko] undefined!
ERROR: modpost: "setbuf" [module.ko] undefined!
ERROR: modpost: "exit" [module.ko] undefined!
ERROR: modpost: "fwrite" [module.ko] undefined!
ERROR: modpost: "stderr" [module.ko] undefined!
ERROR: modpost: "fclose" [module.ko] undefined!
ERROR: modpost: "ftell" [module.ko] undefined!
ERROR: modpost: "fopen" [module.ko] undefined!
ERROR: modpost: "fread" [module.ko] undefined!
ERROR: modpost: "fdopen" [module.ko] undefined!
ERROR: modpost: "fseek" [module.ko] undefined!
ERROR: modpost: "fprintf" [module.ko] undefined!
ERROR: modpost: "strtol" [module.ko] undefined!
ERROR: modpost: "malloc" [module.ko] undefined!
ERROR: modpost: "getpid" [module.ko] undefined!
ERROR: modpost: "getenv" [module.ko] undefined!

We could of course export those, but that makes me nervous, e.g.
printf() is known to use a LOT of stack, far more than we have in the
kernel.

Also, we see:

WARNING: modpost: "__gcov_var" [module] is COMMON symbol
WARNING: modpost: "__gcov_root" [module] is COMMON symbol

which means the module cannot be loaded.

I think I'll just make CONFIG_GCOV depend on !MODULE instead, and for my
use case use CONFIG_GCOV_KERNEL.

Or maybe just kill CONFIG_GCOV entirely, since obviously nobody has ever
tried to use it with modules or with recent toolchains (gcc 9 or newer,
the mangle_path conflict).

johannes

2021-03-18 21:31:52

by Brendan Higgins

[permalink] [raw]
Subject: Re: [PATCH 4/6] um: split up CONFIG_GCOV

On Fri, Mar 12, 2021 at 1:56 AM Johannes Berg <[email protected]> wrote:
>
> From: Johannes Berg <[email protected]>
>
> It's not always desirable to collect coverage data for the
> entire kernel, so split off CONFIG_GCOV_BASE. This option
> only enables linking with coverage options, and compiles a
> single file (reboot.c) with them as well to force gcov to
> be linked into the kernel binary. That way, modules also
> work.
>
> To use this new option properly, one needs to manually add
> '-fprofile-arcs -ftest-coverage' to the compiler options
> of some object(s) or subdir(s) to collect coverage data at
> the desired places.
>
> Signed-off-by: Johannes Berg <[email protected]>

Hey, thanks for doing this! I was looking into this a few weeks ago
and root caused part of the issue in GCC and in the kernel, but I did
not have a fix put together.

Anyway, most of the patches make sense to me, but I am not able to
apply this patch on torvalds/master. Do you mind sending a rebase so I
can test it?

Thanks!

2021-03-18 21:35:17

by Johannes Berg

[permalink] [raw]
Subject: Re: [PATCH 4/6] um: split up CONFIG_GCOV

Hi Brendan,

> Hey, thanks for doing this! I was looking into this a few weeks ago
> and root caused part of the issue in GCC and in the kernel, but I did
> not have a fix put together.
>
> Anyway, most of the patches make sense to me, but I am not able to
> apply this patch on torvalds/master. Do you mind sending a rebase so I
> can test it?

Well, if you see my other replies in the thread, I gave up for various
reasons, see

https://lore.kernel.org/r/[email protected]

Personally, I ended up switching to CONFIG_GCOV_KERNEL instead because
it actually works for modules, but then it was _really_ slow (think 30s
to copy data for a few modules), but I root-caused this and ultimately
sent these patches instead:

https://patchwork.ozlabs.org/project/linux-um/patch/20210315233804.d3e52f6a3422.I9672eef7dfa7ce6c3de1ccf7ab8d9aad1fa7f3a6@changeid/
https://patchwork.ozlabs.org/project/linux-um/patch/20210315234731.2e03184a344b.I04f1816296f04c5aa7d7d88b33bd4a14dd458da8@changeid/


johannes