2006-02-04 22:50:40

by Sam Ravnborg

[permalink] [raw]
Subject: Check for references to discarded sections during build time

Hi Keith.

While doing some other modpost.c changes I thought about the
possibility to do the reference_init check during the modpost stage - so
it is done early and author can catch warning when he made the error.
Attached is first cut.

It does a much more lousy job than reference_init because it identifies
the module and not the .o file. I hope to later identify the function
where the illegal reference hapens.

I have only run it on a subset of the kernel and it found a few
warnings in ide-core.o + one warning in net/drivers/drgs.o.
I have not investigated if this is false positives yet.

make allmodconfig running in background - will check the result when I
wake up again.

Sam

modpost.c | 92 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++-
modpost.h | 6 ++++
2 files changed, 97 insertions(+), 1 deletion(-)

diff --git a/scripts/mod/modpost.c b/scripts/mod/modpost.c
index d901095..b41ae4f 100644
--- a/scripts/mod/modpost.c
+++ b/scripts/mod/modpost.c
@@ -451,6 +451,96 @@ static char *get_modinfo(void *modinfo,
return NULL;
}

+/**
+ * Is this section a true init section?
+ **/
+static int is_init_section(const char *name)
+{
+ if (strcmp(name, ".init") == 0)
+ return 1;
+ if (strncmp(name, ".init.", strlen(".init.")) == 0)
+ return 1;
+ return 0;
+}
+
+/**
+ * Check name - identify sections which is discarded by vmlinux
+ * after module is loaded. Here we are fine referencing __init.
+ **/
+static int is_ref_init_ok(const char *name)
+{
+ const char **s;
+ /* Absolute section names */
+ const char *namelist1[] = {
+ ".init",
+ ".stab",
+ ".rodata",
+ ".text.lock",
+ ".pci_fixup_header",
+ ".pci_fixup_final",
+ ".pdr",
+ "__param",
+ NULL
+ };
+ /* Start of section names */
+ const char *namelist2[] = {
+ ".init.",
+ ".altinstructions",
+ ".eh_frame",
+ ".debug",
+ NULL
+ };
+
+ for (s = namelist1; *s; s++)
+ if (strcmp(*s, name) == 0)
+ return 1;
+ for (s = namelist2; *s; s++)
+ if (strncmp(*s, name, strlen(*s)) == 0)
+ return 1;
+ return 0;
+}
+
+/**
+ * Walk through all sections.
+ * All sections with references to a section identified as "init"
+ * needs to be int too - otherwise we have a reference to code marked
+ * __init and which is discarded by vmlinux
+ **/
+static void handle_checkinitref(struct module *mod, const char *modname, struct elf_info *elf)
+{
+ int i;
+ Elf_Sym *sym;
+ Elf_Ehdr *hdr = elf->hdr;
+ Elf_Shdr *sechdrs = elf->sechdrs;
+ const char *secstrings = (void *)hdr + sechdrs[hdr->e_shstrndx].sh_offset;
+
+ /* Walk through all sections */
+ for (i = 0; i < hdr->e_shnum; i++) {
+ const char *name = secstrings + sechdrs[i].sh_name + strlen(".rela");
+ /* We want to process only relocation sections and not .init */
+ if (is_ref_init_ok(name) || (sechdrs[i].sh_type != SHT_RELA))
+ continue;
+ Elf_Rela *rela;
+ Elf_Rela *start = (void *)hdr + sechdrs[i].sh_offset;
+ Elf_Rela *stop = (void*)start + sechdrs[i].sh_size;
+ for (rela = start; rela < stop; rela++) {
+ Elf_Rela r;
+ const char *symname;
+ r.r_offset = TO_NATIVE(rela->r_offset);
+ r.r_info = TO_NATIVE(rela->r_info);
+ sym = elf->symtab_start + ELF_R_SYM(r.r_info);
+ symname = secstrings + sechdrs[sym->st_shndx].sh_name;
+
+ if (is_init_section(symname)) {
+ warn("%s: Reference to %s section from "
+ "offset 0x%lx within section %s\n",
+ modname, secstrings + sechdrs[sym->st_shndx].sh_name,
+ (long)r.r_offset, name);
+ }
+ }
+ }
+}
+
static void read_symbols(char *modname)
{
const char *symname;
@@ -476,6 +566,7 @@ static void read_symbols(char *modname)
handle_modversions(mod, &info, sym, symname);
handle_moddevtable(mod, &info, sym, symname);
}
+ handle_checkinitref(mod, modname, &info);

version = get_modinfo(info.modinfo, info.modinfo_len, "version");
if (version)
diff --git a/scripts/mod/modpost.h b/scripts/mod/modpost.h
index c0de7b9..f7126c1 100644
--- a/scripts/mod/modpost.h
+++ b/scripts/mod/modpost.h
@@ -19,6 +19,9 @@
#define ELF_ST_BIND ELF32_ST_BIND
#define ELF_ST_TYPE ELF32_ST_TYPE

+#define Elf_Rela Elf32_Rela
+#define ELF_R_SYM ELF32_R_SYM
+#define ELF_R_TYPE ELF32_R_TYPE
#else

#define Elf_Ehdr Elf64_Ehdr
@@ -27,6 +30,9 @@
#define ELF_ST_BIND ELF64_ST_BIND
#define ELF_ST_TYPE ELF64_ST_TYPE

+#define Elf_Rela Elf64_Rela
+#define ELF_R_SYM ELF64_R_SYM
+#define ELF_R_TYPE ELF64_R_TYPE
#endif

#if KERNEL_ELFDATA != HOST_ELFDATA


2006-02-05 00:20:25

by Sam Ravnborg

[permalink] [raw]
Subject: Re: Check for references to discarded sections during build time

On Sat, Feb 04, 2006 at 11:50:25PM +0100, Sam Ravnborg wrote:
> Hi Keith.
>
> While doing some other modpost.c changes I thought about the
> possibility to do the reference_init check during the modpost stage - so
> it is done early and author can catch warning when he made the error.
> Attached is first cut.
>
> It does a much more lousy job than reference_init because it identifies
> the module and not the .o file. I hope to later identify the function
> where the illegal reference hapens.
>
> I have only run it on a subset of the kernel and it found a few
> warnings in ide-core.o + one warning in net/drivers/drgs.o.
> I have not investigated if this is false positives yet.
>
> make allmodconfig running in background - will check the result when I
> wake up again.

That spew out several screenfull of stuff...
So far by looking only positives..

Heading for a business trip and vacation the next 2 weeks so do not
expect much response...

Sam

Corrected patch below:

diff --git a/scripts/mod/modpost.c b/scripts/mod/modpost.c
index d901095..01b482d 100644
--- a/scripts/mod/modpost.c
+++ b/scripts/mod/modpost.c
@@ -451,6 +451,101 @@ static char *get_modinfo(void *modinfo,
return NULL;
}

+/**
+ * Is this section a true init section?
+ **/
+static int is_init_section(const char *name)
+{
+ if (strcmp(name, ".init") == 0)
+ return 1;
+ if (strncmp(name, ".init.", strlen(".init.")) == 0)
+ return 1;
+ return 0;
+}
+
+/**
+ * Check name - identify sections which is discarded by vmlinux
+ * after module is loaded. Here we are fine referencing __init.
+ **/
+static int is_ref_init_ok(const char *name)
+{
+ const char **s;
+ /* Absolute section names */
+ const char *namelist1[] = {
+ ".init",
+ ".stab",
+ ".rodata",
+ ".text.lock",
+ ".pci_fixup_header",
+ ".pci_fixup_final",
+ ".pdr",
+ "__param",
+ NULL
+ };
+ /* Start of section names */
+ const char *namelist2[] = {
+ ".init.",
+ ".altinstructions",
+ ".eh_frame",
+ ".debug",
+ NULL
+ };
+
+ for (s = namelist1; *s; s++)
+ if (strcmp(*s, name) == 0)
+ return 1;
+ for (s = namelist2; *s; s++)
+ if (strncmp(*s, name, strlen(*s)) == 0)
+ return 1;
+ return 0;
+}
+
+/**
+ * Walk through all sections.
+ * All sections with references to a section identified as "init"
+ * needs to be int too - otherwise we have a reference to code marked
+ * __init and which is discarded by vmlinux
+ **/
+static void handle_checkinitref(struct module *mod, const char *modname, struct elf_info *elf)
+{
+ int i;
+ Elf_Sym *sym;
+ Elf_Ehdr *hdr = elf->hdr;
+ Elf_Shdr *sechdrs = elf->sechdrs;
+ const char *secstrings = (void *)hdr + sechdrs[hdr->e_shstrndx].sh_offset;
+
+ /* Walk through all sections */
+ for (i = 0; i < hdr->e_shnum; i++) {
+ const char *name = secstrings + sechdrs[i].sh_name + strlen(".rela");
+ /* We want to process only relocation sections and not .init */
+ if (is_ref_init_ok(name) || (sechdrs[i].sh_type != SHT_RELA))
+ continue;
+ Elf_Rela *rela;
+ Elf_Rela *start = (void *)hdr + sechdrs[i].sh_offset;
+ Elf_Rela *stop = (void*)start + sechdrs[i].sh_size;
+
+ for (rela = start; rela < stop; rela++) {
+ Elf_Rela r;
+ const char *symname;
+ r.r_offset = TO_NATIVE(rela->r_offset);
+ r.r_info = TO_NATIVE(rela->r_info);
+ sym = elf->symtab_start + ELF_R_SYM(r.r_info);
+ /* Skip special sections */
+ if (sym->st_shndx >= SHN_LORESERVE)
+ continue;
+ symname = secstrings + sechdrs[sym->st_shndx].sh_name;
+
+ if (is_init_section(symname)) {
+ warn("%s: Reference to %s section from "
+ "offset 0x%lx within section %s\n",
+ modname, symname,
+ (long)r.r_offset,
+ name);
+ }
+ }
+ }
+}
+
static void read_symbols(char *modname)
{
const char *symname;
@@ -476,6 +571,7 @@ static void read_symbols(char *modname)
handle_modversions(mod, &info, sym, symname);
handle_moddevtable(mod, &info, sym, symname);
}
+ handle_checkinitref(mod, modname, &info);

version = get_modinfo(info.modinfo, info.modinfo_len, "version");
if (version)
diff --git a/scripts/mod/modpost.h b/scripts/mod/modpost.h
index c0de7b9..f7126c1 100644
--- a/scripts/mod/modpost.h
+++ b/scripts/mod/modpost.h
@@ -19,6 +19,9 @@
#define ELF_ST_BIND ELF32_ST_BIND
#define ELF_ST_TYPE ELF32_ST_TYPE

+#define Elf_Rela Elf32_Rela
+#define ELF_R_SYM ELF32_R_SYM
+#define ELF_R_TYPE ELF32_R_TYPE
#else

#define Elf_Ehdr Elf64_Ehdr
@@ -27,6 +30,9 @@
#define ELF_ST_BIND ELF64_ST_BIND
#define ELF_ST_TYPE ELF64_ST_TYPE

+#define Elf_Rela Elf64_Rela
+#define ELF_R_SYM ELF64_R_SYM
+#define ELF_R_TYPE ELF64_R_TYPE
#endif

#if KERNEL_ELFDATA != HOST_ELFDATA

2006-02-06 14:31:14

by Keith Owens

[permalink] [raw]
Subject: Re: Check for references to discarded sections during build time

On Sat, Feb 04, 2006 at 11:50:25PM +0100, Sam Ravnborg wrote:
> Hi Keith.
>
> While doing some other modpost.c changes I thought about the
> possibility to do the reference_init check during the modpost stage - so
> it is done early and author can catch warning when he made the error.
> Attached is first cut.
>
> It does a much more lousy job than reference_init because it identifies
> the module and not the .o file. I hope to later identify the function
> where the illegal reference hapens.

My main concern is that we cannot get the .o file with this approach, I
am particulary concerned about this approach when processing vmlinux.
Static function names are duplicated in the kernel. Reporting a
dangling reference to init or discarded data by function name rather
than by object will lead to confusion if the reference is from one of
the duplicate function names.

# nm vmlinux | fgrep ' t ' | awk '{print $3}' | sort | uniq -dc

produces this horrible list of duplicate function names (IA64):

2 autofs_get_sb
2 base_probe
3 c_next
2 count
2 create_elf_tables
2 c_show
3 c_start
3 c_stop
3 default_handler
3 destroy_inodecache
2 dev_ifsioc
3 disable_slot
2 do_vfs_lock
4 dst_output
2 dump_seek
2 dump_write
2 elf_core_dump
3 enable_slot
2 error
3 exact_lock
3 exact_match
2 fill_note
2 fill_prstatus
2 flush_window
2 get_power_status
2 getreg
2 gzip_release
2 huft_build
2 huft_free
2 inflate_codes
2 inflate_dynamic
2 inflate_fixed
4 init
2 __initcall_init
14 init_once
2 klist_devices_get
2 klist_devices_put
2 load_elf_binary
2 load_elf_library
5 machvec_noop
4 machvec_noop_mm
2 mark_clean
2 maydump
2 message.1
3 m_next
2 modalias_show
3 m_start
3 m_stop
2 next_device
3 notesize
2 padzero
2 parse_options
2 proc_calc_metrics
2 raw_ioctl
2 read_block_bitmap
2 read_inode_bitmap
2 seq_next
2 seq_show
2 seq_start
2 seq_stop
2 set_brk
2 setkey
2 __setup_netdev_boot_setup
2 __setup_str_netdev_boot_setup
2 s_next
2 s_show
2 s_start
2 s_stop
2 state_show
2 state_store
2 store_uevent
2 try_to_fill_dentry
2 writenote
2 xdr_decode_fattr

It will also be extremely difficult to track down entries from compiler
generated anonymous data areas. They are hard enough to isolate when
looking at a single object. When all the anonymous data has been
merged together in vmlinux, it will be beyond most people.

2006-02-06 19:20:27

by Sam Ravnborg

[permalink] [raw]
Subject: Re: Check for references to discarded sections during build time

On Tue, Feb 07, 2006 at 01:31:11AM +1100, Keith Owens wrote:
> On Sat, Feb 04, 2006 at 11:50:25PM +0100, Sam Ravnborg wrote:
> > Hi Keith.
> >
> > While doing some other modpost.c changes I thought about the
> > possibility to do the reference_init check during the modpost stage - so
> > it is done early and author can catch warning when he made the error.
> > Attached is first cut.
> >
> > It does a much more lousy job than reference_init because it identifies
> > the module and not the .o file. I hope to later identify the function
> > where the illegal reference hapens.
>
> My main concern is that we cannot get the .o file with this approach, I
> am particulary concerned about this approach when processing vmlinux.
> Static function names are duplicated in the kernel. Reporting a
> dangling reference to init or discarded data by function name rather
> than by object will lead to confusion if the reference is from one of
> the duplicate function names.
This is an issue. But compared to todays situation where we
only do the checks occasionaly and we miss single object modules
we can live with having to deal with duplicate symbols occasionally.

>
> # nm vmlinux | fgrep ' t ' | awk '{print $3}' | sort | uniq -dc
>
> produces this horrible list of duplicate function names (IA64):
>
> 2 autofs_get_sb
... deleted ~70 lines
> 2 writenote
> 2 xdr_decode_fattr

Looks ugly. Maybe something to check in modpost later.

>
> It will also be extremely difficult to track down entries from compiler
> generated anonymous data areas. They are hard enough to isolate when
> looking at a single object. When all the anonymous data has been
> merged together in vmlinux, it will be beyond most people.

Agreed. Maybe we shall let reference_init.pl report anonymous data and
skip that from the runtime part (if I find a way to detact it).

I will give it a spin later to see where I end up.

Sam