2013-10-28 15:49:26

by Ming Lei

[permalink] [raw]
Subject: [PATCH] scripts/kallsyms: filter symbols not in kernel address space

This patch uses CONFIG_PAGE_OFFSET to filter symbols which
are not in kernel address space because these symbols are
generally for generating code purpose and can't be run at
kernel mode, so we needn't keep them in /proc/kallsyms.

For example, on ARM there are some symbols which are
linked in relocatable code section, then perf can't parse
symbols any more from /proc/kallsyms, and this patch fixes
the problem.

Cc: Russell King <[email protected]>
Cc: [email protected]
Cc: Michal Marek <[email protected]>
Acked-by: Rusty Russell <[email protected]>
Signed-off-by: Ming Lei <[email protected]>
---
scripts/kallsyms.c | 12 +++++++++++-
scripts/link-vmlinux.sh | 2 ++
2 files changed, 13 insertions(+), 1 deletion(-)

diff --git a/scripts/kallsyms.c b/scripts/kallsyms.c
index 487ac6f..9a11f9f 100644
--- a/scripts/kallsyms.c
+++ b/scripts/kallsyms.c
@@ -55,6 +55,7 @@ static struct sym_entry *table;
static unsigned int table_size, table_cnt;
static int all_symbols = 0;
static char symbol_prefix_char = '\0';
+static unsigned long long kernel_start_addr = 0;

int token_profit[0x10000];

@@ -65,7 +66,10 @@ unsigned char best_table_len[256];

static void usage(void)
{
- fprintf(stderr, "Usage: kallsyms [--all-symbols] [--symbol-prefix=<prefix char>] < in.map > out.S\n");
+ fprintf(stderr, "Usage: kallsyms [--all-symbols] "
+ "[--symbol-prefix=<prefix char>] "
+ "[--page-offset=<CONFIG_PAGE_OFFSET>] "
+ "< in.map > out.S\n");
exit(1);
}

@@ -194,6 +198,9 @@ static int symbol_valid(struct sym_entry *s)
int i;
int offset = 1;

+ if (s->addr < kernel_start_addr)
+ return 0;
+
/* skip prefix char */
if (symbol_prefix_char && *(s->sym + 1) == symbol_prefix_char)
offset++;
@@ -646,6 +653,9 @@ int main(int argc, char **argv)
if ((*p == '"' && *(p+2) == '"') || (*p == '\'' && *(p+2) == '\''))
p++;
symbol_prefix_char = *p;
+ } else if (strncmp(argv[i], "--page-offset=", 14) == 0) {
+ const char *p = &argv[i][14];
+ kernel_start_addr = strtoull(p, NULL, 16);
} else
usage();
}
diff --git a/scripts/link-vmlinux.sh b/scripts/link-vmlinux.sh
index 0149949..32b10f5 100644
--- a/scripts/link-vmlinux.sh
+++ b/scripts/link-vmlinux.sh
@@ -82,6 +82,8 @@ kallsyms()
kallsymopt="${kallsymopt} --all-symbols"
fi

+ kallsymopt="${kallsymopt} --page-offset=$CONFIG_PAGE_OFFSET"
+
local aflags="${KBUILD_AFLAGS} ${KBUILD_AFLAGS_KERNEL} \
${NOSTDINC_FLAGS} ${LINUXINCLUDE} ${KBUILD_CPPFLAGS}"

--
1.7.9.5


2013-10-31 22:43:14

by Andrew Morton

[permalink] [raw]
Subject: Re: [PATCH] scripts/kallsyms: filter symbols not in kernel address space

On Mon, 28 Oct 2013 23:48:59 +0800 Ming Lei <[email protected]> wrote:

> This patch uses CONFIG_PAGE_OFFSET to filter symbols which
> are not in kernel address space because these symbols are
> generally for generating code purpose and can't be run at
> kernel mode, so we needn't keep them in /proc/kallsyms.
>
> For example, on ARM there are some symbols which are
> linked in relocatable code section, then perf can't parse
> symbols any more from /proc/kallsyms, and this patch fixes
> the problem.

This is a non-back-compatible change and I'd like to see a much
stronger assurance that it is safe to merge and will not break any
existing application on the planet, please.

For a start, please describe with great precision what these excluded
symbols are (examples would help) and explain why no application will
conceivably have had any use for them.

2013-10-31 22:50:34

by Russell King - ARM Linux

[permalink] [raw]
Subject: Re: [PATCH] scripts/kallsyms: filter symbols not in kernel address space

On Thu, Oct 31, 2013 at 03:43:11PM -0700, Andrew Morton wrote:
> On Mon, 28 Oct 2013 23:48:59 +0800 Ming Lei <[email protected]> wrote:
>
> > This patch uses CONFIG_PAGE_OFFSET to filter symbols which
> > are not in kernel address space because these symbols are
> > generally for generating code purpose and can't be run at
> > kernel mode, so we needn't keep them in /proc/kallsyms.
> >
> > For example, on ARM there are some symbols which are
> > linked in relocatable code section, then perf can't parse
> > symbols any more from /proc/kallsyms, and this patch fixes
> > the problem.
>
> This is a non-back-compatible change and I'd like to see a much
> stronger assurance that it is safe to merge and will not break any
> existing application on the planet, please.
>
> For a start, please describe with great precision what these excluded
> symbols are (examples would help) and explain why no application will
> conceivably have had any use for them.

These symbols are used to build what is relocatable code; the code which
ends up being placed in the machine vectors and the following page.

Rather than have to manually calculate them, I merged a patch which used
the tools we have, namely the assembler and linker, to do the job for us.
Unfortunately, these symbols have ended up in kallsyms, which various
programs read, and having symbols down at the lower 8k is not what they
expect.

What it means is we don't have to play these kinds of games in the
assembler:

- .equ stubs_offset, __vectors_start + 0x1000 - __stubs_start
__vectors_start:
- W(b) vector_rst + stubs_offset
- W(b) vector_und + stubs_offset
- W(ldr) pc, .LCvswi + stubs_offset
- W(b) vector_pabt + stubs_offset
- W(b) vector_dabt + stubs_offset
- W(b) vector_addrexcptn + stubs_offset
- W(b) vector_irq + stubs_offset
- W(b) vector_fiq + stubs_offset

where each vector_* symbol is located at an address greater than
__stubs_start. Here's the obvious question: can you understand what's
going on with all that?

2013-10-31 22:58:33

by Andrew Morton

[permalink] [raw]
Subject: Re: [PATCH] scripts/kallsyms: filter symbols not in kernel address space

On Thu, 31 Oct 2013 22:50:22 +0000 Russell King - ARM Linux <[email protected]> wrote:

> On Thu, Oct 31, 2013 at 03:43:11PM -0700, Andrew Morton wrote:
> > On Mon, 28 Oct 2013 23:48:59 +0800 Ming Lei <[email protected]> wrote:
> >
> > > This patch uses CONFIG_PAGE_OFFSET to filter symbols which
> > > are not in kernel address space because these symbols are
> > > generally for generating code purpose and can't be run at
> > > kernel mode, so we needn't keep them in /proc/kallsyms.
> > >
> > > For example, on ARM there are some symbols which are
> > > linked in relocatable code section, then perf can't parse
> > > symbols any more from /proc/kallsyms, and this patch fixes
> > > the problem.
> >
> > This is a non-back-compatible change and I'd like to see a much
> > stronger assurance that it is safe to merge and will not break any
> > existing application on the planet, please.
> >
> > For a start, please describe with great precision what these excluded
> > symbols are (examples would help) and explain why no application will
> > conceivably have had any use for them.
>
> These symbols are used to build what is relocatable code; the code which
> ends up being placed in the machine vectors and the following page.
>
> Rather than have to manually calculate them, I merged a patch which used
> the tools we have, namely the assembler and linker, to do the job for us.

OK. Do you recall which patch that was? And is it the case that this
patch excludes only the symbols which that patch accidentally added?

> Unfortunately, these symbols have ended up in kallsyms, which various
> programs read, and having symbols down at the lower 8k is not what they
> expect.

> What it means is we don't have to play these kinds of games in the
> assembler:
>
> - .equ stubs_offset, __vectors_start + 0x1000 - __stubs_start
> __vectors_start:
> - W(b) vector_rst + stubs_offset
> - W(b) vector_und + stubs_offset
> - W(ldr) pc, .LCvswi + stubs_offset
> - W(b) vector_pabt + stubs_offset
> - W(b) vector_dabt + stubs_offset
> - W(b) vector_addrexcptn + stubs_offset
> - W(b) vector_irq + stubs_offset
> - W(b) vector_fiq + stubs_offset
>
> where each vector_* symbol is located at an address greater than
> __stubs_start. Here's the obvious question: can you understand what's
> going on with all that?

Nope ;)

2013-10-31 23:53:43

by Russell King - ARM Linux

[permalink] [raw]
Subject: Re: [PATCH] scripts/kallsyms: filter symbols not in kernel address space

On Thu, Oct 31, 2013 at 03:58:31PM -0700, Andrew Morton wrote:
> On Thu, 31 Oct 2013 22:50:22 +0000 Russell King - ARM Linux <[email protected]> wrote:
>
> > On Thu, Oct 31, 2013 at 03:43:11PM -0700, Andrew Morton wrote:
> > > On Mon, 28 Oct 2013 23:48:59 +0800 Ming Lei <[email protected]> wrote:
> > >
> > > > This patch uses CONFIG_PAGE_OFFSET to filter symbols which
> > > > are not in kernel address space because these symbols are
> > > > generally for generating code purpose and can't be run at
> > > > kernel mode, so we needn't keep them in /proc/kallsyms.
> > > >
> > > > For example, on ARM there are some symbols which are
> > > > linked in relocatable code section, then perf can't parse
> > > > symbols any more from /proc/kallsyms, and this patch fixes
> > > > the problem.
> > >
> > > This is a non-back-compatible change and I'd like to see a much
> > > stronger assurance that it is safe to merge and will not break any
> > > existing application on the planet, please.
> > >
> > > For a start, please describe with great precision what these excluded
> > > symbols are (examples would help) and explain why no application will
> > > conceivably have had any use for them.
> >
> > These symbols are used to build what is relocatable code; the code which
> > ends up being placed in the machine vectors and the following page.
> >
> > Rather than have to manually calculate them, I merged a patch which used
> > the tools we have, namely the assembler and linker, to do the job for us.
>
> OK. Do you recall which patch that was? And is it the case that this
> patch excludes only the symbols which that patch accidentally added?

It's not a case of "accidentally added" - as we include all symbols in
kallsyms which are marked as text/data etc, irrespective of whether they
are local or global.

b9b32bf70f2fb made the change, backported to stable trees. e39e3f3ebfef
depended on that change so that it could get at the correct offset to
copy FIQ code.

Both commits are part of closing an information leak from the kernel
as part of a security report back in July.

2013-11-01 02:10:40

by Ming Lei

[permalink] [raw]
Subject: Re: [PATCH] scripts/kallsyms: filter symbols not in kernel address space

Hi Andrew,

Thanks for your comment.

On Fri, Nov 1, 2013 at 6:43 AM, Andrew Morton <[email protected]> wrote:
>
> For a start, please describe with great precision what these excluded
> symbols are (examples would help) and explain why no application will
> conceivably have had any use for them.

Looks Russell has given the example symbols already, and all these
symbols should be only for generating code, and they won't be run by
kernel since kernel can't run code which isn't in kernel virtual address
space, so there is no reason for application to use these symbols.

Actually, there is already report on the problem, see below link:

http://www.gossamer-threads.com/lists/linux/kernel/1808193?page=last

> OK. Do you recall which patch that was? And is it the case that this
> patch excludes only the symbols which that patch accidentally added?

This patch excludes all kernel symbols(non-module symbols) which
aren't in kernel virtual address space, I think it is reasonable and correct.


Thanks,
--
Ming Lei

2013-11-01 02:36:58

by Rusty Russell

[permalink] [raw]
Subject: [PATCH] scripts/kallsyms: filter symbols not in kernel address space

From: Ming Lei <[email protected]>

This patch uses CONFIG_PAGE_OFFSET to filter symbols which
are not in kernel address space because these symbols are
generally for generating code purpose and can't be run at
kernel mode, so we needn't keep them in /proc/kallsyms.

For example, on ARM there are some symbols which may be
linked in relocatable code section, then perf can't parse
symbols any more from /proc/kallsyms, this patch fixes the
problem (introduced b9b32bf70f2fb710b07c94e13afbc729afe221da)

Cc: Russell King <[email protected]>
Cc: [email protected]
Cc: Michal Marek <[email protected]>
Signed-off-by: Ming Lei <[email protected]>
Signed-off-by: Rusty Russell <[email protected]>
Cc: [email protected]
---
Stephen, please run this on todays' linux-next, so I can push to Linus
asap. Thanks...

scripts/kallsyms.c | 12 +++++++++++-
scripts/link-vmlinux.sh | 2 ++
2 files changed, 13 insertions(+), 1 deletion(-)

diff --git a/scripts/kallsyms.c b/scripts/kallsyms.c
index 487ac6f..9a11f9f 100644
--- a/scripts/kallsyms.c
+++ b/scripts/kallsyms.c
@@ -55,6 +55,7 @@ static struct sym_entry *table;
static unsigned int table_size, table_cnt;
static int all_symbols = 0;
static char symbol_prefix_char = '\0';
+static unsigned long long kernel_start_addr = 0;

int token_profit[0x10000];

@@ -65,7 +66,10 @@ unsigned char best_table_len[256];

static void usage(void)
{
- fprintf(stderr, "Usage: kallsyms [--all-symbols] [--symbol-prefix=<prefix char>] < in.map > out.S\n");
+ fprintf(stderr, "Usage: kallsyms [--all-symbols] "
+ "[--symbol-prefix=<prefix char>] "
+ "[--page-offset=<CONFIG_PAGE_OFFSET>] "
+ "< in.map > out.S\n");
exit(1);
}

@@ -194,6 +198,9 @@ static int symbol_valid(struct sym_entry *s)
int i;
int offset = 1;

+ if (s->addr < kernel_start_addr)
+ return 0;
+
/* skip prefix char */
if (symbol_prefix_char && *(s->sym + 1) == symbol_prefix_char)
offset++;
@@ -646,6 +653,9 @@ int main(int argc, char **argv)
if ((*p == '"' && *(p+2) == '"') || (*p == '\'' && *(p+2) == '\''))
p++;
symbol_prefix_char = *p;
+ } else if (strncmp(argv[i], "--page-offset=", 14) == 0) {
+ const char *p = &argv[i][14];
+ kernel_start_addr = strtoull(p, NULL, 16);
} else
usage();
}
diff --git a/scripts/link-vmlinux.sh b/scripts/link-vmlinux.sh
index 0149949..32b10f5 100644
--- a/scripts/link-vmlinux.sh
+++ b/scripts/link-vmlinux.sh
@@ -82,6 +82,8 @@ kallsyms()
kallsymopt="${kallsymopt} --all-symbols"
fi

+ kallsymopt="${kallsymopt} --page-offset=$CONFIG_PAGE_OFFSET"
+
local aflags="${KBUILD_AFLAGS} ${KBUILD_AFLAGS_KERNEL} \
${NOSTDINC_FLAGS} ${LINUXINCLUDE} ${KBUILD_CPPFLAGS}"

2014-01-07 14:13:20

by Arnd Bergmann

[permalink] [raw]
Subject: Re: [PATCH] scripts/kallsyms: filter symbols not in kernel address space

On Monday 28 October 2013, Ming Lei wrote:
> This patch uses CONFIG_PAGE_OFFSET to filter symbols which
> are not in kernel address space because these symbols are
> generally for generating code purpose and can't be run at
> kernel mode, so we needn't keep them in /proc/kallsyms.
>
> For example, on ARM there are some symbols which are
> linked in relocatable code section, then perf can't parse
> symbols any more from /proc/kallsyms, and this patch fixes
> the problem.
>
> Cc: Russell King <[email protected]>
> Cc: [email protected]
> Cc: Michal Marek <[email protected]>
> Acked-by: Rusty Russell <[email protected]>
> Signed-off-by: Ming Lei <[email protected]>

Sorry for the late report, but I seem to have encountered a problem with this
patch, now that it has made it into all stable kernels.

When linking an ARM nommu kernel, I get the output "No valid symbol." twice,
from scripts/kallsyms. The problem evidently is that PAGE_OFFSET is still
set to 0xC0000000 on ARM NOMMU builds but the kernel is linked to start at
PLAT_PHYS_OFFSET instead, which may be elsehwere. For most platforms,
this is defined in Kconfig these days, so we could get away with

diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
index d1e4098..c477a7c 100644
--- a/arch/arm/Kconfig
+++ b/arch/arm/Kconfig
@@ -1592,6 +1592,7 @@ endchoice

config PAGE_OFFSET
hex
+ default PHYS_OFFSET if !MMU
default 0x40000000 if VMSPLIT_1G
default 0x80000000 if VMSPLIT_2G
default 0xC0000000

but there are still a few ARM platforms that define their own PLAT_PHYS_OFFSET
in memory.h, and it wouldn't help on non-ARM systems that might have the same
problem.

Arnd

2014-01-07 14:33:46

by Ming Lei

[permalink] [raw]
Subject: Re: [PATCH] scripts/kallsyms: filter symbols not in kernel address space

Hi Arnd,

On Tue, Jan 7, 2014 at 10:12 PM, Arnd Bergmann <[email protected]> wrote:
> On Monday 28 October 2013, Ming Lei wrote:
>> This patch uses CONFIG_PAGE_OFFSET to filter symbols which
>> are not in kernel address space because these symbols are
>> generally for generating code purpose and can't be run at
>> kernel mode, so we needn't keep them in /proc/kallsyms.
>>
>> For example, on ARM there are some symbols which are
>> linked in relocatable code section, then perf can't parse
>> symbols any more from /proc/kallsyms, and this patch fixes
>> the problem.
>>
>> Cc: Russell King <[email protected]>
>> Cc: [email protected]
>> Cc: Michal Marek <[email protected]>
>> Acked-by: Rusty Russell <[email protected]>
>> Signed-off-by: Ming Lei <[email protected]>
>
> Sorry for the late report, but I seem to have encountered a problem with this
> patch, now that it has made it into all stable kernels.
>
> When linking an ARM nommu kernel, I get the output "No valid symbol." twice,
> from scripts/kallsyms. The problem evidently is that PAGE_OFFSET is still
> set to 0xC0000000 on ARM NOMMU builds but the kernel is linked to start at
> PLAT_PHYS_OFFSET instead, which may be elsehwere. For most platforms,
> this is defined in Kconfig these days, so we could get away with
>
> diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
> index d1e4098..c477a7c 100644
> --- a/arch/arm/Kconfig
> +++ b/arch/arm/Kconfig
> @@ -1592,6 +1592,7 @@ endchoice
>
> config PAGE_OFFSET
> hex
> + default PHYS_OFFSET if !MMU
> default 0x40000000 if VMSPLIT_1G
> default 0x80000000 if VMSPLIT_2G
> default 0xC0000000
>
> but there are still a few ARM platforms that define their own PLAT_PHYS_OFFSET
> in memory.h, and it wouldn't help on non-ARM systems that might have the same
> problem.

We had posted two patches to address the problem, see below link:

http://lists.scusting.com/index.php?t=msg&goto=1726509&S=Google


Thanks,
--
Ming Lei

2014-01-07 15:06:24

by Arnd Bergmann

[permalink] [raw]
Subject: Re: [PATCH] scripts/kallsyms: filter symbols not in kernel address space

On Tuesday 07 January 2014 22:33:24 Ming Lei wrote:
> On Tue, Jan 7, 2014 at 10:12 PM, Arnd Bergmann <[email protected]> wrote:
> > On Monday 28 October 2013, Ming Lei wrote:
>
> We had posted two patches to address the problem, see below link:
>
> http://lists.scusting.com/index.php?t=msg&goto=1726509&S=Google

The patch "scripts/link-vmlinux.sh: only filter kernel symbols for arm"
has made it in now, which is good for all other architectures, but it
makes no difference for nommu-arm, because CONFIG_PAGE_OFFSET is
still set.

The second patch from Jonathan Austin has not been applied yet, as of
today's linux-next, and it's exactly what I suggested. However it
won't work on ARM platforms that define their own PLAT_PHYS_OFFSET:
ebsa110, ep93xx, exynos, footbridge, integrator, iop13xx, ks8695,
omap1, realview, rpc, s5pv210 and sa1100. Fortunately these all have
MMUs, so in practice it won't hurt, but it doesn't seem correct.

Arnd