Hello,
Recently I've spent some time with kgdb and found a interesting thing.
The code I've used is Jason Wessel's sample module which can be downloaded
at following URL.
http://kernel.org/pub/linux/kernel/people/jwessel/dbg_webinar/crash_mod.tar.bz2
It build a kernel module "test_panic" and after the module is loaded it
makes some files in /proc/test_panic directory. Most of them are supposed
to kernel panic when they're written. A simple usage is like below:
# echo 1 > /proc/test_panic/panic
[ 18.345571] Starting panic
[ 18.345730] Kernel panic - not syncing: test_panic running!
[ 18.345752]
[ 18.345856] Pid: 36, comm: sh Not tainted 2.6.38-rc8+ #83
[ 18.345923] Call Trace:
[ 18.346001] [<ffffffff812a8502>] panic+0x8c/0x18d
[ 18.346257] [<ffffffffa000012a>] deep01+0x0/0x38 [test_panic] <--- ???
[ 18.346347] [<ffffffff81104666>] proc_file_write+0x73/0x8d
[ 18.346432] [<ffffffff811000b3>] proc_reg_write+0x8d/0xac
[ 18.346516] [<ffffffff810c7d32>] vfs_write+0xa1/0xc5
[ 18.346603] [<ffffffff810c7e0f>] sys_write+0x45/0x6c
[ 18.346801] [<ffffffff8f02943b>] system_call_fastpath+0x16/0x1b
In this case, offending function is "panic_write()" but the trace showed
me a different function "deep01()". I was a bit confused at first. The
panic_write() function is a very simple code which calls panic() at the
end of the function.
static int panic_write(struct file *file, const char *buffer,
unsigned long count, void *data)
{
trace_printk("Starting panic\n");
printk(KERN_INFO "Starting panic\n");
panic("test_panic running!\n");
return count;
}
But where did the "deep01()" come from?
This was due to a gcc's optimization. Because the panic() was known not to
return to the caller (it has "noreturn" attribute), the gcc optimized out
remaining code after calling to the panic(). But PC was incremented after
the call as usual, so return address saved in stack pointed a different
(very next) function.
I ran objdump on test_panic.ko and got following result.
... (snipped) ...
000000000000010a <panic_write>:
10a: 55 push %rbp
10b: 48 c7 c7 00 00 00 00 mov $0x0,%rdi
112: 31 c0 xor %eax,%eax
114: 48 89 e5 mov %rsp,%rbp
117: e8 00 00 00 00 callq 11c <panic_write+0x12>
11c: 48 c7 c7 00 00 00 00 mov $0x0,%rdi
123: 31 c0 xor %eax,%eax
125: e8 00 00 00 00 callq 12a <deep01> <--- call panic() !!!
000000000000012a <deep01>:
12a: 55 push %rbp
12b: 31 c0 xor %eax,%eax
12d: 48 c7 c7 00 00 00 00 mov $0x0,%rdi
... (snipped) ...
To this end, I added sprint_trace() and %pT specifier for printk(). They'll
do the same symbol lookup but with the offset subtracted by 1 to ensure it
would be in the same function - and increment it again before printing.
It will not affect normal trace case and the only problematic case will be
the case where return address is really the start of a function (offset 0).
But I think this cannot happen during stack trace because calling a function
always increases PC and then saves. Thus even if there's a call at the
beginning in a function, its return address will not have the 0-offset. But
other symbol lookups could still use 0-offset for some purpose, I added new
helper to separate them.
I've only checked this on x86_64 on qemu, but other architectures may want
this too :) It seems work well for me.
[ 22.224483] Call Trace:
[ 22.224569] [<ffffffff812bce69>] panic+0x8c/0x18d
[ 22.224848] [<ffffffffa000012a>] panic_write+0x20/0x20 [test_panic] <--- !!!
[ 22.224979] [<ffffffff81115fab>] proc_file_write+0x73/0x8d
[ 22.225089] [<ffffffff81111a5f>] proc_reg_write+0x8d/0xac
[ 22.225199] [<ffffffff810d90ee>] vfs_write+0xa1/0xc5
[ 22.225304] [<ffffffff810d91cb>] sys_write+0x45/0x6c
[ 22.225408] [<ffffffff812c07fb>] system_call_fastpath+0x16/0x1b
Any comments are welcome.
Thanks.
---
Namhyung Kim (2):
vsprintf: introduce %pT format specifier
x86, dumpstack: use %pT format specifier for stack trace
arch/x86/kernel/dumpstack.c | 2 +-
include/linux/kallsyms.h | 7 +++++++
kernel/kallsyms.c | 26 ++++++++++++++++++++++++++
lib/vsprintf.c | 7 ++++++-
4 files changed, 40 insertions(+), 2 deletions(-)
--
1.7.4
The %pT format specifier is for stack backtrace. Its handler
sprint_trace() does symbol lookup using (offset-1) to ensure
the offset will not point outside of the function.
If there is a tail-call to the function marked "noreturn",
gcc optimized out the code after the call then causes saved
return address points outside of the function (i.e. the start
of the next function), so messes up a call trace entry.
This patch will fix it.
Signed-off-by: Namhyung Kim <[email protected]>
---
include/linux/kallsyms.h | 7 +++++++
kernel/kallsyms.c | 26 ++++++++++++++++++++++++++
lib/vsprintf.c | 7 ++++++-
3 files changed, 39 insertions(+), 1 deletions(-)
diff --git a/include/linux/kallsyms.h b/include/linux/kallsyms.h
index d8e9b3d1c23c..4964ed2021cc 100644
--- a/include/linux/kallsyms.h
+++ b/include/linux/kallsyms.h
@@ -36,6 +36,7 @@ const char *kallsyms_lookup(unsigned long addr,
/* Look up a kernel symbol and return it in a text buffer. */
extern int sprint_symbol(char *buffer, unsigned long address);
+extern int sprint_trace(char *buffer, unsigned long address);
/* Look up a kernel symbol and print it to the kernel messages. */
extern void __print_symbol(const char *fmt, unsigned long address);
@@ -79,6 +80,12 @@ static inline int sprint_symbol(char *buffer, unsigned long addr)
return 0;
}
+static inline int sprint_trace(char *buffer, unsigned long addr)
+{
+ *buffer = '\0';
+ return 0;
+}
+
static inline int lookup_symbol_name(unsigned long addr, char *symname)
{
return -ERANGE;
diff --git a/kernel/kallsyms.c b/kernel/kallsyms.c
index 6f6d091b5757..2e43200cdc79 100644
--- a/kernel/kallsyms.c
+++ b/kernel/kallsyms.c
@@ -368,6 +368,32 @@ int sprint_symbol(char *buffer, unsigned long address)
}
EXPORT_SYMBOL_GPL(sprint_symbol);
+int sprint_trace(char *buffer, unsigned long address)
+{
+ char *modname;
+ const char *name;
+ unsigned long offset, size;
+ int len;
+
+ name = kallsyms_lookup(address-1, &size, &offset, &modname, buffer);
+ if (!name)
+ return sprintf(buffer, "0x%lx", address);
+
+ if (name != buffer)
+ strcpy(buffer, name);
+ len = strlen(buffer);
+ buffer += len;
+ offset++;
+
+ if (modname)
+ len += sprintf(buffer, "+%#lx/%#lx [%s]",
+ offset, size, modname);
+ else
+ len += sprintf(buffer, "+%#lx/%#lx", offset, size);
+
+ return len;
+}
+
/* Look up a kernel symbol and print it to the kernel messages. */
void __print_symbol(const char *fmt, unsigned long address)
{
diff --git a/lib/vsprintf.c b/lib/vsprintf.c
index d3023df8477f..c9c56a54cc42 100644
--- a/lib/vsprintf.c
+++ b/lib/vsprintf.c
@@ -574,7 +574,9 @@ char *symbol_string(char *buf, char *end, void *ptr,
unsigned long value = (unsigned long) ptr;
#ifdef CONFIG_KALLSYMS
char sym[KSYM_SYMBOL_LEN];
- if (ext != 'f' && ext != 's')
+ if (ext == 'T')
+ sprint_trace(sym, value);
+ else if (ext != 'f' && ext != 's')
sprint_symbol(sym, value);
else
kallsyms_lookup(value, NULL, NULL, NULL, sym);
@@ -949,6 +951,7 @@ int kptr_restrict = 1;
* - 'f' For simple symbolic function names without offset
* - 'S' For symbolic direct pointers with offset
* - 's' For symbolic direct pointers without offset
+ * - 'T' For backtraced symbolic direct pointers with offset
* - 'R' For decoded struct resource, e.g., [mem 0x0-0x1f 64bit pref]
* - 'r' For raw struct resource, e.g., [mem 0x0-0x1f flags 0x201]
* - 'M' For a 6-byte MAC address, it prints the address in the
@@ -1008,6 +1011,7 @@ char *pointer(const char *fmt, char *buf, char *end, void *ptr,
/* Fallthrough */
case 'S':
case 's':
+ case 'T':
return symbol_string(buf, end, ptr, spec, *fmt);
case 'R':
case 'r':
@@ -1279,6 +1283,7 @@ qualifier:
* %ps output the name of a text symbol without offset
* %pF output the name of a function pointer with its offset
* %pf output the name of a function pointer without its offset
+ * %pT output the name of a trace symbol with its offset
* %pR output the address range in a struct resource with decoded flags
* %pr output the address range in a struct resource with raw flags
* %pM output a 6-byte MAC address with colons
--
1.7.4
Signed-off-by: Namhyung Kim <[email protected]>
---
arch/x86/kernel/dumpstack.c | 2 +-
1 files changed, 1 insertions(+), 1 deletions(-)
diff --git a/arch/x86/kernel/dumpstack.c b/arch/x86/kernel/dumpstack.c
index 914065620f2c..809d027de28f 100644
--- a/arch/x86/kernel/dumpstack.c
+++ b/arch/x86/kernel/dumpstack.c
@@ -27,7 +27,7 @@ static int die_counter;
void printk_address(unsigned long address, int reliable)
{
- printk(" [<%p>] %s%pS\n", (void *) address,
+ printk(" [<%p>] %s%pT\n", (void *) address,
reliable ? "" : "? ", (void *) address);
}
--
1.7.4
On Fri, Mar 18, 2011 at 08:21:17PM +0900, Namhyung Kim wrote:
> The %pT format specifier is for stack backtrace. Its handler
> sprint_trace() does symbol lookup using (offset-1) to ensure
> the offset will not point outside of the function.
>
> If there is a tail-call to the function marked "noreturn",
> gcc optimized out the code after the call then causes saved
> return address points outside of the function (i.e. the start
> of the next function), so messes up a call trace entry.
> This patch will fix it.
This looks uselful.
> kernel/kallsyms.c | 26 ++++++++++++++++++++++++++
> lib/vsprintf.c | 7 ++++++-
> 3 files changed, 39 insertions(+), 1 deletions(-)
>
> diff --git a/include/linux/kallsyms.h b/include/linux/kallsyms.h
> index d8e9b3d1c23c..4964ed2021cc 100644
> --- a/include/linux/kallsyms.h
> +++ b/include/linux/kallsyms.h
> @@ -36,6 +36,7 @@ const char *kallsyms_lookup(unsigned long addr,
>
> /* Look up a kernel symbol and return it in a text buffer. */
> extern int sprint_symbol(char *buffer, unsigned long address);
> +extern int sprint_trace(char *buffer, unsigned long address);
>
> /* Look up a kernel symbol and print it to the kernel messages. */
> extern void __print_symbol(const char *fmt, unsigned long address);
> @@ -79,6 +80,12 @@ static inline int sprint_symbol(char *buffer, unsigned long addr)
> return 0;
> }
>
> +static inline int sprint_trace(char *buffer, unsigned long addr)
> +{
> + *buffer = '\0';
> + return 0;
> +}
> +
> static inline int lookup_symbol_name(unsigned long addr, char *symname)
> {
> return -ERANGE;
> diff --git a/kernel/kallsyms.c b/kernel/kallsyms.c
> index 6f6d091b5757..2e43200cdc79 100644
> --- a/kernel/kallsyms.c
> +++ b/kernel/kallsyms.c
> @@ -368,6 +368,32 @@ int sprint_symbol(char *buffer, unsigned long address)
> }
> EXPORT_SYMBOL_GPL(sprint_symbol);
>
This desperately needs a comment. Please add a kernel doc style header
to this. (ie.:)
/**
* sprint_trace - one line description
* @buffer: what is the buffer
* @address: what is the address
*
* describe why this function exists and how to use it.
*
* Returns: what are the return values.
*/
-- Steve
> +int sprint_trace(char *buffer, unsigned long address)
> +{
> + char *modname;
> + const char *name;
> + unsigned long offset, size;
> + int len;
> +
> + name = kallsyms_lookup(address-1, &size, &offset, &modname, buffer);
> + if (!name)
> + return sprintf(buffer, "0x%lx", address);
> +
> + if (name != buffer)
> + strcpy(buffer, name);
> + len = strlen(buffer);
> + buffer += len;
> + offset++;
> +
> + if (modname)
> + len += sprintf(buffer, "+%#lx/%#lx [%s]",
> + offset, size, modname);
> + else
> + len += sprintf(buffer, "+%#lx/%#lx", offset, size);
> +
> + return len;
> +}
> +
> /* Look up a kernel symbol and print it to the kernel messages. */
> void __print_symbol(const char *fmt, unsigned long address)
> {
> diff --git a/lib/vsprintf.c b/lib/vsprintf.c
> index d3023df8477f..c9c56a54cc42 100644
> --- a/lib/vsprintf.c
> +++ b/lib/vsprintf.c
> @@ -574,7 +574,9 @@ char *symbol_string(char *buf, char *end, void *ptr,
> unsigned long value = (unsigned long) ptr;
> #ifdef CONFIG_KALLSYMS
> char sym[KSYM_SYMBOL_LEN];
> - if (ext != 'f' && ext != 's')
> + if (ext == 'T')
> + sprint_trace(sym, value);
> + else if (ext != 'f' && ext != 's')
> sprint_symbol(sym, value);
> else
> kallsyms_lookup(value, NULL, NULL, NULL, sym);
> @@ -949,6 +951,7 @@ int kptr_restrict = 1;
> * - 'f' For simple symbolic function names without offset
> * - 'S' For symbolic direct pointers with offset
> * - 's' For symbolic direct pointers without offset
> + * - 'T' For backtraced symbolic direct pointers with offset
> * - 'R' For decoded struct resource, e.g., [mem 0x0-0x1f 64bit pref]
> * - 'r' For raw struct resource, e.g., [mem 0x0-0x1f flags 0x201]
> * - 'M' For a 6-byte MAC address, it prints the address in the
> @@ -1008,6 +1011,7 @@ char *pointer(const char *fmt, char *buf, char *end, void *ptr,
> /* Fallthrough */
> case 'S':
> case 's':
> + case 'T':
> return symbol_string(buf, end, ptr, spec, *fmt);
> case 'R':
> case 'r':
> @@ -1279,6 +1283,7 @@ qualifier:
> * %ps output the name of a text symbol without offset
> * %pF output the name of a function pointer with its offset
> * %pf output the name of a function pointer without its offset
> + * %pT output the name of a trace symbol with its offset
> * %pR output the address range in a struct resource with decoded flags
> * %pr output the address range in a struct resource with raw flags
> * %pM output a 6-byte MAC address with colons
> --
> 1.7.4
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
You needed to add the x86 maintainers to your Cc list.
-- Steve
On Fri, Mar 18, 2011 at 08:21:18PM +0900, Namhyung Kim wrote:
> Signed-off-by: Namhyung Kim <[email protected]>
> ---
> arch/x86/kernel/dumpstack.c | 2 +-
> 1 files changed, 1 insertions(+), 1 deletions(-)
>
> diff --git a/arch/x86/kernel/dumpstack.c b/arch/x86/kernel/dumpstack.c
> index 914065620f2c..809d027de28f 100644
> --- a/arch/x86/kernel/dumpstack.c
> +++ b/arch/x86/kernel/dumpstack.c
> @@ -27,7 +27,7 @@ static int die_counter;
>
> void printk_address(unsigned long address, int reliable)
> {
> - printk(" [<%p>] %s%pS\n", (void *) address,
> + printk(" [<%p>] %s%pT\n", (void *) address,
> reliable ? "" : "? ", (void *) address);
> }
>
> --
> 1.7.4
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/