Running 2.6.29-rc1-next-20090116 on vmware server x86_64 I see this at
boot up. Doesn't appear to cause any problem, no idea what kind of
thermal problem this is causing me on my VM :)
The last linux-next kernel I ran was 2.6.29-rc1-next-20090114 and I do
not remember seeing this. But me not remembering doesn't mean it wasn't
there.....
Any information I can collect or what I can do just let me know.
-Eric
[ 0.036825] ACPI: Core revision 20081204
[ 0.098568] ftrace: converting mcount calls to 0f 1f 44 00 00
[ 0.099101] ftrace: allocating 21076 entries in 166 pages
[ 0.105983] ------------[ cut here ]------------
[ 0.105983] WARNING: at kernel/trace/ftrace.c:434 ftrace_bug+0xdb/0x2b1()
[ 0.105983] Hardware name: VMware Virtual Platform
[ 0.105983] Modules linked in:
[ 0.105983] Pid: 0, comm: swapper Not tainted 2.6.29-rc1-next-20090116 #252
[ 0.105983] Call Trace:
[ 0.105983] [<ffffffff810515bc>] warn_slowpath+0xd8/0xf7
[ 0.105983] [<ffffffff8101562d>] ? dump_trace+0x26b/0x27a
[ 0.105983] [<ffffffff810763c7>] ? find_usage_backwards+0x2c/0xf2
[ 0.105983] [<ffffffff810777e0>] ? check_usage_backwards+0x2e/0x52
[ 0.105983] [<ffffffff810762d0>] ? graph_unlock+0x6b/0x77
[ 0.105983] [<ffffffff81077d33>] ? mark_lock+0x34b/0x35b
[ 0.105983] [<ffffffff81077d95>] ? mark_held_locks+0x52/0x72
[ 0.105983] [<ffffffff813ba274>] ? __mutex_lock_common+0x41a/0x43f
[ 0.105983] [<ffffffff81018e63>] ? sched_clock+0x9/0xc
[ 0.105983] [<ffffffff810bc8ab>] ? pagefault_enable+0x9/0x1a
[ 0.105983] [<ffffffff810bc90a>] ? probe_kernel_write+0x4e/0x74
[ 0.105983] [<ffffffff810bc8ab>] ? pagefault_enable+0x9/0x1a
[ 0.105983] [<ffffffff810bc97e>] ? probe_kernel_read+0x4e/0x78
[ 0.105983] [<ffffffff8122346e>] ? acpi_thermal_notify+0x7/0xc1
[ 0.105983] [<ffffffff81029a9f>] ? ftrace_modify_code+0x46/0x75
[ 0.105983] [<ffffffff8122346e>] ? acpi_thermal_notify+0x7/0xc1
[ 0.105983] [<ffffffff810aaecb>] ftrace_bug+0xdb/0x2b1
[ 0.105983] [<ffffffff8122346e>] ? acpi_thermal_notify+0x7/0xc1
[ 0.105983] [<ffffffff810acc10>] ftrace_convert_nops+0x23d/0x2da
[ 0.105983] [<ffffffff81713fd3>] ftrace_init+0x116/0x121
[ 0.105983] [<ffffffff81213a84>] ? acpi_enable_subsystem+0x3c/0x90
[ 0.105983] [<ffffffff816f5d99>] start_kernel+0x40e/0x41e
[ 0.105983] [<ffffffff816f5140>] ? early_idt_handler+0x0/0x71
[ 0.105983] [<ffffffff816f52c3>] x86_64_start_reservations+0xae/0xb2
[ 0.105983] [<ffffffff816f53d3>] x86_64_start_kernel+0xe9/0xf4
[ 0.105983] ---[ end trace a7919e7f17c0a725 ]---
[ 0.105983] ftrace failed to modify [<ffffffff8122346e>] acpi_thermal_notify+0x7/0xc1
[ 0.105983] actual: 0f:1f:44:00:00
On Fri, 16 Jan 2009, Eric Paris wrote:
> Running 2.6.29-rc1-next-20090116 on vmware server x86_64 I see this at
> boot up. Doesn't appear to cause any problem, no idea what kind of
> thermal problem this is causing me on my VM :)
>
> The last linux-next kernel I ran was 2.6.29-rc1-next-20090114 and I do
> not remember seeing this. But me not remembering doesn't mean it wasn't
> there.....
>
> Any information I can collect or what I can do just let me know.
>
> -Eric
>
> [ 0.036825] ACPI: Core revision 20081204
> [ 0.098568] ftrace: converting mcount calls to 0f 1f 44 00 00
> [ 0.099101] ftrace: allocating 21076 entries in 166 pages
> [ 0.105983] ------------[ cut here ]------------
> [ 0.105983] WARNING: at kernel/trace/ftrace.c:434 ftrace_bug+0xdb/0x2b1()
> [ 0.105983] Hardware name: VMware Virtual Platform
> [ 0.105983] Modules linked in:
> [ 0.105983] Pid: 0, comm: swapper Not tainted 2.6.29-rc1-next-20090116 #252
> [ 0.105983] Call Trace:
> [ 0.105983] [<ffffffff810515bc>] warn_slowpath+0xd8/0xf7
> [ 0.105983] [<ffffffff8101562d>] ? dump_trace+0x26b/0x27a
> [ 0.105983] [<ffffffff810763c7>] ? find_usage_backwards+0x2c/0xf2
> [ 0.105983] [<ffffffff810777e0>] ? check_usage_backwards+0x2e/0x52
> [ 0.105983] [<ffffffff810762d0>] ? graph_unlock+0x6b/0x77
> [ 0.105983] [<ffffffff81077d33>] ? mark_lock+0x34b/0x35b
> [ 0.105983] [<ffffffff81077d95>] ? mark_held_locks+0x52/0x72
> [ 0.105983] [<ffffffff813ba274>] ? __mutex_lock_common+0x41a/0x43f
> [ 0.105983] [<ffffffff81018e63>] ? sched_clock+0x9/0xc
> [ 0.105983] [<ffffffff810bc8ab>] ? pagefault_enable+0x9/0x1a
> [ 0.105983] [<ffffffff810bc90a>] ? probe_kernel_write+0x4e/0x74
> [ 0.105983] [<ffffffff810bc8ab>] ? pagefault_enable+0x9/0x1a
> [ 0.105983] [<ffffffff810bc97e>] ? probe_kernel_read+0x4e/0x78
> [ 0.105983] [<ffffffff8122346e>] ? acpi_thermal_notify+0x7/0xc1
> [ 0.105983] [<ffffffff81029a9f>] ? ftrace_modify_code+0x46/0x75
> [ 0.105983] [<ffffffff8122346e>] ? acpi_thermal_notify+0x7/0xc1
> [ 0.105983] [<ffffffff810aaecb>] ftrace_bug+0xdb/0x2b1
> [ 0.105983] [<ffffffff8122346e>] ? acpi_thermal_notify+0x7/0xc1
> [ 0.105983] [<ffffffff810acc10>] ftrace_convert_nops+0x23d/0x2da
> [ 0.105983] [<ffffffff81713fd3>] ftrace_init+0x116/0x121
> [ 0.105983] [<ffffffff81213a84>] ? acpi_enable_subsystem+0x3c/0x90
> [ 0.105983] [<ffffffff816f5d99>] start_kernel+0x40e/0x41e
> [ 0.105983] [<ffffffff816f5140>] ? early_idt_handler+0x0/0x71
> [ 0.105983] [<ffffffff816f52c3>] x86_64_start_reservations+0xae/0xb2
> [ 0.105983] [<ffffffff816f53d3>] x86_64_start_kernel+0xe9/0xf4
> [ 0.105983] ---[ end trace a7919e7f17c0a725 ]---
> [ 0.105983] ftrace failed to modify [<ffffffff8122346e>] acpi_thermal_notify+0x7/0xc1
> [ 0.105983] actual: 0f:1f:44:00:00
Hmm, someone reported something similar before too. You don't need to
worry, but this is one of the function tracer self tests failing. In which
case, you will loose the ability to use the function tracer.
What happen is that on boot up, the dynamic function tracer will convert
all calls to mcount into a nop. When it does this, it does several self
checks to make sure it is not destroying anything, (remember a e1000e
bug?) If it detects anything abnormal, it aborts and this is one of those
detections triggering.
When it went to modify the location to the call of mcount in
acpi_thermal_notify into a nop, it checked to see if that loctation was
indeed a call to mcount. But it was not. Instead it was already a nop!
This means that it somehow converted the acpi_thermal_notify twice, and
the second time it failed because it was not what it expected.
If you still see this on boot up, could you cat out:
# mount -t debugfs none /sys/kernel/debug
# cat /sys/kernel/debug/tracing/dyn_ftrace_total_info
And perhaps, if possible, could I get a copy of the vmlinux that was
built, and a CONFIG to go with it.
Thanks,
-- Steve
On Fri, 2009-01-16 at 19:47 -0500, Steven Rostedt wrote:
> # mount -t debugfs none /sys/kernel/debug
> # cat /sys/kernel/debug/tracing/dyn_ftrace_total_info
[root@unused ~]# cat /sys/kernel/debug/tracing/dyn_ftrace_total_info
0 0 0
> And perhaps, if possible, could I get a copy of the vmlinux that was
> built, and a CONFIG to go with it.
check out http://people.redhat.com/~eparis/ftrace
You should find my config, the vmlinux, and my serial capture on that
page. I'll build again right now without all of my filesystem
notification patches to make sure they aren't somehow screwing things up
(but I pretty seriously doubt it.)
-Eric
On Fri, 16 Jan 2009, Eric Paris wrote:
> On Fri, 2009-01-16 at 19:47 -0500, Steven Rostedt wrote:
> > # mount -t debugfs none /sys/kernel/debug
> > # cat /sys/kernel/debug/tracing/dyn_ftrace_total_info
>
> [root@unused ~]# cat /sys/kernel/debug/tracing/dyn_ftrace_total_info
> 0 0 0
>
> > And perhaps, if possible, could I get a copy of the vmlinux that was
> > built, and a CONFIG to go with it.
>
> check out http://people.redhat.com/~eparis/ftrace
>
> You should find my config, the vmlinux, and my serial capture on that
> page. I'll build again right now without all of my filesystem
> notification patches to make sure they aren't somehow screwing things up
> (but I pretty seriously doubt it.)
Thanks,
Something messed up in the creating of the mcount section. The secton
that is used to find all the places to modify.
Here's the output of your dump:
[ 0.125980] ------------[ cut here ]------------
[ 0.125980] WARNING: at kernel/trace/ftrace.c:434
ftrace_bug+0xdb/0x2b1()
[ 0.125980] Hardware name: VMware Virtual Platform
[ 0.125980] Modules linked in:
[ 0.125980] Pid: 0, comm: swapper Not tainted 2.6.29-rc1-next-20090116
#256
[ 0.125980] Call Trace:
[ 0.125980] [<ffffffff810515bc>] warn_slowpath+0xd8/0xf7
[...]
[ 0.125980] ---[ end trace a7919e7f17c0a725 ]---
[ 0.125980] ftrace failed to modify [<ffffffff812234de>]
acpi_thermal_notify+0x7/0xc1
[ 0.125980] actual: 0f:1f:44:00:00
$ nm vmlinux |grep mcount_loc
ffffffff815bbb40 R __start_mcount_loc
ffffffff815e4df0 R __stop_mcount_loc
$ objdump -h vmlinux | grep __init_rodata
11 __init_rodata 000292b0 ffffffff815bbb40 00000000015bbb40 007bbb40 2**3
The __mcount_loc section, which is where the data is stored, is the first
section in the __init_rodata section. By doing:
$ readelf -x __init_rodata vmlinux |less
I get to see what is in that section:
Hex dump of section '__init_rodata':
0x815bbb40 11a00081 ffffffff 98a10081 ffffffff ................
0x815bbb50 4ba20081 ffffffff 60a20081 ffffffff K.......`.......
0x815bbb60 85a20081 ffffffff 98a30081 ffffffff ................
[...]
0x815d4710 df342281 ffffffff 51152281 ffffffff .4".....Q.".....
[...]
0x815d4830 b6332281 ffffffff df342281 ffffffff .3"......4".....
Notice the df342281 ffffffff in the 0x815d4710 line.
Also notice that you have that in the 0x815d4830 line as well. This proves
that we have two copies of the same function in the data section.
[ 0.125980] ftrace failed to modify [<ffffffff812234de>]
acpi_thermal_notify+0x7/0xc1
[ 0.125980] actual: 0f:1f:44:00:00
acpi_thermal_notify+0x7 is at ffffffff812234de, and unfortunately the dump
is in the wrong endian, but we do see that ffffffff812234df is listed
twice. The recorded pointer is to the mcount relocation address, and the
printed one ffffffff812234de is to the call site, which is one less than
the relocation pointer.
Now the question is, how did that get listed twice?
Could you do a:
$ objdump -Dr -j __mcount_loc drivers/acpi/thermal.o
and perhaps we can find a duplicate there?
Thanks,
-- Steve
On Fri, 16 Jan 2009, Steven Rostedt wrote:
>
>
> Now the question is, how did that get listed twice?
>
> Could you do a:
>
> $ objdump -Dr -j __mcount_loc drivers/acpi/thermal.o
>
> and perhaps we can find a duplicate there?
>
Actually, if you can post that file, that might help too.
Thanks,
-- Steve
On Fri, 16 Jan 2009, Steven Rostedt wrote:
> >
> > $ objdump -Dr -j __mcount_loc drivers/acpi/thermal.o
> >
> > and perhaps we can find a duplicate there?
> >
>
> Actually, if you can post that file, that might help too.
The drivers/acpi/thermal.o file, just in case I was too ambiguous ;-)
-- Steve
On Fri, 16 Jan 2009, Eric Paris wrote:
> On Fri, 2009-01-16 at 19:47 -0500, Steven Rostedt wrote:
>
> > And perhaps, if possible, could I get a copy of the vmlinux that was
> > built, and a CONFIG to go with it.
>
> So I didn't see it when I rebuilt without any of my patch stack, I'm
> trying to build them back in to see if it was one of mine....
I'm looking at the dump of the vmlinux, and here's something that is
interesting:
0x815d45e0 19132281 ffffffff f9132281 ffffffff ..".......".....
0x815d45f0 51152281 ffffffff 7e152281 ffffffff Q.".....~.".....
0x815d4600 b0152281 ffffffff e2152281 ffffffff ..".......".....
0x815d4610 0b162281 ffffffff 42162281 ffffffff ..".....B.".....
0x815d4620 d8162281 ffffffff 33172281 ffffffff ..".....3.".....
0x815d4630 26182281 ffffffff 21192281 ffffffff &.".....!.".....
0x815d4640 091d2281 ffffffff 3f1d2281 ffffffff ..".....?.".....
0x815d4650 94232281 ffffffff 1d242281 ffffffff .#"......$".....
0x815d4660 c1242281 ffffffff 0d252281 ffffffff .$"......%".....
0x815d4670 56252281 ffffffff 1d262281 ffffffff V%"......&".....
0x815d4680 9f262281 ffffffff ef272281 ffffffff .&"......'".....
0x815d4690 06282281 ffffffff 2a282281 ffffffff .(".....*(".....
0x815d46a0 d82c2281 ffffffff fd2c2281 ffffffff .,"......,".....
0x815d46b0 342d2281 ffffffff 942d2281 ffffffff 4-"......-".....
0x815d46c0 f32f2281 ffffffff 5c302281 ffffffff ./".....\0".....
0x815d46d0 12312281 ffffffff 9f312281 ffffffff .1"......1".....
0x815d46e0 3d322281 ffffffff 62322281 ffffffff =2".....b2".....
0x815d46f0 87322281 ffffffff ac322281 ffffffff .2"......2".....
0x815d4700 de322281 ffffffff b6332281 ffffffff .2"......3".....
0x815d4710 df342281 ffffffff 51152281 ffffffff .4".....Q.".....
0x815d4720 7e152281 ffffffff b0152281 ffffffff ~.".......".....
0x815d4730 e2152281 ffffffff 0b162281 ffffffff ..".......".....
0x815d4740 42162281 ffffffff d8162281 ffffffff B.".......".....
0x815d4750 33172281 ffffffff 26182281 ffffffff 3.".....&.".....
0x815d4760 21192281 ffffffff 091d2281 ffffffff !.".......".....
0x815d4770 3f1d2281 ffffffff 94232281 ffffffff ?."......#".....
0x815d4780 1d242281 ffffffff c1242281 ffffffff .$"......$".....
0x815d4790 0d252281 ffffffff 56252281 ffffffff .%".....V%".....
0x815d47a0 1d262281 ffffffff 9f262281 ffffffff .&"......&".....
0x815d47b0 ef272281 ffffffff 06282281 ffffffff .'"......(".....
0x815d47c0 2a282281 ffffffff d82c2281 ffffffff *("......,".....
0x815d47d0 fd2c2281 ffffffff 342d2281 ffffffff .,".....4-".....
0x815d47e0 942d2281 ffffffff f32f2281 ffffffff .-"....../".....
0x815d47f0 5c302281 ffffffff 12312281 ffffffff \0"......1".....
0x815d4800 9f312281 ffffffff 3d322281 ffffffff .1".....=2".....
0x815d4810 62322281 ffffffff 87322281 ffffffff b2"......2".....
0x815d4820 ac322281 ffffffff de322281 ffffffff .2"......2".....
0x815d4830 b6332281 ffffffff df342281 ffffffff .3"......4".....
0x815d4840 9d352281 ffffffff c1352281 ffffffff .5"......5".....
The entries from 0x815d45f0 to 0x815d4710 are identical to the entries
from 0x815d4711 to 0x815d4831. Due to the history of the code, the entries
are added first into a link list, and then the addresses are modifed in
reverse order. Since b6332281 ffffffff was last (executed first) it was
the first one to see the change.
I'm betting that you had an old version of the thermal.o that had the
mcount_loc already in, and somehow had the recordmcount.o executed again
on it, creating a duplicate copy. This would cause such an error. By
rebuilding the whole tree, you fixed things up.
-- Steve
Em Fri, Jan 16, 2009 at 09:18:36PM -0500, Eric Paris escreveu:
> On Fri, 2009-01-16 at 19:47 -0500, Steven Rostedt wrote:
> > # mount -t debugfs none /sys/kernel/debug
> > # cat /sys/kernel/debug/tracing/dyn_ftrace_total_info
>
> [root@unused ~]# cat /sys/kernel/debug/tracing/dyn_ftrace_total_info
> 0 0 0
Rostedt, see?
/me now using /d/tracing/
;-)
- Arnaldo