Hello,
I'm encountering unexpected behaviour with OProfile when the profiled
system is under heavy load : "BUG: unable to handle kernel paging request
at 0000000000004cc3" (full console message is attached).
I'm using a linux kernel 2.6.32 and both OProfile 0.9.6 from Debian
repositories and OProfile 0.9.7 from CVS on Intel Xeon E5440 harpertown.
It happens when I profile a webserver under heavy load, with some cores
disabled, and with the following event :
--event=CPU_CLK_UNHALTED:1600000000000
I also use the --separate=all --callgraph=3 and my kernel has debug
symbols and frame pointers enabled.
Anyone has any idea on what is happening ?
Thanks,
Regards,
Sylvain
"Sylvain GENEVES" <[email protected]> writes:
> [...]
> I'm encountering unexpected behaviour with OProfile when the profiled
> system is under heavy load : "BUG: unable to handle kernel paging request
> at 0000000000004cc3" (full console message is attached).
> [...]
> Anyone has any idea on what is happening ?
Just glancing at that oops & my local random kernel build, it appears
as though this part of arch/x86/kernel/time.c:profile_pc is failing:
unsigned long profile_pc(struct pt_regs *regs)
{
unsigned long pc = instruction_pointer(regs);
if (!user_mode_vm(regs) && in_lock_functions(pc)) {
#ifdef CONFIG_FRAME_POINTER
return *(unsigned long *)(regs->bp + sizeof(long));
#else
^^^^^^^^^^^^^^^^^^
[...]
regs->bp must have been 0x4cbb, which this code turns into an
unchecked dereferences at 0x4cbb+8 = 0x4cc3. I don't have a theory
as to why regs->bp should have that value in it, but the kernel
should probably use probe_kernel_read() or somesuch to validate the
value before dereferencing it.
- FChE
On Fri, 2010-11-12 at 22:11 +0100, Sylvain GENEVES wrote:
> Anyone has any idea on what is happening ?
Yeah, the oprofile code is terminally broken, it uses
__copy_from_user_inatomic() from NMI context.