Hi,
These days I was studying "perf code" and try to dump some debug
info in event_sched_out(): kernel/perf_event.c by printk, but found
system can hang easily.
After digging into related code for several days, I find the hang is
caused by spinlock deadlock when acquiring rq->lock in task_rq_lock(),
which is called in the path below:
try_to_wake_up<-wake_up_process<-up<-console_unlock<-printk.
I wonder if there are some usage conventions about printk, which says
explicitly that printk can not by used in some places of kernel. Or does the
above show that it is a bug of kernel?
If the above is really a limitation of printk, have we other dump functions
which may be used safely in some places of kernel source code where
printk is not allowed.
thanks,
--
Lei Ming
On Fri, 2011-03-25 at 20:15 +0800, Ming Lei wrote:
> Hi,
>
> These days I was studying "perf code" and try to dump some debug
> info in event_sched_out(): kernel/perf_event.c by printk, but found
> system can hang easily.
>
> After digging into related code for several days, I find the hang is
> caused by spinlock deadlock when acquiring rq->lock in task_rq_lock(),
> which is called in the path below:
>
> try_to_wake_up<-wake_up_process<-up<-console_unlock<-printk.
>
> I wonder if there are some usage conventions about printk, which says
> explicitly that printk can not by used in some places of kernel. Or does the
> above show that it is a bug of kernel?
>
> If the above is really a limitation of printk, have we other dump functions
> which may be used safely in some places of kernel source code where
> printk is not allowed.
Don't use printk from NMI context, that's never going to work well.
One thing you could to is use trace_printk() and use ftrace to debug
perf.
On Fri, 2011-03-25 at 13:21 +0100, Peter Zijlstra wrote:
> On Fri, 2011-03-25 at 20:15 +0800, Ming Lei wrote:
> > Hi,
> >
> > These days I was studying "perf code" and try to dump some debug
> > info in event_sched_out(): kernel/perf_event.c by printk, but found
> > system can hang easily.
> >
> > After digging into related code for several days, I find the hang is
> > caused by spinlock deadlock when acquiring rq->lock in task_rq_lock(),
> > which is called in the path below:
> >
> > try_to_wake_up<-wake_up_process<-up<-console_unlock<-printk.
> >
> > I wonder if there are some usage conventions about printk, which says
> > explicitly that printk can not by used in some places of kernel. Or does the
> > above show that it is a bug of kernel?
> >
> > If the above is really a limitation of printk, have we other dump functions
> > which may be used safely in some places of kernel source code where
> > printk is not allowed.
>
> Don't use printk from NMI context, that's never going to work well.
OK, so I probably should shut down the computer and go back to being
ill, the thing you mentioned isn't in fact NMI context, but yeah using
it from inside the scheduler isn't going to be fun either.
> One thing you could to is use trace_printk() and use ftrace to debug
> perf.
That should still work.
Hi,
2011/3/25 Peter Zijlstra <[email protected]>:
>> One thing you could to is use trace_printk() and use ftrace to debug
>> perf.
>
> That should still work.
I will try it, thanks.
But seems ftrace can't dump out the messages if system hangs, so could
we introduce a smart and simple "printk" which can be used in any
contexts?
thanks,
--
Lei Ming
On Fri, 2011-03-25 at 22:30 +0800, Ming Lei wrote:
>
> But seems ftrace can't dump out the messages if system hangs,
True.
> so could
> we introduce a smart and simple "printk" which can be used in any
> contexts?
I think you could use early_printk() for that, it doesn't bother with
all the syslog muck.
On Fri, Mar 25, 2011 at 01:26:00PM +0100, Peter Zijlstra wrote:
> On Fri, 2011-03-25 at 13:21 +0100, Peter Zijlstra wrote:
> > On Fri, 2011-03-25 at 20:15 +0800, Ming Lei wrote:
> > > Hi,
> > >
> > > These days I was studying "perf code" and try to dump some debug
> > > info in event_sched_out(): kernel/perf_event.c by printk, but found
> > > system can hang easily.
> > >
> > > After digging into related code for several days, I find the hang is
> > > caused by spinlock deadlock when acquiring rq->lock in task_rq_lock(),
> > > which is called in the path below:
> > >
> > > try_to_wake_up<-wake_up_process<-up<-console_unlock<-printk.
I've brought this up before, and basically was told "don't do that" ;)
> > >
> > > I wonder if there are some usage conventions about printk, which says
> > > explicitly that printk can not by used in some places of kernel. Or does the
> > > above show that it is a bug of kernel?
Just don't use it when the rq lock his held. Which you probably don't
want to anyway since most paths that require rq lock can flood the
console with printk messages.
>
> > One thing you could to is use trace_printk() and use ftrace to debug
> > perf.
>
> That should still work.
Yep, trace_printk() is the perfect thing to use to debug anything. I
tried hard to make it work in all contexts.
You can also set ftrace_dump_on_opps on the command line, or echo 1 into
/proc/sys/kernel/ftrace_dump_on_oops at run time if you want the ftrace
buffer to write to the console if the box crashes.
And finally, early_printk() works in these contexts too, as it doesn't
grab any locks. But because it does not, it can easily get corrupted
buffers.
-- Steve
On Fri, Mar 25, 2011 at 03:50:27PM +0100, Peter Zijlstra wrote:
> On Fri, 2011-03-25 at 22:30 +0800, Ming Lei wrote:
> >
> > But seems ftrace can't dump out the messages if system hangs,
>
> True.
Well, matters how the system hangs. Add in a nmi watchdog or other
watchdogs which can also trigger the ftrace_dump_on_oops.
>
> > so could
> > we introduce a smart and simple "printk" which can be used in any
> > contexts?
>
> I think you could use early_printk() for that, it doesn't bother with
> all the syslog muck.
>
Heh, I should have read the thread before replying to the first post ;)
-- Steve