With nmi_watchdog enabled, perf_event_nmi_handler always return
NOTIFY_STOP(active_events > 0), and the notifier call chain will not
call further.
If it was not perf NMI, does the perf nmi handler may stop the real NMI
handler get called because NOTIFY_STOP is returned??
static int __kprobes
perf_event_nmi_handler(struct notifier_block *self,
unsigned long cmd, void *__args)
{
struct die_args *args = __args;
struct pt_regs *regs;
if (!atomic_read(&active_events)) ===> With nmi_watchdog enabled, active_events > 0
return NOTIFY_DONE;
switch (cmd) {
case DIE_NMI:
case DIE_NMI_IPI:
break;
default:
return NOTIFY_DONE;
}
regs = args->regs;
apic_write(APIC_LVTPC, APIC_DM_NMI);
/*
* Can't rely on the handled return value to say it was our NMI, two
* events could trigger 'simultaneously' raising two back-to-back NMIs.
*
* If the first NMI handles both, the latter will be empty and daze
* the CPU.
*/
x86_pmu.handle_irq(regs);
return NOTIFY_STOP;
}
Thanks,
Lin Ming
On Wed, 2010-08-04 at 17:21 +0800, Lin Ming wrote:
> With nmi_watchdog enabled, perf_event_nmi_handler always return
> NOTIFY_STOP(active_events > 0), and the notifier call chain will not
> call further.
>
> If it was not perf NMI, does the perf nmi handler may stop the real NMI
> handler get called because NOTIFY_STOP is returned??
>
> static int __kprobes
> perf_event_nmi_handler(struct notifier_block *self,
> unsigned long cmd, void *__args)
> {
> struct die_args *args = __args;
> struct pt_regs *regs;
>
> if (!atomic_read(&active_events)) ===> With nmi_watchdog enabled, active_events > 0
> return NOTIFY_DONE;
>
> switch (cmd) {
> case DIE_NMI:
> case DIE_NMI_IPI:
> break;
>
> default:
> return NOTIFY_DONE;
> }
>
> regs = args->regs;
>
> apic_write(APIC_LVTPC, APIC_DM_NMI);
> /*
> * Can't rely on the handled return value to say it was our NMI, two
> * events could trigger 'simultaneously' raising two back-to-back NMIs.
> *
> * If the first NMI handles both, the latter will be empty and daze
> * the CPU.
> */
> x86_pmu.handle_irq(regs);
>
> return NOTIFY_STOP;
> }
Urgh,.. right, so what is the alternative? we don't seem to have a
reliable way of telling where the NMI originated from.
As that comment says, the PMU can raise the NMI and raise the pending
NMI latch for a second over-run, at which point the first NMI will
likely see the overflow status for both, clear both, and the second NMI
will see a 0 overflow status, return it wasn't the PMU, but since the
PMU did raise it, nobody else will claim it, and we get these silly
dazed and confused thingies.
What NMI source are you concerned about and can it reliably tell if it
raised the NMI or not?
On 04.08.10 05:21:10, Lin Ming wrote:
> With nmi_watchdog enabled, perf_event_nmi_handler always return
> NOTIFY_STOP(active_events > 0), and the notifier call chain will not
> call further.
>
> If it was not perf NMI, does the perf nmi handler may stop the real NMI
> handler get called because NOTIFY_STOP is returned??
There is no general mechanism for recording the NMI source (except if
it was external triggered, e.g. by the southbridge). Also, all nmis
are mapped to NMI vector 2 and therefore there is no way to find out
the reason by using apic mask registers.
Now, if multiple perfctrs trigger an nmi, it may happen that a handler
has nothing to do because the counter was already handled by the
previous one. Thus, it is valid to have unhandled nmis caused by
perfctrs.
So, with counters enabled we always have to return stop for *all* nmis
as we cannot detect that it was an perfctr nmi. Otherwise we could
trigger an unhandled nmi. To ensure that all other nmi handlers are
called, the perfctr's nmi handler must have the lowest priority. Then,
the handler will be the last in the chain.
-Robert
--
Advanced Micro Devices, Inc.
Operating System Research Center
On Wed, 2010-08-04 at 12:01 +0200, Robert Richter wrote:
> To ensure that all other nmi handlers are
> called, the perfctr's nmi handler must have the lowest priority. Then,
> the handler will be the last in the chain.
Well, unless another NMI handler has the exact same issue and also
starts eating all NMIs, just in case.
On 04.08.10 06:24:18, Peter Zijlstra wrote:
> On Wed, 2010-08-04 at 12:01 +0200, Robert Richter wrote:
> > To ensure that all other nmi handlers are
> > called, the perfctr's nmi handler must have the lowest priority. Then,
> > the handler will be the last in the chain.
>
> Well, unless another NMI handler has the exact same issue and also
> starts eating all NMIs, just in case.
In this case we will have to change the implementation for unhandled
nmis. But I don't know of other sources with this issue.
-Robert
--
Advanced Micro Devices, Inc.
Operating System Research Center
On Wed, Aug 04, 2010 at 05:21:10PM +0800, Lin Ming wrote:
> With nmi_watchdog enabled, perf_event_nmi_handler always return
> NOTIFY_STOP(active_events > 0), and the notifier call chain will not
> call further.
>
> If it was not perf NMI, does the perf nmi handler may stop the real NMI
> handler get called because NOTIFY_STOP is returned??
Yes I sent a cheap and dirty patch to address this a couple of weeks ago
http://lkml.indiana.edu/hypermail//linux/kernel/1007.2/02590.html
Unfortunately, no responded. :-( Of course, it could have been so gross
no one wanted to comment on it. :-)
Cheers,
Don
On Wed, Aug 04, 2010 at 12:01:16PM +0200, Robert Richter wrote:
> There is no general mechanism for recording the NMI source (except if
> it was external triggered, e.g. by the southbridge). Also, all nmis
> are mapped to NMI vector 2 and therefore there is no way to find out
> the reason by using apic mask registers.
This is no different than a shared interrupt, no? All the nmi handlers
need to check their own sources to see if they triggered it. You can't
expect the generic nmi handler to determine this.
>
> Now, if multiple perfctrs trigger an nmi, it may happen that a handler
> has nothing to do because the counter was already handled by the
> previous one. Thus, it is valid to have unhandled nmis caused by
> perfctrs.
>
> So, with counters enabled we always have to return stop for *all* nmis
> as we cannot detect that it was an perfctr nmi. Otherwise we could
> trigger an unhandled nmi. To ensure that all other nmi handlers are
> called, the perfctr's nmi handler must have the lowest priority. Then,
> the handler will be the last in the chain.
But the cases this break are, external NMI buttons, broken firmware that
causes SERRs on the PCI bus, and any other general hardware failures.
So what the perf handler does is really unacceptable. The only reason we
are noticing this now is because I put the nmi_watchdog on top of the perf
subsystem, so it always has a user and will trigger NOTIFY_STOP. Before,
it never had a registerd user so instead returned NOTIFY_DONE and
everything worked great.
Cheers,
Don
On Wed, 2010-08-04 at 10:00 -0400, Don Zickus wrote:
> On Wed, Aug 04, 2010 at 12:01:16PM +0200, Robert Richter wrote:
> > There is no general mechanism for recording the NMI source (except if
> > it was external triggered, e.g. by the southbridge). Also, all nmis
> > are mapped to NMI vector 2 and therefore there is no way to find out
> > the reason by using apic mask registers.
>
> This is no different than a shared interrupt, no? All the nmi handlers
> need to check their own sources to see if they triggered it. You can't
> expect the generic nmi handler to determine this.
Sure, but the problem is that the PMU can't reliably do that.
> > Now, if multiple perfctrs trigger an nmi, it may happen that a handler
> > has nothing to do because the counter was already handled by the
> > previous one. Thus, it is valid to have unhandled nmis caused by
> > perfctrs.
> >
> > So, with counters enabled we always have to return stop for *all* nmis
> > as we cannot detect that it was an perfctr nmi. Otherwise we could
> > trigger an unhandled nmi. To ensure that all other nmi handlers are
> > called, the perfctr's nmi handler must have the lowest priority. Then,
> > the handler will be the last in the chain.
>
> But the cases this break are, external NMI buttons, broken firmware that
> causes SERRs on the PCI bus, and any other general hardware failures.
It breaks broken firmware? :-) and you care?
> So what the perf handler does is really unacceptable. The only reason we
> are noticing this now is because I put the nmi_watchdog on top of the perf
> subsystem, so it always has a user and will trigger NOTIFY_STOP. Before,
> it never had a registerd user so instead returned NOTIFY_DONE and
> everything worked great.
Right so I looked up your thing and while that limits the damage in that
at some point it will let NMIs pass, it will still consume too many.
Meaning that Yinghai will have to potentially press his NMI button
several times before it registers.
On Wed, Aug 04, 2010 at 04:11:33PM +0200, Peter Zijlstra wrote:
> On Wed, 2010-08-04 at 10:00 -0400, Don Zickus wrote:
> > On Wed, Aug 04, 2010 at 12:01:16PM +0200, Robert Richter wrote:
> > > There is no general mechanism for recording the NMI source (except if
> > > it was external triggered, e.g. by the southbridge). Also, all nmis
> > > are mapped to NMI vector 2 and therefore there is no way to find out
> > > the reason by using apic mask registers.
> >
> > This is no different than a shared interrupt, no? All the nmi handlers
> > need to check their own sources to see if they triggered it. You can't
> > expect the generic nmi handler to determine this.
>
> Sure, but the problem is that the PMU can't reliably do that.
Right, but that is because there is no bit that says the PMU generated the
nmi. But for the most part checking to see if the PMU is >0 is good
enough, no?
>
> > > Now, if multiple perfctrs trigger an nmi, it may happen that a handler
> > > has nothing to do because the counter was already handled by the
> > > previous one. Thus, it is valid to have unhandled nmis caused by
> > > perfctrs.
> > >
> > > So, with counters enabled we always have to return stop for *all* nmis
> > > as we cannot detect that it was an perfctr nmi. Otherwise we could
> > > trigger an unhandled nmi. To ensure that all other nmi handlers are
> > > called, the perfctr's nmi handler must have the lowest priority. Then,
> > > the handler will be the last in the chain.
> >
> > But the cases this break are, external NMI buttons, broken firmware that
> > causes SERRs on the PCI bus, and any other general hardware failures.
>
> It breaks broken firmware? :-) and you care?
Absolutely. When a customer complains they upgraded their RHEL kernel and
the box suddenly hangs on boot trying to access the storage device, yes I
care. Because a flood of NMIs would indiciate something is fishy with
the firmware (in this case it was a network card though it hung on storage
access). Swallowing the NMIs would just cause everyone to waste weeks of
their time trying to figure it out (you don't want to know how many weeks
were wasted in RHEL-6 across multiple machines only to find out it was
broken firmware on a card that no one suspected as being the culprit).
As much as I hate broken firmware, it is becoming common place, and the
faster the kernel can point it out through unknown nmis, the faster we can
get the vendor involved to fix it.
>
> > So what the perf handler does is really unacceptable. The only reason we
> > are noticing this now is because I put the nmi_watchdog on top of the perf
> > subsystem, so it always has a user and will trigger NOTIFY_STOP. Before,
> > it never had a registerd user so instead returned NOTIFY_DONE and
> > everything worked great.
>
> Right so I looked up your thing and while that limits the damage in that
> at some point it will let NMIs pass, it will still consume too many.
> Meaning that Yinghai will have to potentially press his NMI button
> several times before it registers.
Ok. Thanks for reviewing. How does it consume to many? I probably don't
understand how perf is being used in the non-simple scenarios.
Cheers,
Don
On Wed, 2010-08-04 at 10:52 -0400, Don Zickus wrote:
> > Right so I looked up your thing and while that limits the damage in that
> > at some point it will let NMIs pass, it will still consume too many.
> > Meaning that Yinghai will have to potentially press his NMI button
> > several times before it registers.
>
> Ok. Thanks for reviewing. How does it consume to many? I probably don't
> understand how perf is being used in the non-simple scenarios.
Suppose you have 4 counters (AMD, intel-nhm+), when more than 2 overflow
the first will raise the PMI, if the other 2+ overflow before we disable
the PMU it will try to raise 2+ more PMIs, but because hardware only has
a single interrupt pending bit it will at most cause a single extra
interrupt after we finish servicing the first one.
So then the first interrupt will see 3+ overflows, return 3+, and will
thus eat 2+ NMIs, only one of which will be the pending interrupt,
leaving 1+ NMIs from other sources to consume unhandled.
In which case Yinghai will have to press his NMI button 2+ times before
it registers.
That said, that might be a better situation than always consuming
unknown NMIs..
On Wed, Aug 04, 2010 at 05:02:41PM +0200, Peter Zijlstra wrote:
> On Wed, 2010-08-04 at 10:52 -0400, Don Zickus wrote:
> > > Right so I looked up your thing and while that limits the damage in that
> > > at some point it will let NMIs pass, it will still consume too many.
> > > Meaning that Yinghai will have to potentially press his NMI button
> > > several times before it registers.
> >
> > Ok. Thanks for reviewing. How does it consume to many? I probably don't
> > understand how perf is being used in the non-simple scenarios.
>
> Suppose you have 4 counters (AMD, intel-nhm+), when more than 2 overflow
> the first will raise the PMI, if the other 2+ overflow before we disable
> the PMU it will try to raise 2+ more PMIs, but because hardware only has
> a single interrupt pending bit it will at most cause a single extra
> interrupt after we finish servicing the first one.
>
> So then the first interrupt will see 3+ overflows, return 3+, and will
> thus eat 2+ NMIs, only one of which will be the pending interrupt,
> leaving 1+ NMIs from other sources to consume unhandled.
>
> In which case Yinghai will have to press his NMI button 2+ times before
> it registers.
>
> That said, that might be a better situation than always consuming
> unknown NMIs..
>
Well, first I guess having Yinghai CC'ed is a bonus ;)
The second thing is that I don't get why perf handler can't be _last_
call in default_do_nmi, if there were any nmi with reason (serr or parity)
I think they should be calling first which of course don't eliminate
the former issue but somewhat make it weaken.
-- Cyrill
On Wed, Aug 04, 2010 at 05:02:41PM +0200, Peter Zijlstra wrote:
> So then the first interrupt will see 3+ overflows, return 3+, and will
> thus eat 2+ NMIs, only one of which will be the pending interrupt,
> leaving 1+ NMIs from other sources to consume unhandled.
>
> In which case Yinghai will have to press his NMI button 2+ times before
> it registers.
>
> That said, that might be a better situation than always consuming
> unknown NMIs..
Well if the worse case scenario is only one extra NMI, then I can change
the logic in my patch to eat a max of 1 possible NMI instead of 2 as in the
example you gave above.
It still won't be 100% accurate but how often are people running perf
where they need 4 counters, have to hit an external nmi button or run into
broken firmware all at the same time? :-)
Cheers,
Don
On Wed, Aug 04, 2010 at 07:18:58PM +0400, Cyrill Gorcunov wrote:
> On Wed, Aug 04, 2010 at 05:02:41PM +0200, Peter Zijlstra wrote:
> > On Wed, 2010-08-04 at 10:52 -0400, Don Zickus wrote:
> > > > Right so I looked up your thing and while that limits the damage in that
> > > > at some point it will let NMIs pass, it will still consume too many.
> > > > Meaning that Yinghai will have to potentially press his NMI button
> > > > several times before it registers.
> > >
> > > Ok. Thanks for reviewing. How does it consume to many? I probably don't
> > > understand how perf is being used in the non-simple scenarios.
> >
> > Suppose you have 4 counters (AMD, intel-nhm+), when more than 2 overflow
> > the first will raise the PMI, if the other 2+ overflow before we disable
> > the PMU it will try to raise 2+ more PMIs, but because hardware only has
> > a single interrupt pending bit it will at most cause a single extra
> > interrupt after we finish servicing the first one.
> >
> > So then the first interrupt will see 3+ overflows, return 3+, and will
> > thus eat 2+ NMIs, only one of which will be the pending interrupt,
> > leaving 1+ NMIs from other sources to consume unhandled.
> >
> > In which case Yinghai will have to press his NMI button 2+ times before
> > it registers.
> >
> > That said, that might be a better situation than always consuming
> > unknown NMIs..
> >
>
> Well, first I guess having Yinghai CC'ed is a bonus ;)
> The second thing is that I don't get why perf handler can't be _last_
> call in default_do_nmi, if there were any nmi with reason (serr or parity)
> I think they should be calling first which of course don't eliminate
> the former issue but somewhat make it weaken.
Because the reason registers are never set. If they were, then the code
wouldn't have to walk the notify_chain. :-)
Unknown nmis are unknown nmis, nobody is claiming them. Even worse, there
are customers that want to register their nmi handler below the perf
handler to claim all the unknown nmis, so they can be logged on the system
before being rebooted.
Cheers,
Don
On Wed, Aug 04, 2010 at 11:50:02AM -0400, Don Zickus wrote:
...
> >
> > Well, first I guess having Yinghai CC'ed is a bonus ;)
> > The second thing is that I don't get why perf handler can't be _last_
> > call in default_do_nmi, if there were any nmi with reason (serr or parity)
> > I think they should be calling first which of course don't eliminate
> > the former issue but somewhat make it weaken.
>
> Because the reason registers are never set. If they were, then the code
> wouldn't have to walk the notify_chain. :-)
>
maybe we're talking about different things. i meant that if there is nmi
with a reason (from 0x61) the handling of such nmi should be done before
notify_die I think (if only I not miss something behind).
> Unknown nmis are unknown nmis, nobody is claiming them. Even worse, there
> are customers that want to register their nmi handler below the perf
> handler to claim all the unknown nmis, so they can be logged on the system
> before being rebooted.
well, perhaps we might need some kind of perf_chain in notifier code and
call for it after die_nmi chain, so the customers you mention may add own
chain for being called last.
>
> Cheers,
> Don
>
-- Cyrill
On Wed, Aug 04, 2010 at 08:10:46PM +0400, Cyrill Gorcunov wrote:
> On Wed, Aug 04, 2010 at 11:50:02AM -0400, Don Zickus wrote:
> ...
> > >
> > > Well, first I guess having Yinghai CC'ed is a bonus ;)
> > > The second thing is that I don't get why perf handler can't be _last_
> > > call in default_do_nmi, if there were any nmi with reason (serr or parity)
> > > I think they should be calling first which of course don't eliminate
> > > the former issue but somewhat make it weaken.
> >
> > Because the reason registers are never set. If they were, then the code
> > wouldn't have to walk the notify_chain. :-)
> >
>
> maybe we're talking about different things. i meant that if there is nmi
> with a reason (from 0x61) the handling of such nmi should be done before
> notify_die I think (if only I not miss something behind).
No we are talking about the same thing. :-) And that code is already
there. The problem is the bits in register 0x61 are not always set
correctly in the case of SERRs (well at least in all the cases I have
dealt with). So you can easily can a flood of unknown nmis from an SERR
and register 0x61 would have the PERR/SERR bits set to 0. Fun, huh?
Cheers,
Don
On Wed, Aug 04, 2010 at 12:20:26PM -0400, Don Zickus wrote:
...
> > >
> > > Because the reason registers are never set. If they were, then the code
> > > wouldn't have to walk the notify_chain. :-)
> > >
> >
> > maybe we're talking about different things. i meant that if there is nmi
> > with a reason (from 0x61) the handling of such nmi should be done before
> > notify_die I think (if only I not miss something behind).
>
> No we are talking about the same thing. :-) And that code is already
seems not actually ;)
> there. The problem is the bits in register 0x61 are not always set
> correctly in the case of SERRs (well at least in all the cases I have
> dealt with). So you can easily can a flood of unknown nmis from an SERR
> and register 0x61 would have the PERR/SERR bits set to 0. Fun, huh?
if there is nothing in nmi_sc the code flows into another branch. And
it hits the problem of perf events eating all nmi giving no chance the
others. So we take if (!(reason & 0xc0)) case and hit DIE_NMI_IPI
(/me scratching the head why it's not under CONFIG_X86_LOCAL_APIC) and
drop all code, unpleasant.
>
> Cheers,
> Don
>
-- Cyrill
(cc'ing Andi)
On 04.08.10 12:39:30, Cyrill Gorcunov wrote:
> On Wed, Aug 04, 2010 at 12:20:26PM -0400, Don Zickus wrote:
> > there. The problem is the bits in register 0x61 are not always set
> > correctly in the case of SERRs (well at least in all the cases I have
> > dealt with). So you can easily can a flood of unknown nmis from an SERR
> > and register 0x61 would have the PERR/SERR bits set to 0. Fun, huh?
>
> if there is nothing in nmi_sc the code flows into another branch. And
> it hits the problem of perf events eating all nmi giving no chance the
> others. So we take if (!(reason & 0xc0)) case and hit DIE_NMI_IPI
> (/me scratching the head why it's not under CONFIG_X86_LOCAL_APIC) and
> drop all code, unpleasant.
Only the upper 2 bits in io_61h indicate the nmi reason, so in case of
(!(reason & 0xc0)) the source simply can not be determined and all nmi
handlers in the chain must be called (DIE_NMI/DIE_NMI_IPI). The
perfctr handler then stops it.
So you can decide to either get an unrecovered nmi panic triggered by
a perfctr or losing unknown nmis from other sources. Maybe this can be
fixed by implementing handlers for those sources.
-Robert
--
Advanced Micro Devices, Inc.
Operating System Research Center
> Only the upper 2 bits in io_61h indicate the nmi reason, so in case of
> (!(reason & 0xc0)) the source simply can not be determined and all nmi
> handlers in the chain must be called (DIE_NMI/DIE_NMI_IPI). The
> perfctr handler then stops it.
>
> So you can decide to either get an unrecovered nmi panic triggered by
> a perfctr or losing unknown nmis from other sources. Maybe this can be
> fixed by implementing handlers for those sources.
This is a tricky area. Me and Ying have been looking at this recently.
Hardware traditionally signals NMI when it has a uncontained error and really
expects the OS to shut down to prevent data corruption spreading. i
Unfortunately especially for some older hardware
there can be cases where this is not expressed in port 61.
But the default behaviour of Linux for this today is quite wrong.
Some cases can be also determined with the help of APEI, which
can give you more information about the error (and tell you
if shutdown is needed).
But of course we can still have performance counter and other NMI
users.
So the right flow might be something like
- check software events (like crash dump or reboot)
- check perfctrs
- check APEI
- check port 61 for known events (it's probably a good idea
to check perfctrs first because accessing io ports is quite slow.
But the perfctr handler has to make sure it doesn't eat unknown
events, otherwise error handling would be impacted)
- check other event sources
- shutdown (depending on the chipset likely)
This means the NMI users who cannot determine themselves if a event
happened and eat everything (like oprofile today) would need to be fixed.
-Andi
--
[email protected] -- Speaking for myself only.
On Wed, Aug 04, 2010 at 08:48:06PM +0200, Robert Richter wrote:
> (cc'ing Andi)
>
> On 04.08.10 12:39:30, Cyrill Gorcunov wrote:
> > On Wed, Aug 04, 2010 at 12:20:26PM -0400, Don Zickus wrote:
>
> > > there. The problem is the bits in register 0x61 are not always set
> > > correctly in the case of SERRs (well at least in all the cases I have
> > > dealt with). So you can easily can a flood of unknown nmis from an SERR
> > > and register 0x61 would have the PERR/SERR bits set to 0. Fun, huh?
> >
> > if there is nothing in nmi_sc the code flows into another branch. And
> > it hits the problem of perf events eating all nmi giving no chance the
> > others. So we take if (!(reason & 0xc0)) case and hit DIE_NMI_IPI
> > (/me scratching the head why it's not under CONFIG_X86_LOCAL_APIC) and
> > drop all code, unpleasant.
>
> Only the upper 2 bits in io_61h indicate the nmi reason, so in case of
> (!(reason & 0xc0)) the source simply can not be determined and all nmi
> handlers in the chain must be called (DIE_NMI/DIE_NMI_IPI). The
> perfctr handler then stops it.
yes, that is what I meant by nmi_sc register. I think we need to restucturize
current default_do_nmi handler but how to be with perfs I don't know at moment
if perf register gets overflowed (ie already has pedning nmi) but we handle
it in early nmi cycle this would lead to strange results. Need to think.
>
> So you can decide to either get an unrecovered nmi panic triggered by
> a perfctr or losing unknown nmis from other sources. Maybe this can be
> fixed by implementing handlers for those sources.
>
> -Robert
>
> --
> Advanced Micro Devices, Inc.
> Operating System Research Center
>
-- Cyrill
On 04.08.10 15:26:34, Cyrill Gorcunov wrote:
> yes, that is what I meant by nmi_sc register. I think we need to restucturize
> current default_do_nmi handler but how to be with perfs I don't know at moment
> if perf register gets overflowed (ie already has pedning nmi) but we handle
> it in early nmi cycle this would lead to strange results. Need to think.
>
> >
> > So you can decide to either get an unrecovered nmi panic triggered by
> > a perfctr or losing unknown nmis from other sources. Maybe this can be
> > fixed by implementing handlers for those sources.
I was playing around with it yesterday trying to fix this. My idea is
to skip an unkown nmi if the privious nmi was a *handled* perfctr
nmi. I will probably post an rfc patch early next week.
Another problem I encountered is that unknown nmis from the chipset
are not reenabled, thus when hitting the nmi button I only see one
unknown nmi message per boot, if I reenable it, I get an nmi
storm firing nmi_watchdog. Uhh....
-Robert
--
Advanced Micro Devices, Inc.
Operating System Research Center
On Fri, Aug 06, 2010 at 08:52:03AM +0200, Robert Richter wrote:
> On 04.08.10 15:26:34, Cyrill Gorcunov wrote:
>
> > yes, that is what I meant by nmi_sc register. I think we need to restucturize
> > current default_do_nmi handler but how to be with perfs I don't know at moment
> > if perf register gets overflowed (ie already has pedning nmi) but we handle
> > it in early nmi cycle this would lead to strange results. Need to think.
> >
> > >
> > > So you can decide to either get an unrecovered nmi panic triggered by
> > > a perfctr or losing unknown nmis from other sources. Maybe this can be
> > > fixed by implementing handlers for those sources.
>
> I was playing around with it yesterday trying to fix this. My idea is
> to skip an unkown nmi if the privious nmi was a *handled* perfctr
You might want to add a little more logic that says *handled* _and_ had
more than one perfctr trigger. Most of the time only one perfctr is
probably triggering, so you might be eating unknown_nmi's needlessly.
Just a thought.
> nmi. I will probably post an rfc patch early next week.
>
> Another problem I encountered is that unknown nmis from the chipset
> are not reenabled, thus when hitting the nmi button I only see one
> unknown nmi message per boot, if I reenable it, I get an nmi
> storm firing nmi_watchdog. Uhh....
Interesting.
Cheers,
Don
Don Zickus <[email protected]> writes:
> On Wed, Aug 04, 2010 at 08:10:46PM +0400, Cyrill Gorcunov wrote:
>> On Wed, Aug 04, 2010 at 11:50:02AM -0400, Don Zickus wrote:
>> ...
>> > >
>> > > Well, first I guess having Yinghai CC'ed is a bonus ;)
>> > > The second thing is that I don't get why perf handler can't be _last_
>> > > call in default_do_nmi, if there were any nmi with reason (serr or parity)
>> > > I think they should be calling first which of course don't eliminate
>> > > the former issue but somewhat make it weaken.
>> >
>> > Because the reason registers are never set. If they were, then the code
>> > wouldn't have to walk the notify_chain. :-)
>> >
>>
>> maybe we're talking about different things. i meant that if there is nmi
>> with a reason (from 0x61) the handling of such nmi should be done before
>> notify_die I think (if only I not miss something behind).
>
> No we are talking about the same thing. :-) And that code is already
> there. The problem is the bits in register 0x61 are not always set
> correctly in the case of SERRs (well at least in all the cases I have
> dealt with). So you can easily can a flood of unknown nmis from an SERR
> and register 0x61 would have the PERR/SERR bits set to 0. Fun, huh?
Some of this can be handled by APEI on newer systems (if the platform
supports that).
But not all unfortunately if you consider older systems.
-Andi
--
[email protected] -- Speaking for myself only.
Peter Zijlstra <[email protected]> writes:
>
> Suppose you have 4 counters (AMD, intel-nhm+), when more than 2 overflow
> the first will raise the PMI, if the other 2+ overflow before we disable
> the PMU it will try to raise 2+ more PMIs, but because hardware only has
> a single interrupt pending bit it will at most cause a single extra
> interrupt after we finish servicing the first one.
>
> So then the first interrupt will see 3+ overflows, return 3+, and will
> thus eat 2+ NMIs, only one of which will be the pending interrupt,
> leaving 1+ NMIs from other sources to consume unhandled.
>
> In which case Yinghai will have to press his NMI button 2+ times before
> it registers.
>
> That said, that might be a better situation than always consuming
> unknown NMIs..
One alternative would be to stop using NMIs for perf counters in common cases.
If you have PEBS and your events support PEBS then PEBS can give you a
lot of information inside the irq off region. That works for common
events at least.
Also traditionally interrupt off regions are shrinking in Linux,
so is it really still worth all the trouble just to profile inside them.
e.g. one could make nmi profiling an option with default off.
-Andi
--
[email protected] -- Speaking for myself only.
On 06.08.10 10:21:31, Don Zickus wrote:
> On Fri, Aug 06, 2010 at 08:52:03AM +0200, Robert Richter wrote:
> > I was playing around with it yesterday trying to fix this. My idea is
> > to skip an unkown nmi if the privious nmi was a *handled* perfctr
>
> You might want to add a little more logic that says *handled* _and_ had
> more than one perfctr trigger. Most of the time only one perfctr is
> probably triggering, so you might be eating unknown_nmi's needlessly.
>
> Just a thought.
Yes, that's true. It could be implemented on top of the patch below.
>
> > nmi. I will probably post an rfc patch early next week.
Here it comes:
>From d2739578199d881ae6a9537c1b96a0efd1cdea43 Mon Sep 17 00:00:00 2001
From: Robert Richter <[email protected]>
Date: Thu, 5 Aug 2010 16:19:59 +0200
Subject: [PATCH] perf, x86: try to handle unknown nmis with running perfctrs
When perfctrs are running it is valid to have unhandled nmis, two
events could trigger 'simultaneously' raising two back-to-back
NMIs. If the first NMI handles both, the latter will be empty and daze
the CPU.
The solution to avoid an 'unknown nmi' massage in this case was simply
to stop the nmi handler chain when perfctrs are runnning by stating
the nmi was handled. This has the drawback that a) we can not detect
unknown nmis anymore, and b) subsequent nmi handlers are not called.
This patch addresses this. Now, we drop this unknown NMI only if the
previous NMI was handling a perfctr. Otherwise we pass it and let the
kernel handle the unknown nmi. The check runs only if no nmi handler
could handle the nmi (DIE_NMIUNKNOWN case).
We could improve this further by checking if perf was handling more
than one counter. Otherwise we may pass the unknown nmi too.
Signed-off-by: Robert Richter <[email protected]>
---
arch/x86/kernel/cpu/perf_event.c | 39 +++++++++++++++++++++++++++++--------
1 files changed, 30 insertions(+), 9 deletions(-)
diff --git a/arch/x86/kernel/cpu/perf_event.c b/arch/x86/kernel/cpu/perf_event.c
index f2da20f..c3cd159 100644
--- a/arch/x86/kernel/cpu/perf_event.c
+++ b/arch/x86/kernel/cpu/perf_event.c
@@ -1200,12 +1200,16 @@ void perf_events_lapic_init(void)
apic_write(APIC_LVTPC, APIC_DM_NMI);
}
+static DEFINE_PER_CPU(unsigned int, perfctr_handled);
+
static int __kprobes
perf_event_nmi_handler(struct notifier_block *self,
unsigned long cmd, void *__args)
{
struct die_args *args = __args;
struct pt_regs *regs;
+ unsigned int this_nmi;
+ unsigned int prev_nmi;
if (!atomic_read(&active_events))
return NOTIFY_DONE;
@@ -1214,7 +1218,26 @@ perf_event_nmi_handler(struct notifier_block *self,
case DIE_NMI:
case DIE_NMI_IPI:
break;
-
+ case DIE_NMIUNKNOWN:
+ /*
+ * This one could be our NMI, two events could trigger
+ * 'simultaneously' raising two back-to-back NMIs. If
+ * the first NMI handles both, the latter will be
+ * empty and daze the CPU.
+ *
+ * So, we drop this unknown NMI if the previous NMI
+ * was handling a perfctr. Otherwise we pass it and
+ * let the kernel handle the unknown nmi.
+ *
+ * Note: this could be improved if we drop unknown
+ * NMIs only if we handled more than one perfctr in
+ * the previous NMI.
+ */
+ this_nmi = percpu_read(irq_stat.__nmi_count);
+ prev_nmi = __get_cpu_var(perfctr_handled);
+ if (this_nmi == prev_nmi + 1)
+ return NOTIFY_STOP;
+ return NOTIFY_DONE;
default:
return NOTIFY_DONE;
}
@@ -1222,14 +1245,12 @@ perf_event_nmi_handler(struct notifier_block *self,
regs = args->regs;
apic_write(APIC_LVTPC, APIC_DM_NMI);
- /*
- * Can't rely on the handled return value to say it was our NMI, two
- * events could trigger 'simultaneously' raising two back-to-back NMIs.
- *
- * If the first NMI handles both, the latter will be empty and daze
- * the CPU.
- */
- x86_pmu.handle_irq(regs);
+
+ if (!x86_pmu.handle_irq(regs))
+ return NOTIFY_DONE;
+
+ /* handled */
+ __get_cpu_var(perfctr_handled) = percpu_read(irq_stat.__nmi_count);
return NOTIFY_STOP;
}
--
1.7.1.1
--
Advanced Micro Devices, Inc.
Operating System Research Center
On Mon, Aug 09, 2010 at 09:48:29PM +0200, Robert Richter wrote:
> On 06.08.10 10:21:31, Don Zickus wrote:
> > On Fri, Aug 06, 2010 at 08:52:03AM +0200, Robert Richter wrote:
>
> > > I was playing around with it yesterday trying to fix this. My idea is
> > > to skip an unkown nmi if the privious nmi was a *handled* perfctr
> >
> > You might want to add a little more logic that says *handled* _and_ had
> > more than one perfctr trigger. Most of the time only one perfctr is
> > probably triggering, so you might be eating unknown_nmi's needlessly.
> >
> > Just a thought.
>
> Yes, that's true. It could be implemented on top of the patch below.
>
> >
> > > nmi. I will probably post an rfc patch early next week.
>
> Here it comes:
>
Thanks Robert! Looks good to me, one nit below.
> From d2739578199d881ae6a9537c1b96a0efd1cdea43 Mon Sep 17 00:00:00 2001
> From: Robert Richter <[email protected]>
> Date: Thu, 5 Aug 2010 16:19:59 +0200
> Subject: [PATCH] perf, x86: try to handle unknown nmis with running perfctrs
>
...
> diff --git a/arch/x86/kernel/cpu/perf_event.c b/arch/x86/kernel/cpu/perf_event.c
> index f2da20f..c3cd159 100644
> --- a/arch/x86/kernel/cpu/perf_event.c
> +++ b/arch/x86/kernel/cpu/perf_event.c
> @@ -1200,12 +1200,16 @@ void perf_events_lapic_init(void)
> apic_write(APIC_LVTPC, APIC_DM_NMI);
> }
>
> +static DEFINE_PER_CPU(unsigned int, perfctr_handled);
> +
> static int __kprobes
> perf_event_nmi_handler(struct notifier_block *self,
> unsigned long cmd, void *__args)
> {
> struct die_args *args = __args;
> struct pt_regs *regs;
> + unsigned int this_nmi;
> + unsigned int prev_nmi;
>
> if (!atomic_read(&active_events))
> return NOTIFY_DONE;
> @@ -1214,7 +1218,26 @@ perf_event_nmi_handler(struct notifier_block *self,
> case DIE_NMI:
> case DIE_NMI_IPI:
> break;
> -
> + case DIE_NMIUNKNOWN:
> + /*
> + * This one could be our NMI, two events could trigger
> + * 'simultaneously' raising two back-to-back NMIs. If
> + * the first NMI handles both, the latter will be
> + * empty and daze the CPU.
> + *
> + * So, we drop this unknown NMI if the previous NMI
> + * was handling a perfctr. Otherwise we pass it and
> + * let the kernel handle the unknown nmi.
> + *
> + * Note: this could be improved if we drop unknown
> + * NMIs only if we handled more than one perfctr in
> + * the previous NMI.
> + */
> + this_nmi = percpu_read(irq_stat.__nmi_count);
> + prev_nmi = __get_cpu_var(perfctr_handled);
> + if (this_nmi == prev_nmi + 1)
> + return NOTIFY_STOP;
> + return NOTIFY_DONE;
> default:
> return NOTIFY_DONE;
> }
> @@ -1222,14 +1245,12 @@ perf_event_nmi_handler(struct notifier_block *self,
> regs = args->regs;
>
> apic_write(APIC_LVTPC, APIC_DM_NMI);
If only I'm not missing something this apic_write should go up to
"case DIE_NMIUNKNOWN" site, no?
-- Cyrill
On 09.08.10 16:02:45, Cyrill Gorcunov wrote:
> > @@ -1222,14 +1245,12 @@ perf_event_nmi_handler(struct notifier_block *self,
> > regs = args->regs;
> >
> > apic_write(APIC_LVTPC, APIC_DM_NMI);
>
> If only I'm not missing something this apic_write should go up to
> "case DIE_NMIUNKNOWN" site, no?
This seems to be code from the non-nmi implementation and can be
removed at all, which should be a separate patch. The vector is
already set up.
-Robert
--
Advanced Micro Devices, Inc.
Operating System Research Center
On Tue, Aug 10, 2010 at 09:42:00AM +0200, Robert Richter wrote:
> On 09.08.10 16:02:45, Cyrill Gorcunov wrote:
>
> > > @@ -1222,14 +1245,12 @@ perf_event_nmi_handler(struct notifier_block *self,
> > > regs = args->regs;
> > >
> > > apic_write(APIC_LVTPC, APIC_DM_NMI);
> >
> > If only I'm not missing something this apic_write should go up to
> > "case DIE_NMIUNKNOWN" site, no?
>
> This seems to be code from the non-nmi implementation and can be
> removed at all, which should be a separate patch. The vector is
> already set up.
>
> -Robert
>
No, this is just a short way to unmask LVTPC (which is required for
cpus). Actually lookig into this snippet I found that in p4 pmu
I've made one redundant unmaksing operation. will update as only
this area settle down.
-- Cyrill
On 10.08.10 12:16:27, Cyrill Gorcunov wrote:
> On Tue, Aug 10, 2010 at 09:42:00AM +0200, Robert Richter wrote:
> > On 09.08.10 16:02:45, Cyrill Gorcunov wrote:
> >
> > > > @@ -1222,14 +1245,12 @@ perf_event_nmi_handler(struct notifier_block *self,
> > > > regs = args->regs;
> > > >
> > > > apic_write(APIC_LVTPC, APIC_DM_NMI);
> > >
> > > If only I'm not missing something this apic_write should go up to
> > > "case DIE_NMIUNKNOWN" site, no?
> >
> > This seems to be code from the non-nmi implementation and can be
> > removed at all, which should be a separate patch. The vector is
> > already set up.
> >
> > -Robert
> >
>
> No, this is just a short way to unmask LVTPC (which is required for
> cpus). Actually lookig into this snippet I found that in p4 pmu
> I've made one redundant unmaksing operation. will update as only
> this area settle down.
The vector is setup in hw_perf_enable() and then never masked. The
perfctrs nmi is alwayes enabled since then. I still see no reason for
unmasking it again with every nmi. Once you handle the nmi it is also
enabled.
-Robert
--
Advanced Micro Devices, Inc.
Operating System Research Center
On Tue, Aug 10, 2010 at 06:41:24PM +0200, Robert Richter wrote:
> On 10.08.10 12:16:27, Cyrill Gorcunov wrote:
> > On Tue, Aug 10, 2010 at 09:42:00AM +0200, Robert Richter wrote:
> > > On 09.08.10 16:02:45, Cyrill Gorcunov wrote:
> > >
> > > > > @@ -1222,14 +1245,12 @@ perf_event_nmi_handler(struct notifier_block *self,
> > > > > regs = args->regs;
> > > > >
> > > > > apic_write(APIC_LVTPC, APIC_DM_NMI);
> > > >
> > > > If only I'm not missing something this apic_write should go up to
> > > > "case DIE_NMIUNKNOWN" site, no?
> > >
> > > This seems to be code from the non-nmi implementation and can be
> > > removed at all, which should be a separate patch. The vector is
> > > already set up.
> > >
> > > -Robert
> > >
> >
> > No, this is just a short way to unmask LVTPC (which is required for
> > cpus). Actually lookig into this snippet I found that in p4 pmu
> > I've made one redundant unmaksing operation. will update as only
> > this area settle down.
>
> The vector is setup in hw_perf_enable() and then never masked. The
> perfctrs nmi is alwayes enabled since then. I still see no reason for
> unmasking it again with every nmi. Once you handle the nmi it is also
> enabled.
>
It gets masked on NMI arrival, at least for some models (Core Duo, P4,
P6 M and I suspect more theh that, that was the reason oprofile has
it, also there is a note in SDM V3a page 643).
> -Robert
>
> --
> Advanced Micro Devices, Inc.
> Operating System Research Center
>
-- Cyrill
On 10.08.10 13:24:51, Cyrill Gorcunov wrote:
> It gets masked on NMI arrival, at least for some models (Core Duo, P4,
> P6 M and I suspect more theh that, that was the reason oprofile has
> it, also there is a note in SDM V3a page 643).
Yes, that's right, I never noticed that. Maybe it is better to
implement the apic write it in model specific code then.
-Robert
--
Advanced Micro Devices, Inc.
Operating System Research Center
On Tue, Aug 10, 2010 at 09:05:41PM +0200, Robert Richter wrote:
> On 10.08.10 13:24:51, Cyrill Gorcunov wrote:
>
> > It gets masked on NMI arrival, at least for some models (Core Duo, P4,
> > P6 M and I suspect more theh that, that was the reason oprofile has
> > it, also there is a note in SDM V3a page 643).
>
> Yes, that's right, I never noticed that. Maybe it is better to
> implement the apic write it in model specific code then.
>
Perhaps we can make it simplier I think, ie like it was before -- we just
move it under your new DIE_NMIUNKNOWN, in separate patch of course. Though
I'm fine with either way.
(actually it's interesting to know wouldn't we leave lvt masked when
we hit 'second delayed nmi has arrived' situation, I guess we didn't
hit it before in real yet :-)
> -Robert
>
> --
> Advanced Micro Devices, Inc.
> Operating System Research Center
>
-- Cyrill
On Mon, Aug 09, 2010 at 09:48:29PM +0200, Robert Richter wrote:
> On 06.08.10 10:21:31, Don Zickus wrote:
> > On Fri, Aug 06, 2010 at 08:52:03AM +0200, Robert Richter wrote:
>
> > > I was playing around with it yesterday trying to fix this. My idea is
> > > to skip an unkown nmi if the privious nmi was a *handled* perfctr
> >
> > You might want to add a little more logic that says *handled* _and_ had
> > more than one perfctr trigger. Most of the time only one perfctr is
> > probably triggering, so you might be eating unknown_nmi's needlessly.
> >
> > Just a thought.
>
> Yes, that's true. It could be implemented on top of the patch below.
I did, but the changes basically revert the bulk of your patch.
>
> >
> > > nmi. I will probably post an rfc patch early next week.
>
> Here it comes:
>
> From d2739578199d881ae6a9537c1b96a0efd1cdea43 Mon Sep 17 00:00:00 2001
> From: Robert Richter <[email protected]>
> Date: Thu, 5 Aug 2010 16:19:59 +0200
> Subject: [PATCH] perf, x86: try to handle unknown nmis with running perfctrs
On top of Robert's patch:
(compiled tested only because I don't have a fancy button to trigger
unknown nmis)
>From 548cf5148f47618854a0eff22b1d55db71b6f8fc Mon Sep 17 00:00:00 2001
From: Don Zickus <[email protected]>
Date: Tue, 10 Aug 2010 16:40:03 -0400
Subject: [PATCH] perf, x86: only skip NMIs when multiple perfctrs trigger
A small optimization on top of Robert's patch that limits the
skipping of NMI's to cases where we detect multiple perfctr events
have happened.
Signed-off-by: Don Zickus <[email protected]>
---
arch/x86/kernel/cpu/perf_event.c | 34 ++++++++++++++++++++--------------
1 files changed, 20 insertions(+), 14 deletions(-)
diff --git a/arch/x86/kernel/cpu/perf_event.c b/arch/x86/kernel/cpu/perf_event.c
index c3cd159..066046d 100644
--- a/arch/x86/kernel/cpu/perf_event.c
+++ b/arch/x86/kernel/cpu/perf_event.c
@@ -1154,7 +1154,7 @@ static int x86_pmu_handle_irq(struct pt_regs *regs)
/*
* event overflow
*/
- handled = 1;
+ handled += 1;
data.period = event->hw.last_period;
if (!x86_perf_event_set_period(event))
@@ -1200,7 +1200,7 @@ void perf_events_lapic_init(void)
apic_write(APIC_LVTPC, APIC_DM_NMI);
}
-static DEFINE_PER_CPU(unsigned int, perfctr_handled);
+static DEFINE_PER_CPU(unsigned int, perfctr_skip);
static int __kprobes
perf_event_nmi_handler(struct notifier_block *self,
@@ -1208,8 +1208,7 @@ perf_event_nmi_handler(struct notifier_block *self,
{
struct die_args *args = __args;
struct pt_regs *regs;
- unsigned int this_nmi;
- unsigned int prev_nmi;
+ int handled = 0;
if (!atomic_read(&active_events))
return NOTIFY_DONE;
@@ -1229,14 +1228,11 @@ perf_event_nmi_handler(struct notifier_block *self,
* was handling a perfctr. Otherwise we pass it and
* let the kernel handle the unknown nmi.
*
- * Note: this could be improved if we drop unknown
- * NMIs only if we handled more than one perfctr in
- * the previous NMI.
*/
- this_nmi = percpu_read(irq_stat.__nmi_count);
- prev_nmi = __get_cpu_var(perfctr_handled);
- if (this_nmi == prev_nmi + 1)
+ if (__get_cpu_var(perfctr_skip)){
+ __get_cpu_var(perfctr_skip) -=1;
return NOTIFY_STOP;
+ }
return NOTIFY_DONE;
default:
return NOTIFY_DONE;
@@ -1246,11 +1242,21 @@ perf_event_nmi_handler(struct notifier_block *self,
apic_write(APIC_LVTPC, APIC_DM_NMI);
- if (!x86_pmu.handle_irq(regs))
+ handled = x86_pmu.handle_irq(regs);
+ if (!handled)
+ /* not our NMI */
return NOTIFY_DONE;
-
- /* handled */
- __get_cpu_var(perfctr_handled) = percpu_read(irq_stat.__nmi_count);
+ else if (handled > 1)
+ /*
+ * More than one perfctr triggered. This could have
+ * caused a second NMI that we must now skip because
+ * we have already handled it. Remember it.
+ *
+ * NOTE: We have no way of knowing if a second NMI was
+ * actually triggered, so we may accidentally skip a valid
+ * unknown nmi later.
+ */
+ __get_cpu_var(perfctr_skip) +=1;
return NOTIFY_STOP;
}
--
1.7.2
On Tue, Aug 10, 2010 at 04:48:56PM -0400, Don Zickus wrote:
> On Mon, Aug 09, 2010 at 09:48:29PM +0200, Robert Richter wrote:
> > On 06.08.10 10:21:31, Don Zickus wrote:
> > > On Fri, Aug 06, 2010 at 08:52:03AM +0200, Robert Richter wrote:
> >
> > > > I was playing around with it yesterday trying to fix this. My idea is
> > > > to skip an unkown nmi if the privious nmi was a *handled* perfctr
> > >
> > > You might want to add a little more logic that says *handled* _and_ had
> > > more than one perfctr trigger. Most of the time only one perfctr is
> > > probably triggering, so you might be eating unknown_nmi's needlessly.
> > >
> > > Just a thought.
> >
> > Yes, that's true. It could be implemented on top of the patch below.
>
> I did, but the changes basically revert the bulk of your patch.
>
> >
> > >
> > > > nmi. I will probably post an rfc patch early next week.
> >
> > Here it comes:
> >
> > From d2739578199d881ae6a9537c1b96a0efd1cdea43 Mon Sep 17 00:00:00 2001
> > From: Robert Richter <[email protected]>
> > Date: Thu, 5 Aug 2010 16:19:59 +0200
> > Subject: [PATCH] perf, x86: try to handle unknown nmis with running perfctrs
>
> On top of Robert's patch:
> (compiled tested only because I don't have a fancy button to trigger
> unknown nmis)
>
> From 548cf5148f47618854a0eff22b1d55db71b6f8fc Mon Sep 17 00:00:00 2001
> From: Don Zickus <[email protected]>
> Date: Tue, 10 Aug 2010 16:40:03 -0400
> Subject: [PATCH] perf, x86: only skip NMIs when multiple perfctrs trigger
>
> A small optimization on top of Robert's patch that limits the
> skipping of NMI's to cases where we detect multiple perfctr events
> have happened.
Yeah, I think that's more reasonable. This lowers even more the chances of
losing important hardware errors.
One comment though:
>
> Signed-off-by: Don Zickus <[email protected]>
>
> ---
> arch/x86/kernel/cpu/perf_event.c | 34 ++++++++++++++++++++--------------
> 1 files changed, 20 insertions(+), 14 deletions(-)
>
> diff --git a/arch/x86/kernel/cpu/perf_event.c b/arch/x86/kernel/cpu/perf_event.c
> index c3cd159..066046d 100644
> --- a/arch/x86/kernel/cpu/perf_event.c
> +++ b/arch/x86/kernel/cpu/perf_event.c
> @@ -1154,7 +1154,7 @@ static int x86_pmu_handle_irq(struct pt_regs *regs)
> /*
> * event overflow
> */
> - handled = 1;
> + handled += 1;
> data.period = event->hw.last_period;
>
> if (!x86_perf_event_set_period(event))
> @@ -1200,7 +1200,7 @@ void perf_events_lapic_init(void)
> apic_write(APIC_LVTPC, APIC_DM_NMI);
> }
>
> -static DEFINE_PER_CPU(unsigned int, perfctr_handled);
> +static DEFINE_PER_CPU(unsigned int, perfctr_skip);
>
> static int __kprobes
> perf_event_nmi_handler(struct notifier_block *self,
> @@ -1208,8 +1208,7 @@ perf_event_nmi_handler(struct notifier_block *self,
> {
> struct die_args *args = __args;
> struct pt_regs *regs;
> - unsigned int this_nmi;
> - unsigned int prev_nmi;
> + int handled = 0;
>
> if (!atomic_read(&active_events))
> return NOTIFY_DONE;
> @@ -1229,14 +1228,11 @@ perf_event_nmi_handler(struct notifier_block *self,
> * was handling a perfctr. Otherwise we pass it and
> * let the kernel handle the unknown nmi.
> *
> - * Note: this could be improved if we drop unknown
> - * NMIs only if we handled more than one perfctr in
> - * the previous NMI.
> */
> - this_nmi = percpu_read(irq_stat.__nmi_count);
> - prev_nmi = __get_cpu_var(perfctr_handled);
> - if (this_nmi == prev_nmi + 1)
> + if (__get_cpu_var(perfctr_skip)){
> + __get_cpu_var(perfctr_skip) -=1;
> return NOTIFY_STOP;
> + }
> return NOTIFY_DONE;
> default:
> return NOTIFY_DONE;
> @@ -1246,11 +1242,21 @@ perf_event_nmi_handler(struct notifier_block *self,
>
> apic_write(APIC_LVTPC, APIC_DM_NMI);
>
> - if (!x86_pmu.handle_irq(regs))
> + handled = x86_pmu.handle_irq(regs);
> + if (!handled)
> + /* not our NMI */
> return NOTIFY_DONE;
> -
> - /* handled */
> - __get_cpu_var(perfctr_handled) = percpu_read(irq_stat.__nmi_count);
> + else if (handled > 1)
> + /*
> + * More than one perfctr triggered. This could have
> + * caused a second NMI that we must now skip because
> + * we have already handled it. Remember it.
> + *
> + * NOTE: We have no way of knowing if a second NMI was
> + * actually triggered, so we may accidentally skip a valid
> + * unknown nmi later.
> + */
> + __get_cpu_var(perfctr_skip) +=1;
May be make it just a pending bit. I mean not something that can
go further 1, because you can't have more than 1 pending anyway. I don't
know how that could happen you get accidental perctr_skip > 1, may be
expected pending NMIs that don't happen somehow, but better be paranoid with
that, as it's about trying not to miss hardware errors.
Thanks.
On Wed, 2010-08-11 at 04:48 +0800, Don Zickus wrote:
> > From d2739578199d881ae6a9537c1b96a0efd1cdea43 Mon Sep 17 00:00:00 2001
> > From: Robert Richter <[email protected]>
> > Date: Thu, 5 Aug 2010 16:19:59 +0200
> > Subject: [PATCH] perf, x86: try to handle unknown nmis with running perfctrs
>
> On top of Robert's patch:
> (compiled tested only because I don't have a fancy button to trigger
> unknown nmis)
You can trigger unknown NMIs via apic->send_IPI_mask(cpu_mask,
NMI_VECTOR).
How about the algorithm as follow:
int perf_event_nmi_handler()
{
...
switch (cmd) {
case DIE_NMIUNKNOWN:
if (per_cpu(perfctr_prev_handled) > 1
&& rdtsc() - per_cpu(perfctr_handled_timestamp) < 1000)
return NOTIFY_STOP;
else
return NOTIFY_DONE;
}
...
handled = x86_pmu.handle_irq(regs);
per_cpu(perfctr_prev_handled) = per_cpu(perfctr_handled);
per_cpu(perfctr_handled) = handled;
if (handled) {
per_cpu(perfctr_handled_timestamp) = rdtsc();
return NOTIFY_STOP;
} else
return NOTIFY_DONE;
}
Best Regards,
Huang Ying
On 10.08.10 22:44:55, Frederic Weisbecker wrote:
> On Tue, Aug 10, 2010 at 04:48:56PM -0400, Don Zickus wrote:
> > @@ -1200,7 +1200,7 @@ void perf_events_lapic_init(void)
> > apic_write(APIC_LVTPC, APIC_DM_NMI);
> > }
> >
> > -static DEFINE_PER_CPU(unsigned int, perfctr_handled);
> > +static DEFINE_PER_CPU(unsigned int, perfctr_skip);
Yes, using perfctr_skip is better to understand ...
> > @@ -1229,14 +1228,11 @@ perf_event_nmi_handler(struct notifier_block *self,
> > * was handling a perfctr. Otherwise we pass it and
> > * let the kernel handle the unknown nmi.
> > *
> > - * Note: this could be improved if we drop unknown
> > - * NMIs only if we handled more than one perfctr in
> > - * the previous NMI.
> > */
> > - this_nmi = percpu_read(irq_stat.__nmi_count);
> > - prev_nmi = __get_cpu_var(perfctr_handled);
> > - if (this_nmi == prev_nmi + 1)
> > + if (__get_cpu_var(perfctr_skip)){
> > + __get_cpu_var(perfctr_skip) -=1;
> > return NOTIFY_STOP;
> > + }
> > return NOTIFY_DONE;
> > default:
> > return NOTIFY_DONE;
> > @@ -1246,11 +1242,21 @@ perf_event_nmi_handler(struct notifier_block *self,
> >
> > apic_write(APIC_LVTPC, APIC_DM_NMI);
> >
> > - if (!x86_pmu.handle_irq(regs))
> > + handled = x86_pmu.handle_irq(regs);
> > + if (!handled)
> > + /* not our NMI */
> > return NOTIFY_DONE;
> > -
> > - /* handled */
> > - __get_cpu_var(perfctr_handled) = percpu_read(irq_stat.__nmi_count);
> > + else if (handled > 1)
> > + /*
> > + * More than one perfctr triggered. This could have
> > + * caused a second NMI that we must now skip because
> > + * we have already handled it. Remember it.
> > + *
> > + * NOTE: We have no way of knowing if a second NMI was
> > + * actually triggered, so we may accidentally skip a valid
> > + * unknown nmi later.
> > + */
> > + __get_cpu_var(perfctr_skip) +=1;
... but this will not work. You have to mark the *absolute* nmi number
here. If you only raise a flag, the next unknown nmi will be dropped,
every. Because, in between there could have been other nmis that
stopped the chain and thus the 'unknown' path is not executed. The
trick in my patch is that you *know*, which nmi you want to skip.
I will send an updated version of my patch.
-Robert
>
>
>
> May be make it just a pending bit. I mean not something that can
> go further 1, because you can't have more than 1 pending anyway. I don't
> know how that could happen you get accidental perctr_skip > 1, may be
> expected pending NMIs that don't happen somehow, but better be paranoid with
> that, as it's about trying not to miss hardware errors.
>
> Thanks.
>
>
--
Advanced Micro Devices, Inc.
Operating System Research Center
On Wed, Aug 11, 2010 at 11:19:39AM +0800, Huang Ying wrote:
> On Wed, 2010-08-11 at 04:48 +0800, Don Zickus wrote:
> > > From d2739578199d881ae6a9537c1b96a0efd1cdea43 Mon Sep 17 00:00:00 2001
> > > From: Robert Richter <[email protected]>
> > > Date: Thu, 5 Aug 2010 16:19:59 +0200
> > > Subject: [PATCH] perf, x86: try to handle unknown nmis with running perfctrs
> >
> > On top of Robert's patch:
> > (compiled tested only because I don't have a fancy button to trigger
> > unknown nmis)
>
> You can trigger unknown NMIs via apic->send_IPI_mask(cpu_mask,
> NMI_VECTOR).
>
> How about the algorithm as follow:
Heh, I thought about the following too, just couldn't figure out an easy
way to timestamp. I forgot about the rdtsc(). :-)
The only thing that might screw this up would be an SMI which takes longer
than 1000 but that should be rare and would probably be related to the
unknown NMI anyway. Also under virt that would probably break (due to
time jumping) but until they emulate the perfctr, it won't matter. :-)
Cheers,
Don
On Wed, Aug 11, 2010 at 04:44:55AM +0200, Frederic Weisbecker wrote:
> May be make it just a pending bit. I mean not something that can
> go further 1, because you can't have more than 1 pending anyway. I don't
> know how that could happen you get accidental perctr_skip > 1, may be
> expected pending NMIs that don't happen somehow, but better be paranoid with
> that, as it's about trying not to miss hardware errors.
I guess I was thinking about the SMI case where it drains the perfctr(s)
and then retriggers them but I guess even in that case the most you can
have is one extra NMI. So yeah, you are probably right, I should have
used a flag instead of incrementing.
Cheers,
Don
On Wed, Aug 11, 2010 at 01:10:46PM +0200, Robert Richter wrote:
> > > + *
> > > + * NOTE: We have no way of knowing if a second NMI was
> > > + * actually triggered, so we may accidentally skip a valid
> > > + * unknown nmi later.
> > > + */
> > > + __get_cpu_var(perfctr_skip) +=1;
>
> ... but this will not work. You have to mark the *absolute* nmi number
> here. If you only raise a flag, the next unknown nmi will be dropped,
> every. Because, in between there could have been other nmis that
> stopped the chain and thus the 'unknown' path is not executed. The
> trick in my patch is that you *know*, which nmi you want to skip.
I guess I am confused. The way I read your patch was that you assumed the
next NMI would be the one you skip and if there was another NMI in between
the handled one and the one to skip, you would not skip it (nmi count !=
prev + 1) and it would produce an accidental unknown nmi.
I tried to change that with my patch by setting the skip flag which would
be drained on the next unknown nmi, independent of where it is in the NMI
backlog of NMIs.
Did I misread something?
Cheers,
Don
On 11.08.10 08:44:43, Don Zickus wrote:
> On Wed, Aug 11, 2010 at 01:10:46PM +0200, Robert Richter wrote:
> > > > + *
> > > > + * NOTE: We have no way of knowing if a second NMI was
> > > > + * actually triggered, so we may accidentally skip a valid
> > > > + * unknown nmi later.
> > > > + */
> > > > + __get_cpu_var(perfctr_skip) +=1;
> >
> > ... but this will not work. You have to mark the *absolute* nmi number
> > here. If you only raise a flag, the next unknown nmi will be dropped,
> > every. Because, in between there could have been other nmis that
> > stopped the chain and thus the 'unknown' path is not executed. The
> > trick in my patch is that you *know*, which nmi you want to skip.
>
> I guess I am confused. The way I read your patch was that you assumed the
> next NMI would be the one you skip and if there was another NMI in between
> the handled one and the one to skip, you would not skip it (nmi count !=
> prev + 1) and it would produce an accidental unknown nmi.
That's how it works.
> I tried to change that with my patch by setting the skip flag which would
> be drained on the next unknown nmi, independent of where it is in the NMI
> backlog of NMIs.
"setting the skip flag which would be drained on the next unknown nmi"
That's what is wrong, it drops every unknown nmi, no matter when it is
detected. In between there could be 1000's of valid other nmis
handled. You even could have been returned from nmi mode. But still,
the next unknown nmi will be dropped. Your patch could accumulate also
the number of unknown nmis to skip, and then, if 'real' unknown nmis
happen, all of them will be dropped.
-Robert
>
> Did I misread something?
>
> Cheers,
> Don
>
--
Advanced Micro Devices, Inc.
Operating System Research Center
On Wed, Aug 11, 2010 at 04:03:07PM +0200, Robert Richter wrote:
> On 11.08.10 08:44:43, Don Zickus wrote:
> > On Wed, Aug 11, 2010 at 01:10:46PM +0200, Robert Richter wrote:
> > > > > + *
> > > > > + * NOTE: We have no way of knowing if a second NMI was
> > > > > + * actually triggered, so we may accidentally skip a valid
> > > > > + * unknown nmi later.
> > > > > + */
> > > > > + __get_cpu_var(perfctr_skip) +=1;
> > >
> > > ... but this will not work. You have to mark the *absolute* nmi number
> > > here. If you only raise a flag, the next unknown nmi will be dropped,
> > > every. Because, in between there could have been other nmis that
> > > stopped the chain and thus the 'unknown' path is not executed. The
> > > trick in my patch is that you *know*, which nmi you want to skip.
> >
> > I guess I am confused. The way I read your patch was that you assumed the
> > next NMI would be the one you skip and if there was another NMI in between
> > the handled one and the one to skip, you would not skip it (nmi count !=
> > prev + 1) and it would produce an accidental unknown nmi.
>
> That's how it works.
>
> > I tried to change that with my patch by setting the skip flag which would
> > be drained on the next unknown nmi, independent of where it is in the NMI
> > backlog of NMIs.
>
> "setting the skip flag which would be drained on the next unknown nmi"
>
> That's what is wrong, it drops every unknown nmi, no matter when it is
Well as Frederic pointed out the skip variable will never go past one, so
it will only drop at most one unknown nmi.
> detected. In between there could be 1000's of valid other nmis
> handled. You even could have been returned from nmi mode. But still,
> the next unknown nmi will be dropped. Your patch could accumulate also
That was the intent. Can we guarantee that in the rare cases where the
perfctr is generating two nmis, that they will be back-to-back?
I think Huang tried to cap my approach even further my creating a time
window in which the two nmis had to happen. That gives us the flexibility
to handle nmis that are not back to back, but yet deal with the case where
two perfctrs fired but we are unsure if it generated a second nmi and we
falsely set the skip flag.
Cheers,
Don
I was debuging this a little more, see version 2 below.
-Robert
--
>From 8bb831af56d118b85fc38e0ddc2e516f7504b9fb Mon Sep 17 00:00:00 2001
From: Robert Richter <[email protected]>
Date: Thu, 5 Aug 2010 16:19:59 +0200
Subject: [PATCH] perf, x86: try to handle unknown nmis with running perfctrs
When perfctrs are running it is valid to have unhandled nmis, two
events could trigger 'simultaneously' raising two back-to-back
NMIs. If the first NMI handles both, the latter will be empty and daze
the CPU.
The solution to avoid an 'unknown nmi' massage in this case was simply
to stop the nmi handler chain when perfctrs are runnning by stating
the nmi was handled. This has the drawback that a) we can not detect
unknown nmis anymore, and b) subsequent nmi handlers are not called.
This patch addresses this. Now, we check this unknown NMI if it could
be a perfctr back-to-back NMI. Otherwise we pass it and let the kernel
handle the unknown nmi.
This is a debug log:
cpu #6, nmi #32333, skip_nmi #32330, handled = 1, time = 1934364430
cpu #6, nmi #32334, skip_nmi #32330, handled = 1, time = 1934704616
cpu #6, nmi #32335, skip_nmi #32336, handled = 2, time = 1936032320
cpu #6, nmi #32336, skip_nmi #32336, handled = 0, time = 1936034139
cpu #6, nmi #32337, skip_nmi #32336, handled = 1, time = 1936120100
cpu #6, nmi #32338, skip_nmi #32336, handled = 1, time = 1936404607
cpu #6, nmi #32339, skip_nmi #32336, handled = 1, time = 1937983416
cpu #6, nmi #32340, skip_nmi #32341, handled = 2, time = 1938201032
cpu #6, nmi #32341, skip_nmi #32341, handled = 0, time = 1938202830
cpu #6, nmi #32342, skip_nmi #32341, handled = 1, time = 1938443743
cpu #6, nmi #32343, skip_nmi #32341, handled = 1, time = 1939956552
cpu #6, nmi #32344, skip_nmi #32341, handled = 1, time = 1940073224
cpu #6, nmi #32345, skip_nmi #32341, handled = 1, time = 1940485677
cpu #6, nmi #32346, skip_nmi #32347, handled = 2, time = 1941947772
cpu #6, nmi #32347, skip_nmi #32347, handled = 1, time = 1941949818
cpu #6, nmi #32348, skip_nmi #32347, handled = 0, time = 1941951591
Uhhuh. NMI received for unknown reason 00 on CPU 6.
Do you have a strange power saving mode enabled?
Dazed and confused, but trying to continue
Deltas:
nmi #32334 340186
nmi #32335 1327704
nmi #32336 1819 <<<< back-to-back nmi [1]
nmi #32337 85961
nmi #32338 284507
nmi #32339 1578809
nmi #32340 217616
nmi #32341 1798 <<<< back-to-back nmi [2]
nmi #32342 240913
nmi #32343 1512809
nmi #32344 116672
nmi #32345 412453
nmi #32346 1462095 <<<< 1st nmi (standard) handling 2 counters
nmi #32347 2046 <<<< 2nd nmi (back-to-back) handling one counter
nmi #32348 1773 <<<< 3rd nmi (back-to-back) handling no counter! [3]
For back-to-back nmi detection there are the following rules:
The perfctr nmi handler was handling more than one counter and no
counter was handled in the subsequent nmi (see [1] and [2] above).
There is another case if there are two subsequent back-to-back nmis
[3]. In this case we measure the time between the first and the
2nd. The 2nd is detected as back-to-back because the first handled
more than one counter. The time between the 1st and the 2nd is used to
calculate a range for which we assume a back-to-back nmi. Now, the 3rd
nmi triggers, we measure again the time delta and compare it with the
first delta from which we know it was a back-to-back nmi. If the 3rd
nmi is within the range, it is also a back-to-back nmi and we drop it.
Signed-off-by: Robert Richter <[email protected]>
---
arch/x86/kernel/cpu/perf_event.c | 96 +++++++++++++++++++++++++++++++++----
1 files changed, 85 insertions(+), 11 deletions(-)
diff --git a/arch/x86/kernel/cpu/perf_event.c b/arch/x86/kernel/cpu/perf_event.c
index f2da20f..b79a235 100644
--- a/arch/x86/kernel/cpu/perf_event.c
+++ b/arch/x86/kernel/cpu/perf_event.c
@@ -1154,7 +1154,7 @@ static int x86_pmu_handle_irq(struct pt_regs *regs)
/*
* event overflow
*/
- handled = 1;
+ handled++;
data.period = event->hw.last_period;
if (!x86_perf_event_set_period(event))
@@ -1200,12 +1200,24 @@ void perf_events_lapic_init(void)
apic_write(APIC_LVTPC, APIC_DM_NMI);
}
+struct perfctr_nmi {
+ u64 timestamp;
+ u64 range;
+ unsigned int marked;
+ int handled;
+};
+
+static DEFINE_PER_CPU(struct perfctr_nmi, nmi);
+
static int __kprobes
perf_event_nmi_handler(struct notifier_block *self,
unsigned long cmd, void *__args)
{
struct die_args *args = __args;
- struct pt_regs *regs;
+ unsigned int this_nmi;
+ int handled;
+ u64 timestamp;
+ u64 delta;
if (!atomic_read(&active_events))
return NOTIFY_DONE;
@@ -1214,22 +1226,84 @@ perf_event_nmi_handler(struct notifier_block *self,
case DIE_NMI:
case DIE_NMI_IPI:
break;
+ case DIE_NMIUNKNOWN:
+ this_nmi = percpu_read(irq_stat.__nmi_count);
+ if (this_nmi != __get_cpu_var(nmi).marked)
+ return NOTIFY_DONE;
+
+ /*
+ * This one could be our NMI, two events could trigger
+ * 'simultaneously' raising two back-to-back NMIs. If
+ * the first NMI handles both, the latter will be
+ * empty and daze the CPU.
+ *
+ * So, we check this unknown NMI and drop it if it is
+ * a perfctr back-to-back nmi. Otherwise we pass it
+ * and let the kernel handle the unknown nmi.
+ */
+
+ if (!cpu_has_tsc)
+ /*
+ * no timestamps available the cannot detect
+ * back-to-back nmis, drop it
+ */
+ return NOTIFY_STOP;
+
+ if (__get_cpu_var(nmi).handled > 1)
+ /*
+ * we have handled more than one counter,
+ * this must be a back-to-back nmi
+ */
+ return NOTIFY_STOP;
+
+ rdtscll(delta);
+ delta -= __get_cpu_var(nmi).timestamp;
+ if (delta < __get_cpu_var(nmi).range)
+ /* the delta is small, it is a back-to-back nmi */
+ return NOTIFY_STOP;
+
+ /* not a perfctr back-to-back nmi, let it pass */
+ return NOTIFY_DONE;
default:
return NOTIFY_DONE;
}
- regs = args->regs;
+ this_nmi = percpu_read(irq_stat.__nmi_count);
apic_write(APIC_LVTPC, APIC_DM_NMI);
- /*
- * Can't rely on the handled return value to say it was our NMI, two
- * events could trigger 'simultaneously' raising two back-to-back NMIs.
- *
- * If the first NMI handles both, the latter will be empty and daze
- * the CPU.
- */
- x86_pmu.handle_irq(regs);
+
+ handled = x86_pmu.handle_irq(args->regs);
+
+ if (!handled)
+ return NOTIFY_DONE;
+
+ if ((handled == 1) && (__get_cpu_var(nmi).marked != this_nmi))
+ /* this may not trigger back-to-back nmis */
+ return NOTIFY_STOP;
+
+ if (!cpu_has_tsc)
+ goto mark_next_nmi;
+
+ rdtscll(timestamp);
+ if (__get_cpu_var(nmi).marked == this_nmi) {
+ /*
+ * this was a back-to-back nmi, calculate back-to-back
+ * time delta and define the back-to-back range that
+ * is twice the delta
+ */
+ delta = timestamp;
+ delta -= __get_cpu_var(nmi).timestamp;
+
+ __get_cpu_var(nmi).range = delta << 1;
+ }
+
+ __get_cpu_var(nmi).timestamp = timestamp;
+ __get_cpu_var(nmi).handled = handled;
+
+ mark_next_nmi:
+ /* the next nmi could be a back-to-back nmi */
+ __get_cpu_var(nmi).marked = this_nmi + 1;
return NOTIFY_STOP;
}
--
1.7.1.1
--
Advanced Micro Devices, Inc.
Operating System Research Center
On 12.08.10 00:00:58, Robert Richter wrote:
> I was debuging this a little more, see version 2 below.
I was testing the patch further, it properly filters perfctr
back-to-back nmis. I was able to reliable detect unknown nmis
triggered by the nmi button during high load perf sessions with
multiple counters, no false positives.
>
> -Robert
--
Advanced Micro Devices, Inc.
Operating System Research Center
On 10.08.10 15:24:28, Cyrill Gorcunov wrote:
> On Tue, Aug 10, 2010 at 09:05:41PM +0200, Robert Richter wrote:
> > On 10.08.10 13:24:51, Cyrill Gorcunov wrote:
> >
> > > It gets masked on NMI arrival, at least for some models (Core Duo, P4,
> > > P6 M and I suspect more theh that, that was the reason oprofile has
> > > it, also there is a note in SDM V3a page 643).
> >
> > Yes, that's right, I never noticed that. Maybe it is better to
> > implement the apic write it in model specific code then.
> >
>
> Perhaps we can make it simplier I think, ie like it was before -- we just
> move it under your new DIE_NMIUNKNOWN, in separate patch of course. Though
> I'm fine with either way.
I do not understand why you want to put this in the 'unknown'
path. Isn't it necessary to unmask the vector with every call of the
nmi handler?
-Robert
>
> (actually it's interesting to know wouldn't we leave lvt masked when
> we hit 'second delayed nmi has arrived' situation, I guess we didn't
> hit it before in real yet :-)
>
> > -Robert
> >
> > --
> > Advanced Micro Devices, Inc.
> > Operating System Research Center
> >
> -- Cyrill
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>
--
Advanced Micro Devices, Inc.
Operating System Research Center
On Thu, Aug 12, 2010 at 12:00:58AM +0200, Robert Richter wrote:
> I was debuging this a little more, see version 2 below.
>
> -Robert
>
> --
>
> From 8bb831af56d118b85fc38e0ddc2e516f7504b9fb Mon Sep 17 00:00:00 2001
> From: Robert Richter <[email protected]>
> Date: Thu, 5 Aug 2010 16:19:59 +0200
> Subject: [PATCH] perf, x86: try to handle unknown nmis with running perfctrs
>
> When perfctrs are running it is valid to have unhandled nmis, two
> events could trigger 'simultaneously' raising two back-to-back
> NMIs. If the first NMI handles both, the latter will be empty and daze
> the CPU.
>
> The solution to avoid an 'unknown nmi' massage in this case was simply
> to stop the nmi handler chain when perfctrs are runnning by stating
> the nmi was handled. This has the drawback that a) we can not detect
> unknown nmis anymore, and b) subsequent nmi handlers are not called.
>
> This patch addresses this. Now, we check this unknown NMI if it could
> be a perfctr back-to-back NMI. Otherwise we pass it and let the kernel
> handle the unknown nmi.
Seems to cover my concerns. Great work!
ACK
Cheers,
Don
On Thu, Aug 12, 2010 at 03:24:19PM +0200, Robert Richter wrote:
> On 10.08.10 15:24:28, Cyrill Gorcunov wrote:
> > On Tue, Aug 10, 2010 at 09:05:41PM +0200, Robert Richter wrote:
> > > On 10.08.10 13:24:51, Cyrill Gorcunov wrote:
> > >
> > > > It gets masked on NMI arrival, at least for some models (Core Duo, P4,
> > > > P6 M and I suspect more theh that, that was the reason oprofile has
> > > > it, also there is a note in SDM V3a page 643).
> > >
> > > Yes, that's right, I never noticed that. Maybe it is better to
> > > implement the apic write it in model specific code then.
> > >
> >
> > Perhaps we can make it simplier I think, ie like it was before -- we just
> > move it under your new DIE_NMIUNKNOWN, in separate patch of course. Though
> > I'm fine with either way.
>
> I do not understand why you want to put this in the 'unknown'
> path. Isn't it necessary to unmask the vector with every call of the
> nmi handler?
>
> -Robert
>
Heh, it's simple - I'm screwed. Robert you're right, of course it should NOT
be under every 'unknown' nmi. I thought about small optimization here, but
I think all this should be done only _after_ your patch is merged.
Sorry for confuse ;)
-- Cyrill
On Thu, Aug 12, 2010 at 03:10:07PM +0200, Robert Richter wrote:
> On 12.08.10 00:00:58, Robert Richter wrote:
> > I was debuging this a little more, see version 2 below.
>
> I was testing the patch further, it properly filters perfctr
> back-to-back nmis. I was able to reliable detect unknown nmis
> triggered by the nmi button during high load perf sessions with
> multiple counters, no false positives.
For my own curiousity, what type of high load perf sessions are you using
to test this. I don't know perf well enough to have it generate events
across multiple perfctrs.
Thanks,
Don
On Thu, Aug 12, 2010 at 12:00:58AM +0200, Robert Richter wrote:
> I was debuging this a little more, see version 2 below.
>
> -Robert
>
> --
>
> From 8bb831af56d118b85fc38e0ddc2e516f7504b9fb Mon Sep 17 00:00:00 2001
> From: Robert Richter <[email protected]>
> Date: Thu, 5 Aug 2010 16:19:59 +0200
> Subject: [PATCH] perf, x86: try to handle unknown nmis with running perfctrs
>
> When perfctrs are running it is valid to have unhandled nmis, two
> events could trigger 'simultaneously' raising two back-to-back
> NMIs. If the first NMI handles both, the latter will be empty and daze
> the CPU.
>
> The solution to avoid an 'unknown nmi' massage in this case was simply
> to stop the nmi handler chain when perfctrs are runnning by stating
> the nmi was handled. This has the drawback that a) we can not detect
> unknown nmis anymore, and b) subsequent nmi handlers are not called.
>
> This patch addresses this. Now, we check this unknown NMI if it could
> be a perfctr back-to-back NMI. Otherwise we pass it and let the kernel
> handle the unknown nmi.
>
> This is a debug log:
>
> cpu #6, nmi #32333, skip_nmi #32330, handled = 1, time = 1934364430
> cpu #6, nmi #32334, skip_nmi #32330, handled = 1, time = 1934704616
> cpu #6, nmi #32335, skip_nmi #32336, handled = 2, time = 1936032320
> cpu #6, nmi #32336, skip_nmi #32336, handled = 0, time = 1936034139
> cpu #6, nmi #32337, skip_nmi #32336, handled = 1, time = 1936120100
> cpu #6, nmi #32338, skip_nmi #32336, handled = 1, time = 1936404607
> cpu #6, nmi #32339, skip_nmi #32336, handled = 1, time = 1937983416
> cpu #6, nmi #32340, skip_nmi #32341, handled = 2, time = 1938201032
> cpu #6, nmi #32341, skip_nmi #32341, handled = 0, time = 1938202830
> cpu #6, nmi #32342, skip_nmi #32341, handled = 1, time = 1938443743
> cpu #6, nmi #32343, skip_nmi #32341, handled = 1, time = 1939956552
> cpu #6, nmi #32344, skip_nmi #32341, handled = 1, time = 1940073224
> cpu #6, nmi #32345, skip_nmi #32341, handled = 1, time = 1940485677
> cpu #6, nmi #32346, skip_nmi #32347, handled = 2, time = 1941947772
> cpu #6, nmi #32347, skip_nmi #32347, handled = 1, time = 1941949818
> cpu #6, nmi #32348, skip_nmi #32347, handled = 0, time = 1941951591
> Uhhuh. NMI received for unknown reason 00 on CPU 6.
> Do you have a strange power saving mode enabled?
> Dazed and confused, but trying to continue
>
> Deltas:
>
> nmi #32334 340186
> nmi #32335 1327704
> nmi #32336 1819 <<<< back-to-back nmi [1]
> nmi #32337 85961
> nmi #32338 284507
> nmi #32339 1578809
> nmi #32340 217616
> nmi #32341 1798 <<<< back-to-back nmi [2]
> nmi #32342 240913
> nmi #32343 1512809
> nmi #32344 116672
> nmi #32345 412453
> nmi #32346 1462095 <<<< 1st nmi (standard) handling 2 counters
> nmi #32347 2046 <<<< 2nd nmi (back-to-back) handling one counter
> nmi #32348 1773 <<<< 3rd nmi (back-to-back) handling no counter! [3]
>
> For back-to-back nmi detection there are the following rules:
>
> The perfctr nmi handler was handling more than one counter and no
> counter was handled in the subsequent nmi (see [1] and [2] above).
>
> There is another case if there are two subsequent back-to-back nmis
> [3]. In this case we measure the time between the first and the
> 2nd. The 2nd is detected as back-to-back because the first handled
> more than one counter. The time between the 1st and the 2nd is used to
> calculate a range for which we assume a back-to-back nmi. Now, the 3rd
> nmi triggers, we measure again the time delta and compare it with the
> first delta from which we know it was a back-to-back nmi. If the 3rd
> nmi is within the range, it is also a back-to-back nmi and we drop it.
>
> Signed-off-by: Robert Richter <[email protected]>
> ---
That time based thing looks a bit complicated.
I'm still not sure why you don't want to use a simple flag:
After handled a perf NMI:
if (handled more than one counter)
__get_cpu_var(skip_unknown) = 1;
While handling an unknown NMI:
if (__get_cpu_var(skip_unknown)) {
__get_cpu_var(skip_unknow) = 0;
return NOTIFY_STOP;
}
On Wed, Aug 11, 2010 at 01:10:46PM +0200, Robert Richter wrote:
> On 10.08.10 22:44:55, Frederic Weisbecker wrote:
> > On Tue, Aug 10, 2010 at 04:48:56PM -0400, Don Zickus wrote:
> > > @@ -1200,7 +1200,7 @@ void perf_events_lapic_init(void)
> > > apic_write(APIC_LVTPC, APIC_DM_NMI);
> > > }
> > >
> > > -static DEFINE_PER_CPU(unsigned int, perfctr_handled);
> > > +static DEFINE_PER_CPU(unsigned int, perfctr_skip);
>
> Yes, using perfctr_skip is better to understand ...
>
> > > @@ -1229,14 +1228,11 @@ perf_event_nmi_handler(struct notifier_block *self,
> > > * was handling a perfctr. Otherwise we pass it and
> > > * let the kernel handle the unknown nmi.
> > > *
> > > - * Note: this could be improved if we drop unknown
> > > - * NMIs only if we handled more than one perfctr in
> > > - * the previous NMI.
> > > */
> > > - this_nmi = percpu_read(irq_stat.__nmi_count);
> > > - prev_nmi = __get_cpu_var(perfctr_handled);
> > > - if (this_nmi == prev_nmi + 1)
> > > + if (__get_cpu_var(perfctr_skip)){
> > > + __get_cpu_var(perfctr_skip) -=1;
> > > return NOTIFY_STOP;
> > > + }
> > > return NOTIFY_DONE;
> > > default:
> > > return NOTIFY_DONE;
> > > @@ -1246,11 +1242,21 @@ perf_event_nmi_handler(struct notifier_block *self,
> > >
> > > apic_write(APIC_LVTPC, APIC_DM_NMI);
> > >
> > > - if (!x86_pmu.handle_irq(regs))
> > > + handled = x86_pmu.handle_irq(regs);
> > > + if (!handled)
> > > + /* not our NMI */
> > > return NOTIFY_DONE;
> > > -
> > > - /* handled */
> > > - __get_cpu_var(perfctr_handled) = percpu_read(irq_stat.__nmi_count);
> > > + else if (handled > 1)
> > > + /*
> > > + * More than one perfctr triggered. This could have
> > > + * caused a second NMI that we must now skip because
> > > + * we have already handled it. Remember it.
> > > + *
> > > + * NOTE: We have no way of knowing if a second NMI was
> > > + * actually triggered, so we may accidentally skip a valid
> > > + * unknown nmi later.
> > > + */
> > > + __get_cpu_var(perfctr_skip) +=1;
>
> ... but this will not work. You have to mark the *absolute* nmi number
> here. If you only raise a flag, the next unknown nmi will be dropped,
> every.
Isn't it what we want? Only the next unknown nmi gets dropped.
> Because, in between there could have been other nmis that
> stopped the chain and thus the 'unknown' path is not executed.
I'm not sure what you mean here. Are you thinking about a third
NMI source that triggers while we are still handling the first
NMI in the back to back sequence?
> The trick in my patch is that you *know*, which nmi you want to skip.
Well with the flag you also know which nmi you want to skip.
On 13.08.10 00:37:48, Frederic Weisbecker wrote:
> >
> > ... but this will not work. You have to mark the *absolute* nmi number
> > here. If you only raise a flag, the next unknown nmi will be dropped,
> > every.
>
>
>
> Isn't it what we want? Only the next unknown nmi gets dropped.
>
>
>
>
> > Because, in between there could have been other nmis that
> > stopped the chain and thus the 'unknown' path is not executed.
>
>
>
> I'm not sure what you mean here. Are you thinking about a third
> NMI source that triggers while we are still handling the first
> NMI in the back to back sequence?
>
>
>
> > The trick in my patch is that you *know*, which nmi you want to skip.
>
>
> Well with the flag you also know which nmi you want to skip.
We cannot assume that all cpus have the same behavior here. Imagine a
cpu that handles 2 counters and *does not* trigger a back-to-back
nmi. With flags only implemented, the next unknown nmi will be dropped
anyway, no matter when it fires. We have to check the nmi sequence.
The next thing you have to be aware is, a registered nmi handler is
not called with every nmi. If there was another nmi source and a
handler with higher priority was returning a stop, when all other
subsequent handlers are not called. Esp. 'unknown nmi' is called only
in rare cases. So, a handler might not get noticed of an nmi. This
means, if a handler gets called a 2nd time, it must not necessarily
the next (2nd) nmi.
-Robert
--
Advanced Micro Devices, Inc.
Operating System Research Center
On Fri, Aug 13, 2010 at 10:22:30AM +0200, Robert Richter wrote:
> On 13.08.10 00:37:48, Frederic Weisbecker wrote:
>
> > >
> > > ... but this will not work. You have to mark the *absolute* nmi number
> > > here. If you only raise a flag, the next unknown nmi will be dropped,
> > > every.
> >
> >
> >
> > Isn't it what we want? Only the next unknown nmi gets dropped.
> >
> >
> >
> >
> > > Because, in between there could have been other nmis that
> > > stopped the chain and thus the 'unknown' path is not executed.
> >
> >
> >
> > I'm not sure what you mean here. Are you thinking about a third
> > NMI source that triggers while we are still handling the first
> > NMI in the back to back sequence?
> >
> >
> >
> > > The trick in my patch is that you *know*, which nmi you want to skip.
> >
> >
> > Well with the flag you also know which nmi you want to skip.
>
> We cannot assume that all cpus have the same behavior here. Imagine a
> cpu that handles 2 counters and *does not* trigger a back-to-back
> nmi. With flags only implemented, the next unknown nmi will be dropped
> anyway, no matter when it fires. We have to check the nmi sequence.
I'd expect it to be an ABI. NMIs can't nest, but if one triggers while
servicing another, it should trigger right after (once we "iret", which
reenables NMIs).
But I haven't checked intel or amd documentation about that.
> The next thing you have to be aware is, a registered nmi handler is
> not called with every nmi. If there was another nmi source and a
> handler with higher priority was returning a stop, when all other
> subsequent handlers are not called. Esp. 'unknown nmi' is called only
> in rare cases. So, a handler might not get noticed of an nmi. This
> means, if a handler gets called a 2nd time, it must not necessarily
> the next (2nd) nmi.
Yeah, in this case we can just clear the __ger_cpu_var(next_nmi_skip)
when another handler service the next NMI.
On 13.08.10 21:28:07, Frederic Weisbecker wrote:
> > We cannot assume that all cpus have the same behavior here. Imagine a
> > cpu that handles 2 counters and *does not* trigger a back-to-back
> > nmi. With flags only implemented, the next unknown nmi will be dropped
> > anyway, no matter when it fires. We have to check the nmi sequence.
>
>
>
> I'd expect it to be an ABI. NMIs can't nest, but if one triggers while
> servicing another, it should trigger right after (once we "iret", which
> reenables NMIs).
>
> But I haven't checked intel or amd documentation about that.
Yes, nmis are nested and if there is another source firing it will be
retriggered. But the question is, if multiple counters trigger, does
this mean multiple nmis are fired, esp. if all counters were served in
the first run? This very much depends on the cpu implementation and it
will be hard to find documentation about this detail.
> > The next thing you have to be aware is, a registered nmi handler is
> > not called with every nmi. If there was another nmi source and a
> > handler with higher priority was returning a stop, when all other
> > subsequent handlers are not called. Esp. 'unknown nmi' is called only
> > in rare cases. So, a handler might not get noticed of an nmi. This
> > means, if a handler gets called a 2nd time, it must not necessarily
> > the next (2nd) nmi.
>
>
> Yeah, in this case we can just clear the __ger_cpu_var(next_nmi_skip)
> when another handler service the next NMI.
Yes, this might work too. But then you end up at the same complexity.
Even worse, you have to check and unset the flag with each perf nmi.
If you store the nmi number, it will only be read in then 'unknown'
nmi path again. And, you can't unset the flag for nmis by other
sources what do not pass the perf nmi handler.
So, overall, I don't see advantages in using a flag. The
implementation of storing the nmi number is simple and straight
forward with no or less impact on performance or memory usage.
-Robert
--
Advanced Micro Devices, Inc.
Operating System Research Center
On 12.08.10 14:21:26, Don Zickus wrote:
> > I was testing the patch further, it properly filters perfctr
> > back-to-back nmis. I was able to reliable detect unknown nmis
> > triggered by the nmi button during high load perf sessions with
> > multiple counters, no false positives.
>
> For my own curiousity, what type of high load perf sessions are you using
> to test this. I don't know perf well enough to have it generate events
> across multiple perfctrs.
You put load on all cpus and then start something like the following:
perf record -e cycles -e instructions -e cache-references \
-e cache-misses -e branch-misses -a -- sleep 10
-Robert
--
Advanced Micro Devices, Inc.
Operating System Research Center
On Wed, 2010-08-11 at 08:36 -0400, Don Zickus wrote:
> The only thing that might screw this up would be an SMI which takes longer
> than 1000 but that should be rare and would probably be related to the
> unknown NMI anyway.
Long running SMIs aren't nearly as rare as you'd want them to be.
Hitting one in exactly the right spot will be, but given the numbers its
going to happen and make us scratch our heads..
On Thu, 2010-08-12 at 00:00 +0200, Robert Richter wrote:
> From 8bb831af56d118b85fc38e0ddc2e516f7504b9fb Mon Sep 17 00:00:00 2001
> From: Robert Richter <[email protected]>
> Date: Thu, 5 Aug 2010 16:19:59 +0200
> Subject: [PATCH] perf, x86: try to handle unknown nmis with running perfctrs
>
> When perfctrs are running it is valid to have unhandled nmis, two
> events could trigger 'simultaneously' raising two back-to-back
> NMIs. If the first NMI handles both, the latter will be empty and daze
> the CPU.
>
> The solution to avoid an 'unknown nmi' massage in this case was simply
> to stop the nmi handler chain when perfctrs are runnning by stating
> the nmi was handled. This has the drawback that a) we can not detect
> unknown nmis anymore, and b) subsequent nmi handlers are not called.
>
> This patch addresses this. Now, we check this unknown NMI if it could
> be a perfctr back-to-back NMI. Otherwise we pass it and let the kernel
> handle the unknown nmi.
>
> This is a debug log:
>
> Deltas:
>
> nmi #32346 1462095 <<<< 1st nmi (standard) handling 2 counters
> nmi #32347 2046 <<<< 2nd nmi (back-to-back) handling one counter
> nmi #32348 1773 <<<< 3rd nmi (back-to-back) handling no counter! [3]
>
> For back-to-back nmi detection there are the following rules:
>
> The perfctr nmi handler was handling more than one counter and no
> counter was handled in the subsequent nmi (see [1] and [2] above).
>
> There is another case if there are two subsequent back-to-back nmis
> [3]. In this case we measure the time between the first and the
> 2nd. The 2nd is detected as back-to-back because the first handled
> more than one counter. The time between the 1st and the 2nd is used to
> calculate a range for which we assume a back-to-back nmi. Now, the 3rd
> nmi triggers, we measure again the time delta and compare it with the
> first delta from which we know it was a back-to-back nmi. If the 3rd
> nmi is within the range, it is also a back-to-back nmi and we drop it.
I liked the one without funny timestamps in better, the whole timestamps
thing just feels too fragile.
Relying on handled > 1 to arm the back-to-back filter seems doable.
(Also, you didn't deal with the TSC going backwards..)
On Mon, Aug 16, 2010 at 04:48:36PM +0200, Peter Zijlstra wrote:
...
> > There is another case if there are two subsequent back-to-back nmis
> > [3]. In this case we measure the time between the first and the
> > 2nd. The 2nd is detected as back-to-back because the first handled
> > more than one counter. The time between the 1st and the 2nd is used to
> > calculate a range for which we assume a back-to-back nmi. Now, the 3rd
> > nmi triggers, we measure again the time delta and compare it with the
> > first delta from which we know it was a back-to-back nmi. If the 3rd
> > nmi is within the range, it is also a back-to-back nmi and we drop it.
>
> I liked the one without funny timestamps in better, the whole timestamps
> thing just feels too fragile.
>
Me too, the former Roberts patch (if I'm not missing something) looks good
to me.
>
> Relying on handled > 1 to arm the back-to-back filter seems doable.
>
It's doable _but_ I think there is nothing we can do, there is no
way (at least I known of) to check if there is latched nmi from
perf counters. We only can assume that if there multiple counters
overflowed most probably the next unknown nmi has the same nature,
ie it came from perf. Yes, we can loose real unknown nmi in this
case but I think this is justified trade off. If an user need
a precise counting of unknown nmis he should not arm perf events
at all, if there an user with nmi button (guys where did you get this
magic buttuns? i need one ;) he better to not arm perf events too
otherwise he might have to click twice
(and of course we should keep in mind Andi's proposal but it
is a next step I think).
> (Also, you didn't deal with the TSC going backwards..)
>
-- Cyrill
On 16.08.10 12:27:06, Cyrill Gorcunov wrote:
> On Mon, Aug 16, 2010 at 04:48:36PM +0200, Peter Zijlstra wrote:
> > I liked the one without funny timestamps in better, the whole timestamps
> > thing just feels too fragile.
> >
>
> Me too, the former Roberts patch (if I'm not missing something) looks good
> to me.
>
> >
> > Relying on handled > 1 to arm the back-to-back filter seems doable.
Peter, I will rip out the timestamp code from the -v2 patch. My first
patch does not deal with a 2-1-0 sequence, so it has false positives.
We do not necessarily need the timestamps if back-to-back nmis are
rare. Without using timestamps the statistically lost ratio for
unknown nmis will be as the ratio for back-to-back nmis, with
timestamps we could catch almost every unknown nmi. So if we encounter
problems we could still implement timestamp code on top.
> It's doable _but_ I think there is nothing we can do, there is no
> way (at least I known of) to check if there is latched nmi from
> perf counters. We only can assume that if there multiple counters
> overflowed most probably the next unknown nmi has the same nature,
> ie it came from perf.
As said, I think with timestamps we could be able to detect 100% of
the unknown nmis. I guess we get now more than 90% with mutliple
counters, and 100% with a single counter running. So, this is already
more than a simple improvement.
> Yes, we can loose real unknown nmi in this
> case but I think this is justified trade off. If an user need
> a precise counting of unknown nmis he should not arm perf events
> at all, if there an user with nmi button (guys where did you get this
> magic buttuns? i need one ;) he better to not arm perf events too
> otherwise he might have to click twice
>
> (and of course we should keep in mind Andi's proposal but it
> is a next step I think).
Yes, this patch is the first step, now we can change the nmi handler
priority. The perf handler must not have the lowest priority anymore.
> > (Also, you didn't deal with the TSC going backwards..)
Does this also happen in the case of a back-to-back nmi? I don't know
the conditions for a backward running TSC. Maybe, if an nmi is
retriggered the TSC wont be adjusted by a negative offset, I don't
know...
-Robert
--
Advanced Micro Devices, Inc.
Operating System Research Center
On Mon, Aug 16, 2010 at 07:16:10PM +0200, Robert Richter wrote:
> On 16.08.10 12:27:06, Cyrill Gorcunov wrote:
> > On Mon, Aug 16, 2010 at 04:48:36PM +0200, Peter Zijlstra wrote:
> > > I liked the one without funny timestamps in better, the whole timestamps
> > > thing just feels too fragile.
> > >
> >
> > Me too, the former Roberts patch (if I'm not missing something) looks good
> > to me.
> >
> > >
> > > Relying on handled > 1 to arm the back-to-back filter seems doable.
>
I suspect Peter was supposed to be in To: field ;)
> Peter, I will rip out the timestamp code from the -v2 patch. My first
> patch does not deal with a 2-1-0 sequence, so it has false positives.
> We do not necessarily need the timestamps if back-to-back nmis are
> rare. Without using timestamps the statistically lost ratio for
> unknown nmis will be as the ratio for back-to-back nmis, with
> timestamps we could catch almost every unknown nmi. So if we encounter
> problems we could still implement timestamp code on top.
>
> > It's doable _but_ I think there is nothing we can do, there is no
> > way (at least I known of) to check if there is latched nmi from
> > perf counters. We only can assume that if there multiple counters
> > overflowed most probably the next unknown nmi has the same nature,
> > ie it came from perf.
>
> As said, I think with timestamps we could be able to detect 100% of
> the unknown nmis. I guess we get now more than 90% with mutliple
> counters, and 100% with a single counter running. So, this is already
> more than a simple improvement.
Robert, I think we still may miss unknown irq, consider the case when
unknown nmi is latched while you handle nmi from perf and what is
more interesting several counters may be overflowed. So you set
delta small enough and second (unknown nmi) will be in range and
treated as being perf back-to-back, or I miss something from patch?
>
> > Yes, we can loose real unknown nmi in this
> > case but I think this is justified trade off. If an user need
> > a precise counting of unknown nmis he should not arm perf events
> > at all, if there an user with nmi button (guys where did you get this
> > magic buttuns? i need one ;) he better to not arm perf events too
> > otherwise he might have to click twice
> >
> > (and of course we should keep in mind Andi's proposal but it
> > is a next step I think).
>
> Yes, this patch is the first step, now we can change the nmi handler
> priority. The perf handler must not have the lowest priority anymore.
>
> > > (Also, you didn't deal with the TSC going backwards..)
>
> Does this also happen in the case of a back-to-back nmi? I don't know
> the conditions for a backward running TSC. Maybe, if an nmi is
> retriggered the TSC wont be adjusted by a negative offset, I don't
> know...
I never heard of backward running tsc, though tsc is a strange beast.
>
> -Robert
>
> --
> Advanced Micro Devices, Inc.
> Operating System Research Center
>
-- Cyrill
On Mon, 2010-08-16 at 23:06 +0400, Cyrill Gorcunov wrote:
> I never heard of backward running tsc, though tsc is a strange beast.
>
Its not supposed to happen, but then there's BIOS failure-add that frobs
the TSC from SMIs and fun TSC artifacts around CPU frequency changes and
people resetting TSC in S-states etc..
In short, never trust the TSC to be even remotely sane.
On Mon, Aug 16, 2010 at 09:13:16PM +0200, Peter Zijlstra wrote:
> On Mon, 2010-08-16 at 23:06 +0400, Cyrill Gorcunov wrote:
> > I never heard of backward running tsc, though tsc is a strange beast.
> >
> Its not supposed to happen, but then there's BIOS failure-add that frobs
> the TSC from SMIs and fun TSC artifacts around CPU frequency changes and
> people resetting TSC in S-states etc..
grr :/
>
> In short, never trust the TSC to be even remotely sane.
>
ok, good to know, thanks!
-- Cyrill
On 16.08.10 15:06:59, Cyrill Gorcunov wrote:
> I suspect Peter was supposed to be in To: field ;)
It was the attempt to answer 2 mails with one. :)
> Robert, I think we still may miss unknown irq, consider the case when
> unknown nmi is latched while you handle nmi from perf and what is
> more interesting several counters may be overflowed. So you set
> delta small enough and second (unknown nmi) will be in range and
> treated as being perf back-to-back, or I miss something from patch?
Yes, that's true, but before you have to enable your Infinite
Improbability Drive.
-Robert
--
Advanced Micro Devices, Inc.
Operating System Research Center
Peter,
this is version 3 without timestamp code. Compared to -v1 it
implements the handling for 2-1-0 nmi sequences.
-Robert
>From 71a206394d8a3536b033247ec14ad037fef77236 Mon Sep 17 00:00:00 2001
From: Robert Richter <[email protected]>
Date: Tue, 17 Aug 2010 16:42:03 +0200
Subject: [PATCH] perf, x86: try to handle unknown nmis with running perfctrs
When perfctrs are running it is valid to have unhandled nmis, two
events could trigger 'simultaneously' raising two back-to-back
NMIs. If the first NMI handles both, the latter will be empty and daze
the CPU.
The solution to avoid an 'unknown nmi' massage in this case was simply
to stop the nmi handler chain when perfctrs are runnning by stating
the nmi was handled. This has the drawback that a) we can not detect
unknown nmis anymore, and b) subsequent nmi handlers are not called.
This patch addresses this. Now, we check this unknown NMI if it could
be a perfctr back-to-back NMI. Otherwise we pass it and let the kernel
handle the unknown nmi.
This is a debug log:
cpu #6, nmi #32333, skip_nmi #32330, handled = 1, time = 1934364430
cpu #6, nmi #32334, skip_nmi #32330, handled = 1, time = 1934704616
cpu #6, nmi #32335, skip_nmi #32336, handled = 2, time = 1936032320
cpu #6, nmi #32336, skip_nmi #32336, handled = 0, time = 1936034139
cpu #6, nmi #32337, skip_nmi #32336, handled = 1, time = 1936120100
cpu #6, nmi #32338, skip_nmi #32336, handled = 1, time = 1936404607
cpu #6, nmi #32339, skip_nmi #32336, handled = 1, time = 1937983416
cpu #6, nmi #32340, skip_nmi #32341, handled = 2, time = 1938201032
cpu #6, nmi #32341, skip_nmi #32341, handled = 0, time = 1938202830
cpu #6, nmi #32342, skip_nmi #32341, handled = 1, time = 1938443743
cpu #6, nmi #32343, skip_nmi #32341, handled = 1, time = 1939956552
cpu #6, nmi #32344, skip_nmi #32341, handled = 1, time = 1940073224
cpu #6, nmi #32345, skip_nmi #32341, handled = 1, time = 1940485677
cpu #6, nmi #32346, skip_nmi #32347, handled = 2, time = 1941947772
cpu #6, nmi #32347, skip_nmi #32347, handled = 1, time = 1941949818
cpu #6, nmi #32348, skip_nmi #32347, handled = 0, time = 1941951591
Uhhuh. NMI received for unknown reason 00 on CPU 6.
Do you have a strange power saving mode enabled?
Dazed and confused, but trying to continue
Deltas:
nmi #32334 340186
nmi #32335 1327704
nmi #32336 1819 <<<< back-to-back nmi [1]
nmi #32337 85961
nmi #32338 284507
nmi #32339 1578809
nmi #32340 217616
nmi #32341 1798 <<<< back-to-back nmi [2]
nmi #32342 240913
nmi #32343 1512809
nmi #32344 116672
nmi #32345 412453
nmi #32346 1462095 <<<< 1st nmi (standard) handling 2 counters
nmi #32347 2046 <<<< 2nd nmi (back-to-back) handling one counter
nmi #32348 1773 <<<< 3rd nmi (back-to-back) handling no counter! [3]
For back-to-back nmi detection there are the following rules:
The perfctr nmi handler was handling more than one counter and no
counter was handled in the subsequent nmi (see [1] and [2] above).
There is another case if there are two subsequent back-to-back nmis
[3]. The 2nd is detected as back-to-back because the first handled
more than one counter. If the second handles one counter and the 3rd
handles nothing, we drop the 3rd nmi because it could be a
back-to-back nmi.
Signed-off-by: Robert Richter <[email protected]>
---
arch/x86/kernel/cpu/perf_event.c | 65 ++++++++++++++++++++++++++++++-------
1 files changed, 52 insertions(+), 13 deletions(-)
diff --git a/arch/x86/kernel/cpu/perf_event.c b/arch/x86/kernel/cpu/perf_event.c
index f2da20f..bbcc89e 100644
--- a/arch/x86/kernel/cpu/perf_event.c
+++ b/arch/x86/kernel/cpu/perf_event.c
@@ -1154,7 +1154,7 @@ static int x86_pmu_handle_irq(struct pt_regs *regs)
/*
* event overflow
*/
- handled = 1;
+ handled++;
data.period = event->hw.last_period;
if (!x86_perf_event_set_period(event))
@@ -1200,12 +1200,20 @@ void perf_events_lapic_init(void)
apic_write(APIC_LVTPC, APIC_DM_NMI);
}
+struct perfctr_nmi {
+ unsigned int marked;
+ int handled;
+};
+
+static DEFINE_PER_CPU(struct perfctr_nmi, nmi);
+
static int __kprobes
perf_event_nmi_handler(struct notifier_block *self,
unsigned long cmd, void *__args)
{
struct die_args *args = __args;
- struct pt_regs *regs;
+ unsigned int this_nmi;
+ int handled;
if (!atomic_read(&active_events))
return NOTIFY_DONE;
@@ -1214,22 +1222,53 @@ perf_event_nmi_handler(struct notifier_block *self,
case DIE_NMI:
case DIE_NMI_IPI:
break;
-
+ case DIE_NMIUNKNOWN:
+ this_nmi = percpu_read(irq_stat.__nmi_count);
+ if (this_nmi != __get_cpu_var(nmi).marked)
+ /* let the kernel handle the unknown nmi */
+ return NOTIFY_DONE;
+ /*
+ * This one is a perfctr back-to-back nmi. Two events
+ * trigger 'simultaneously' raising two back-to-back
+ * NMIs. If the first NMI handles both, the latter
+ * will be empty and daze the CPU. So, we drop it to
+ * avoid false-positive 'unknown nmi' messages.
+ */
+ return NOTIFY_STOP;
default:
return NOTIFY_DONE;
}
- regs = args->regs;
-
apic_write(APIC_LVTPC, APIC_DM_NMI);
- /*
- * Can't rely on the handled return value to say it was our NMI, two
- * events could trigger 'simultaneously' raising two back-to-back NMIs.
- *
- * If the first NMI handles both, the latter will be empty and daze
- * the CPU.
- */
- x86_pmu.handle_irq(regs);
+
+ handled = x86_pmu.handle_irq(args->regs);
+ if (!handled)
+ return NOTIFY_DONE;
+
+ this_nmi = percpu_read(irq_stat.__nmi_count);
+ if (handled > 1)
+ goto mark_nmi;
+ if ((__get_cpu_var(nmi).marked == this_nmi)
+ && (__get_cpu_var(nmi).handled > 1))
+ /*
+ * We could have two subsequent back-to-back nmis: The
+ * first handles more than one counter, the 2nd
+ * handles only one counter and the 3rd handles no
+ * counter.
+ *
+ * This is the 2nd nmi because the previous was
+ * handling more than one counter. We will mark the
+ * next (3rd) and then drop it if unhandled.
+ */
+ goto mark_nmi;
+
+ /* this may not trigger back-to-back nmis */
+ return NOTIFY_STOP;
+
+ mark_nmi:
+ /* the next nmi could be a back-to-back nmi */
+ __get_cpu_var(nmi).marked = this_nmi + 1;
+ __get_cpu_var(nmi).handled = handled;
return NOTIFY_STOP;
}
--
1.7.1.1
--
Advanced Micro Devices, Inc.
Operating System Research Center
On Tue, Aug 17, 2010 at 12:55:25AM +0200, Robert Richter wrote:
> On 16.08.10 15:06:59, Cyrill Gorcunov wrote:
> > I suspect Peter was supposed to be in To: field ;)
>
> It was the attempt to answer 2 mails with one. :)
>
ah, I see ;)
> > Robert, I think we still may miss unknown irq, consider the case when
> > unknown nmi is latched while you handle nmi from perf and what is
> > more interesting several counters may be overflowed. So you set
> > delta small enough and second (unknown nmi) will be in range and
> > treated as being perf back-to-back, or I miss something from patch?
>
> Yes, that's true, but before you have to enable your Infinite
> Improbability Drive.
>
not at all, it's absolutely legit to happen ;)
> -Robert
>
> --
> Advanced Micro Devices, Inc.
> Operating System Research Center
>
-- Cyrill
On Tue, Aug 17, 2010 at 05:22:25PM +0200, Robert Richter wrote:
> Peter,
>
> this is version 3 without timestamp code. Compared to -v1 it
> implements the handling for 2-1-0 nmi sequences.
>
> -Robert
>
> From 71a206394d8a3536b033247ec14ad037fef77236 Mon Sep 17 00:00:00 2001
> From: Robert Richter <[email protected]>
> Date: Tue, 17 Aug 2010 16:42:03 +0200
> Subject: [PATCH] perf, x86: try to handle unknown nmis with running perfctrs
>
...
> Signed-off-by: Robert Richter <[email protected]>
> ---
...
Looks good to me, thanks a lot Robert!
-- Cyrill
On Tue, 2010-08-17 at 17:22 +0200, Robert Richter wrote:
> + this_nmi = percpu_read(irq_stat.__nmi_count);
> + if (handled > 1)
> + goto mark_nmi;
> + if ((__get_cpu_var(nmi).marked == this_nmi)
> + && (__get_cpu_var(nmi).handled > 1))
> + /*
> + * We could have two subsequent back-to-back nmis: The
> + * first handles more than one counter, the 2nd
> + * handles only one counter and the 3rd handles no
> + * counter.
> + *
> + * This is the 2nd nmi because the previous was
> + * handling more than one counter. We will mark the
> + * next (3rd) and then drop it if unhandled.
> + */
> + goto mark_nmi;
> +
> + /* this may not trigger back-to-back nmis */
> + return NOTIFY_STOP;
> +
> + mark_nmi:
> + /* the next nmi could be a back-to-back nmi */
> + __get_cpu_var(nmi).marked = this_nmi + 1;
> + __get_cpu_var(nmi).handled = handled;
>
> return NOTIFY_STOP;
> }
I queued it with that part changed to:
+ this_nmi = percpu_read(irq_stat.__nmi_count);
+ if ((handled > 1) ||
+ /* the next nmi could be a back-to-back nmi */
+ ((__get_cpu_var(nmi).marked == this_nmi) &&
+ (__get_cpu_var(nmi).handled > 1))) {
+ /*
+ * We could have two subsequent back-to-back nmis: The
+ * first handles more than one counter, the 2nd
+ * handles only one counter and the 3rd handles no
+ * counter.
+ *
+ * This is the 2nd nmi because the previous was
+ * handling more than one counter. We will mark the
+ * next (3rd) and then drop it if unhandled.
+ */
+ __get_cpu_var(nmi).marked = this_nmi + 1;
+ __get_cpu_var(nmi).handled = handled;
+ }
return NOTIFY_STOP;
}
On 19.08.10 06:45:53, Peter Zijlstra wrote:
> I queued it with that part changed to:
>
> + this_nmi = percpu_read(irq_stat.__nmi_count);
> + if ((handled > 1) ||
> + /* the next nmi could be a back-to-back nmi */
> + ((__get_cpu_var(nmi).marked == this_nmi) &&
> + (__get_cpu_var(nmi).handled > 1))) {
> + /*
> + * We could have two subsequent back-to-back nmis: The
> + * first handles more than one counter, the 2nd
> + * handles only one counter and the 3rd handles no
> + * counter.
> + *
> + * This is the 2nd nmi because the previous was
> + * handling more than one counter. We will mark the
> + * next (3rd) and then drop it if unhandled.
> + */
> + __get_cpu_var(nmi).marked = this_nmi + 1;
> + __get_cpu_var(nmi).handled = handled;
> + }
>
> return NOTIFY_STOP;
> }
I am fine with this. Thanks Peter.
-Robert
--
Advanced Micro Devices, Inc.
Operating System Research Center
On Thu, Aug 19, 2010 at 12:45:53PM +0200, Peter Zijlstra wrote:
>
> I queued it with that part changed to:
I realized the other day this change doesn't cover the nehalem, core and p4
cases which use
intel_pmu_handle_irq
p4_pmu_handle_irq
as their handlers. Though that patch can go on top of Robert's.
Cheers,
Don
On Thu, 2010-08-19 at 10:12 -0400, Don Zickus wrote:
> On Thu, Aug 19, 2010 at 12:45:53PM +0200, Peter Zijlstra wrote:
> >
> > I queued it with that part changed to:
>
> I realized the other day this change doesn't cover the nehalem, core and p4
> cases which use
>
> intel_pmu_handle_irq
> p4_pmu_handle_irq
>
> as their handlers. Though that patch can go on top of Robert's.
Something like this?
---
Index: linux-2.6/arch/x86/kernel/cpu/perf_event_intel.c
===================================================================
--- linux-2.6.orig/arch/x86/kernel/cpu/perf_event_intel.c
+++ linux-2.6/arch/x86/kernel/cpu/perf_event_intel.c
@@ -713,6 +713,7 @@ static int intel_pmu_handle_irq(struct p
struct cpu_hw_events *cpuc;
int bit, loops;
u64 ack, status;
+ int handled = 0;
perf_sample_data_init(&data, 0);
@@ -743,12 +744,16 @@ again:
/*
* PEBS overflow sets bit 62 in the global status register
*/
- if (__test_and_clear_bit(62, (unsigned long *)&status))
+ if (__test_and_clear_bit(62, (unsigned long *)&status)) {
+ handled++;
x86_pmu.drain_pebs(regs);
+ }
for_each_set_bit(bit, (unsigned long *)&status, X86_PMC_IDX_MAX) {
struct perf_event *event = cpuc->events[bit];
+ handled++;
+
if (!test_bit(bit, cpuc->active_mask))
continue;
@@ -772,7 +777,7 @@ again:
done:
intel_pmu_enable_all(0);
- return 1;
+ return handled;
}
static struct event_constraint *
Index: linux-2.6/arch/x86/kernel/cpu/perf_event_p4.c
===================================================================
--- linux-2.6.orig/arch/x86/kernel/cpu/perf_event_p4.c
+++ linux-2.6/arch/x86/kernel/cpu/perf_event_p4.c
@@ -673,7 +673,7 @@ static int p4_pmu_handle_irq(struct pt_r
if (!overflow && (val & (1ULL << (x86_pmu.cntval_bits - 1))))
continue;
- handled += overflow;
+ handled += !!overflow;
/* event overflow for sure */
data.period = event->hw.last_period;
@@ -690,7 +690,7 @@ static int p4_pmu_handle_irq(struct pt_r
inc_irq_stat(apic_perf_irqs);
}
- return handled > 0;
+ return handled;
}
/*
On Thu, Aug 19, 2010 at 04:27:13PM +0200, Peter Zijlstra wrote:
> On Thu, 2010-08-19 at 10:12 -0400, Don Zickus wrote:
> > On Thu, Aug 19, 2010 at 12:45:53PM +0200, Peter Zijlstra wrote:
> > >
> > > I queued it with that part changed to:
> >
> > I realized the other day this change doesn't cover the nehalem, core and p4
> > cases which use
> >
> > intel_pmu_handle_irq
> > p4_pmu_handle_irq
> >
> > as their handlers. Though that patch can go on top of Robert's.
>
> Something like this?
Looks correct. I'll try to test the perf_event_intel.c path though I
haven't had much luck getting multiple events on the nehalem box I have
(the amd box was easy).
Cheers,
Don
On Thu, Aug 19, 2010 at 04:27:13PM +0200, Peter Zijlstra wrote:
...
> Index: linux-2.6/arch/x86/kernel/cpu/perf_event_p4.c
> ===================================================================
> --- linux-2.6.orig/arch/x86/kernel/cpu/perf_event_p4.c
> +++ linux-2.6/arch/x86/kernel/cpu/perf_event_p4.c
> @@ -673,7 +673,7 @@ static int p4_pmu_handle_irq(struct pt_r
> if (!overflow && (val & (1ULL << (x86_pmu.cntval_bits - 1))))
> continue;
>
> - handled += overflow;
> + handled += !!overflow;
No need to !! here, overflowed returns [0;1] though a small nit
and could be updated later :)
>
> /* event overflow for sure */
> data.period = event->hw.last_period;
> @@ -690,7 +690,7 @@ static int p4_pmu_handle_irq(struct pt_r
> inc_irq_stat(apic_perf_irqs);
> }
>
> - return handled > 0;
> + return handled;
> }
>
> /*
>
-- Cyrill
On Thu, 2010-08-19 at 21:43 +0400, Cyrill Gorcunov wrote:
> > @@ -673,7 +673,7 @@ static int p4_pmu_handle_irq(struct pt_r
> > if (!overflow && (val & (1ULL << (x86_pmu.cntval_bits
> - 1))))
> > continue;
> >
> > - handled += overflow;
> > + handled += !!overflow;
>
> No need to !! here, overflowed returns [0;1] though a small nit
> and could be updated later :)
done.
On Thu, Aug 19, 2010 at 04:27:13PM +0200, Peter Zijlstra wrote:
> x86_pmu.drain_pebs(regs);
> + }
>
> for_each_set_bit(bit, (unsigned long *)&status, X86_PMC_IDX_MAX) {
> struct perf_event *event = cpuc->events[bit];
>
> + handled++;
> +
> if (!test_bit(bit, cpuc->active_mask))
> continue;
Sorry I didn't notice this earlier, but I think you want that handled++
after the if(..) continue pieces. Otherwise you will always have a number
>1. :-)
Cheers,
Don
On Thu, Aug 19, 2010 at 04:27:13PM +0200, Peter Zijlstra wrote:
> On Thu, 2010-08-19 at 10:12 -0400, Don Zickus wrote:
> > On Thu, Aug 19, 2010 at 12:45:53PM +0200, Peter Zijlstra wrote:
> > >
> > > I queued it with that part changed to:
> >
> > I realized the other day this change doesn't cover the nehalem, core and p4
> > cases which use
> >
> > intel_pmu_handle_irq
> > p4_pmu_handle_irq
> >
> > as their handlers. Though that patch can go on top of Robert's.
>
> Something like this?
I tested this patch and Robert's on an AMD box and Nehalem box. Both
worked as intended. However I did notice that whenever the AMD box
detected handled >1, it was shortly followed by an unknown_nmi that was
properly eaten with Robert's logic. Whereas on the Nehalem box I saw a
lot of 'handled > 1' messages but very very few of them were followed by
an unknown_nmi message (and those messages that did come were properly
eaten).
Maybe that is just the differences in the cpu designs.
Of course I had to make the one change I mentioned previously for the
perf_event_intel.c file (moving the handled++ logic down a few lines).
I didn't run the test on a P4 box.
Looks great, thanks guys!
Cheers,
Don
>
> ---
> Index: linux-2.6/arch/x86/kernel/cpu/perf_event_intel.c
> ===================================================================
> --- linux-2.6.orig/arch/x86/kernel/cpu/perf_event_intel.c
> +++ linux-2.6/arch/x86/kernel/cpu/perf_event_intel.c
> @@ -713,6 +713,7 @@ static int intel_pmu_handle_irq(struct p
> struct cpu_hw_events *cpuc;
> int bit, loops;
> u64 ack, status;
> + int handled = 0;
>
> perf_sample_data_init(&data, 0);
>
> @@ -743,12 +744,16 @@ again:
> /*
> * PEBS overflow sets bit 62 in the global status register
> */
> - if (__test_and_clear_bit(62, (unsigned long *)&status))
> + if (__test_and_clear_bit(62, (unsigned long *)&status)) {
> + handled++;
> x86_pmu.drain_pebs(regs);
> + }
>
> for_each_set_bit(bit, (unsigned long *)&status, X86_PMC_IDX_MAX) {
> struct perf_event *event = cpuc->events[bit];
>
> + handled++;
> +
> if (!test_bit(bit, cpuc->active_mask))
> continue;
>
> @@ -772,7 +777,7 @@ again:
>
> done:
> intel_pmu_enable_all(0);
> - return 1;
> + return handled;
> }
>
> static struct event_constraint *
> Index: linux-2.6/arch/x86/kernel/cpu/perf_event_p4.c
> ===================================================================
> --- linux-2.6.orig/arch/x86/kernel/cpu/perf_event_p4.c
> +++ linux-2.6/arch/x86/kernel/cpu/perf_event_p4.c
> @@ -673,7 +673,7 @@ static int p4_pmu_handle_irq(struct pt_r
> if (!overflow && (val & (1ULL << (x86_pmu.cntval_bits - 1))))
> continue;
>
> - handled += overflow;
> + handled += !!overflow;
>
> /* event overflow for sure */
> data.period = event->hw.last_period;
> @@ -690,7 +690,7 @@ static int p4_pmu_handle_irq(struct pt_r
> inc_irq_stat(apic_perf_irqs);
> }
>
> - return handled > 0;
> + return handled;
> }
>
> /*
>
* Don Zickus <[email protected]> wrote:
> On Thu, Aug 19, 2010 at 04:27:13PM +0200, Peter Zijlstra wrote:
> > On Thu, 2010-08-19 at 10:12 -0400, Don Zickus wrote:
> > > On Thu, Aug 19, 2010 at 12:45:53PM +0200, Peter Zijlstra wrote:
> > > >
> > > > I queued it with that part changed to:
> > >
> > > I realized the other day this change doesn't cover the nehalem, core and p4
> > > cases which use
> > >
> > > intel_pmu_handle_irq
> > > p4_pmu_handle_irq
> > >
> > > as their handlers. Though that patch can go on top of Robert's.
> >
> > Something like this?
>
> I tested this patch and Robert's on an AMD box and Nehalem box. Both
> worked as intended. However I did notice that whenever the AMD box
> detected handled >1, it was shortly followed by an unknown_nmi that was
> properly eaten with Robert's logic. Whereas on the Nehalem box I saw a
> lot of 'handled > 1' messages but very very few of them were followed by
> an unknown_nmi message (and those messages that did come were properly
> eaten).
>
> Maybe that is just the differences in the cpu designs.
>
> Of course I had to make the one change I mentioned previously for the
> perf_event_intel.c file (moving the handled++ logic down a few lines).
>
> I didn't run the test on a P4 box.
>
> Looks great, thanks guys!
Please someone send the final version with a changelog, with all the acks and
tested-by's added, so that i can send it Linuswards.
Thanks,
Ingo
On 19.08.10 21:50:17, Don Zickus wrote:
> I tested this patch and Robert's on an AMD box and Nehalem box. Both
> worked as intended. However I did notice that whenever the AMD box
> detected handled >1, it was shortly followed by an unknown_nmi that was
> properly eaten with Robert's logic. Whereas on the Nehalem box I saw a
> lot of 'handled > 1' messages but very very few of them were followed by
> an unknown_nmi message (and those messages that did come were properly
> eaten).
Don, thanks for testing this.
-Robert
--
Advanced Micro Devices, Inc.
Operating System Research Center
On Thu, 2010-08-19 at 17:58 -0400, Don Zickus wrote:
> On Thu, Aug 19, 2010 at 04:27:13PM +0200, Peter Zijlstra wrote:
> > x86_pmu.drain_pebs(regs);
> > + }
> >
> > for_each_set_bit(bit, (unsigned long *)&status, X86_PMC_IDX_MAX) {
> > struct perf_event *event = cpuc->events[bit];
> >
> > + handled++;
> > +
> > if (!test_bit(bit, cpuc->active_mask))
> > continue;
>
> Sorry I didn't notice this earlier, but I think you want that handled++
> after the if(..) continue pieces. Otherwise you will always have a number
> >1. :-)
Only if there's any remaining set bits in the overflow status reg,
right?
But if we want to be paranoid, we should also check if the event that
generated the overflow actually had the INT flag enabled, I guess ;-)
On Fri, 2010-08-20 at 10:16 +0200, Ingo Molnar wrote:
>
>
> Please someone send the final version with a changelog, with all the acks and
> tested-by's added, so that i can send it Linuswards.
I've got two patches queued, one from Robert (with a slight
modification, which removes a goto) and one from myself converting the
intel and p4 irq handler.
They're available at:
programming.kicks-ass.net/sekrit/patches.tar.bz2
I did add Tested-by tags, although I suspect its not the exact same bits
Don tested..
On 8/20/10, Peter Zijlstra <[email protected]> wrote:
> On Fri, 2010-08-20 at 10:16 +0200, Ingo Molnar wrote:
>>
>>
>> Please someone send the final version with a changelog, with all the acks
>> and
>> tested-by's added, so that i can send it Linuswards.
>
> I've got two patches queued, one from Robert (with a slight
> modification, which removes a goto) and one from myself converting the
> intel and p4 irq handler.
>
> They're available at:
>
> programming.kicks-ass.net/sekrit/patches.tar.bz2
>
> I did add Tested-by tags, although I suspect its not the exact same bits
> Don tested..
>
My Ack if needed
On Fri, Aug 20, 2010 at 12:04:11PM +0200, Peter Zijlstra wrote:
> On Fri, 2010-08-20 at 10:16 +0200, Ingo Molnar wrote:
> >
> >
> > Please someone send the final version with a changelog, with all the acks and
> > tested-by's added, so that i can send it Linuswards.
>
> I've got two patches queued, one from Robert (with a slight
> modification, which removes a goto) and one from myself converting the
> intel and p4 irq handler.
>
> They're available at:
>
> programming.kicks-ass.net/sekrit/patches.tar.bz2
>
> I did add Tested-by tags, although I suspect its not the exact same bits
> Don tested..
I can re-test those exact patches later today if needed.
Cheers,
Don
* Don Zickus <[email protected]> wrote:
> On Fri, Aug 20, 2010 at 12:04:11PM +0200, Peter Zijlstra wrote:
> > On Fri, 2010-08-20 at 10:16 +0200, Ingo Molnar wrote:
> > >
> > >
> > > Please someone send the final version with a changelog, with all the acks and
> > > tested-by's added, so that i can send it Linuswards.
> >
> > I've got two patches queued, one from Robert (with a slight
> > modification, which removes a goto) and one from myself converting the
> > intel and p4 irq handler.
> >
> > They're available at:
> >
> > programming.kicks-ass.net/sekrit/patches.tar.bz2
> >
> > I did add Tested-by tags, although I suspect its not the exact same bits
> > Don tested..
>
> I can re-test those exact patches later today if needed.
Please pick up -tip in an hour or two and holler if something looks wrong.
Thanks,
Ingo
On Fri, Aug 20, 2010 at 03:27:37PM +0200, Ingo Molnar wrote:
>
> * Don Zickus <[email protected]> wrote:
>
> > On Fri, Aug 20, 2010 at 12:04:11PM +0200, Peter Zijlstra wrote:
> > > On Fri, 2010-08-20 at 10:16 +0200, Ingo Molnar wrote:
> > > >
> > > >
> > > > Please someone send the final version with a changelog, with all the acks and
> > > > tested-by's added, so that i can send it Linuswards.
> > >
> > > I've got two patches queued, one from Robert (with a slight
> > > modification, which removes a goto) and one from myself converting the
> > > intel and p4 irq handler.
> > >
> > > They're available at:
> > >
> > > programming.kicks-ass.net/sekrit/patches.tar.bz2
> > >
> > > I did add Tested-by tags, although I suspect its not the exact same bits
> > > Don tested..
> >
> > I can re-test those exact patches later today if needed.
>
> Please pick up -tip in an hour or two and holler if something looks wrong.
Easy enough.
Cheers,
Don
it's not working so well, i'm getting:
Uhhuh. NMI received for unknown reason 00 on CPU 9.
Do you have a strange power saving mode enabled?
Dazed and confused, but trying to continue
on a nehalem box, after a perf top and perf stat run.
Thanks,
Ingo
Commit-ID: 8e3e42b4f88602bb6e1f0b74afd80ff4703f79ce
Gitweb: http://git.kernel.org/tip/8e3e42b4f88602bb6e1f0b74afd80ff4703f79ce
Author: Robert Richter <[email protected]>
AuthorDate: Tue, 17 Aug 2010 16:42:03 +0200
Committer: Ingo Molnar <[email protected]>
CommitDate: Fri, 20 Aug 2010 15:00:05 +0200
perf, x86: Try to handle unknown nmis with an enabled PMU
When the PMU is enabled it is valid to have unhandled nmis, two
events could trigger 'simultaneously' raising two back-to-back
NMIs. If the first NMI handles both, the latter will be empty and daze
the CPU.
The solution to avoid an 'unknown nmi' massage in this case was simply
to stop the nmi handler chain when the PMU is enabled by stating
the nmi was handled. This has the drawback that a) we can not detect
unknown nmis anymore, and b) subsequent nmi handlers are not called.
This patch addresses this. Now, we check this unknown NMI if it could
be a PMU back-to-back NMI. Otherwise we pass it and let the kernel
handle the unknown nmi.
This is a debug log:
cpu #6, nmi #32333, skip_nmi #32330, handled = 1, time = 1934364430
cpu #6, nmi #32334, skip_nmi #32330, handled = 1, time = 1934704616
cpu #6, nmi #32335, skip_nmi #32336, handled = 2, time = 1936032320
cpu #6, nmi #32336, skip_nmi #32336, handled = 0, time = 1936034139
cpu #6, nmi #32337, skip_nmi #32336, handled = 1, time = 1936120100
cpu #6, nmi #32338, skip_nmi #32336, handled = 1, time = 1936404607
cpu #6, nmi #32339, skip_nmi #32336, handled = 1, time = 1937983416
cpu #6, nmi #32340, skip_nmi #32341, handled = 2, time = 1938201032
cpu #6, nmi #32341, skip_nmi #32341, handled = 0, time = 1938202830
cpu #6, nmi #32342, skip_nmi #32341, handled = 1, time = 1938443743
cpu #6, nmi #32343, skip_nmi #32341, handled = 1, time = 1939956552
cpu #6, nmi #32344, skip_nmi #32341, handled = 1, time = 1940073224
cpu #6, nmi #32345, skip_nmi #32341, handled = 1, time = 1940485677
cpu #6, nmi #32346, skip_nmi #32347, handled = 2, time = 1941947772
cpu #6, nmi #32347, skip_nmi #32347, handled = 1, time = 1941949818
cpu #6, nmi #32348, skip_nmi #32347, handled = 0, time = 1941951591
Uhhuh. NMI received for unknown reason 00 on CPU 6.
Do you have a strange power saving mode enabled?
Dazed and confused, but trying to continue
Deltas:
nmi #32334 340186
nmi #32335 1327704
nmi #32336 1819 <<<< back-to-back nmi [1]
nmi #32337 85961
nmi #32338 284507
nmi #32339 1578809
nmi #32340 217616
nmi #32341 1798 <<<< back-to-back nmi [2]
nmi #32342 240913
nmi #32343 1512809
nmi #32344 116672
nmi #32345 412453
nmi #32346 1462095 <<<< 1st nmi (standard) handling 2 counters
nmi #32347 2046 <<<< 2nd nmi (back-to-back) handling one counter
nmi #32348 1773 <<<< 3rd nmi (back-to-back) handling no counter! [3]
For back-to-back nmi detection there are the following rules:
The PMU nmi handler was handling more than one counter and no
counter was handled in the subsequent nmi (see [1] and [2] above).
There is another case if there are two subsequent back-to-back nmis
[3]. The 2nd is detected as back-to-back because the first handled
more than one counter. If the second handles one counter and the 3rd
handles nothing, we drop the 3rd nmi because it could be a
back-to-back nmi.
Signed-off-by: Robert Richter <[email protected]>
Signed-off-by: Peter Zijlstra <[email protected]>
Tested-by: Don Zickus <[email protected]>
LKML-Reference: <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
---
arch/x86/kernel/cpu/perf_event.c | 59 +++++++++++++++++++++++++++++--------
1 files changed, 46 insertions(+), 13 deletions(-)
diff --git a/arch/x86/kernel/cpu/perf_event.c b/arch/x86/kernel/cpu/perf_event.c
index f2da20f..dd2fceb 100644
--- a/arch/x86/kernel/cpu/perf_event.c
+++ b/arch/x86/kernel/cpu/perf_event.c
@@ -1154,7 +1154,7 @@ static int x86_pmu_handle_irq(struct pt_regs *regs)
/*
* event overflow
*/
- handled = 1;
+ handled++;
data.period = event->hw.last_period;
if (!x86_perf_event_set_period(event))
@@ -1200,12 +1200,20 @@ void perf_events_lapic_init(void)
apic_write(APIC_LVTPC, APIC_DM_NMI);
}
+struct pmu_nmi_state {
+ unsigned int marked;
+ int handled;
+};
+
+static DEFINE_PER_CPU(struct pmu_nmi_state, nmi);
+
static int __kprobes
perf_event_nmi_handler(struct notifier_block *self,
unsigned long cmd, void *__args)
{
struct die_args *args = __args;
- struct pt_regs *regs;
+ unsigned int this_nmi;
+ int handled;
if (!atomic_read(&active_events))
return NOTIFY_DONE;
@@ -1214,22 +1222,47 @@ perf_event_nmi_handler(struct notifier_block *self,
case DIE_NMI:
case DIE_NMI_IPI:
break;
-
+ case DIE_NMIUNKNOWN:
+ this_nmi = percpu_read(irq_stat.__nmi_count);
+ if (this_nmi != __get_cpu_var(nmi).marked)
+ /* let the kernel handle the unknown nmi */
+ return NOTIFY_DONE;
+ /*
+ * This one is a PMU back-to-back nmi. Two events
+ * trigger 'simultaneously' raising two back-to-back
+ * NMIs. If the first NMI handles both, the latter
+ * will be empty and daze the CPU. So, we drop it to
+ * avoid false-positive 'unknown nmi' messages.
+ */
+ return NOTIFY_STOP;
default:
return NOTIFY_DONE;
}
- regs = args->regs;
-
apic_write(APIC_LVTPC, APIC_DM_NMI);
- /*
- * Can't rely on the handled return value to say it was our NMI, two
- * events could trigger 'simultaneously' raising two back-to-back NMIs.
- *
- * If the first NMI handles both, the latter will be empty and daze
- * the CPU.
- */
- x86_pmu.handle_irq(regs);
+
+ handled = x86_pmu.handle_irq(args->regs);
+ if (!handled)
+ return NOTIFY_DONE;
+
+ this_nmi = percpu_read(irq_stat.__nmi_count);
+ if ((handled > 1) ||
+ /* the next nmi could be a back-to-back nmi */
+ ((__get_cpu_var(nmi).marked == this_nmi) &&
+ (__get_cpu_var(nmi).handled > 1))) {
+ /*
+ * We could have two subsequent back-to-back nmis: The
+ * first handles more than one counter, the 2nd
+ * handles only one counter and the 3rd handles no
+ * counter.
+ *
+ * This is the 2nd nmi because the previous was
+ * handling more than one counter. We will mark the
+ * next (3rd) and then drop it if unhandled.
+ */
+ __get_cpu_var(nmi).marked = this_nmi + 1;
+ __get_cpu_var(nmi).handled = handled;
+ }
return NOTIFY_STOP;
}
Commit-ID: 4a31bebe71ab2e4f6ecd6e5f9f2ac9f0ff38ff76
Gitweb: http://git.kernel.org/tip/4a31bebe71ab2e4f6ecd6e5f9f2ac9f0ff38ff76
Author: Peter Zijlstra <[email protected]>
AuthorDate: Thu, 19 Aug 2010 16:28:00 +0200
Committer: Ingo Molnar <[email protected]>
CommitDate: Fri, 20 Aug 2010 15:00:06 +0200
perf, x86: Fix handle_irq return values
Now that we rely on the number of handled overflows, ensure all handle_irq
implementations actually return the right number.
Signed-off-by: Peter Zijlstra <[email protected]>
Tested-by: Don Zickus <[email protected]>
LKML-Reference: <1282228033.2605.204.camel@laptop>
---
arch/x86/kernel/cpu/perf_event_intel.c | 9 +++++++--
arch/x86/kernel/cpu/perf_event_p4.c | 2 +-
2 files changed, 8 insertions(+), 3 deletions(-)
diff --git a/arch/x86/kernel/cpu/perf_event_intel.c b/arch/x86/kernel/cpu/perf_event_intel.c
index d8d86d0..4539b4b 100644
--- a/arch/x86/kernel/cpu/perf_event_intel.c
+++ b/arch/x86/kernel/cpu/perf_event_intel.c
@@ -713,6 +713,7 @@ static int intel_pmu_handle_irq(struct pt_regs *regs)
struct cpu_hw_events *cpuc;
int bit, loops;
u64 ack, status;
+ int handled = 0;
perf_sample_data_init(&data, 0);
@@ -743,12 +744,16 @@ again:
/*
* PEBS overflow sets bit 62 in the global status register
*/
- if (__test_and_clear_bit(62, (unsigned long *)&status))
+ if (__test_and_clear_bit(62, (unsigned long *)&status)) {
+ handled++;
x86_pmu.drain_pebs(regs);
+ }
for_each_set_bit(bit, (unsigned long *)&status, X86_PMC_IDX_MAX) {
struct perf_event *event = cpuc->events[bit];
+ handled++;
+
if (!test_bit(bit, cpuc->active_mask))
continue;
@@ -772,7 +777,7 @@ again:
done:
intel_pmu_enable_all(0);
- return 1;
+ return handled;
}
static struct event_constraint *
diff --git a/arch/x86/kernel/cpu/perf_event_p4.c b/arch/x86/kernel/cpu/perf_event_p4.c
index febb12c..d470c91 100644
--- a/arch/x86/kernel/cpu/perf_event_p4.c
+++ b/arch/x86/kernel/cpu/perf_event_p4.c
@@ -690,7 +690,7 @@ static int p4_pmu_handle_irq(struct pt_regs *regs)
inc_irq_stat(apic_perf_irqs);
}
- return handled > 0;
+ return handled;
}
/*
I'll test tip later today to see if I can reproduce it.
Cheers,
Don
Ingo Molnar <[email protected]> wrote:
>
>it's not working so well, i'm getting:
>
> Uhhuh. NMI received for unknown reason 00 on CPU 9.
> Do you have a strange power saving mode enabled?
> Dazed and confused, but trying to continue
>
>on a nehalem box, after a perf top and perf stat run.
>
>Thanks,
>
> Ingo
????{.n?+???????+%?????ݶ??w??{.n?+????{??G?????{ay?ʇڙ?,j??f???h?????????z_??(?階?ݢj"???m??????G????????????&???~???iO???z??v?^?m????????????I?
* Don Zickus <[email protected]> wrote:
> I'll test tip later today to see if I can reproduce it.
>
> Cheers,
> Don
>
> Ingo Molnar <[email protected]> wrote:
>
> >
> >it's not working so well, i'm getting:
> >
> > Uhhuh. NMI received for unknown reason 00 on CPU 9.
> > Do you have a strange power saving mode enabled?
> > Dazed and confused, but trying to continue
> >
> >on a nehalem box, after a perf top and perf stat run.
FYI, it does not trigger on an AMD box.
Thanks,
Ingo
On Fri, Aug 20, 2010 at 04:17:03PM +0200, Ingo Molnar wrote:
>
> it's not working so well, i'm getting:
>
> Uhhuh. NMI received for unknown reason 00 on CPU 9.
> Do you have a strange power saving mode enabled?
> Dazed and confused, but trying to continue
>
> on a nehalem box, after a perf top and perf stat run.
>
> Thanks,
>
> Ingo
>
Might not this be the case of AAN93 erratum, which says the following
| The processor can be configured to issue a PMI (performance monitor interrupt)
| upon overflow of the IA32_FIXED_CTR0 MSR (309H). A single PMI should be observed on
| overflow of IA32_FIXED_CTR0, however multiple PMIs are observed when this
| erratum occurs.
Just a thought.
-- Cyrill
On Fri, Aug 20, 2010 at 11:05:42AM -0400, Don Zickus wrote:
> I'll test tip later today to see if I can reproduce it.
>
> Cheers,
> Don
Sad to say, that won't happen. Both my amd box and nehalem box have to
many issues with your master branch.
The amd box BUGs in perf_event_nmi_handler on the new code trying to run
'perf top'
arch/x86/kernel/cpu/perf_event.c::perf_event_nmi_handler:1250
((__get_cpu_var(nmi).marked == this_nmi) &&
The BUG is attached below. I can't figure out why
And my Nehalem box won't even boot with the that kernel, not even to
console for some reason. Then bisecting revealed that in 2.6.35 something
with LVM changed such that the kernel can't mount my RHEL-6 lvm
partitions. So even if I did get that kernel booting it won't mount
disks.
I'll take this as a sign to quit for now.. and try again on Monday. :-)
Cheers,
Don
-----
amd-ma78gm-01.rhts.eng.bos.redhat.com login: BUG: unable to handle kernel
paging request at ffff87ff838a5200
IP: [<ffffffff814a6370>] perf_event_nmi_handler+0xd0/0xe0
PGD 0
Oops: 0000 [#1] SMP
last sysfs file: /sys/devices/system/cpu/online
CPU 0
Modules linked in: autofs4 sunrpc cpufreq_ondemand powernow_k8 freq_table
mperf ipv6 dm_mirror dm_region_hash dm_log ppdev parport_pc parport wmi
snd_hda_codec_atihdmi snd_hda_codec_realtek snd_hda_intel snd_hda_codec
snd_hwdep snd_seq snd_seq_device snd_pcm snd_timer snd soundcore
snd_page_alloc pcspkr serio_raw edac_core edac_mce_amd sg i2c_piix4 r8169
mii ahci libahci shpchp ext4 mbcache jbd2 sr_mod cdrom sd_mod crc_t10dif
firewire_ohci firewire_core crc_itu_t ata_generic pata_acpi pata_atiixp
floppy radeon ttm drm_kms_helper drm hwmon i2c_algo_bit i2c_core dm_mod
[last unloaded: scsi_wait_scan]
Pid: 1865, comm: perf Not tainted 2.6.36-rc1tipperf-tip+ #28
GA-MA78GM-S2H/GA-MA78GM-S2H
RIP: 0010:[<ffffffff814a6370>] [<ffffffff814a6370>]
perf_event_nmi_handler+0xd0/0xe0
RSP: 0018:ffff880002407e88 EFLAGS: 00010046
RAX: 0000000000000001 RBX: 000000000000000c RCX: ffffffff814a5200
RDX: ffffffff814a5200 RSI: 0000000000000001 RDI: ffff880002400000
RBP: ffff880002407e98 R08: 0000000000000001 R09: ffff880002407d48
R10: 0000000000000002 R11: 0000000000000000 R12: ffff880002407ef8
R13: 00000000fffffffc R14: 0000000000000000 R15: ffffffff81c1df80
FS: 00007f0220ac9700(0000) GS:ffff880002400000(0000)
knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: ffff87ff838a5200 CR3: 0000000222c31000 CR4: 00000000000006f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process perf (pid: 1865, threadinfo ffff880220a78000, task
ffff88021c64a100)
Stack:
0000000000000000 ffff880002407ef8 ffff880002407ed8 ffffffff814a8505
<0> 0000000000000000 ffff880002407f58 000000000000003d 000000000000003d
<0> ffff88000240ccc0 0000000000000001 ffff880002407ee8 ffffffff814a856a
Call Trace:
<NMI>
[<ffffffff814a8505>] notifier_call_chain+0x55/0x80
[<ffffffff814a856a>] atomic_notifier_call_chain+0x1a/0x20
[<ffffffff814a859e>] notify_die+0x2e/0x30
[<ffffffff814a5963>] do_nmi+0x173/0x2b0
[<ffffffff814a5220>] nmi+0x20/0x30
[<ffffffff810340fa>] ? native_write_msr_safe+0xa/0x10
<<EOE>>
[<ffffffff8101a4b0>] x86_pmu_enable_all+0x60/0x80
[<ffffffff8101b72c>] hw_perf_enable+0xfc/0x230
[<ffffffff810eb1dd>] perf_enable+0x2d/0x40
[<ffffffff810ed76d>] __perf_install_in_context+0xcd/0x190
[<ffffffff810ed6a0>] ? __perf_install_in_context+0x0/0x190
[<ffffffff810953bc>] smp_call_function_single+0x8c/0x160
[<ffffffff810f23a8>] ? find_get_context+0x98/0x2b0
[<ffffffff810edbba>] perf_install_in_context+0x9a/0xa0
[<ffffffff810f3141>] sys_perf_event_open+0x361/0x4f0
[<ffffffff8100b072>] system_call_fastpath+0x16/0x1b
Code: 53 01 00 48 c7 c0 00 52 4a 81 65 48 8b 14 25 38 e3 00 00 3b 0c 02 0f
85 67 ff ff ff b8 01 80 00 00 c9 c3 0f 1f 84 00 00 00 00 00 <83> 3c 0f 01
74 a6 eb e9 00 00 00 00 00 00 00 00 55 48 89 e5 48
RIP [<ffffffff814a6370>] perf_event_nmi_handler+0xd0/0xe0
RSP <ffff880002407e88>
CR2: ffff87ff838a5200
---[ end trace 3ddcb8e2da2c4430 ]---
>
>
> Ingo Molnar <[email protected]> wrote:
>
>
> it's not working so well, i'm getting:
>
> Uhhuh. NMI received for unknown reason 00 on CPU 9.
> Do you have a strange power saving mode enabled?
> Dazed and confused, but trying to continue
>
> on a nehalem box, after a perf top and perf stat run.
>
> Thanks,
>
> Ingo
* Ingo Molnar <[email protected]> wrote:
>
> * Don Zickus <[email protected]> wrote:
>
> > I'll test tip later today to see if I can reproduce it.
> >
> > Cheers,
> > Don
> >
> > Ingo Molnar <[email protected]> wrote:
> >
> > >
> > >it's not working so well, i'm getting:
> > >
> > > Uhhuh. NMI received for unknown reason 00 on CPU 9.
> > > Do you have a strange power saving mode enabled?
> > > Dazed and confused, but trying to continue
> > >
> > >on a nehalem box, after a perf top and perf stat run.
>
> FYI, it does not trigger on an AMD box.
Ok, to not hold up the perf/urgent flow i zapped these two commits for
the time being:
4a31beb: perf, x86: Fix handle_irq return values
8e3e42b: perf, x86: Try to handle unknown nmis with an enabled PMU
We can apply them if they take a form that dont introduce a different
kind of (and more visible) regression.
Thanks,
Ingo
On Mon, Aug 23, 2010 at 10:53:39AM +0200, Ingo Molnar wrote:
>
> * Ingo Molnar <[email protected]> wrote:
>
> >
> > * Don Zickus <[email protected]> wrote:
> >
> > > I'll test tip later today to see if I can reproduce it.
> > >
> > > Cheers,
> > > Don
> > >
> > > Ingo Molnar <[email protected]> wrote:
> > >
> > > >
> > > >it's not working so well, i'm getting:
> > > >
> > > > Uhhuh. NMI received for unknown reason 00 on CPU 9.
> > > > Do you have a strange power saving mode enabled?
> > > > Dazed and confused, but trying to continue
> > > >
> > > >on a nehalem box, after a perf top and perf stat run.
> >
> > FYI, it does not trigger on an AMD box.
>
> Ok, to not hold up the perf/urgent flow i zapped these two commits for
> the time being:
>
> 4a31beb: perf, x86: Fix handle_irq return values
> 8e3e42b: perf, x86: Try to handle unknown nmis with an enabled PMU
>
> We can apply them if they take a form that dont introduce a different
> kind of (and more visible) regression.
>
> Thanks,
>
> Ingo
>
Btw, guys, I fail to see how new nmi_watchdog work, we have
default_do_nmi
if (!(reason & 0xc0)) {
if (notify_die(DIE_NMI, "nmi", regs, reason, 2, SIGINT) == == NOTIFY_STOP)
return
if (nmi_watchdog_tick(regs, reason))
return
but perf_event_nmi_handler returns NOTIFY_STOP when watchdog is perf event
and nmi_watchdog_tick _never_ called, or (most probably) I miss something?
-- Cyrill
On 24.08.10 12:22:52, Cyrill Gorcunov wrote:
> Btw, guys, I fail to see how new nmi_watchdog work, we have
> default_do_nmi
> if (!(reason & 0xc0)) {
> if (notify_die(DIE_NMI, "nmi", regs, reason, 2, SIGINT) == == NOTIFY_STOP)
> return
> if (nmi_watchdog_tick(regs, reason))
> return
>
> but perf_event_nmi_handler returns NOTIFY_STOP when watchdog is perf event
> and nmi_watchdog_tick _never_ called, or (most probably) I miss something?
The watchdog is disabled during profiling (perf and oprofile) by
calling disable_lapic_nmi_watchdog().
-Robert
--
Advanced Micro Devices, Inc.
Operating System Research Center
On Tue, Aug 24, 2010 at 07:09:36PM +0200, Robert Richter wrote:
> On 24.08.10 12:22:52, Cyrill Gorcunov wrote:
> > Btw, guys, I fail to see how new nmi_watchdog work, we have
> > default_do_nmi
> > if (!(reason & 0xc0)) {
> > if (notify_die(DIE_NMI, "nmi", regs, reason, 2, SIGINT) == == NOTIFY_STOP)
> > return
> > if (nmi_watchdog_tick(regs, reason))
> > return
> >
> > but perf_event_nmi_handler returns NOTIFY_STOP when watchdog is perf event
> > and nmi_watchdog_tick _never_ called, or (most probably) I miss something?
>
> The watchdog is disabled during profiling (perf and oprofile) by
> calling disable_lapic_nmi_watchdog().
>
> -Robert
>
Huh? iirc Don have switched nmi watchdog to native perf subsystem, ie watchdog
uses PERF_COUNT_HW_CPU_CYCLES event, letme check...
-- Cyrill
On Tue, Aug 24, 2010 at 09:20:22PM +0400, Cyrill Gorcunov wrote:
> On Tue, Aug 24, 2010 at 07:09:36PM +0200, Robert Richter wrote:
> > On 24.08.10 12:22:52, Cyrill Gorcunov wrote:
> > > Btw, guys, I fail to see how new nmi_watchdog work, we have
> > > default_do_nmi
> > > if (!(reason & 0xc0)) {
> > > if (notify_die(DIE_NMI, "nmi", regs, reason, 2, SIGINT) == == NOTIFY_STOP)
> > > return
> > > if (nmi_watchdog_tick(regs, reason))
> > > return
> > >
> > > but perf_event_nmi_handler returns NOTIFY_STOP when watchdog is perf event
> > > and nmi_watchdog_tick _never_ called, or (most probably) I miss something?
> >
> > The watchdog is disabled during profiling (perf and oprofile) by
> > calling disable_lapic_nmi_watchdog().
> >
> > -Robert
> >
>
> Huh? iirc Don have switched nmi watchdog to native perf subsystem, ie watchdog
> uses PERF_COUNT_HW_CPU_CYCLES event, letme check...
>
> -- Cyrill
False alarm, perf watchdog uses own handler. Sorry for noise ;)
-- Cyrill
On 23.08.10 04:53:39, Ingo Molnar wrote:
> Ok, to not hold up the perf/urgent flow i zapped these two commits for
> the time being:
>
> 4a31beb: perf, x86: Fix handle_irq return values
> 8e3e42b: perf, x86: Try to handle unknown nmis with an enabled PMU
Ingo, are those commits in some branch to fetch them for testing? I
did not find them in tip/master.
Thanks,
-Robert
--
Advanced Micro Devices, Inc.
Operating System Research Center
On Tue, Aug 24, 2010 at 07:15:05PM +0200, Robert Richter wrote:
> On 23.08.10 04:53:39, Ingo Molnar wrote:
> > Ok, to not hold up the perf/urgent flow i zapped these two commits for
> > the time being:
> >
> > 4a31beb: perf, x86: Fix handle_irq return values
> > 8e3e42b: perf, x86: Try to handle unknown nmis with an enabled PMU
>
> Ingo, are those commits in some branch to fetch them for testing? I
> did not find them in tip/master.
>
> Thanks,
>
> -Robert
>
> --
> Advanced Micro Devices, Inc.
> Operating System Research Center
>
Robert, they should be in -tip tree, I have
| commit 4a31bebe71ab2e4f6ecd6e5f9f2ac9f0ff38ff76
| Author: Peter Zijlstra <[email protected]>
| Date: Thu Aug 19 16:28:00 2010 +0200
|
| perf, x86: Fix handle_irq return values
-- Cyrill
On Tue, Aug 24, 2010 at 09:28:06PM +0400, Cyrill Gorcunov wrote:
> On Tue, Aug 24, 2010 at 07:15:05PM +0200, Robert Richter wrote:
> > On 23.08.10 04:53:39, Ingo Molnar wrote:
> > > Ok, to not hold up the perf/urgent flow i zapped these two commits for
> > > the time being:
> > >
> > > 4a31beb: perf, x86: Fix handle_irq return values
> > > 8e3e42b: perf, x86: Try to handle unknown nmis with an enabled PMU
> >
> > Ingo, are those commits in some branch to fetch them for testing? I
> > did not find them in tip/master.
> >
> > Thanks,
> >
> > -Robert
> >
> > --
> > Advanced Micro Devices, Inc.
> > Operating System Research Center
> >
>
> Robert, they should be in -tip tree, I have
>
> | commit 4a31bebe71ab2e4f6ecd6e5f9f2ac9f0ff38ff76
> | Author: Peter Zijlstra <[email protected]>
> | Date: Thu Aug 19 16:28:00 2010 +0200
> |
> | perf, x86: Fix handle_irq return values
Has anyone else seen kernel panics when enabling the LOCKUP_DETECTOR or
apply Robert's patch? I keep getting a panic with unable to handle paging
request everytime either Robert's patch or the lockup detector uses
__get_cpu_var(). And only on 2.6.36 code.
Cheers,
Don
On Tue, Aug 24, 2010 at 02:46:58PM -0400, Don Zickus wrote:
> On Tue, Aug 24, 2010 at 09:28:06PM +0400, Cyrill Gorcunov wrote:
> > On Tue, Aug 24, 2010 at 07:15:05PM +0200, Robert Richter wrote:
> > > On 23.08.10 04:53:39, Ingo Molnar wrote:
> > > > Ok, to not hold up the perf/urgent flow i zapped these two commits for
> > > > the time being:
> > > >
> > > > 4a31beb: perf, x86: Fix handle_irq return values
> > > > 8e3e42b: perf, x86: Try to handle unknown nmis with an enabled PMU
> > >
> > > Ingo, are those commits in some branch to fetch them for testing? I
> > > did not find them in tip/master.
> > >
> > > Thanks,
> > >
> > > -Robert
> > >
> > > --
> > > Advanced Micro Devices, Inc.
> > > Operating System Research Center
> > >
> >
> > Robert, they should be in -tip tree, I have
> >
> > | commit 4a31bebe71ab2e4f6ecd6e5f9f2ac9f0ff38ff76
> > | Author: Peter Zijlstra <[email protected]>
> > | Date: Thu Aug 19 16:28:00 2010 +0200
> > |
> > | perf, x86: Fix handle_irq return values
>
> Has anyone else seen kernel panics when enabling the LOCKUP_DETECTOR or
> apply Robert's patch? I keep getting a panic with unable to handle paging
> request everytime either Robert's patch or the lockup detector uses
> __get_cpu_var(). And only on 2.6.36 code.
>
> Cheers,
> Don
>
I use 2.6.34 atm. Letme try 2.6.36 (which might require some time to recompile).
-- Cyrill
On Tue, Aug 24, 2010 at 10:54:58PM +0400, Cyrill Gorcunov wrote:
...
> > > | commit 4a31bebe71ab2e4f6ecd6e5f9f2ac9f0ff38ff76
> > > | Author: Peter Zijlstra <[email protected]>
> > > | Date: Thu Aug 19 16:28:00 2010 +0200
> > > |
> > > | perf, x86: Fix handle_irq return values
> >
> > Has anyone else seen kernel panics when enabling the LOCKUP_DETECTOR or
> > apply Robert's patch? I keep getting a panic with unable to handle paging
> > request everytime either Robert's patch or the lockup detector uses
> > __get_cpu_var(). And only on 2.6.36 code.
> >
> > Cheers,
> > Don
> >
>
> I use 2.6.34 atm. Letme try 2.6.36 (which might require some time to recompile).
>
> -- Cyrill
Don, for me it fails with somehow unrelated page handling fault - in
reiserfs_evict_inode O_o Fails in __get_cpu_var I suspect might means
a problem either in per-cpu allocator itself or we screw pointer somehow.
Weird.
-- Cyrill
On Tue, Aug 24, 2010 at 11:52:40PM +0400, Cyrill Gorcunov wrote:
> > I use 2.6.34 atm. Letme try 2.6.36 (which might require some time to recompile).
> >
> > -- Cyrill
>
> Don, for me it fails with somehow unrelated page handling fault - in
> reiserfs_evict_inode O_o Fails in __get_cpu_var I suspect might means
> a problem either in per-cpu allocator itself or we screw pointer somehow.
> Weird.
I just found out (with the help of the crash utility and Dave A.) that
Robert's percpu struct nmi clashes with the exception entry point .nmi.
I only see this problem in 2.6.36, so I am not sure what changed with
regards to compiler flags to confuse variables with text segments.
But renaming the percpu struct nmi to nmidon fixed the problem for me (I
am open to other suggestions :-) ).
Regarding your reiserfs, what was the variable's name?
Cheers,
Don
On Tue, Aug 24, 2010 at 04:27:18PM -0400, Don Zickus wrote:
> On Tue, Aug 24, 2010 at 11:52:40PM +0400, Cyrill Gorcunov wrote:
> > > I use 2.6.34 atm. Letme try 2.6.36 (which might require some time to recompile).
> > >
> > > -- Cyrill
> >
> > Don, for me it fails with somehow unrelated page handling fault - in
> > reiserfs_evict_inode O_o Fails in __get_cpu_var I suspect might means
> > a problem either in per-cpu allocator itself or we screw pointer somehow.
> > Weird.
>
> I just found out (with the help of the crash utility and Dave A.) that
> Robert's percpu struct nmi clashes with the exception entry point .nmi.
> I only see this problem in 2.6.36, so I am not sure what changed with
> regards to compiler flags to confuse variables with text segments.
>
yeah, I suspect name clashes here but then I did grep over per-cpu variables
in whole kernel and didn't find match so I thought the assumption was wrong,
but eventually it happens to be true but via other way :) good to know, thanks!
> But renaming the percpu struct nmi to nmidon fixed the problem for me (I
> am open to other suggestions :-) ).
nmi_don_zickus ;) well, I think nmi_pmu or something like that
might be a bit modest ;)
>
> Regarding your reiserfs, what was the variable's name?
>
It seems to be different, it's pity that I had only 80x25 vga
mode and was unable to snap the whole log. But actually I didn't
even check precisely all .config options I had set since I was
more interested in early stage where per-cpu access should already
happen rather then real init'ed environmen. But I think I'll be
moving completely to .36 this week so we will see how it goes.
> Cheers,
> Don
>
-- Cyrill
On Fri, Aug 20, 2010 at 04:17:03PM +0200, Ingo Molnar wrote:
>
> it's not working so well, i'm getting:
>
> Uhhuh. NMI received for unknown reason 00 on CPU 9.
> Do you have a strange power saving mode enabled?
> Dazed and confused, but trying to continue
>
> on a nehalem box, after a perf top and perf stat run.
After applying the patch below, I ran the following commands on my nehalem
box without reproducing what you are seeing. Any thing else I can run
that might trigger it? (I also ran them on an amd phenom quad-core box).
I used 2.6.32-rc2 plus Robert's and 2 of PeterZ's patches.
perf top
perf stat -a -e cycles -e instructions -e cache-references -e cache-misses -e branch-misses -- sleep 5
perf record -f -a -e cycles -e instructions -e cache-references -e cache-misses -e branch-misses -- sleep 5
Cheers,
Don
>From 198be1044fa603bc9582a5c19134fdf9a433fff0 Mon Sep 17 00:00:00 2001
From: Don Zickus <[email protected]>
Date: Tue, 24 Aug 2010 17:43:17 -0400
Subject: [PATCH] [x86] perf: rename nmi variable to avoid clash with entry point
There is already an entry point named .nmi in entry.S and that seems to clash
with the per_cpu variable nmi defined in commit f3a860d8. Renaming this
variable avoids the namespace collision.
Signed-off-by: Don Zickus <[email protected]>
---
arch/x86/kernel/cpu/perf_event.c | 12 ++++++------
1 files changed, 6 insertions(+), 6 deletions(-)
diff --git a/arch/x86/kernel/cpu/perf_event.c b/arch/x86/kernel/cpu/perf_event.c
index dd2fceb..2a05ea4 100644
--- a/arch/x86/kernel/cpu/perf_event.c
+++ b/arch/x86/kernel/cpu/perf_event.c
@@ -1205,7 +1205,7 @@ struct pmu_nmi_state {
int handled;
};
-static DEFINE_PER_CPU(struct pmu_nmi_state, nmi);
+static DEFINE_PER_CPU_PAGE_ALIGNED(struct pmu_nmi_state, pmu_nmi);
static int __kprobes
perf_event_nmi_handler(struct notifier_block *self,
@@ -1224,7 +1224,7 @@ perf_event_nmi_handler(struct notifier_block *self,
break;
case DIE_NMIUNKNOWN:
this_nmi = percpu_read(irq_stat.__nmi_count);
- if (this_nmi != __get_cpu_var(nmi).marked)
+ if (this_nmi != __get_cpu_var(pmu_nmi).marked)
/* let the kernel handle the unknown nmi */
return NOTIFY_DONE;
/*
@@ -1248,8 +1248,8 @@ perf_event_nmi_handler(struct notifier_block *self,
this_nmi = percpu_read(irq_stat.__nmi_count);
if ((handled > 1) ||
/* the next nmi could be a back-to-back nmi */
- ((__get_cpu_var(nmi).marked == this_nmi) &&
- (__get_cpu_var(nmi).handled > 1))) {
+ ((__get_cpu_var(pmu_nmi).marked == this_nmi) &&
+ (__get_cpu_var(pmu_nmi).handled > 1))) {
/*
* We could have two subsequent back-to-back nmis: The
* first handles more than one counter, the 2nd
@@ -1260,8 +1260,8 @@ perf_event_nmi_handler(struct notifier_block *self,
* handling more than one counter. We will mark the
* next (3rd) and then drop it if unhandled.
*/
- __get_cpu_var(nmi).marked = this_nmi + 1;
- __get_cpu_var(nmi).handled = handled;
+ __get_cpu_var(pmu_nmi).marked = this_nmi + 1;
+ __get_cpu_var(pmu_nmi).handled = handled;
}
return NOTIFY_STOP;
--
1.7.2.1
On 20.08.10 11:25:10, Ingo Molnar wrote:
> > Ingo Molnar <[email protected]> wrote:
> >
> > >
> > >it's not working so well, i'm getting:
> > >
> > > Uhhuh. NMI received for unknown reason 00 on CPU 9.
> > > Do you have a strange power saving mode enabled?
> > > Dazed and confused, but trying to continue
> > >
> > >on a nehalem box, after a perf top and perf stat run.
>
> FYI, it does not trigger on an AMD box.
Ingo,
do you mean it does not trigger false positives on AMD? Both patches
applied on top of current tip/perf/urgent (c6db67c) are working on the
systems I have.
You might use the debug patch below for diagnostics.
-Robert
--
>From 1bbb5aa64e96360529c34a593a072e1a84114f04 Mon Sep 17 00:00:00 2001
From: Robert Richter <[email protected]>
Date: Wed, 11 Aug 2010 18:14:00 +0200
Subject: [PATCH] debug
Signed-off-by: Robert Richter <[email protected]>
---
arch/x86/kernel/cpu/perf_event.c | 54 ++++++++++++++++++++++++++++++++++++-
1 files changed, 52 insertions(+), 2 deletions(-)
diff --git a/arch/x86/kernel/cpu/perf_event.c b/arch/x86/kernel/cpu/perf_event.c
index dd2fceb..059ef09 100644
--- a/arch/x86/kernel/cpu/perf_event.c
+++ b/arch/x86/kernel/cpu/perf_event.c
@@ -1203,10 +1203,43 @@ void perf_events_lapic_init(void)
struct pmu_nmi_state {
unsigned int marked;
int handled;
+ u64 timestamp;
};
static DEFINE_PER_CPU(struct pmu_nmi_state, nmi);
+struct nmi_debug {
+ int cpu;
+ unsigned int this_nmi;
+ unsigned int marked;
+ int handled;
+ u64 timestamp;
+ u64 delta;
+};
+
+static DEFINE_PER_CPU(struct nmi_debug[16], nmi_debug);
+
+static void nmi_handler_debug(void)
+{
+ struct nmi_debug *debug;
+ int i;
+
+ if (!printk_ratelimit())
+ return;
+
+ for (i = 0; i < 16; i++) {
+ debug = &__get_cpu_var(nmi_debug)[i];
+ printk(KERN_EMERG
+ "cpu #%d, nmi #%d, marked #%d, handled = %d, time = %llu, delta = %llu\n",
+ debug->cpu,
+ debug->this_nmi,
+ debug->marked,
+ debug->handled,
+ debug->timestamp,
+ debug->delta);
+ }
+}
+
static int __kprobes
perf_event_nmi_handler(struct notifier_block *self,
unsigned long cmd, void *__args)
@@ -1214,6 +1247,8 @@ perf_event_nmi_handler(struct notifier_block *self,
struct die_args *args = __args;
unsigned int this_nmi;
int handled;
+ struct nmi_debug *debug;
+ u64 timestamp;
if (!atomic_read(&active_events))
return NOTIFY_DONE;
@@ -1224,9 +1259,11 @@ perf_event_nmi_handler(struct notifier_block *self,
break;
case DIE_NMIUNKNOWN:
this_nmi = percpu_read(irq_stat.__nmi_count);
- if (this_nmi != __get_cpu_var(nmi).marked)
+ if (this_nmi != __get_cpu_var(nmi).marked) {
+ nmi_handler_debug();
/* let the kernel handle the unknown nmi */
return NOTIFY_DONE;
+ }
/*
* This one is a PMU back-to-back nmi. Two events
* trigger 'simultaneously' raising two back-to-back
@@ -1242,10 +1279,21 @@ perf_event_nmi_handler(struct notifier_block *self,
apic_write(APIC_LVTPC, APIC_DM_NMI);
handled = x86_pmu.handle_irq(args->regs);
+ this_nmi = percpu_read(irq_stat.__nmi_count);
+
+ debug = &__get_cpu_var(nmi_debug)[0xf & this_nmi];
+ debug->cpu = smp_processor_id();
+ debug->this_nmi = this_nmi;
+ debug->marked = __get_cpu_var(nmi).marked;
+ debug->handled = handled;
+ rdtscll(timestamp);
+ debug->delta = timestamp - __get_cpu_var(nmi).timestamp;
+ __get_cpu_var(nmi).timestamp = timestamp;
+ debug->timestamp = timestamp;
+
if (!handled)
return NOTIFY_DONE;
- this_nmi = percpu_read(irq_stat.__nmi_count);
if ((handled > 1) ||
/* the next nmi could be a back-to-back nmi */
((__get_cpu_var(nmi).marked == this_nmi) &&
@@ -1262,6 +1310,8 @@ perf_event_nmi_handler(struct notifier_block *self,
*/
__get_cpu_var(nmi).marked = this_nmi + 1;
__get_cpu_var(nmi).handled = handled;
+ debug->marked = __get_cpu_var(nmi).marked;
+ debug->handled = handled;
}
return NOTIFY_STOP;
--
1.7.1.1
--
Advanced Micro Devices, Inc.
Operating System Research Center
On 24.08.10 16:27:18, Don Zickus wrote:
> On Tue, Aug 24, 2010 at 11:52:40PM +0400, Cyrill Gorcunov wrote:
> > > I use 2.6.34 atm. Letme try 2.6.36 (which might require some time to recompile).
> > >
> > > -- Cyrill
> >
> > Don, for me it fails with somehow unrelated page handling fault - in
> > reiserfs_evict_inode O_o Fails in __get_cpu_var I suspect might means
> > a problem either in per-cpu allocator itself or we screw pointer somehow.
> > Weird.
>
> I just found out (with the help of the crash utility and Dave A.) that
> Robert's percpu struct nmi clashes with the exception entry point .nmi.
> I only see this problem in 2.6.36, so I am not sure what changed with
> regards to compiler flags to confuse variables with text segments.
I was testing the patches also with CONFIG_LOCKUP_DETECTOR=y without
crashes.
>
> But renaming the percpu struct nmi to nmidon fixed the problem for me (I
> am open to other suggestions :-) ).
Also I was looking at the differences between .35 and tip/perf/urgent
and could not find a namespace collision.
Hmm...
-Robert
--
Advanced Micro Devices, Inc.
Operating System Research Center
* Robert Richter <[email protected]> wrote:
> On 20.08.10 11:25:10, Ingo Molnar wrote:
>
> > > Ingo Molnar <[email protected]> wrote:
> > >
> > > >
> > > >it's not working so well, i'm getting:
> > > >
> > > > Uhhuh. NMI received for unknown reason 00 on CPU 9.
> > > > Do you have a strange power saving mode enabled?
> > > > Dazed and confused, but trying to continue
> > > >
> > > >on a nehalem box, after a perf top and perf stat run.
> >
> > FYI, it does not trigger on an AMD box.
>
> Ingo,
>
> do you mean it does not trigger false positives on AMD? [...]
Correct, on AMD boxes i do not get this message during the first 'perf
top' run:
> > > > Uhhuh. NMI received for unknown reason 00 on CPU 9.
> > > > Do you have a strange power saving mode enabled?
> > > > Dazed and confused, but trying to continue
Before the two patches i did not get these messages at all, on any of
the systems.
> You might use the debug patch below for diagnostics.
Thanks, will try that and report back.
Ingo
* Ingo Molnar <[email protected]> wrote:
> > You might use the debug patch below for diagnostics.
>
> Thanks, will try that and report back.
Here's a more detailed description of the regression introduced by:
4a31beb: perf, x86: Fix handle_irq return values
8e3e42b: perf, x86: Try to handle unknown nmis with an enabled PMU
Booting into the debug kernel the system boots up fine - no NMI
messages, as expected.
Then when i start 'perf top' for the first time i get the NMI message
with this debug output:
cpu #15, nmi #160, marked #0, handled = 1, time = 333392635730, delta = 11238255
cpu #15, nmi #161, marked #0, handled = 1, time = 333403779380, delta = 11143650
cpu #15, nmi #162, marked #0, handled = 1, time = 333415418497, delta = 11639117
cpu #15, nmi #163, marked #0, handled = 1, time = 333415467084, delta = 48587
cpu #15, nmi #164, marked #0, handled = 1, time = 333415501531, delta = 34447
cpu #15, nmi #165, marked #0, handled = 1, time = 333459918106, delta = 44416575
cpu #15, nmi #166, marked #0, handled = 0, time = 333459923167, delta = 1666
cpu #15, nmi #151, marked #0, handled = 1, time = 332978597882, delta = 11447002
cpu #15, nmi #152, marked #0, handled = 1, time = 332978657151, delta = 59269
cpu #15, nmi #153, marked #0, handled = 1, time = 332978667847, delta = 10696
cpu #15, nmi #154, marked #0, handled = 1, time = 333023125757, delta = 44457910
cpu #15, nmi #155, marked #0, handled = 1, time = 333291980833, delta = 268855076
cpu #15, nmi #156, marked #0, handled = 1, time = 333325663125, delta = 33682292
cpu #15, nmi #157, marked #0, handled = 1, time = 333348216481, delta = 22553356
cpu #15, nmi #158, marked #0, handled = 1, time = 333370168887, delta = 21952406
cpu #15, nmi #159, marked #0, handled = 1, time = 333381397475, delta = 11228588
Uhhuh. NMI received for unknown reason 00 on CPU 15.
Do you have a strange power saving mode enabled?
Dazed and confused, but trying to continue
When i start perf top for a second time, no messages are printed at all.
The reason is that on one of the CPUs NMIs are 'stuck':
NMI: 78164 67099 6342 [*] 65677 66119 63796 65395 63995 65012 64151 65082
63483 64948 62926 65608 62630
CPU#2 is stuck at 6342.
The NMIs work fine on other CPUs and perf top works (sans the missing
samples from CPU#2@@), and the NMIs keep ticking.
The CPU is:
processor : 2
vendor_id : GenuineIntel
cpu family : 6
model : 26
model name : Intel(R) Xeon(R) CPU X55600 @ 2.80GHz
stepping : 5
cpu MHz : 2794.000
cache size : 8192 KB
physical id : 0
siblings : 8
core id : 1
cpu cores : 4
apicid : 2
initial apicid : 2
fpu : yes
fpu_exception : yes
cpuid level : 11
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good xtopology nonstop_tsc aperfmperf pni dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm dca sse4_1 sse4_2 popcnt lahf_lm ida tpr_shadow vnmi flexpriority ept vpid
bogomips : 5599.98
clflush size : 64
cache_alignment: 64
address sizes : 40 bits physical, 48 bits virtual
power management:
The PMU init is:
Performance Events: PEBS fmt1+, Nehalem events, Intel PMU driver.
... version: 3
... bit width: 48
... generic registers: 4
... value mask: 0000ffffffffffff
... max period: 000000007fffffff
... fixed-purpose events: 3
... event mask: 000000070000000f
I've attached the config as well.
Thanks,
Ingo
On 25.08.10 06:41:30, Ingo Molnar wrote:
> > > > > Uhhuh. NMI received for unknown reason 00 on CPU 9.
> > > > > Do you have a strange power saving mode enabled?
> > > > > Dazed and confused, but trying to continue
>
> Before the two patches i did not get these messages at all, on any of
> the systems.
Yes, because all were eaten by NOTIFY_STOP, and that's why we also
lose NMIs from other sources, like the nmi button. The patches try to
fix this.
-Robert
--
Advanced Micro Devices, Inc.
Operating System Research Center
* Robert Richter <[email protected]> wrote:
> On 25.08.10 06:41:30, Ingo Molnar wrote:
>
> > > > > > Uhhuh. NMI received for unknown reason 00 on CPU 9.
> > > > > > Do you have a strange power saving mode enabled?
> > > > > > Dazed and confused, but trying to continue
> >
> > Before the two patches i did not get these messages at all, on any
> > of the systems.
>
> Yes, because all were eaten by NOTIFY_STOP, and that's why we also
> lose NMIs from other sources, like the nmi button. The patches try to
> fix this.
I understand that, but obviously the patches need to fix the lack of
messages without introducing new regressions: such as messages not seen
before and CPUs with stuck NMIs.
Thanks,
Ingo
On Wed, Aug 25, 2010 at 01:00:06PM +0200, Ingo Molnar wrote:
>
> * Ingo Molnar <[email protected]> wrote:
>
> > > You might use the debug patch below for diagnostics.
> >
> > Thanks, will try that and report back.
>
> Here's a more detailed description of the regression introduced by:
>
> 4a31beb: perf, x86: Fix handle_irq return values
> 8e3e42b: perf, x86: Try to handle unknown nmis with an enabled PMU
>
> Booting into the debug kernel the system boots up fine - no NMI
> messages, as expected.
>
> Then when i start 'perf top' for the first time i get the NMI message
> with this debug output:
>
> cpu #15, nmi #160, marked #0, handled = 1, time = 333392635730, delta = 11238255
> cpu #15, nmi #161, marked #0, handled = 1, time = 333403779380, delta = 11143650
> cpu #15, nmi #162, marked #0, handled = 1, time = 333415418497, delta = 11639117
> cpu #15, nmi #163, marked #0, handled = 1, time = 333415467084, delta = 48587
> cpu #15, nmi #164, marked #0, handled = 1, time = 333415501531, delta = 34447
> cpu #15, nmi #165, marked #0, handled = 1, time = 333459918106, delta = 44416575
> cpu #15, nmi #166, marked #0, handled = 0, time = 333459923167, delta = 1666
> cpu #15, nmi #151, marked #0, handled = 1, time = 332978597882, delta = 11447002
> cpu #15, nmi #152, marked #0, handled = 1, time = 332978657151, delta = 59269
> cpu #15, nmi #153, marked #0, handled = 1, time = 332978667847, delta = 10696
> cpu #15, nmi #154, marked #0, handled = 1, time = 333023125757, delta = 44457910
> cpu #15, nmi #155, marked #0, handled = 1, time = 333291980833, delta = 268855076
> cpu #15, nmi #156, marked #0, handled = 1, time = 333325663125, delta = 33682292
> cpu #15, nmi #157, marked #0, handled = 1, time = 333348216481, delta = 22553356
> cpu #15, nmi #158, marked #0, handled = 1, time = 333370168887, delta = 21952406
> cpu #15, nmi #159, marked #0, handled = 1, time = 333381397475, delta = 11228588
> Uhhuh. NMI received for unknown reason 00 on CPU 15.
> Do you have a strange power saving mode enabled?
> Dazed and confused, but trying to continue
So I found a Nehalem box that can reliably reproduce Ingo's problem using
something as simple 'perf top'. But like above, I am noticing the
samething, an extra NMI(PMI??) that comes out of nowhere.
Looking at the data above the delta between nmis is very small compared to
the other nmis. It almost suggests that this is an extra PMI.
Considering there is already two cpu errata discussing extra PMIs under
certain configurations, I wouldn't be surprised if this was a third.
Cheers,
Don
>
> When i start perf top for a second time, no messages are printed at all.
> The reason is that on one of the CPUs NMIs are 'stuck':
>
> NMI: 78164 67099 6342 [*] 65677 66119 63796 65395 63995 65012 64151 65082
> 63483 64948 62926 65608 62630
>
> CPU#2 is stuck at 6342.
>
> The NMIs work fine on other CPUs and perf top works (sans the missing
> samples from CPU#2@@), and the NMIs keep ticking.
>
> The CPU is:
>
> processor : 2
> vendor_id : GenuineIntel
> cpu family : 6
> model : 26
> model name : Intel(R) Xeon(R) CPU X55600 @ 2.80GHz
> stepping : 5
> cpu MHz : 2794.000
> cache size : 8192 KB
> physical id : 0
> siblings : 8
> core id : 1
> cpu cores : 4
> apicid : 2
> initial apicid : 2
> fpu : yes
> fpu_exception : yes
> cpuid level : 11
> wp : yes
> flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good xtopology nonstop_tsc aperfmperf pni dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm dca sse4_1 sse4_2 popcnt lahf_lm ida tpr_shadow vnmi flexpriority ept vpid
> bogomips : 5599.98
> clflush size : 64
> cache_alignment: 64
> address sizes : 40 bits physical, 48 bits virtual
> power management:
>
> The PMU init is:
>
> Performance Events: PEBS fmt1+, Nehalem events, Intel PMU driver.
> ... version: 3
> ... bit width: 48
> ... generic registers: 4
> ... value mask: 0000ffffffffffff
> ... max period: 000000007fffffff
> ... fixed-purpose events: 3
> ... event mask: 000000070000000f
>
> I've attached the config as well.
>
> Thanks,
>
> Ingo
> #
> # Automatically generated make config: don't edit
> # Linux kernel version: 2.6.36-rc2
> # Thu Aug 26 07:41:31 2010
> #
> CONFIG_64BIT=y
> # CONFIG_X86_32 is not set
> CONFIG_X86_64=y
> CONFIG_X86=y
> CONFIG_INSTRUCTION_DECODER=y
> CONFIG_OUTPUT_FORMAT="elf64-x86-64"
> CONFIG_ARCH_DEFCONFIG="arch/x86/configs/x86_64_defconfig"
> CONFIG_GENERIC_CMOS_UPDATE=y
> CONFIG_CLOCKSOURCE_WATCHDOG=y
> CONFIG_GENERIC_CLOCKEVENTS=y
> CONFIG_GENERIC_CLOCKEVENTS_BROADCAST=y
> CONFIG_LOCKDEP_SUPPORT=y
> CONFIG_STACKTRACE_SUPPORT=y
> CONFIG_HAVE_LATENCYTOP_SUPPORT=y
> CONFIG_MMU=y
> CONFIG_ZONE_DMA=y
> CONFIG_NEED_DMA_MAP_STATE=y
> CONFIG_NEED_SG_DMA_LENGTH=y
> CONFIG_GENERIC_ISA_DMA=y
> CONFIG_GENERIC_IOMAP=y
> CONFIG_GENERIC_BUG=y
> CONFIG_GENERIC_BUG_RELATIVE_POINTERS=y
> CONFIG_GENERIC_HWEIGHT=y
> CONFIG_ARCH_MAY_HAVE_PC_FDC=y
> # CONFIG_RWSEM_GENERIC_SPINLOCK is not set
> CONFIG_RWSEM_XCHGADD_ALGORITHM=y
> CONFIG_ARCH_HAS_CPU_IDLE_WAIT=y
> CONFIG_GENERIC_CALIBRATE_DELAY=y
> CONFIG_GENERIC_TIME_VSYSCALL=y
> CONFIG_ARCH_HAS_CPU_RELAX=y
> CONFIG_ARCH_HAS_DEFAULT_IDLE=y
> CONFIG_ARCH_HAS_CACHE_LINE_SIZE=y
> CONFIG_HAVE_SETUP_PER_CPU_AREA=y
> CONFIG_NEED_PER_CPU_EMBED_FIRST_CHUNK=y
> CONFIG_NEED_PER_CPU_PAGE_FIRST_CHUNK=y
> CONFIG_HAVE_CPUMASK_OF_CPU_MAP=y
> CONFIG_ARCH_HIBERNATION_POSSIBLE=y
> CONFIG_ARCH_SUSPEND_POSSIBLE=y
> CONFIG_ZONE_DMA32=y
> CONFIG_ARCH_POPULATES_NODE_MAP=y
> CONFIG_AUDIT_ARCH=y
> CONFIG_ARCH_SUPPORTS_OPTIMIZED_INLINING=y
> CONFIG_ARCH_SUPPORTS_DEBUG_PAGEALLOC=y
> CONFIG_HAVE_EARLY_RES=y
> CONFIG_GENERIC_HARDIRQS=y
> CONFIG_GENERIC_HARDIRQS_NO__DO_IRQ=y
> CONFIG_GENERIC_IRQ_PROBE=y
> CONFIG_GENERIC_PENDING_IRQ=y
> CONFIG_USE_GENERIC_SMP_HELPERS=y
> CONFIG_X86_64_SMP=y
> CONFIG_X86_HT=y
> CONFIG_X86_TRAMPOLINE=y
> CONFIG_ARCH_HWEIGHT_CFLAGS="-fcall-saved-rdi -fcall-saved-rsi -fcall-saved-rdx -fcall-saved-rcx -fcall-saved-r8 -fcall-saved-r9 -fcall-saved-r10 -fcall-saved-r11"
> # CONFIG_KTIME_SCALAR is not set
> CONFIG_ARCH_CPU_PROBE_RELEASE=y
> CONFIG_DEFCONFIG_LIST="/lib/modules/$UNAME_RELEASE/.config"
> CONFIG_CONSTRUCTORS=y
>
> #
> # General setup
> #
> CONFIG_EXPERIMENTAL=y
> CONFIG_LOCK_KERNEL=y
> CONFIG_INIT_ENV_ARG_LIMIT=32
> CONFIG_CROSS_COMPILE=""
> CONFIG_LOCALVERSION=""
> CONFIG_LOCALVERSION_AUTO=y
> CONFIG_HAVE_KERNEL_GZIP=y
> CONFIG_HAVE_KERNEL_BZIP2=y
> CONFIG_HAVE_KERNEL_LZMA=y
> CONFIG_HAVE_KERNEL_LZO=y
> CONFIG_KERNEL_GZIP=y
> # CONFIG_KERNEL_BZIP2 is not set
> # CONFIG_KERNEL_LZMA is not set
> # CONFIG_KERNEL_LZO is not set
> CONFIG_SWAP=y
> CONFIG_SYSVIPC=y
> CONFIG_SYSVIPC_SYSCTL=y
> CONFIG_POSIX_MQUEUE=y
> CONFIG_POSIX_MQUEUE_SYSCTL=y
> CONFIG_BSD_PROCESS_ACCT=y
> # CONFIG_BSD_PROCESS_ACCT_V3 is not set
> CONFIG_TASKSTATS=y
> CONFIG_TASK_DELAY_ACCT=y
> # CONFIG_TASK_XACCT is not set
> CONFIG_AUDIT=y
> CONFIG_AUDITSYSCALL=y
> CONFIG_AUDIT_WATCH=y
> CONFIG_AUDIT_TREE=y
>
> #
> # RCU Subsystem
> #
> CONFIG_TREE_RCU=y
> # CONFIG_PREEMPT_RCU is not set
> # CONFIG_RCU_TRACE is not set
> CONFIG_RCU_FANOUT=64
> # CONFIG_RCU_FANOUT_EXACT is not set
> # CONFIG_TREE_RCU_TRACE is not set
> # CONFIG_IKCONFIG is not set
> CONFIG_LOG_BUF_SHIFT=20
> CONFIG_HAVE_UNSTABLE_SCHED_CLOCK=y
> # CONFIG_CGROUPS is not set
> CONFIG_SYSFS_DEPRECATED=y
> CONFIG_SYSFS_DEPRECATED_V2=y
> CONFIG_RELAY=y
> CONFIG_NAMESPACES=y
> # CONFIG_UTS_NS is not set
> # CONFIG_IPC_NS is not set
> # CONFIG_USER_NS is not set
> # CONFIG_PID_NS is not set
> # CONFIG_NET_NS is not set
> CONFIG_BLK_DEV_INITRD=y
> CONFIG_INITRAMFS_SOURCE=""
> CONFIG_RD_GZIP=y
> CONFIG_RD_BZIP2=y
> CONFIG_RD_LZMA=y
> CONFIG_RD_LZO=y
> CONFIG_CC_OPTIMIZE_FOR_SIZE=y
> CONFIG_SYSCTL=y
> CONFIG_ANON_INODES=y
> # CONFIG_EMBEDDED is not set
> CONFIG_UID16=y
> CONFIG_SYSCTL_SYSCALL=y
> CONFIG_KALLSYMS=y
> CONFIG_KALLSYMS_ALL=y
> CONFIG_KALLSYMS_EXTRA_PASS=y
> CONFIG_HOTPLUG=y
> CONFIG_PRINTK=y
> CONFIG_BUG=y
> CONFIG_ELF_CORE=y
> CONFIG_PCSPKR_PLATFORM=y
> CONFIG_BASE_FULL=y
> CONFIG_FUTEX=y
> CONFIG_EPOLL=y
> CONFIG_SIGNALFD=y
> CONFIG_TIMERFD=y
> CONFIG_EVENTFD=y
> CONFIG_SHMEM=y
> CONFIG_AIO=y
> CONFIG_HAVE_PERF_EVENTS=y
>
> #
> # Kernel Performance Events And Counters
> #
> CONFIG_PERF_EVENTS=y
> CONFIG_PERF_COUNTERS=y
> # CONFIG_DEBUG_PERF_USE_VMALLOC is not set
> CONFIG_VM_EVENT_COUNTERS=y
> CONFIG_PCI_QUIRKS=y
> CONFIG_COMPAT_BRK=y
> CONFIG_SLAB=y
> # CONFIG_SLUB is not set
> CONFIG_PROFILING=y
> CONFIG_TRACEPOINTS=y
> CONFIG_OPROFILE=m
> # CONFIG_OPROFILE_EVENT_MULTIPLEX is not set
> CONFIG_HAVE_OPROFILE=y
> CONFIG_KPROBES=y
> CONFIG_OPTPROBES=y
> CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS=y
> CONFIG_KRETPROBES=y
> CONFIG_USER_RETURN_NOTIFIER=y
> CONFIG_HAVE_IOREMAP_PROT=y
> CONFIG_HAVE_KPROBES=y
> CONFIG_HAVE_KRETPROBES=y
> CONFIG_HAVE_OPTPROBES=y
> CONFIG_HAVE_ARCH_TRACEHOOK=y
> CONFIG_HAVE_DMA_ATTRS=y
> CONFIG_HAVE_REGS_AND_STACK_ACCESS_API=y
> CONFIG_HAVE_DMA_API_DEBUG=y
> CONFIG_HAVE_HW_BREAKPOINT=y
> CONFIG_HAVE_MIXED_BREAKPOINTS_REGS=y
> CONFIG_HAVE_USER_RETURN_NOTIFIER=y
> CONFIG_HAVE_PERF_EVENTS_NMI=y
>
> #
> # GCOV-based kernel profiling
> #
> # CONFIG_HAVE_GENERIC_DMA_COHERENT is not set
> CONFIG_SLABINFO=y
> CONFIG_RT_MUTEXES=y
> CONFIG_BASE_SMALL=0
> CONFIG_MODULES=y
> # CONFIG_MODULE_FORCE_LOAD is not set
> CONFIG_MODULE_UNLOAD=y
> # CONFIG_MODULE_FORCE_UNLOAD is not set
> CONFIG_MODVERSIONS=y
> CONFIG_MODULE_SRCVERSION_ALL=y
> CONFIG_STOP_MACHINE=y
> CONFIG_BLOCK=y
> # CONFIG_BLK_DEV_BSG is not set
> # CONFIG_BLK_DEV_INTEGRITY is not set
> CONFIG_BLOCK_COMPAT=y
>
> #
> # IO Schedulers
> #
> CONFIG_IOSCHED_NOOP=y
> CONFIG_IOSCHED_DEADLINE=y
> CONFIG_IOSCHED_CFQ=y
> # CONFIG_DEFAULT_DEADLINE is not set
> CONFIG_DEFAULT_CFQ=y
> # CONFIG_DEFAULT_NOOP is not set
> CONFIG_DEFAULT_IOSCHED="cfq"
> CONFIG_PREEMPT_NOTIFIERS=y
> # CONFIG_INLINE_SPIN_TRYLOCK is not set
> # CONFIG_INLINE_SPIN_TRYLOCK_BH is not set
> # CONFIG_INLINE_SPIN_LOCK is not set
> # CONFIG_INLINE_SPIN_LOCK_BH is not set
> # CONFIG_INLINE_SPIN_LOCK_IRQ is not set
> # CONFIG_INLINE_SPIN_LOCK_IRQSAVE is not set
> CONFIG_INLINE_SPIN_UNLOCK=y
> # CONFIG_INLINE_SPIN_UNLOCK_BH is not set
> CONFIG_INLINE_SPIN_UNLOCK_IRQ=y
> # CONFIG_INLINE_SPIN_UNLOCK_IRQRESTORE is not set
> # CONFIG_INLINE_READ_TRYLOCK is not set
> # CONFIG_INLINE_READ_LOCK is not set
> # CONFIG_INLINE_READ_LOCK_BH is not set
> # CONFIG_INLINE_READ_LOCK_IRQ is not set
> # CONFIG_INLINE_READ_LOCK_IRQSAVE is not set
> CONFIG_INLINE_READ_UNLOCK=y
> # CONFIG_INLINE_READ_UNLOCK_BH is not set
> CONFIG_INLINE_READ_UNLOCK_IRQ=y
> # CONFIG_INLINE_READ_UNLOCK_IRQRESTORE is not set
> # CONFIG_INLINE_WRITE_TRYLOCK is not set
> # CONFIG_INLINE_WRITE_LOCK is not set
> # CONFIG_INLINE_WRITE_LOCK_BH is not set
> # CONFIG_INLINE_WRITE_LOCK_IRQ is not set
> # CONFIG_INLINE_WRITE_LOCK_IRQSAVE is not set
> CONFIG_INLINE_WRITE_UNLOCK=y
> # CONFIG_INLINE_WRITE_UNLOCK_BH is not set
> CONFIG_INLINE_WRITE_UNLOCK_IRQ=y
> # CONFIG_INLINE_WRITE_UNLOCK_IRQRESTORE is not set
> CONFIG_MUTEX_SPIN_ON_OWNER=y
> CONFIG_FREEZER=y
>
> #
> # Processor type and features
> #
> # CONFIG_NO_HZ is not set
> # CONFIG_HIGH_RES_TIMERS is not set
> CONFIG_GENERIC_CLOCKEVENTS_BUILD=y
> CONFIG_SMP=y
> # CONFIG_SPARSE_IRQ is not set
> CONFIG_X86_MPPARSE=y
> CONFIG_X86_EXTENDED_PLATFORM=y
> # CONFIG_X86_VSMP is not set
> CONFIG_X86_SUPPORTS_MEMORY_FAILURE=y
> CONFIG_SCHED_OMIT_FRAME_POINTER=y
> # CONFIG_PARAVIRT_GUEST is not set
> CONFIG_NO_BOOTMEM=y
> # CONFIG_MEMTEST is not set
> # CONFIG_MK8 is not set
> # CONFIG_MPSC is not set
> # CONFIG_MCORE2 is not set
> # CONFIG_MATOM is not set
> CONFIG_GENERIC_CPU=y
> CONFIG_X86_CPU=y
> CONFIG_X86_INTERNODE_CACHE_SHIFT=7
> CONFIG_X86_CMPXCHG=y
> CONFIG_X86_L1_CACHE_SHIFT=6
> CONFIG_X86_XADD=y
> CONFIG_X86_WP_WORKS_OK=y
> CONFIG_X86_TSC=y
> CONFIG_X86_CMPXCHG64=y
> CONFIG_X86_CMOV=y
> CONFIG_X86_MINIMUM_CPU_FAMILY=64
> CONFIG_X86_DEBUGCTLMSR=y
> CONFIG_CPU_SUP_INTEL=y
> CONFIG_CPU_SUP_AMD=y
> CONFIG_CPU_SUP_CENTAUR=y
> CONFIG_HPET_TIMER=y
> CONFIG_HPET_EMULATE_RTC=y
> CONFIG_DMI=y
> CONFIG_GART_IOMMU=y
> CONFIG_CALGARY_IOMMU=y
> CONFIG_CALGARY_IOMMU_ENABLED_BY_DEFAULT=y
> # CONFIG_AMD_IOMMU is not set
> CONFIG_SWIOTLB=y
> CONFIG_IOMMU_HELPER=y
> # CONFIG_IOMMU_API is not set
> # CONFIG_MAXSMP is not set
> CONFIG_NR_CPUS=255
> CONFIG_SCHED_SMT=y
> CONFIG_SCHED_MC=y
> CONFIG_PREEMPT_NONE=y
> # CONFIG_PREEMPT_VOLUNTARY is not set
> # CONFIG_PREEMPT is not set
> CONFIG_X86_LOCAL_APIC=y
> CONFIG_X86_IO_APIC=y
> CONFIG_X86_REROUTE_FOR_BROKEN_BOOT_IRQS=y
> CONFIG_X86_MCE=y
> CONFIG_X86_MCE_INTEL=y
> CONFIG_X86_MCE_AMD=y
> CONFIG_X86_MCE_THRESHOLD=y
> # CONFIG_X86_MCE_INJECT is not set
> CONFIG_X86_THERMAL_VECTOR=y
> # CONFIG_I8K is not set
> CONFIG_MICROCODE=m
> CONFIG_MICROCODE_INTEL=y
> # CONFIG_MICROCODE_AMD is not set
> CONFIG_MICROCODE_OLD_INTERFACE=y
> CONFIG_X86_MSR=y
> CONFIG_X86_CPUID=y
> CONFIG_ARCH_PHYS_ADDR_T_64BIT=y
> CONFIG_DIRECT_GBPAGES=y
> CONFIG_NUMA=y
> CONFIG_K8_NUMA=y
> CONFIG_X86_64_ACPI_NUMA=y
> CONFIG_NODES_SPAN_OTHER_NODES=y
> # CONFIG_NUMA_EMU is not set
> CONFIG_NODES_SHIFT=6
> CONFIG_ARCH_PROC_KCORE_TEXT=y
> CONFIG_ARCH_SPARSEMEM_DEFAULT=y
> CONFIG_ARCH_SPARSEMEM_ENABLE=y
> CONFIG_ARCH_SELECT_MEMORY_MODEL=y
> CONFIG_ILLEGAL_POINTER_VALUE=0xdead000000000000
> CONFIG_SELECT_MEMORY_MODEL=y
> CONFIG_SPARSEMEM_MANUAL=y
> CONFIG_SPARSEMEM=y
> CONFIG_NEED_MULTIPLE_NODES=y
> CONFIG_HAVE_MEMORY_PRESENT=y
> CONFIG_SPARSEMEM_EXTREME=y
> CONFIG_SPARSEMEM_VMEMMAP_ENABLE=y
> CONFIG_SPARSEMEM_ALLOC_MEM_MAP_TOGETHER=y
> CONFIG_SPARSEMEM_VMEMMAP=y
> # CONFIG_MEMORY_HOTPLUG is not set
> CONFIG_PAGEFLAGS_EXTENDED=y
> CONFIG_SPLIT_PTLOCK_CPUS=4
> # CONFIG_COMPACTION is not set
> CONFIG_MIGRATION=y
> CONFIG_PHYS_ADDR_T_64BIT=y
> CONFIG_ZONE_DMA_FLAG=1
> CONFIG_BOUNCE=y
> CONFIG_VIRT_TO_BUS=y
> CONFIG_MMU_NOTIFIER=y
> # CONFIG_KSM is not set
> CONFIG_DEFAULT_MMAP_MIN_ADDR=4096
> CONFIG_ARCH_SUPPORTS_MEMORY_FAILURE=y
> # CONFIG_MEMORY_FAILURE is not set
> CONFIG_X86_CHECK_BIOS_CORRUPTION=y
> CONFIG_X86_BOOTPARAM_MEMORY_CORRUPTION_CHECK=y
> CONFIG_X86_LOW_RESERVE=64
> CONFIG_MTRR=y
> # CONFIG_MTRR_SANITIZER is not set
> CONFIG_X86_PAT=y
> CONFIG_ARCH_USES_PG_UNCACHED=y
> # CONFIG_EFI is not set
> # CONFIG_SECCOMP is not set
> # CONFIG_CC_STACKPROTECTOR is not set
> # CONFIG_HZ_100 is not set
> CONFIG_HZ_250=y
> # CONFIG_HZ_300 is not set
> # CONFIG_HZ_1000 is not set
> CONFIG_HZ=250
> # CONFIG_SCHED_HRTICK is not set
> CONFIG_KEXEC=y
> # CONFIG_CRASH_DUMP is not set
> CONFIG_PHYSICAL_START=0x1000000
> # CONFIG_RELOCATABLE is not set
> CONFIG_PHYSICAL_ALIGN=0x1000000
> CONFIG_HOTPLUG_CPU=y
> CONFIG_COMPAT_VDSO=y
> # CONFIG_CMDLINE_BOOL is not set
> CONFIG_ARCH_ENABLE_MEMORY_HOTPLUG=y
> CONFIG_HAVE_ARCH_EARLY_PFN_TO_NID=y
> CONFIG_USE_PERCPU_NUMA_NODE_ID=y
>
> #
> # Power management and ACPI options
> #
> CONFIG_PM=y
> # CONFIG_PM_DEBUG is not set
> CONFIG_PM_SLEEP_SMP=y
> CONFIG_PM_SLEEP=y
> CONFIG_SUSPEND_NVS=y
> CONFIG_SUSPEND=y
> CONFIG_SUSPEND_FREEZER=y
> # CONFIG_HIBERNATION is not set
> # CONFIG_PM_RUNTIME is not set
> CONFIG_PM_OPS=y
> CONFIG_ACPI=y
> CONFIG_ACPI_SLEEP=y
> CONFIG_ACPI_PROCFS=y
> CONFIG_ACPI_PROCFS_POWER=y
> # CONFIG_ACPI_POWER_METER is not set
> CONFIG_ACPI_SYSFS_POWER=y
> # CONFIG_ACPI_EC_DEBUGFS is not set
> CONFIG_ACPI_PROC_EVENT=y
> CONFIG_ACPI_AC=m
> CONFIG_ACPI_BATTERY=m
> CONFIG_ACPI_BUTTON=m
> CONFIG_ACPI_VIDEO=m
> CONFIG_ACPI_FAN=y
> CONFIG_ACPI_DOCK=y
> CONFIG_ACPI_PROCESSOR=y
> CONFIG_ACPI_HOTPLUG_CPU=y
> # CONFIG_ACPI_PROCESSOR_AGGREGATOR is not set
> CONFIG_ACPI_THERMAL=y
> CONFIG_ACPI_NUMA=y
> # CONFIG_ACPI_CUSTOM_DSDT is not set
> CONFIG_ACPI_BLACKLIST_YEAR=0
> # CONFIG_ACPI_DEBUG is not set
> # CONFIG_ACPI_PCI_SLOT is not set
> CONFIG_X86_PM_TIMER=y
> CONFIG_ACPI_CONTAINER=y
> CONFIG_ACPI_SBS=m
> # CONFIG_ACPI_HED is not set
> # CONFIG_ACPI_APEI is not set
> # CONFIG_SFI is not set
>
> #
> # CPU Frequency scaling
> #
> CONFIG_CPU_FREQ=y
> CONFIG_CPU_FREQ_TABLE=y
> CONFIG_CPU_FREQ_DEBUG=y
> CONFIG_CPU_FREQ_STAT=m
> CONFIG_CPU_FREQ_STAT_DETAILS=y
> # CONFIG_CPU_FREQ_DEFAULT_GOV_PERFORMANCE is not set
> CONFIG_CPU_FREQ_DEFAULT_GOV_USERSPACE=y
> # CONFIG_CPU_FREQ_DEFAULT_GOV_ONDEMAND is not set
> # CONFIG_CPU_FREQ_DEFAULT_GOV_CONSERVATIVE is not set
> CONFIG_CPU_FREQ_GOV_PERFORMANCE=y
> CONFIG_CPU_FREQ_GOV_POWERSAVE=m
> CONFIG_CPU_FREQ_GOV_USERSPACE=y
> CONFIG_CPU_FREQ_GOV_ONDEMAND=m
> CONFIG_CPU_FREQ_GOV_CONSERVATIVE=m
>
> #
> # CPUFreq processor drivers
> #
> # CONFIG_X86_PCC_CPUFREQ is not set
> CONFIG_X86_ACPI_CPUFREQ=y
> CONFIG_X86_POWERNOW_K8=y
> # CONFIG_X86_SPEEDSTEP_CENTRINO is not set
> # CONFIG_X86_P4_CLOCKMOD is not set
>
> #
> # shared options
> #
> # CONFIG_X86_SPEEDSTEP_LIB is not set
> CONFIG_CPU_IDLE=y
> CONFIG_CPU_IDLE_GOV_LADDER=y
> # CONFIG_INTEL_IDLE is not set
>
> #
> # Memory power savings
> #
> # CONFIG_I7300_IDLE is not set
>
> #
> # Bus options (PCI etc.)
> #
> CONFIG_PCI=y
> CONFIG_PCI_DIRECT=y
> CONFIG_PCI_MMCONFIG=y
> CONFIG_PCI_DOMAINS=y
> # CONFIG_PCI_CNB20LE_QUIRK is not set
> CONFIG_PCIEPORTBUS=y
> CONFIG_HOTPLUG_PCI_PCIE=m
> CONFIG_PCIEAER=y
> # CONFIG_PCIE_ECRC is not set
> # CONFIG_PCIEAER_INJECT is not set
> CONFIG_PCIEASPM=y
> # CONFIG_PCIEASPM_DEBUG is not set
> CONFIG_ARCH_SUPPORTS_MSI=y
> # CONFIG_PCI_MSI is not set
> # CONFIG_PCI_DEBUG is not set
> # CONFIG_PCI_STUB is not set
> CONFIG_HT_IRQ=y
> # CONFIG_PCI_IOV is not set
> CONFIG_PCI_IOAPIC=y
> CONFIG_ISA_DMA_API=y
> CONFIG_K8_NB=y
> CONFIG_PCCARD=y
> CONFIG_PCMCIA=y
> CONFIG_PCMCIA_LOAD_CIS=y
> CONFIG_CARDBUS=y
>
> #
> # PC-card bridges
> #
> CONFIG_YENTA=y
> CONFIG_YENTA_O2=y
> CONFIG_YENTA_RICOH=y
> CONFIG_YENTA_TI=y
> CONFIG_YENTA_ENE_TUNE=y
> CONFIG_YENTA_TOSHIBA=y
> CONFIG_PD6729=m
> CONFIG_I82092=m
> CONFIG_PCCARD_NONSTATIC=y
> CONFIG_HOTPLUG_PCI=y
> CONFIG_HOTPLUG_PCI_FAKE=m
> CONFIG_HOTPLUG_PCI_ACPI=m
> CONFIG_HOTPLUG_PCI_ACPI_IBM=m
> # CONFIG_HOTPLUG_PCI_CPCI is not set
> CONFIG_HOTPLUG_PCI_SHPC=m
>
> #
> # Executable file formats / Emulations
> #
> CONFIG_BINFMT_ELF=y
> CONFIG_COMPAT_BINFMT_ELF=y
> # CONFIG_CORE_DUMP_DEFAULT_ELF_HEADERS is not set
> # CONFIG_HAVE_AOUT is not set
> CONFIG_BINFMT_MISC=y
> CONFIG_IA32_EMULATION=y
> # CONFIG_IA32_AOUT is not set
> CONFIG_COMPAT=y
> CONFIG_COMPAT_FOR_U64_ALIGNMENT=y
> CONFIG_SYSVIPC_COMPAT=y
> CONFIG_NET=y
>
> #
> # Networking options
> #
> CONFIG_PACKET=y
> CONFIG_UNIX=y
> CONFIG_XFRM=y
> CONFIG_XFRM_USER=y
> # CONFIG_XFRM_SUB_POLICY is not set
> CONFIG_XFRM_MIGRATE=y
> # CONFIG_XFRM_STATISTICS is not set
> CONFIG_XFRM_IPCOMP=m
> CONFIG_NET_KEY=m
> CONFIG_NET_KEY_MIGRATE=y
> CONFIG_INET=y
> CONFIG_IP_MULTICAST=y
> CONFIG_IP_ADVANCED_ROUTER=y
> CONFIG_ASK_IP_FIB_HASH=y
> # CONFIG_IP_FIB_TRIE is not set
> CONFIG_IP_FIB_HASH=y
> CONFIG_IP_MULTIPLE_TABLES=y
> CONFIG_IP_ROUTE_MULTIPATH=y
> CONFIG_IP_ROUTE_VERBOSE=y
> # CONFIG_IP_PNP is not set
> CONFIG_NET_IPIP=m
> CONFIG_NET_IPGRE=m
> CONFIG_NET_IPGRE_BROADCAST=y
> CONFIG_IP_MROUTE=y
> # CONFIG_IP_MROUTE_MULTIPLE_TABLES is not set
> CONFIG_IP_PIMSM_V1=y
> CONFIG_IP_PIMSM_V2=y
> # CONFIG_ARPD is not set
> CONFIG_SYN_COOKIES=y
> CONFIG_INET_AH=m
> CONFIG_INET_ESP=m
> CONFIG_INET_IPCOMP=m
> CONFIG_INET_XFRM_TUNNEL=m
> CONFIG_INET_TUNNEL=m
> CONFIG_INET_XFRM_MODE_TRANSPORT=m
> CONFIG_INET_XFRM_MODE_TUNNEL=m
> CONFIG_INET_XFRM_MODE_BEET=m
> CONFIG_INET_LRO=y
> CONFIG_INET_DIAG=m
> CONFIG_INET_TCP_DIAG=m
> CONFIG_TCP_CONG_ADVANCED=y
> CONFIG_TCP_CONG_BIC=y
> CONFIG_TCP_CONG_CUBIC=m
> CONFIG_TCP_CONG_WESTWOOD=m
> CONFIG_TCP_CONG_HTCP=m
> CONFIG_TCP_CONG_HSTCP=m
> CONFIG_TCP_CONG_HYBLA=m
> CONFIG_TCP_CONG_VEGAS=m
> CONFIG_TCP_CONG_SCALABLE=m
> CONFIG_TCP_CONG_LP=m
> CONFIG_TCP_CONG_VENO=m
> # CONFIG_TCP_CONG_YEAH is not set
> # CONFIG_TCP_CONG_ILLINOIS is not set
> CONFIG_DEFAULT_BIC=y
> # CONFIG_DEFAULT_RENO is not set
> CONFIG_DEFAULT_TCP_CONG="bic"
> CONFIG_TCP_MD5SIG=y
> CONFIG_IPV6=m
> CONFIG_IPV6_PRIVACY=y
> CONFIG_IPV6_ROUTER_PREF=y
> CONFIG_IPV6_ROUTE_INFO=y
> # CONFIG_IPV6_OPTIMISTIC_DAD is not set
> CONFIG_INET6_AH=m
> CONFIG_INET6_ESP=m
> CONFIG_INET6_IPCOMP=m
> # CONFIG_IPV6_MIP6 is not set
> CONFIG_INET6_XFRM_TUNNEL=m
> CONFIG_INET6_TUNNEL=m
> CONFIG_INET6_XFRM_MODE_TRANSPORT=m
> CONFIG_INET6_XFRM_MODE_TUNNEL=m
> CONFIG_INET6_XFRM_MODE_BEET=m
> CONFIG_INET6_XFRM_MODE_ROUTEOPTIMIZATION=m
> CONFIG_IPV6_SIT=m
> # CONFIG_IPV6_SIT_6RD is not set
> CONFIG_IPV6_NDISC_NODETYPE=y
> CONFIG_IPV6_TUNNEL=m
> # CONFIG_IPV6_MULTIPLE_TABLES is not set
> # CONFIG_IPV6_MROUTE is not set
> CONFIG_NETLABEL=y
> CONFIG_NETWORK_SECMARK=y
> # CONFIG_NETWORK_PHY_TIMESTAMPING is not set
> CONFIG_NETFILTER=y
> CONFIG_NETFILTER_DEBUG=y
> CONFIG_NETFILTER_ADVANCED=y
> CONFIG_BRIDGE_NETFILTER=y
>
> #
> # Core Netfilter Configuration
> #
> CONFIG_NETFILTER_NETLINK=m
> CONFIG_NETFILTER_NETLINK_QUEUE=m
> CONFIG_NETFILTER_NETLINK_LOG=m
> CONFIG_NF_CONNTRACK=y
> CONFIG_NF_CONNTRACK_MARK=y
> CONFIG_NF_CONNTRACK_SECMARK=y
> CONFIG_NF_CONNTRACK_EVENTS=y
> CONFIG_NF_CT_PROTO_DCCP=m
> CONFIG_NF_CT_PROTO_GRE=m
> CONFIG_NF_CT_PROTO_SCTP=m
> # CONFIG_NF_CT_PROTO_UDPLITE is not set
> CONFIG_NF_CONNTRACK_AMANDA=m
> CONFIG_NF_CONNTRACK_FTP=m
> CONFIG_NF_CONNTRACK_H323=m
> CONFIG_NF_CONNTRACK_IRC=m
> CONFIG_NF_CONNTRACK_NETBIOS_NS=m
> CONFIG_NF_CONNTRACK_PPTP=m
> CONFIG_NF_CONNTRACK_SANE=m
> CONFIG_NF_CONNTRACK_SIP=m
> CONFIG_NF_CONNTRACK_TFTP=m
> # CONFIG_NF_CT_NETLINK is not set
> # CONFIG_NETFILTER_TPROXY is not set
> CONFIG_NETFILTER_XTABLES=m
>
> #
> # Xtables combined modules
> #
> CONFIG_NETFILTER_XT_MARK=m
> CONFIG_NETFILTER_XT_CONNMARK=m
>
> #
> # Xtables targets
> #
> # CONFIG_NETFILTER_XT_TARGET_CHECKSUM is not set
> CONFIG_NETFILTER_XT_TARGET_CLASSIFY=m
> CONFIG_NETFILTER_XT_TARGET_CONNMARK=m
> CONFIG_NETFILTER_XT_TARGET_CONNSECMARK=m
> # CONFIG_NETFILTER_XT_TARGET_CT is not set
> CONFIG_NETFILTER_XT_TARGET_DSCP=m
> CONFIG_NETFILTER_XT_TARGET_HL=m
> # CONFIG_NETFILTER_XT_TARGET_IDLETIMER is not set
> # CONFIG_NETFILTER_XT_TARGET_LED is not set
> CONFIG_NETFILTER_XT_TARGET_MARK=m
> CONFIG_NETFILTER_XT_TARGET_NFLOG=m
> CONFIG_NETFILTER_XT_TARGET_NFQUEUE=m
> CONFIG_NETFILTER_XT_TARGET_NOTRACK=m
> # CONFIG_NETFILTER_XT_TARGET_RATEEST is not set
> # CONFIG_NETFILTER_XT_TARGET_TEE is not set
> # CONFIG_NETFILTER_XT_TARGET_TRACE is not set
> CONFIG_NETFILTER_XT_TARGET_SECMARK=m
> CONFIG_NETFILTER_XT_TARGET_TCPMSS=m
> # CONFIG_NETFILTER_XT_TARGET_TCPOPTSTRIP is not set
>
> #
> # Xtables matches
> #
> # CONFIG_NETFILTER_XT_MATCH_CLUSTER is not set
> CONFIG_NETFILTER_XT_MATCH_COMMENT=m
> CONFIG_NETFILTER_XT_MATCH_CONNBYTES=m
> # CONFIG_NETFILTER_XT_MATCH_CONNLIMIT is not set
> CONFIG_NETFILTER_XT_MATCH_CONNMARK=m
> CONFIG_NETFILTER_XT_MATCH_CONNTRACK=m
> # CONFIG_NETFILTER_XT_MATCH_CPU is not set
> CONFIG_NETFILTER_XT_MATCH_DCCP=m
> CONFIG_NETFILTER_XT_MATCH_DSCP=m
> CONFIG_NETFILTER_XT_MATCH_ESP=m
> CONFIG_NETFILTER_XT_MATCH_HASHLIMIT=m
> CONFIG_NETFILTER_XT_MATCH_HELPER=m
> CONFIG_NETFILTER_XT_MATCH_HL=m
> # CONFIG_NETFILTER_XT_MATCH_IPRANGE is not set
> # CONFIG_NETFILTER_XT_MATCH_IPVS is not set
> CONFIG_NETFILTER_XT_MATCH_LENGTH=m
> CONFIG_NETFILTER_XT_MATCH_LIMIT=m
> CONFIG_NETFILTER_XT_MATCH_MAC=m
> CONFIG_NETFILTER_XT_MATCH_MARK=m
> CONFIG_NETFILTER_XT_MATCH_MULTIPORT=m
> # CONFIG_NETFILTER_XT_MATCH_OSF is not set
> # CONFIG_NETFILTER_XT_MATCH_OWNER is not set
> CONFIG_NETFILTER_XT_MATCH_POLICY=m
> CONFIG_NETFILTER_XT_MATCH_PHYSDEV=m
> CONFIG_NETFILTER_XT_MATCH_PKTTYPE=m
> CONFIG_NETFILTER_XT_MATCH_QUOTA=m
> # CONFIG_NETFILTER_XT_MATCH_RATEEST is not set
> CONFIG_NETFILTER_XT_MATCH_REALM=m
> # CONFIG_NETFILTER_XT_MATCH_RECENT is not set
> CONFIG_NETFILTER_XT_MATCH_SCTP=m
> CONFIG_NETFILTER_XT_MATCH_STATE=m
> CONFIG_NETFILTER_XT_MATCH_STATISTIC=m
> CONFIG_NETFILTER_XT_MATCH_STRING=m
> CONFIG_NETFILTER_XT_MATCH_TCPMSS=m
> # CONFIG_NETFILTER_XT_MATCH_TIME is not set
> # CONFIG_NETFILTER_XT_MATCH_U32 is not set
> CONFIG_IP_VS=m
> # CONFIG_IP_VS_IPV6 is not set
> # CONFIG_IP_VS_DEBUG is not set
> CONFIG_IP_VS_TAB_BITS=12
>
> #
> # IPVS transport protocol load balancing support
> #
> CONFIG_IP_VS_PROTO_TCP=y
> CONFIG_IP_VS_PROTO_UDP=y
> CONFIG_IP_VS_PROTO_AH_ESP=y
> CONFIG_IP_VS_PROTO_ESP=y
> CONFIG_IP_VS_PROTO_AH=y
> # CONFIG_IP_VS_PROTO_SCTP is not set
>
> #
> # IPVS scheduler
> #
> CONFIG_IP_VS_RR=m
> CONFIG_IP_VS_WRR=m
> CONFIG_IP_VS_LC=m
> CONFIG_IP_VS_WLC=m
> CONFIG_IP_VS_LBLC=m
> CONFIG_IP_VS_LBLCR=m
> CONFIG_IP_VS_DH=m
> CONFIG_IP_VS_SH=m
> CONFIG_IP_VS_SED=m
> CONFIG_IP_VS_NQ=m
>
> #
> # IPVS application helper
> #
> CONFIG_IP_VS_FTP=m
>
> #
> # IP: Netfilter Configuration
> #
> CONFIG_NF_DEFRAG_IPV4=m
> CONFIG_NF_CONNTRACK_IPV4=m
> CONFIG_NF_CONNTRACK_PROC_COMPAT=y
> CONFIG_IP_NF_QUEUE=m
> CONFIG_IP_NF_IPTABLES=m
> CONFIG_IP_NF_MATCH_ADDRTYPE=m
> CONFIG_IP_NF_MATCH_AH=m
> CONFIG_IP_NF_MATCH_ECN=m
> CONFIG_IP_NF_MATCH_TTL=m
> CONFIG_IP_NF_FILTER=m
> CONFIG_IP_NF_TARGET_REJECT=m
> CONFIG_IP_NF_TARGET_LOG=m
> CONFIG_IP_NF_TARGET_ULOG=m
> CONFIG_NF_NAT=m
> CONFIG_NF_NAT_NEEDED=y
> CONFIG_IP_NF_TARGET_MASQUERADE=m
> CONFIG_IP_NF_TARGET_NETMAP=m
> CONFIG_IP_NF_TARGET_REDIRECT=m
> CONFIG_NF_NAT_SNMP_BASIC=m
> CONFIG_NF_NAT_PROTO_DCCP=m
> CONFIG_NF_NAT_PROTO_GRE=m
> CONFIG_NF_NAT_PROTO_SCTP=m
> CONFIG_NF_NAT_FTP=m
> CONFIG_NF_NAT_IRC=m
> CONFIG_NF_NAT_TFTP=m
> CONFIG_NF_NAT_AMANDA=m
> CONFIG_NF_NAT_PPTP=m
> CONFIG_NF_NAT_H323=m
> CONFIG_NF_NAT_SIP=m
> CONFIG_IP_NF_MANGLE=m
> CONFIG_IP_NF_TARGET_CLUSTERIP=m
> CONFIG_IP_NF_TARGET_ECN=m
> CONFIG_IP_NF_TARGET_TTL=m
> CONFIG_IP_NF_RAW=m
> # CONFIG_IP_NF_SECURITY is not set
> CONFIG_IP_NF_ARPTABLES=m
> CONFIG_IP_NF_ARPFILTER=m
> CONFIG_IP_NF_ARP_MANGLE=m
>
> #
> # IPv6: Netfilter Configuration
> #
> CONFIG_NF_CONNTRACK_IPV6=m
> CONFIG_IP6_NF_QUEUE=m
> CONFIG_IP6_NF_IPTABLES=m
> CONFIG_IP6_NF_MATCH_AH=m
> CONFIG_IP6_NF_MATCH_EUI64=m
> CONFIG_IP6_NF_MATCH_FRAG=m
> CONFIG_IP6_NF_MATCH_OPTS=m
> CONFIG_IP6_NF_MATCH_HL=m
> CONFIG_IP6_NF_MATCH_IPV6HEADER=m
> CONFIG_IP6_NF_MATCH_MH=m
> CONFIG_IP6_NF_MATCH_RT=m
> CONFIG_IP6_NF_TARGET_HL=m
> CONFIG_IP6_NF_TARGET_LOG=m
> CONFIG_IP6_NF_FILTER=m
> CONFIG_IP6_NF_TARGET_REJECT=m
> CONFIG_IP6_NF_MANGLE=m
> CONFIG_IP6_NF_RAW=m
> # CONFIG_IP6_NF_SECURITY is not set
>
> #
> # DECnet: Netfilter Configuration
> #
> # CONFIG_DECNET_NF_GRABULATOR is not set
> CONFIG_BRIDGE_NF_EBTABLES=m
> CONFIG_BRIDGE_EBT_BROUTE=m
> CONFIG_BRIDGE_EBT_T_FILTER=m
> CONFIG_BRIDGE_EBT_T_NAT=m
> CONFIG_BRIDGE_EBT_802_3=m
> CONFIG_BRIDGE_EBT_AMONG=m
> CONFIG_BRIDGE_EBT_ARP=m
> CONFIG_BRIDGE_EBT_IP=m
> # CONFIG_BRIDGE_EBT_IP6 is not set
> CONFIG_BRIDGE_EBT_LIMIT=m
> CONFIG_BRIDGE_EBT_MARK=m
> CONFIG_BRIDGE_EBT_PKTTYPE=m
> CONFIG_BRIDGE_EBT_STP=m
> CONFIG_BRIDGE_EBT_VLAN=m
> CONFIG_BRIDGE_EBT_ARPREPLY=m
> CONFIG_BRIDGE_EBT_DNAT=m
> CONFIG_BRIDGE_EBT_MARK_T=m
> CONFIG_BRIDGE_EBT_REDIRECT=m
> CONFIG_BRIDGE_EBT_SNAT=m
> CONFIG_BRIDGE_EBT_LOG=m
> CONFIG_BRIDGE_EBT_ULOG=m
> # CONFIG_BRIDGE_EBT_NFLOG is not set
> CONFIG_IP_DCCP=m
> CONFIG_INET_DCCP_DIAG=m
>
> #
> # DCCP CCIDs Configuration (EXPERIMENTAL)
> #
> # CONFIG_IP_DCCP_CCID2_DEBUG is not set
> CONFIG_IP_DCCP_CCID3=y
> # CONFIG_IP_DCCP_CCID3_DEBUG is not set
> CONFIG_IP_DCCP_CCID3_RTO=100
> CONFIG_IP_DCCP_TFRC_LIB=y
>
> #
> # DCCP Kernel Hacking
> #
> # CONFIG_IP_DCCP_DEBUG is not set
> # CONFIG_NET_DCCPPROBE is not set
> CONFIG_IP_SCTP=m
> # CONFIG_NET_SCTPPROBE is not set
> # CONFIG_SCTP_DBG_MSG is not set
> # CONFIG_SCTP_DBG_OBJCNT is not set
> # CONFIG_SCTP_HMAC_NONE is not set
> # CONFIG_SCTP_HMAC_SHA1 is not set
> CONFIG_SCTP_HMAC_MD5=y
> # CONFIG_RDS is not set
> CONFIG_TIPC=m
> # CONFIG_TIPC_ADVANCED is not set
> # CONFIG_TIPC_DEBUG is not set
> CONFIG_ATM=m
> CONFIG_ATM_CLIP=m
> # CONFIG_ATM_CLIP_NO_ICMP is not set
> CONFIG_ATM_LANE=m
> # CONFIG_ATM_MPOA is not set
> CONFIG_ATM_BR2684=m
> # CONFIG_ATM_BR2684_IPFILTER is not set
> # CONFIG_L2TP is not set
> CONFIG_STP=m
> CONFIG_BRIDGE=m
> CONFIG_BRIDGE_IGMP_SNOOPING=y
> # CONFIG_NET_DSA is not set
> CONFIG_VLAN_8021Q=m
> # CONFIG_VLAN_8021Q_GVRP is not set
> CONFIG_DECNET=m
> CONFIG_DECNET_ROUTER=y
> CONFIG_LLC=y
> # CONFIG_LLC2 is not set
> CONFIG_IPX=m
> # CONFIG_IPX_INTERN is not set
> CONFIG_ATALK=m
> CONFIG_DEV_APPLETALK=m
> CONFIG_IPDDP=m
> CONFIG_IPDDP_ENCAP=y
> CONFIG_IPDDP_DECAP=y
> # CONFIG_X25 is not set
> # CONFIG_LAPB is not set
> # CONFIG_ECONET is not set
> CONFIG_WAN_ROUTER=m
> # CONFIG_PHONET is not set
> # CONFIG_IEEE802154 is not set
> CONFIG_NET_SCHED=y
>
> #
> # Queueing/Scheduling
> #
> CONFIG_NET_SCH_CBQ=m
> CONFIG_NET_SCH_HTB=m
> CONFIG_NET_SCH_HFSC=m
> CONFIG_NET_SCH_ATM=m
> CONFIG_NET_SCH_PRIO=m
> # CONFIG_NET_SCH_MULTIQ is not set
> CONFIG_NET_SCH_RED=m
> CONFIG_NET_SCH_SFQ=m
> CONFIG_NET_SCH_TEQL=m
> CONFIG_NET_SCH_TBF=m
> CONFIG_NET_SCH_GRED=m
> CONFIG_NET_SCH_DSMARK=m
> CONFIG_NET_SCH_NETEM=m
> # CONFIG_NET_SCH_DRR is not set
> CONFIG_NET_SCH_INGRESS=m
>
> #
> # Classification
> #
> CONFIG_NET_CLS=y
> CONFIG_NET_CLS_BASIC=m
> CONFIG_NET_CLS_TCINDEX=m
> CONFIG_NET_CLS_ROUTE4=m
> CONFIG_NET_CLS_ROUTE=y
> CONFIG_NET_CLS_FW=m
> CONFIG_NET_CLS_U32=m
> CONFIG_CLS_U32_PERF=y
> CONFIG_CLS_U32_MARK=y
> CONFIG_NET_CLS_RSVP=m
> CONFIG_NET_CLS_RSVP6=m
> # CONFIG_NET_CLS_FLOW is not set
> CONFIG_NET_EMATCH=y
> CONFIG_NET_EMATCH_STACK=32
> CONFIG_NET_EMATCH_CMP=m
> CONFIG_NET_EMATCH_NBYTE=m
> CONFIG_NET_EMATCH_U32=m
> CONFIG_NET_EMATCH_META=m
> CONFIG_NET_EMATCH_TEXT=m
> CONFIG_NET_CLS_ACT=y
> CONFIG_NET_ACT_POLICE=m
> CONFIG_NET_ACT_GACT=m
> CONFIG_GACT_PROB=y
> CONFIG_NET_ACT_MIRRED=m
> CONFIG_NET_ACT_IPT=m
> # CONFIG_NET_ACT_NAT is not set
> CONFIG_NET_ACT_PEDIT=m
> CONFIG_NET_ACT_SIMP=m
> # CONFIG_NET_ACT_SKBEDIT is not set
> CONFIG_NET_CLS_IND=y
> CONFIG_NET_SCH_FIFO=y
> # CONFIG_DCB is not set
> CONFIG_DNS_RESOLVER=y
> CONFIG_RPS=y
>
> #
> # Network testing
> #
> CONFIG_NET_PKTGEN=m
> # CONFIG_NET_TCPPROBE is not set
> # CONFIG_NET_DROP_MONITOR is not set
> # CONFIG_HAMRADIO is not set
> # CONFIG_CAN is not set
> CONFIG_IRDA=m
>
> #
> # IrDA protocols
> #
> CONFIG_IRLAN=m
> CONFIG_IRNET=m
> CONFIG_IRCOMM=m
> # CONFIG_IRDA_ULTRA is not set
>
> #
> # IrDA options
> #
> CONFIG_IRDA_CACHE_LAST_LSAP=y
> CONFIG_IRDA_FAST_RR=y
> # CONFIG_IRDA_DEBUG is not set
>
> #
> # Infrared-port device drivers
> #
>
> #
> # SIR device drivers
> #
> CONFIG_IRTTY_SIR=m
>
> #
> # Dongle support
> #
> CONFIG_DONGLE=y
> CONFIG_ESI_DONGLE=m
> CONFIG_ACTISYS_DONGLE=m
> CONFIG_TEKRAM_DONGLE=m
> CONFIG_TOIM3232_DONGLE=m
> CONFIG_LITELINK_DONGLE=m
> CONFIG_MA600_DONGLE=m
> CONFIG_GIRBIL_DONGLE=m
> CONFIG_MCP2120_DONGLE=m
> CONFIG_OLD_BELKIN_DONGLE=m
> CONFIG_ACT200L_DONGLE=m
> # CONFIG_KINGSUN_DONGLE is not set
> # CONFIG_KSDAZZLE_DONGLE is not set
> # CONFIG_KS959_DONGLE is not set
>
> #
> # FIR device drivers
> #
> CONFIG_USB_IRDA=m
> CONFIG_SIGMATEL_FIR=m
> CONFIG_NSC_FIR=m
> CONFIG_WINBOND_FIR=m
> CONFIG_SMC_IRCC_FIR=m
> CONFIG_ALI_FIR=m
> CONFIG_VLSI_FIR=m
> CONFIG_VIA_FIR=m
> CONFIG_MCS_FIR=m
> CONFIG_BT=m
> CONFIG_BT_L2CAP=m
> CONFIG_BT_SCO=m
> CONFIG_BT_RFCOMM=m
> CONFIG_BT_RFCOMM_TTY=y
> CONFIG_BT_BNEP=m
> CONFIG_BT_BNEP_MC_FILTER=y
> CONFIG_BT_BNEP_PROTO_FILTER=y
> CONFIG_BT_HIDP=m
>
> #
> # Bluetooth device drivers
> #
> # CONFIG_BT_HCIBTUSB is not set
> # CONFIG_BT_HCIBTSDIO is not set
> CONFIG_BT_HCIUART=m
> CONFIG_BT_HCIUART_H4=y
> CONFIG_BT_HCIUART_BCSP=y
> # CONFIG_BT_HCIUART_ATH3K is not set
> # CONFIG_BT_HCIUART_LL is not set
> CONFIG_BT_HCIBCM203X=m
> CONFIG_BT_HCIBPA10X=m
> CONFIG_BT_HCIBFUSB=m
> CONFIG_BT_HCIDTL1=m
> CONFIG_BT_HCIBT3C=m
> CONFIG_BT_HCIBLUECARD=m
> CONFIG_BT_HCIBTUART=m
> CONFIG_BT_HCIVHCI=m
> # CONFIG_BT_MRVL is not set
> # CONFIG_AF_RXRPC is not set
> CONFIG_FIB_RULES=y
> CONFIG_WIRELESS=y
> # CONFIG_CFG80211 is not set
> # CONFIG_LIB80211 is not set
>
> #
> # CFG80211 needs to be enabled for MAC80211
> #
>
> #
> # Some wireless drivers require a rate control algorithm
> #
> # CONFIG_WIMAX is not set
> CONFIG_RFKILL=m
> CONFIG_RFKILL_LEDS=y
> CONFIG_RFKILL_INPUT=y
> # CONFIG_CAIF is not set
>
> #
> # Device Drivers
> #
>
> #
> # Generic Driver Options
> #
> CONFIG_UEVENT_HELPER_PATH="/sbin/hotplug"
> # CONFIG_DEVTMPFS is not set
> CONFIG_STANDALONE=y
> CONFIG_PREVENT_FIRMWARE_BUILD=y
> CONFIG_FW_LOADER=y
> CONFIG_FIRMWARE_IN_KERNEL=y
> CONFIG_EXTRA_FIRMWARE=""
> # CONFIG_DEBUG_DRIVER is not set
> # CONFIG_DEBUG_DEVRES is not set
> # CONFIG_SYS_HYPERVISOR is not set
> CONFIG_CONNECTOR=y
> CONFIG_PROC_EVENTS=y
> # CONFIG_MTD is not set
> CONFIG_PARPORT=m
> CONFIG_PARPORT_PC=m
> CONFIG_PARPORT_SERIAL=m
> # CONFIG_PARPORT_PC_FIFO is not set
> # CONFIG_PARPORT_PC_SUPERIO is not set
> CONFIG_PARPORT_PC_PCMCIA=m
> # CONFIG_PARPORT_GSC is not set
> # CONFIG_PARPORT_AX88796 is not set
> CONFIG_PARPORT_1284=y
> CONFIG_PARPORT_NOT_PC=y
> CONFIG_PNP=y
> CONFIG_PNP_DEBUG_MESSAGES=y
>
> #
> # Protocols
> #
> CONFIG_PNPACPI=y
> CONFIG_BLK_DEV=y
> CONFIG_BLK_DEV_FD=m
> # CONFIG_PARIDE is not set
> CONFIG_BLK_CPQ_DA=y
> CONFIG_BLK_CPQ_CISS_DA=m
> CONFIG_CISS_SCSI_TAPE=y
> CONFIG_BLK_DEV_DAC960=m
> CONFIG_BLK_DEV_UMEM=m
> # CONFIG_BLK_DEV_COW_COMMON is not set
> CONFIG_BLK_DEV_LOOP=m
> CONFIG_BLK_DEV_CRYPTOLOOP=m
> # CONFIG_BLK_DEV_DRBD is not set
> CONFIG_BLK_DEV_NBD=m
> CONFIG_BLK_DEV_SX8=m
> CONFIG_BLK_DEV_UB=m
> CONFIG_BLK_DEV_RAM=y
> CONFIG_BLK_DEV_RAM_COUNT=16
> CONFIG_BLK_DEV_RAM_SIZE=16384
> # CONFIG_BLK_DEV_XIP is not set
> CONFIG_CDROM_PKTCDVD=m
> CONFIG_CDROM_PKTCDVD_BUFFERS=8
> # CONFIG_CDROM_PKTCDVD_WCACHE is not set
> CONFIG_ATA_OVER_ETH=m
> # CONFIG_BLK_DEV_HD is not set
> CONFIG_MISC_DEVICES=y
> # CONFIG_AD525X_DPOT is not set
> # CONFIG_IBM_ASM is not set
> # CONFIG_PHANTOM is not set
> CONFIG_SGI_IOC4=m
> CONFIG_TIFM_CORE=m
> CONFIG_TIFM_7XX1=m
> # CONFIG_ICS932S401 is not set
> # CONFIG_ENCLOSURE_SERVICES is not set
> # CONFIG_CS5535_MFGPT is not set
> # CONFIG_HP_ILO is not set
> # CONFIG_ISL29003 is not set
> # CONFIG_SENSORS_TSL2550 is not set
> # CONFIG_SENSORS_BH1780 is not set
> # CONFIG_HMC6352 is not set
> # CONFIG_DS1682 is not set
> # CONFIG_VMWARE_BALLOON is not set
> # CONFIG_BMP085 is not set
> # CONFIG_C2PORT is not set
>
> #
> # EEPROM support
> #
> # CONFIG_EEPROM_AT24 is not set
> # CONFIG_EEPROM_LEGACY is not set
> # CONFIG_EEPROM_MAX6875 is not set
> # CONFIG_EEPROM_93CX6 is not set
> # CONFIG_CB710_CORE is not set
> # CONFIG_IWMC3200TOP is not set
> CONFIG_HAVE_IDE=y
> # CONFIG_IDE is not set
>
> #
> # SCSI device support
> #
> CONFIG_SCSI_MOD=y
> CONFIG_RAID_ATTRS=m
> CONFIG_SCSI=y
> CONFIG_SCSI_DMA=y
> CONFIG_SCSI_TGT=m
> CONFIG_SCSI_NETLINK=y
> CONFIG_SCSI_PROC_FS=y
>
> #
> # SCSI support type (disk, tape, CD-ROM)
> #
> CONFIG_BLK_DEV_SD=y
> CONFIG_CHR_DEV_ST=m
> CONFIG_CHR_DEV_OSST=m
> CONFIG_BLK_DEV_SR=m
> CONFIG_BLK_DEV_SR_VENDOR=y
> CONFIG_CHR_DEV_SG=m
> CONFIG_CHR_DEV_SCH=m
> CONFIG_SCSI_MULTI_LUN=y
> CONFIG_SCSI_CONSTANTS=y
> CONFIG_SCSI_LOGGING=y
> # CONFIG_SCSI_SCAN_ASYNC is not set
> CONFIG_SCSI_WAIT_SCAN=m
>
> #
> # SCSI Transports
> #
> CONFIG_SCSI_SPI_ATTRS=y
> CONFIG_SCSI_FC_ATTRS=m
> # CONFIG_SCSI_FC_TGT_ATTRS is not set
> CONFIG_SCSI_ISCSI_ATTRS=m
> CONFIG_SCSI_SAS_ATTRS=m
> # CONFIG_SCSI_SAS_LIBSAS is not set
> # CONFIG_SCSI_SRP_ATTRS is not set
> CONFIG_SCSI_LOWLEVEL=y
> CONFIG_ISCSI_TCP=m
> # CONFIG_ISCSI_BOOT_SYSFS is not set
> # CONFIG_SCSI_CXGB3_ISCSI is not set
> # CONFIG_SCSI_BNX2_ISCSI is not set
> # CONFIG_BE2ISCSI is not set
> CONFIG_BLK_DEV_3W_XXXX_RAID=m
> # CONFIG_SCSI_HPSA is not set
> CONFIG_SCSI_3W_9XXX=m
> # CONFIG_SCSI_3W_SAS is not set
> CONFIG_SCSI_ACARD=m
> CONFIG_SCSI_AACRAID=m
> CONFIG_SCSI_AIC7XXX=y
> CONFIG_AIC7XXX_CMDS_PER_DEVICE=4
> CONFIG_AIC7XXX_RESET_DELAY_MS=15000
> # CONFIG_AIC7XXX_DEBUG_ENABLE is not set
> CONFIG_AIC7XXX_DEBUG_MASK=0
> # CONFIG_AIC7XXX_REG_PRETTY_PRINT is not set
> CONFIG_SCSI_AIC7XXX_OLD=m
> CONFIG_SCSI_AIC79XX=m
> CONFIG_AIC79XX_CMDS_PER_DEVICE=4
> CONFIG_AIC79XX_RESET_DELAY_MS=15000
> # CONFIG_AIC79XX_DEBUG_ENABLE is not set
> CONFIG_AIC79XX_DEBUG_MASK=0
> # CONFIG_AIC79XX_REG_PRETTY_PRINT is not set
> # CONFIG_SCSI_AIC94XX is not set
> # CONFIG_SCSI_MVSAS is not set
> # CONFIG_SCSI_DPT_I2O is not set
> # CONFIG_SCSI_ADVANSYS is not set
> CONFIG_SCSI_ARCMSR=m
> # CONFIG_SCSI_ARCMSR_AER is not set
> CONFIG_MEGARAID_NEWGEN=y
> CONFIG_MEGARAID_MM=m
> CONFIG_MEGARAID_MAILBOX=m
> CONFIG_MEGARAID_LEGACY=m
> CONFIG_MEGARAID_SAS=m
> # CONFIG_SCSI_MPT2SAS is not set
> CONFIG_SCSI_HPTIOP=m
> CONFIG_SCSI_BUSLOGIC=m
> # CONFIG_VMWARE_PVSCSI is not set
> # CONFIG_LIBFC is not set
> # CONFIG_LIBFCOE is not set
> # CONFIG_FCOE is not set
> # CONFIG_FCOE_FNIC is not set
> # CONFIG_SCSI_DMX3191D is not set
> # CONFIG_SCSI_EATA is not set
> # CONFIG_SCSI_FUTURE_DOMAIN is not set
> CONFIG_SCSI_GDTH=m
> CONFIG_SCSI_IPS=m
> CONFIG_SCSI_INITIO=m
> CONFIG_SCSI_INIA100=m
> CONFIG_SCSI_PPA=m
> CONFIG_SCSI_IMM=m
> # CONFIG_SCSI_IZIP_EPP16 is not set
> # CONFIG_SCSI_IZIP_SLOW_CTR is not set
> CONFIG_SCSI_STEX=m
> CONFIG_SCSI_SYM53C8XX_2=m
> CONFIG_SCSI_SYM53C8XX_DMA_ADDRESSING_MODE=1
> CONFIG_SCSI_SYM53C8XX_DEFAULT_TAGS=16
> CONFIG_SCSI_SYM53C8XX_MAX_TAGS=64
> CONFIG_SCSI_SYM53C8XX_MMIO=y
> # CONFIG_SCSI_IPR is not set
> CONFIG_SCSI_QLOGIC_1280=m
> CONFIG_SCSI_QLA_FC=m
> CONFIG_SCSI_QLA_ISCSI=m
> CONFIG_SCSI_LPFC=m
> # CONFIG_SCSI_LPFC_DEBUG_FS is not set
> CONFIG_SCSI_DC395x=m
> CONFIG_SCSI_DC390T=m
> # CONFIG_SCSI_PMCRAID is not set
> # CONFIG_SCSI_PM8001 is not set
> CONFIG_SCSI_SRP=m
> # CONFIG_SCSI_BFA_FC is not set
> # CONFIG_SCSI_LOWLEVEL_PCMCIA is not set
> # CONFIG_SCSI_DH is not set
> # CONFIG_SCSI_OSD_INITIATOR is not set
> CONFIG_ATA=y
> # CONFIG_ATA_NONSTANDARD is not set
> CONFIG_ATA_VERBOSE_ERROR=y
> CONFIG_ATA_ACPI=y
> CONFIG_SATA_PMP=y
>
> #
> # Controllers with non-SFF native interface
> #
> CONFIG_SATA_AHCI=y
> # CONFIG_SATA_AHCI_PLATFORM is not set
> CONFIG_SATA_INIC162X=m
> CONFIG_SATA_SIL24=m
> CONFIG_ATA_SFF=y
>
> #
> # SFF controllers with custom DMA interface
> #
> CONFIG_PDC_ADMA=m
> CONFIG_SATA_QSTOR=m
> CONFIG_SATA_SX4=m
> CONFIG_ATA_BMDMA=y
>
> #
> # SATA SFF controllers with BMDMA
> #
> CONFIG_ATA_PIIX=y
> CONFIG_SATA_MV=m
> CONFIG_SATA_NV=y
> CONFIG_SATA_PROMISE=m
> CONFIG_SATA_SIL=m
> CONFIG_SATA_SIS=m
> CONFIG_SATA_SVW=m
> CONFIG_SATA_ULI=m
> CONFIG_SATA_VIA=m
> CONFIG_SATA_VITESSE=m
>
> #
> # PATA SFF controllers with BMDMA
> #
> CONFIG_PATA_ALI=m
> CONFIG_PATA_AMD=y
> CONFIG_PATA_ARTOP=m
> CONFIG_PATA_ATIIXP=m
> # CONFIG_PATA_ATP867X is not set
> CONFIG_PATA_CMD64X=m
> CONFIG_PATA_CS5520=m
> CONFIG_PATA_CS5530=m
> CONFIG_PATA_CYPRESS=m
> CONFIG_PATA_EFAR=m
> CONFIG_PATA_HPT366=m
> CONFIG_PATA_HPT37X=m
> CONFIG_PATA_HPT3X2N=m
> CONFIG_PATA_HPT3X3=m
> # CONFIG_PATA_HPT3X3_DMA is not set
> CONFIG_PATA_IT8213=m
> CONFIG_PATA_IT821X=m
> CONFIG_PATA_JMICRON=m
> CONFIG_PATA_MARVELL=m
> CONFIG_PATA_NETCELL=m
> # CONFIG_PATA_NINJA32 is not set
> # CONFIG_PATA_NS87415 is not set
> CONFIG_PATA_OLDPIIX=y
> CONFIG_PATA_OPTIDMA=m
> CONFIG_PATA_PDC2027X=m
> CONFIG_PATA_PDC_OLD=m
> CONFIG_PATA_RADISYS=m
> # CONFIG_PATA_RDC is not set
> CONFIG_PATA_SC1200=m
> # CONFIG_PATA_SCH is not set
> CONFIG_PATA_SERVERWORKS=m
> CONFIG_PATA_SIL680=m
> CONFIG_PATA_SIS=m
> # CONFIG_PATA_TOSHIBA is not set
> CONFIG_PATA_TRIFLEX=m
> CONFIG_PATA_VIA=m
> CONFIG_PATA_WINBOND=m
>
> #
> # PIO-only SFF controllers
> #
> # CONFIG_PATA_CMD640_PCI is not set
> CONFIG_PATA_MPIIX=m
> CONFIG_PATA_NS87410=m
> CONFIG_PATA_OPTI=m
> CONFIG_PATA_PCMCIA=m
> CONFIG_PATA_RZ1000=m
>
> #
> # Generic fallback / legacy drivers
> #
> # CONFIG_PATA_ACPI is not set
> CONFIG_ATA_GENERIC=m
> # CONFIG_PATA_LEGACY is not set
> CONFIG_MD=y
> CONFIG_BLK_DEV_MD=y
> CONFIG_MD_AUTODETECT=y
> CONFIG_MD_LINEAR=m
> CONFIG_MD_RAID0=m
> CONFIG_MD_RAID1=m
> CONFIG_MD_RAID10=m
> CONFIG_MD_RAID456=m
> # CONFIG_MULTICORE_RAID456 is not set
> CONFIG_MD_MULTIPATH=m
> CONFIG_MD_FAULTY=m
> CONFIG_BLK_DEV_DM=m
> # CONFIG_DM_DEBUG is not set
> CONFIG_DM_CRYPT=m
> CONFIG_DM_SNAPSHOT=m
> CONFIG_DM_MIRROR=m
> # CONFIG_DM_LOG_USERSPACE is not set
> CONFIG_DM_ZERO=m
> CONFIG_DM_MULTIPATH=m
> # CONFIG_DM_MULTIPATH_QL is not set
> # CONFIG_DM_MULTIPATH_ST is not set
> # CONFIG_DM_DELAY is not set
> # CONFIG_DM_UEVENT is not set
> CONFIG_FUSION=y
> CONFIG_FUSION_SPI=m
> CONFIG_FUSION_FC=m
> CONFIG_FUSION_SAS=m
> CONFIG_FUSION_MAX_SGE=40
> CONFIG_FUSION_CTL=m
> CONFIG_FUSION_LAN=m
> # CONFIG_FUSION_LOGGING is not set
>
> #
> # IEEE 1394 (FireWire) support
> #
>
> #
> # You can enable one or both FireWire driver stacks.
> #
>
> #
> # The newer stack is recommended.
> #
> # CONFIG_FIREWIRE is not set
> CONFIG_IEEE1394=m
> CONFIG_IEEE1394_OHCI1394=m
> CONFIG_IEEE1394_PCILYNX=m
> CONFIG_IEEE1394_SBP2=m
> # CONFIG_IEEE1394_SBP2_PHYS_DMA is not set
> CONFIG_IEEE1394_ETH1394_ROM_ENTRY=y
> CONFIG_IEEE1394_ETH1394=m
> CONFIG_IEEE1394_RAWIO=m
> CONFIG_IEEE1394_VIDEO1394=m
> CONFIG_IEEE1394_DV1394=m
> # CONFIG_IEEE1394_VERBOSEDEBUG is not set
> # CONFIG_FIREWIRE_NOSY is not set
> CONFIG_I2O=m
> # CONFIG_I2O_LCT_NOTIFY_ON_CHANGES is not set
> CONFIG_I2O_EXT_ADAPTEC=y
> CONFIG_I2O_EXT_ADAPTEC_DMA64=y
> # CONFIG_I2O_CONFIG is not set
> CONFIG_I2O_BUS=m
> CONFIG_I2O_BLOCK=m
> CONFIG_I2O_SCSI=m
> CONFIG_I2O_PROC=m
> # CONFIG_MACINTOSH_DRIVERS is not set
> CONFIG_NETDEVICES=y
> CONFIG_IFB=m
> CONFIG_DUMMY=m
> CONFIG_BONDING=m
> # CONFIG_MACVLAN is not set
> CONFIG_EQUALIZER=m
> CONFIG_TUN=m
> # CONFIG_VETH is not set
> CONFIG_NET_SB1000=m
> # CONFIG_ARCNET is not set
> CONFIG_PHYLIB=y
>
> #
> # MII PHY device drivers
> #
> CONFIG_MARVELL_PHY=m
> CONFIG_DAVICOM_PHY=m
> CONFIG_QSEMI_PHY=m
> CONFIG_LXT_PHY=m
> CONFIG_CICADA_PHY=m
> CONFIG_VITESSE_PHY=m
> CONFIG_SMSC_PHY=m
> CONFIG_BROADCOM_PHY=m
> # CONFIG_ICPLUS_PHY is not set
> # CONFIG_REALTEK_PHY is not set
> # CONFIG_NATIONAL_PHY is not set
> # CONFIG_STE10XP is not set
> # CONFIG_LSI_ET1011C_PHY is not set
> # CONFIG_MICREL_PHY is not set
> # CONFIG_FIXED_PHY is not set
> # CONFIG_MDIO_BITBANG is not set
> CONFIG_NET_ETHERNET=y
> CONFIG_MII=y
> CONFIG_HAPPYMEAL=m
> CONFIG_SUNGEM=m
> CONFIG_CASSINI=m
> CONFIG_NET_VENDOR_3COM=y
> CONFIG_VORTEX=y
> CONFIG_TYPHOON=m
> # CONFIG_ETHOC is not set
> # CONFIG_DNET is not set
> CONFIG_NET_TULIP=y
> CONFIG_DE2104X=m
> CONFIG_DE2104X_DSL=0
> CONFIG_TULIP=m
> # CONFIG_TULIP_MWI is not set
> CONFIG_TULIP_MMIO=y
> # CONFIG_TULIP_NAPI is not set
> CONFIG_DE4X5=m
> CONFIG_WINBOND_840=m
> CONFIG_DM9102=m
> CONFIG_ULI526X=m
> CONFIG_PCMCIA_XIRCOM=m
> # CONFIG_HP100 is not set
> # CONFIG_IBM_NEW_EMAC_ZMII is not set
> # CONFIG_IBM_NEW_EMAC_RGMII is not set
> # CONFIG_IBM_NEW_EMAC_TAH is not set
> # CONFIG_IBM_NEW_EMAC_EMAC4 is not set
> # CONFIG_IBM_NEW_EMAC_NO_FLOW_CTRL is not set
> # CONFIG_IBM_NEW_EMAC_MAL_CLR_ICINTSTAT is not set
> # CONFIG_IBM_NEW_EMAC_MAL_COMMON_ERR is not set
> CONFIG_NET_PCI=y
> CONFIG_PCNET32=m
> CONFIG_AMD8111_ETH=m
> CONFIG_ADAPTEC_STARFIRE=m
> # CONFIG_KSZ884X_PCI is not set
> CONFIG_B44=m
> CONFIG_B44_PCI_AUTOSELECT=y
> CONFIG_B44_PCICORE_AUTOSELECT=y
> CONFIG_B44_PCI=y
> CONFIG_FORCEDETH=y
> CONFIG_E100=y
> CONFIG_FEALNX=m
> CONFIG_NATSEMI=m
> CONFIG_NE2K_PCI=m
> CONFIG_8139CP=m
> CONFIG_8139TOO=y
> # CONFIG_8139TOO_PIO is not set
> # CONFIG_8139TOO_TUNE_TWISTER is not set
> CONFIG_8139TOO_8129=y
> # CONFIG_8139_OLD_RX_RESET is not set
> # CONFIG_R6040 is not set
> CONFIG_SIS900=m
> CONFIG_EPIC100=m
> # CONFIG_SMSC9420 is not set
> CONFIG_SUNDANCE=m
> # CONFIG_SUNDANCE_MMIO is not set
> # CONFIG_TLAN is not set
> # CONFIG_KS8851_MLL is not set
> CONFIG_VIA_RHINE=m
> CONFIG_VIA_RHINE_MMIO=y
> CONFIG_SC92031=m
> CONFIG_NET_POCKET=y
> CONFIG_ATP=m
> CONFIG_DE600=m
> CONFIG_DE620=m
> # CONFIG_ATL2 is not set
> CONFIG_NETDEV_1000=y
> CONFIG_ACENIC=m
> # CONFIG_ACENIC_OMIT_TIGON_I is not set
> CONFIG_DL2K=m
> CONFIG_E1000=y
> CONFIG_E1000E=y
> # CONFIG_IP1000 is not set
> CONFIG_IGB=y
> # CONFIG_IGBVF is not set
> CONFIG_NS83820=m
> CONFIG_HAMACHI=m
> CONFIG_YELLOWFIN=m
> CONFIG_R8169=m
> CONFIG_R8169_VLAN=y
> # CONFIG_SIS190 is not set
> CONFIG_SKGE=m
> # CONFIG_SKGE_DEBUG is not set
> CONFIG_SKY2=m
> # CONFIG_SKY2_DEBUG is not set
> CONFIG_VIA_VELOCITY=m
> CONFIG_TIGON3=y
> CONFIG_BNX2=m
> # CONFIG_CNIC is not set
> CONFIG_QLA3XXX=m
> CONFIG_ATL1=m
> # CONFIG_ATL1E is not set
> # CONFIG_ATL1C is not set
> # CONFIG_JME is not set
> CONFIG_NETDEV_10000=y
> CONFIG_MDIO=m
> CONFIG_CHELSIO_T1=m
> CONFIG_CHELSIO_T1_1G=y
> CONFIG_CHELSIO_T3_DEPENDS=y
> CONFIG_CHELSIO_T3=m
> CONFIG_CHELSIO_T4_DEPENDS=y
> # CONFIG_CHELSIO_T4 is not set
> CONFIG_CHELSIO_T4VF_DEPENDS=y
> # CONFIG_CHELSIO_T4VF is not set
> # CONFIG_ENIC is not set
> # CONFIG_IXGBE is not set
> CONFIG_IXGB=m
> CONFIG_S2IO=m
> CONFIG_MYRI10GE=m
> # CONFIG_NIU is not set
> # CONFIG_MLX4_EN is not set
> # CONFIG_MLX4_CORE is not set
> # CONFIG_TEHUTI is not set
> # CONFIG_BNX2X is not set
> # CONFIG_QLCNIC is not set
> # CONFIG_QLGE is not set
> # CONFIG_SFC is not set
> # CONFIG_BE2NET is not set
> CONFIG_TR=y
> CONFIG_IBMOL=m
> CONFIG_3C359=m
> # CONFIG_TMS380TR is not set
> CONFIG_WLAN=y
> # CONFIG_PCMCIA_RAYCS is not set
> # CONFIG_AIRO is not set
> # CONFIG_ATMEL is not set
> # CONFIG_AIRO_CS is not set
> # CONFIG_PCMCIA_WL3501 is not set
> # CONFIG_PRISM54 is not set
> # CONFIG_USB_ZD1201 is not set
> # CONFIG_HOSTAP is not set
>
> #
> # Enable WiMAX (Networking options) to see the WiMAX drivers
> #
>
> #
> # USB Network Adapters
> #
> CONFIG_USB_CATC=m
> CONFIG_USB_KAWETH=m
> CONFIG_USB_PEGASUS=m
> CONFIG_USB_RTL8150=m
> CONFIG_USB_USBNET=m
> CONFIG_USB_NET_AX8817X=m
> CONFIG_USB_NET_CDCETHER=m
> # CONFIG_USB_NET_CDC_EEM is not set
> CONFIG_USB_NET_DM9601=m
> # CONFIG_USB_NET_SMSC75XX is not set
> # CONFIG_USB_NET_SMSC95XX is not set
> CONFIG_USB_NET_GL620A=m
> CONFIG_USB_NET_NET1080=m
> CONFIG_USB_NET_PLUSB=m
> CONFIG_USB_NET_MCS7830=m
> CONFIG_USB_NET_RNDIS_HOST=m
> CONFIG_USB_NET_CDC_SUBSET=m
> CONFIG_USB_ALI_M5632=y
> CONFIG_USB_AN2720=y
> CONFIG_USB_BELKIN=y
> CONFIG_USB_ARMLINUX=y
> CONFIG_USB_EPSON2888=y
> CONFIG_USB_KC2190=y
> CONFIG_USB_NET_ZAURUS=m
> # CONFIG_USB_HSO is not set
> # CONFIG_USB_NET_INT51X1 is not set
> # CONFIG_USB_IPHETH is not set
> # CONFIG_USB_SIERRA_NET is not set
> CONFIG_NET_PCMCIA=y
> CONFIG_PCMCIA_3C589=m
> CONFIG_PCMCIA_3C574=m
> CONFIG_PCMCIA_FMVJ18X=m
> CONFIG_PCMCIA_PCNET=m
> CONFIG_PCMCIA_NMCLAN=m
> CONFIG_PCMCIA_SMC91C92=m
> CONFIG_PCMCIA_XIRC2PS=m
> CONFIG_PCMCIA_AXNET=m
> # CONFIG_PCMCIA_IBMTR is not set
> # CONFIG_WAN is not set
> CONFIG_ATM_DRIVERS=y
> # CONFIG_ATM_DUMMY is not set
> CONFIG_ATM_TCP=m
> CONFIG_ATM_LANAI=m
> CONFIG_ATM_ENI=m
> # CONFIG_ATM_ENI_DEBUG is not set
> # CONFIG_ATM_ENI_TUNE_BURST is not set
> CONFIG_ATM_FIRESTREAM=m
> # CONFIG_ATM_ZATM is not set
> # CONFIG_ATM_NICSTAR is not set
> CONFIG_ATM_IDT77252=m
> # CONFIG_ATM_IDT77252_DEBUG is not set
> # CONFIG_ATM_IDT77252_RCV_ALL is not set
> CONFIG_ATM_IDT77252_USE_SUNI=y
> CONFIG_ATM_AMBASSADOR=m
> # CONFIG_ATM_AMBASSADOR_DEBUG is not set
> CONFIG_ATM_HORIZON=m
> # CONFIG_ATM_HORIZON_DEBUG is not set
> # CONFIG_ATM_IA is not set
> # CONFIG_ATM_FORE200E is not set
> CONFIG_ATM_HE=m
> # CONFIG_ATM_HE_USE_SUNI is not set
> # CONFIG_ATM_SOLOS is not set
>
> #
> # CAIF transport drivers
> #
> CONFIG_FDDI=y
> # CONFIG_DEFXX is not set
> CONFIG_SKFP=m
> # CONFIG_HIPPI is not set
> CONFIG_PLIP=m
> CONFIG_PPP=m
> CONFIG_PPP_MULTILINK=y
> CONFIG_PPP_FILTER=y
> CONFIG_PPP_ASYNC=m
> CONFIG_PPP_SYNC_TTY=m
> CONFIG_PPP_DEFLATE=m
> # CONFIG_PPP_BSDCOMP is not set
> CONFIG_PPP_MPPE=m
> CONFIG_PPPOE=m
> CONFIG_PPPOATM=m
> CONFIG_SLIP=m
> CONFIG_SLIP_COMPRESSED=y
> CONFIG_SLHC=m
> CONFIG_SLIP_SMART=y
> # CONFIG_SLIP_MODE_SLIP6 is not set
> CONFIG_NET_FC=y
> CONFIG_NETCONSOLE=y
> # CONFIG_NETCONSOLE_DYNAMIC is not set
> CONFIG_NETPOLL=y
> CONFIG_NETPOLL_TRAP=y
> CONFIG_NET_POLL_CONTROLLER=y
> # CONFIG_VMXNET3 is not set
> # CONFIG_ISDN is not set
> # CONFIG_PHONE is not set
>
> #
> # Input device support
> #
> CONFIG_INPUT=y
> CONFIG_INPUT_FF_MEMLESS=y
> CONFIG_INPUT_POLLDEV=y
> CONFIG_INPUT_SPARSEKMAP=m
>
> #
> # Userland interfaces
> #
> CONFIG_INPUT_MOUSEDEV=y
> # CONFIG_INPUT_MOUSEDEV_PSAUX is not set
> CONFIG_INPUT_MOUSEDEV_SCREEN_X=1024
> CONFIG_INPUT_MOUSEDEV_SCREEN_Y=768
> CONFIG_INPUT_JOYDEV=m
> CONFIG_INPUT_EVDEV=y
> # CONFIG_INPUT_EVBUG is not set
>
> #
> # Input Device Drivers
> #
> CONFIG_INPUT_KEYBOARD=y
> # CONFIG_KEYBOARD_ADP5588 is not set
> CONFIG_KEYBOARD_ATKBD=y
> # CONFIG_KEYBOARD_QT2160 is not set
> # CONFIG_KEYBOARD_LKKBD is not set
> # CONFIG_KEYBOARD_TCA6416 is not set
> # CONFIG_KEYBOARD_LM8323 is not set
> # CONFIG_KEYBOARD_MAX7359 is not set
> # CONFIG_KEYBOARD_MCS is not set
> # CONFIG_KEYBOARD_NEWTON is not set
> # CONFIG_KEYBOARD_OPENCORES is not set
> CONFIG_KEYBOARD_STOWAWAY=m
> # CONFIG_KEYBOARD_SUNKBD is not set
> # CONFIG_KEYBOARD_XTKBD is not set
> CONFIG_INPUT_MOUSE=y
> CONFIG_MOUSE_PS2=y
> CONFIG_MOUSE_PS2_ALPS=y
> CONFIG_MOUSE_PS2_LOGIPS2PP=y
> CONFIG_MOUSE_PS2_SYNAPTICS=y
> CONFIG_MOUSE_PS2_LIFEBOOK=y
> CONFIG_MOUSE_PS2_TRACKPOINT=y
> # CONFIG_MOUSE_PS2_ELANTECH is not set
> # CONFIG_MOUSE_PS2_SENTELIC is not set
> # CONFIG_MOUSE_PS2_TOUCHKIT is not set
> CONFIG_MOUSE_SERIAL=m
> # CONFIG_MOUSE_APPLETOUCH is not set
> # CONFIG_MOUSE_BCM5974 is not set
> CONFIG_MOUSE_VSXXXAA=m
> # CONFIG_MOUSE_SYNAPTICS_I2C is not set
> CONFIG_INPUT_JOYSTICK=y
> CONFIG_JOYSTICK_ANALOG=m
> CONFIG_JOYSTICK_A3D=m
> CONFIG_JOYSTICK_ADI=m
> CONFIG_JOYSTICK_COBRA=m
> CONFIG_JOYSTICK_GF2K=m
> CONFIG_JOYSTICK_GRIP=m
> CONFIG_JOYSTICK_GRIP_MP=m
> CONFIG_JOYSTICK_GUILLEMOT=m
> CONFIG_JOYSTICK_INTERACT=m
> CONFIG_JOYSTICK_SIDEWINDER=m
> CONFIG_JOYSTICK_TMDC=m
> CONFIG_JOYSTICK_IFORCE=m
> CONFIG_JOYSTICK_IFORCE_USB=y
> CONFIG_JOYSTICK_IFORCE_232=y
> CONFIG_JOYSTICK_WARRIOR=m
> CONFIG_JOYSTICK_MAGELLAN=m
> CONFIG_JOYSTICK_SPACEORB=m
> CONFIG_JOYSTICK_SPACEBALL=m
> CONFIG_JOYSTICK_STINGER=m
> CONFIG_JOYSTICK_TWIDJOY=m
> # CONFIG_JOYSTICK_ZHENHUA is not set
> CONFIG_JOYSTICK_DB9=m
> CONFIG_JOYSTICK_GAMECON=m
> CONFIG_JOYSTICK_TURBOGRAFX=m
> CONFIG_JOYSTICK_JOYDUMP=m
> # CONFIG_JOYSTICK_XPAD is not set
> # CONFIG_INPUT_TABLET is not set
> CONFIG_INPUT_TOUCHSCREEN=y
> # CONFIG_TOUCHSCREEN_AD7879 is not set
> # CONFIG_TOUCHSCREEN_DYNAPRO is not set
> # CONFIG_TOUCHSCREEN_HAMPSHIRE is not set
> # CONFIG_TOUCHSCREEN_EETI is not set
> # CONFIG_TOUCHSCREEN_FUJITSU is not set
> CONFIG_TOUCHSCREEN_GUNZE=m
> CONFIG_TOUCHSCREEN_ELO=m
> # CONFIG_TOUCHSCREEN_WACOM_W8001 is not set
> # CONFIG_TOUCHSCREEN_MCS5000 is not set
> CONFIG_TOUCHSCREEN_MTOUCH=m
> # CONFIG_TOUCHSCREEN_INEXIO is not set
> CONFIG_TOUCHSCREEN_MK712=m
> CONFIG_TOUCHSCREEN_PENMOUNT=m
> # CONFIG_TOUCHSCREEN_QT602240 is not set
> CONFIG_TOUCHSCREEN_TOUCHRIGHT=m
> CONFIG_TOUCHSCREEN_TOUCHWIN=m
> # CONFIG_TOUCHSCREEN_WM97XX is not set
> # CONFIG_TOUCHSCREEN_USB_COMPOSITE is not set
> # CONFIG_TOUCHSCREEN_TOUCHIT213 is not set
> # CONFIG_TOUCHSCREEN_TSC2007 is not set
> # CONFIG_TOUCHSCREEN_TPS6507X is not set
> CONFIG_INPUT_MISC=y
> # CONFIG_INPUT_AD714X is not set
> CONFIG_INPUT_PCSPKR=m
> # CONFIG_INPUT_APANEL is not set
> CONFIG_INPUT_ATLAS_BTNS=m
> # CONFIG_INPUT_ATI_REMOTE is not set
> # CONFIG_INPUT_ATI_REMOTE2 is not set
> # CONFIG_INPUT_KEYSPAN_REMOTE is not set
> # CONFIG_INPUT_POWERMATE is not set
> # CONFIG_INPUT_YEALINK is not set
> # CONFIG_INPUT_CM109 is not set
> CONFIG_INPUT_UINPUT=m
> # CONFIG_INPUT_WINBOND_CIR is not set
> # CONFIG_INPUT_PCF8574 is not set
> # CONFIG_INPUT_ADXL34X is not set
>
> #
> # Hardware I/O ports
> #
> CONFIG_SERIO=y
> CONFIG_SERIO_I8042=y
> CONFIG_SERIO_SERPORT=y
> # CONFIG_SERIO_CT82C710 is not set
> # CONFIG_SERIO_PARKBD is not set
> # CONFIG_SERIO_PCIPS2 is not set
> CONFIG_SERIO_LIBPS2=y
> CONFIG_SERIO_RAW=m
> # CONFIG_SERIO_ALTERA_PS2 is not set
> CONFIG_GAMEPORT=m
> CONFIG_GAMEPORT_NS558=m
> CONFIG_GAMEPORT_L4=m
> CONFIG_GAMEPORT_EMU10K1=m
> CONFIG_GAMEPORT_FM801=m
>
> #
> # Character devices
> #
> CONFIG_VT=y
> CONFIG_CONSOLE_TRANSLATIONS=y
> CONFIG_VT_CONSOLE=y
> CONFIG_HW_CONSOLE=y
> CONFIG_VT_HW_CONSOLE_BINDING=y
> CONFIG_DEVKMEM=y
> CONFIG_SERIAL_NONSTANDARD=y
> # CONFIG_COMPUTONE is not set
> # CONFIG_ROCKETPORT is not set
> CONFIG_CYCLADES=m
> # CONFIG_CYZ_INTR is not set
> # CONFIG_DIGIEPCA is not set
> # CONFIG_MOXA_INTELLIO is not set
> # CONFIG_MOXA_SMARTIO is not set
> # CONFIG_ISI is not set
> CONFIG_SYNCLINK=m
> CONFIG_SYNCLINKMP=m
> CONFIG_SYNCLINK_GT=m
> CONFIG_N_HDLC=m
> # CONFIG_N_GSM is not set
> # CONFIG_RISCOM8 is not set
> # CONFIG_SPECIALIX is not set
> # CONFIG_STALDRV is not set
> # CONFIG_NOZOMI is not set
>
> #
> # Serial drivers
> #
> CONFIG_SERIAL_8250=y
> CONFIG_SERIAL_8250_CONSOLE=y
> CONFIG_FIX_EARLYCON_MEM=y
> CONFIG_SERIAL_8250_PCI=y
> CONFIG_SERIAL_8250_PNP=y
> CONFIG_SERIAL_8250_CS=m
> CONFIG_SERIAL_8250_NR_UARTS=32
> CONFIG_SERIAL_8250_RUNTIME_UARTS=4
> CONFIG_SERIAL_8250_EXTENDED=y
> CONFIG_SERIAL_8250_MANY_PORTS=y
> CONFIG_SERIAL_8250_SHARE_IRQ=y
> CONFIG_SERIAL_8250_DETECT_IRQ=y
> CONFIG_SERIAL_8250_RSA=y
>
> #
> # Non-8250 serial port support
> #
> # CONFIG_SERIAL_MFD_HSU is not set
> CONFIG_SERIAL_CORE=y
> CONFIG_SERIAL_CORE_CONSOLE=y
> CONFIG_SERIAL_JSM=m
> # CONFIG_SERIAL_TIMBERDALE is not set
> # CONFIG_SERIAL_ALTERA_JTAGUART is not set
> # CONFIG_SERIAL_ALTERA_UART is not set
> CONFIG_UNIX98_PTYS=y
> # CONFIG_DEVPTS_MULTIPLE_INSTANCES is not set
> # CONFIG_LEGACY_PTYS is not set
> CONFIG_PRINTER=m
> CONFIG_LP_CONSOLE=y
> CONFIG_PPDEV=m
> CONFIG_IPMI_HANDLER=m
> # CONFIG_IPMI_PANIC_EVENT is not set
> CONFIG_IPMI_DEVICE_INTERFACE=m
> CONFIG_IPMI_SI=m
> CONFIG_IPMI_WATCHDOG=m
> CONFIG_IPMI_POWEROFF=m
> CONFIG_HW_RANDOM=y
> # CONFIG_HW_RANDOM_TIMERIOMEM is not set
> CONFIG_HW_RANDOM_INTEL=m
> CONFIG_HW_RANDOM_AMD=m
> CONFIG_HW_RANDOM_VIA=y
> CONFIG_NVRAM=y
> CONFIG_R3964=m
> # CONFIG_APPLICOM is not set
>
> #
> # PCMCIA character devices
> #
> # CONFIG_SYNCLINK_CS is not set
> CONFIG_CARDMAN_4000=m
> CONFIG_CARDMAN_4040=m
> # CONFIG_IPWIRELESS is not set
> CONFIG_MWAVE=m
> # CONFIG_RAW_DRIVER is not set
> # CONFIG_HPET is not set
> CONFIG_HANGCHECK_TIMER=m
> CONFIG_TCG_TPM=m
> CONFIG_TCG_TIS=m
> CONFIG_TCG_NSC=m
> CONFIG_TCG_ATMEL=m
> CONFIG_TCG_INFINEON=m
> # CONFIG_TELCLOCK is not set
> CONFIG_DEVPORT=y
> # CONFIG_RAMOOPS is not set
> CONFIG_I2C=y
> CONFIG_I2C_BOARDINFO=y
> CONFIG_I2C_COMPAT=y
> CONFIG_I2C_CHARDEV=m
> # CONFIG_I2C_MUX is not set
> CONFIG_I2C_HELPER_AUTO=y
> CONFIG_I2C_SMBUS=m
> CONFIG_I2C_ALGOBIT=m
>
> #
> # I2C Hardware Bus support
> #
>
> #
> # PC SMBus host controller drivers
> #
> # CONFIG_I2C_ALI1535 is not set
> # CONFIG_I2C_ALI1563 is not set
> # CONFIG_I2C_ALI15X3 is not set
> CONFIG_I2C_AMD756=m
> CONFIG_I2C_AMD8111=m
> CONFIG_I2C_I801=m
> # CONFIG_I2C_ISCH is not set
> CONFIG_I2C_PIIX4=y
> CONFIG_I2C_NFORCE2=y
> # CONFIG_I2C_SIS5595 is not set
> # CONFIG_I2C_SIS630 is not set
> CONFIG_I2C_SIS96X=m
> CONFIG_I2C_VIA=m
> CONFIG_I2C_VIAPRO=m
>
> #
> # ACPI drivers
> #
> # CONFIG_I2C_SCMI is not set
>
> #
> # I2C system bus drivers (mostly embedded / system-on-chip)
> #
> # CONFIG_I2C_OCORES is not set
> # CONFIG_I2C_PCA_PLATFORM is not set
> # CONFIG_I2C_SIMTEC is not set
> # CONFIG_I2C_XILINX is not set
>
> #
> # External I2C/SMBus adapter drivers
> #
> CONFIG_I2C_PARPORT=m
> CONFIG_I2C_PARPORT_LIGHT=m
> # CONFIG_I2C_TAOS_EVM is not set
> # CONFIG_I2C_TINY_USB is not set
>
> #
> # Other I2C/SMBus bus drivers
> #
> CONFIG_I2C_STUB=m
> # CONFIG_I2C_DEBUG_CORE is not set
> # CONFIG_I2C_DEBUG_ALGO is not set
> # CONFIG_I2C_DEBUG_BUS is not set
> # CONFIG_SPI is not set
>
> #
> # PPS support
> #
> # CONFIG_PPS is not set
> CONFIG_ARCH_WANT_OPTIONAL_GPIOLIB=y
> # CONFIG_GPIOLIB is not set
> CONFIG_W1=m
> CONFIG_W1_CON=y
>
> #
> # 1-wire Bus Masters
> #
> CONFIG_W1_MASTER_MATROX=m
> CONFIG_W1_MASTER_DS2490=m
> CONFIG_W1_MASTER_DS2482=m
>
> #
> # 1-wire Slaves
> #
> CONFIG_W1_SLAVE_THERM=m
> CONFIG_W1_SLAVE_SMEM=m
> # CONFIG_W1_SLAVE_DS2431 is not set
> CONFIG_W1_SLAVE_DS2433=m
> CONFIG_W1_SLAVE_DS2433_CRC=y
> # CONFIG_W1_SLAVE_DS2760 is not set
> # CONFIG_W1_SLAVE_BQ27000 is not set
> CONFIG_POWER_SUPPLY=y
> # CONFIG_POWER_SUPPLY_DEBUG is not set
> # CONFIG_PDA_POWER is not set
> # CONFIG_TEST_POWER is not set
> # CONFIG_BATTERY_DS2760 is not set
> # CONFIG_BATTERY_DS2782 is not set
> # CONFIG_BATTERY_BQ27x00 is not set
> # CONFIG_BATTERY_MAX17040 is not set
> CONFIG_HWMON=y
> CONFIG_HWMON_VID=m
> # CONFIG_HWMON_DEBUG_CHIP is not set
>
> #
> # Native drivers
> #
> CONFIG_SENSORS_ABITUGURU=m
> # CONFIG_SENSORS_ABITUGURU3 is not set
> # CONFIG_SENSORS_AD7414 is not set
> # CONFIG_SENSORS_AD7418 is not set
> CONFIG_SENSORS_ADM1021=m
> CONFIG_SENSORS_ADM1025=m
> CONFIG_SENSORS_ADM1026=m
> CONFIG_SENSORS_ADM1029=m
> CONFIG_SENSORS_ADM1031=m
> CONFIG_SENSORS_ADM9240=m
> # CONFIG_SENSORS_ADT7411 is not set
> # CONFIG_SENSORS_ADT7462 is not set
> # CONFIG_SENSORS_ADT7470 is not set
> # CONFIG_SENSORS_ADT7475 is not set
> # CONFIG_SENSORS_ASC7621 is not set
> CONFIG_SENSORS_K8TEMP=m
> # CONFIG_SENSORS_K10TEMP is not set
> CONFIG_SENSORS_ASB100=m
> CONFIG_SENSORS_ATXP1=m
> CONFIG_SENSORS_DS1621=m
> # CONFIG_SENSORS_I5K_AMB is not set
> CONFIG_SENSORS_F71805F=m
> # CONFIG_SENSORS_F71882FG is not set
> # CONFIG_SENSORS_F75375S is not set
> # CONFIG_SENSORS_FSCHMD is not set
> # CONFIG_SENSORS_G760A is not set
> CONFIG_SENSORS_GL518SM=m
> CONFIG_SENSORS_GL520SM=m
> # CONFIG_SENSORS_CORETEMP is not set
> # CONFIG_SENSORS_PKGTEMP is not set
> # CONFIG_SENSORS_IBMAEM is not set
> # CONFIG_SENSORS_IBMPEX is not set
> CONFIG_SENSORS_IT87=m
> # CONFIG_SENSORS_JC42 is not set
> CONFIG_SENSORS_LM63=m
> # CONFIG_SENSORS_LM73 is not set
> CONFIG_SENSORS_LM75=m
> CONFIG_SENSORS_LM77=m
> CONFIG_SENSORS_LM78=m
> CONFIG_SENSORS_LM80=m
> CONFIG_SENSORS_LM83=m
> CONFIG_SENSORS_LM85=m
> CONFIG_SENSORS_LM87=m
> CONFIG_SENSORS_LM90=m
> CONFIG_SENSORS_LM92=m
> # CONFIG_SENSORS_LM93 is not set
> # CONFIG_SENSORS_LTC4215 is not set
> # CONFIG_SENSORS_LTC4245 is not set
> # CONFIG_SENSORS_LM95241 is not set
> CONFIG_SENSORS_MAX1619=m
> # CONFIG_SENSORS_MAX6650 is not set
> CONFIG_SENSORS_PC87360=m
> CONFIG_SENSORS_PC87427=m
> CONFIG_SENSORS_PCF8591=m
> CONFIG_SENSORS_SIS5595=m
> # CONFIG_SENSORS_SMM665 is not set
> # CONFIG_SENSORS_DME1737 is not set
> # CONFIG_SENSORS_EMC1403 is not set
> # CONFIG_SENSORS_EMC2103 is not set
> CONFIG_SENSORS_SMSC47M1=m
> CONFIG_SENSORS_SMSC47M192=m
> CONFIG_SENSORS_SMSC47B397=m
> # CONFIG_SENSORS_ADS7828 is not set
> # CONFIG_SENSORS_AMC6821 is not set
> # CONFIG_SENSORS_THMC50 is not set
> # CONFIG_SENSORS_TMP102 is not set
> # CONFIG_SENSORS_TMP401 is not set
> # CONFIG_SENSORS_TMP421 is not set
> # CONFIG_SENSORS_VIA_CPUTEMP is not set
> CONFIG_SENSORS_VIA686A=m
> CONFIG_SENSORS_VT1211=m
> CONFIG_SENSORS_VT8231=m
> CONFIG_SENSORS_W83781D=m
> CONFIG_SENSORS_W83791D=m
> CONFIG_SENSORS_W83792D=m
> CONFIG_SENSORS_W83793=m
> CONFIG_SENSORS_W83L785TS=m
> # CONFIG_SENSORS_W83L786NG is not set
> CONFIG_SENSORS_W83627HF=m
> CONFIG_SENSORS_W83627EHF=m
> CONFIG_SENSORS_HDAPS=m
> # CONFIG_SENSORS_LIS3_I2C is not set
> # CONFIG_SENSORS_APPLESMC is not set
>
> #
> # ACPI drivers
> #
> # CONFIG_SENSORS_ATK0110 is not set
> # CONFIG_SENSORS_LIS3LV02D is not set
> CONFIG_THERMAL=y
> # CONFIG_THERMAL_HWMON is not set
> CONFIG_WATCHDOG=y
> # CONFIG_WATCHDOG_NOWAYOUT is not set
>
> #
> # Watchdog Device Drivers
> #
> CONFIG_SOFT_WATCHDOG=m
> # CONFIG_ACQUIRE_WDT is not set
> # CONFIG_ADVANTECH_WDT is not set
> CONFIG_ALIM1535_WDT=m
> CONFIG_ALIM7101_WDT=m
> # CONFIG_F71808E_WDT is not set
> # CONFIG_SC520_WDT is not set
> # CONFIG_EUROTECH_WDT is not set
> # CONFIG_IB700_WDT is not set
> CONFIG_IBMASR=m
> # CONFIG_WAFER_WDT is not set
> CONFIG_I6300ESB_WDT=m
> CONFIG_ITCO_WDT=m
> CONFIG_ITCO_VENDOR_SUPPORT=y
> # CONFIG_IT8712F_WDT is not set
> # CONFIG_IT87_WDT is not set
> # CONFIG_HP_WATCHDOG is not set
> # CONFIG_SC1200_WDT is not set
> CONFIG_PC87413_WDT=m
> # CONFIG_60XX_WDT is not set
> # CONFIG_SBC8360_WDT is not set
> # CONFIG_CPU5_WDT is not set
> # CONFIG_SMSC_SCH311X_WDT is not set
> # CONFIG_SMSC37B787_WDT is not set
> CONFIG_W83627HF_WDT=m
> CONFIG_W83697HF_WDT=m
> # CONFIG_W83697UG_WDT is not set
> CONFIG_W83877F_WDT=m
> CONFIG_W83977F_WDT=m
> CONFIG_MACHZ_WDT=m
> # CONFIG_SBC_EPX_C3_WATCHDOG is not set
>
> #
> # PCI-based Watchdog Cards
> #
> CONFIG_PCIPCWATCHDOG=m
> CONFIG_WDTPCI=m
>
> #
> # USB-based Watchdog Cards
> #
> CONFIG_USBPCWATCHDOG=m
> CONFIG_SSB_POSSIBLE=y
>
> #
> # Sonics Silicon Backplane
> #
> CONFIG_SSB=m
> CONFIG_SSB_SPROM=y
> CONFIG_SSB_PCIHOST_POSSIBLE=y
> CONFIG_SSB_PCIHOST=y
> # CONFIG_SSB_B43_PCI_BRIDGE is not set
> CONFIG_SSB_PCMCIAHOST_POSSIBLE=y
> # CONFIG_SSB_PCMCIAHOST is not set
> CONFIG_SSB_SDIOHOST_POSSIBLE=y
> # CONFIG_SSB_SDIOHOST is not set
> # CONFIG_SSB_DEBUG is not set
> CONFIG_SSB_DRIVER_PCICORE_POSSIBLE=y
> CONFIG_SSB_DRIVER_PCICORE=y
> CONFIG_MFD_SUPPORT=y
> # CONFIG_MFD_CORE is not set
> # CONFIG_MFD_88PM860X is not set
> CONFIG_MFD_SM501=m
> # CONFIG_HTC_PASIC3 is not set
> # CONFIG_TPS6507X is not set
> # CONFIG_TWL4030_CORE is not set
> # CONFIG_MFD_STMPE is not set
> # CONFIG_MFD_TC35892 is not set
> # CONFIG_MFD_TMIO is not set
> # CONFIG_PMIC_DA903X is not set
> # CONFIG_PMIC_ADP5520 is not set
> # CONFIG_MFD_MAX8925 is not set
> # CONFIG_MFD_MAX8998 is not set
> # CONFIG_MFD_WM8400 is not set
> # CONFIG_MFD_WM831X is not set
> # CONFIG_MFD_WM8350_I2C is not set
> # CONFIG_MFD_WM8994 is not set
> # CONFIG_MFD_PCF50633 is not set
> # CONFIG_ABX500_CORE is not set
> # CONFIG_LPC_SCH is not set
> # CONFIG_MFD_RDC321X is not set
> # CONFIG_MFD_JANZ_CMODIO is not set
> # CONFIG_REGULATOR is not set
> # CONFIG_MEDIA_SUPPORT is not set
>
> #
> # Graphics support
> #
> CONFIG_AGP=y
> CONFIG_AGP_AMD64=y
> CONFIG_AGP_INTEL=y
> CONFIG_AGP_SIS=y
> CONFIG_AGP_VIA=y
> CONFIG_VGA_ARB=y
> CONFIG_VGA_ARB_MAX_GPUS=16
> # CONFIG_VGA_SWITCHEROO is not set
> CONFIG_DRM=m
> CONFIG_DRM_KMS_HELPER=m
> CONFIG_DRM_TTM=m
> CONFIG_DRM_TDFX=m
> CONFIG_DRM_R128=m
> CONFIG_DRM_RADEON=m
> CONFIG_DRM_I810=m
> CONFIG_DRM_I830=m
> CONFIG_DRM_I915=m
> # CONFIG_DRM_I915_KMS is not set
> CONFIG_DRM_MGA=m
> # CONFIG_DRM_SIS is not set
> CONFIG_DRM_VIA=m
> CONFIG_DRM_SAVAGE=m
> CONFIG_VGASTATE=m
> CONFIG_VIDEO_OUTPUT_CONTROL=m
> CONFIG_FB=y
> # CONFIG_FIRMWARE_EDID is not set
> CONFIG_FB_DDC=m
> # CONFIG_FB_BOOT_VESA_SUPPORT is not set
> CONFIG_FB_CFB_FILLRECT=m
> CONFIG_FB_CFB_COPYAREA=m
> CONFIG_FB_CFB_IMAGEBLIT=m
> # CONFIG_FB_CFB_REV_PIXELS_IN_BYTE is not set
> # CONFIG_FB_SYS_FILLRECT is not set
> # CONFIG_FB_SYS_COPYAREA is not set
> # CONFIG_FB_SYS_IMAGEBLIT is not set
> # CONFIG_FB_FOREIGN_ENDIAN is not set
> # CONFIG_FB_SYS_FOPS is not set
> CONFIG_FB_SVGALIB=m
> # CONFIG_FB_MACMODES is not set
> CONFIG_FB_BACKLIGHT=y
> CONFIG_FB_MODE_HELPERS=y
> CONFIG_FB_TILEBLITTING=y
>
> #
> # Frame buffer hardware drivers
> #
> # CONFIG_FB_CIRRUS is not set
> # CONFIG_FB_PM2 is not set
> # CONFIG_FB_CYBER2000 is not set
> # CONFIG_FB_ARC is not set
> # CONFIG_FB_ASILIANT is not set
> # CONFIG_FB_IMSTT is not set
> # CONFIG_FB_VGA16 is not set
> # CONFIG_FB_UVESA is not set
> # CONFIG_FB_VESA is not set
> # CONFIG_FB_N411 is not set
> # CONFIG_FB_HGA is not set
> # CONFIG_FB_S1D13XXX is not set
> CONFIG_FB_NVIDIA=m
> CONFIG_FB_NVIDIA_I2C=y
> # CONFIG_FB_NVIDIA_DEBUG is not set
> CONFIG_FB_NVIDIA_BACKLIGHT=y
> CONFIG_FB_RIVA=m
> # CONFIG_FB_RIVA_I2C is not set
> # CONFIG_FB_RIVA_DEBUG is not set
> CONFIG_FB_RIVA_BACKLIGHT=y
> # CONFIG_FB_LE80578 is not set
> CONFIG_FB_MATROX=m
> CONFIG_FB_MATROX_MILLENIUM=y
> CONFIG_FB_MATROX_MYSTIQUE=y
> CONFIG_FB_MATROX_G=y
> CONFIG_FB_MATROX_I2C=m
> CONFIG_FB_MATROX_MAVEN=m
> # CONFIG_FB_RADEON is not set
> CONFIG_FB_ATY128=m
> CONFIG_FB_ATY128_BACKLIGHT=y
> CONFIG_FB_ATY=m
> CONFIG_FB_ATY_CT=y
> CONFIG_FB_ATY_GENERIC_LCD=y
> CONFIG_FB_ATY_GX=y
> CONFIG_FB_ATY_BACKLIGHT=y
> CONFIG_FB_S3=m
> CONFIG_FB_SAVAGE=m
> CONFIG_FB_SAVAGE_I2C=y
> CONFIG_FB_SAVAGE_ACCEL=y
> # CONFIG_FB_SIS is not set
> # CONFIG_FB_VIA is not set
> CONFIG_FB_NEOMAGIC=m
> CONFIG_FB_KYRO=m
> CONFIG_FB_3DFX=m
> CONFIG_FB_3DFX_ACCEL=y
> CONFIG_FB_3DFX_I2C=y
> CONFIG_FB_VOODOO1=m
> # CONFIG_FB_VT8623 is not set
> CONFIG_FB_TRIDENT=m
> # CONFIG_FB_ARK is not set
> # CONFIG_FB_PM3 is not set
> # CONFIG_FB_CARMINE is not set
> # CONFIG_FB_GEODE is not set
> CONFIG_FB_SM501=m
> # CONFIG_FB_VIRTUAL is not set
> # CONFIG_FB_METRONOME is not set
> # CONFIG_FB_MB862XX is not set
> # CONFIG_FB_BROADSHEET is not set
> CONFIG_BACKLIGHT_LCD_SUPPORT=y
> CONFIG_LCD_CLASS_DEVICE=m
> # CONFIG_LCD_PLATFORM is not set
> CONFIG_BACKLIGHT_CLASS_DEVICE=y
> CONFIG_BACKLIGHT_GENERIC=y
> CONFIG_BACKLIGHT_PROGEAR=m
> # CONFIG_BACKLIGHT_MBP_NVIDIA is not set
> # CONFIG_BACKLIGHT_SAHARA is not set
> # CONFIG_BACKLIGHT_ADP8860 is not set
>
> #
> # Display device support
> #
> # CONFIG_DISPLAY_SUPPORT is not set
>
> #
> # Console display driver support
> #
> CONFIG_VGA_CONSOLE=y
> CONFIG_VGACON_SOFT_SCROLLBACK=y
> CONFIG_VGACON_SOFT_SCROLLBACK_SIZE=64
> CONFIG_DUMMY_CONSOLE=y
> CONFIG_FRAMEBUFFER_CONSOLE=m
> # CONFIG_FRAMEBUFFER_CONSOLE_DETECT_PRIMARY is not set
> # CONFIG_FRAMEBUFFER_CONSOLE_ROTATION is not set
> # CONFIG_FONTS is not set
> CONFIG_FONT_8x8=y
> CONFIG_FONT_8x16=y
> CONFIG_LOGO=y
> # CONFIG_LOGO_LINUX_MONO is not set
> # CONFIG_LOGO_LINUX_VGA16 is not set
> CONFIG_LOGO_LINUX_CLUT224=y
> CONFIG_SOUND=m
> CONFIG_SOUND_OSS_CORE=y
> CONFIG_SOUND_OSS_CORE_PRECLAIM=y
> CONFIG_SND=m
> CONFIG_SND_TIMER=m
> CONFIG_SND_PCM=m
> CONFIG_SND_HWDEP=m
> CONFIG_SND_RAWMIDI=m
> CONFIG_SND_JACK=y
> CONFIG_SND_SEQUENCER=m
> CONFIG_SND_SEQ_DUMMY=m
> CONFIG_SND_OSSEMUL=y
> CONFIG_SND_MIXER_OSS=m
> CONFIG_SND_PCM_OSS=m
> CONFIG_SND_PCM_OSS_PLUGINS=y
> CONFIG_SND_SEQUENCER_OSS=y
> CONFIG_SND_DYNAMIC_MINORS=y
> # CONFIG_SND_SUPPORT_OLD_API is not set
> CONFIG_SND_VERBOSE_PROCFS=y
> # CONFIG_SND_VERBOSE_PRINTK is not set
> # CONFIG_SND_DEBUG is not set
> CONFIG_SND_VMASTER=y
> CONFIG_SND_DMA_SGBUF=y
> CONFIG_SND_RAWMIDI_SEQ=m
> CONFIG_SND_OPL3_LIB_SEQ=m
> # CONFIG_SND_OPL4_LIB_SEQ is not set
> # CONFIG_SND_SBAWE_SEQ is not set
> CONFIG_SND_EMU10K1_SEQ=m
> CONFIG_SND_MPU401_UART=m
> CONFIG_SND_OPL3_LIB=m
> CONFIG_SND_VX_LIB=m
> CONFIG_SND_AC97_CODEC=m
> CONFIG_SND_DRIVERS=y
> CONFIG_SND_DUMMY=m
> CONFIG_SND_VIRMIDI=m
> CONFIG_SND_MTS64=m
> # CONFIG_SND_SERIAL_U16550 is not set
> CONFIG_SND_MPU401=m
> CONFIG_SND_PORTMAN2X4=m
> CONFIG_SND_AC97_POWER_SAVE=y
> CONFIG_SND_AC97_POWER_SAVE_DEFAULT=0
> CONFIG_SND_SB_COMMON=m
> CONFIG_SND_PCI=y
> CONFIG_SND_AD1889=m
> CONFIG_SND_ALS300=m
> CONFIG_SND_ALS4000=m
> CONFIG_SND_ALI5451=m
> # CONFIG_SND_ASIHPI is not set
> CONFIG_SND_ATIIXP=m
> CONFIG_SND_ATIIXP_MODEM=m
> CONFIG_SND_AU8810=m
> CONFIG_SND_AU8820=m
> CONFIG_SND_AU8830=m
> # CONFIG_SND_AW2 is not set
> CONFIG_SND_AZT3328=m
> CONFIG_SND_BT87X=m
> # CONFIG_SND_BT87X_OVERCLOCK is not set
> CONFIG_SND_CA0106=m
> CONFIG_SND_CMIPCI=m
> # CONFIG_SND_OXYGEN is not set
> CONFIG_SND_CS4281=m
> CONFIG_SND_CS46XX=m
> CONFIG_SND_CS46XX_NEW_DSP=y
> # CONFIG_SND_CS5530 is not set
> # CONFIG_SND_CS5535AUDIO is not set
> # CONFIG_SND_CTXFI is not set
> CONFIG_SND_DARLA20=m
> CONFIG_SND_GINA20=m
> CONFIG_SND_LAYLA20=m
> CONFIG_SND_DARLA24=m
> CONFIG_SND_GINA24=m
> CONFIG_SND_LAYLA24=m
> CONFIG_SND_MONA=m
> CONFIG_SND_MIA=m
> CONFIG_SND_ECHO3G=m
> CONFIG_SND_INDIGO=m
> CONFIG_SND_INDIGOIO=m
> CONFIG_SND_INDIGODJ=m
> # CONFIG_SND_INDIGOIOX is not set
> # CONFIG_SND_INDIGODJX is not set
> CONFIG_SND_EMU10K1=m
> CONFIG_SND_EMU10K1X=m
> CONFIG_SND_ENS1370=m
> CONFIG_SND_ENS1371=m
> CONFIG_SND_ES1938=m
> CONFIG_SND_ES1968=m
> # CONFIG_SND_ES1968_INPUT is not set
> CONFIG_SND_FM801=m
> CONFIG_SND_HDA_INTEL=m
> # CONFIG_SND_HDA_HWDEP is not set
> # CONFIG_SND_HDA_INPUT_BEEP is not set
> # CONFIG_SND_HDA_INPUT_JACK is not set
> # CONFIG_SND_HDA_PATCH_LOADER is not set
> CONFIG_SND_HDA_CODEC_REALTEK=y
> CONFIG_SND_HDA_CODEC_ANALOG=y
> CONFIG_SND_HDA_CODEC_SIGMATEL=y
> CONFIG_SND_HDA_CODEC_VIA=y
> CONFIG_SND_HDA_CODEC_ATIHDMI=y
> CONFIG_SND_HDA_CODEC_NVHDMI=y
> CONFIG_SND_HDA_CODEC_INTELHDMI=y
> CONFIG_SND_HDA_ELD=y
> CONFIG_SND_HDA_CODEC_CIRRUS=y
> CONFIG_SND_HDA_CODEC_CONEXANT=y
> CONFIG_SND_HDA_CODEC_CA0110=y
> CONFIG_SND_HDA_CODEC_CMEDIA=y
> CONFIG_SND_HDA_CODEC_SI3054=y
> CONFIG_SND_HDA_GENERIC=y
> # CONFIG_SND_HDA_POWER_SAVE is not set
> CONFIG_SND_HDSP=m
> CONFIG_SND_HDSPM=m
> # CONFIG_SND_HIFIER is not set
> CONFIG_SND_ICE1712=m
> CONFIG_SND_ICE1724=m
> CONFIG_SND_INTEL8X0=m
> CONFIG_SND_INTEL8X0M=m
> CONFIG_SND_KORG1212=m
> # CONFIG_SND_LX6464ES is not set
> CONFIG_SND_MAESTRO3=m
> # CONFIG_SND_MAESTRO3_INPUT is not set
> CONFIG_SND_MIXART=m
> CONFIG_SND_NM256=m
> CONFIG_SND_PCXHR=m
> CONFIG_SND_RIPTIDE=m
> CONFIG_SND_RME32=m
> CONFIG_SND_RME96=m
> CONFIG_SND_RME9652=m
> CONFIG_SND_SONICVIBES=m
> CONFIG_SND_TRIDENT=m
> CONFIG_SND_VIA82XX=m
> CONFIG_SND_VIA82XX_MODEM=m
> # CONFIG_SND_VIRTUOSO is not set
> CONFIG_SND_VX222=m
> CONFIG_SND_YMFPCI=m
> CONFIG_SND_USB=y
> CONFIG_SND_USB_AUDIO=m
> # CONFIG_SND_USB_UA101 is not set
> CONFIG_SND_USB_USX2Y=m
> # CONFIG_SND_USB_CAIAQ is not set
> # CONFIG_SND_USB_US122L is not set
> CONFIG_SND_PCMCIA=y
> # CONFIG_SND_VXPOCKET is not set
> # CONFIG_SND_PDAUDIOCF is not set
> CONFIG_SND_SOC=m
> CONFIG_SND_SOC_I2C_AND_SPI=m
> # CONFIG_SND_SOC_ALL_CODECS is not set
> # CONFIG_SOUND_PRIME is not set
> CONFIG_AC97_BUS=m
> CONFIG_HID_SUPPORT=y
> CONFIG_HID=y
> # CONFIG_HIDRAW is not set
>
> #
> # USB Input Devices
> #
> CONFIG_USB_HID=y
> CONFIG_HID_PID=y
> CONFIG_USB_HIDDEV=y
>
> #
> # Special HID drivers
> #
> # CONFIG_HID_3M_PCT is not set
> CONFIG_HID_A4TECH=y
> # CONFIG_HID_ACRUX_FF is not set
> CONFIG_HID_APPLE=y
> CONFIG_HID_BELKIN=y
> # CONFIG_HID_CANDO is not set
> CONFIG_HID_CHERRY=y
> CONFIG_HID_CHICONY=y
> # CONFIG_HID_PRODIKEYS is not set
> CONFIG_HID_CYPRESS=y
> CONFIG_HID_DRAGONRISE=y
> # CONFIG_DRAGONRISE_FF is not set
> # CONFIG_HID_EGALAX is not set
> # CONFIG_HID_ELECOM is not set
> CONFIG_HID_EZKEY=y
> CONFIG_HID_KYE=y
> CONFIG_HID_GYRATION=y
> CONFIG_HID_TWINHAN=y
> CONFIG_HID_KENSINGTON=y
> CONFIG_HID_LOGITECH=y
> CONFIG_LOGITECH_FF=y
> # CONFIG_LOGIRUMBLEPAD2_FF is not set
> # CONFIG_LOGIG940_FF is not set
> # CONFIG_HID_MAGICMOUSE is not set
> CONFIG_HID_MICROSOFT=y
> # CONFIG_HID_MOSART is not set
> CONFIG_HID_MONTEREY=y
> CONFIG_HID_NTRIG=y
> CONFIG_HID_ORTEK=y
> CONFIG_HID_PANTHERLORD=y
> CONFIG_PANTHERLORD_FF=y
> CONFIG_HID_PETALYNX=y
> # CONFIG_HID_PICOLCD is not set
> # CONFIG_HID_QUANTA is not set
> # CONFIG_HID_ROCCAT is not set
> # CONFIG_HID_ROCCAT_KONE is not set
> CONFIG_HID_SAMSUNG=y
> CONFIG_HID_SONY=y
> # CONFIG_HID_STANTUM is not set
> CONFIG_HID_SUNPLUS=y
> CONFIG_HID_GREENASIA=y
> # CONFIG_GREENASIA_FF is not set
> CONFIG_HID_SMARTJOYPLUS=y
> # CONFIG_SMARTJOYPLUS_FF is not set
> CONFIG_HID_TOPSEED=y
> CONFIG_HID_THRUSTMASTER=y
> CONFIG_THRUSTMASTER_FF=y
> # CONFIG_HID_WACOM is not set
> CONFIG_HID_ZEROPLUS=y
> CONFIG_ZEROPLUS_FF=y
> # CONFIG_HID_ZYDACRON is not set
> CONFIG_USB_SUPPORT=y
> CONFIG_USB_ARCH_HAS_HCD=y
> CONFIG_USB_ARCH_HAS_OHCI=y
> CONFIG_USB_ARCH_HAS_EHCI=y
> CONFIG_USB=y
> # CONFIG_USB_DEBUG is not set
> # CONFIG_USB_ANNOUNCE_NEW_DEVICES is not set
>
> #
> # Miscellaneous USB options
> #
> CONFIG_USB_DEVICEFS=y
> CONFIG_USB_DEVICE_CLASS=y
> # CONFIG_USB_DYNAMIC_MINORS is not set
> CONFIG_USB_MON=y
> # CONFIG_USB_WUSB is not set
> # CONFIG_USB_WUSB_CBAF is not set
>
> #
> # USB Host Controller Drivers
> #
> # CONFIG_USB_C67X00_HCD is not set
> # CONFIG_USB_XHCI_HCD is not set
> CONFIG_USB_EHCI_HCD=y
> CONFIG_USB_EHCI_ROOT_HUB_TT=y
> CONFIG_USB_EHCI_TT_NEWSCHED=y
> # CONFIG_USB_OXU210HP_HCD is not set
> CONFIG_USB_ISP116X_HCD=m
> # CONFIG_USB_ISP1760_HCD is not set
> # CONFIG_USB_ISP1362_HCD is not set
> CONFIG_USB_OHCI_HCD=y
> # CONFIG_USB_OHCI_BIG_ENDIAN_DESC is not set
> # CONFIG_USB_OHCI_BIG_ENDIAN_MMIO is not set
> CONFIG_USB_OHCI_LITTLE_ENDIAN=y
> CONFIG_USB_UHCI_HCD=y
> CONFIG_USB_U132_HCD=m
> CONFIG_USB_SL811_HCD=m
> CONFIG_USB_SL811_CS=m
> # CONFIG_USB_R8A66597_HCD is not set
> # CONFIG_USB_HWA_HCD is not set
>
> #
> # USB Device Class drivers
> #
> CONFIG_USB_ACM=m
> CONFIG_USB_PRINTER=m
> # CONFIG_USB_WDM is not set
> # CONFIG_USB_TMC is not set
>
> #
> # NOTE: USB_STORAGE depends on SCSI but BLK_DEV_SD may
> #
>
> #
> # also be needed; see USB_STORAGE Help for more info
> #
> CONFIG_USB_STORAGE=m
> # CONFIG_USB_STORAGE_DEBUG is not set
> CONFIG_USB_STORAGE_DATAFAB=m
> CONFIG_USB_STORAGE_FREECOM=m
> CONFIG_USB_STORAGE_ISD200=m
> CONFIG_USB_STORAGE_USBAT=m
> CONFIG_USB_STORAGE_SDDR09=m
> CONFIG_USB_STORAGE_SDDR55=m
> CONFIG_USB_STORAGE_JUMPSHOT=m
> CONFIG_USB_STORAGE_ALAUDA=m
> # CONFIG_USB_STORAGE_ONETOUCH is not set
> CONFIG_USB_STORAGE_KARMA=m
> # CONFIG_USB_STORAGE_CYPRESS_ATACB is not set
> CONFIG_USB_LIBUSUAL=y
>
> #
> # USB Imaging devices
> #
> CONFIG_USB_MDC800=m
> CONFIG_USB_MICROTEK=m
>
> #
> # USB port drivers
> #
> CONFIG_USB_USS720=m
> CONFIG_USB_SERIAL=m
> CONFIG_USB_EZUSB=y
> CONFIG_USB_SERIAL_GENERIC=y
> CONFIG_USB_SERIAL_AIRCABLE=m
> CONFIG_USB_SERIAL_ARK3116=m
> CONFIG_USB_SERIAL_BELKIN=m
> # CONFIG_USB_SERIAL_CH341 is not set
> CONFIG_USB_SERIAL_WHITEHEAT=m
> CONFIG_USB_SERIAL_DIGI_ACCELEPORT=m
> # CONFIG_USB_SERIAL_CP210X is not set
> CONFIG_USB_SERIAL_CYPRESS_M8=m
> CONFIG_USB_SERIAL_EMPEG=m
> CONFIG_USB_SERIAL_FTDI_SIO=m
> CONFIG_USB_SERIAL_FUNSOFT=m
> CONFIG_USB_SERIAL_VISOR=m
> CONFIG_USB_SERIAL_IPAQ=m
> CONFIG_USB_SERIAL_IR=m
> CONFIG_USB_SERIAL_EDGEPORT=m
> CONFIG_USB_SERIAL_EDGEPORT_TI=m
> CONFIG_USB_SERIAL_GARMIN=m
> CONFIG_USB_SERIAL_IPW=m
> # CONFIG_USB_SERIAL_IUU is not set
> CONFIG_USB_SERIAL_KEYSPAN_PDA=m
> CONFIG_USB_SERIAL_KEYSPAN=m
> CONFIG_USB_SERIAL_KEYSPAN_MPR=y
> CONFIG_USB_SERIAL_KEYSPAN_USA28=y
> CONFIG_USB_SERIAL_KEYSPAN_USA28X=y
> CONFIG_USB_SERIAL_KEYSPAN_USA28XA=y
> CONFIG_USB_SERIAL_KEYSPAN_USA28XB=y
> CONFIG_USB_SERIAL_KEYSPAN_USA19=y
> CONFIG_USB_SERIAL_KEYSPAN_USA18X=y
> CONFIG_USB_SERIAL_KEYSPAN_USA19W=y
> CONFIG_USB_SERIAL_KEYSPAN_USA19QW=y
> CONFIG_USB_SERIAL_KEYSPAN_USA19QI=y
> CONFIG_USB_SERIAL_KEYSPAN_USA49W=y
> CONFIG_USB_SERIAL_KEYSPAN_USA49WLC=y
> CONFIG_USB_SERIAL_KLSI=m
> CONFIG_USB_SERIAL_KOBIL_SCT=m
> CONFIG_USB_SERIAL_MCT_U232=m
> CONFIG_USB_SERIAL_MOS7720=m
> # CONFIG_USB_SERIAL_MOS7715_PARPORT is not set
> CONFIG_USB_SERIAL_MOS7840=m
> # CONFIG_USB_SERIAL_MOTOROLA is not set
> CONFIG_USB_SERIAL_NAVMAN=m
> CONFIG_USB_SERIAL_PL2303=m
> # CONFIG_USB_SERIAL_OTI6858 is not set
> # CONFIG_USB_SERIAL_QCAUX is not set
> # CONFIG_USB_SERIAL_QUALCOMM is not set
> # CONFIG_USB_SERIAL_SPCP8X5 is not set
> CONFIG_USB_SERIAL_HP4X=m
> CONFIG_USB_SERIAL_SAFE=m
> CONFIG_USB_SERIAL_SAFE_PADDED=y
> # CONFIG_USB_SERIAL_SIEMENS_MPI is not set
> CONFIG_USB_SERIAL_SIERRAWIRELESS=m
> # CONFIG_USB_SERIAL_SYMBOL is not set
> CONFIG_USB_SERIAL_TI=m
> CONFIG_USB_SERIAL_CYBERJACK=m
> CONFIG_USB_SERIAL_XIRCOM=m
> CONFIG_USB_SERIAL_WWAN=m
> CONFIG_USB_SERIAL_OPTION=m
> CONFIG_USB_SERIAL_OMNINET=m
> # CONFIG_USB_SERIAL_OPTICON is not set
> # CONFIG_USB_SERIAL_VIVOPAY_SERIAL is not set
> # CONFIG_USB_SERIAL_ZIO is not set
> # CONFIG_USB_SERIAL_SSU100 is not set
> CONFIG_USB_SERIAL_DEBUG=m
>
> #
> # USB Miscellaneous drivers
> #
> CONFIG_USB_EMI62=m
> CONFIG_USB_EMI26=m
> CONFIG_USB_ADUTUX=m
> # CONFIG_USB_SEVSEG is not set
> CONFIG_USB_RIO500=m
> CONFIG_USB_LEGOTOWER=m
> CONFIG_USB_LCD=m
> CONFIG_USB_LED=m
> # CONFIG_USB_CYPRESS_CY7C63 is not set
> # CONFIG_USB_CYTHERM is not set
> CONFIG_USB_IDMOUSE=m
> CONFIG_USB_FTDI_ELAN=m
> CONFIG_USB_APPLEDISPLAY=m
> CONFIG_USB_SISUSBVGA=m
> CONFIG_USB_SISUSBVGA_CON=y
> CONFIG_USB_LD=m
> CONFIG_USB_TRANCEVIBRATOR=m
> CONFIG_USB_IOWARRIOR=m
> CONFIG_USB_TEST=m
> # CONFIG_USB_ISIGHTFW is not set
> CONFIG_USB_ATM=m
> CONFIG_USB_SPEEDTOUCH=m
> CONFIG_USB_CXACRU=m
> CONFIG_USB_UEAGLEATM=m
> CONFIG_USB_XUSBATM=m
> # CONFIG_USB_GADGET is not set
>
> #
> # OTG and related infrastructure
> #
> # CONFIG_NOP_USB_XCEIV is not set
> # CONFIG_UWB is not set
> CONFIG_MMC=m
> # CONFIG_MMC_DEBUG is not set
> # CONFIG_MMC_UNSAFE_RESUME is not set
>
> #
> # MMC/SD/SDIO Card Drivers
> #
> CONFIG_MMC_BLOCK=m
> CONFIG_MMC_BLOCK_BOUNCE=y
> # CONFIG_SDIO_UART is not set
> # CONFIG_MMC_TEST is not set
>
> #
> # MMC/SD/SDIO Host Controller Drivers
> #
> CONFIG_MMC_SDHCI=m
> # CONFIG_MMC_SDHCI_PCI is not set
> # CONFIG_MMC_SDHCI_PLTFM is not set
> CONFIG_MMC_WBSD=m
> CONFIG_MMC_TIFM_SD=m
> # CONFIG_MMC_SDRICOH_CS is not set
> # CONFIG_MMC_CB710 is not set
> # CONFIG_MMC_VIA_SDMMC is not set
> # CONFIG_MEMSTICK is not set
> CONFIG_NEW_LEDS=y
> CONFIG_LEDS_CLASS=y
>
> #
> # LED drivers
> #
> # CONFIG_LEDS_ALIX2 is not set
> # CONFIG_LEDS_PCA9532 is not set
> # CONFIG_LEDS_LP3944 is not set
> # CONFIG_LEDS_CLEVO_MAIL is not set
> # CONFIG_LEDS_PCA955X is not set
> # CONFIG_LEDS_BD2802 is not set
> CONFIG_LEDS_TRIGGERS=y
>
> #
> # LED Triggers
> #
> CONFIG_LEDS_TRIGGER_TIMER=m
> CONFIG_LEDS_TRIGGER_HEARTBEAT=m
> # CONFIG_LEDS_TRIGGER_BACKLIGHT is not set
> # CONFIG_LEDS_TRIGGER_DEFAULT_ON is not set
>
> #
> # iptables trigger is under Netfilter config (LED target)
> #
> # CONFIG_ACCESSIBILITY is not set
> # CONFIG_INFINIBAND is not set
> CONFIG_EDAC=y
>
> #
> # Reporting subsystems
> #
> # CONFIG_EDAC_DEBUG is not set
> CONFIG_EDAC_DECODE_MCE=y
> CONFIG_EDAC_MM_EDAC=m
> # CONFIG_EDAC_AMD64 is not set
> CONFIG_EDAC_E752X=m
> # CONFIG_EDAC_I82975X is not set
> # CONFIG_EDAC_I3000 is not set
> # CONFIG_EDAC_I3200 is not set
> # CONFIG_EDAC_X38 is not set
> # CONFIG_EDAC_I5400 is not set
> # CONFIG_EDAC_I7CORE is not set
> # CONFIG_EDAC_I5000 is not set
> # CONFIG_EDAC_I5100 is not set
> CONFIG_RTC_LIB=m
> CONFIG_RTC_CLASS=m
>
> #
> # RTC interfaces
> #
> CONFIG_RTC_INTF_SYSFS=y
> CONFIG_RTC_INTF_PROC=y
> CONFIG_RTC_INTF_DEV=y
> # CONFIG_RTC_INTF_DEV_UIE_EMUL is not set
> # CONFIG_RTC_DRV_TEST is not set
>
> #
> # I2C RTC drivers
> #
> CONFIG_RTC_DRV_DS1307=m
> # CONFIG_RTC_DRV_DS1374 is not set
> CONFIG_RTC_DRV_DS1672=m
> # CONFIG_RTC_DRV_DS3232 is not set
> # CONFIG_RTC_DRV_MAX6900 is not set
> CONFIG_RTC_DRV_RS5C372=m
> CONFIG_RTC_DRV_ISL1208=m
> # CONFIG_RTC_DRV_ISL12022 is not set
> CONFIG_RTC_DRV_X1205=m
> CONFIG_RTC_DRV_PCF8563=m
> # CONFIG_RTC_DRV_PCF8583 is not set
> # CONFIG_RTC_DRV_M41T80 is not set
> # CONFIG_RTC_DRV_BQ32K is not set
> # CONFIG_RTC_DRV_S35390A is not set
> # CONFIG_RTC_DRV_FM3130 is not set
> # CONFIG_RTC_DRV_RX8581 is not set
> # CONFIG_RTC_DRV_RX8025 is not set
>
> #
> # SPI RTC drivers
> #
>
> #
> # Platform RTC drivers
> #
> CONFIG_RTC_DRV_CMOS=m
> # CONFIG_RTC_DRV_DS1286 is not set
> # CONFIG_RTC_DRV_DS1511 is not set
> CONFIG_RTC_DRV_DS1553=m
> CONFIG_RTC_DRV_DS1742=m
> # CONFIG_RTC_DRV_STK17TA8 is not set
> # CONFIG_RTC_DRV_M48T86 is not set
> # CONFIG_RTC_DRV_M48T35 is not set
> # CONFIG_RTC_DRV_M48T59 is not set
> # CONFIG_RTC_DRV_MSM6242 is not set
> # CONFIG_RTC_DRV_BQ4802 is not set
> # CONFIG_RTC_DRV_RP5C01 is not set
> CONFIG_RTC_DRV_V3020=m
>
> #
> # on-CPU RTC drivers
> #
> # CONFIG_DMADEVICES is not set
> # CONFIG_AUXDISPLAY is not set
> # CONFIG_UIO is not set
> # CONFIG_STAGING is not set
> CONFIG_X86_PLATFORM_DEVICES=y
> # CONFIG_ACER_WMI is not set
> CONFIG_ASUS_LAPTOP=m
> # CONFIG_DELL_LAPTOP is not set
> # CONFIG_FUJITSU_LAPTOP is not set
> CONFIG_MSI_LAPTOP=m
> # CONFIG_PANASONIC_LAPTOP is not set
> # CONFIG_COMPAL_LAPTOP is not set
> CONFIG_SONY_LAPTOP=m
> # CONFIG_SONYPI_COMPAT is not set
> # CONFIG_IDEAPAD_ACPI is not set
> # CONFIG_THINKPAD_ACPI is not set
> # CONFIG_INTEL_MENLOW is not set
> # CONFIG_EEEPC_LAPTOP is not set
> # CONFIG_ACPI_WMI is not set
> CONFIG_ACPI_ASUS=m
> # CONFIG_TOPSTAR_LAPTOP is not set
> CONFIG_ACPI_TOSHIBA=m
> # CONFIG_TOSHIBA_BT_RFKILL is not set
> # CONFIG_ACPI_CMPC is not set
> # CONFIG_INTEL_IPS is not set
>
> #
> # Firmware Drivers
> #
> CONFIG_EDD=m
> # CONFIG_EDD_OFF is not set
> CONFIG_FIRMWARE_MEMMAP=y
> CONFIG_DELL_RBU=m
> CONFIG_DCDBAS=m
> CONFIG_DMIID=y
> # CONFIG_ISCSI_IBFT_FIND is not set
>
> #
> # File systems
> #
> CONFIG_EXT2_FS=y
> CONFIG_EXT2_FS_XATTR=y
> CONFIG_EXT2_FS_POSIX_ACL=y
> CONFIG_EXT2_FS_SECURITY=y
> CONFIG_EXT2_FS_XIP=y
> CONFIG_EXT3_FS=y
> # CONFIG_EXT3_DEFAULTS_TO_ORDERED is not set
> CONFIG_EXT3_FS_XATTR=y
> CONFIG_EXT3_FS_POSIX_ACL=y
> CONFIG_EXT3_FS_SECURITY=y
> # CONFIG_EXT4_FS is not set
> CONFIG_FS_XIP=y
> CONFIG_JBD=y
> # CONFIG_JBD_DEBUG is not set
> CONFIG_JBD2=m
> # CONFIG_JBD2_DEBUG is not set
> CONFIG_FS_MBCACHE=y
> CONFIG_REISERFS_FS=m
> # CONFIG_REISERFS_CHECK is not set
> CONFIG_REISERFS_PROC_INFO=y
> CONFIG_REISERFS_FS_XATTR=y
> CONFIG_REISERFS_FS_POSIX_ACL=y
> CONFIG_REISERFS_FS_SECURITY=y
> CONFIG_JFS_FS=m
> CONFIG_JFS_POSIX_ACL=y
> CONFIG_JFS_SECURITY=y
> # CONFIG_JFS_DEBUG is not set
> # CONFIG_JFS_STATISTICS is not set
> CONFIG_FS_POSIX_ACL=y
> CONFIG_XFS_FS=m
> CONFIG_XFS_QUOTA=y
> CONFIG_XFS_POSIX_ACL=y
> # CONFIG_XFS_RT is not set
> # CONFIG_XFS_DEBUG is not set
> CONFIG_GFS2_FS=m
> # CONFIG_GFS2_FS_LOCKING_DLM is not set
> CONFIG_OCFS2_FS=m
> CONFIG_OCFS2_FS_O2CB=m
> CONFIG_OCFS2_FS_USERSPACE_CLUSTER=m
> CONFIG_OCFS2_FS_STATS=y
> # CONFIG_OCFS2_DEBUG_MASKLOG is not set
> # CONFIG_OCFS2_DEBUG_FS is not set
> # CONFIG_BTRFS_FS is not set
> # CONFIG_NILFS2_FS is not set
> CONFIG_FILE_LOCKING=y
> CONFIG_FSNOTIFY=y
> CONFIG_DNOTIFY=y
> CONFIG_INOTIFY_USER=y
> # CONFIG_FANOTIFY is not set
> CONFIG_QUOTA=y
> # CONFIG_QUOTA_NETLINK_INTERFACE is not set
> CONFIG_PRINT_QUOTA_WARNING=y
> # CONFIG_QUOTA_DEBUG is not set
> CONFIG_QUOTA_TREE=y
> # CONFIG_QFMT_V1 is not set
> CONFIG_QFMT_V2=y
> CONFIG_QUOTACTL=y
> CONFIG_QUOTACTL_COMPAT=y
> CONFIG_AUTOFS_FS=m
> CONFIG_AUTOFS4_FS=m
> CONFIG_FUSE_FS=m
> # CONFIG_CUSE is not set
> CONFIG_GENERIC_ACL=y
>
> #
> # Caches
> #
> # CONFIG_FSCACHE is not set
>
> #
> # CD-ROM/DVD Filesystems
> #
> CONFIG_ISO9660_FS=y
> CONFIG_JOLIET=y
> CONFIG_ZISOFS=y
> CONFIG_UDF_FS=m
> CONFIG_UDF_NLS=y
>
> #
> # DOS/FAT/NT Filesystems
> #
> CONFIG_FAT_FS=m
> CONFIG_MSDOS_FS=m
> CONFIG_VFAT_FS=m
> CONFIG_FAT_DEFAULT_CODEPAGE=437
> CONFIG_FAT_DEFAULT_IOCHARSET="ascii"
> # CONFIG_NTFS_FS is not set
>
> #
> # Pseudo filesystems
> #
> CONFIG_PROC_FS=y
> CONFIG_PROC_KCORE=y
> CONFIG_PROC_SYSCTL=y
> CONFIG_PROC_PAGE_MONITOR=y
> CONFIG_SYSFS=y
> CONFIG_TMPFS=y
> CONFIG_TMPFS_POSIX_ACL=y
> CONFIG_HUGETLBFS=y
> CONFIG_HUGETLB_PAGE=y
> CONFIG_CONFIGFS_FS=m
> CONFIG_MISC_FILESYSTEMS=y
> # CONFIG_ADFS_FS is not set
> CONFIG_AFFS_FS=m
> # CONFIG_ECRYPT_FS is not set
> CONFIG_HFS_FS=m
> CONFIG_HFSPLUS_FS=m
> CONFIG_BEFS_FS=m
> # CONFIG_BEFS_DEBUG is not set
> CONFIG_BFS_FS=m
> CONFIG_EFS_FS=m
> # CONFIG_LOGFS is not set
> CONFIG_CRAMFS=m
> # CONFIG_SQUASHFS is not set
> CONFIG_VXFS_FS=m
> CONFIG_MINIX_FS=m
> # CONFIG_OMFS_FS is not set
> # CONFIG_HPFS_FS is not set
> CONFIG_QNX4FS_FS=m
> CONFIG_ROMFS_FS=m
> CONFIG_ROMFS_BACKED_BY_BLOCK=y
> CONFIG_ROMFS_ON_BLOCK=y
> CONFIG_SYSV_FS=m
> CONFIG_UFS_FS=m
> # CONFIG_UFS_FS_WRITE is not set
> # CONFIG_UFS_DEBUG is not set
> CONFIG_NETWORK_FILESYSTEMS=y
> CONFIG_NFS_FS=m
> CONFIG_NFS_V3=y
> CONFIG_NFS_V3_ACL=y
> CONFIG_NFS_V4=y
> # CONFIG_NFS_V4_1 is not set
> # CONFIG_NFS_USE_LEGACY_DNS is not set
> CONFIG_NFS_USE_KERNEL_DNS=y
> CONFIG_NFSD=m
> CONFIG_NFSD_V2_ACL=y
> CONFIG_NFSD_V3=y
> CONFIG_NFSD_V3_ACL=y
> CONFIG_NFSD_V4=y
> CONFIG_LOCKD=m
> CONFIG_LOCKD_V4=y
> CONFIG_EXPORTFS=m
> CONFIG_NFS_ACL_SUPPORT=m
> CONFIG_NFS_COMMON=y
> CONFIG_SUNRPC=m
> CONFIG_SUNRPC_GSS=m
> CONFIG_RPCSEC_GSS_KRB5=m
> CONFIG_RPCSEC_GSS_SPKM3=m
> # CONFIG_SMB_FS is not set
> # CONFIG_CEPH_FS is not set
> CONFIG_CIFS=m
> # CONFIG_CIFS_STATS is not set
> CONFIG_CIFS_WEAK_PW_HASH=y
> # CONFIG_CIFS_UPCALL is not set
> CONFIG_CIFS_XATTR=y
> CONFIG_CIFS_POSIX=y
> # CONFIG_CIFS_DEBUG2 is not set
> # CONFIG_CIFS_DFS_UPCALL is not set
> # CONFIG_CIFS_EXPERIMENTAL is not set
> CONFIG_NCP_FS=m
> CONFIG_NCPFS_PACKET_SIGNING=y
> CONFIG_NCPFS_IOCTL_LOCKING=y
> CONFIG_NCPFS_STRONG=y
> CONFIG_NCPFS_NFS_NS=y
> CONFIG_NCPFS_OS2_NS=y
> CONFIG_NCPFS_SMALLDOS=y
> CONFIG_NCPFS_NLS=y
> CONFIG_NCPFS_EXTRAS=y
> CONFIG_CODA_FS=m
> # CONFIG_AFS_FS is not set
>
> #
> # Partition Types
> #
> CONFIG_PARTITION_ADVANCED=y
> # CONFIG_ACORN_PARTITION is not set
> CONFIG_OSF_PARTITION=y
> CONFIG_AMIGA_PARTITION=y
> # CONFIG_ATARI_PARTITION is not set
> CONFIG_MAC_PARTITION=y
> CONFIG_MSDOS_PARTITION=y
> CONFIG_BSD_DISKLABEL=y
> CONFIG_MINIX_SUBPARTITION=y
> CONFIG_SOLARIS_X86_PARTITION=y
> CONFIG_UNIXWARE_DISKLABEL=y
> # CONFIG_LDM_PARTITION is not set
> CONFIG_SGI_PARTITION=y
> # CONFIG_ULTRIX_PARTITION is not set
> CONFIG_SUN_PARTITION=y
> CONFIG_KARMA_PARTITION=y
> CONFIG_EFI_PARTITION=y
> # CONFIG_SYSV68_PARTITION is not set
> CONFIG_NLS=y
> CONFIG_NLS_DEFAULT="utf8"
> CONFIG_NLS_CODEPAGE_437=y
> CONFIG_NLS_CODEPAGE_737=m
> CONFIG_NLS_CODEPAGE_775=m
> CONFIG_NLS_CODEPAGE_850=m
> CONFIG_NLS_CODEPAGE_852=m
> CONFIG_NLS_CODEPAGE_855=m
> CONFIG_NLS_CODEPAGE_857=m
> CONFIG_NLS_CODEPAGE_860=m
> CONFIG_NLS_CODEPAGE_861=m
> CONFIG_NLS_CODEPAGE_862=m
> CONFIG_NLS_CODEPAGE_863=m
> CONFIG_NLS_CODEPAGE_864=m
> CONFIG_NLS_CODEPAGE_865=m
> CONFIG_NLS_CODEPAGE_866=m
> CONFIG_NLS_CODEPAGE_869=m
> CONFIG_NLS_CODEPAGE_936=m
> CONFIG_NLS_CODEPAGE_950=m
> CONFIG_NLS_CODEPAGE_932=m
> CONFIG_NLS_CODEPAGE_949=m
> CONFIG_NLS_CODEPAGE_874=m
> CONFIG_NLS_ISO8859_8=m
> CONFIG_NLS_CODEPAGE_1250=m
> CONFIG_NLS_CODEPAGE_1251=m
> CONFIG_NLS_ASCII=y
> CONFIG_NLS_ISO8859_1=m
> CONFIG_NLS_ISO8859_2=m
> CONFIG_NLS_ISO8859_3=m
> CONFIG_NLS_ISO8859_4=m
> CONFIG_NLS_ISO8859_5=m
> CONFIG_NLS_ISO8859_6=m
> CONFIG_NLS_ISO8859_7=m
> CONFIG_NLS_ISO8859_9=m
> CONFIG_NLS_ISO8859_13=m
> CONFIG_NLS_ISO8859_14=m
> CONFIG_NLS_ISO8859_15=m
> CONFIG_NLS_KOI8_R=m
> CONFIG_NLS_KOI8_U=m
> CONFIG_NLS_UTF8=m
> CONFIG_DLM=m
> CONFIG_DLM_DEBUG=y
>
> #
> # Kernel hacking
> #
> CONFIG_TRACE_IRQFLAGS_SUPPORT=y
> # CONFIG_PRINTK_TIME is not set
> # CONFIG_ENABLE_WARN_DEPRECATED is not set
> # CONFIG_ENABLE_MUST_CHECK is not set
> CONFIG_FRAME_WARN=2048
> CONFIG_MAGIC_SYSRQ=y
> # CONFIG_STRIP_ASM_SYMS is not set
> # CONFIG_UNUSED_SYMBOLS is not set
> CONFIG_DEBUG_FS=y
> # CONFIG_HEADERS_CHECK is not set
> CONFIG_DEBUG_KERNEL=y
> # CONFIG_DEBUG_SHIRQ is not set
> # CONFIG_LOCKUP_DETECTOR is not set
> # CONFIG_HARDLOCKUP_DETECTOR is not set
> CONFIG_DETECT_HUNG_TASK=y
> # CONFIG_BOOTPARAM_HUNG_TASK_PANIC is not set
> CONFIG_BOOTPARAM_HUNG_TASK_PANIC_VALUE=0
> CONFIG_SCHED_DEBUG=y
> # CONFIG_SCHEDSTATS is not set
> # CONFIG_TIMER_STATS is not set
> # CONFIG_DEBUG_OBJECTS is not set
> # CONFIG_DEBUG_SLAB is not set
> # CONFIG_DEBUG_RT_MUTEXES is not set
> # CONFIG_RT_MUTEX_TESTER is not set
> # CONFIG_DEBUG_SPINLOCK is not set
> # CONFIG_DEBUG_MUTEXES is not set
> # CONFIG_DEBUG_LOCK_ALLOC is not set
> # CONFIG_PROVE_LOCKING is not set
> # CONFIG_SPARSE_RCU_POINTER is not set
> # CONFIG_LOCK_STAT is not set
> # CONFIG_DEBUG_SPINLOCK_SLEEP is not set
> # CONFIG_DEBUG_LOCKING_API_SELFTESTS is not set
> CONFIG_STACKTRACE=y
> # CONFIG_DEBUG_KOBJECT is not set
> CONFIG_DEBUG_BUGVERBOSE=y
> # CONFIG_DEBUG_INFO is not set
> # CONFIG_DEBUG_VM is not set
> # CONFIG_DEBUG_VIRTUAL is not set
> # CONFIG_DEBUG_WRITECOUNT is not set
> CONFIG_DEBUG_MEMORY_INIT=y
> # CONFIG_DEBUG_LIST is not set
> # CONFIG_DEBUG_SG is not set
> # CONFIG_DEBUG_NOTIFIERS is not set
> # CONFIG_DEBUG_CREDENTIALS is not set
> CONFIG_ARCH_WANT_FRAME_POINTERS=y
> CONFIG_FRAME_POINTER=y
> # CONFIG_BOOT_PRINTK_DELAY is not set
> # CONFIG_RCU_TORTURE_TEST is not set
> # CONFIG_RCU_CPU_STALL_DETECTOR is not set
> # CONFIG_KPROBES_SANITY_TEST is not set
> # CONFIG_BACKTRACE_SELF_TEST is not set
> # CONFIG_DEBUG_BLOCK_EXT_DEVT is not set
> # CONFIG_DEBUG_FORCE_WEAK_PER_CPU is not set
> # CONFIG_LKDTM is not set
> # CONFIG_CPU_NOTIFIER_ERROR_INJECT is not set
> # CONFIG_FAULT_INJECTION is not set
> # CONFIG_LATENCYTOP is not set
> CONFIG_SYSCTL_SYSCALL_CHECK=y
> # CONFIG_DEBUG_PAGEALLOC is not set
> CONFIG_USER_STACKTRACE_SUPPORT=y
> CONFIG_NOP_TRACER=y
> CONFIG_HAVE_FUNCTION_TRACER=y
> CONFIG_HAVE_FUNCTION_GRAPH_TRACER=y
> CONFIG_HAVE_FUNCTION_GRAPH_FP_TEST=y
> CONFIG_HAVE_FUNCTION_TRACE_MCOUNT_TEST=y
> CONFIG_HAVE_DYNAMIC_FTRACE=y
> CONFIG_HAVE_FTRACE_MCOUNT_RECORD=y
> CONFIG_HAVE_SYSCALL_TRACEPOINTS=y
> CONFIG_RING_BUFFER=y
> CONFIG_EVENT_TRACING=y
> CONFIG_CONTEXT_SWITCH_TRACER=y
> CONFIG_RING_BUFFER_ALLOW_SWAP=y
> CONFIG_TRACING=y
> CONFIG_TRACING_SUPPORT=y
> CONFIG_FTRACE=y
> # CONFIG_FUNCTION_TRACER is not set
> # CONFIG_IRQSOFF_TRACER is not set
> # CONFIG_SCHED_TRACER is not set
> CONFIG_ENABLE_DEFAULT_TRACERS=y
> # CONFIG_FTRACE_SYSCALLS is not set
> CONFIG_BRANCH_PROFILE_NONE=y
> # CONFIG_PROFILE_ANNOTATED_BRANCHES is not set
> # CONFIG_PROFILE_ALL_BRANCHES is not set
> # CONFIG_STACK_TRACER is not set
> # CONFIG_BLK_DEV_IO_TRACE is not set
> # CONFIG_KPROBE_EVENT is not set
> # CONFIG_MMIOTRACE is not set
> # CONFIG_RING_BUFFER_BENCHMARK is not set
> # CONFIG_PROVIDE_OHCI1394_DMA_INIT is not set
> # CONFIG_DYNAMIC_DEBUG is not set
> # CONFIG_DMA_API_DEBUG is not set
> # CONFIG_ATOMIC64_SELFTEST is not set
> # CONFIG_SAMPLES is not set
> CONFIG_HAVE_ARCH_KGDB=y
> # CONFIG_KGDB is not set
> CONFIG_HAVE_ARCH_KMEMCHECK=y
> # CONFIG_STRICT_DEVMEM is not set
> CONFIG_X86_VERBOSE_BOOTUP=y
> CONFIG_EARLY_PRINTK=y
> # CONFIG_EARLY_PRINTK_DBGP is not set
> # CONFIG_DEBUG_STACKOVERFLOW is not set
> # CONFIG_DEBUG_STACK_USAGE is not set
> # CONFIG_DEBUG_PER_CPU_MAPS is not set
> # CONFIG_X86_PTDUMP is not set
> CONFIG_DEBUG_RODATA=y
> CONFIG_DEBUG_RODATA_TEST=y
> # CONFIG_DEBUG_NX_TEST is not set
> # CONFIG_IOMMU_DEBUG is not set
> # CONFIG_IOMMU_STRESS is not set
> CONFIG_HAVE_MMIOTRACE_SUPPORT=y
> # CONFIG_X86_DECODER_SELFTEST is not set
> CONFIG_IO_DELAY_TYPE_0X80=0
> CONFIG_IO_DELAY_TYPE_0XED=1
> CONFIG_IO_DELAY_TYPE_UDELAY=2
> CONFIG_IO_DELAY_TYPE_NONE=3
> CONFIG_IO_DELAY_0X80=y
> # CONFIG_IO_DELAY_0XED is not set
> # CONFIG_IO_DELAY_UDELAY is not set
> # CONFIG_IO_DELAY_NONE is not set
> CONFIG_DEFAULT_IO_DELAY_TYPE=0
> # CONFIG_DEBUG_BOOT_PARAMS is not set
> # CONFIG_CPA_DEBUG is not set
> CONFIG_OPTIMIZE_INLINING=y
>
> #
> # Security options
> #
> CONFIG_KEYS=y
> CONFIG_KEYS_DEBUG_PROC_KEYS=y
> CONFIG_SECURITY=y
> CONFIG_SECURITYFS=y
> CONFIG_SECURITY_NETWORK=y
> # CONFIG_SECURITY_NETWORK_XFRM is not set
> # CONFIG_SECURITY_PATH is not set
> CONFIG_LSM_MMAP_MIN_ADDR=65536
> CONFIG_SECURITY_SELINUX=y
> CONFIG_SECURITY_SELINUX_BOOTPARAM=y
> CONFIG_SECURITY_SELINUX_BOOTPARAM_VALUE=1
> CONFIG_SECURITY_SELINUX_DISABLE=y
> CONFIG_SECURITY_SELINUX_DEVELOP=y
> CONFIG_SECURITY_SELINUX_AVC_STATS=y
> CONFIG_SECURITY_SELINUX_CHECKREQPROT_VALUE=1
> # CONFIG_SECURITY_SELINUX_POLICYDB_VERSION_MAX is not set
> # CONFIG_SECURITY_SMACK is not set
> # CONFIG_SECURITY_TOMOYO is not set
> # CONFIG_SECURITY_APPARMOR is not set
> # CONFIG_IMA is not set
> CONFIG_DEFAULT_SECURITY_SELINUX=y
> # CONFIG_DEFAULT_SECURITY_DAC is not set
> CONFIG_DEFAULT_SECURITY="selinux"
> CONFIG_XOR_BLOCKS=m
> CONFIG_ASYNC_CORE=m
> CONFIG_ASYNC_MEMCPY=m
> CONFIG_ASYNC_XOR=m
> CONFIG_ASYNC_PQ=m
> CONFIG_ASYNC_RAID6_RECOV=m
> # CONFIG_ASYNC_RAID6_TEST is not set
> CONFIG_CRYPTO=y
>
> #
> # Crypto core or helper
> #
> CONFIG_CRYPTO_ALGAPI=y
> CONFIG_CRYPTO_ALGAPI2=y
> CONFIG_CRYPTO_AEAD=m
> CONFIG_CRYPTO_AEAD2=y
> CONFIG_CRYPTO_BLKCIPHER=m
> CONFIG_CRYPTO_BLKCIPHER2=y
> CONFIG_CRYPTO_HASH=y
> CONFIG_CRYPTO_HASH2=y
> CONFIG_CRYPTO_RNG2=y
> CONFIG_CRYPTO_PCOMP2=y
> CONFIG_CRYPTO_MANAGER=y
> CONFIG_CRYPTO_MANAGER2=y
> CONFIG_CRYPTO_MANAGER_TESTS=y
> CONFIG_CRYPTO_GF128MUL=m
> CONFIG_CRYPTO_NULL=m
> # CONFIG_CRYPTO_PCRYPT is not set
> CONFIG_CRYPTO_WORKQUEUE=y
> # CONFIG_CRYPTO_CRYPTD is not set
> CONFIG_CRYPTO_AUTHENC=m
> # CONFIG_CRYPTO_TEST is not set
>
> #
> # Authenticated Encryption with Associated Data
> #
> # CONFIG_CRYPTO_CCM is not set
> # CONFIG_CRYPTO_GCM is not set
> # CONFIG_CRYPTO_SEQIV is not set
>
> #
> # Block modes
> #
> CONFIG_CRYPTO_CBC=m
> # CONFIG_CRYPTO_CTR is not set
> # CONFIG_CRYPTO_CTS is not set
> CONFIG_CRYPTO_ECB=m
> CONFIG_CRYPTO_LRW=m
> CONFIG_CRYPTO_PCBC=m
> # CONFIG_CRYPTO_XTS is not set
>
> #
> # Hash modes
> #
> CONFIG_CRYPTO_HMAC=y
> CONFIG_CRYPTO_XCBC=m
> # CONFIG_CRYPTO_VMAC is not set
>
> #
> # Digest
> #
> CONFIG_CRYPTO_CRC32C=y
> # CONFIG_CRYPTO_CRC32C_INTEL is not set
> # CONFIG_CRYPTO_GHASH is not set
> CONFIG_CRYPTO_MD4=m
> CONFIG_CRYPTO_MD5=y
> CONFIG_CRYPTO_MICHAEL_MIC=m
> # CONFIG_CRYPTO_RMD128 is not set
> # CONFIG_CRYPTO_RMD160 is not set
> # CONFIG_CRYPTO_RMD256 is not set
> # CONFIG_CRYPTO_RMD320 is not set
> CONFIG_CRYPTO_SHA1=y
> CONFIG_CRYPTO_SHA256=m
> CONFIG_CRYPTO_SHA512=m
> CONFIG_CRYPTO_TGR192=m
> CONFIG_CRYPTO_WP512=m
> # CONFIG_CRYPTO_GHASH_CLMUL_NI_INTEL is not set
>
> #
> # Ciphers
> #
> CONFIG_CRYPTO_AES=m
> CONFIG_CRYPTO_AES_X86_64=m
> # CONFIG_CRYPTO_AES_NI_INTEL is not set
> CONFIG_CRYPTO_ANUBIS=m
> CONFIG_CRYPTO_ARC4=m
> CONFIG_CRYPTO_BLOWFISH=m
> CONFIG_CRYPTO_CAMELLIA=m
> CONFIG_CRYPTO_CAST5=m
> CONFIG_CRYPTO_CAST6=m
> CONFIG_CRYPTO_DES=m
> CONFIG_CRYPTO_FCRYPT=m
> CONFIG_CRYPTO_KHAZAD=m
> # CONFIG_CRYPTO_SALSA20 is not set
> # CONFIG_CRYPTO_SALSA20_X86_64 is not set
> # CONFIG_CRYPTO_SEED is not set
> CONFIG_CRYPTO_SERPENT=m
> CONFIG_CRYPTO_TEA=m
> CONFIG_CRYPTO_TWOFISH=m
> CONFIG_CRYPTO_TWOFISH_COMMON=m
> CONFIG_CRYPTO_TWOFISH_X86_64=m
>
> #
> # Compression
> #
> CONFIG_CRYPTO_DEFLATE=m
> # CONFIG_CRYPTO_ZLIB is not set
> # CONFIG_CRYPTO_LZO is not set
>
> #
> # Random Number Generation
> #
> # CONFIG_CRYPTO_ANSI_CPRNG is not set
> CONFIG_CRYPTO_HW=y
> # CONFIG_CRYPTO_DEV_PADLOCK is not set
> # CONFIG_CRYPTO_DEV_HIFN_795X is not set
> CONFIG_HAVE_KVM=y
> CONFIG_HAVE_KVM_IRQCHIP=y
> CONFIG_HAVE_KVM_EVENTFD=y
> CONFIG_KVM_APIC_ARCHITECTURE=y
> CONFIG_KVM_MMIO=y
> CONFIG_VIRTUALIZATION=y
> CONFIG_KVM=m
> CONFIG_KVM_INTEL=m
> CONFIG_KVM_AMD=m
> # CONFIG_VHOST_NET is not set
> # CONFIG_VIRTIO_PCI is not set
> # CONFIG_VIRTIO_BALLOON is not set
> CONFIG_BINARY_PRINTF=y
>
> #
> # Library routines
> #
> CONFIG_RAID6_PQ=m
> CONFIG_BITREVERSE=y
> CONFIG_GENERIC_FIND_FIRST_BIT=y
> CONFIG_GENERIC_FIND_NEXT_BIT=y
> CONFIG_GENERIC_FIND_LAST_BIT=y
> CONFIG_CRC_CCITT=m
> CONFIG_CRC16=m
> CONFIG_CRC_T10DIF=y
> CONFIG_CRC_ITU_T=m
> CONFIG_CRC32=y
> # CONFIG_CRC7 is not set
> CONFIG_LIBCRC32C=y
> CONFIG_ZLIB_INFLATE=y
> CONFIG_ZLIB_DEFLATE=m
> CONFIG_LZO_DECOMPRESS=y
> CONFIG_DECOMPRESS_GZIP=y
> CONFIG_DECOMPRESS_BZIP2=y
> CONFIG_DECOMPRESS_LZMA=y
> CONFIG_DECOMPRESS_LZO=y
> CONFIG_TEXTSEARCH=y
> CONFIG_TEXTSEARCH_KMP=m
> CONFIG_TEXTSEARCH_BM=m
> CONFIG_TEXTSEARCH_FSM=m
> CONFIG_HAS_IOMEM=y
> CONFIG_HAS_IOPORT=y
> CONFIG_HAS_DMA=y
> CONFIG_NLATTR=y
On Wed, Aug 25, 2010 at 04:11:06PM -0400, Don Zickus wrote:
...
> > Uhhuh. NMI received for unknown reason 00 on CPU 15.
> > Do you have a strange power saving mode enabled?
> > Dazed and confused, but trying to continue
>
> So I found a Nehalem box that can reliably reproduce Ingo's problem using
> something as simple 'perf top'. But like above, I am noticing the
> samething, an extra NMI(PMI??) that comes out of nowhere.
>
> Looking at the data above the delta between nmis is very small compared to
> the other nmis. It almost suggests that this is an extra PMI.
> Considering there is already two cpu errata discussing extra PMIs under
> certain configurations, I wouldn't be surprised if this was a third.
>
> Cheers,
> Don
>
Oh. I'm not sure if it would be a good idea at all but maybe we could
use kind of Robert's idea about "pmu nmi relaxing time" ie some time
slice in which we treat nmi's as being from pmu, but not arbitrary number
but equal to the number of PMI turned off. Say we handle NMI and found
that 4 events are overflowed, we clear them, arm timer and wait for
3 unknow nmis to happen, if they are not happening during some time
period we clear this waitqueue, if they happen or partially happen
- we destroy the timer. Ie almost the same as Robert's idea but
without tsc? Just a thought.
-- Cyrill
On Thu, Aug 26, 2010 at 12:24:58AM +0400, Cyrill Gorcunov wrote:
> On Wed, Aug 25, 2010 at 04:11:06PM -0400, Don Zickus wrote:
> ...
> > > Uhhuh. NMI received for unknown reason 00 on CPU 15.
> > > Do you have a strange power saving mode enabled?
> > > Dazed and confused, but trying to continue
> >
> > So I found a Nehalem box that can reliably reproduce Ingo's problem using
> > something as simple 'perf top'. But like above, I am noticing the
> > samething, an extra NMI(PMI??) that comes out of nowhere.
> >
> > Looking at the data above the delta between nmis is very small compared to
> > the other nmis. It almost suggests that this is an extra PMI.
> > Considering there is already two cpu errata discussing extra PMIs under
> > certain configurations, I wouldn't be surprised if this was a third.
> >
> > Cheers,
> > Don
> >
>
> Oh. I'm not sure if it would be a good idea at all but maybe we could
> use kind of Robert's idea about "pmu nmi relaxing time" ie some time
> slice in which we treat nmi's as being from pmu, but not arbitrary number
> but equal to the number of PMI turned off. Say we handle NMI and found
> that 4 events are overflowed, we clear them, arm timer and wait for
> 3 unknow nmis to happen, if they are not happening during some time
> period we clear this waitqueue, if they happen or partially happen
> - we destroy the timer. Ie almost the same as Robert's idea but
> without tsc? Just a thought.
The only problem is only one counter is overflowing in these cases, so we
would have to do it all the time, which may not be hard. But I was
thinking of something similar.
For now, I am trying to force counter0 off, seeing that most of the perf
errata on nehalem have been on counter0. Or maybe I can get 'perf top' to
use something other than counter0 by running 'perf record' first?
Cheers,
Don
>
> -- Cyrill
On Wed, Aug 25, 2010 at 05:20:37PM -0400, Don Zickus wrote:
...
> The only problem is only one counter is overflowing in these cases, so we
> would have to do it all the time, which may not be hard. But I was
> thinking of something similar.
>
> For now, I am trying to force counter0 off, seeing that most of the perf
> errata on nehalem have been on counter0. Or maybe I can get 'perf top' to
> use something other than counter0 by running 'perf record' first?
>
> Cheers,
> Don
>
> >
> > -- Cyrill
>
Well, I never looked deep inside perf tool code itself. Perhaps the easiest
(if you want to disable counter0) is just disable it in kernel and see how
it goes, no?
-- Cyrill
On Wed, Aug 25, 2010 at 12:40:38AM +0400, Cyrill Gorcunov wrote:
> On Tue, Aug 24, 2010 at 04:27:18PM -0400, Don Zickus wrote:
> > On Tue, Aug 24, 2010 at 11:52:40PM +0400, Cyrill Gorcunov wrote:
> > > > I use 2.6.34 atm. Letme try 2.6.36 (which might require some time to recompile).
> > > >
> > > > -- Cyrill
> > >
> > > Don, for me it fails with somehow unrelated page handling fault - in
> > > reiserfs_evict_inode O_o Fails in __get_cpu_var I suspect might means
> > > a problem either in per-cpu allocator itself or we screw pointer somehow.
> > > Weird.
> >
> > I just found out (with the help of the crash utility and Dave A.) that
> > Robert's percpu struct nmi clashes with the exception entry point .nmi.
> > I only see this problem in 2.6.36, so I am not sure what changed with
> > regards to compiler flags to confuse variables with text segments.
> >
>
> yeah, I suspect name clashes here but then I did grep over per-cpu variables
> in whole kernel and didn't find match so I thought the assumption was wrong,
> but eventually it happens to be true but via other way :) good to know, thanks!
>
> > But renaming the percpu struct nmi to nmidon fixed the problem for me (I
> > am open to other suggestions :-) ).
>
> nmi_don_zickus ;) well, I think nmi_pmu or something like that
> might be a bit modest ;)
>
> >
> > Regarding your reiserfs, what was the variable's name?
> >
>
> It seems to be different, it's pity that I had only 80x25 vga
> mode and was unable to snap the whole log. But actually I didn't
> even check precisely all .config options I had set since I was
> more interested in early stage where per-cpu access should already
> happen rather then real init'ed environmen. But I think I'll be
> moving completely to .36 this week so we will see how it goes.
There was a missing "return" in the end of reiserfs_evict_inode.
It has been fixed there:
http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=f4ae2faa40199b97b12f508234640bc565d166f8
On 25.08.10 16:11:06, Don Zickus wrote:
> > cpu #15, nmi #160, marked #0, handled = 1, time = 333392635730, delta = 11238255
> > cpu #15, nmi #161, marked #0, handled = 1, time = 333403779380, delta = 11143650
> > cpu #15, nmi #162, marked #0, handled = 1, time = 333415418497, delta = 11639117
> > cpu #15, nmi #163, marked #0, handled = 1, time = 333415467084, delta = 48587
> > cpu #15, nmi #164, marked #0, handled = 1, time = 333415501531, delta = 34447
> > cpu #15, nmi #165, marked #0, handled = 1, time = 333459918106, delta = 44416575
> > cpu #15, nmi #166, marked #0, handled = 0, time = 333459923167, delta = 1666
> > cpu #15, nmi #151, marked #0, handled = 1, time = 332978597882, delta = 11447002
> > cpu #15, nmi #152, marked #0, handled = 1, time = 332978657151, delta = 59269
> > cpu #15, nmi #153, marked #0, handled = 1, time = 332978667847, delta = 10696
> > cpu #15, nmi #154, marked #0, handled = 1, time = 333023125757, delta = 44457910
> > cpu #15, nmi #155, marked #0, handled = 1, time = 333291980833, delta = 268855076
> > cpu #15, nmi #156, marked #0, handled = 1, time = 333325663125, delta = 33682292
> > cpu #15, nmi #157, marked #0, handled = 1, time = 333348216481, delta = 22553356
> > cpu #15, nmi #158, marked #0, handled = 1, time = 333370168887, delta = 21952406
> > cpu #15, nmi #159, marked #0, handled = 1, time = 333381397475, delta = 11228588
> > Uhhuh. NMI received for unknown reason 00 on CPU 15.
> > Do you have a strange power saving mode enabled?
> > Dazed and confused, but trying to continue
>
> So I found a Nehalem box that can reliably reproduce Ingo's problem using
> something as simple 'perf top'. But like above, I am noticing the
> samething, an extra NMI(PMI??) that comes out of nowhere.
This could also be a race in the counter handling code, or we do not
proper count the number of handled counters. Maybe 2 counters actually
fired but we only noticed one counter and then accidentially cleared
the 2nd without processing it.
-Robert
--
Advanced Micro Devices, Inc.
Operating System Research Center
On Thu, Aug 26, 2010 at 3:52 AM, Frederic Weisbecker <[email protected]> wrote:
...
>>
>> It seems to be different, it's pity that I had only 80x25 vga
>> mode and was unable to snap the whole log. But actually I didn't
>> even check precisely all .config options I had set since I was
>> more interested in early stage where per-cpu access should already
>> happen rather then real init'ed environmen. But I think I'll be
>> moving completely to .36 this week so we will see how it goes.
>
>
> There was a missing "return" in the end of reiserfs_evict_inode.
>
> It has been fixed there:
>
> http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=f4ae2faa40199b97b12f508234640bc565d166f8
>
>
Thanks for info, Frederic!
On Thu, Aug 26, 2010 at 1:00 PM, Robert Richter <[email protected]> wrote:
...
>
> This could also be a race in the counter handling code, or we do not
> proper count the number of handled counters. Maybe 2 counters actually
> fired but we only noticed one counter and then accidentially cleared
> the 2nd without processing it.
>
> -Robert
>
Any chance to get it tested on P4 machine since it has a bit
different design?
Cyrill
On Thu, Aug 26, 2010 at 01:18:29PM +0400, Cyrill Gorcunov wrote:
> On Thu, Aug 26, 2010 at 1:00 PM, Robert Richter <[email protected]> wrote:
> ...
> >
> > This could also be a race in the counter handling code, or we do not
> > proper count the number of handled counters. Maybe 2 counters actually
> > fired but we only noticed one counter and then accidentially cleared
> > the 2nd without processing it.
> >
> > -Robert
> >
>
> Any chance to get it tested on P4 machine since it has a bit
> different design?
Well P4 uses a different pmu irq handler, so I am not sure it will give us
much insight. I haven't noticed this on an AMD or an intel i5 either.
Cheers,
Don
On Thu, Aug 26, 2010 at 01:18:29PM +0400, Cyrill Gorcunov wrote:
> On Thu, Aug 26, 2010 at 1:00 PM, Robert Richter <[email protected]> wrote:
> ...
> >
> > This could also be a race in the counter handling code, or we do not
> > proper count the number of handled counters. Maybe 2 counters actually
> > fired but we only noticed one counter and then accidentially cleared
> > the 2nd without processing it.
> >
> > -Robert
> >
>
> Any chance to get it tested on P4 machine since it has a bit
> different design?
Hmm, I take that back. I guess I can reproduce this on my i5 that I had
using Ingo's config.
Working on Robert's assumption, I added code to perf_event_intel.c that
said if handled !=0 just add one to it (IOW always process handled as 0 or
something >1). That seems to working good and catches the nmis that Ingo
was seeing.
I'll keep looking for the race condition to better fix it.
Cheers,
Don
On Thu, Aug 26, 2010 at 11:22:46AM -0400, Don Zickus wrote:
> On Thu, Aug 26, 2010 at 01:18:29PM +0400, Cyrill Gorcunov wrote:
> > On Thu, Aug 26, 2010 at 1:00 PM, Robert Richter <[email protected]> wrote:
> > ...
> > >
> > > This could also be a race in the counter handling code, or we do not
> > > proper count the number of handled counters. Maybe 2 counters actually
> > > fired but we only noticed one counter and then accidentially cleared
> > > the 2nd without processing it.
> > >
> > > -Robert
> > >
> >
> > Any chance to get it tested on P4 machine since it has a bit
> > different design?
>
> Hmm, I take that back. I guess I can reproduce this on my i5 that I had
> using Ingo's config.
>
> Working on Robert's assumption, I added code to perf_event_intel.c that
> said if handled !=0 just add one to it (IOW always process handled as 0 or
> something >1). That seems to working good and catches the nmis that Ingo
> was seeing.
>
> I'll keep looking for the race condition to better fix it.
>
> Cheers,
> Don
>
Sounds promising, mind to post new inter-diff? Ie what you have
changed from Robert's patch.
-- Cyrill
On Thu, Aug 26, 2010 at 07:34:04PM +0400, Cyrill Gorcunov wrote:
> > I'll keep looking for the race condition to better fix it.
> >
> > Cheers,
> > Don
> >
>
> Sounds promising, mind to post new inter-diff? Ie what you have
> changed from Robert's patch.
It was really in Peter's patch, just a stupid hack for now.
Cheers,
Don
diff --git a/arch/x86/kernel/cpu/perf_event_intel.c b/arch/x86/kernel/cpu/perf_event_intel.c
index 4539b4b..9e65a7b 100644
--- a/arch/x86/kernel/cpu/perf_event_intel.c
+++ b/arch/x86/kernel/cpu/perf_event_intel.c
@@ -777,7 +777,9 @@ again:
done:
intel_pmu_enable_all(0);
- return handled;
+ if (!handled)
+ return handled;
+ return ++handled;
}
static struct event_constraint *
On Thu, Aug 26, 2010 at 12:40:31PM -0400, Don Zickus wrote:
> On Thu, Aug 26, 2010 at 07:34:04PM +0400, Cyrill Gorcunov wrote:
> > > I'll keep looking for the race condition to better fix it.
> > >
> > > Cheers,
> > > Don
> > >
> >
> > Sounds promising, mind to post new inter-diff? Ie what you have
> > changed from Robert's patch.
>
> It was really in Peter's patch, just a stupid hack for now.
>
> Cheers,
> Don
>
>
> diff --git a/arch/x86/kernel/cpu/perf_event_intel.c b/arch/x86/kernel/cpu/perf_event_intel.c
> index 4539b4b..9e65a7b 100644
> --- a/arch/x86/kernel/cpu/perf_event_intel.c
> +++ b/arch/x86/kernel/cpu/perf_event_intel.c
> @@ -777,7 +777,9 @@ again:
>
> done:
> intel_pmu_enable_all(0);
> - return handled;
> + if (!handled)
> + return handled;
> + return ++handled;
> }
>
> static struct event_constraint *
>
ok, it seems it just treat any unknown nmi as being came from PMU, no?
-- Cyrill
On Mon, Aug 23, 2010 at 10:53:39AM +0200, Ingo Molnar wrote:
>
> * Ingo Molnar <[email protected]> wrote:
>
> >
> > * Don Zickus <[email protected]> wrote:
> >
> > > I'll test tip later today to see if I can reproduce it.
> > >
> > > Cheers,
> > > Don
> > >
> > > Ingo Molnar <[email protected]> wrote:
> > >
> > > >
> > > >it's not working so well, i'm getting:
> > > >
> > > > Uhhuh. NMI received for unknown reason 00 on CPU 9.
> > > > Do you have a strange power saving mode enabled?
> > > > Dazed and confused, but trying to continue
> > > >
> > > >on a nehalem box, after a perf top and perf stat run.
> >
> > FYI, it does not trigger on an AMD box.
>
> Ok, to not hold up the perf/urgent flow i zapped these two commits for
> the time being:
>
> 4a31beb: perf, x86: Fix handle_irq return values
> 8e3e42b: perf, x86: Try to handle unknown nmis with an enabled PMU
>
> We can apply them if they take a form that dont introduce a different
> kind of (and more visible) regression.
So this patch fixes it, though I haven't convince myself why (perhaps
babysitting my 4 month old isn't helping :-))
The code now enters the loop and reprocesses the new status which properly
increments handled to 2 and thus the new logic takes care of it.
Cheers,
Don
diff --git a/arch/x86/kernel/cpu/perf_event_intel.c b/arch/x86/kernel/cpu/perf_event_intel.c
index 4539b4b..d16ebd8 100644
--- a/arch/x86/kernel/cpu/perf_event_intel.c
+++ b/arch/x86/kernel/cpu/perf_event_intel.c
@@ -738,6 +738,7 @@ again:
inc_irq_stat(apic_perf_irqs);
ack = status;
+ intel_pmu_ack_status(ack);
intel_pmu_lbr_read();
@@ -766,8 +767,6 @@ again:
x86_pmu_stop(event);
}
- intel_pmu_ack_status(ack);
-
/*
* Repeat if there is more work to be done:
*/
On 26.08.10 17:14:24, Don Zickus wrote:
> diff --git a/arch/x86/kernel/cpu/perf_event_intel.c b/arch/x86/kernel/cpu/perf_event_intel.c
> index 4539b4b..d16ebd8 100644
> --- a/arch/x86/kernel/cpu/perf_event_intel.c
> +++ b/arch/x86/kernel/cpu/perf_event_intel.c
> @@ -738,6 +738,7 @@ again:
>
> inc_irq_stat(apic_perf_irqs);
> ack = status;
> + intel_pmu_ack_status(ack);
Yes, not immediately ack'ing the status was suspect to me too. Though
it must then be the same counter that retriggers. Or, it is a cpu
bug. You could add a debug print of the status register for the case
the loop is reentered, would be interesting...
Thank for fixing this.
-Robert
>
> intel_pmu_lbr_read();
>
> @@ -766,8 +767,6 @@ again:
> x86_pmu_stop(event);
> }
>
> - intel_pmu_ack_status(ack);
> -
> /*
> * Repeat if there is more work to be done:
> */
>
--
Advanced Micro Devices, Inc.
Operating System Research Center
On 26.08.10 14:02:50, Cyrill Gorcunov wrote:
> > diff --git a/arch/x86/kernel/cpu/perf_event_intel.c b/arch/x86/kernel/cpu/perf_event_intel.c
> > index 4539b4b..9e65a7b 100644
> > --- a/arch/x86/kernel/cpu/perf_event_intel.c
> > +++ b/arch/x86/kernel/cpu/perf_event_intel.c
> > @@ -777,7 +777,9 @@ again:
> >
> > done:
> > intel_pmu_enable_all(0);
> > - return handled;
> > + if (!handled)
> > + return handled;
> > + return ++handled;
> > }
> >
> > static struct event_constraint *
> >
>
> ok, it seems it just treat any unknown nmi as being came from PMU, no?
Yes, this just throws away all unknown nmis after a perf nmi. It
disables unknown nmi detection on this cpu type.
-Robert
--
Advanced Micro Devices, Inc.
Operating System Research Center
On Fri, 2010-08-27 at 09:57 +0200, Robert Richter wrote:
> On 26.08.10 14:02:50, Cyrill Gorcunov wrote:
> > > diff --git a/arch/x86/kernel/cpu/perf_event_intel.c b/arch/x86/kernel/cpu/perf_event_intel.c
> > > index 4539b4b..9e65a7b 100644
> > > --- a/arch/x86/kernel/cpu/perf_event_intel.c
> > > +++ b/arch/x86/kernel/cpu/perf_event_intel.c
> > > @@ -777,7 +777,9 @@ again:
> > >
> > > done:
> > > intel_pmu_enable_all(0);
> > > - return handled;
> > > + if (!handled)
> > > + return handled;
> > > + return ++handled;
> > > }
> > >
> > > static struct event_constraint *
> > >
> >
> > ok, it seems it just treat any unknown nmi as being came from PMU, no?
>
> Yes, this just throws away all unknown nmis after a perf nmi. It
> disables unknown nmi detection on this cpu type.
Wouldn't returning 2 be more sensible, then it would only eat a few
unknowns after each pmi? (Still assuming you return 0 when there really
was nothing to do)
On 26.08.10 17:14:24, Don Zickus wrote:
> diff --git a/arch/x86/kernel/cpu/perf_event_intel.c b/arch/x86/kernel/cpu/perf_event_intel.c
> index 4539b4b..d16ebd8 100644
> --- a/arch/x86/kernel/cpu/perf_event_intel.c
> +++ b/arch/x86/kernel/cpu/perf_event_intel.c
> @@ -738,6 +738,7 @@ again:
>
> inc_irq_stat(apic_perf_irqs);
> ack = status;
> + intel_pmu_ack_status(ack);
I would slightly change the patch:
There is no need for the ack variable anymore, you could directly work
with the status.
I would call intel_pmu_ack_status() as close as possible after the
intel_pmu_get_status(), which is after 'again:'.
-Robert
>
> intel_pmu_lbr_read();
>
> @@ -766,8 +767,6 @@ again:
> x86_pmu_stop(event);
> }
>
> - intel_pmu_ack_status(ack);
> -
> /*
> * Repeat if there is more work to be done:
> */
>
--
Advanced Micro Devices, Inc.
Operating System Research Center
On 27.08.10 04:11:21, Peter Zijlstra wrote:
> On Fri, 2010-08-27 at 09:57 +0200, Robert Richter wrote:
> > On 26.08.10 14:02:50, Cyrill Gorcunov wrote:
> > > > diff --git a/arch/x86/kernel/cpu/perf_event_intel.c b/arch/x86/kernel/cpu/perf_event_intel.c
> > > > index 4539b4b..9e65a7b 100644
> > > > --- a/arch/x86/kernel/cpu/perf_event_intel.c
> > > > +++ b/arch/x86/kernel/cpu/perf_event_intel.c
> > > > @@ -777,7 +777,9 @@ again:
> > > >
> > > > done:
> > > > intel_pmu_enable_all(0);
> > > > - return handled;
> > > > + if (!handled)
> > > > + return handled;
> > > > + return ++handled;
> > > > }
> > > >
> > > > static struct event_constraint *
> > > >
> > >
> > > ok, it seems it just treat any unknown nmi as being came from PMU, no?
> >
> > Yes, this just throws away all unknown nmis after a perf nmi. It
> > disables unknown nmi detection on this cpu type.
>
> Wouldn't returning 2 be more sensible, then it would only eat a few
> unknowns after each pmi? (Still assuming you return 0 when there really
> was nothing to do)
Yes, this would be the best workaround for cpus where the detection
logic does not work properly. But I think Don found a solution
already.
-Robert
--
Advanced Micro Devices, Inc.
Operating System Research Center
On Fri, Aug 27, 2010 at 09:51:32AM +0200, Robert Richter wrote:
> On 26.08.10 17:14:24, Don Zickus wrote:
> > diff --git a/arch/x86/kernel/cpu/perf_event_intel.c b/arch/x86/kernel/cpu/perf_event_intel.c
> > index 4539b4b..d16ebd8 100644
> > --- a/arch/x86/kernel/cpu/perf_event_intel.c
> > +++ b/arch/x86/kernel/cpu/perf_event_intel.c
> > @@ -738,6 +738,7 @@ again:
> >
> > inc_irq_stat(apic_perf_irqs);
> > ack = status;
> > + intel_pmu_ack_status(ack);
>
> Yes, not immediately ack'ing the status was suspect to me too. Though
> it must then be the same counter that retriggers. Or, it is a cpu
> bug. You could add a debug print of the status register for the case
> the loop is reentered, would be interesting...
It seems to be the same counter. I wonder if the act of processing it
'intel_save_and_restart' cleared the status bit on the perfselX bit but
not in the global status register. Thus it triggered again and we
accidentally cleared it when we ack'd later in the code.
Cheers,
Don
On Fri, Aug 27, 2010 at 10:10:38AM +0200, Robert Richter wrote:
> On 26.08.10 17:14:24, Don Zickus wrote:
> > diff --git a/arch/x86/kernel/cpu/perf_event_intel.c b/arch/x86/kernel/cpu/perf_event_intel.c
> > index 4539b4b..d16ebd8 100644
> > --- a/arch/x86/kernel/cpu/perf_event_intel.c
> > +++ b/arch/x86/kernel/cpu/perf_event_intel.c
> > @@ -738,6 +738,7 @@ again:
> >
> > inc_irq_stat(apic_perf_irqs);
> > ack = status;
> > + intel_pmu_ack_status(ack);
>
> I would slightly change the patch:
>
> There is no need for the ack variable anymore, you could directly work
> with the status.
>
> I would call intel_pmu_ack_status() as close as possible after the
> intel_pmu_get_status(), which is after 'again:'.
Yeah, I can do that. The other patch was just a proof of concept to see
what others thought.
What is funny is that this problem was masked by the
perf_event_nmi_handler swallowing all the nmis. I wonder if we were
losing events as a result of this bug too because if you think about it,
we processed the first event, a second event came in and we accidentally
ack'd it, thus dropping it on the floor. Now I wonder how the event was
ever reloaded, unless it was by accident because of how the scheduler
deals with perf counters (perf_start/stop all the time).
Cheers,
Don
On 27.08.10 09:44:29, Don Zickus wrote:
> On Fri, Aug 27, 2010 at 10:10:38AM +0200, Robert Richter wrote:
> > On 26.08.10 17:14:24, Don Zickus wrote:
> > > diff --git a/arch/x86/kernel/cpu/perf_event_intel.c b/arch/x86/kernel/cpu/perf_event_intel.c
> > > index 4539b4b..d16ebd8 100644
> > > --- a/arch/x86/kernel/cpu/perf_event_intel.c
> > > +++ b/arch/x86/kernel/cpu/perf_event_intel.c
> > > @@ -738,6 +738,7 @@ again:
> > >
> > > inc_irq_stat(apic_perf_irqs);
> > > ack = status;
> > > + intel_pmu_ack_status(ack);
> >
> > I would slightly change the patch:
> >
> > There is no need for the ack variable anymore, you could directly work
> > with the status.
> >
> > I would call intel_pmu_ack_status() as close as possible after the
> > intel_pmu_get_status(), which is after 'again:'.
>
> Yeah, I can do that. The other patch was just a proof of concept to see
> what others thought.
>
> What is funny is that this problem was masked by the
> perf_event_nmi_handler swallowing all the nmis. I wonder if we were
> losing events as a result of this bug too because if you think about it,
> we processed the first event, a second event came in and we accidentally
> ack'd it, thus dropping it on the floor.
Yes, this could be the case, but only for handled counters. So it
would be interesting to see for this case the status mask of the
current and previous get_status call.
> Now I wonder how the event was
> ever reloaded, unless it was by accident because of how the scheduler
> deals with perf counters (perf_start/stop all the time).
The nmi might be queued be the cpu regardless of of the overflow
state.
I am wondering why this happens at all, because events are disabled by
wrmsrl(MSR_CORE_PERF_GLOBAL_CTRL, 0). Hmm, maybe this is exactly the
reason because the nmi could fire again after reenabling the counters.
Is there a reason for disabling all counters?
-Robert
>
> Cheers,
> Don
>
--
Advanced Micro Devices, Inc.
Operating System Research Center
On Fri, Aug 27, 2010 at 04:05:23PM +0200, Robert Richter wrote:
> > What is funny is that this problem was masked by the
> > perf_event_nmi_handler swallowing all the nmis. I wonder if we were
> > losing events as a result of this bug too because if you think about it,
> > we processed the first event, a second event came in and we accidentally
> > ack'd it, thus dropping it on the floor.
>
> Yes, this could be the case, but only for handled counters. So it
> would be interesting to see for this case the status mask of the
> current and previous get_status call.
The status masks seem to be identical, 0x1 (and when I forced pmc0
unusable, everything was 0x2).
>
> > Now I wonder how the event was
> > ever reloaded, unless it was by accident because of how the scheduler
> > deals with perf counters (perf_start/stop all the time).
>
> The nmi might be queued be the cpu regardless of of the overflow
> state.
>
> I am wondering why this happens at all, because events are disabled by
> wrmsrl(MSR_CORE_PERF_GLOBAL_CTRL, 0). Hmm, maybe this is exactly the
Heh. Not sure why it isn't working then. Then again you shouldn't need
the loop if it was working I would think.
> reason because the nmi could fire again after reenabling the counters.
>
> Is there a reason for disabling all counters?
It would be a nice to have that way we wouldn't have to 'eat' all these
extra nmis. But I guess it isn't working correctly.
Cheers,
Don
On 27.08.10 11:05:23, Don Zickus wrote:
> The status masks seem to be identical, 0x1 (and when I forced pmc0
> unusable, everything was 0x2).
So this should also happen if only one counter is running?
Back-to-back nmis actually only occur then 2 different counters
trigger simultaneously.
> > > Now I wonder how the event was
> > > ever reloaded, unless it was by accident because of how the scheduler
> > > deals with perf counters (perf_start/stop all the time).
> >
> > The nmi might be queued be the cpu regardless of of the overflow
> > state.
> >
> > I am wondering why this happens at all, because events are disabled by
> > wrmsrl(MSR_CORE_PERF_GLOBAL_CTRL, 0). Hmm, maybe this is exactly the
>
> Heh. Not sure why it isn't working then. Then again you shouldn't need
> the loop if it was working I would think.
>
> > reason because the nmi could fire again after reenabling the counters.
> >
> > Is there a reason for disabling all counters?
>
> It would be a nice to have that way we wouldn't have to 'eat' all these
> extra nmis. But I guess it isn't working correctly.
What about the erratum mentioned in this thread before? We might
identify affected cpus and return handled=2 for them. This solution
will be still better than before. And, all other cpu models have the
nmi detection fixed. In a next step we could try to use a timer for
detection.
-Robert
--
Advanced Micro Devices, Inc.
Operating System Research Center
On Fri, Aug 27, 2010 at 10:10:38AM +0200, Robert Richter wrote:
> >
> > inc_irq_stat(apic_perf_irqs);
> > ack = status;
> > + intel_pmu_ack_status(ack);
>
> I would slightly change the patch:
>
> There is no need for the ack variable anymore, you could directly work
> with the status.
>
> I would call intel_pmu_ack_status() as close as possible after the
> intel_pmu_get_status(), which is after 'again:'.
>
Here is another version of the patch, which uses Robert's suggestion of
removing the ack variable
Cheers,
Don
--
From: Don Zickus <[email protected]>
Date: Fri, 27 Aug 2010 14:43:03 -0400
Subject: [PATCH] [x86] perf: fix accidentally ack'ing a second event on intel perf counter
During testing of a patch to stop having the perf subsytem swallow nmis,
it was uncovered that Nehalem boxes were randomly getting unknown nmis
when using the perf tool.
Moving the ack'ing of the PMI closer to when we get the status allows
the hardware to properly re-set the PMU bit signaling another PMI was
triggered during the processing of the first PMI. This allows the new
logic for dealing with the shortcomings of multiple PMIs to handle the
extra NMI by 'eat'ing it later.
Now one can wonder why are we getting a second PMI when we disable all
the PMUs in the begining of the NMI handler to prevent such a case, for
that I do not know. But I know the fix below helps deal with this quirk.
Tested on multiple Nehalems where the problem was occuring. With the
patch, the code now loops a second time to handle the second PMI (whereas
before it was not).
Signed-off-by: Don Zickus <[email protected]>
---
arch/x86/kernel/cpu/perf_event_intel.c | 6 ++----
1 files changed, 2 insertions(+), 4 deletions(-)
diff --git a/arch/x86/kernel/cpu/perf_event_intel.c b/arch/x86/kernel/cpu/perf_event_intel.c
index 4539b4b..ee05c90 100644
--- a/arch/x86/kernel/cpu/perf_event_intel.c
+++ b/arch/x86/kernel/cpu/perf_event_intel.c
@@ -712,7 +712,7 @@ static int intel_pmu_handle_irq(struct pt_regs *regs)
struct perf_sample_data data;
struct cpu_hw_events *cpuc;
int bit, loops;
- u64 ack, status;
+ u64 status;
int handled = 0;
perf_sample_data_init(&data, 0);
@@ -729,6 +729,7 @@ static int intel_pmu_handle_irq(struct pt_regs *regs)
loops = 0;
again:
+ intel_pmu_ack_status(status);
if (++loops > 100) {
WARN_ONCE(1, "perfevents: irq loop stuck!\n");
perf_event_print_debug();
@@ -737,7 +738,6 @@ again:
}
inc_irq_stat(apic_perf_irqs);
- ack = status;
intel_pmu_lbr_read();
@@ -766,8 +766,6 @@ again:
x86_pmu_stop(event);
}
- intel_pmu_ack_status(ack);
-
/*
* Repeat if there is more work to be done:
*/
--
1.7.2.1
On 08/27/2010 11:57 AM, Don Zickus wrote:
> On Fri, Aug 27, 2010 at 10:10:38AM +0200, Robert Richter wrote:
>>>
>>> inc_irq_stat(apic_perf_irqs);
>>> ack = status;
>>> + intel_pmu_ack_status(ack);
>>
>> I would slightly change the patch:
>>
>> There is no need for the ack variable anymore, you could directly work
>> with the status.
>>
>> I would call intel_pmu_ack_status() as close as possible after the
>> intel_pmu_get_status(), which is after 'again:'.
>>
>
> Here is another version of the patch, which uses Robert's suggestion of
> removing the ack variable
Can you resend all three updated patches?
Thanks
Yinghai
On 27.08.10 14:57:44, Don Zickus wrote:
> From: Don Zickus <[email protected]>
> Date: Fri, 27 Aug 2010 14:43:03 -0400
> Subject: [PATCH] [x86] perf: fix accidentally ack'ing a second event on intel perf counter
>
> During testing of a patch to stop having the perf subsytem swallow nmis,
> it was uncovered that Nehalem boxes were randomly getting unknown nmis
> when using the perf tool.
>
> Moving the ack'ing of the PMI closer to when we get the status allows
> the hardware to properly re-set the PMU bit signaling another PMI was
> triggered during the processing of the first PMI. This allows the new
> logic for dealing with the shortcomings of multiple PMIs to handle the
> extra NMI by 'eat'ing it later.
>
> Now one can wonder why are we getting a second PMI when we disable all
> the PMUs in the begining of the NMI handler to prevent such a case, for
> that I do not know. But I know the fix below helps deal with this quirk.
>
> Tested on multiple Nehalems where the problem was occuring. With the
> patch, the code now loops a second time to handle the second PMI (whereas
> before it was not).
>
> Signed-off-by: Don Zickus <[email protected]>
Looks good to me. Thanks Don.
-Robert
--
Advanced Micro Devices, Inc.
Operating System Research Center