2002-07-23 17:33:57

by Isabelle, Francois

[permalink] [raw]
Subject: Handling NMI in a kernel module

Is it possible to request_nmi() the way you can request_irq() from a kernel
driver on the i386 arch?

Our hardware watchdog is dual stage and can generate NMI on first stage ,
our cPCI handle switch can also be used for Hot swap request via NMI.
I'ld like to make use of this, I noticed cpqhealth module already
implemented some nmi handling but this driver is close sourced.

Should we patch in i386/kernel/traps.c to add a callback to our stuff in
unkown_nmi_error().

I'ld like my driver to register a callback there, what about maintaining a
list of user callback functions which could be registered via:

request_nmi(int option, void (*hander)(void *dev_id, struct pt_regs *regs),
unsigned long flags, const char *dev_name, void *dev_id);

where option could take meaning such as
- prepend : place at start of nmi callback functions
- append : place at end of nmi callback functions
- truncate : replace callback chain

Is there any standard mecanism to implement such features( dual stage
watchdog ) ?

Comments are welcome.

Francois Isabelle
[email protected]
Kontron Canada Inc






2002-07-23 18:24:30

by John Levon

[permalink] [raw]
Subject: Re: Handling NMI in a kernel module

On Tue, Jul 23, 2002 at 01:37:01PM -0400, Isabelle, Francois wrote:

> Is it possible to request_nmi() the way you can request_irq() from a kernel
> driver on the i386 arch?

Not currently, no.

> Our hardware watchdog is dual stage and can generate NMI on first stage ,
> our cPCI handle switch can also be used for Hot swap request via NMI.
> I'ld like to make use of this, I noticed cpqhealth module already
> implemented some nmi handling but this driver is close sourced.

You can do some horrible hack with sidt + _set_gate() to replace the NMI trap handler.

> Should we patch in i386/kernel/traps.c to add a callback to our stuff in
> unkown_nmi_error().
>
> I'ld like my driver to register a callback there, what about maintaining a
> list of user callback functions which could be registered via:
>
> request_nmi(int option, void (*hander)(void *dev_id, struct pt_regs *regs),
> unsigned long flags, const char *dev_name, void *dev_id);
>
> where option could take meaning such as
> - prepend : place at start of nmi callback functions
> - append : place at end of nmi callback functions
> - truncate : replace callback chain

Why all three ? When would anything other than prepend be useful ? Each
handler must simply see if the NMI is their responsibility, and pass its
duty along to the next handler if not.

What is the purpose of dev_name, dev_id, and flags exactly ?

Personally, I'd like to see such a patch.

regards
john

--
"If you cannot convince them, confuse them."
- Harry S. Truman

2002-07-23 21:14:29

by Alan

[permalink] [raw]
Subject: Re: Handling NMI in a kernel module

On Tue, 2002-07-23 at 18:37, Isabelle, Francois wrote:
> I'ld like my driver to register a callback there, what about maintaining a
> list of user callback functions which could be registered via:
>
> request_nmi(int option, void (*hander)(void *dev_id, struct pt_regs *regs),
> unsigned long flags, const char *dev_name, void *dev_id)

Doesnt seen unreasonable
;
> Is there any standard mecanism to implement such features( dual stage
> watchdog ) ?

We have a watchdog API but not yet dual stage stuff. Its becoming a must
have for HA stuff so the API needs extending

2002-07-26 21:53:09

by Alan Robertson

[permalink] [raw]
Subject: Re: Handling NMI in a kernel module

Isabelle, Francois wrote:
> Is it possible to request_nmi() the way you can request_irq() from a kernel
> driver on the i386 arch?
>
> Our hardware watchdog is dual stage and can generate NMI on first stage ,
> our cPCI handle switch can also be used for Hot swap request via NMI.
> I'ld like to make use of this, I noticed cpqhealth module already
> implemented some nmi handling but this driver is close sourced.
>
> Should we patch in i386/kernel/traps.c to add a callback to our stuff in
> unkown_nmi_error().
>
> I'ld like my driver to register a callback there, what about maintaining a
> list of user callback functions which could be registered via:
>
> request_nmi(int option, void (*hander)(void *dev_id, struct pt_regs *regs),
> unsigned long flags, const char *dev_name, void *dev_id);
>
> where option could take meaning such as
> - prepend : place at start of nmi callback functions
> - append : place at end of nmi callback functions
> - truncate : replace callback chain
>
> Is there any standard mecanism to implement such features( dual stage
> watchdog ) ?

We've created a separate mailing list to talk about enhancements to the
watchdog driver API. Dual stage watchdog is on the list. It's pretty quiet
now, but perhaps now that summer is winding down, we can get started again...


Info on the list is here:
http://lists.tummy.com/mailman/listinfo/watchdogng



-- Alan Robertson
[email protected]

2002-07-27 00:48:37

by Alan

[permalink] [raw]
Subject: Re: Handling NMI in a kernel module

I've been tracking other lists. The current state is very much that we
need the dual notifier. I now have some draft code that allows us to do
this even on hardware that doesn't support it, and where the read()
function gets told when an event is about to occur

2002-07-27 03:28:21

by Alan Robertson

[permalink] [raw]
Subject: Re: Handling NMI in a kernel module

Alan Cox wrote:
> I've been tracking other lists. The current state is very much that we
> need the dual notifier. I now have some draft code that allows us to do
> this even on hardware that doesn't support it, and where the read()
> function gets told when an event is about to occur

I know what had been requested from the telco crowd was the ability to
register a function to get called (in the kernel) when an event was about to
occur.

I'm not sure what it means to say that "the read() function gets told when
an event is about to occur".

-- Alan Robertson
[email protected]

2002-07-27 03:43:12

by Jonathan Lundell

[permalink] [raw]
Subject: Re: Handling NMI in a kernel module

At 3:05 AM +0100 2002-07-27, Alan Cox wrote:
>I've been tracking other lists. The current state is very much that we
>need the dual notifier. I now have some draft code that allows us to do
>this even on hardware that doesn't support it, and where the read()
>function gets told when an event is about to occur

I'd be grateful for a copy of the draft code. We've done a machine
with a hardware-based two-stage watchdog, and are in the process of
implementing one on a more-vanilla piece of hardware.
--
/Jonathan Lundell.

2002-07-27 11:22:09

by Alan

[permalink] [raw]
Subject: Re: Handling NMI in a kernel module

On Sat, 2002-07-27 at 04:30, Alan Robertson wrote:
> Alan Cox wrote:
> > I've been tracking other lists. The current state is very much that we
> > need the dual notifier. I now have some draft code that allows us to do
> > this even on hardware that doesn't support it, and where the read()
> > function gets told when an event is about to occur
>
> I know what had been requested from the telco crowd was the ability to
> register a function to get called (in the kernel) when an event was about to
> occur.

They can already do that anyway. Its called add_timer() 8)

2002-07-27 13:48:53

by Alan Robertson

[permalink] [raw]
Subject: Re: Handling NMI in a kernel module

Alan Cox wrote:
> On Sat, 2002-07-27 at 04:30, Alan Robertson wrote:
>
>>Alan Cox wrote:
>>
>>>I've been tracking other lists. The current state is very much that we
>>>need the dual notifier. I now have some draft code that allows us to do
>>>this even on hardware that doesn't support it, and where the read()
>>>function gets told when an event is about to occur
>>>
>>I know what had been requested from the telco crowd was the ability to
>>register a function to get called (in the kernel) when an event was about to
>>occur.
>>
>
> They can already do that anyway. Its called add_timer() 8)

Given how vaguely I stated it, I guess that's what I deserve ;-)
However, it's not quite what I had in mind. ;-)

You'd like to see the driver add a timer when someone opens the device, and
removes and readds it each time they tickle the watchdog timer. It's this
interaction which the driver has to provide support for.

Also, you'd like to specify how long before the watchdog timer goes off that
it "pops".

So, you really want something like register_watchdog_pretimeout() and pass
it a function and a pretimeout time in ticks or milliseconds, or whatever.
You'd also need an unregister_watchdog_pretimeout() function of course as
well... IIRC, this is what I mentioned on the watchdog timer list.

-- Alan Robertson
[email protected]