the Blackfin on-chip watchdog has controllable behavior ... it can be
configured to reset the processor (like a normal watchdog), or it can
be configured to simply generate an interrupt.
i can see embedded systems where simply resetting the system is not
desirable ... perhaps it's the control system for some machinery and
resetting the system will force it to reinitialize itself which could
cause problems if a guy is servicing the insides at the time ;)
the Blackfin watchdog driver has a module parameter, "action" ... the
default will have the watchdog act like every other watchdog out there
-- it reboots after a timeout. however, by setting the action param
appropriately, the watchdog goes into simple interrupt generation.
the question then becomes, how do people developing their
board-specific version customize what happens when the timeout occurs
and the interrupt is fired ? making every customer who wishes to
customize the watchdog behavior edit the watchdog driver is
troublesome as they're now blurring the distinct parts: the
watchdog-specific piece and their board-specific piece.
what i'm doing now is weak symbols:
...
extern irqreturn_t bfin_board_watchdog_interrupt(void) __attribute__((weak));
static irqreturn_t bfin_wdt_interrupt(int irq, void *dev_id)
{
if (bfin_board_watchdog_interrupt) {
return bfin_board_watchdog_interrupt();
} else {
bfin_wdt_stop();
bfin_wdt_keepalive();
bfin_wdt_start();
return IRQ_HANDLED;
}
}
...
is this completely bad mojo ? is there some other mechanism that
provides what i want and i just dont know about it ? or do i just
make people change the driver to fit their application, thus throwing
out the idea of keeping all board-specific details in just the boards
file ...
-mike
On Thu, May 24, 2007 at 12:21:47AM -0400, Mike Frysinger wrote:
> is this completely bad mojo ? is there some other mechanism that
> provides what i want and i just dont know about it ? or do i just
> make people change the driver to fit their application, thus throwing
> out the idea of keeping all board-specific details in just the boards
> file ...
It sounds like your constraining your driver based on terminology.
Watchdogs on most embedded platforms support either a 'reset' mode or
otherwise act as periodic timers, trying to push both of these
functionalities in to a watchdog driver is rather pointless.
CONFIG_WATCHDOG implies 'reset' mode by definition.
If you wish to use your watchdog timer as a periodic timer, simply have a
clocksource/clockevents established for it, leave the watchdog driver as
a reset-only thing, and let the user decide which one they want either
via Kconfig or the kernel command line. (The watchdog driver can just
-ENODEV or -EBUSY if the clocksource is active).
On 5/24/07, Paul Mundt <[email protected]> wrote:
> It sounds like your constraining your driver based on terminology.
> Watchdogs on most embedded platforms support either a 'reset' mode or
> otherwise act as periodic timers, trying to push both of these
> functionalities in to a watchdog driver is rather pointless.
> CONFIG_WATCHDOG implies 'reset' mode by definition.
I agree for a product in the hands of a customer: let the watchdog
pull your bacon out of the fire.
But what about debugging? Suppose your embedded computer with custom
drivers locks up solid every few hundred hours. It would be nice if
the watchdog gave a stack dump instead erasing the evidence. How about
having "action=reset" and "action=debug"?
-- Daniel
On Thu, May 24, 2007 at 03:47:45AM -0500, Daniel Newby wrote:
> On 5/24/07, Paul Mundt <[email protected]> wrote:
> > It sounds like your constraining your driver based on terminology.
> >Watchdogs on most embedded platforms support either a 'reset' mode or
> >otherwise act as periodic timers, trying to push both of these
> >functionalities in to a watchdog driver is rather pointless.
> >CONFIG_WATCHDOG implies 'reset' mode by definition.
>
> I agree for a product in the hands of a customer: let the watchdog
> pull your bacon out of the fire.
>
> But what about debugging? Suppose your embedded computer with custom
> drivers locks up solid every few hundred hours. It would be nice if
> the watchdog gave a stack dump instead erasing the evidence. How about
> having "action=reset" and "action=debug"?
>
Again, CONFIG_WATCHDOG implies reset by definition. If you'd like to
propose pluggable policies for CONFIG_WATCHDOG and post the code for
that, go right ahead.
For soft lockups, there's already the softlockup code which does what
you seem to be leaning towards. For hard lockups (which is where
CONFIG_WATCHDOG comes in handy), you're likely not going to get any
output anyways. In either case, wiggling this in to CONFIG_WATCHDOG is
very much changing the meaning of what CONFIG_WATCHDOG means today, and
doesn't seem to buy us anything.
> is this completely bad mojo ? is there some other mechanism that
> provides what i want and i just dont know about it ? or do i just
> make people change the driver to fit their application, thus throwing
> out the idea of keeping all board-specific details in just the boards
> file ...
There are two possibilities of interest I can think of (and maybe both
are useful). One is to deliver a signal to someone on expiry the other
would be to use notifier chains and export either the notifier or
add/remove operations. That allows multiple modules and users to chain
onto the expiry event
Take a look at include/linux/notifier.h
On Thu 24 May 2007 01:23, Paul Mundt pondered:
> On Thu, May 24, 2007 at 12:21:47AM -0400, Mike Frysinger wrote:
> > is this completely bad mojo ? is there some other mechanism that
> > provides what i want and i just dont know about it ? or do i just
> > make people change the driver to fit their application, thus throwing
> > out the idea of keeping all board-specific details in just the boards
> > file ...
>
> It sounds like your constraining your driver based on terminology.
> Watchdogs on most embedded platforms support either a 'reset' mode or
> otherwise act as periodic timers, trying to push both of these
> functionalities in to a watchdog driver is rather pointless.
> CONFIG_WATCHDOG implies 'reset' mode by definition.
I understand what you mean - typically - most people think of watchdog ==
reset.
But, calling it a periodic timer, and servicing it with the watchdog user
space demon is even more confusing - isn't it?
-Robin
On 5/24/07, Daniel Newby <[email protected]> wrote:
> On 5/24/07, Paul Mundt <[email protected]> wrote:
> > It sounds like your constraining your driver based on terminology.
> > Watchdogs on most embedded platforms support either a 'reset' mode or
> > otherwise act as periodic timers, trying to push both of these
> > functionalities in to a watchdog driver is rather pointless.
> > CONFIG_WATCHDOG implies 'reset' mode by definition.
>
> I agree for a product in the hands of a customer: let the watchdog
> pull your bacon out of the fire.
>
> But what about debugging? Suppose your embedded computer with custom
> drivers locks up solid every few hundred hours. It would be nice if
> the watchdog gave a stack dump instead erasing the evidence. How about
> having "action=reset" and "action=debug"?
"action" corresponds one to one with the functionality of the hardware
device ... what the board guy wishes to do when the interrupt expires
is up to the board guy and that may include doing a debug dump
somewhere ... that is most definitely not a realm i want to tread into
it ;)
-mike
On 5/24/07, Paul Mundt <[email protected]> wrote:
> On Thu, May 24, 2007 at 12:21:47AM -0400, Mike Frysinger wrote:
> > is this completely bad mojo ? is there some other mechanism that
> > provides what i want and i just dont know about it ? or do i just
> > make people change the driver to fit their application, thus throwing
> > out the idea of keeping all board-specific details in just the boards
> > file ...
>
> It sounds like your constraining your driver based on terminology.
> Watchdogs on most embedded platforms support either a 'reset' mode or
> otherwise act as periodic timers, trying to push both of these
> functionalities in to a watchdog driver is rather pointless.
> CONFIG_WATCHDOG implies 'reset' mode by definition.
my constraint was trying to keep all of the code that deals with the
watchdog in one file ... those were the blinders i had on from the get
go so the idea of having different drivers that work with the watchdog
hardware hadnt even occurred to me
> If you wish to use your watchdog timer as a periodic timer, simply have a
> clocksource/clockevents established for it, leave the watchdog driver as
> a reset-only thing, and let the user decide which one they want either
> via Kconfig or the kernel command line. (The watchdog driver can just
> -ENODEV or -EBUSY if the clocksource is active).
hmm, i'll poke the clocksource/clockevents stuff as well as the
notifier idea from Alan
thanks
-mike
On Thu, May 24, 2007 at 09:29:51AM -0400, Robin Getz wrote:
> On Thu 24 May 2007 01:23, Paul Mundt pondered:
> > On Thu, May 24, 2007 at 12:21:47AM -0400, Mike Frysinger wrote:
> > > is this completely bad mojo ? is there some other mechanism that
> > > provides what i want and i just dont know about it ? or do i just
> > > make people change the driver to fit their application, thus throwing
> > > out the idea of keeping all board-specific details in just the boards
> > > file ...
> >
> > It sounds like your constraining your driver based on terminology.
> > Watchdogs on most embedded platforms support either a 'reset' mode or
> > otherwise act as periodic timers, trying to push both of these
> > functionalities in to a watchdog driver is rather pointless.
> > CONFIG_WATCHDOG implies 'reset' mode by definition.
>
> I understand what you mean - typically - most people think of watchdog ==
> reset.
>
No, not typically. This is _precisely_ what CONFIG_WATCHDOG means, and
how the entire set of drivers underneath it behave.
> But, calling it a periodic timer, and servicing it with the watchdog user
> space demon is even more confusing - isn't it?
>
Calling it a periodic timer when its in periodic timer mode makes sense.
Why you would want to interface that with a userspace watchdog daemon is
beyond me, they're conceptually unrelated.
Please read my original mail on the subject. I'm not advocating hiding a
clocksource somewhere in the depths of CONFIG_WATCHDOG, they're
completely unrelated.
On Thu 24 May 2007 11:23, Paul Mundt pondered:
>
> Calling it a periodic timer when its in periodic timer mode makes sense.
No disagreements - but I don't think that a watchdog that doesn't cause a
reset is a periodic timer.
> Why you would want to interface that with a userspace watchdog daemon is
> beyond me, they're conceptually unrelated.
Agreed again - periodic timers have nothing to do with watchdogs. This is
where I am confused about why you are saying that the only event a watchdog
can have is a hard reset.
> Please read my original mail on the subject.
I did. Twice - but maybe I am still missing something. (sorry)
> I'm not advocating hiding a
> clocksource somewhere in the depths of CONFIG_WATCHDOG, they're
> completely unrelated.
I (and many others) consider a "watchdog" a clock sink - something that needs
to be poked within certain limits (too fast can indicate a failures just as
too slow is a failure).
The event or how something is notified of the failure of the watchdog to be
serviced shouldn't determine what the name is.
What's in a name? that which we call a watchdog
By any other name would smell as sweet;
-Bill S
-Robin
On Thu, May 24, 2007 at 01:32:30PM -0400, Robin Getz wrote:
> On Thu 24 May 2007 11:23, Paul Mundt pondered:
> >
> > Calling it a periodic timer when its in periodic timer mode makes sense.
>
> No disagreements - but I don't think that a watchdog that doesn't cause a
> reset is a periodic timer.
>
I'm not sure what else you think it is? On most platforms, when it's not
in reset mode, it works as a free-running timer with an IRQ generated on
overflow. I've certainly used the watchdog as a system timer before on
boards where all of the timer channels are tied up for other uses.
> > Why you would want to interface that with a userspace watchdog daemon is
> > beyond me, they're conceptually unrelated.
>
> Agreed again - periodic timers have nothing to do with watchdogs. This is
> where I am confused about why you are saying that the only event a watchdog
> can have is a hard reset.
>
No, what I said was that the only event that _matters_ to CONFIG_WATCHDOG
is a hard reset. So far no one has suggested anything outside of hard
reset, periodic timer, or softlockup detection that would be useful to
extend CONFIG_WATCHDOG for.
If you're talking about specific events, clockevents are still a much
better way to go than trying to hammer something in to CONFIG_WATCHDOG
that it was never designed for. If you have some 'special' events for
your watchdog that would be of use to others, tying these in as
additional flags for clockevents would be far more beneficial anyways.
> > I'm not advocating hiding a clocksource somewhere in the depths of
> > CONFIG_WATCHDOG, they're completely unrelated.
>
> I (and many others) consider a "watchdog" a clock sink - something that needs
> to be poked within certain limits (too fast can indicate a failures just as
> too slow is a failure).
>
Currently there's nothing in the kernel that cares about clearing 'too fast'.
I can't imagine why this _should_ be treated as a failure, but feel free
to code up a solution if you feel it will be useful.
> The event or how something is notified of the failure of the watchdog to be
> serviced shouldn't determine what the name is.
>
If all you want is a timer that you occasionally have to poke and then
take some notification when it expires, you can just use a regular
one-shot timer anyways and bank off of the system timer, the 'watchdog'
is certainly not doing anything useful at this point.
So far the only example anyone has provided outside of periodic timers or
hardware reset has been dumping the stack when something gets stuck.
Softlockup does this already today, using a timer.
If your system is completely dead, you won't have any way to trigger or
see the stack dump anyways, so the watchdog doesn't buy you anything
there, either.
What many watchdogs do today is simply to have split timer for userspace
and the actual hardware (where userspace has to poke the timer every now
and then, or the kernel will allow the overflow). This is pretty common
for watchdogs with very fast overflow periods.
There's certainly nothing wrong with having a timer that runs out and
kicks a notifier chain if there's something special you want to do, but
tying up the watchdog hardware for that is silly. There are many other
things one has to use the watchdog hardware for anyways (reset, periodic
timing, frequency scaling -- waiting for PLL stabilization, etc.). Tying
down a hardware block where a struct timer_list will do the same work
makes no sense.
On 5/24/07, Paul Mundt <[email protected]> wrote:
> So far the only example anyone has provided outside of periodic timers or
> hardware reset has been dumping the stack when something gets stuck.
> Softlockup does this already today, using a timer.
Many watchdogs can be hooked up to a non-maskable interrupt (NMI) that
cannot be disabled or preempted. You get a stack dump even for drastic
bugs: ISR lock up, timer misconfiguration, level-sensitive interrupt
line stuck asserted, and so forth. Getting that information by other
means can be painful and/or expensive.
The Blackfin chip in the original message appears to support watchdog
NMI.
-- Daniel
On 5/25/07, Daniel Newby <[email protected]> wrote:
> On 5/24/07, Paul Mundt <[email protected]> wrote:
> > So far the only example anyone has provided outside of periodic timers or
> > hardware reset has been dumping the stack when something gets stuck.
> > Softlockup does this already today, using a timer.
>
> Many watchdogs can be hooked up to a non-maskable interrupt (NMI) that
> cannot be disabled or preempted. You get a stack dump even for drastic
> bugs: ISR lock up, timer misconfiguration, level-sensitive interrupt
> line stuck asserted, and so forth. Getting that information by other
> means can be painful and/or expensive.
>
> The Blackfin chip in the original message appears to support watchdog
> NMI.
it does ... it has four modes:
- reset (drivers/char/watchdog/bfin_wdt.c)
- interrupt (i'll prob write a clockevents driver for this)
- NMI (no plans to do anything for this as NMI is unused in Blackfin)
- nothing (have yet to find a use case for this)
-mike