Hello everyone,
I have an Intel board (D915GEV/D915GRF) with an onboard i8xx TCO timer
watchdog on it. I compiled a kernel and tried to make it reset my
machine, but it simply doesn't. I use Linus Linux tree (GIT HEAD), the
following watchdog related configuration:
CONFIG_WATCHDOG=y
CONFIG_WATCHDOG_NOWAYOUT=y
CONFIG_I6300ESB_WDT=y
CONFIG_I8XX_TCO=y
I tried to test the watchdog using the following:
cat > /dev/watchdog
and wait a few minutes, but that doesn't reset my machine. dmesg shows the
following:
(webfarm) [~] dmesg | grep TCO
i8xx TCO timer: heartbeat value must be 2<heartbeat<39, using 30
i8xx TCO timer: initialized (0x0460). heartbeat=30 sec (nowayout=1)
lspci is this:
0000:00:00.0 Host bridge: Intel Corp. 915G/P/GV Processor to I/O Controller (rev 04)
0000:00:01.0 PCI bridge: Intel Corp. 915G/P/GV PCI Express Root Port (rev 04)
0000:00:02.0 VGA compatible controller: Intel Corp. 82915G Express Chipset Family Graphics Controller (rev 04)
0000:00:1c.0 PCI bridge: Intel Corp. 82801FB/FBM/FR/FW/FRW (ICH6 Family) PCI Express Port 1 (rev 03)
0000:00:1c.1 PCI bridge: Intel Corp. 82801FB/FBM/FR/FW/FRW (ICH6 Family) PCI Express Port 2 (rev 03)
0000:00:1c.2 PCI bridge: Intel Corp. 82801FB/FBM/FR/FW/FRW (ICH6 Family) PCI Express Port 3 (rev 03)
0000:00:1c.3 PCI bridge: Intel Corp. 82801FB/FBM/FR/FW/FRW (ICH6 Family) PCI Express Port 4 (rev 03)
0000:00:1d.0 USB Controller: Intel Corp. 82801FB/FBM/FR/FW/FRW (ICH6 Family) USB UHCI #1 (rev 03)
0000:00:1d.1 USB Controller: Intel Corp. 82801FB/FBM/FR/FW/FRW (ICH6 Family) USB UHCI #2 (rev 03)
0000:00:1d.2 USB Controller: Intel Corp. 82801FB/FBM/FR/FW/FRW (ICH6 Family) USB UHCI #3 (rev 03)
0000:00:1d.3 USB Controller: Intel Corp. 82801FB/FBM/FR/FW/FRW (ICH6 Family) USB UHCI #4 (rev 03)
0000:00:1d.7 USB Controller: Intel Corp. 82801FB/FBM/FR/FW/FRW (ICH6 Family) USB2 EHCI Controller (rev 03)
0000:00:1e.0 PCI bridge: Intel Corp. 82801 PCI Bridge (rev d3)
0000:00:1f.0 ISA bridge: Intel Corp. 82801FB/FR (ICH6/ICH6R) LPC Interface Bridge (rev 03)
0000:00:1f.1 IDE interface: Intel Corp. 82801FB/FBM/FR/FW/FRW (ICH6 Family) IDE Controller (rev 03)
0000:00:1f.2 IDE interface: Intel Corp. 82801FB/FW (ICH6/ICH6W) SATA Controller (rev 03)
0000:00:1f.3 SMBus: Intel Corp. 82801FB/FBM/FR/FW/FRW (ICH6 Family) SMBus Controller (rev 03)
0000:04:00.0 Ethernet controller: Marvell Technology Group Ltd.: Unknown device 4361 (rev 17)
Has somone any ideas, did I do something wrong? From looking at the
source code it looks like the watchdog is enabled as soon as I open the
device. And if I don't feed anything in, it shouldn't reload the timer.
Thomas
On Tue, 2 May 2006 00:59:48 +0200 Thomas Glanzmann wrote:
> Hello everyone,
> I have an Intel board (D915GEV/D915GRF) with an onboard i8xx TCO timer
> watchdog on it. I compiled a kernel and tried to make it reset my
> machine, but it simply doesn't. I use Linus Linux tree (GIT HEAD), the
> following watchdog related configuration:
>
> CONFIG_WATCHDOG=y
> CONFIG_WATCHDOG_NOWAYOUT=y
> CONFIG_I6300ESB_WDT=y
> CONFIG_I8XX_TCO=y
>
> I tried to test the watchdog using the following:
>
> cat > /dev/watchdog
*** see below
> and wait a few minutes, but that doesn't reset my machine. dmesg shows the
> following:
>
> (webfarm) [~] dmesg | grep TCO
> i8xx TCO timer: heartbeat value must be 2<heartbeat<39, using 30
> i8xx TCO timer: initialized (0x0460). heartbeat=30 sec (nowayout=1)
>
> lspci is this:
>
[snip]
>
> Has somone any ideas, did I do something wrong? From looking at the
> source code it looks like the watchdog is enabled as soon as I open the
> device. And if I don't feed anything in, it shouldn't reload the timer.
TCO watchdog works on my old P-III machine. After writing to
/dev/watchdog, about 50 (not 30) seconds later, it reboots.
Oh, I see. Two choices:
use: echo -n 1 > /dev/watchdog
or when you use: cat > /dev/watchdog
and press CR, that doesn't close /dev/watchdog yet.
You need to kill cat (^C) and then /dev/watchdog is closed
and the watchdog timer starts counting. You will (should) see
a message like so:
i8xx TCO timer: Unexpected close, not stopping watchdog!
then wait awhile, then it reboots.
---
~Randy
Hello Randy,
thanks for the reply, but the problem isn't how I tried to trigger it
from the userland but how the new chips work. See attached eMail.
Nevertheless at the moment the software watchdog is okay with me. But
the userland 'watchdog' programm is driving me crazy. 6 false positives
in less than 48 hours which resulted into reboots. I think that I will
put some work into writing something more reliable.
Thomas