2005-04-24 04:08:37

by Parag Warudkar

[permalink] [raw]
Subject: X86_64: 2.6.12-rc3 spontaneous reboot

While running a 32 bit Java program 2.6.12-rc3 rebooted spontaneously leaving
a corrupt partition table and disk with errors. There was nothing in dmesg
(no oops/panic) except some -MARK- entries during the reboot.

No such problems with 2.6.11. Hardware is a laptop - Compaq presario R3240us
- Athlon64. It runs solid when running 2.6.11.

Parag


2005-04-24 08:42:15

by Alexander Nyberg

[permalink] [raw]
Subject: Re: X86_64: 2.6.12-rc3 spontaneous reboot

s?n 2005-04-24 klockan 00:08 -0400 skrev Parag Warudkar:
> While running a 32 bit Java program 2.6.12-rc3 rebooted spontaneously leaving
> a corrupt partition table and disk with errors. There was nothing in dmesg
> (no oops/panic) except some -MARK- entries during the reboot.
>

Is this reproducible? If so, can you give a detailed description of how.

Thanks

2005-04-24 13:02:42

by Ed Tomlinson

[permalink] [raw]
Subject: Re: X86_64: 2.6.12-rc3 spontaneous reboot

On Sunday 24 April 2005 04:41, Alexander Nyberg wrote:
> s?n 2005-04-24 klockan 00:08 -0400 skrev Parag Warudkar:
> > While running a 32 bit Java program 2.6.12-rc3 rebooted spontaneously leaving
> > a corrupt partition table and disk with errors. There was nothing in dmesg
> > (no oops/panic) except some -MARK- entries during the reboot.
> >
>
> Is this reproducible? If so, can you give a detailed description of how.

I think rc3 has code from rc2-mm2/3. Both of these reboot here randomly. Nothing
shows up on a serial console... Think something is seriously wrong with x86_64 in rc3.
That being said its possible its fixed in HEAD by.

[PATCH] x86_64: fix new out of line put_user()
[PATCH] x86_64: Bug in new out of line put_user()

some people have reported reversing (in -mm)

sched-unlocked-context-switches.patch

Helps too.

Ed Tomlinson

2005-04-24 13:35:15

by ismail dönmez

[permalink] [raw]
Subject: Re: X86_64: 2.6.12-rc3 spontaneous reboot

Running 2.6.12-rc3 here on amd64 and I don't see any reboots. So it
might be config dependent.


On 4/24/05, Ed Tomlinson <[email protected]> wrote:
> On Sunday 24 April 2005 04:41, Alexander Nyberg wrote:
> > s?n 2005-04-24 klockan 00:08 -0400 skrev Parag Warudkar:
> > > While running a 32 bit Java program 2.6.12-rc3 rebooted spontaneously
> leaving
> > > a corrupt partition table and disk with errors. There was nothing in
> dmesg
> > > (no oops/panic) except some -MARK- entries during the reboot.
> > >
> >
> > Is this reproducible? If so, can you give a detailed description of how.
>
> I think rc3 has code from rc2-mm2/3. Both of these reboot here randomly.
> Nothing
> shows up on a serial console... Think something is seriously wrong with
> x86_64 in rc3.
> That being said its possible its fixed in HEAD by.
>
> [PATCH] x86_64: fix new out of line put_user()
> [PATCH] x86_64: Bug in new out of line put_user()
>
> some people have reported reversing (in -mm)
>
> sched-unlocked-context-switches.patch
>
> Helps too.
>
> Ed Tomlinson
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>

--
Time is what you make of it

2005-04-24 21:59:04

by Parag Warudkar

[permalink] [raw]
Subject: Re: X86_64: 2.6.12-rc3 spontaneous reboot

I haven't tried to reproduce it since it will cause me data loss :(
I am not sure it is possible to go to this extent in order to reproduce
the reboot but still... Here is what I was doing when the error occured -
I installed Ubuntu AMD64 5.04 release. Installed 2.6.12-rc3.
Installed and configured a chroot env for 32bit programs. (dchroot)
Installed 32bit JRE from Sun - latest one. Tried to install
Borland Together J Eclipse edition under the chrooted environment
- Just before the install finished, machine rebooted. Machine was
running for couple hours before the reboot.

Parag


> On Sunday 24 April 2005 04:41, Alexander Nyberg wrote:
> > s?n 2005-04-24 klockan 00:08 -0400 skrev Parag Warudkar:
> > > While running a 32 bit Java program 2.6.12-rc3 rebooted spontaneously
> leaving
> > > a corrupt partition table and disk with errors. There was nothing in dmesg
> > > (no oops/panic) except some -MARK- entries during the reboot.
> > >
> >
> > Is this reproducible? If so, can you give a detailed description of how.
>
> I think rc3 has code from rc2-mm2/3. Both of these reboot here randomly.
> Nothing
> shows up on a serial console... Think something is seriously wrong with x86_64
> in rc3.
> That being said its possible its fixed in HEAD by.
>
> [PATCH] x86_64: fix new out of line put_user()
> [PATCH] x86_64: Bug in new out of line put_user()
>
> some people have reported reversing (in -mm)
>
> sched-unlocked-context-switches.patch
>
> Helps too.
>
> Ed Tomlinson

2005-04-25 08:44:43

by Patrick McHardy

[permalink] [raw]
Subject: Re: X86_64: 2.6.12-rc3 spontaneous reboot

Ed Tomlinson wrote:
> I think rc3 has code from rc2-mm2/3. Both of these reboot here randomly. Nothing
> shows up on a serial console... Think something is seriously wrong with x86_64 in rc3.
> That being said its possible its fixed in HEAD by.
>
> [PATCH] x86_64: fix new out of line put_user()
> [PATCH] x86_64: Bug in new out of line put_user()

I'm seeing the same problem with a fresh git checkout when running uml
or gcc in 32bit mode. Nothing is received from netconsole. If anyone
can suggest which patches might be worth reverting I'll try that.

Regards
Patrick

2005-04-25 15:48:32

by Andi Kleen

[permalink] [raw]
Subject: Re: X86_64: 2.6.12-rc3 spontaneous reboot

On Mon, Apr 25, 2005 at 10:44:33AM +0200, Patrick McHardy wrote:
> Ed Tomlinson wrote:
> >I think rc3 has code from rc2-mm2/3. Both of these reboot here randomly.
> >Nothing
> >shows up on a serial console... Think something is seriously wrong with
> >x86_64 in rc3.
> >That being said its possible its fixed in HEAD by.
> >
> >[PATCH] x86_64: fix new out of line put_user()
> >[PATCH] x86_64: Bug in new out of line put_user()
>
> I'm seeing the same problem with a fresh git checkout when running uml
> or gcc in 32bit mode. Nothing is received from netconsole. If anyone
> can suggest which patches might be worth reverting I'll try that.

Well, you can revert all my x86-64 changes for testing that went
in after rc2. Does that make a difference? If yes then please
do a binary search or give me a test case that shows the problem.

-Andi

2005-04-25 18:15:07

by Alexander Nyberg

[permalink] [raw]
Subject: Re: X86_64: 2.6.12-rc3 spontaneous reboot

> > >I think rc3 has code from rc2-mm2/3. Both of these reboot here randomly.
> > >Nothing
> > >shows up on a serial console... Think something is seriously wrong with
> > >x86_64 in rc3.
> > >That being said its possible its fixed in HEAD by.
> > >
> > >[PATCH] x86_64: fix new out of line put_user()
> > >[PATCH] x86_64: Bug in new out of line put_user()
> >
> > I'm seeing the same problem with a fresh git checkout when running uml
> > or gcc in 32bit mode. Nothing is received from netconsole. If anyone
> > can suggest which patches might be worth reverting I'll try that.
>
> Well, you can revert all my x86-64 changes for testing that went
> in after rc2. Does that make a difference? If yes then please
> do a binary search or give me a test case that shows the problem.
>

Usually gives in after about 20 minutes of gcc compiling, sometimes even
up to 40 minutes. I had 2.6.12-rc2 stand for 2-3 hours so it seems ok.
If anyone has a better recipe for it please do tell.

It doesn't appear to be any of the obvious patch candidates...

2005-04-26 00:04:42

by Patrick McHardy

[permalink] [raw]
Subject: Re: X86_64: 2.6.12-rc3 spontaneous reboot

Alexander Nyberg wrote:
> Usually gives in after about 20 minutes of gcc compiling, sometimes even
> up to 40 minutes. I had 2.6.12-rc2 stand for 2-3 hours so it seems ok.
> If anyone has a better recipe for it please do tell.

uml seems to trigger the bug faster, I'm usually seeing it within
a couple of minutes. I'm doing a binary search now.

> It doesn't appear to be any of the obvious patch candidates...

Which ones did you try?

2005-04-26 08:26:22

by Alexander Nyberg

[permalink] [raw]
Subject: Re: X86_64: 2.6.12-rc3 spontaneous reboot

tis 2005-04-26 klockan 02:04 +0200 skrev Patrick McHardy:
> Alexander Nyberg wrote:
> > Usually gives in after about 20 minutes of gcc compiling, sometimes even
> > up to 40 minutes. I had 2.6.12-rc2 stand for 2-3 hours so it seems ok.
> > If anyone has a better recipe for it please do tell.
>
> uml seems to trigger the bug faster, I'm usually seeing it within
> a couple of minutes. I'm doing a binary search now.

Ok, I'll try to compile UML, most of the times it doesn't work out for
me, but having something that triggers more reliably would be useful.
Sometimes it has taken even over an hour with gcc.

> > It doesn't appear to be any of the obvious patch candidates...
>
> Which ones did you try?

freepgt series and everything that directly touches ia32 emulation
between rc2 and rc3. The problem is I have only my saved commit mails
right now and some x64 patches seem to be missing, I need to learn git
and start reversing stuff from there...

No snapshots available either, whoever introduced the bug chose a bad
timing ;-)

2005-04-26 13:05:19

by Patrick McHardy

[permalink] [raw]
Subject: Re: X86_64: 2.6.12-rc3 spontaneous reboot

Andi Kleen wrote:
> Well, you can revert all my x86-64 changes for testing that went
> in after rc2. Does that make a difference? If yes then please
> do a binary search or give me a test case that shows the problem.

It's this one:

[PATCH] x86_64: Fix a small missing schedule race

Uml seems to be a good testcase, the box reliably reboots while uml
is starting up, usually shortly after the root-filesystem has been
mounted.

Regards
Patrick

2005-04-26 13:53:26

by Andi Kleen

[permalink] [raw]
Subject: Re: X86_64: 2.6.12-rc3 spontaneous reboot

On Tue, Apr 26, 2005 at 03:04:47PM +0200, Patrick McHardy wrote:
> Andi Kleen wrote:
> >Well, you can revert all my x86-64 changes for testing that went
> >in after rc2. Does that make a difference? If yes then please
> >do a binary search or give me a test case that shows the problem.
>
> It's this one:
>
> [PATCH] x86_64: Fix a small missing schedule race
>
> Uml seems to be a good testcase, the box reliably reboots while uml
> is starting up, usually shortly after the root-filesystem has been
> mounted.

Hmm, thats hard to believe. And are you sure the NMI watchdog
did not trigger? (e.g. did you run it with serial or netconsole)?

-Andi

2005-04-26 13:57:31

by Patrick McHardy

[permalink] [raw]
Subject: Re: X86_64: 2.6.12-rc3 spontaneous reboot

Andi Kleen wrote:
> Hmm, thats hard to believe. And are you sure the NMI watchdog
> did not trigger? (e.g. did you run it with serial or netconsole)?

I ran it with netconsole, but nothing was received. I'm going to retry
with and without the patch once more to make sure.

Regards
Patrick

2005-04-26 14:19:20

by Patrick McHardy

[permalink] [raw]
Subject: Re: X86_64: 2.6.12-rc3 spontaneous reboot

Patrick McHardy wrote:
> Andi Kleen wrote:
>
>> Hmm, thats hard to believe. And are you sure the NMI watchdog
>> did not trigger? (e.g. did you run it with serial or netconsole)?
>
> I ran it with netconsole, but nothing was received. I'm going to retry
> with and without the patch once more to make sure.

I tried starting uml with and without the patch, five times each.
Without the patch it worked fine every time, with the patch it
instantly rebooted every time. Nothing was logged on netconsole.
Perhaps also interesting is that if I revert all patches after
this change up to HEAD, it doesn't always instantly reboot, but
sometimes just hangs. When using HEAD, it rebooted each time so far.

Regards
Patrick

2005-04-26 14:23:30

by Andi Kleen

[permalink] [raw]
Subject: Re: X86_64: 2.6.12-rc3 spontaneous reboot

On Tue, Apr 26, 2005 at 04:18:58PM +0200, Patrick McHardy wrote:
> Patrick McHardy wrote:
> >Andi Kleen wrote:
> >
> >>Hmm, thats hard to believe. And are you sure the NMI watchdog
> >>did not trigger? (e.g. did you run it with serial or netconsole)?
> >
> >I ran it with netconsole, but nothing was received. I'm going to retry
> >with and without the patch once more to make sure.
>
> I tried starting uml with and without the patch, five times each.
> Without the patch it worked fine every time, with the patch it
> instantly rebooted every time. Nothing was logged on netconsole.
> Perhaps also interesting is that if I revert all patches after
> this change up to HEAD, it doesn't always instantly reboot, but
> sometimes just hangs. When using HEAD, it rebooted each time so far.

Ok thanks for the information. I will stare a bit at the patch.

It is very mysterious though. Even if the patch was somehow wrong
the worst thing that could happen is that you end up with interrupts
off when you shouldnt, and the NMI watchdog is very good
at catching that.

Hmm actually - on some systems I broke the NMI watchdog. Can you
check your dmesg to see if check_nmi_watchdog doesnt report it
as stuck? If yes please put a return on top of check_nmi_watchdog
that should fix it. You can verify it works by looking at the
per CPU NMI counters in /proc/interrupts. An nmi watchdog
backtrace would be nice to see.

-Andi

2005-04-26 14:51:47

by Patrick McHardy

[permalink] [raw]
Subject: Re: X86_64: 2.6.12-rc3 spontaneous reboot

Andi Kleen wrote:
> Hmm actually - on some systems I broke the NMI watchdog. Can you
> check your dmesg to see if check_nmi_watchdog doesnt report it
> as stuck? If yes please put a return on top of check_nmi_watchdog
> that should fix it. You can verify it works by looking at the
> per CPU NMI counters in /proc/interrupts. An nmi watchdog
> backtrace would be nice to see.

No occurences of check_nmi_watchdog in dmesg or my logs, just
"Using local APIC NMI watchdog using perfctr0". /proc/interrupts
shows:

NMI: 181

Regards
Patrick

2005-04-26 14:54:00

by Andi Kleen

[permalink] [raw]
Subject: Re: X86_64: 2.6.12-rc3 spontaneous reboot

On Tue, Apr 26, 2005 at 04:51:29PM +0200, Patrick McHardy wrote:
> Andi Kleen wrote:
> >Hmm actually - on some systems I broke the NMI watchdog. Can you
> >check your dmesg to see if check_nmi_watchdog doesnt report it
> >as stuck? If yes please put a return on top of check_nmi_watchdog
> >that should fix it. You can verify it works by looking at the
> >per CPU NMI counters in /proc/interrupts. An nmi watchdog
> >backtrace would be nice to see.
>
> No occurences of check_nmi_watchdog in dmesg or my logs, just
> "Using local APIC NMI watchdog using perfctr0". /proc/interrupts
> shows:
>
> NMI: 181

It is Checking NMI watchdog .... if you see "stuck" after that it is
broken. Or do the NMIs actually increase when you make the machine
busy (run while true ; do true ; done on each CPU)?

-Andi

2005-04-26 15:06:13

by Patrick McHardy

[permalink] [raw]
Subject: Re: X86_64: 2.6.12-rc3 spontaneous reboot

Bootdata ok (command line is root=/dev/md0 ro [email protected]/eth0,[email protected]/ff:ff:ff:ff:ff:ff )
Linux version 2.6.12-rc3-2005-04-26 (kaber@kaber) (gcc-Version 3.3.5 (Debian 1:3.3.5-12)) #4 Tue Apr 26 16:07:53 CEST 2005
BIOS-provided physical RAM map:
BIOS-e820: 0000000000000000 - 000000000009fc00 (usable)
BIOS-e820: 000000000009fc00 - 00000000000a0000 (reserved)
BIOS-e820: 00000000000e4000 - 0000000000100000 (reserved)
BIOS-e820: 0000000000100000 - 000000007ff30000 (usable)
BIOS-e820: 000000007ff30000 - 000000007ff40000 (ACPI data)
BIOS-e820: 000000007ff40000 - 000000007fff0000 (ACPI NVS)
BIOS-e820: 000000007fff0000 - 0000000080000000 (reserved)
BIOS-e820: 00000000fff80000 - 0000000100000000 (reserved)
ACPI: RSDP (v002 ACPIAM ) @ 0x00000000000faac0
ACPI: XSDT (v001 A M I OEMXSDT 0x08000403 MSFT 0x00000097) @ 0x000000007ff30100
ACPI: FADT (v003 A M I OEMFACP 0x08000403 MSFT 0x00000097) @ 0x000000007ff30290
ACPI: MADT (v001 A M I OEMAPIC 0x08000403 MSFT 0x00000097) @ 0x000000007ff30390
ACPI: OEMB (v001 A M I OEMBIOS 0x08000403 MSFT 0x00000097) @ 0x000000007ff40040
ACPI: DSDT (v001 A0091 A0091006 0x00000006 MSFT 0x0100000d) @ 0x0000000000000000
On node 0 totalpages: 524080
DMA zone: 4096 pages, LIFO batch:1
Normal zone: 519984 pages, LIFO batch:16
HighMem zone: 0 pages, LIFO batch:1
ACPI: Local APIC address 0xfee00000
ACPI: LAPIC (acpi_id[0x01] lapic_id[0x00] enabled)
Processor #0 15:12 APIC version 16
ACPI: IOAPIC (id[0x01] address[0xfec00000] gsi_base[0])
IOAPIC[0]: apic_id 1, version 3, address 0xfec00000, GSI 0-23
ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl)
ACPI: IRQ0 used by override.
ACPI: IRQ2 used by override.
ACPI: IRQ9 used by override.
Setting APIC routing to flat
Using ACPI (MADT) for SMP configuration information
Allocating PCI resources starting at 80000000 (gap: 80000000:7ff80000)
Checking aperture...
CPU 0: aperture @ e0000000 size 256 MB
Built 1 zonelists
Kernel command line: root=/dev/md0 ro [email protected]/eth0,[email protected]/ff:ff:ff:ff:ff:ff
netconsole: local port 4444
netconsole: local IP 172.16.1.123
netconsole: interface eth0
netconsole: remote port 6666
netconsole: remote IP 172.16.195.3
netconsole: remote ethernet address ff:ff:ff:ff:ff:ff
Initializing CPU#0
PID hash table entries: 4096 (order: 12, 131072 bytes)
time.c: Using 1.193182 MHz PIT timer.
time.c: Detected 2002.579 MHz processor.
time.c: Using PIT/TSC based timekeeping.
Console: colour VGA+ 80x25
Dentry cache hash table entries: 524288 (order: 10, 4194304 bytes)
Inode-cache hash table entries: 262144 (order: 9, 2097152 bytes)
Memory: 2056404k/2096320k available (2487k kernel code, 39224k reserved, 1309k data, 152k init)
Calibrating delay loop... 3923.96 BogoMIPS (lpj=1961984)
Security Framework v1.0.0 initialized
Capability LSM initialized
Mount-cache hash table entries: 256
CPU: L1 I Cache: 64K (64 bytes/line), D cache 64K (64 bytes/line)
CPU: L2 Cache: 512K (64 bytes/line)
CPU: AMD Athlon(tm) 64 Processor 3000+ stepping 00
Using local APIC NMI watchdog using perfctr0
Using local APIC timer interrupts.
Detected 12.516 MHz APIC timer.
NET: Registered protocol family 16
PCI: Using configuration type 1
mtrr: v2.0 (20020519)
ACPI: Subsystem revision 20050309
ACPI: Interpreter enabled
ACPI: Using IOAPIC for interrupt routing
ACPI: PCI Root Bridge [PCI0] (0000:00)
PCI: Probing PCI hardware (bus 00)
PCI: Via IRQ fixup
Boot video device is 0000:01:00.0
ACPI: PCI Interrupt Routing Table [\_SB_.PCI0._PRT]
ACPI: PCI Interrupt Link [LNKA] (IRQs 3 4 5 7 10 *11 14 15)
ACPI: PCI Interrupt Link [LNKB] (IRQs 3 4 5 7 *10 11 14 15)
ACPI: PCI Interrupt Link [LNKC] (IRQs 3 4 *5 7 10 11 14 15)
ACPI: PCI Interrupt Link [LNKD] (IRQs 3 4 5 7 10 11 14 15) *0, disabled.
ACPI: PCI Interrupt Link [LNKE] (IRQs 3 4 5 7 10 11 14 15) *0, disabled.
ACPI: PCI Interrupt Link [LNKF] (IRQs 3 4 5 7 10 11 14 15) *0, disabled.
ACPI: PCI Interrupt Link [LNKG] (IRQs 3 4 5 7 10 11 14 15) *0, disabled.
ACPI: PCI Interrupt Link [LNKH] (IRQs 3 4 5 7 10 11 14 15) *0, disabled.
SCSI subsystem initialized
usbcore: registered new driver usbfs
usbcore: registered new driver hub
PCI: Using ACPI for IRQ routing
PCI: If a device doesn't work, try "pci=routeirq". If it helps, post a report
TC classifier action (bugs to [email protected] cc [email protected])
agpgart: Detected AGP bridge 0
agpgart: AGP aperture is 256M @ 0xe0000000
PCI-DMA: Disabling IOMMU.
IA32 emulation $Id: sys_ia32.c,v 1.32 2002/03/24 13:02:28 ak Exp $
Initializing Cryptographic API
ACPI: Power Button (FF) [PWRF]
ACPI: Sleep Button (CM) [SLPB]
Linux agpgart interface v0.101 (c) Dave Jones
[drm] Initialized drm 1.0.0 20040925
ACPI: PCI Interrupt 0000:01:00.0[A] -> GSI 16 (level, low) -> IRQ 169
[drm] Initialized mga 3.1.0 20021029 on minor 0: Matrox Graphics, Inc. MGA G550 AGP
Hangcheck: starting hangcheck timer 0.5.0 (tick is 180 seconds, margin is 60 seconds).
serio: i8042 AUX port at 0x60,0x64 irq 12
serio: i8042 KBD port at 0x60,0x64 irq 1
Serial: 8250/16550 driver $Revision: 1.90 $ 8 ports, IRQ sharing disabled
ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A
io scheduler noop registered
io scheduler cfq registered
ACPI: PCI Interrupt 0000:00:0a.0[A] -> GSI 17 (level, low) -> IRQ 177
ACPI: PCI Interrupt 0000:00:0a.0[A] -> GSI 17 (level, low) -> IRQ 177
sk98lin: Asus mainboard with buggy VPD? Correcting data.
eth0: Yukon Gigabit Ethernet 10/100/1000Base-T Adapter
PrefPort:A RlmtMode:Check Link State
netconsole: device eth0 not up yet, forcing it
netconsole: carrier detect appears untrustworthy, waiting 4 seconds
eth0: network connection up using port A
speed: 100
autonegotiation: yes
duplex mode: full
flowctrl: symmetric
irq moderation: disabled
scatter-gather: enabled
netconsole: network logging started
libata version 1.10 loaded.
sata_via version 1.1
ACPI: PCI Interrupt 0000:00:0f.0[B] -> GSI 20 (level, low) -> IRQ 185
sata_via(0000:00:0f.0): routed to hard irq line 4
ata1: SATA max UDMA/133 cmd 0xE800 ctl 0xE402 bmdma 0xD400 irq 185
ata2: SATA max UDMA/133 cmd 0xE000 ctl 0xD802 bmdma 0xD408 irq 185
ata1: dev 0 cfg 49:2f00 82:346b 83:7d01 84:4003 85:3469 86:3c01 87:4003 88:407f
ata1: dev 0 ATA, max UDMA/133, 390721968 sectors: lba48
ata1: dev 0 configured for UDMA/133
scsi0 : sata_via
ata2: dev 0 cfg 49:2f00 82:346b 83:7d01 84:4003 85:3469 86:3c01 87:4003 88:407f
ata2: dev 0 ATA, max UDMA/133, 390721968 sectors: lba48
ata2: dev 0 configured for UDMA/133
scsi1 : sata_via
Vendor: ATA Model: ST3200822AS Rev: 3.01
Type: Direct-Access ANSI SCSI revision: 05
Vendor: ATA Model: ST3200822AS Rev: 3.01
Type: Direct-Access ANSI SCSI revision: 05
SCSI device sda: 390721968 512-byte hdwr sectors (200050 MB)
SCSI device sda: drive cache: write back
SCSI device sda: 390721968 512-byte hdwr sectors (200050 MB)
SCSI device sda: drive cache: write back
sda: sda1 sda2
Attached scsi disk sda at scsi0, channel 0, id 0, lun 0
SCSI device sdb: 390721968 512-byte hdwr sectors (200050 MB)
SCSI device sdb: drive cache: write back
SCSI device sdb: 390721968 512-byte hdwr sectors (200050 MB)
SCSI device sdb: drive cache: write back
sdb: sdb1 sdb2
Attached scsi disk sdb at scsi1, channel 0, id 0, lun 0
mice: PS/2 mouse device common for all mice
md: raid0 personality registered as nr 2
md: md driver 0.90.1 MAX_MD_DEVS=256, MD_SB_DISKS=27
Advanced Linux Sound Architecture Driver Version 1.0.9rc2 (Thu Mar 24 10:33:39 2005 UTC).
ACPI: PCI Interrupt 0000:00:11.5[C] -> GSI 22 (level, low) -> IRQ 193
PCI: Setting latency timer of device 0000:00:11.5 to 64
input: AT Translated Set 2 keyboard on isa0060/serio0
logips2pp: Detected unknown logitech mouse model 99
input: ImExPS/2 Logitech Explorer Mouse on isa0060/serio1
codec_read: codec 0 is not valid [0xfe0000]
codec_read: codec 0 is not valid [0xfe0000]
codec_read: codec 0 is not valid [0xfe0000]
codec_read: codec 0 is not valid [0xfe0000]
ALSA device list:
#0: VIA 8237 with AD1980 at 0xc800, irq 193
NET: Registered protocol family 2
IP: routing cache hash table of 16384 buckets, 128Kbytes
TCP established hash table entries: 524288 (order: 10, 4194304 bytes)
TCP bind hash table entries: 65536 (order: 7, 524288 bytes)
TCP: Hash tables configured (established 524288 bind 65536)
NET: Registered protocol family 10
Disabled Privacy Extensions on device ffffffff80466480(lo)
IPv6 over IPv4 tunneling driver
powernow-k8: Found 1 AMD Athlon 64 / Opteron processors (version 1.00.09e)
powernow-k8: 0 : fid 0x2 (1000 MHz), vid 0x12 (1100 mV)
powernow-k8: 1 : fid 0xa (1800 MHz), vid 0x6 (1400 mV)
powernow-k8: 2 : fid 0xc (2000 MHz), vid 0x2 (1500 mV)
cpu_init done, current fid 0xc, vid 0x0
powernow-k8: ph2 null fid transition 0xc
ACPI wakeup devices:
PCI0 PS2K PS2M UAR1 AC97 USB1 USB2 USB3 USB4 EHCI PWRB SLPB
ACPI: (supports S0 S1 S3 S4 S5)
md: Autodetecting RAID arrays.
md: autorun ...
md: considering sdb2 ...
md: adding sdb2 ...
md: adding sda2 ...
md: created md0
md: bind<sda2>
md: bind<sdb2>
md: running: <sdb2><sda2>
md0: setting max_sectors to 128, segment boundary to 32767
raid0: looking at sdb2
raid0: comparing sdb2(194378368) with sdb2(194378368)
raid0: END
raid0: ==> UNIQUE
raid0: 1 zones
raid0: looking at sda2
raid0: comparing sda2(194378368) with sdb2(194378368)
raid0: EQUAL
raid0: FINAL 1 zones
raid0: done.
raid0 : md_size is 388756736 blocks.
raid0 : conf->hash_spacing is 388756736 blocks.
raid0 : nb_zone is 1.
raid0 : Allocating 8 bytes for hash.
md: ... autorun DONE.
EXT3-fs: INFO: recovery required on readonly filesystem.
EXT3-fs: write access will be enabled during recovery.
EXT3-fs: md0: orphan cleanup on readonly fs
ext3_orphan_cleanup: deleting unreferenced inode 18105388
ext3_orphan_cleanup: deleting unreferenced inode 18105387
ext3_orphan_cleanup: deleting unreferenced inode 18105386
ext3_orphan_cleanup: deleting unreferenced inode 18104335
EXT3-fs: md0: 4 orphan inodes deleted
EXT3-fs: recovery complete.
kjournald starting. Commit interval 5 seconds
EXT3-fs: mounted filesystem with ordered data mode.
VFS: Mounted root (ext3 filesystem) readonly.
Freeing unused kernel memory: 152k freed
NET: Registered protocol family 1
Adding 979924k swap on /dev/sdb1. Priority:-1 extents:1
EXT3 FS on md0, internal journal
Generic RTC Driver v1.07
Uniform Multi-Platform E-IDE driver Revision: 7.00alpha2
ide: Assuming 33MHz system bus speed for PIO modes; override with idebus=xx
kjournald starting. Commit interval 5 seconds
EXT3 FS on sda1, internal journal
EXT3-fs: mounted filesystem with ordered data mode.
VP_IDE: IDE controller at PCI slot 0000:00:0f.1
ACPI: PCI Interrupt 0000:00:0f.1[A] -> GSI 20 (level, low) -> IRQ 185
VP_IDE: chipset revision 6
VP_IDE: not 100% native mode: will probe irqs later
VP_IDE: VIA vt8237 (rev 00) IDE UDMA133 controller on pci0000:00:0f.1
ide0: BM-DMA at 0xfc00-0xfc07, BIOS settings: hda:DMA, hdb:pio
ide1: BM-DMA at 0xfc08-0xfc0f, BIOS settings: hdc:DMA, hdd:pio
Probing IDE interface ide0...
hda: PLEXTOR DVD-ROM PX-116A2 0100, ATAPI CD/DVD-ROM drive
ide0 at 0x1f0-0x1f7,0x3f6 on irq 14
hda: ATAPI 48X DVD-ROM drive, 256kB Cache, UDMA(33)
Uniform CD-ROM driver Revision: 3.20
Probing IDE interface ide1...
hdc: HDS722540VLAT20, ATA DISK drive
ide1 at 0x170-0x177,0x376 on irq 15
USB Universal Host Controller Interface driver v2.2
ACPI: PCI Interrupt 0000:00:10.0[A] -> GSI 21 (level, low) -> IRQ 201
uhci_hcd 0000:00:10.0: VIA Technologies, Inc. VT82xxxxx UHCI USB 1.1 Controller
uhci_hcd 0000:00:10.0: new USB bus registered, assigned bus number 1
uhci_hcd 0000:00:10.0: irq 201, io base 0x0000dc00
hub 1-0:1.0: USB hub found
hub 1-0:1.0: 2 ports detected
ACPI: PCI Interrupt 0000:00:10.1[A] -> GSI 21 (level, low) -> IRQ 201
uhci_hcd 0000:00:10.1: VIA Technologies, Inc. VT82xxxxx UHCI USB 1.1 Controller (#2)
uhci_hcd 0000:00:10.1: new USB bus registered, assigned bus number 2
uhci_hcd 0000:00:10.1: irq 201, io base 0x0000ec00
hub 2-0:1.0: USB hub found
hub 2-0:1.0: 2 ports detected
ACPI: PCI Interrupt 0000:00:10.2[B] -> GSI 21 (level, low) -> IRQ 201
uhci_hcd 0000:00:10.2: VIA Technologies, Inc. VT82xxxxx UHCI USB 1.1 Controller (#3)
eth0: no IPv6 routers present
uhci_hcd 0000:00:10.2: new USB bus registered, assigned bus number 3
uhci_hcd 0000:00:10.2: irq 201, io base 0x0000c000
hub 3-0:1.0: USB hub found
hub 3-0:1.0: 2 ports detected
ACPI: PCI Interrupt 0000:00:10.3[B] -> GSI 21 (level, low) -> IRQ 201
uhci_hcd 0000:00:10.3: VIA Technologies, Inc. VT82xxxxx UHCI USB 1.1 Controller (#4)
uhci_hcd 0000:00:10.3: new USB bus registered, assigned bus number 4
uhci_hcd 0000:00:10.3: irq 201, io base 0x0000c400
hub 4-0:1.0: USB hub found
hub 4-0:1.0: 2 ports detected
hdc: max request size: 1024KiB
hdc: 80418240 sectors (41174 MB) w/1794KiB Cache, CHS=16383/255/63, UDMA(100)
hdc: cache flushes supported
hdc: hdc1 hdc2


Attachments:
dmesg (12.65 kB)

2005-04-26 20:28:43

by Patrick McHardy

[permalink] [raw]
Subject: Re: X86_64: 2.6.12-rc3 spontaneous reboot

Index: arch/x86_64/kernel/entry.S
===================================================================
--- 585883113da6fe9142de95138c8ed8ca898a4ccc/arch/x86_64/kernel/entry.S (mode:100644 sha1:3233a15cc4e074c00b75569f21c2844ee280b214)
+++ uncommitted/arch/x86_64/kernel/entry.S (mode:100644)
@@ -307,6 +307,7 @@
1: movl $_TIF_NEED_RESCHED,%edi
int_restore_rest:
RESTORE_REST
+ cli
jmp int_with_check
CFI_ENDPROC


Attachments:
x (426.00 B)

2005-04-27 12:59:37

by Andi Kleen

[permalink] [raw]
Subject: Re: X86_64: 2.6.12-rc3 spontaneous reboot


Can you please test this patch?

It should fix the problem.

Add missing clis

Signed-off-by: Andi Kleen <[email protected]>

diff -u linux-2.6.12rc3/arch/x86_64/kernel/entry.S-o linux-2.6.12rc3/arch/x86_64/kernel/entry.S
--- linux-2.6.12rc3/arch/x86_64/kernel/entry.S-o 2005-04-22 12:48:11.000000000 +0200
+++ linux-2.6.12rc3/arch/x86_64/kernel/entry.S 2005-04-27 14:52:41.998972964 +0200
@@ -296,6 +296,7 @@
call syscall_trace_leave
popq %rdi
andl $~(_TIF_SYSCALL_TRACE|_TIF_SYSCALL_AUDIT|_TIF_SINGLESTEP),%edi
+ cli
jmp int_restore_rest

int_signal:
@@ -307,6 +308,7 @@
1: movl $_TIF_NEED_RESCHED,%edi
int_restore_rest:
RESTORE_REST
+ cli
jmp int_with_check
CFI_ENDPROC

2005-04-27 22:05:40

by Patrick McHardy

[permalink] [raw]
Subject: Re: X86_64: 2.6.12-rc3 spontaneous reboot

Andi Kleen wrote:
> Can you please test this patch?

It works fine, thanks.