2008-01-28 02:22:49

by Gene Heskett

[permalink] [raw]
Subject: Problem with ata layer in 2.6.24

Greeting;

I had to reboot early this morning due to a freezeup, and I had a
bunch of these in the messages log:
==============
Jan 27 19:42:11 coyote kernel: [42461.915961] ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen
Jan 27 19:42:11 coyote kernel: [42461.915973] ata1.00: cmd ca/00:08:b1:66:46/00:00:00:00:00/e8 tag 0 dma 4096 out
Jan 27 19:42:11 coyote kernel: [42461.915974] res 40/00:01:00:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout)
Jan 27 19:42:11 coyote kernel: [42461.915978] ata1.00: status: { DRDY }
Jan 27 19:42:11 coyote kernel: [42461.916005] ata1: soft resetting link
Jan 27 19:42:12 coyote kernel: [42462.078216] ata1.00: configured for UDMA/100
Jan 27 19:42:12 coyote kernel: [42462.078232] ata1: EH complete
Jan 27 19:42:12 coyote kernel: [42462.090700] sd 0:0:0:0: [sda] 390721968 512-byte hardware sectors (200050 MB)
Jan 27 19:42:12 coyote kernel: [42462.114230] sd 0:0:0:0: [sda] Write Protect is off
Jan 27 19:42:12 coyote kernel: [42462.115079] sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't
support DPO or FUA
===============
That one showed up about 2 hours ago, so I expect I'll be locked
up again before I've managed a 24 hour uptime. This drive passed
a 'smartctl -t long /dev/sda' with flying colors after the reboot
this morning.

Two instances were logged after I had rebooted to 2.6.24 from 2.6.24-rc8:

Jan 24 20:46:33 coyote kernel: [ 0.000000] Linux version 2.6.24 ([email protected]) (gcc version 4.1.2 20070925
(Red Hat 4.1.2-33)) #1 SMP Thu Jan 24 20:17:55 EST 2008
----
Jan 27 02:28:29 coyote kernel: [193207.445158] ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen
Jan 27 02:28:29 coyote kernel: [193207.445170] ata1.00: cmd 35/00:08:f9:24:0a/00:00:17:00:00/e0 tag 0 dma 4096 out
Jan 27 02:28:29 coyote kernel: [193207.445172] res 40/00:01:00:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout)
Jan 27 02:28:29 coyote kernel: [193207.445175] ata1.00: status: { DRDY }
Jan 27 02:28:29 coyote kernel: [193207.445202] ata1: soft resetting link
Jan 27 02:28:29 coyote kernel: [193207.607384] ata1.00: configured for UDMA/100
Jan 27 02:28:29 coyote kernel: [193207.607399] ata1: EH complete
Jan 27 02:28:29 coyote kernel: [193207.609681] sd 0:0:0:0: [sda] 390721968 512-byte hardware sectors (200050 MB)
Jan 27 02:28:29 coyote kernel: [193207.619277] sd 0:0:0:0: [sda] Write Protect is off
Jan 27 02:28:29 coyote kernel: [193207.649041] sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't
support DPO or FUA
Jan 27 02:30:06 coyote kernel: [193304.336929] ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen
Jan 27 02:30:06 coyote kernel: [193304.336940] ata1.00: cmd ca/00:20:69:22:a6/00:00:00:00:00/e7 tag 0 dma 16384 out
Jan 27 02:30:06 coyote kernel: [193304.336942] res 40/00:01:00:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout)
Jan 27 02:30:06 coyote kernel: [193304.336945] ata1.00: status: { DRDY }
Jan 27 02:30:06 coyote kernel: [193304.336972] ata1: soft resetting link
Jan 27 02:30:06 coyote kernel: [193304.499210] ata1.00: configured for UDMA/100
Jan 27 02:30:06 coyote kernel: [193304.499226] ata1: EH complete
Jan 27 02:30:06 coyote kernel: [193304.499714] sd 0:0:0:0: [sda] 390721968 512-byte hardware sectors (200050 MB)
Jan 27 02:30:06 coyote kernel: [193304.499857] sd 0:0:0:0: [sda] Write Protect is off
Jan 27 02:30:06 coyote kernel: [193304.502315] sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't
support DPO or FUA

None were logged during the time I was running an -rc7 or -rc8.

The previous hits on this resulted in the udma speed being downgraded
till it was actually running in pio just before the freeze that
required the hardware reset button.

I'll reboot to -rc8 right now and resume. If its the drive, I should see it.
If not, then 2.6.24 is where I'll point the finger.

Idea's anyone?

--
Cheers, Gene
"There are four boxes to be used in defense of liberty:
soap, ballot, jury, and ammo. Please use in that order."
-Ed Howdershelt (Author)
Yow! Am I in Milwaukee?


2008-01-28 03:19:55

by Kasper Sandberg

[permalink] [raw]
Subject: Re: Problem with ata layer in 2.6.24

On Sun, 2008-01-27 at 21:22 -0500, Gene Heskett wrote:
> Greeting;
>
<snip>
> None were logged during the time I was running an -rc7 or -rc8.
>
> The previous hits on this resulted in the udma speed being downgraded
> till it was actually running in pio just before the freeze that
> required the hardware reset button.
>
> I'll reboot to -rc8 right now and resume. If its the drive, I should see it.
> If not, then 2.6.24 is where I'll point the finger.
>
> Idea's anyone?
I believe there is some sort of bug in libata, not just for this kernel
version.

i run a fileserver with .20, and i get these resets a few times a day..
no freezes though... except when its in progress, then all IO freezes
and locks up applications using it.. but it passes after ~30sec.

i know that since ubuntu started shipping with libata by default for
IDE, a large number of people are seeing these intermittant freezes
aswell(which passes after half a minute or less).

i reported this before, however as far as i know, no reason, and much
less a fix, has been found.

it would be great to get this solved though.


>
> --
> Cheers, Gene
> "There are four boxes to be used in defense of liberty:
> soap, ballot, jury, and ammo. Please use in that order."
> -Ed Howdershelt (Author)
> Yow! Am I in Milwaukee?
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/

2008-01-28 08:17:31

by Mikael Pettersson

[permalink] [raw]
Subject: Re: Problem with ata layer in 2.6.24

Gene Heskett writes:
> Greeting;
>
> I had to reboot early this morning due to a freezeup, and I had a
> bunch of these in the messages log:
> ==============
> Jan 27 19:42:11 coyote kernel: [42461.915961] ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen
> Jan 27 19:42:11 coyote kernel: [42461.915973] ata1.00: cmd ca/00:08:b1:66:46/00:00:00:00:00/e8 tag 0 dma 4096 out
> Jan 27 19:42:11 coyote kernel: [42461.915974] res 40/00:01:00:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout)
> Jan 27 19:42:11 coyote kernel: [42461.915978] ata1.00: status: { DRDY }
> Jan 27 19:42:11 coyote kernel: [42461.916005] ata1: soft resetting link
> Jan 27 19:42:12 coyote kernel: [42462.078216] ata1.00: configured for UDMA/100
> Jan 27 19:42:12 coyote kernel: [42462.078232] ata1: EH complete
> Jan 27 19:42:12 coyote kernel: [42462.090700] sd 0:0:0:0: [sda] 390721968 512-byte hardware sectors (200050 MB)
> Jan 27 19:42:12 coyote kernel: [42462.114230] sd 0:0:0:0: [sda] Write Protect is off
> Jan 27 19:42:12 coyote kernel: [42462.115079] sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't
> support DPO or FUA
> ===============
> That one showed up about 2 hours ago, so I expect I'll be locked
> up again before I've managed a 24 hour uptime. This drive passed
> a 'smartctl -t long /dev/sda' with flying colors after the reboot
> this morning.
>
> Two instances were logged after I had rebooted to 2.6.24 from 2.6.24-rc8:
>
> Jan 24 20:46:33 coyote kernel: [ 0.000000] Linux version 2.6.24 ([email protected]) (gcc version 4.1.2 20070925
> (Red Hat 4.1.2-33)) #1 SMP Thu Jan 24 20:17:55 EST 2008
> ----
> Jan 27 02:28:29 coyote kernel: [193207.445158] ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen
> Jan 27 02:28:29 coyote kernel: [193207.445170] ata1.00: cmd 35/00:08:f9:24:0a/00:00:17:00:00/e0 tag 0 dma 4096 out
> Jan 27 02:28:29 coyote kernel: [193207.445172] res 40/00:01:00:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout)
> Jan 27 02:28:29 coyote kernel: [193207.445175] ata1.00: status: { DRDY }
> Jan 27 02:28:29 coyote kernel: [193207.445202] ata1: soft resetting link
> Jan 27 02:28:29 coyote kernel: [193207.607384] ata1.00: configured for UDMA/100
> Jan 27 02:28:29 coyote kernel: [193207.607399] ata1: EH complete
> Jan 27 02:28:29 coyote kernel: [193207.609681] sd 0:0:0:0: [sda] 390721968 512-byte hardware sectors (200050 MB)
> Jan 27 02:28:29 coyote kernel: [193207.619277] sd 0:0:0:0: [sda] Write Protect is off
> Jan 27 02:28:29 coyote kernel: [193207.649041] sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't
> support DPO or FUA
> Jan 27 02:30:06 coyote kernel: [193304.336929] ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen
> Jan 27 02:30:06 coyote kernel: [193304.336940] ata1.00: cmd ca/00:20:69:22:a6/00:00:00:00:00/e7 tag 0 dma 16384 out
> Jan 27 02:30:06 coyote kernel: [193304.336942] res 40/00:01:00:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout)
> Jan 27 02:30:06 coyote kernel: [193304.336945] ata1.00: status: { DRDY }
> Jan 27 02:30:06 coyote kernel: [193304.336972] ata1: soft resetting link
> Jan 27 02:30:06 coyote kernel: [193304.499210] ata1.00: configured for UDMA/100
> Jan 27 02:30:06 coyote kernel: [193304.499226] ata1: EH complete
> Jan 27 02:30:06 coyote kernel: [193304.499714] sd 0:0:0:0: [sda] 390721968 512-byte hardware sectors (200050 MB)
> Jan 27 02:30:06 coyote kernel: [193304.499857] sd 0:0:0:0: [sda] Write Protect is off
> Jan 27 02:30:06 coyote kernel: [193304.502315] sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't
> support DPO or FUA
>
> None were logged during the time I was running an -rc7 or -rc8.
>
> The previous hits on this resulted in the udma speed being downgraded
> till it was actually running in pio just before the freeze that
> required the hardware reset button.
>
> I'll reboot to -rc8 right now and resume. If its the drive, I should see it.
> If not, then 2.6.24 is where I'll point the finger.
>
> Idea's anyone?

1. Wrong mailing list; use linux-ide (@vger) instead.
2. Incomplete dmesg, in particular, we can't see what your hardware is.
Just post the complete dmesg.

2008-01-28 12:03:17

by Peter Zijlstra

[permalink] [raw]
Subject: Re: Problem with ata layer in 2.6.24


On Mon, 2008-01-28 at 09:17 +0100, Mikael Pettersson wrote:

> 1. Wrong mailing list; use linux-ide (@vger) instead.

What, and keep all us other interested people in the dark?

2008-01-28 12:26:31

by Mikael Pettersson

[permalink] [raw]
Subject: Re: Problem with ata layer in 2.6.24

Peter Zijlstra writes:
>
> On Mon, 2008-01-28 at 09:17 +0100, Mikael Pettersson wrote:
>
> > 1. Wrong mailing list; use linux-ide (@vger) instead.
>
> What, and keep all us other interested people in the dark?

MAINTAINERS clearly lists linux-ide as the primary mailing
list for all things IDE/ATA.

The original report only went to LKML, thus it has a high
chance of being missed or ignored by those most capable of
dealing with it.

If a topic is of general interest a simple Cc: lkml will
keep other parties in the loop.

2008-01-28 12:46:19

by Ingo Molnar

[permalink] [raw]
Subject: Re: Problem with ata layer in 2.6.24


* Mikael Pettersson <[email protected]> wrote:

> Peter Zijlstra writes:
> >
> > On Mon, 2008-01-28 at 09:17 +0100, Mikael Pettersson wrote:
> >
> > > 1. Wrong mailing list; use linux-ide (@vger) instead.
> >
> > What, and keep all us other interested people in the dark?
>
> MAINTAINERS clearly lists linux-ide as the primary mailing list for
> all things IDE/ATA.
>
> The original report only went to LKML, thus it has a high chance of
> being missed or ignored by those most capable of dealing with it.

that is a fatal misunderstanding on your part. lkml is a perfectly fine
place to report Linux bugs, why should testers be aware of the zillions
of tiny, mostly irrelevant lists mentioned in the MAINTAINERS file?

Maintainers are required to read lkml for bugreports regarding their
subsystems - not the other way around. If a tester manages to Cc: a
maintainer (be that a person or a list alias) that's a bonus, but not a
requirement at all ...

Ingo

2008-01-28 12:55:43

by Gene Heskett

[permalink] [raw]
Subject: Re: Problem with ata layer in 2.6.24

On Monday 28 January 2008, Peter Zijlstra wrote:
>On Mon, 2008-01-28 at 09:17 +0100, Mikael Pettersson wrote:
>> 1. Wrong mailing list; use linux-ide (@vger) instead.
>
>What, and keep all us other interested people in the dark?

As a test, I tried rebooting to the latest fedora kernel and found it kills X,
so I'm back to the second to last fedora version ATM, and the
third 'smartctl -t lng /dev/sda' in 24 hours is running now. The first two
completed with no errors.

I've added the linux-ide list to refresh those people of the problem,
the logs are being spammed by this message stanza:

Jan 28 04:46:25 coyote kernel: [26550.290016] ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen
Jan 28 04:46:25 coyote kernel: [26550.290028] ata1.00: cmd 35/00:58:c9:9c:0a/00:01:00:00:00/e0 tag 0 dma 176128 out
Jan 28 04:46:25 coyote kernel: [26550.290029] res 40/00:01:00:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout)
Jan 28 04:46:25 coyote kernel: [26550.290032] ata1.00: status: { DRDY }
Jan 28 04:46:25 coyote kernel: [26550.290060] ata1: soft resetting link
Jan 28 04:46:25 coyote kernel: [26550.452301] ata1.00: configured for UDMA/100
Jan 28 04:46:25 coyote kernel: [26550.452318] ata1: EH complete
Jan 28 04:46:25 coyote kernel: [26550.455898] sd 0:0:0:0: [sda] 390721968 512-byte hardware sectors (200050 MB)
Jan 28 04:46:25 coyote kernel: [26550.456151] sd 0:0:0:0: [sda] Write Protect is off
Jan 28 04:46:25 coyote kernel: [26550.456403] sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't
support DPO or FUA


And it just did it again, using the fedora kernel but without logging
anything at all when it froze. In other words I had to reboot between
the word list and the word to above. So now I'm booted to 2.6.24-rc7.

Before it crashes again, here is the dmesg:
[ 0.000000] Linux version 2.6.24-rc7 ([email protected]) (gcc version 4.1.2 20070925 (Red Hat 4.1.2-33)) #1 SMP
Mon Jan 14 10:00:40 EST 2008
[ 0.000000] BIOS-provided physical RAM map:
[ 0.000000] BIOS-e820: 0000000000000000 - 000000000009f800 (usable)
[ 0.000000] BIOS-e820: 000000000009f800 - 00000000000a0000 (reserved)
[ 0.000000] BIOS-e820: 00000000000f0000 - 0000000000100000 (reserved)
[ 0.000000] BIOS-e820: 0000000000100000 - 000000003fff0000 (usable)
[ 0.000000] BIOS-e820: 000000003fff0000 - 000000003fff3000 (ACPI NVS)
[ 0.000000] BIOS-e820: 000000003fff3000 - 0000000040000000 (ACPI data)
[ 0.000000] BIOS-e820: 00000000fec00000 - 00000000fec01000 (reserved)
[ 0.000000] BIOS-e820: 00000000fee00000 - 00000000fee01000 (reserved)
[ 0.000000] BIOS-e820: 00000000ffff0000 - 0000000100000000 (reserved)
[ 0.000000] 127MB HIGHMEM available.
[ 0.000000] 896MB LOWMEM available.
[ 0.000000] Entering add_active_range(0, 0, 262128) 0 entries of 256 used
[ 0.000000] Zone PFN ranges:
[ 0.000000] DMA 0 -> 4096
[ 0.000000] Normal 4096 -> 229376
[ 0.000000] HighMem 229376 -> 262128
[ 0.000000] Movable zone start PFN for each node
[ 0.000000] early_node_map[1] active PFN ranges
[ 0.000000] 0: 0 -> 262128
[ 0.000000] On node 0 totalpages: 262128
[ 0.000000] DMA zone: 32 pages used for memmap
[ 0.000000] DMA zone: 0 pages reserved
[ 0.000000] DMA zone: 4064 pages, LIFO batch:0
[ 0.000000] Normal zone: 1760 pages used for memmap
[ 0.000000] Normal zone: 223520 pages, LIFO batch:31
[ 0.000000] HighMem zone: 255 pages used for memmap
[ 0.000000] HighMem zone: 32497 pages, LIFO batch:7
[ 0.000000] Movable zone: 0 pages used for memmap
[ 0.000000] DMI 2.2 present.
[ 0.000000] ACPI: RSDP 000F7220, 0014 (r0 Nvidia)
[ 0.000000] ACPI: RSDT 3FFF3000, 002C (r1 Nvidia AWRDACPI 42302E31 AWRD 0)
[ 0.000000] ACPI: FACP 3FFF3040, 0074 (r1 Nvidia AWRDACPI 42302E31 AWRD 0)
[ 0.000000] ACPI: DSDT 3FFF30C0, 4CC4 (r1 NVIDIA AWRDACPI 1000 MSFT 100000E)
[ 0.000000] ACPI: FACS 3FFF0000, 0040
[ 0.000000] ACPI: APIC 3FFF7DC0, 006E (r1 Nvidia AWRDACPI 42302E31 AWRD 0)
[ 0.000000] Nvidia board detected. Ignoring ACPI timer override.
[ 0.000000] If you got timer trouble try acpi_use_timer_override
[ 0.000000] ACPI: PM-Timer IO Port: 0x4008
[ 0.000000] ACPI: Local APIC address 0xfee00000
[ 0.000000] ACPI: LAPIC (acpi_id[0x00] lapic_id[0x00] enabled)
[ 0.000000] Processor #0 6:10 APIC version 16
[ 0.000000] ACPI: LAPIC_NMI (acpi_id[0x00] high edge lint[0x1])
[ 0.000000] ACPI: IOAPIC (id[0x02] address[0xfec00000] gsi_base[0])
[ 0.000000] IOAPIC[0]: apic_id 2, version 17, address 0xfec00000, GSI 0-23
[ 0.000000] ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl)
[ 0.000000] ACPI: BIOS IRQ0 pin2 override ignored.
[ 0.000000] ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 high level)
[ 0.000000] ACPI: INT_SRC_OVR (bus 0 bus_irq 14 global_irq 14 high edge)
[ 0.000000] ACPI: INT_SRC_OVR (bus 0 bus_irq 15 global_irq 15 high edge)
[ 0.000000] ACPI: IRQ9 used by override.
[ 0.000000] ACPI: IRQ14 used by override.
[ 0.000000] ACPI: IRQ15 used by override.
[ 0.000000] Enabling APIC mode: Flat. Using 1 I/O APICs
[ 0.000000] Using ACPI (MADT) for SMP configuration information
[ 0.000000] Allocating PCI resources starting at 50000000 (gap: 40000000:bec00000)
[ 0.000000] swsusp: Registered nosave memory region: 000000000009f000 - 00000000000a0000
[ 0.000000] swsusp: Registered nosave memory region: 00000000000a0000 - 00000000000f0000
[ 0.000000] swsusp: Registered nosave memory region: 00000000000f0000 - 0000000000100000
[ 0.000000] Built 1 zonelists in Zone order, mobility grouping on. Total pages: 260081
[ 0.000000] Kernel command line: ro root=/dev/VolGroup00/LogVol00 rhgb quiet
[ 0.000000] mapped APIC to ffffb000 (fee00000)
[ 0.000000] mapped IOAPIC to ffffa000 (fec00000)
[ 0.000000] Enabling fast FPU save and restore... done.
[ 0.000000] Enabling unmasked SIMD FPU exception support... done.
[ 0.000000] Initializing CPU#0
[ 0.000000] CPU 0 irqstacks, hard=c073a000 soft=c071a000
[ 0.000000] PID hash table entries: 4096 (order: 12, 16384 bytes)
[ 0.000000] Detected 2079.551 MHz processor.
[ 28.725256] Console: colour VGA+ 80x25
[ 28.725259] console [tty0] enabled
[ 28.725828] Dentry cache hash table entries: 131072 (order: 7, 524288 bytes)
[ 28.726361] Inode-cache hash table entries: 65536 (order: 6, 262144 bytes)
[ 28.756701] Memory: 1031116k/1048512k available (1938k kernel code, 16656k reserved, 967k data, 236k init, 131008k
highmem)
[ 28.756710] virtual kernel memory layout:
[ 28.756711] fixmap : 0xffc55000 - 0xfffff000 (3752 kB)
[ 28.756713] pkmap : 0xff800000 - 0xffc00000 (4096 kB)
[ 28.756714] vmalloc : 0xf8800000 - 0xff7fe000 ( 111 MB)
[ 28.756715] lowmem : 0xc0000000 - 0xf8000000 ( 896 MB)
[ 28.756716] .init : 0xc06dc000 - 0xc0717000 ( 236 kB)
[ 28.756718] .data : 0xc05e4944 - 0xc06d66e4 ( 967 kB)
[ 28.756719] .text : 0xc0400000 - 0xc05e4944 (1938 kB)
[ 28.756722] Checking if this processor honours the WP bit even in supervisor mode... Ok.
[ 28.756770] SLUB: Genslabs=11, HWalign=32, Order=0-1, MinObjects=4, CPUs=1, Nodes=1
[ 28.816731] Calibrating delay using timer specific routine.. 4160.90 BogoMIPS (lpj=2080452)
[ 28.816763] Security Framework initialized
[ 28.816770] SELinux: Initializing.
[ 28.816784] SELinux: Starting in permissive mode
[ 28.816797] selinux_register_security: Registering secondary module capability
[ 28.816800] Capability LSM initialized as secondary
[ 28.816809] Mount-cache hash table entries: 512
[ 28.816976] CPU: After generic identify, caps: 0383fbff c1c3fbff 00000000 00000000 00000000 00000000 00000000
00000000
[ 28.816985] CPU: L1 I Cache: 64K (64 bytes/line), D cache 64K (64 bytes/line)
[ 28.816987] CPU: L2 Cache: 512K (64 bytes/line)
[ 28.816990] CPU: After all inits, caps: 0383fbff c1c3fbff 00000000 00000420 00000000 00000000 00000000 00000000
[ 28.816996] Intel machine check architecture supported.
[ 28.816998] Intel machine check reporting enabled on CPU#0.
[ 28.817003] Compat vDSO mapped to ffffe000.
[ 28.817017] Checking 'hlt' instruction... OK.
[ 28.820895] SMP alternatives: switching to UP code
[ 28.821401] Freeing SMP alternatives: 12k freed
[ 28.821404] ACPI: Core revision 20070126
[ 28.824590] CPU0: AMD Athlon(tm) XP 2800+ stepping 00
[ 28.824614] Total of 1 processors activated (4160.90 BogoMIPS).
[ 28.824820] ENABLING IO-APIC IRQs
[ 28.825012] ..TIMER: vector=0x31 apic1=0 pin1=0 apic2=-1 pin2=-1
[ 28.936680] Brought up 1 CPUs
[ 28.936708] CPU0 attaching sched-domain:
[ 28.936711] domain 0: span 00000001
[ 28.936713] groups: 00000001
[ 28.936925] net_namespace: 64 bytes
[ 28.937409] Time: 12:43:09 Date: 01/28/08
[ 28.937442] NET: Registered protocol family 16
[ 28.937683] ACPI: bus type pci registered
[ 28.972986] PCI: PCI BIOS revision 2.10 entry at 0xfb4c0, last bus=2
[ 28.972989] PCI: Using configuration type 1
[ 28.972991] Setting up standard PCI resources
[ 28.980763] ACPI: EC: Look up EC in DSDT
[ 28.986590] ACPI: Interpreter enabled
[ 28.986593] ACPI: (supports S0 S1 S4 S5)
[ 28.986608] ACPI: Using IOAPIC for interrupt routing
[ 28.997079] ACPI: PCI Root Bridge [PCI0] (0000:00)
[ 28.997157] PCI: nForce2 C1 Halt Disconnect fixup
[ 28.998175] ACPI: PCI Interrupt Routing Table [\_SB_.PCI0._PRT]
[ 28.998355] ACPI: PCI Interrupt Routing Table [\_SB_.PCI0.HUB0._PRT]
[ 28.998631] ACPI: PCI Interrupt Routing Table [\_SB_.PCI0.AGPB._PRT]
[ 29.054757] ACPI: PCI Interrupt Link [LNK1] (IRQs 3 4 5 6 7 10 *11 12 14 15)
[ 29.054952] ACPI: PCI Interrupt Link [LNK2] (IRQs 3 4 5 6 7 10 *11 12 14 15)
[ 29.055144] ACPI: PCI Interrupt Link [LNK3] (IRQs 3 4 *5 6 7 10 11 12 14 15)
[ 29.055334] ACPI: PCI Interrupt Link [LNK4] (IRQs 3 4 *5 6 7 10 11 12 14 15)
[ 29.055529] ACPI: PCI Interrupt Link [LNK5] (IRQs 3 4 5 6 7 10 11 12 14 15) *0, disabled.
[ 29.055724] ACPI: PCI Interrupt Link [LUBA] (IRQs 3 4 *5 6 7 10 11 12 14 15)
[ 29.055918] ACPI: PCI Interrupt Link [LUBB] (IRQs 3 4 5 6 7 10 11 *12 14 15)
[ 29.056109] ACPI: PCI Interrupt Link [LMAC] (IRQs 3 4 5 6 7 10 11 *12 14 15)
[ 29.056298] ACPI: PCI Interrupt Link [LAPU] (IRQs 3 4 5 6 7 10 11 12 14 15) *0, disabled.
[ 29.056489] ACPI: PCI Interrupt Link [LACI] (IRQs 3 4 5 6 7 10 11 *12 14 15)
[ 29.056685] ACPI: PCI Interrupt Link [LMCI] (IRQs 3 4 5 6 7 10 11 12 14 15) *0, disabled.
[ 29.056876] ACPI: PCI Interrupt Link [LSMB] (IRQs 3 4 5 6 7 10 11 12 14 15) *0, disabled.
[ 29.057066] ACPI: PCI Interrupt Link [LUB2] (IRQs 3 4 5 6 7 10 *11 12 14 15)
[ 29.057258] ACPI: PCI Interrupt Link [LFIR] (IRQs 3 4 5 6 7 10 11 12 14 15) *0, disabled.
[ 29.057448] ACPI: PCI Interrupt Link [L3CM] (IRQs 3 4 5 6 7 10 11 12 14 15) *0, disabled.
[ 29.057642] ACPI: PCI Interrupt Link [LIDE] (IRQs 3 4 5 6 7 10 11 12 14 15) *0, disabled.
[ 29.057803] ACPI: PCI Interrupt Link [APC1] (IRQs *16)
[ 29.057951] ACPI: PCI Interrupt Link [APC2] (IRQs *17)
[ 29.058099] ACPI: PCI Interrupt Link [APC3] (IRQs *18)
[ 29.058246] ACPI: PCI Interrupt Link [APC4] (IRQs *19)
[ 29.058402] ACPI: PCI Interrupt Link [APC5] (IRQs *16), disabled.
[ 29.058617] ACPI: PCI Interrupt Link [APCF] (IRQs 20 21 22) *0
[ 29.058827] ACPI: PCI Interrupt Link [APCG] (IRQs 20 21 22) *0
[ 29.059036] ACPI: PCI Interrupt Link [APCH] (IRQs 20 21 22) *0
[ 29.059245] ACPI: PCI Interrupt Link [APCI] (IRQs 20 21 22) *0, disabled.
[ 29.059454] ACPI: PCI Interrupt Link [APCJ] (IRQs 20 21 22) *0
[ 29.059669] ACPI: PCI Interrupt Link [APCK] (IRQs 20 21 22) *0, disabled.
[ 29.059819] ACPI: PCI Interrupt Link [APCS] (IRQs *23), disabled.
[ 29.060028] ACPI: PCI Interrupt Link [APCL] (IRQs 20 21 22) *0
[ 29.060236] ACPI: PCI Interrupt Link [APCM] (IRQs 20 21 22) *0, disabled.
[ 29.060446] ACPI: PCI Interrupt Link [AP3C] (IRQs 20 21 22) *0, disabled.
[ 29.060661] ACPI: PCI Interrupt Link [APCZ] (IRQs 20 21 22) *0, disabled.
[ 29.060822] ACPI: Power Resource [ISAV] (on)
[ 29.060880] Linux Plug and Play Support v0.97 (c) Adam Belay
[ 29.060917] pnp: PnP ACPI init
[ 29.060926] ACPI: bus type pnp registered
[ 29.066989] pnp: PnP ACPI: found 16 devices
[ 29.066992] ACPI: ACPI bus type pnp unregistered
[ 29.067179] usbcore: registered new interface driver usbfs
[ 29.067257] usbcore: registered new interface driver hub
[ 29.067309] usbcore: registered new device driver usb
[ 29.067395] PCI: Using ACPI for IRQ routing
[ 29.067399] PCI: If a device doesn't work, try "pci=routeirq". If it helps, post a report
[ 29.117453] NetLabel: Initializing
[ 29.117455] NetLabel: domain hash size = 128
[ 29.117457] NetLabel: protocols = UNLABELED CIPSOv4
[ 29.117471] NetLabel: unlabeled traffic allowed by default
[ 29.118443] Time: tsc clocksource has been installed.
[ 29.120481] system 00:00: ioport range 0x4000-0x407f has been reserved
[ 29.120484] system 00:00: ioport range 0x4080-0x40ff has been reserved
[ 29.120487] system 00:00: ioport range 0x4400-0x447f has been reserved
[ 29.120490] system 00:00: ioport range 0x4480-0x44ff has been reserved
[ 29.120492] system 00:00: ioport range 0x4200-0x427f has been reserved
[ 29.120495] system 00:00: ioport range 0x4280-0x42ff has been reserved
[ 29.120502] system 00:01: ioport range 0x5000-0x503f has been reserved
[ 29.120505] system 00:01: ioport range 0x5100-0x513f has been reserved
[ 29.120511] system 00:02: iomem range 0xda800-0xdbfff has been reserved
[ 29.120514] system 00:02: iomem range 0xf0000-0xf7fff could not be reserved
[ 29.120516] system 00:02: iomem range 0xf8000-0xfbfff could not be reserved
[ 29.120519] system 00:02: iomem range 0xfc000-0xfffff could not be reserved
[ 29.120522] system 00:02: iomem range 0x3fff0000-0x3fffffff could not be reserved
[ 29.120525] system 00:02: iomem range 0xffff0000-0xffffffff could not be reserved
[ 29.120528] system 00:02: iomem range 0x0-0x9ffff could not be reserved
[ 29.120531] system 00:02: iomem range 0x100000-0x3ffeffff could not be reserved
[ 29.120534] system 00:02: iomem range 0xfec00000-0xfec00fff could not be reserved
[ 29.120537] system 00:02: iomem range 0xfee00000-0xfee00fff could not be reserved
[ 29.120543] system 00:04: ioport range 0xb78-0xb7b has been reserved
[ 29.120546] system 00:04: ioport range 0xf78-0xf7b has been reserved
[ 29.120548] system 00:04: ioport range 0xa78-0xa7b has been reserved
[ 29.120551] system 00:04: ioport range 0xe78-0xe7b has been reserved
[ 29.120554] system 00:04: ioport range 0xbbc-0xbbf has been reserved
[ 29.120556] system 00:04: ioport range 0xfbc-0xfbf has been reserved
[ 29.120559] system 00:04: ioport range 0x4d0-0x4d1 has been reserved
[ 29.120562] system 00:04: ioport range 0x294-0x297 has been reserved
[ 29.151040] PCI: Bridge: 0000:00:08.0
[ 29.151044] IO window: 9000-afff
[ 29.151049] MEM window: e3000000-e6ffffff
[ 29.151053] PREFETCH window: 50000000-500fffff
[ 29.151058] PCI: Bridge: 0000:00:1e.0
[ 29.151059] IO window: disabled.
[ 29.151063] MEM window: e0000000-e2ffffff
[ 29.151066] PREFETCH window: d0000000-dfffffff
[ 29.151077] PCI: Setting latency timer of device 0000:00:08.0 to 64
[ 29.151093] NET: Registered protocol family 2
[ 29.160585] IP route cache hash table entries: 32768 (order: 5, 131072 bytes)
[ 29.160952] TCP established hash table entries: 131072 (order: 8, 1048576 bytes)
[ 29.162429] TCP bind hash table entries: 65536 (order: 7, 524288 bytes)
[ 29.163092] TCP: Hash tables configured (established 131072 bind 65536)
[ 29.163095] TCP reno registered
[ 29.165574] checking if image is initramfs... it is
[ 29.446295] Freeing initrd memory: 3628k freed
[ 29.446709] apm: BIOS version 1.2 Flags 0x07 (Driver version 1.16ac)
[ 29.446712] apm: overridden by ACPI.
[ 29.447133] audit: initializing netlink socket (disabled)
[ 29.447149] audit(1201524188.569:1): initialized
[ 29.447287] highmem bounce pool size: 64 pages
[ 29.447291] Total HugeTLB memory allocated, 0
[ 29.449941] SELinux: Registering netfilter hooks
[ 29.450082] Block layer SCSI generic (bsg) driver version 0.4 loaded (major 254)
[ 29.450086] io scheduler noop registered
[ 29.450088] io scheduler anticipatory registered
[ 29.450090] io scheduler deadline registered
[ 29.450101] io scheduler cfq registered (default)
[ 29.472109] Boot video device is 0000:02:00.0
[ 29.477398] ACPI: Thermal Zone [THRM] (51 C)
[ 29.477413] isapnp: Scanning for PnP cards...
[ 29.650914] Switched to high resolution mode on CPU 0
[ 29.834322] isapnp: No Plug & Play device found
[ 29.837157] Real Time Clock Driver v1.12ac
[ 29.837309] Non-volatile memory driver v1.2
[ 29.837312] Linux agpgart interface v0.102
[ 29.837365] agpgart: Detected NVIDIA nForce2 chipset
[ 29.853228] agpgart: AGP aperture is 256M @ 0xc0000000
[ 29.853255] Serial: 8250/16550 driver $Revision: 1.90 $ 2 ports, IRQ sharing disabled
[ 29.853403] serial8250: ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A
[ 29.853542] serial8250: ttyS1 at I/O 0x2f8 (irq = 3) is a 16550A
[ 29.853854] 00:0a: ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A
[ 29.854037] 00:0b: ttyS1 at I/O 0x2f8 (irq = 3) is a 16550A
[ 29.855000] RAMDISK driver initialized: 16 RAM disks of 16384K size 4096 blocksize
[ 29.855188] PNP: PS/2 Controller [PNP0303:PS2K] at 0x60,0x64 irq 1
[ 29.855191] PNP: PS/2 appears to have AUX port disabled, if this is incorrect please boot with i8042.nopnp
[ 29.855565] serio: i8042 KBD port at 0x60,0x64 irq 1
[ 29.855638] mice: PS/2 mouse device common for all mice
[ 29.876081] input: AT Translated Set 2 keyboard as /class/input/input0
[ 29.878901] cpuidle: using governor ladder
[ 29.878904] cpuidle: using governor menu
[ 29.878982] usbcore: registered new interface driver hiddev
[ 29.879024] usbcore: registered new interface driver usbhid
[ 29.879027] drivers/hid/usbhid/hid-core.c: v2.6:USB HID core driver
[ 29.879098] TCP cubic registered
[ 29.879100] Initializing XFRM netlink socket
[ 29.879180] NET: Registered protocol family 1
[ 29.879196] NET: Registered protocol family 17
[ 29.879204] Using IPI No-Shortcut mode
[ 29.879217] registered taskstats version 1
[ 29.879349] Magic number: 8:30:735
[ 29.879657] Freeing unused kernel memory: 236k freed
[ 29.879695] Write protecting the kernel text: 1940k
[ 29.879708] Write protecting the kernel read-only data: 758k
[ 30.175117] ACPI: PCI Interrupt Link [APCL] enabled at IRQ 22
[ 30.175126] ACPI: PCI Interrupt 0000:00:02.2[C] -> Link [APCL] -> GSI 22 (level, high) -> IRQ 16
[ 30.175139] PCI: Setting latency timer of device 0000:00:02.2 to 64
[ 30.175143] ehci_hcd 0000:00:02.2: EHCI Host Controller
[ 30.175235] ehci_hcd 0000:00:02.2: new USB bus registered, assigned bus number 1
[ 30.175275] ehci_hcd 0000:00:02.2: debug port 1
[ 30.175280] PCI: cache line size of 64 is not supported by device 0000:00:02.2
[ 30.175291] ehci_hcd 0000:00:02.2: irq 16, io mem 0xe7005000
[ 30.180677] ehci_hcd 0000:00:02.2: USB 2.0 started, EHCI 1.00, driver 10 Dec 2004
[ 30.180795] usb usb1: configuration #1 chosen from 1 choice
[ 30.180823] hub 1-0:1.0: USB hub found
[ 30.180834] hub 1-0:1.0: 6 ports detected
[ 30.287626] ohci_hcd: 2006 August 04 USB 1.1 'Open' Host Controller (OHCI) Driver
[ 30.288031] ACPI: PCI Interrupt Link [APCF] enabled at IRQ 21
[ 30.288038] ACPI: PCI Interrupt 0000:00:02.0[A] -> Link [APCF] -> GSI 21 (level, high) -> IRQ 17
[ 30.288052] PCI: Setting latency timer of device 0000:00:02.0 to 64
[ 30.288055] ohci_hcd 0000:00:02.0: OHCI Host Controller
[ 30.288129] ohci_hcd 0000:00:02.0: new USB bus registered, assigned bus number 2
[ 30.288148] ohci_hcd 0000:00:02.0: irq 17, io mem 0xe7003000
[ 30.340664] usb usb2: configuration #1 chosen from 1 choice
[ 30.340691] hub 2-0:1.0: USB hub found
[ 30.340704] hub 2-0:1.0: 3 ports detected
[ 30.441860] ACPI: PCI Interrupt Link [APCG] enabled at IRQ 20
[ 30.441865] ACPI: PCI Interrupt 0000:00:02.1[B] -> Link [APCG] -> GSI 20 (level, high) -> IRQ 18
[ 30.441873] PCI: Setting latency timer of device 0000:00:02.1 to 64
[ 30.441876] ohci_hcd 0000:00:02.1: OHCI Host Controller
[ 30.441932] ohci_hcd 0000:00:02.1: new USB bus registered, assigned bus number 3
[ 30.441945] ohci_hcd 0000:00:02.1: irq 18, io mem 0xe7004000
[ 30.487468] usb 1-1: new high speed USB device using ehci_hcd and address 2
[ 30.494540] usb usb3: configuration #1 chosen from 1 choice
[ 30.494569] hub 3-0:1.0: USB hub found
[ 30.494579] hub 3-0:1.0: 3 ports detected
[ 30.601427] USB Universal Host Controller Interface driver v3.0
[ 30.601865] usb 1-1: configuration #1 chosen from 1 choice
[ 30.602052] hub 1-1:1.0: USB hub found
[ 30.602151] hub 1-1:1.0: 4 ports detected
[ 30.660576] SCSI subsystem initialized
[ 30.673851] Driver 'sd' needs updating - please use bus_type methods
[ 30.700319] libata version 3.00 loaded.
[ 30.702887] pata_amd 0000:00:09.0: version 0.3.10
[ 30.703052] PCI: Setting latency timer of device 0000:00:09.0 to 64
[ 30.703188] scsi0 : pata_amd
[ 30.709313] scsi1 : pata_amd
[ 30.710076] ata1: PATA max UDMA/133 cmd 0x1f0 ctl 0x3f6 bmdma 0xf000 irq 14
[ 30.710079] ata2: PATA max UDMA/133 cmd 0x170 ctl 0x376 bmdma 0xf008 irq 15
[ 30.864753] ata1.00: ATA-6: WDC WD2000JB-00EVA0, 15.05R15, max UDMA/100
[ 30.864756] ata1.00: 390721968 sectors, multi 16: LBA48
[ 30.871629] ata1.00: configured for UDMA/100
[ 31.195305] ata2.00: ATAPI: LITE-ON DVDRW SHM-165H6S, HS06, max UDMA/66
[ 31.243813] ata2.01: ATA-7: MAXTOR STM3320620A, 3.AAE, max UDMA/100
[ 31.243816] ata2.01: 625142448 sectors, multi 16: LBA48
[ 31.243825] ata2.00: limited to UDMA/33 due to 40-wire cable
[ 31.417074] ata2.00: configured for UDMA/33
[ 31.451769] ata2.01: configured for UDMA/100
[ 31.451873] scsi 0:0:0:0: Direct-Access ATA WDC WD2000JB-00E 15.0 PQ: 0 ANSI: 5
[ 31.451953] sd 0:0:0:0: [sda] 390721968 512-byte hardware sectors (200050 MB)
[ 31.451967] sd 0:0:0:0: [sda] Write Protect is off
[ 31.451970] sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00
[ 31.451989] sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
[ 31.452040] sd 0:0:0:0: [sda] 390721968 512-byte hardware sectors (200050 MB)
[ 31.452051] sd 0:0:0:0: [sda] Write Protect is off
[ 31.452054] sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00
[ 31.452071] sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
[ 31.452075] sda: sda1 sda2
[ 31.467219] sd 0:0:0:0: [sda] Attached SCSI disk
[ 31.468093] scsi 1:0:0:0: CD-ROM LITE-ON DVDRW SHM-165H6S HS06 PQ: 0 ANSI: 5
[ 31.468208] scsi 1:0:1:0: Direct-Access ATA MAXTOR STM332062 3.AA PQ: 0 ANSI: 5
[ 31.468272] sd 1:0:1:0: [sdb] 625142448 512-byte hardware sectors (320073 MB)
[ 31.468283] sd 1:0:1:0: [sdb] Write Protect is off
[ 31.468286] sd 1:0:1:0: [sdb] Mode Sense: 00 3a 00 00
[ 31.468303] sd 1:0:1:0: [sdb] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
[ 31.468338] sd 1:0:1:0: [sdb] 625142448 512-byte hardware sectors (320073 MB)
[ 31.468349] sd 1:0:1:0: [sdb] Write Protect is off
[ 31.468352] sd 1:0:1:0: [sdb] Mode Sense: 00 3a 00 00
[ 31.468370] sd 1:0:1:0: [sdb] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
[ 31.468373] sdb:<6>usb 2-2: new full speed USB device using ohci_hcd and address 2
[ 31.499690] sdb1 sdb2 sdb3
[ 31.500119] sd 1:0:1:0: [sdb] Attached SCSI disk
[ 31.637428] usb 2-2: configuration #1 chosen from 1 choice
[ 31.856522] usb 2-3: new low speed USB device using ohci_hcd and address 3
[ 32.020045] usb 2-3: configuration #1 chosen from 1 choice
[ 32.035222] input: Chicony Saitek Eclipse II Keyboard as /class/input/input1
[ 32.038424] input,hidraw0: USB HID v1.11 Keyboard [Chicony Saitek Eclipse II Keyboard] on usb-0000:00:02.0-3
[ 32.067995] input: Chicony Saitek Eclipse II Keyboard as /class/input/input2
[ 32.070422] input,hiddev96,hidraw1: USB HID v1.11 Device [Chicony Saitek Eclipse II Keyboard] on usb-0000:00:02.0-3
[ 32.287225] usb 3-3: new full speed USB device using ohci_hcd and address 2
[ 32.422699] usb 3-3: configuration #1 chosen from 1 choice
[ 32.425658] hub 3-3:1.0: USB hub found
[ 32.428631] hub 3-3:1.0: 4 ports detected
[ 32.724000] usb 1-1.1: new low speed USB device using ehci_hcd and address 6
[ 32.919001] usb 1-1.1: configuration #1 chosen from 1 choice
[ 33.655893] hiddev97hidraw2: USB HID v1.11 Device [Belkin Belkin UPS] on usb-0000:00:02.2-1.1
[ 33.833315] usb 1-1.2: new full speed USB device using ehci_hcd and address 7
[ 33.925926] usb 1-1.2: configuration #1 chosen from 1 choice
[ 34.018028] device-mapper: ioctl: 4.12.0-ioctl (2007-10-02) initialised: [email protected]
[ 34.043070] sata_sil 0000:01:0a.0: version 2.3
[ 34.043348] ACPI: PCI Interrupt Link [APC1] enabled at IRQ 16
[ 34.043355] ACPI: PCI Interrupt 0000:01:0a.0[A] -> Link [APC1] -> GSI 16 (level, high) -> IRQ 19
[ 34.045029] scsi2 : sata_sil
[ 34.050031] scsi3 : sata_sil
[ 34.050064] ata3: SATA max UDMA/100 mmio m512@0xe6004000 tf 0xe6004080 irq 19
[ 34.050068] ata4: SATA max UDMA/100 mmio m512@0xe6004000 tf 0xe60040c0 irq 19
[ 34.107056] usb 1-1.4: new high speed USB device using ehci_hcd and address 8
[ 34.192310] usb 1-1.4: configuration #1 chosen from 1 choice
[ 34.192499] hub 1-1.4:1.0: USB hub found
[ 34.192597] hub 1-1.4:1.0: 4 ports detected
[ 34.352811] ata3: SATA link down (SStatus 0 SControl 310)
[ 34.481823] usb 1-1.4.3: new low speed USB device using ehci_hcd and address 9
[ 34.573676] usb 1-1.4.3: configuration #1 chosen from 1 choice
[ 34.577917] input: Logitech USB Receiver as /class/input/input3
[ 34.580691] input,hidraw3: USB HID v1.10 Mouse [Logitech USB Receiver] on usb-0000:00:02.2-1.4.3
[ 34.757557] usb 1-1.4.4: new full speed USB device using ehci_hcd and address 10
[ 34.806514] ata4: SATA link up 1.5 Gbps (SStatus 113 SControl 310)
[ 34.810071] ata4.00: ATA-7: Hitachi HDT725040VLA360, V5COA7EA, max UDMA/133
[ 34.810074] ata4.00: 781422768 sectors, multi 16: LBA48 NCQ (depth 0/32)
[ 34.816059] ata4.00: configured for UDMA/100
[ 34.816175] scsi 3:0:0:0: Direct-Access ATA Hitachi HDT72504 V5CO PQ: 0 ANSI: 5
[ 34.816257] sd 3:0:0:0: [sdc] 781422768 512-byte hardware sectors (400088 MB)
[ 34.816271] sd 3:0:0:0: [sdc] Write Protect is off
[ 34.816274] sd 3:0:0:0: [sdc] Mode Sense: 00 3a 00 00
[ 34.816293] sd 3:0:0:0: [sdc] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
[ 34.816344] sd 3:0:0:0: [sdc] 781422768 512-byte hardware sectors (400088 MB)
[ 34.816355] sd 3:0:0:0: [sdc] Write Protect is off
[ 34.816358] sd 3:0:0:0: [sdc] Mode Sense: 00 3a 00 00
[ 34.816375] sd 3:0:0:0: [sdc] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
[ 34.816379] sdc: sdc1
[ 34.829837] sd 3:0:0:0: [sdc] Attached SCSI disk
[ 34.843933] usb 1-1.4.4: configuration #1 chosen from 1 choice
[ 37.852458] EXT3-fs: INFO: recovery required on readonly filesystem.
[ 37.852464] EXT3-fs: write access will be enabled during recovery.
[ 41.522866] kjournald starting. Commit interval 5 seconds
[ 41.522885] EXT3-fs: recovery complete.
[ 41.524254] EXT3-fs: mounted filesystem with ordered data mode.
[ 41.972449] audit(1201524201.103:2): enforcing=1 old_enforcing=0 auid=4294967295
[ 42.187011] SELinux:8192 avtab hash slots allocated.Num of rules:213166
[ 42.260611] SELinux:8192 avtab hash slots allocated.Num of rules:213166
[ 42.314117] security: 8 users, 11 roles, 2363 types, 114 bools, 1 sens, 1024 cats
[ 42.314122] security: 67 classes, 213166 rules
[ 42.327287] SELinux: Completing initialization.
[ 42.327290] SELinux: Setting up existing superblocks.
[ 42.353550] SELinux: initialized (dev dm-0, type ext3), uses xattr
[ 42.515071] SELinux: initialized (dev usbfs, type usbfs), uses genfs_contexts
[ 42.515088] SELinux: initialized (dev tmpfs, type tmpfs), uses transition SIDs
[ 42.515193] SELinux: initialized (dev debugfs, type debugfs), uses genfs_contexts
[ 42.515205] SELinux: initialized (dev selinuxfs, type selinuxfs), uses genfs_contexts
[ 42.515235] SELinux: initialized (dev mqueue, type mqueue), uses transition SIDs
[ 42.515244] SELinux: initialized (dev hugetlbfs, type hugetlbfs), uses genfs_contexts
[ 42.515250] SELinux: initialized (dev devpts, type devpts), uses transition SIDs
[ 42.515262] SELinux: initialized (dev inotifyfs, type inotifyfs), uses genfs_contexts
[ 42.515266] SELinux: initialized (dev tmpfs, type tmpfs), uses transition SIDs
[ 42.515274] SELinux: initialized (dev futexfs, type futexfs), uses genfs_contexts
[ 42.515280] SELinux: initialized (dev anon_inodefs, type anon_inodefs), uses genfs_contexts
[ 42.515285] SELinux: initialized (dev pipefs, type pipefs), uses task SIDs
[ 42.515290] SELinux: initialized (dev sockfs, type sockfs), uses task SIDs
[ 42.515299] SELinux: initialized (dev proc, type proc), uses genfs_contexts
[ 42.515312] SELinux: initialized (dev bdev, type bdev), uses genfs_contexts
[ 42.515318] SELinux: initialized (dev rootfs, type rootfs), uses genfs_contexts
[ 42.515341] SELinux: initialized (dev sysfs, type sysfs), uses genfs_contexts
[ 42.520002] SELinux: policy loaded with handle_unknown=allow
[ 42.520011] audit(1201524201.651:3): policy loaded auid=4294967295
[ 46.528101] sd 0:0:0:0: Attached scsi generic sg0 type 0
[ 46.528126] scsi 1:0:0:0: Attached scsi generic sg1 type 5
[ 46.528149] sd 1:0:1:0: Attached scsi generic sg2 type 0
[ 46.528174] sd 3:0:0:0: Attached scsi generic sg3 type 0
[ 46.931288] input: Power Button (FF) as /class/input/input4
[ 46.938141] ACPI: Power Button (FF) [PWRF]
[ 46.938186] ACPI Error (evxfevnt-0186): Could not enable SleepButton event [20070126]
[ 46.938192] ACPI Warning (evxface-0145): Could not enable fixed event 3 [20070126]
[ 46.938283] input: Power Button (CM) as /class/input/input5
[ 46.941132] ACPI: Power Button (CM) [PWRB]
[ 46.941190] input: Sleep Button (CM) as /class/input/input6
[ 46.944347] ACPI: Sleep Button (CM) [SLPB]
[ 47.285717] usblp0: USB Bidirectional printer dev 2 if 0 alt 0 proto 2 vid 0x04B8 pid 0x0005
[ 47.285742] usbcore: registered new interface driver usblp
[ 47.308848] i2c-adapter i2c-0: nForce2 SMBus adapter at 0x5000
[ 47.308876] i2c-adapter i2c-1: nForce2 SMBus adapter at 0x5100
[ 47.352146] usbcore: registered new interface driver usbserial
[ 47.352152] drivers/usb/serial/usb-serial.c: USB Serial Driver core
[ 47.455275] drivers/usb/serial/usb-serial.c: USB Serial support registered for FTDI USB Serial Device
[ 47.455308] ftdi_sio 1-1.2:1.0: FTDI USB Serial Device converter detected
[ 47.455342] drivers/usb/serial/ftdi_sio.c: Detected FT232RL
[ 47.455381] usb 1-1.2: FTDI USB Serial Device converter now attached to ttyUSB0
[ 47.455398] usbcore: registered new interface driver ftdi_sio
[ 47.455401] drivers/usb/serial/ftdi_sio.c: v1.4.3:USB FTDI Serial Converters Driver
[ 47.530751] input: PC Speaker as /class/input/input7
[ 47.572764] forcedeth: Reverse Engineered nForce ethernet driver. Version 0.61.
[ 47.573147] ACPI: PCI Interrupt Link [APCH] enabled at IRQ 22
[ 47.573151] ACPI: PCI Interrupt 0000:00:04.0[A] -> Link [APCH] -> GSI 22 (level, high) -> IRQ 16
[ 47.573158] PCI: Setting latency timer of device 0000:00:04.0 to 64
[ 47.615492] Floppy drive(s): fd0 is 1.44M
[ 47.630509] FDC 0 is a post-1991 82077
[ 48.084769] forcedeth 0000:00:04.0: ifname eth0, PHY OUI 0x20 @ 1, addr 00:04:4b:5d:eb:7d
[ 48.084775] forcedeth 0000:00:04.0: timirq lnktim desc-v1
[ 48.085214] ACPI: PCI Interrupt Link [APC4] enabled at IRQ 19
[ 48.085222] ACPI: PCI Interrupt 0000:01:09.0[A] -> Link [APC4] -> GSI 19 (level, high) -> IRQ 20
[ 48.141308] firewire_ohci: Added fw-ohci device 0000:01:09.0, OHCI version 1.10
[ 48.285456] nvidia: module license 'NVIDIA' taints kernel.
[ 48.549725] ACPI: PCI Interrupt 0000:02:00.0[A] -> Link [APC4] -> GSI 19 (level, high) -> IRQ 20
[ 48.550149] NVRM: loading NVIDIA UNIX x86 Kernel Module 169.07 Thu Dec 13 18:42:56 PST 2007
[ 48.641028] firewire_core: created new fw device fw0 (0 config rom retries, S400)
[ 48.848094] Linux video capture interface: v2.00
[ 49.104974] cx88/2: cx2388x MPEG-TS Driver Manager version 0.0.6 loaded
[ 49.105072] cx88[0]: subsystem: 7063:3000, board: pcHDTV HD3000 HDTV [card=22,autodetected]
[ 49.105076] cx88[0]: TV tuner type 60, Radio tuner type -1
[ 49.151205] cx88/0: cx2388x v4l2 driver version 0.0.6 loaded
[ 49.255507] cx88[0]/2: cx2388x 8802 Driver Manager
[ 49.257461] ACPI: PCI Interrupt Link [APC2] enabled at IRQ 17
[ 49.257472] ACPI: PCI Interrupt 0000:01:07.2[A] -> Link [APC2] -> GSI 17 (level, high) -> IRQ 21
[ 49.257484] cx88[0]/2: found at 0000:01:07.2, rev: 5, irq: 21, latency: 32, mmio: 0xe4000000
[ 49.257561] ACPI: PCI Interrupt 0000:01:07.0[A] -> Link [APC2] -> GSI 17 (level, high) -> IRQ 21
[ 49.257573] cx88[0]/0: found at 0000:01:07.0, rev: 5, irq: 21, latency: 32, mmio: 0xe3000000
[ 49.440179] tda8290_probe: not probed - driver disabled by Kconfig
[ 49.440185] tuner 2-0043: chip found @ 0x86 (cx88[0])
[ 49.440208] tda9887 2-0043: tda988[5/6/7] found @ 0x43 (tuner)
[ 49.440211] tuner 2-0043: type set to tda9887
[ 49.442442] tuner 2-0061: chip found @ 0xc2 (cx88[0])
[ 49.442458] tuner-simple 2-0061: type set to 60 (Thomson DTT 761X (ATSC/NTSC))
[ 49.442461] tuner 2-0061: type set to Thomson DTT 761X (A
[ 49.442464] tuner-simple 2-0061: type set to 60 (Thomson DTT 761X (ATSC/NTSC))
[ 49.442466] tuner 2-0061: type set to Thomson DTT 761X (A
[ 49.451016] cx88[0]/0: registered device video0 [v4l2]
[ 49.451038] cx88[0]/0: registered device vbi0
[ 49.451064] cx88[0]/0: registered device radio0
[ 49.454722] ACPI: PCI Interrupt Link [APC3] enabled at IRQ 18
[ 49.454731] ACPI: PCI Interrupt 0000:01:08.0[A] -> Link [APC3] -> GSI 18 (level, high) -> IRQ 22
[ 49.459555] Audigy2 value: Special config.
[ 49.532042] cx88/2: cx2388x dvb driver version 0.0.6 loaded
[ 49.532047] cx88/2: registering cx8802 driver, type: dvb access: shared
[ 49.532052] cx88[0]/2: subsystem: 7063:3000, board: pcHDTV HD3000 HDTV [card=22]
[ 49.532055] cx88[0]/2: cx2388x based DVB/ATSC card
[ 49.548119] ACPI: PCI Interrupt Link [APCJ] enabled at IRQ 21
[ 49.548125] ACPI: PCI Interrupt 0000:00:06.0[A] -> Link [APCJ] -> GSI 21 (level, high) -> IRQ 17
[ 49.548162] PCI: Setting latency timer of device 0000:00:06.0 to 64
[ 49.654575] DVB: registering new adapter (cx88[0])
[ 49.654582] DVB: registering frontend 0 (Oren OR51132 VSB/QAM Frontend)...
[ 49.859126] intel8x0_measure_ac97_clock: measured 50668 usecs
[ 49.859131] intel8x0: clocking to 47378
[ 52.910654] EXT3 FS on dm-0, internal journal
[ 53.162621] kjournald starting. Commit interval 5 seconds
[ 53.170013] EXT3 FS on sda1, internal journal
[ 53.170019] EXT3-fs: mounted filesystem with ordered data mode.
[ 53.170144] SELinux: initialized (dev sda1, type ext3), uses xattr
[ 53.170540] SELinux: initialized (dev tmpfs, type tmpfs), uses transition SIDs
[ 53.174936] kjournald starting. Commit interval 5 seconds
[ 53.182987] EXT3 FS on sdc1, internal journal
[ 53.182992] EXT3-fs: mounted filesystem with ordered data mode.
[ 53.188827] SELinux: initialized (dev sdc1, type ext3), uses xattr
[ 54.005729] Adding 2031608k swap on /dev/mapper/VolGroup00-LogVol01. Priority:-1 extents:1 across:2031608k
[ 54.009417] SELinux: initialized (dev binfmt_misc, type binfmt_misc), uses genfs_contexts



--
Cheers, Gene
"There are four boxes to be used in defense of liberty:
soap, ballot, jury, and ammo. Please use in that order."
-Ed Howdershelt (Author)
...[Linux's] capacity to talk via any medium except smoke signals.
-- Dr. Greg Wettstein, Roger Maris Cancer Center

2008-01-28 13:19:58

by Gene Heskett

[permalink] [raw]
Subject: Re: Problem with ata layer in 2.6.24

On Monday 28 January 2008, Gene Heskett wrote:
>[    0.000000] If you got timer trouble try acpi_use_timer_override
This is from the dmesg of my previous post.

Can anyone tell me what it actually means?


--
Cheers, Gene
"There are four boxes to be used in defense of liberty:
soap, ballot, jury, and ammo. Please use in that order."
-Ed Howdershelt (Author)
I have a simple rule in life: If I don't understand something, it must be bad.

- Linus Torvalds

2008-01-28 13:57:39

by Mikael Pettersson

[permalink] [raw]
Subject: Re: Problem with ata layer in 2.6.24

Gene Heskett writes:
> On Monday 28 January 2008, Peter Zijlstra wrote:
> >On Mon, 2008-01-28 at 09:17 +0100, Mikael Pettersson wrote:
> >> 1. Wrong mailing list; use linux-ide (@vger) instead.
> >
> >What, and keep all us other interested people in the dark?
>
> As a test, I tried rebooting to the latest fedora kernel and found it kills X,
> so I'm back to the second to last fedora version ATM, and the
> third 'smartctl -t lng /dev/sda' in 24 hours is running now. The first two
> completed with no errors.
>
> I've added the linux-ide list to refresh those people of the problem,
> the logs are being spammed by this message stanza:
>
> Jan 28 04:46:25 coyote kernel: [26550.290016] ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen
> Jan 28 04:46:25 coyote kernel: [26550.290028] ata1.00: cmd 35/00:58:c9:9c:0a/00:01:00:00:00/e0 tag 0 dma 176128 out
> Jan 28 04:46:25 coyote kernel: [26550.290029] res 40/00:01:00:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout)
> Jan 28 04:46:25 coyote kernel: [26550.290032] ata1.00: status: { DRDY }
> Jan 28 04:46:25 coyote kernel: [26550.290060] ata1: soft resetting link
> Jan 28 04:46:25 coyote kernel: [26550.452301] ata1.00: configured for UDMA/100
> Jan 28 04:46:25 coyote kernel: [26550.452318] ata1: EH complete
> Jan 28 04:46:25 coyote kernel: [26550.455898] sd 0:0:0:0: [sda] 390721968 512-byte hardware sectors (200050 MB)
> Jan 28 04:46:25 coyote kernel: [26550.456151] sd 0:0:0:0: [sda] Write Protect is off
> Jan 28 04:46:25 coyote kernel: [26550.456403] sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't
> support DPO or FUA

It's not obvious from this incomplete dmesg log what HW or driver
is behind ata1, but if the 2.6.24-rc7 kernel matches the 2.6.24 one,
it should be pata_amd driving a WDC disk:

> [ 30.702887] pata_amd 0000:00:09.0: version 0.3.10
> [ 30.703052] PCI: Setting latency timer of device 0000:00:09.0 to 64
> [ 30.703188] scsi0 : pata_amd
> [ 30.709313] scsi1 : pata_amd
> [ 30.710076] ata1: PATA max UDMA/133 cmd 0x1f0 ctl 0x3f6 bmdma 0xf000 irq 14
> [ 30.710079] ata2: PATA max UDMA/133 cmd 0x170 ctl 0x376 bmdma 0xf008 irq 15
> [ 30.864753] ata1.00: ATA-6: WDC WD2000JB-00EVA0, 15.05R15, max UDMA/100
> [ 30.864756] ata1.00: 390721968 sectors, multi 16: LBA48
> [ 30.871629] ata1.00: configured for UDMA/100

Unfortunately we also see:

> [ 48.285456] nvidia: module license 'NVIDIA' taints kernel.
> [ 48.549725] ACPI: PCI Interrupt 0000:02:00.0[A] -> Link [APC4] -> GSI 19 (level, high) -> IRQ 20
> [ 48.550149] NVRM: loading NVIDIA UNIX x86 Kernel Module 169.07 Thu Dec 13 18:42:56 PST 2007

We have no way of debugging that module, so please try 2.6.24 without it.
If the problems persist, please try to capture a complete log from the
failing kernel -- the interesting bits are everything from initial boot
up to and including the first few errors. You may need to increase the
kernel's log buffer size if the log gets truncated (CONFIG_LOG_BUF_SHIFT).

There are no pata_amd changes from 2.6.24-rc7 to 2.6.24 final.

2008-01-28 15:26:49

by rgheck

[permalink] [raw]
Subject: Re: Problem with ata layer in 2.6.24


I've recently seen this kind of error myself, under Fedora 8, using the
Fedora 2.6.23 kernels: I'd see a train of the same sort of error:
> Jan 28 04:46:25 coyote kernel: [26550.290016] ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen
> Jan 28 04:46:25 coyote kernel: [26550.290028] ata1.00: cmd 35/00:58:c9:9c:0a/00:01:00:00:00/e0 tag 0 dma 176128 out
> Jan 28 04:46:25 coyote kernel: [26550.290029] res 40/00:01:00:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout)
>
usually associated with the optical drive, and then it seems as if the
whole SATA subsystem would lock up, and the machine then becomes
useless: I get journal commit errors if I'm lucky; if I'm not, it just
locks up. My system is also using the pata_amd driver.

I have not seen these sorts of errors with the 2.6.24 kernels.

Richard Heck

Gene Heskett wrote:
> On Monday 28 January 2008, Peter Zijlstra wrote:
>
>> On Mon, 2008-01-28 at 09:17 +0100, Mikael Pettersson wrote:
>>
>>> 1. Wrong mailing list; use linux-ide (@vger) instead.
>>>
>> What, and keep all us other interested people in the dark?
>>
>
> As a test, I tried rebooting to the latest fedora kernel and found it kills X,
> so I'm back to the second to last fedora version ATM, and the
> third 'smartctl -t lng /dev/sda' in 24 hours is running now. The first two
> completed with no errors.
>
> I've added the linux-ide list to refresh those people of the problem,
> the logs are being spammed by this message stanza:
>
> Jan 28 04:46:25 coyote kernel: [26550.290016] ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen
> Jan 28 04:46:25 coyote kernel: [26550.290028] ata1.00: cmd 35/00:58:c9:9c:0a/00:01:00:00:00/e0 tag 0 dma 176128 out
> Jan 28 04:46:25 coyote kernel: [26550.290029] res 40/00:01:00:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout)
> Jan 28 04:46:25 coyote kernel: [26550.290032] ata1.00: status: { DRDY }
> Jan 28 04:46:25 coyote kernel: [26550.290060] ata1: soft resetting link
> Jan 28 04:46:25 coyote kernel: [26550.452301] ata1.00: configured for UDMA/100
> Jan 28 04:46:25 coyote kernel: [26550.452318] ata1: EH complete
> Jan 28 04:46:25 coyote kernel: [26550.455898] sd 0:0:0:0: [sda] 390721968 512-byte hardware sectors (200050 MB)
> Jan 28 04:46:25 coyote kernel: [26550.456151] sd 0:0:0:0: [sda] Write Protect is off
> Jan 28 04:46:25 coyote kernel: [26550.456403] sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't
> support DPO or FUA
>
>
> And it just did it again, using the fedora kernel but without logging
> anything at all when it froze. In other words I had to reboot between
> the word list and the word to above. So now I'm booted to 2.6.24-rc7.
>
> Before it crashes again, here is the dmesg:
> [ 0.000000] Linux version 2.6.24-rc7 ([email protected]) (gcc version 4.1.2 20070925 (Red Hat 4.1.2-33)) #1 SMP
> Mon Jan 14 10:00:40 EST 2008
> [ 0.000000] BIOS-provided physical RAM map:
> [ 0.000000] BIOS-e820: 0000000000000000 - 000000000009f800 (usable)
> [ 0.000000] BIOS-e820: 000000000009f800 - 00000000000a0000 (reserved)
> [ 0.000000] BIOS-e820: 00000000000f0000 - 0000000000100000 (reserved)
> [ 0.000000] BIOS-e820: 0000000000100000 - 000000003fff0000 (usable)
> [ 0.000000] BIOS-e820: 000000003fff0000 - 000000003fff3000 (ACPI NVS)
> [ 0.000000] BIOS-e820: 000000003fff3000 - 0000000040000000 (ACPI data)
> [ 0.000000] BIOS-e820: 00000000fec00000 - 00000000fec01000 (reserved)
> [ 0.000000] BIOS-e820: 00000000fee00000 - 00000000fee01000 (reserved)
> [ 0.000000] BIOS-e820: 00000000ffff0000 - 0000000100000000 (reserved)
> [ 0.000000] 127MB HIGHMEM available.
> [ 0.000000] 896MB LOWMEM available.
> [ 0.000000] Entering add_active_range(0, 0, 262128) 0 entries of 256 used
> [ 0.000000] Zone PFN ranges:
> [ 0.000000] DMA 0 -> 4096
> [ 0.000000] Normal 4096 -> 229376
> [ 0.000000] HighMem 229376 -> 262128
> [ 0.000000] Movable zone start PFN for each node
> [ 0.000000] early_node_map[1] active PFN ranges
> [ 0.000000] 0: 0 -> 262128
> [ 0.000000] On node 0 totalpages: 262128
> [ 0.000000] DMA zone: 32 pages used for memmap
> [ 0.000000] DMA zone: 0 pages reserved
> [ 0.000000] DMA zone: 4064 pages, LIFO batch:0
> [ 0.000000] Normal zone: 1760 pages used for memmap
> [ 0.000000] Normal zone: 223520 pages, LIFO batch:31
> [ 0.000000] HighMem zone: 255 pages used for memmap
> [ 0.000000] HighMem zone: 32497 pages, LIFO batch:7
> [ 0.000000] Movable zone: 0 pages used for memmap
> [ 0.000000] DMI 2.2 present.
> [ 0.000000] ACPI: RSDP 000F7220, 0014 (r0 Nvidia)
> [ 0.000000] ACPI: RSDT 3FFF3000, 002C (r1 Nvidia AWRDACPI 42302E31 AWRD 0)
> [ 0.000000] ACPI: FACP 3FFF3040, 0074 (r1 Nvidia AWRDACPI 42302E31 AWRD 0)
> [ 0.000000] ACPI: DSDT 3FFF30C0, 4CC4 (r1 NVIDIA AWRDACPI 1000 MSFT 100000E)
> [ 0.000000] ACPI: FACS 3FFF0000, 0040
> [ 0.000000] ACPI: APIC 3FFF7DC0, 006E (r1 Nvidia AWRDACPI 42302E31 AWRD 0)
> [ 0.000000] Nvidia board detected. Ignoring ACPI timer override.
> [ 0.000000] If you got timer trouble try acpi_use_timer_override
> [ 0.000000] ACPI: PM-Timer IO Port: 0x4008
> [ 0.000000] ACPI: Local APIC address 0xfee00000
> [ 0.000000] ACPI: LAPIC (acpi_id[0x00] lapic_id[0x00] enabled)
> [ 0.000000] Processor #0 6:10 APIC version 16
> [ 0.000000] ACPI: LAPIC_NMI (acpi_id[0x00] high edge lint[0x1])
> [ 0.000000] ACPI: IOAPIC (id[0x02] address[0xfec00000] gsi_base[0])
> [ 0.000000] IOAPIC[0]: apic_id 2, version 17, address 0xfec00000, GSI 0-23
> [ 0.000000] ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl)
> [ 0.000000] ACPI: BIOS IRQ0 pin2 override ignored.
> [ 0.000000] ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 high level)
> [ 0.000000] ACPI: INT_SRC_OVR (bus 0 bus_irq 14 global_irq 14 high edge)
> [ 0.000000] ACPI: INT_SRC_OVR (bus 0 bus_irq 15 global_irq 15 high edge)
> [ 0.000000] ACPI: IRQ9 used by override.
> [ 0.000000] ACPI: IRQ14 used by override.
> [ 0.000000] ACPI: IRQ15 used by override.
> [ 0.000000] Enabling APIC mode: Flat. Using 1 I/O APICs
> [ 0.000000] Using ACPI (MADT) for SMP configuration information
> [ 0.000000] Allocating PCI resources starting at 50000000 (gap: 40000000:bec00000)
> [ 0.000000] swsusp: Registered nosave memory region: 000000000009f000 - 00000000000a0000
> [ 0.000000] swsusp: Registered nosave memory region: 00000000000a0000 - 00000000000f0000
> [ 0.000000] swsusp: Registered nosave memory region: 00000000000f0000 - 0000000000100000
> [ 0.000000] Built 1 zonelists in Zone order, mobility grouping on. Total pages: 260081
> [ 0.000000] Kernel command line: ro root=/dev/VolGroup00/LogVol00 rhgb quiet
> [ 0.000000] mapped APIC to ffffb000 (fee00000)
> [ 0.000000] mapped IOAPIC to ffffa000 (fec00000)
> [ 0.000000] Enabling fast FPU save and restore... done.
> [ 0.000000] Enabling unmasked SIMD FPU exception support... done.
> [ 0.000000] Initializing CPU#0
> [ 0.000000] CPU 0 irqstacks, hard=c073a000 soft=c071a000
> [ 0.000000] PID hash table entries: 4096 (order: 12, 16384 bytes)
> [ 0.000000] Detected 2079.551 MHz processor.
> [ 28.725256] Console: colour VGA+ 80x25
> [ 28.725259] console [tty0] enabled
> [ 28.725828] Dentry cache hash table entries: 131072 (order: 7, 524288 bytes)
> [ 28.726361] Inode-cache hash table entries: 65536 (order: 6, 262144 bytes)
> [ 28.756701] Memory: 1031116k/1048512k available (1938k kernel code, 16656k reserved, 967k data, 236k init, 131008k
> highmem)
> [ 28.756710] virtual kernel memory layout:
> [ 28.756711] fixmap : 0xffc55000 - 0xfffff000 (3752 kB)
> [ 28.756713] pkmap : 0xff800000 - 0xffc00000 (4096 kB)
> [ 28.756714] vmalloc : 0xf8800000 - 0xff7fe000 ( 111 MB)
> [ 28.756715] lowmem : 0xc0000000 - 0xf8000000 ( 896 MB)
> [ 28.756716] .init : 0xc06dc000 - 0xc0717000 ( 236 kB)
> [ 28.756718] .data : 0xc05e4944 - 0xc06d66e4 ( 967 kB)
> [ 28.756719] .text : 0xc0400000 - 0xc05e4944 (1938 kB)
> [ 28.756722] Checking if this processor honours the WP bit even in supervisor mode... Ok.
> [ 28.756770] SLUB: Genslabs=11, HWalign=32, Order=0-1, MinObjects=4, CPUs=1, Nodes=1
> [ 28.816731] Calibrating delay using timer specific routine.. 4160.90 BogoMIPS (lpj=2080452)
> [ 28.816763] Security Framework initialized
> [ 28.816770] SELinux: Initializing.
> [ 28.816784] SELinux: Starting in permissive mode
> [ 28.816797] selinux_register_security: Registering secondary module capability
> [ 28.816800] Capability LSM initialized as secondary
> [ 28.816809] Mount-cache hash table entries: 512
> [ 28.816976] CPU: After generic identify, caps: 0383fbff c1c3fbff 00000000 00000000 00000000 00000000 00000000
> 00000000
> [ 28.816985] CPU: L1 I Cache: 64K (64 bytes/line), D cache 64K (64 bytes/line)
> [ 28.816987] CPU: L2 Cache: 512K (64 bytes/line)
> [ 28.816990] CPU: After all inits, caps: 0383fbff c1c3fbff 00000000 00000420 00000000 00000000 00000000 00000000
> [ 28.816996] Intel machine check architecture supported.
> [ 28.816998] Intel machine check reporting enabled on CPU#0.
> [ 28.817003] Compat vDSO mapped to ffffe000.
> [ 28.817017] Checking 'hlt' instruction... OK.
> [ 28.820895] SMP alternatives: switching to UP code
> [ 28.821401] Freeing SMP alternatives: 12k freed
> [ 28.821404] ACPI: Core revision 20070126
> [ 28.824590] CPU0: AMD Athlon(tm) XP 2800+ stepping 00
> [ 28.824614] Total of 1 processors activated (4160.90 BogoMIPS).
> [ 28.824820] ENABLING IO-APIC IRQs
> [ 28.825012] ..TIMER: vector=0x31 apic1=0 pin1=0 apic2=-1 pin2=-1
> [ 28.936680] Brought up 1 CPUs
> [ 28.936708] CPU0 attaching sched-domain:
> [ 28.936711] domain 0: span 00000001
> [ 28.936713] groups: 00000001
> [ 28.936925] net_namespace: 64 bytes
> [ 28.937409] Time: 12:43:09 Date: 01/28/08
> [ 28.937442] NET: Registered protocol family 16
> [ 28.937683] ACPI: bus type pci registered
> [ 28.972986] PCI: PCI BIOS revision 2.10 entry at 0xfb4c0, last bus=2
> [ 28.972989] PCI: Using configuration type 1
> [ 28.972991] Setting up standard PCI resources
> [ 28.980763] ACPI: EC: Look up EC in DSDT
> [ 28.986590] ACPI: Interpreter enabled
> [ 28.986593] ACPI: (supports S0 S1 S4 S5)
> [ 28.986608] ACPI: Using IOAPIC for interrupt routing
> [ 28.997079] ACPI: PCI Root Bridge [PCI0] (0000:00)
> [ 28.997157] PCI: nForce2 C1 Halt Disconnect fixup
> [ 28.998175] ACPI: PCI Interrupt Routing Table [\_SB_.PCI0._PRT]
> [ 28.998355] ACPI: PCI Interrupt Routing Table [\_SB_.PCI0.HUB0._PRT]
> [ 28.998631] ACPI: PCI Interrupt Routing Table [\_SB_.PCI0.AGPB._PRT]
> [ 29.054757] ACPI: PCI Interrupt Link [LNK1] (IRQs 3 4 5 6 7 10 *11 12 14 15)
> [ 29.054952] ACPI: PCI Interrupt Link [LNK2] (IRQs 3 4 5 6 7 10 *11 12 14 15)
> [ 29.055144] ACPI: PCI Interrupt Link [LNK3] (IRQs 3 4 *5 6 7 10 11 12 14 15)
> [ 29.055334] ACPI: PCI Interrupt Link [LNK4] (IRQs 3 4 *5 6 7 10 11 12 14 15)
> [ 29.055529] ACPI: PCI Interrupt Link [LNK5] (IRQs 3 4 5 6 7 10 11 12 14 15) *0, disabled.
> [ 29.055724] ACPI: PCI Interrupt Link [LUBA] (IRQs 3 4 *5 6 7 10 11 12 14 15)
> [ 29.055918] ACPI: PCI Interrupt Link [LUBB] (IRQs 3 4 5 6 7 10 11 *12 14 15)
> [ 29.056109] ACPI: PCI Interrupt Link [LMAC] (IRQs 3 4 5 6 7 10 11 *12 14 15)
> [ 29.056298] ACPI: PCI Interrupt Link [LAPU] (IRQs 3 4 5 6 7 10 11 12 14 15) *0, disabled.
> [ 29.056489] ACPI: PCI Interrupt Link [LACI] (IRQs 3 4 5 6 7 10 11 *12 14 15)
> [ 29.056685] ACPI: PCI Interrupt Link [LMCI] (IRQs 3 4 5 6 7 10 11 12 14 15) *0, disabled.
> [ 29.056876] ACPI: PCI Interrupt Link [LSMB] (IRQs 3 4 5 6 7 10 11 12 14 15) *0, disabled.
> [ 29.057066] ACPI: PCI Interrupt Link [LUB2] (IRQs 3 4 5 6 7 10 *11 12 14 15)
> [ 29.057258] ACPI: PCI Interrupt Link [LFIR] (IRQs 3 4 5 6 7 10 11 12 14 15) *0, disabled.
> [ 29.057448] ACPI: PCI Interrupt Link [L3CM] (IRQs 3 4 5 6 7 10 11 12 14 15) *0, disabled.
> [ 29.057642] ACPI: PCI Interrupt Link [LIDE] (IRQs 3 4 5 6 7 10 11 12 14 15) *0, disabled.
> [ 29.057803] ACPI: PCI Interrupt Link [APC1] (IRQs *16)
> [ 29.057951] ACPI: PCI Interrupt Link [APC2] (IRQs *17)
> [ 29.058099] ACPI: PCI Interrupt Link [APC3] (IRQs *18)
> [ 29.058246] ACPI: PCI Interrupt Link [APC4] (IRQs *19)
> [ 29.058402] ACPI: PCI Interrupt Link [APC5] (IRQs *16), disabled.
> [ 29.058617] ACPI: PCI Interrupt Link [APCF] (IRQs 20 21 22) *0
> [ 29.058827] ACPI: PCI Interrupt Link [APCG] (IRQs 20 21 22) *0
> [ 29.059036] ACPI: PCI Interrupt Link [APCH] (IRQs 20 21 22) *0
> [ 29.059245] ACPI: PCI Interrupt Link [APCI] (IRQs 20 21 22) *0, disabled.
> [ 29.059454] ACPI: PCI Interrupt Link [APCJ] (IRQs 20 21 22) *0
> [ 29.059669] ACPI: PCI Interrupt Link [APCK] (IRQs 20 21 22) *0, disabled.
> [ 29.059819] ACPI: PCI Interrupt Link [APCS] (IRQs *23), disabled.
> [ 29.060028] ACPI: PCI Interrupt Link [APCL] (IRQs 20 21 22) *0
> [ 29.060236] ACPI: PCI Interrupt Link [APCM] (IRQs 20 21 22) *0, disabled.
> [ 29.060446] ACPI: PCI Interrupt Link [AP3C] (IRQs 20 21 22) *0, disabled.
> [ 29.060661] ACPI: PCI Interrupt Link [APCZ] (IRQs 20 21 22) *0, disabled.
> [ 29.060822] ACPI: Power Resource [ISAV] (on)
> [ 29.060880] Linux Plug and Play Support v0.97 (c) Adam Belay
> [ 29.060917] pnp: PnP ACPI init
> [ 29.060926] ACPI: bus type pnp registered
> [ 29.066989] pnp: PnP ACPI: found 16 devices
> [ 29.066992] ACPI: ACPI bus type pnp unregistered
> [ 29.067179] usbcore: registered new interface driver usbfs
> [ 29.067257] usbcore: registered new interface driver hub
> [ 29.067309] usbcore: registered new device driver usb
> [ 29.067395] PCI: Using ACPI for IRQ routing
> [ 29.067399] PCI: If a device doesn't work, try "pci=routeirq". If it helps, post a report
> [ 29.117453] NetLabel: Initializing
> [ 29.117455] NetLabel: domain hash size = 128
> [ 29.117457] NetLabel: protocols = UNLABELED CIPSOv4
> [ 29.117471] NetLabel: unlabeled traffic allowed by default
> [ 29.118443] Time: tsc clocksource has been installed.
> [ 29.120481] system 00:00: ioport range 0x4000-0x407f has been reserved
> [ 29.120484] system 00:00: ioport range 0x4080-0x40ff has been reserved
> [ 29.120487] system 00:00: ioport range 0x4400-0x447f has been reserved
> [ 29.120490] system 00:00: ioport range 0x4480-0x44ff has been reserved
> [ 29.120492] system 00:00: ioport range 0x4200-0x427f has been reserved
> [ 29.120495] system 00:00: ioport range 0x4280-0x42ff has been reserved
> [ 29.120502] system 00:01: ioport range 0x5000-0x503f has been reserved
> [ 29.120505] system 00:01: ioport range 0x5100-0x513f has been reserved
> [ 29.120511] system 00:02: iomem range 0xda800-0xdbfff has been reserved
> [ 29.120514] system 00:02: iomem range 0xf0000-0xf7fff could not be reserved
> [ 29.120516] system 00:02: iomem range 0xf8000-0xfbfff could not be reserved
> [ 29.120519] system 00:02: iomem range 0xfc000-0xfffff could not be reserved
> [ 29.120522] system 00:02: iomem range 0x3fff0000-0x3fffffff could not be reserved
> [ 29.120525] system 00:02: iomem range 0xffff0000-0xffffffff could not be reserved
> [ 29.120528] system 00:02: iomem range 0x0-0x9ffff could not be reserved
> [ 29.120531] system 00:02: iomem range 0x100000-0x3ffeffff could not be reserved
> [ 29.120534] system 00:02: iomem range 0xfec00000-0xfec00fff could not be reserved
> [ 29.120537] system 00:02: iomem range 0xfee00000-0xfee00fff could not be reserved
> [ 29.120543] system 00:04: ioport range 0xb78-0xb7b has been reserved
> [ 29.120546] system 00:04: ioport range 0xf78-0xf7b has been reserved
> [ 29.120548] system 00:04: ioport range 0xa78-0xa7b has been reserved
> [ 29.120551] system 00:04: ioport range 0xe78-0xe7b has been reserved
> [ 29.120554] system 00:04: ioport range 0xbbc-0xbbf has been reserved
> [ 29.120556] system 00:04: ioport range 0xfbc-0xfbf has been reserved
> [ 29.120559] system 00:04: ioport range 0x4d0-0x4d1 has been reserved
> [ 29.120562] system 00:04: ioport range 0x294-0x297 has been reserved
> [ 29.151040] PCI: Bridge: 0000:00:08.0
> [ 29.151044] IO window: 9000-afff
> [ 29.151049] MEM window: e3000000-e6ffffff
> [ 29.151053] PREFETCH window: 50000000-500fffff
> [ 29.151058] PCI: Bridge: 0000:00:1e.0
> [ 29.151059] IO window: disabled.
> [ 29.151063] MEM window: e0000000-e2ffffff
> [ 29.151066] PREFETCH window: d0000000-dfffffff
> [ 29.151077] PCI: Setting latency timer of device 0000:00:08.0 to 64
> [ 29.151093] NET: Registered protocol family 2
> [ 29.160585] IP route cache hash table entries: 32768 (order: 5, 131072 bytes)
> [ 29.160952] TCP established hash table entries: 131072 (order: 8, 1048576 bytes)
> [ 29.162429] TCP bind hash table entries: 65536 (order: 7, 524288 bytes)
> [ 29.163092] TCP: Hash tables configured (established 131072 bind 65536)
> [ 29.163095] TCP reno registered
> [ 29.165574] checking if image is initramfs... it is
> [ 29.446295] Freeing initrd memory: 3628k freed
> [ 29.446709] apm: BIOS version 1.2 Flags 0x07 (Driver version 1.16ac)
> [ 29.446712] apm: overridden by ACPI.
> [ 29.447133] audit: initializing netlink socket (disabled)
> [ 29.447149] audit(1201524188.569:1): initialized
> [ 29.447287] highmem bounce pool size: 64 pages
> [ 29.447291] Total HugeTLB memory allocated, 0
> [ 29.449941] SELinux: Registering netfilter hooks
> [ 29.450082] Block layer SCSI generic (bsg) driver version 0.4 loaded (major 254)
> [ 29.450086] io scheduler noop registered
> [ 29.450088] io scheduler anticipatory registered
> [ 29.450090] io scheduler deadline registered
> [ 29.450101] io scheduler cfq registered (default)
> [ 29.472109] Boot video device is 0000:02:00.0
> [ 29.477398] ACPI: Thermal Zone [THRM] (51 C)
> [ 29.477413] isapnp: Scanning for PnP cards...
> [ 29.650914] Switched to high resolution mode on CPU 0
> [ 29.834322] isapnp: No Plug & Play device found
> [ 29.837157] Real Time Clock Driver v1.12ac
> [ 29.837309] Non-volatile memory driver v1.2
> [ 29.837312] Linux agpgart interface v0.102
> [ 29.837365] agpgart: Detected NVIDIA nForce2 chipset
> [ 29.853228] agpgart: AGP aperture is 256M @ 0xc0000000
> [ 29.853255] Serial: 8250/16550 driver $Revision: 1.90 $ 2 ports, IRQ sharing disabled
> [ 29.853403] serial8250: ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A
> [ 29.853542] serial8250: ttyS1 at I/O 0x2f8 (irq = 3) is a 16550A
> [ 29.853854] 00:0a: ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A
> [ 29.854037] 00:0b: ttyS1 at I/O 0x2f8 (irq = 3) is a 16550A
> [ 29.855000] RAMDISK driver initialized: 16 RAM disks of 16384K size 4096 blocksize
> [ 29.855188] PNP: PS/2 Controller [PNP0303:PS2K] at 0x60,0x64 irq 1
> [ 29.855191] PNP: PS/2 appears to have AUX port disabled, if this is incorrect please boot with i8042.nopnp
> [ 29.855565] serio: i8042 KBD port at 0x60,0x64 irq 1
> [ 29.855638] mice: PS/2 mouse device common for all mice
> [ 29.876081] input: AT Translated Set 2 keyboard as /class/input/input0
> [ 29.878901] cpuidle: using governor ladder
> [ 29.878904] cpuidle: using governor menu
> [ 29.878982] usbcore: registered new interface driver hiddev
> [ 29.879024] usbcore: registered new interface driver usbhid
> [ 29.879027] drivers/hid/usbhid/hid-core.c: v2.6:USB HID core driver
> [ 29.879098] TCP cubic registered
> [ 29.879100] Initializing XFRM netlink socket
> [ 29.879180] NET: Registered protocol family 1
> [ 29.879196] NET: Registered protocol family 17
> [ 29.879204] Using IPI No-Shortcut mode
> [ 29.879217] registered taskstats version 1
> [ 29.879349] Magic number: 8:30:735
> [ 29.879657] Freeing unused kernel memory: 236k freed
> [ 29.879695] Write protecting the kernel text: 1940k
> [ 29.879708] Write protecting the kernel read-only data: 758k
> [ 30.175117] ACPI: PCI Interrupt Link [APCL] enabled at IRQ 22
> [ 30.175126] ACPI: PCI Interrupt 0000:00:02.2[C] -> Link [APCL] -> GSI 22 (level, high) -> IRQ 16
> [ 30.175139] PCI: Setting latency timer of device 0000:00:02.2 to 64
> [ 30.175143] ehci_hcd 0000:00:02.2: EHCI Host Controller
> [ 30.175235] ehci_hcd 0000:00:02.2: new USB bus registered, assigned bus number 1
> [ 30.175275] ehci_hcd 0000:00:02.2: debug port 1
> [ 30.175280] PCI: cache line size of 64 is not supported by device 0000:00:02.2
> [ 30.175291] ehci_hcd 0000:00:02.2: irq 16, io mem 0xe7005000
> [ 30.180677] ehci_hcd 0000:00:02.2: USB 2.0 started, EHCI 1.00, driver 10 Dec 2004
> [ 30.180795] usb usb1: configuration #1 chosen from 1 choice
> [ 30.180823] hub 1-0:1.0: USB hub found
> [ 30.180834] hub 1-0:1.0: 6 ports detected
> [ 30.287626] ohci_hcd: 2006 August 04 USB 1.1 'Open' Host Controller (OHCI) Driver
> [ 30.288031] ACPI: PCI Interrupt Link [APCF] enabled at IRQ 21
> [ 30.288038] ACPI: PCI Interrupt 0000:00:02.0[A] -> Link [APCF] -> GSI 21 (level, high) -> IRQ 17
> [ 30.288052] PCI: Setting latency timer of device 0000:00:02.0 to 64
> [ 30.288055] ohci_hcd 0000:00:02.0: OHCI Host Controller
> [ 30.288129] ohci_hcd 0000:00:02.0: new USB bus registered, assigned bus number 2
> [ 30.288148] ohci_hcd 0000:00:02.0: irq 17, io mem 0xe7003000
> [ 30.340664] usb usb2: configuration #1 chosen from 1 choice
> [ 30.340691] hub 2-0:1.0: USB hub found
> [ 30.340704] hub 2-0:1.0: 3 ports detected
> [ 30.441860] ACPI: PCI Interrupt Link [APCG] enabled at IRQ 20
> [ 30.441865] ACPI: PCI Interrupt 0000:00:02.1[B] -> Link [APCG] -> GSI 20 (level, high) -> IRQ 18
> [ 30.441873] PCI: Setting latency timer of device 0000:00:02.1 to 64
> [ 30.441876] ohci_hcd 0000:00:02.1: OHCI Host Controller
> [ 30.441932] ohci_hcd 0000:00:02.1: new USB bus registered, assigned bus number 3
> [ 30.441945] ohci_hcd 0000:00:02.1: irq 18, io mem 0xe7004000
> [ 30.487468] usb 1-1: new high speed USB device using ehci_hcd and address 2
> [ 30.494540] usb usb3: configuration #1 chosen from 1 choice
> [ 30.494569] hub 3-0:1.0: USB hub found
> [ 30.494579] hub 3-0:1.0: 3 ports detected
> [ 30.601427] USB Universal Host Controller Interface driver v3.0
> [ 30.601865] usb 1-1: configuration #1 chosen from 1 choice
> [ 30.602052] hub 1-1:1.0: USB hub found
> [ 30.602151] hub 1-1:1.0: 4 ports detected
> [ 30.660576] SCSI subsystem initialized
> [ 30.673851] Driver 'sd' needs updating - please use bus_type methods
> [ 30.700319] libata version 3.00 loaded.
> [ 30.702887] pata_amd 0000:00:09.0: version 0.3.10
> [ 30.703052] PCI: Setting latency timer of device 0000:00:09.0 to 64
> [ 30.703188] scsi0 : pata_amd
> [ 30.709313] scsi1 : pata_amd
> [ 30.710076] ata1: PATA max UDMA/133 cmd 0x1f0 ctl 0x3f6 bmdma 0xf000 irq 14
> [ 30.710079] ata2: PATA max UDMA/133 cmd 0x170 ctl 0x376 bmdma 0xf008 irq 15
> [ 30.864753] ata1.00: ATA-6: WDC WD2000JB-00EVA0, 15.05R15, max UDMA/100
> [ 30.864756] ata1.00: 390721968 sectors, multi 16: LBA48
> [ 30.871629] ata1.00: configured for UDMA/100
> [ 31.195305] ata2.00: ATAPI: LITE-ON DVDRW SHM-165H6S, HS06, max UDMA/66
> [ 31.243813] ata2.01: ATA-7: MAXTOR STM3320620A, 3.AAE, max UDMA/100
> [ 31.243816] ata2.01: 625142448 sectors, multi 16: LBA48
> [ 31.243825] ata2.00: limited to UDMA/33 due to 40-wire cable
> [ 31.417074] ata2.00: configured for UDMA/33
> [ 31.451769] ata2.01: configured for UDMA/100
> [ 31.451873] scsi 0:0:0:0: Direct-Access ATA WDC WD2000JB-00E 15.0 PQ: 0 ANSI: 5
> [ 31.451953] sd 0:0:0:0: [sda] 390721968 512-byte hardware sectors (200050 MB)
> [ 31.451967] sd 0:0:0:0: [sda] Write Protect is off
> [ 31.451970] sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00
> [ 31.451989] sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
> [ 31.452040] sd 0:0:0:0: [sda] 390721968 512-byte hardware sectors (200050 MB)
> [ 31.452051] sd 0:0:0:0: [sda] Write Protect is off
> [ 31.452054] sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00
> [ 31.452071] sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
> [ 31.452075] sda: sda1 sda2
> [ 31.467219] sd 0:0:0:0: [sda] Attached SCSI disk
> [ 31.468093] scsi 1:0:0:0: CD-ROM LITE-ON DVDRW SHM-165H6S HS06 PQ: 0 ANSI: 5
> [ 31.468208] scsi 1:0:1:0: Direct-Access ATA MAXTOR STM332062 3.AA PQ: 0 ANSI: 5
> [ 31.468272] sd 1:0:1:0: [sdb] 625142448 512-byte hardware sectors (320073 MB)
> [ 31.468283] sd 1:0:1:0: [sdb] Write Protect is off
> [ 31.468286] sd 1:0:1:0: [sdb] Mode Sense: 00 3a 00 00
> [ 31.468303] sd 1:0:1:0: [sdb] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
> [ 31.468338] sd 1:0:1:0: [sdb] 625142448 512-byte hardware sectors (320073 MB)
> [ 31.468349] sd 1:0:1:0: [sdb] Write Protect is off
> [ 31.468352] sd 1:0:1:0: [sdb] Mode Sense: 00 3a 00 00
> [ 31.468370] sd 1:0:1:0: [sdb] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
> [ 31.468373] sdb:<6>usb 2-2: new full speed USB device using ohci_hcd and address 2
> [ 31.499690] sdb1 sdb2 sdb3
> [ 31.500119] sd 1:0:1:0: [sdb] Attached SCSI disk
> [ 31.637428] usb 2-2: configuration #1 chosen from 1 choice
> [ 31.856522] usb 2-3: new low speed USB device using ohci_hcd and address 3
> [ 32.020045] usb 2-3: configuration #1 chosen from 1 choice
> [ 32.035222] input: Chicony Saitek Eclipse II Keyboard as /class/input/input1
> [ 32.038424] input,hidraw0: USB HID v1.11 Keyboard [Chicony Saitek Eclipse II Keyboard] on usb-0000:00:02.0-3
> [ 32.067995] input: Chicony Saitek Eclipse II Keyboard as /class/input/input2
> [ 32.070422] input,hiddev96,hidraw1: USB HID v1.11 Device [Chicony Saitek Eclipse II Keyboard] on usb-0000:00:02.0-3
> [ 32.287225] usb 3-3: new full speed USB device using ohci_hcd and address 2
> [ 32.422699] usb 3-3: configuration #1 chosen from 1 choice
> [ 32.425658] hub 3-3:1.0: USB hub found
> [ 32.428631] hub 3-3:1.0: 4 ports detected
> [ 32.724000] usb 1-1.1: new low speed USB device using ehci_hcd and address 6
> [ 32.919001] usb 1-1.1: configuration #1 chosen from 1 choice
> [ 33.655893] hiddev97hidraw2: USB HID v1.11 Device [Belkin Belkin UPS] on usb-0000:00:02.2-1.1
> [ 33.833315] usb 1-1.2: new full speed USB device using ehci_hcd and address 7
> [ 33.925926] usb 1-1.2: configuration #1 chosen from 1 choice
> [ 34.018028] device-mapper: ioctl: 4.12.0-ioctl (2007-10-02) initialised: [email protected]
> [ 34.043070] sata_sil 0000:01:0a.0: version 2.3
> [ 34.043348] ACPI: PCI Interrupt Link [APC1] enabled at IRQ 16
> [ 34.043355] ACPI: PCI Interrupt 0000:01:0a.0[A] -> Link [APC1] -> GSI 16 (level, high) -> IRQ 19
> [ 34.045029] scsi2 : sata_sil
> [ 34.050031] scsi3 : sata_sil
> [ 34.050064] ata3: SATA max UDMA/100 mmio m512@0xe6004000 tf 0xe6004080 irq 19
> [ 34.050068] ata4: SATA max UDMA/100 mmio m512@0xe6004000 tf 0xe60040c0 irq 19
> [ 34.107056] usb 1-1.4: new high speed USB device using ehci_hcd and address 8
> [ 34.192310] usb 1-1.4: configuration #1 chosen from 1 choice
> [ 34.192499] hub 1-1.4:1.0: USB hub found
> [ 34.192597] hub 1-1.4:1.0: 4 ports detected
> [ 34.352811] ata3: SATA link down (SStatus 0 SControl 310)
> [ 34.481823] usb 1-1.4.3: new low speed USB device using ehci_hcd and address 9
> [ 34.573676] usb 1-1.4.3: configuration #1 chosen from 1 choice
> [ 34.577917] input: Logitech USB Receiver as /class/input/input3
> [ 34.580691] input,hidraw3: USB HID v1.10 Mouse [Logitech USB Receiver] on usb-0000:00:02.2-1.4.3
> [ 34.757557] usb 1-1.4.4: new full speed USB device using ehci_hcd and address 10
> [ 34.806514] ata4: SATA link up 1.5 Gbps (SStatus 113 SControl 310)
> [ 34.810071] ata4.00: ATA-7: Hitachi HDT725040VLA360, V5COA7EA, max UDMA/133
> [ 34.810074] ata4.00: 781422768 sectors, multi 16: LBA48 NCQ (depth 0/32)
> [ 34.816059] ata4.00: configured for UDMA/100
> [ 34.816175] scsi 3:0:0:0: Direct-Access ATA Hitachi HDT72504 V5CO PQ: 0 ANSI: 5
> [ 34.816257] sd 3:0:0:0: [sdc] 781422768 512-byte hardware sectors (400088 MB)
> [ 34.816271] sd 3:0:0:0: [sdc] Write Protect is off
> [ 34.816274] sd 3:0:0:0: [sdc] Mode Sense: 00 3a 00 00
> [ 34.816293] sd 3:0:0:0: [sdc] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
> [ 34.816344] sd 3:0:0:0: [sdc] 781422768 512-byte hardware sectors (400088 MB)
> [ 34.816355] sd 3:0:0:0: [sdc] Write Protect is off
> [ 34.816358] sd 3:0:0:0: [sdc] Mode Sense: 00 3a 00 00
> [ 34.816375] sd 3:0:0:0: [sdc] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
> [ 34.816379] sdc: sdc1
> [ 34.829837] sd 3:0:0:0: [sdc] Attached SCSI disk
> [ 34.843933] usb 1-1.4.4: configuration #1 chosen from 1 choice
> [ 37.852458] EXT3-fs: INFO: recovery required on readonly filesystem.
> [ 37.852464] EXT3-fs: write access will be enabled during recovery.
> [ 41.522866] kjournald starting. Commit interval 5 seconds
> [ 41.522885] EXT3-fs: recovery complete.
> [ 41.524254] EXT3-fs: mounted filesystem with ordered data mode.
> [ 41.972449] audit(1201524201.103:2): enforcing=1 old_enforcing=0 auid=4294967295
> [ 42.187011] SELinux:8192 avtab hash slots allocated.Num of rules:213166
> [ 42.260611] SELinux:8192 avtab hash slots allocated.Num of rules:213166
> [ 42.314117] security: 8 users, 11 roles, 2363 types, 114 bools, 1 sens, 1024 cats
> [ 42.314122] security: 67 classes, 213166 rules
> [ 42.327287] SELinux: Completing initialization.
> [ 42.327290] SELinux: Setting up existing superblocks.
> [ 42.353550] SELinux: initialized (dev dm-0, type ext3), uses xattr
> [ 42.515071] SELinux: initialized (dev usbfs, type usbfs), uses genfs_contexts
> [ 42.515088] SELinux: initialized (dev tmpfs, type tmpfs), uses transition SIDs
> [ 42.515193] SELinux: initialized (dev debugfs, type debugfs), uses genfs_contexts
> [ 42.515205] SELinux: initialized (dev selinuxfs, type selinuxfs), uses genfs_contexts
> [ 42.515235] SELinux: initialized (dev mqueue, type mqueue), uses transition SIDs
> [ 42.515244] SELinux: initialized (dev hugetlbfs, type hugetlbfs), uses genfs_contexts
> [ 42.515250] SELinux: initialized (dev devpts, type devpts), uses transition SIDs
> [ 42.515262] SELinux: initialized (dev inotifyfs, type inotifyfs), uses genfs_contexts
> [ 42.515266] SELinux: initialized (dev tmpfs, type tmpfs), uses transition SIDs
> [ 42.515274] SELinux: initialized (dev futexfs, type futexfs), uses genfs_contexts
> [ 42.515280] SELinux: initialized (dev anon_inodefs, type anon_inodefs), uses genfs_contexts
> [ 42.515285] SELinux: initialized (dev pipefs, type pipefs), uses task SIDs
> [ 42.515290] SELinux: initialized (dev sockfs, type sockfs), uses task SIDs
> [ 42.515299] SELinux: initialized (dev proc, type proc), uses genfs_contexts
> [ 42.515312] SELinux: initialized (dev bdev, type bdev), uses genfs_contexts
> [ 42.515318] SELinux: initialized (dev rootfs, type rootfs), uses genfs_contexts
> [ 42.515341] SELinux: initialized (dev sysfs, type sysfs), uses genfs_contexts
> [ 42.520002] SELinux: policy loaded with handle_unknown=allow
> [ 42.520011] audit(1201524201.651:3): policy loaded auid=4294967295
> [ 46.528101] sd 0:0:0:0: Attached scsi generic sg0 type 0
> [ 46.528126] scsi 1:0:0:0: Attached scsi generic sg1 type 5
> [ 46.528149] sd 1:0:1:0: Attached scsi generic sg2 type 0
> [ 46.528174] sd 3:0:0:0: Attached scsi generic sg3 type 0
> [ 46.931288] input: Power Button (FF) as /class/input/input4
> [ 46.938141] ACPI: Power Button (FF) [PWRF]
> [ 46.938186] ACPI Error (evxfevnt-0186): Could not enable SleepButton event [20070126]
> [ 46.938192] ACPI Warning (evxface-0145): Could not enable fixed event 3 [20070126]
> [ 46.938283] input: Power Button (CM) as /class/input/input5
> [ 46.941132] ACPI: Power Button (CM) [PWRB]
> [ 46.941190] input: Sleep Button (CM) as /class/input/input6
> [ 46.944347] ACPI: Sleep Button (CM) [SLPB]
> [ 47.285717] usblp0: USB Bidirectional printer dev 2 if 0 alt 0 proto 2 vid 0x04B8 pid 0x0005
> [ 47.285742] usbcore: registered new interface driver usblp
> [ 47.308848] i2c-adapter i2c-0: nForce2 SMBus adapter at 0x5000
> [ 47.308876] i2c-adapter i2c-1: nForce2 SMBus adapter at 0x5100
> [ 47.352146] usbcore: registered new interface driver usbserial
> [ 47.352152] drivers/usb/serial/usb-serial.c: USB Serial Driver core
> [ 47.455275] drivers/usb/serial/usb-serial.c: USB Serial support registered for FTDI USB Serial Device
> [ 47.455308] ftdi_sio 1-1.2:1.0: FTDI USB Serial Device converter detected
> [ 47.455342] drivers/usb/serial/ftdi_sio.c: Detected FT232RL
> [ 47.455381] usb 1-1.2: FTDI USB Serial Device converter now attached to ttyUSB0
> [ 47.455398] usbcore: registered new interface driver ftdi_sio
> [ 47.455401] drivers/usb/serial/ftdi_sio.c: v1.4.3:USB FTDI Serial Converters Driver
> [ 47.530751] input: PC Speaker as /class/input/input7
> [ 47.572764] forcedeth: Reverse Engineered nForce ethernet driver. Version 0.61.
> [ 47.573147] ACPI: PCI Interrupt Link [APCH] enabled at IRQ 22
> [ 47.573151] ACPI: PCI Interrupt 0000:00:04.0[A] -> Link [APCH] -> GSI 22 (level, high) -> IRQ 16
> [ 47.573158] PCI: Setting latency timer of device 0000:00:04.0 to 64
> [ 47.615492] Floppy drive(s): fd0 is 1.44M
> [ 47.630509] FDC 0 is a post-1991 82077
> [ 48.084769] forcedeth 0000:00:04.0: ifname eth0, PHY OUI 0x20 @ 1, addr 00:04:4b:5d:eb:7d
> [ 48.084775] forcedeth 0000:00:04.0: timirq lnktim desc-v1
> [ 48.085214] ACPI: PCI Interrupt Link [APC4] enabled at IRQ 19
> [ 48.085222] ACPI: PCI Interrupt 0000:01:09.0[A] -> Link [APC4] -> GSI 19 (level, high) -> IRQ 20
> [ 48.141308] firewire_ohci: Added fw-ohci device 0000:01:09.0, OHCI version 1.10
> [ 48.285456] nvidia: module license 'NVIDIA' taints kernel.
> [ 48.549725] ACPI: PCI Interrupt 0000:02:00.0[A] -> Link [APC4] -> GSI 19 (level, high) -> IRQ 20
> [ 48.550149] NVRM: loading NVIDIA UNIX x86 Kernel Module 169.07 Thu Dec 13 18:42:56 PST 2007
> [ 48.641028] firewire_core: created new fw device fw0 (0 config rom retries, S400)
> [ 48.848094] Linux video capture interface: v2.00
> [ 49.104974] cx88/2: cx2388x MPEG-TS Driver Manager version 0.0.6 loaded
> [ 49.105072] cx88[0]: subsystem: 7063:3000, board: pcHDTV HD3000 HDTV [card=22,autodetected]
> [ 49.105076] cx88[0]: TV tuner type 60, Radio tuner type -1
> [ 49.151205] cx88/0: cx2388x v4l2 driver version 0.0.6 loaded
> [ 49.255507] cx88[0]/2: cx2388x 8802 Driver Manager
> [ 49.257461] ACPI: PCI Interrupt Link [APC2] enabled at IRQ 17
> [ 49.257472] ACPI: PCI Interrupt 0000:01:07.2[A] -> Link [APC2] -> GSI 17 (level, high) -> IRQ 21
> [ 49.257484] cx88[0]/2: found at 0000:01:07.2, rev: 5, irq: 21, latency: 32, mmio: 0xe4000000
> [ 49.257561] ACPI: PCI Interrupt 0000:01:07.0[A] -> Link [APC2] -> GSI 17 (level, high) -> IRQ 21
> [ 49.257573] cx88[0]/0: found at 0000:01:07.0, rev: 5, irq: 21, latency: 32, mmio: 0xe3000000
> [ 49.440179] tda8290_probe: not probed - driver disabled by Kconfig
> [ 49.440185] tuner 2-0043: chip found @ 0x86 (cx88[0])
> [ 49.440208] tda9887 2-0043: tda988[5/6/7] found @ 0x43 (tuner)
> [ 49.440211] tuner 2-0043: type set to tda9887
> [ 49.442442] tuner 2-0061: chip found @ 0xc2 (cx88[0])
> [ 49.442458] tuner-simple 2-0061: type set to 60 (Thomson DTT 761X (ATSC/NTSC))
> [ 49.442461] tuner 2-0061: type set to Thomson DTT 761X (A
> [ 49.442464] tuner-simple 2-0061: type set to 60 (Thomson DTT 761X (ATSC/NTSC))
> [ 49.442466] tuner 2-0061: type set to Thomson DTT 761X (A
> [ 49.451016] cx88[0]/0: registered device video0 [v4l2]
> [ 49.451038] cx88[0]/0: registered device vbi0
> [ 49.451064] cx88[0]/0: registered device radio0
> [ 49.454722] ACPI: PCI Interrupt Link [APC3] enabled at IRQ 18
> [ 49.454731] ACPI: PCI Interrupt 0000:01:08.0[A] -> Link [APC3] -> GSI 18 (level, high) -> IRQ 22
> [ 49.459555] Audigy2 value: Special config.
> [ 49.532042] cx88/2: cx2388x dvb driver version 0.0.6 loaded
> [ 49.532047] cx88/2: registering cx8802 driver, type: dvb access: shared
> [ 49.532052] cx88[0]/2: subsystem: 7063:3000, board: pcHDTV HD3000 HDTV [card=22]
> [ 49.532055] cx88[0]/2: cx2388x based DVB/ATSC card
> [ 49.548119] ACPI: PCI Interrupt Link [APCJ] enabled at IRQ 21
> [ 49.548125] ACPI: PCI Interrupt 0000:00:06.0[A] -> Link [APCJ] -> GSI 21 (level, high) -> IRQ 17
> [ 49.548162] PCI: Setting latency timer of device 0000:00:06.0 to 64
> [ 49.654575] DVB: registering new adapter (cx88[0])
> [ 49.654582] DVB: registering frontend 0 (Oren OR51132 VSB/QAM Frontend)...
> [ 49.859126] intel8x0_measure_ac97_clock: measured 50668 usecs
> [ 49.859131] intel8x0: clocking to 47378
> [ 52.910654] EXT3 FS on dm-0, internal journal
> [ 53.162621] kjournald starting. Commit interval 5 seconds
> [ 53.170013] EXT3 FS on sda1, internal journal
> [ 53.170019] EXT3-fs: mounted filesystem with ordered data mode.
> [ 53.170144] SELinux: initialized (dev sda1, type ext3), uses xattr
> [ 53.170540] SELinux: initialized (dev tmpfs, type tmpfs), uses transition SIDs
> [ 53.174936] kjournald starting. Commit interval 5 seconds
> [ 53.182987] EXT3 FS on sdc1, internal journal
> [ 53.182992] EXT3-fs: mounted filesystem with ordered data mode.
> [ 53.188827] SELinux: initialized (dev sdc1, type ext3), uses xattr
> [ 54.005729] Adding 2031608k swap on /dev/mapper/VolGroup00-LogVol01. Priority:-1 extents:1 across:2031608k
> [ 54.009417] SELinux: initialized (dev binfmt_misc, type binfmt_misc), uses genfs_contexts
>
>
>
>

2008-01-28 16:35:49

by Gene Heskett

[permalink] [raw]
Subject: Re: Problem with ata layer in 2.6.24

On Monday 28 January 2008, Mikael Pettersson wrote:
>Gene Heskett writes:
> > On Monday 28 January 2008, Peter Zijlstra wrote:
> > >On Mon, 2008-01-28 at 09:17 +0100, Mikael Pettersson wrote:
> > >> 1. Wrong mailing list; use linux-ide (@vger) instead.
> > >
> > >What, and keep all us other interested people in the dark?
> >
> > As a test, I tried rebooting to the latest fedora kernel and found it
> > kills X, so I'm back to the second to last fedora version ATM, and the
> > third 'smartctl -t lng /dev/sda' in 24 hours is running now. The first
> > two completed with no errors.
> >
> > I've added the linux-ide list to refresh those people of the problem,
> > the logs are being spammed by this message stanza:
> >
> > Jan 28 04:46:25 coyote kernel: [26550.290016] ata1.00: exception Emask
> > 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen Jan 28 04:46:25 coyote kernel:
> > [26550.290028] ata1.00: cmd 35/00:58:c9:9c:0a/00:01:00:00:00/e0 tag 0 dma
> > 176128 out Jan 28 04:46:25 coyote kernel: [26550.290029] res
> > 40/00:01:00:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout) Jan 28 04:46:25
> > coyote kernel: [26550.290032] ata1.00: status: { DRDY } Jan 28 04:46:25
> > coyote kernel: [26550.290060] ata1: soft resetting link Jan 28 04:46:25
> > coyote kernel: [26550.452301] ata1.00: configured for UDMA/100 Jan 28
> > 04:46:25 coyote kernel: [26550.452318] ata1: EH complete
> > Jan 28 04:46:25 coyote kernel: [26550.455898] sd 0:0:0:0: [sda] 390721968
> > 512-byte hardware sectors (200050 MB) Jan 28 04:46:25 coyote kernel:
> > [26550.456151] sd 0:0:0:0: [sda] Write Protect is off Jan 28 04:46:25
> > coyote kernel: [26550.456403] sd 0:0:0:0: [sda] Write cache: enabled,
> > read cache: enabled, doesn't support DPO or FUA
>
>It's not obvious from this incomplete dmesg log what HW or driver
>is behind ata1, but if the 2.6.24-rc7 kernel matches the 2.6.24 one,
>
>it should be pata_amd driving a WDC disk:
> > [ 30.702887] pata_amd 0000:00:09.0: version 0.3.10
> > [ 30.703052] PCI: Setting latency timer of device 0000:00:09.0 to 64
> > [ 30.703188] scsi0 : pata_amd
> > [ 30.709313] scsi1 : pata_amd
> > [ 30.710076] ata1: PATA max UDMA/133 cmd 0x1f0 ctl 0x3f6 bmdma 0xf000
> > irq 14 [ 30.710079] ata2: PATA max UDMA/133 cmd 0x170 ctl 0x376 bmdma
> > 0xf008 irq 15 [ 30.864753] ata1.00: ATA-6: WDC WD2000JB-00EVA0,
> > 15.05R15, max UDMA/100 [ 30.864756] ata1.00: 390721968 sectors, multi
> > 16: LBA48
> > [ 30.871629] ata1.00: configured for UDMA/100
>
>Unfortunately we also see:
> > [ 48.285456] nvidia: module license 'NVIDIA' taints kernel.
> > [ 48.549725] ACPI: PCI Interrupt 0000:02:00.0[A] -> Link [APC4] -> GSI
> > 19 (level, high) -> IRQ 20 [ 48.550149] NVRM: loading NVIDIA UNIX x86
> > Kernel Module 169.07 Thu Dec 13 18:42:56 PST 2007
>
>We have no way of debugging that module, so please try 2.6.24 without it.

Sorry, I can't do this and have a working machine. The nv driver has suffered
bit rot or something since the FC2 days when it COULD run a 19" crt at
1600x1200, and will not drive this 20" wide screen lcd 1680x1050 monitor at
more than 800x600, which is absolutely butt ugly fuzzy, looking like a jpg
compressed to 10%. The system is not usable on a day to basis without the
nvidia driver.

Fix the nv driver so it will run this screen at its native resolution and I'll
be glad to run it even if it won't run google earth, which I do use from time
to time. Now, if in all the hits you can get from google on this, currently
14,800 just for 'exception Emask', apparently caused by a timeout, if 100% of
the complainers are running nvidia drivers also, then I see a legit
complaint. Again, fix the nv driver so it will run my screen & I'll be glad
to switch. I can see the reason, sure, but the machine must be capable of
doing its common day to day stuff, while using that driver, like running kde
for kmail, and browsers that work.

>If the problems persist, please try to capture a complete log from the
>failing kernel -- the interesting bits are everything from initial boot
>up to and including the first few errors. You may need to increase the
>kernel's log buffer size if the log gets truncated (CONFIG_LOG_BUF_SHIFT).

If by log you mean /var/log/messages, I have several megabytes of those.
If you mean a live dmesg capture taken right now, its attached. It contains
several of these at the bottom. I long ago made the kernel log buffer
bigger, cuz it couldn't even show the start immediately after the boot, and
even the dump to syslog was truncated.

>There are no pata_amd changes from 2.6.24-rc7 to 2.6.24 final.

That is what I was afraid of. I've done some limited grepping in that branch
of the kernel tree, and cannot seem to locate where this EH handler is being
invoked from.

There is 2 lines of interest in the dmesg:

[ 0.000000] Nvidia board detected. Ignoring ACPI timer override.
[ 0.000000] If you got timer trouble try acpi_use_timer_override

But I have NDI what it means, kernel argument/xconfig option?

I've also done some googling, and it appears this problem is fairly widespread
since the switchover to libata was encouraged. A stock fedora F8 kernel
suffers the same freezes and eventually locks up, but does it without the
error messages being logged, it just freezes, feeling identical to this in
the minutes before the total freeze. I've tried 2 of those too, but the
newest one won't even run X.

--
Cheers, Gene
"There are four boxes to be used in defense of liberty:
soap, ballot, jury, and ammo. Please use in that order."
-Ed Howdershelt (Author)
"Ada is PL/I trying to be Smalltalk.
-- Codoso diBlini


Attachments:
(No filename) (5.53 kB)
dmesg (36.51 kB)
Download all attachments

2008-01-28 16:50:58

by Calvin Walton

[permalink] [raw]
Subject: Re: Problem with ata layer in 2.6.24

On Mon, 2008-01-28 at 11:35 -0500, Gene Heskett wrote:
> On Monday 28 January 2008, Mikael Pettersson wrote:
> >Unfortunately we also see:
> > > [ 48.285456] nvidia: module license 'NVIDIA' taints kernel.
> > > [ 48.549725] ACPI: PCI Interrupt 0000:02:00.0[A] -> Link [APC4] -> GSI
> > > 19 (level, high) -> IRQ 20 [ 48.550149] NVRM: loading NVIDIA UNIX x86
> > > Kernel Module 169.07 Thu Dec 13 18:42:56 PST 2007
> >
> >We have no way of debugging that module, so please try 2.6.24 without it.
>
> Sorry, I can't do this and have a working machine. The nv driver has suffered
> bit rot or something since the FC2 days when it COULD run a 19" crt at
> 1600x1200, and will not drive this 20" wide screen lcd 1680x1050 monitor at
> more than 800x600, which is absolutely butt ugly fuzzy, looking like a jpg
> compressed to 10%. The system is not usable on a day to basis without the
> nvidia driver.

You should probably give the nouveau[1] driver a try, if only for
testing purposes; if you are running an NV4x (G6x or G7x) card in
particular, it works a lot better than the nv driver for 2d support.

1. http://nouveau.freedesktop.org/wiki/InstallNouveau

--
Calvin Walton <[email protected]>

2008-01-28 17:00:47

by Gene Heskett

[permalink] [raw]
Subject: Re: Problem with ata layer in 2.6.24

On Monday 28 January 2008, Gene Heskett wrote:
While reading this msg as it came back, I locked up again and rebooted to
2.6.24, and got lucky (maybe) as the attached dmesg will show quite a few
instances of this LOOOONNNGG before the nvidia driver is loaded to taint the
kernel. Have fun guys!

>On Monday 28 January 2008, Mikael Pettersson wrote:
>>Gene Heskett writes:
>> > On Monday 28 January 2008, Peter Zijlstra wrote:
>> > >On Mon, 2008-01-28 at 09:17 +0100, Mikael Pettersson wrote:
>> > >> 1. Wrong mailing list; use linux-ide (@vger) instead.
>> > >
>> > >What, and keep all us other interested people in the dark?
>> >
>> > As a test, I tried rebooting to the latest fedora kernel and found it
>> > kills X, so I'm back to the second to last fedora version ATM, and the
>> > third 'smartctl -t lng /dev/sda' in 24 hours is running now. The first
>> > two completed with no errors.
>> >
>> > I've added the linux-ide list to refresh those people of the problem,
>> > the logs are being spammed by this message stanza:
>> >
>> > Jan 28 04:46:25 coyote kernel: [26550.290016] ata1.00: exception Emask
>> > 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen Jan 28 04:46:25 coyote kernel:
>> > [26550.290028] ata1.00: cmd 35/00:58:c9:9c:0a/00:01:00:00:00/e0 tag 0
>> > dma 176128 out Jan 28 04:46:25 coyote kernel: [26550.290029]
>> > res 40/00:01:00:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout) Jan 28
>> > 04:46:25 coyote kernel: [26550.290032] ata1.00: status: { DRDY } Jan 28
>> > 04:46:25 coyote kernel: [26550.290060] ata1: soft resetting link Jan 28
>> > 04:46:25 coyote kernel: [26550.452301] ata1.00: configured for UDMA/100
>> > Jan 28 04:46:25 coyote kernel: [26550.452318] ata1: EH complete
>> > Jan 28 04:46:25 coyote kernel: [26550.455898] sd 0:0:0:0: [sda]
>> > 390721968 512-byte hardware sectors (200050 MB) Jan 28 04:46:25 coyote
>> > kernel: [26550.456151] sd 0:0:0:0: [sda] Write Protect is off Jan 28
>> > 04:46:25 coyote kernel: [26550.456403] sd 0:0:0:0: [sda] Write cache:
>> > enabled, read cache: enabled, doesn't support DPO or FUA
>>
>>It's not obvious from this incomplete dmesg log what HW or driver
>>is behind ata1, but if the 2.6.24-rc7 kernel matches the 2.6.24 one,
>>
>>it should be pata_amd driving a WDC disk:
>> > [ 30.702887] pata_amd 0000:00:09.0: version 0.3.10
>> > [ 30.703052] PCI: Setting latency timer of device 0000:00:09.0 to 64
>> > [ 30.703188] scsi0 : pata_amd
>> > [ 30.709313] scsi1 : pata_amd
>> > [ 30.710076] ata1: PATA max UDMA/133 cmd 0x1f0 ctl 0x3f6 bmdma 0xf000
>> > irq 14 [ 30.710079] ata2: PATA max UDMA/133 cmd 0x170 ctl 0x376 bmdma
>> > 0xf008 irq 15 [ 30.864753] ata1.00: ATA-6: WDC WD2000JB-00EVA0,
>> > 15.05R15, max UDMA/100 [ 30.864756] ata1.00: 390721968 sectors, multi
>> > 16: LBA48
>> > [ 30.871629] ata1.00: configured for UDMA/100
>>
>>Unfortunately we also see:
>> > [ 48.285456] nvidia: module license 'NVIDIA' taints kernel.
>> > [ 48.549725] ACPI: PCI Interrupt 0000:02:00.0[A] -> Link [APC4] -> GSI
>> > 19 (level, high) -> IRQ 20 [ 48.550149] NVRM: loading NVIDIA UNIX x86
>> > Kernel Module 169.07 Thu Dec 13 18:42:56 PST 2007
>>
>>We have no way of debugging that module, so please try 2.6.24 without it.
>
>Sorry, I can't do this and have a working machine. The nv driver has
> suffered bit rot or something since the FC2 days when it COULD run a 19"
> crt at 1600x1200, and will not drive this 20" wide screen lcd 1680x1050
> monitor at more than 800x600, which is absolutely butt ugly fuzzy, looking
> like a jpg compressed to 10%. The system is not usable on a day to basis
> without the nvidia driver.
>
>Fix the nv driver so it will run this screen at its native resolution and
> I'll be glad to run it even if it won't run google earth, which I do use
> from time to time. Now, if in all the hits you can get from google on
> this, currently 14,800 just for 'exception Emask', apparently caused by a
> timeout, if 100% of the complainers are running nvidia drivers also, then I
> see a legit complaint. Again, fix the nv driver so it will run my screen &
> I'll be glad to switch. I can see the reason, sure, but the machine must
> be capable of doing its common day to day stuff, while using that driver,
> like running kde for kmail, and browsers that work.
>
>>If the problems persist, please try to capture a complete log from the
>>failing kernel -- the interesting bits are everything from initial boot
>>up to and including the first few errors. You may need to increase the
>>kernel's log buffer size if the log gets truncated (CONFIG_LOG_BUF_SHIFT).
>
>If by log you mean /var/log/messages, I have several megabytes of those.
>If you mean a live dmesg capture taken right now, its attached. It contains
>several of these at the bottom. I long ago made the kernel log buffer
>bigger, cuz it couldn't even show the start immediately after the boot, and
>even the dump to syslog was truncated.
>
>>There are no pata_amd changes from 2.6.24-rc7 to 2.6.24 final.
>
>That is what I was afraid of. I've done some limited grepping in that
> branch of the kernel tree, and cannot seem to locate where this EH handler
> is being invoked from.
>
>There is 2 lines of interest in the dmesg:
>
>[ 0.000000] Nvidia board detected. Ignoring ACPI timer override.
>[ 0.000000] If you got timer trouble try acpi_use_timer_override
>
>But I have NDI what it means, kernel argument/xconfig option?
>
>I've also done some googling, and it appears this problem is fairly
> widespread since the switchover to libata was encouraged. A stock fedora
> F8 kernel suffers the same freezes and eventually locks up, but does it
> without the error messages being logged, it just freezes, feeling identical
> to this in the minutes before the total freeze. I've tried 2 of those too,
> but the newest one won't even run X.



--
Cheers, Gene
"There are four boxes to be used in defense of liberty:
soap, ballot, jury, and ammo. Please use in that order."
-Ed Howdershelt (Author)
Deprive a mirror of its silver and even the Czar won't see his face.


Attachments:
(No filename) (5.93 kB)
dmesg (42.08 kB)
Download all attachments

2008-01-28 17:01:42

by Gene Heskett

[permalink] [raw]
Subject: Re: Problem with ata layer in 2.6.24

On Monday 28 January 2008, Richard Heck wrote:
>I've recently seen this kind of error myself, under Fedora 8, using the
>
>Fedora 2.6.23 kernels: I'd see a train of the same sort of error:
>> Jan 28 04:46:25 coyote kernel: [26550.290016] ata1.00: exception Emask
>> 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen Jan 28 04:46:25 coyote kernel:
>> [26550.290028] ata1.00: cmd 35/00:58:c9:9c:0a/00:01:00:00:00/e0 tag 0 dma
>> 176128 out Jan 28 04:46:25 coyote kernel: [26550.290029] res
>> 40/00:01:00:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout)
>
>usually associated with the optical drive, and then it seems as if the
>whole SATA subsystem would lock up, and the machine then becomes
>useless: I get journal commit errors if I'm lucky; if I'm not, it just
>locks up. My system is also using the pata_amd driver.
>
>I have not seen these sorts of errors with the 2.6.24 kernels.
>
>Richard Heck
>
Unforch, this is my only bootable drive, and its raising hell with things,
about 6 hardware reset initiated reboots so far today since 6:15 am. If it
persists I'll go see if Circuit City still has any pata drives left as this
mobo won't boot from a sata card.

>Gene Heskett wrote:
>> On Monday 28 January 2008, Peter Zijlstra wrote:
>>> On Mon, 2008-01-28 at 09:17 +0100, Mikael Pettersson wrote:
>>>> 1. Wrong mailing list; use linux-ide (@vger) instead.
>>>
>>> What, and keep all us other interested people in the dark?
>>
>> As a test, I tried rebooting to the latest fedora kernel and found it
>> kills X, so I'm back to the second to last fedora version ATM, and the
>> third 'smartctl -t lng /dev/sda' in 24 hours is running now. The first
>> two completed with no errors.
>>
>> I've added the linux-ide list to refresh those people of the problem,
>> the logs are being spammed by this message stanza:
>>
>> Jan 28 04:46:25 coyote kernel: [26550.290016] ata1.00: exception Emask
>> 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen Jan 28 04:46:25 coyote kernel:
>> [26550.290028] ata1.00: cmd 35/00:58:c9:9c:0a/00:01:00:00:00/e0 tag 0 dma
>> 176128 out Jan 28 04:46:25 coyote kernel: [26550.290029] res
>> 40/00:01:00:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout) Jan 28 04:46:25
>> coyote kernel: [26550.290032] ata1.00: status: { DRDY } Jan 28 04:46:25
>> coyote kernel: [26550.290060] ata1: soft resetting link Jan 28 04:46:25
>> coyote kernel: [26550.452301] ata1.00: configured for UDMA/100 Jan 28
>> 04:46:25 coyote kernel: [26550.452318] ata1: EH complete
>> Jan 28 04:46:25 coyote kernel: [26550.455898] sd 0:0:0:0: [sda] 390721968
>> 512-byte hardware sectors (200050 MB) Jan 28 04:46:25 coyote kernel:
>> [26550.456151] sd 0:0:0:0: [sda] Write Protect is off Jan 28 04:46:25
>> coyote kernel: [26550.456403] sd 0:0:0:0: [sda] Write cache: enabled, read
>> cache: enabled, doesn't support DPO or FUA
>>
>>
>> And it just did it again, using the fedora kernel but without logging
>> anything at all when it froze. In other words I had to reboot between
>> the word list and the word to above. So now I'm booted to 2.6.24-rc7.
>>
>> Before it crashes again, here is the dmesg:
>> [ 0.000000] Linux version 2.6.24-rc7 ([email protected]) (gcc
>> version 4.1.2 20070925 (Red Hat 4.1.2-33)) #1 SMP Mon Jan 14 10:00:40 EST
>> 2008
>> [ 0.000000] BIOS-provided physical RAM map:
>> [ 0.000000] BIOS-e820: 0000000000000000 - 000000000009f800 (usable)
>> [ 0.000000] BIOS-e820: 000000000009f800 - 00000000000a0000 (reserved)
>> [ 0.000000] BIOS-e820: 00000000000f0000 - 0000000000100000 (reserved)
>> [ 0.000000] BIOS-e820: 0000000000100000 - 000000003fff0000 (usable)
>> [ 0.000000] BIOS-e820: 000000003fff0000 - 000000003fff3000 (ACPI NVS)
>> [ 0.000000] BIOS-e820: 000000003fff3000 - 0000000040000000 (ACPI data)
>> [ 0.000000] BIOS-e820: 00000000fec00000 - 00000000fec01000 (reserved)
>> [ 0.000000] BIOS-e820: 00000000fee00000 - 00000000fee01000 (reserved)
>> [ 0.000000] BIOS-e820: 00000000ffff0000 - 0000000100000000 (reserved)
>> [ 0.000000] 127MB HIGHMEM available.
>> [ 0.000000] 896MB LOWMEM available.
>> [ 0.000000] Entering add_active_range(0, 0, 262128) 0 entries of 256
>> used [ 0.000000] Zone PFN ranges:
>> [ 0.000000] DMA 0 -> 4096
>> [ 0.000000] Normal 4096 -> 229376
>> [ 0.000000] HighMem 229376 -> 262128
>> [ 0.000000] Movable zone start PFN for each node
>> [ 0.000000] early_node_map[1] active PFN ranges
>> [ 0.000000] 0: 0 -> 262128
>> [ 0.000000] On node 0 totalpages: 262128
>> [ 0.000000] DMA zone: 32 pages used for memmap
>> [ 0.000000] DMA zone: 0 pages reserved
>> [ 0.000000] DMA zone: 4064 pages, LIFO batch:0
>> [ 0.000000] Normal zone: 1760 pages used for memmap
>> [ 0.000000] Normal zone: 223520 pages, LIFO batch:31
>> [ 0.000000] HighMem zone: 255 pages used for memmap
>> [ 0.000000] HighMem zone: 32497 pages, LIFO batch:7
>> [ 0.000000] Movable zone: 0 pages used for memmap
>> [ 0.000000] DMI 2.2 present.
>> [ 0.000000] ACPI: RSDP 000F7220, 0014 (r0 Nvidia)
>> [ 0.000000] ACPI: RSDT 3FFF3000, 002C (r1 Nvidia AWRDACPI 42302E31 AWRD
>> 0) [ 0.000000] ACPI: FACP 3FFF3040, 0074 (r1 Nvidia AWRDACPI
>> 42302E31 AWRD 0) [ 0.000000] ACPI: DSDT 3FFF30C0, 4CC4 (r1
>> NVIDIA AWRDACPI 1000 MSFT 100000E) [ 0.000000] ACPI: FACS
>> 3FFF0000, 0040
>> [ 0.000000] ACPI: APIC 3FFF7DC0, 006E (r1 Nvidia AWRDACPI 42302E31 AWRD
>> 0) [ 0.000000] Nvidia board detected. Ignoring ACPI timer
>> override. [ 0.000000] If you got timer trouble try
>> acpi_use_timer_override [ 0.000000] ACPI: PM-Timer IO Port: 0x4008
>> [ 0.000000] ACPI: Local APIC address 0xfee00000
>> [ 0.000000] ACPI: LAPIC (acpi_id[0x00] lapic_id[0x00] enabled)
>> [ 0.000000] Processor #0 6:10 APIC version 16
>> [ 0.000000] ACPI: LAPIC_NMI (acpi_id[0x00] high edge lint[0x1])
>> [ 0.000000] ACPI: IOAPIC (id[0x02] address[0xfec00000] gsi_base[0])
>> [ 0.000000] IOAPIC[0]: apic_id 2, version 17, address 0xfec00000, GSI
>> 0-23 [ 0.000000] ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl
>> dfl) [ 0.000000] ACPI: BIOS IRQ0 pin2 override ignored.
>> [ 0.000000] ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 high level)
>> [ 0.000000] ACPI: INT_SRC_OVR (bus 0 bus_irq 14 global_irq 14 high
>> edge) [ 0.000000] ACPI: INT_SRC_OVR (bus 0 bus_irq 15 global_irq 15
>> high edge) [ 0.000000] ACPI: IRQ9 used by override.
>> [ 0.000000] ACPI: IRQ14 used by override.
>> [ 0.000000] ACPI: IRQ15 used by override.
>> [ 0.000000] Enabling APIC mode: Flat. Using 1 I/O APICs
>> [ 0.000000] Using ACPI (MADT) for SMP configuration information
>> [ 0.000000] Allocating PCI resources starting at 50000000 (gap:
>> 40000000:bec00000) [ 0.000000] swsusp: Registered nosave memory region:
>> 000000000009f000 - 00000000000a0000 [ 0.000000] swsusp: Registered
>> nosave memory region: 00000000000a0000 - 00000000000f0000 [ 0.000000]
>> swsusp: Registered nosave memory region: 00000000000f0000 -
>> 0000000000100000 [ 0.000000] Built 1 zonelists in Zone order, mobility
>> grouping on. Total pages: 260081 [ 0.000000] Kernel command line: ro
>> root=/dev/VolGroup00/LogVol00 rhgb quiet [ 0.000000] mapped APIC to
>> ffffb000 (fee00000)
>> [ 0.000000] mapped IOAPIC to ffffa000 (fec00000)
>> [ 0.000000] Enabling fast FPU save and restore... done.
>> [ 0.000000] Enabling unmasked SIMD FPU exception support... done.
>> [ 0.000000] Initializing CPU#0
>> [ 0.000000] CPU 0 irqstacks, hard=c073a000 soft=c071a000
>> [ 0.000000] PID hash table entries: 4096 (order: 12, 16384 bytes)
>> [ 0.000000] Detected 2079.551 MHz processor.
>> [ 28.725256] Console: colour VGA+ 80x25
>> [ 28.725259] console [tty0] enabled
>> [ 28.725828] Dentry cache hash table entries: 131072 (order: 7, 524288
>> bytes) [ 28.726361] Inode-cache hash table entries: 65536 (order: 6,
>> 262144 bytes) [ 28.756701] Memory: 1031116k/1048512k available (1938k
>> kernel code, 16656k reserved, 967k data, 236k init, 131008k highmem)
>> [ 28.756710] virtual kernel memory layout:
>> [ 28.756711] fixmap : 0xffc55000 - 0xfffff000 (3752 kB)
>> [ 28.756713] pkmap : 0xff800000 - 0xffc00000 (4096 kB)
>> [ 28.756714] vmalloc : 0xf8800000 - 0xff7fe000 ( 111 MB)
>> [ 28.756715] lowmem : 0xc0000000 - 0xf8000000 ( 896 MB)
>> [ 28.756716] .init : 0xc06dc000 - 0xc0717000 ( 236 kB)
>> [ 28.756718] .data : 0xc05e4944 - 0xc06d66e4 ( 967 kB)
>> [ 28.756719] .text : 0xc0400000 - 0xc05e4944 (1938 kB)
>> [ 28.756722] Checking if this processor honours the WP bit even in
>> supervisor mode... Ok. [ 28.756770] SLUB: Genslabs=11, HWalign=32,
>> Order=0-1, MinObjects=4, CPUs=1, Nodes=1 [ 28.816731] Calibrating delay
>> using timer specific routine.. 4160.90 BogoMIPS (lpj=2080452) [
>> 28.816763] Security Framework initialized
>> [ 28.816770] SELinux: Initializing.
>> [ 28.816784] SELinux: Starting in permissive mode
>> [ 28.816797] selinux_register_security: Registering secondary module
>> capability [ 28.816800] Capability LSM initialized as secondary
>> [ 28.816809] Mount-cache hash table entries: 512
>> [ 28.816976] CPU: After generic identify, caps: 0383fbff c1c3fbff
>> 00000000 00000000 00000000 00000000 00000000 00000000
>> [ 28.816985] CPU: L1 I Cache: 64K (64 bytes/line), D cache 64K (64
>> bytes/line) [ 28.816987] CPU: L2 Cache: 512K (64 bytes/line)
>> [ 28.816990] CPU: After all inits, caps: 0383fbff c1c3fbff 00000000
>> 00000420 00000000 00000000 00000000 00000000 [ 28.816996] Intel machine
>> check architecture supported.
>> [ 28.816998] Intel machine check reporting enabled on CPU#0.
>> [ 28.817003] Compat vDSO mapped to ffffe000.
>> [ 28.817017] Checking 'hlt' instruction... OK.
>> [ 28.820895] SMP alternatives: switching to UP code
>> [ 28.821401] Freeing SMP alternatives: 12k freed
>> [ 28.821404] ACPI: Core revision 20070126
>> [ 28.824590] CPU0: AMD Athlon(tm) XP 2800+ stepping 00
>> [ 28.824614] Total of 1 processors activated (4160.90 BogoMIPS).
>> [ 28.824820] ENABLING IO-APIC IRQs
>> [ 28.825012] ..TIMER: vector=0x31 apic1=0 pin1=0 apic2=-1 pin2=-1
>> [ 28.936680] Brought up 1 CPUs
>> [ 28.936708] CPU0 attaching sched-domain:
>> [ 28.936711] domain 0: span 00000001
>> [ 28.936713] groups: 00000001
>> [ 28.936925] net_namespace: 64 bytes
>> [ 28.937409] Time: 12:43:09 Date: 01/28/08
>> [ 28.937442] NET: Registered protocol family 16
>> [ 28.937683] ACPI: bus type pci registered
>> [ 28.972986] PCI: PCI BIOS revision 2.10 entry at 0xfb4c0, last bus=2
>> [ 28.972989] PCI: Using configuration type 1
>> [ 28.972991] Setting up standard PCI resources
>> [ 28.980763] ACPI: EC: Look up EC in DSDT
>> [ 28.986590] ACPI: Interpreter enabled
>> [ 28.986593] ACPI: (supports S0 S1 S4 S5)
>> [ 28.986608] ACPI: Using IOAPIC for interrupt routing
>> [ 28.997079] ACPI: PCI Root Bridge [PCI0] (0000:00)
>> [ 28.997157] PCI: nForce2 C1 Halt Disconnect fixup
>> [ 28.998175] ACPI: PCI Interrupt Routing Table [\_SB_.PCI0._PRT]
>> [ 28.998355] ACPI: PCI Interrupt Routing Table [\_SB_.PCI0.HUB0._PRT]
>> [ 28.998631] ACPI: PCI Interrupt Routing Table [\_SB_.PCI0.AGPB._PRT]
>> [ 29.054757] ACPI: PCI Interrupt Link [LNK1] (IRQs 3 4 5 6 7 10 *11 12
>> 14 15) [ 29.054952] ACPI: PCI Interrupt Link [LNK2] (IRQs 3 4 5 6 7 10
>> *11 12 14 15) [ 29.055144] ACPI: PCI Interrupt Link [LNK3] (IRQs 3 4 *5
>> 6 7 10 11 12 14 15) [ 29.055334] ACPI: PCI Interrupt Link [LNK4] (IRQs 3
>> 4 *5 6 7 10 11 12 14 15) [ 29.055529] ACPI: PCI Interrupt Link [LNK5]
>> (IRQs 3 4 5 6 7 10 11 12 14 15) *0, disabled. [ 29.055724] ACPI: PCI
>> Interrupt Link [LUBA] (IRQs 3 4 *5 6 7 10 11 12 14 15) [ 29.055918]
>> ACPI: PCI Interrupt Link [LUBB] (IRQs 3 4 5 6 7 10 11 *12 14 15) [
>> 29.056109] ACPI: PCI Interrupt Link [LMAC] (IRQs 3 4 5 6 7 10 11 *12 14
>> 15) [ 29.056298] ACPI: PCI Interrupt Link [LAPU] (IRQs 3 4 5 6 7 10 11
>> 12 14 15) *0, disabled. [ 29.056489] ACPI: PCI Interrupt Link [LACI]
>> (IRQs 3 4 5 6 7 10 11 *12 14 15) [ 29.056685] ACPI: PCI Interrupt Link
>> [LMCI] (IRQs 3 4 5 6 7 10 11 12 14 15) *0, disabled. [ 29.056876] ACPI:
>> PCI Interrupt Link [LSMB] (IRQs 3 4 5 6 7 10 11 12 14 15) *0, disabled. [
>> 29.057066] ACPI: PCI Interrupt Link [LUB2] (IRQs 3 4 5 6 7 10 *11 12 14
>> 15) [ 29.057258] ACPI: PCI Interrupt Link [LFIR] (IRQs 3 4 5 6 7 10 11
>> 12 14 15) *0, disabled. [ 29.057448] ACPI: PCI Interrupt Link [L3CM]
>> (IRQs 3 4 5 6 7 10 11 12 14 15) *0, disabled. [ 29.057642] ACPI: PCI
>> Interrupt Link [LIDE] (IRQs 3 4 5 6 7 10 11 12 14 15) *0, disabled. [
>> 29.057803] ACPI: PCI Interrupt Link [APC1] (IRQs *16)
>> [ 29.057951] ACPI: PCI Interrupt Link [APC2] (IRQs *17)
>> [ 29.058099] ACPI: PCI Interrupt Link [APC3] (IRQs *18)
>> [ 29.058246] ACPI: PCI Interrupt Link [APC4] (IRQs *19)
>> [ 29.058402] ACPI: PCI Interrupt Link [APC5] (IRQs *16), disabled.
>> [ 29.058617] ACPI: PCI Interrupt Link [APCF] (IRQs 20 21 22) *0
>> [ 29.058827] ACPI: PCI Interrupt Link [APCG] (IRQs 20 21 22) *0
>> [ 29.059036] ACPI: PCI Interrupt Link [APCH] (IRQs 20 21 22) *0
>> [ 29.059245] ACPI: PCI Interrupt Link [APCI] (IRQs 20 21 22) *0,
>> disabled. [ 29.059454] ACPI: PCI Interrupt Link [APCJ] (IRQs 20 21 22)
>> *0 [ 29.059669] ACPI: PCI Interrupt Link [APCK] (IRQs 20 21 22) *0,
>> disabled. [ 29.059819] ACPI: PCI Interrupt Link [APCS] (IRQs *23),
>> disabled. [ 29.060028] ACPI: PCI Interrupt Link [APCL] (IRQs 20 21 22)
>> *0 [ 29.060236] ACPI: PCI Interrupt Link [APCM] (IRQs 20 21 22) *0,
>> disabled. [ 29.060446] ACPI: PCI Interrupt Link [AP3C] (IRQs 20 21 22)
>> *0, disabled. [ 29.060661] ACPI: PCI Interrupt Link [APCZ] (IRQs 20 21
>> 22) *0, disabled. [ 29.060822] ACPI: Power Resource [ISAV] (on)
>> [ 29.060880] Linux Plug and Play Support v0.97 (c) Adam Belay
>> [ 29.060917] pnp: PnP ACPI init
>> [ 29.060926] ACPI: bus type pnp registered
>> [ 29.066989] pnp: PnP ACPI: found 16 devices
>> [ 29.066992] ACPI: ACPI bus type pnp unregistered
>> [ 29.067179] usbcore: registered new interface driver usbfs
>> [ 29.067257] usbcore: registered new interface driver hub
>> [ 29.067309] usbcore: registered new device driver usb
>> [ 29.067395] PCI: Using ACPI for IRQ routing
>> [ 29.067399] PCI: If a device doesn't work, try "pci=routeirq". If it
>> helps, post a report [ 29.117453] NetLabel: Initializing
>> [ 29.117455] NetLabel: domain hash size = 128
>> [ 29.117457] NetLabel: protocols = UNLABELED CIPSOv4
>> [ 29.117471] NetLabel: unlabeled traffic allowed by default
>> [ 29.118443] Time: tsc clocksource has been installed.
>> [ 29.120481] system 00:00: ioport range 0x4000-0x407f has been reserved
>> [ 29.120484] system 00:00: ioport range 0x4080-0x40ff has been reserved
>> [ 29.120487] system 00:00: ioport range 0x4400-0x447f has been reserved
>> [ 29.120490] system 00:00: ioport range 0x4480-0x44ff has been reserved
>> [ 29.120492] system 00:00: ioport range 0x4200-0x427f has been reserved
>> [ 29.120495] system 00:00: ioport range 0x4280-0x42ff has been reserved
>> [ 29.120502] system 00:01: ioport range 0x5000-0x503f has been reserved
>> [ 29.120505] system 00:01: ioport range 0x5100-0x513f has been reserved
>> [ 29.120511] system 00:02: iomem range 0xda800-0xdbfff has been reserved
>> [ 29.120514] system 00:02: iomem range 0xf0000-0xf7fff could not be
>> reserved [ 29.120516] system 00:02: iomem range 0xf8000-0xfbfff could
>> not be reserved [ 29.120519] system 00:02: iomem range 0xfc000-0xfffff
>> could not be reserved [ 29.120522] system 00:02: iomem range
>> 0x3fff0000-0x3fffffff could not be reserved [ 29.120525] system 00:02:
>> iomem range 0xffff0000-0xffffffff could not be reserved [ 29.120528]
>> system 00:02: iomem range 0x0-0x9ffff could not be reserved [ 29.120531]
>> system 00:02: iomem range 0x100000-0x3ffeffff could not be reserved [
>> 29.120534] system 00:02: iomem range 0xfec00000-0xfec00fff could not be
>> reserved [ 29.120537] system 00:02: iomem range 0xfee00000-0xfee00fff
>> could not be reserved [ 29.120543] system 00:04: ioport range
>> 0xb78-0xb7b has been reserved [ 29.120546] system 00:04: ioport range
>> 0xf78-0xf7b has been reserved [ 29.120548] system 00:04: ioport range
>> 0xa78-0xa7b has been reserved [ 29.120551] system 00:04: ioport range
>> 0xe78-0xe7b has been reserved [ 29.120554] system 00:04: ioport range
>> 0xbbc-0xbbf has been reserved [ 29.120556] system 00:04: ioport range
>> 0xfbc-0xfbf has been reserved [ 29.120559] system 00:04: ioport range
>> 0x4d0-0x4d1 has been reserved [ 29.120562] system 00:04: ioport range
>> 0x294-0x297 has been reserved [ 29.151040] PCI: Bridge: 0000:00:08.0
>> [ 29.151044] IO window: 9000-afff
>> [ 29.151049] MEM window: e3000000-e6ffffff
>> [ 29.151053] PREFETCH window: 50000000-500fffff
>> [ 29.151058] PCI: Bridge: 0000:00:1e.0
>> [ 29.151059] IO window: disabled.
>> [ 29.151063] MEM window: e0000000-e2ffffff
>> [ 29.151066] PREFETCH window: d0000000-dfffffff
>> [ 29.151077] PCI: Setting latency timer of device 0000:00:08.0 to 64
>> [ 29.151093] NET: Registered protocol family 2
>> [ 29.160585] IP route cache hash table entries: 32768 (order: 5, 131072
>> bytes) [ 29.160952] TCP established hash table entries: 131072 (order:
>> 8, 1048576 bytes) [ 29.162429] TCP bind hash table entries: 65536
>> (order: 7, 524288 bytes) [ 29.163092] TCP: Hash tables configured
>> (established 131072 bind 65536) [ 29.163095] TCP reno registered
>> [ 29.165574] checking if image is initramfs... it is
>> [ 29.446295] Freeing initrd memory: 3628k freed
>> [ 29.446709] apm: BIOS version 1.2 Flags 0x07 (Driver version 1.16ac)
>> [ 29.446712] apm: overridden by ACPI.
>> [ 29.447133] audit: initializing netlink socket (disabled)
>> [ 29.447149] audit(1201524188.569:1): initialized
>> [ 29.447287] highmem bounce pool size: 64 pages
>> [ 29.447291] Total HugeTLB memory allocated, 0
>> [ 29.449941] SELinux: Registering netfilter hooks
>> [ 29.450082] Block layer SCSI generic (bsg) driver version 0.4 loaded
>> (major 254) [ 29.450086] io scheduler noop registered
>> [ 29.450088] io scheduler anticipatory registered
>> [ 29.450090] io scheduler deadline registered
>> [ 29.450101] io scheduler cfq registered (default)
>> [ 29.472109] Boot video device is 0000:02:00.0
>> [ 29.477398] ACPI: Thermal Zone [THRM] (51 C)
>> [ 29.477413] isapnp: Scanning for PnP cards...
>> [ 29.650914] Switched to high resolution mode on CPU 0
>> [ 29.834322] isapnp: No Plug & Play device found
>> [ 29.837157] Real Time Clock Driver v1.12ac
>> [ 29.837309] Non-volatile memory driver v1.2
>> [ 29.837312] Linux agpgart interface v0.102
>> [ 29.837365] agpgart: Detected NVIDIA nForce2 chipset
>> [ 29.853228] agpgart: AGP aperture is 256M @ 0xc0000000
>> [ 29.853255] Serial: 8250/16550 driver $Revision: 1.90 $ 2 ports, IRQ
>> sharing disabled [ 29.853403] serial8250: ttyS0 at I/O 0x3f8 (irq = 4)
>> is a 16550A [ 29.853542] serial8250: ttyS1 at I/O 0x2f8 (irq = 3) is a
>> 16550A [ 29.853854] 00:0a: ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A
>> [ 29.854037] 00:0b: ttyS1 at I/O 0x2f8 (irq = 3) is a 16550A
>> [ 29.855000] RAMDISK driver initialized: 16 RAM disks of 16384K size
>> 4096 blocksize [ 29.855188] PNP: PS/2 Controller [PNP0303:PS2K] at
>> 0x60,0x64 irq 1 [ 29.855191] PNP: PS/2 appears to have AUX port
>> disabled, if this is incorrect please boot with i8042.nopnp [ 29.855565]
>> serio: i8042 KBD port at 0x60,0x64 irq 1
>> [ 29.855638] mice: PS/2 mouse device common for all mice
>> [ 29.876081] input: AT Translated Set 2 keyboard as /class/input/input0
>> [ 29.878901] cpuidle: using governor ladder
>> [ 29.878904] cpuidle: using governor menu
>> [ 29.878982] usbcore: registered new interface driver hiddev
>> [ 29.879024] usbcore: registered new interface driver usbhid
>> [ 29.879027] drivers/hid/usbhid/hid-core.c: v2.6:USB HID core driver
>> [ 29.879098] TCP cubic registered
>> [ 29.879100] Initializing XFRM netlink socket
>> [ 29.879180] NET: Registered protocol family 1
>> [ 29.879196] NET: Registered protocol family 17
>> [ 29.879204] Using IPI No-Shortcut mode
>> [ 29.879217] registered taskstats version 1
>> [ 29.879349] Magic number: 8:30:735
>> [ 29.879657] Freeing unused kernel memory: 236k freed
>> [ 29.879695] Write protecting the kernel text: 1940k
>> [ 29.879708] Write protecting the kernel read-only data: 758k
>> [ 30.175117] ACPI: PCI Interrupt Link [APCL] enabled at IRQ 22
>> [ 30.175126] ACPI: PCI Interrupt 0000:00:02.2[C] -> Link [APCL] -> GSI
>> 22 (level, high) -> IRQ 16 [ 30.175139] PCI: Setting latency timer of
>> device 0000:00:02.2 to 64 [ 30.175143] ehci_hcd 0000:00:02.2: EHCI Host
>> Controller
>> [ 30.175235] ehci_hcd 0000:00:02.2: new USB bus registered, assigned bus
>> number 1 [ 30.175275] ehci_hcd 0000:00:02.2: debug port 1
>> [ 30.175280] PCI: cache line size of 64 is not supported by device
>> 0000:00:02.2 [ 30.175291] ehci_hcd 0000:00:02.2: irq 16, io mem
>> 0xe7005000
>> [ 30.180677] ehci_hcd 0000:00:02.2: USB 2.0 started, EHCI 1.00, driver
>> 10 Dec 2004 [ 30.180795] usb usb1: configuration #1 chosen from 1 choice
>> [ 30.180823] hub 1-0:1.0: USB hub found
>> [ 30.180834] hub 1-0:1.0: 6 ports detected
>> [ 30.287626] ohci_hcd: 2006 August 04 USB 1.1 'Open' Host Controller
>> (OHCI) Driver [ 30.288031] ACPI: PCI Interrupt Link [APCF] enabled at
>> IRQ 21
>> [ 30.288038] ACPI: PCI Interrupt 0000:00:02.0[A] -> Link [APCF] -> GSI
>> 21 (level, high) -> IRQ 17 [ 30.288052] PCI: Setting latency timer of
>> device 0000:00:02.0 to 64 [ 30.288055] ohci_hcd 0000:00:02.0: OHCI Host
>> Controller
>> [ 30.288129] ohci_hcd 0000:00:02.0: new USB bus registered, assigned bus
>> number 2 [ 30.288148] ohci_hcd 0000:00:02.0: irq 17, io mem 0xe7003000
>> [ 30.340664] usb usb2: configuration #1 chosen from 1 choice
>> [ 30.340691] hub 2-0:1.0: USB hub found
>> [ 30.340704] hub 2-0:1.0: 3 ports detected
>> [ 30.441860] ACPI: PCI Interrupt Link [APCG] enabled at IRQ 20
>> [ 30.441865] ACPI: PCI Interrupt 0000:00:02.1[B] -> Link [APCG] -> GSI
>> 20 (level, high) -> IRQ 18 [ 30.441873] PCI: Setting latency timer of
>> device 0000:00:02.1 to 64 [ 30.441876] ohci_hcd 0000:00:02.1: OHCI Host
>> Controller
>> [ 30.441932] ohci_hcd 0000:00:02.1: new USB bus registered, assigned bus
>> number 3 [ 30.441945] ohci_hcd 0000:00:02.1: irq 18, io mem 0xe7004000
>> [ 30.487468] usb 1-1: new high speed USB device using ehci_hcd and
>> address 2 [ 30.494540] usb usb3: configuration #1 chosen from 1 choice
>> [ 30.494569] hub 3-0:1.0: USB hub found
>> [ 30.494579] hub 3-0:1.0: 3 ports detected
>> [ 30.601427] USB Universal Host Controller Interface driver v3.0
>> [ 30.601865] usb 1-1: configuration #1 chosen from 1 choice
>> [ 30.602052] hub 1-1:1.0: USB hub found
>> [ 30.602151] hub 1-1:1.0: 4 ports detected
>> [ 30.660576] SCSI subsystem initialized
>> [ 30.673851] Driver 'sd' needs updating - please use bus_type methods
>> [ 30.700319] libata version 3.00 loaded.
>> [ 30.702887] pata_amd 0000:00:09.0: version 0.3.10
>> [ 30.703052] PCI: Setting latency timer of device 0000:00:09.0 to 64
>> [ 30.703188] scsi0 : pata_amd
>> [ 30.709313] scsi1 : pata_amd
>> [ 30.710076] ata1: PATA max UDMA/133 cmd 0x1f0 ctl 0x3f6 bmdma 0xf000
>> irq 14 [ 30.710079] ata2: PATA max UDMA/133 cmd 0x170 ctl 0x376 bmdma
>> 0xf008 irq 15 [ 30.864753] ata1.00: ATA-6: WDC WD2000JB-00EVA0,
>> 15.05R15, max UDMA/100 [ 30.864756] ata1.00: 390721968 sectors, multi
>> 16: LBA48
>> [ 30.871629] ata1.00: configured for UDMA/100
>> [ 31.195305] ata2.00: ATAPI: LITE-ON DVDRW SHM-165H6S, HS06, max UDMA/66
>> [ 31.243813] ata2.01: ATA-7: MAXTOR STM3320620A, 3.AAE, max UDMA/100
>> [ 31.243816] ata2.01: 625142448 sectors, multi 16: LBA48
>> [ 31.243825] ata2.00: limited to UDMA/33 due to 40-wire cable
>> [ 31.417074] ata2.00: configured for UDMA/33
>> [ 31.451769] ata2.01: configured for UDMA/100
>> [ 31.451873] scsi 0:0:0:0: Direct-Access ATA WDC WD2000JB-00E
>> 15.0 PQ: 0 ANSI: 5 [ 31.451953] sd 0:0:0:0: [sda] 390721968 512-byte
>> hardware sectors (200050 MB) [ 31.451967] sd 0:0:0:0: [sda] Write
>> Protect is off
>> [ 31.451970] sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00
>> [ 31.451989] sd 0:0:0:0: [sda] Write cache: enabled, read cache:
>> enabled, doesn't support DPO or FUA [ 31.452040] sd 0:0:0:0: [sda]
>> 390721968 512-byte hardware sectors (200050 MB) [ 31.452051] sd 0:0:0:0:
>> [sda] Write Protect is off
>> [ 31.452054] sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00
>> [ 31.452071] sd 0:0:0:0: [sda] Write cache: enabled, read cache:
>> enabled, doesn't support DPO or FUA [ 31.452075] sda: sda1 sda2
>> [ 31.467219] sd 0:0:0:0: [sda] Attached SCSI disk
>> [ 31.468093] scsi 1:0:0:0: CD-ROM LITE-ON DVDRW SHM-165H6S
>> HS06 PQ: 0 ANSI: 5 [ 31.468208] scsi 1:0:1:0: Direct-Access ATA
>> MAXTOR STM332062 3.AA PQ: 0 ANSI: 5 [ 31.468272] sd 1:0:1:0: [sdb]
>> 625142448 512-byte hardware sectors (320073 MB) [ 31.468283] sd 1:0:1:0:
>> [sdb] Write Protect is off
>> [ 31.468286] sd 1:0:1:0: [sdb] Mode Sense: 00 3a 00 00
>> [ 31.468303] sd 1:0:1:0: [sdb] Write cache: enabled, read cache:
>> enabled, doesn't support DPO or FUA [ 31.468338] sd 1:0:1:0: [sdb]
>> 625142448 512-byte hardware sectors (320073 MB) [ 31.468349] sd 1:0:1:0:
>> [sdb] Write Protect is off
>> [ 31.468352] sd 1:0:1:0: [sdb] Mode Sense: 00 3a 00 00
>> [ 31.468370] sd 1:0:1:0: [sdb] Write cache: enabled, read cache:
>> enabled, doesn't support DPO or FUA [ 31.468373] sdb:<6>usb 2-2: new
>> full speed USB device using ohci_hcd and address 2 [ 31.499690] sdb1
>> sdb2 sdb3
>> [ 31.500119] sd 1:0:1:0: [sdb] Attached SCSI disk
>> [ 31.637428] usb 2-2: configuration #1 chosen from 1 choice
>> [ 31.856522] usb 2-3: new low speed USB device using ohci_hcd and
>> address 3 [ 32.020045] usb 2-3: configuration #1 chosen from 1 choice
>> [ 32.035222] input: Chicony Saitek Eclipse II Keyboard as
>> /class/input/input1 [ 32.038424] input,hidraw0: USB HID v1.11 Keyboard
>> [Chicony Saitek Eclipse II Keyboard] on usb-0000:00:02.0-3 [ 32.067995]
>> input: Chicony Saitek Eclipse II Keyboard as /class/input/input2 [
>> 32.070422] input,hiddev96,hidraw1: USB HID v1.11 Device [Chicony Saitek
>> Eclipse II Keyboard] on usb-0000:00:02.0-3 [ 32.287225] usb 3-3: new
>> full speed USB device using ohci_hcd and address 2 [ 32.422699] usb 3-3:
>> configuration #1 chosen from 1 choice
>> [ 32.425658] hub 3-3:1.0: USB hub found
>> [ 32.428631] hub 3-3:1.0: 4 ports detected
>> [ 32.724000] usb 1-1.1: new low speed USB device using ehci_hcd and
>> address 6 [ 32.919001] usb 1-1.1: configuration #1 chosen from 1 choice
>> [ 33.655893] hiddev97hidraw2: USB HID v1.11 Device [Belkin Belkin UPS]
>> on usb-0000:00:02.2-1.1 [ 33.833315] usb 1-1.2: new full speed USB
>> device using ehci_hcd and address 7 [ 33.925926] usb 1-1.2:
>> configuration #1 chosen from 1 choice
>> [ 34.018028] device-mapper: ioctl: 4.12.0-ioctl (2007-10-02)
>> initialised: [email protected] [ 34.043070] sata_sil 0000:01:0a.0:
>> version 2.3
>> [ 34.043348] ACPI: PCI Interrupt Link [APC1] enabled at IRQ 16
>> [ 34.043355] ACPI: PCI Interrupt 0000:01:0a.0[A] -> Link [APC1] -> GSI
>> 16 (level, high) -> IRQ 19 [ 34.045029] scsi2 : sata_sil
>> [ 34.050031] scsi3 : sata_sil
>> [ 34.050064] ata3: SATA max UDMA/100 mmio m512@0xe6004000 tf 0xe6004080
>> irq 19 [ 34.050068] ata4: SATA max UDMA/100 mmio m512@0xe6004000 tf
>> 0xe60040c0 irq 19 [ 34.107056] usb 1-1.4: new high speed USB device
>> using ehci_hcd and address 8 [ 34.192310] usb 1-1.4: configuration #1
>> chosen from 1 choice
>> [ 34.192499] hub 1-1.4:1.0: USB hub found
>> [ 34.192597] hub 1-1.4:1.0: 4 ports detected
>> [ 34.352811] ata3: SATA link down (SStatus 0 SControl 310)
>> [ 34.481823] usb 1-1.4.3: new low speed USB device using ehci_hcd and
>> address 9 [ 34.573676] usb 1-1.4.3: configuration #1 chosen from 1
>> choice [ 34.577917] input: Logitech USB Receiver as /class/input/input3
>> [ 34.580691] input,hidraw3: USB HID v1.10 Mouse [Logitech USB Receiver]
>> on usb-0000:00:02.2-1.4.3 [ 34.757557] usb 1-1.4.4: new full speed USB
>> device using ehci_hcd and address 10 [ 34.806514] ata4: SATA link up 1.5
>> Gbps (SStatus 113 SControl 310) [ 34.810071] ata4.00: ATA-7: Hitachi
>> HDT725040VLA360, V5COA7EA, max UDMA/133 [ 34.810074] ata4.00: 781422768
>> sectors, multi 16: LBA48 NCQ (depth 0/32) [ 34.816059] ata4.00:
>> configured for UDMA/100
>> [ 34.816175] scsi 3:0:0:0: Direct-Access ATA Hitachi HDT72504
>> V5CO PQ: 0 ANSI: 5 [ 34.816257] sd 3:0:0:0: [sdc] 781422768 512-byte
>> hardware sectors (400088 MB) [ 34.816271] sd 3:0:0:0: [sdc] Write
>> Protect is off
>> [ 34.816274] sd 3:0:0:0: [sdc] Mode Sense: 00 3a 00 00
>> [ 34.816293] sd 3:0:0:0: [sdc] Write cache: enabled, read cache:
>> enabled, doesn't support DPO or FUA [ 34.816344] sd 3:0:0:0: [sdc]
>> 781422768 512-byte hardware sectors (400088 MB) [ 34.816355] sd 3:0:0:0:
>> [sdc] Write Protect is off
>> [ 34.816358] sd 3:0:0:0: [sdc] Mode Sense: 00 3a 00 00
>> [ 34.816375] sd 3:0:0:0: [sdc] Write cache: enabled, read cache:
>> enabled, doesn't support DPO or FUA [ 34.816379] sdc: sdc1
>> [ 34.829837] sd 3:0:0:0: [sdc] Attached SCSI disk
>> [ 34.843933] usb 1-1.4.4: configuration #1 chosen from 1 choice
>> [ 37.852458] EXT3-fs: INFO: recovery required on readonly filesystem.
>> [ 37.852464] EXT3-fs: write access will be enabled during recovery.
>> [ 41.522866] kjournald starting. Commit interval 5 seconds
>> [ 41.522885] EXT3-fs: recovery complete.
>> [ 41.524254] EXT3-fs: mounted filesystem with ordered data mode.
>> [ 41.972449] audit(1201524201.103:2): enforcing=1 old_enforcing=0
>> auid=4294967295 [ 42.187011] SELinux:8192 avtab hash slots allocated.Num
>> of rules:213166 [ 42.260611] SELinux:8192 avtab hash slots allocated.Num
>> of rules:213166 [ 42.314117] security: 8 users, 11 roles, 2363 types,
>> 114 bools, 1 sens, 1024 cats [ 42.314122] security: 67 classes, 213166
>> rules
>> [ 42.327287] SELinux: Completing initialization.
>> [ 42.327290] SELinux: Setting up existing superblocks.
>> [ 42.353550] SELinux: initialized (dev dm-0, type ext3), uses xattr
>> [ 42.515071] SELinux: initialized (dev usbfs, type usbfs), uses
>> genfs_contexts [ 42.515088] SELinux: initialized (dev tmpfs, type
>> tmpfs), uses transition SIDs [ 42.515193] SELinux: initialized (dev
>> debugfs, type debugfs), uses genfs_contexts [ 42.515205] SELinux:
>> initialized (dev selinuxfs, type selinuxfs), uses genfs_contexts [
>> 42.515235] SELinux: initialized (dev mqueue, type mqueue), uses transition
>> SIDs [ 42.515244] SELinux: initialized (dev hugetlbfs, type hugetlbfs),
>> uses genfs_contexts [ 42.515250] SELinux: initialized (dev devpts, type
>> devpts), uses transition SIDs [ 42.515262] SELinux: initialized (dev
>> inotifyfs, type inotifyfs), uses genfs_contexts [ 42.515266] SELinux:
>> initialized (dev tmpfs, type tmpfs), uses transition SIDs [ 42.515274]
>> SELinux: initialized (dev futexfs, type futexfs), uses genfs_contexts [
>> 42.515280] SELinux: initialized (dev anon_inodefs, type anon_inodefs),
>> uses genfs_contexts [ 42.515285] SELinux: initialized (dev pipefs, type
>> pipefs), uses task SIDs [ 42.515290] SELinux: initialized (dev sockfs,
>> type sockfs), uses task SIDs [ 42.515299] SELinux: initialized (dev
>> proc, type proc), uses genfs_contexts [ 42.515312] SELinux: initialized
>> (dev bdev, type bdev), uses genfs_contexts [ 42.515318] SELinux:
>> initialized (dev rootfs, type rootfs), uses genfs_contexts [ 42.515341]
>> SELinux: initialized (dev sysfs, type sysfs), uses genfs_contexts [
>> 42.520002] SELinux: policy loaded with handle_unknown=allow
>> [ 42.520011] audit(1201524201.651:3): policy loaded auid=4294967295
>> [ 46.528101] sd 0:0:0:0: Attached scsi generic sg0 type 0
>> [ 46.528126] scsi 1:0:0:0: Attached scsi generic sg1 type 5
>> [ 46.528149] sd 1:0:1:0: Attached scsi generic sg2 type 0
>> [ 46.528174] sd 3:0:0:0: Attached scsi generic sg3 type 0
>> [ 46.931288] input: Power Button (FF) as /class/input/input4
>> [ 46.938141] ACPI: Power Button (FF) [PWRF]
>> [ 46.938186] ACPI Error (evxfevnt-0186): Could not enable SleepButton
>> event [20070126] [ 46.938192] ACPI Warning (evxface-0145): Could not
>> enable fixed event 3 [20070126] [ 46.938283] input: Power Button (CM) as
>> /class/input/input5
>> [ 46.941132] ACPI: Power Button (CM) [PWRB]
>> [ 46.941190] input: Sleep Button (CM) as /class/input/input6
>> [ 46.944347] ACPI: Sleep Button (CM) [SLPB]
>> [ 47.285717] usblp0: USB Bidirectional printer dev 2 if 0 alt 0 proto 2
>> vid 0x04B8 pid 0x0005 [ 47.285742] usbcore: registered new interface
>> driver usblp
>> [ 47.308848] i2c-adapter i2c-0: nForce2 SMBus adapter at 0x5000
>> [ 47.308876] i2c-adapter i2c-1: nForce2 SMBus adapter at 0x5100
>> [ 47.352146] usbcore: registered new interface driver usbserial
>> [ 47.352152] drivers/usb/serial/usb-serial.c: USB Serial Driver core
>> [ 47.455275] drivers/usb/serial/usb-serial.c: USB Serial support
>> registered for FTDI USB Serial Device [ 47.455308] ftdi_sio 1-1.2:1.0:
>> FTDI USB Serial Device converter detected [ 47.455342]
>> drivers/usb/serial/ftdi_sio.c: Detected FT232RL
>> [ 47.455381] usb 1-1.2: FTDI USB Serial Device converter now attached to
>> ttyUSB0 [ 47.455398] usbcore: registered new interface driver ftdi_sio
>> [ 47.455401] drivers/usb/serial/ftdi_sio.c: v1.4.3:USB FTDI Serial
>> Converters Driver [ 47.530751] input: PC Speaker as /class/input/input7
>> [ 47.572764] forcedeth: Reverse Engineered nForce ethernet driver.
>> Version 0.61. [ 47.573147] ACPI: PCI Interrupt Link [APCH] enabled at
>> IRQ 22
>> [ 47.573151] ACPI: PCI Interrupt 0000:00:04.0[A] -> Link [APCH] -> GSI
>> 22 (level, high) -> IRQ 16 [ 47.573158] PCI: Setting latency timer of
>> device 0000:00:04.0 to 64 [ 47.615492] Floppy drive(s): fd0 is 1.44M
>> [ 47.630509] FDC 0 is a post-1991 82077
>> [ 48.084769] forcedeth 0000:00:04.0: ifname eth0, PHY OUI 0x20 @ 1, addr
>> 00:04:4b:5d:eb:7d [ 48.084775] forcedeth 0000:00:04.0: timirq lnktim
>> desc-v1
>> [ 48.085214] ACPI: PCI Interrupt Link [APC4] enabled at IRQ 19
>> [ 48.085222] ACPI: PCI Interrupt 0000:01:09.0[A] -> Link [APC4] -> GSI
>> 19 (level, high) -> IRQ 20 [ 48.141308] firewire_ohci: Added fw-ohci
>> device 0000:01:09.0, OHCI version 1.10 [ 48.285456] nvidia: module
>> license 'NVIDIA' taints kernel.
>> [ 48.549725] ACPI: PCI Interrupt 0000:02:00.0[A] -> Link [APC4] -> GSI
>> 19 (level, high) -> IRQ 20 [ 48.550149] NVRM: loading NVIDIA UNIX x86
>> Kernel Module 169.07 Thu Dec 13 18:42:56 PST 2007 [ 48.641028]
>> firewire_core: created new fw device fw0 (0 config rom retries, S400) [
>> 48.848094] Linux video capture interface: v2.00
>> [ 49.104974] cx88/2: cx2388x MPEG-TS Driver Manager version 0.0.6 loaded
>> [ 49.105072] cx88[0]: subsystem: 7063:3000, board: pcHDTV HD3000 HDTV
>> [card=22,autodetected] [ 49.105076] cx88[0]: TV tuner type 60, Radio
>> tuner type -1
>> [ 49.151205] cx88/0: cx2388x v4l2 driver version 0.0.6 loaded
>> [ 49.255507] cx88[0]/2: cx2388x 8802 Driver Manager
>> [ 49.257461] ACPI: PCI Interrupt Link [APC2] enabled at IRQ 17
>> [ 49.257472] ACPI: PCI Interrupt 0000:01:07.2[A] -> Link [APC2] -> GSI
>> 17 (level, high) -> IRQ 21 [ 49.257484] cx88[0]/2: found at
>> 0000:01:07.2, rev: 5, irq: 21, latency: 32, mmio: 0xe4000000 [
>> 49.257561] ACPI: PCI Interrupt 0000:01:07.0[A] -> Link [APC2] -> GSI 17
>> (level, high) -> IRQ 21 [ 49.257573] cx88[0]/0: found at 0000:01:07.0,
>> rev: 5, irq: 21, latency: 32, mmio: 0xe3000000 [ 49.440179]
>> tda8290_probe: not probed - driver disabled by Kconfig [ 49.440185]
>> tuner 2-0043: chip found @ 0x86 (cx88[0])
>> [ 49.440208] tda9887 2-0043: tda988[5/6/7] found @ 0x43 (tuner)
>> [ 49.440211] tuner 2-0043: type set to tda9887
>> [ 49.442442] tuner 2-0061: chip found @ 0xc2 (cx88[0])
>> [ 49.442458] tuner-simple 2-0061: type set to 60 (Thomson DTT 761X
>> (ATSC/NTSC)) [ 49.442461] tuner 2-0061: type set to Thomson DTT 761X (A
>> [ 49.442464] tuner-simple 2-0061: type set to 60 (Thomson DTT 761X
>> (ATSC/NTSC)) [ 49.442466] tuner 2-0061: type set to Thomson DTT 761X (A
>> [ 49.451016] cx88[0]/0: registered device video0 [v4l2]
>> [ 49.451038] cx88[0]/0: registered device vbi0
>> [ 49.451064] cx88[0]/0: registered device radio0
>> [ 49.454722] ACPI: PCI Interrupt Link [APC3] enabled at IRQ 18
>> [ 49.454731] ACPI: PCI Interrupt 0000:01:08.0[A] -> Link [APC3] -> GSI
>> 18 (level, high) -> IRQ 22 [ 49.459555] Audigy2 value: Special config.
>> [ 49.532042] cx88/2: cx2388x dvb driver version 0.0.6 loaded
>> [ 49.532047] cx88/2: registering cx8802 driver, type: dvb access: shared
>> [ 49.532052] cx88[0]/2: subsystem: 7063:3000, board: pcHDTV HD3000 HDTV
>> [card=22] [ 49.532055] cx88[0]/2: cx2388x based DVB/ATSC card
>> [ 49.548119] ACPI: PCI Interrupt Link [APCJ] enabled at IRQ 21
>> [ 49.548125] ACPI: PCI Interrupt 0000:00:06.0[A] -> Link [APCJ] -> GSI
>> 21 (level, high) -> IRQ 17 [ 49.548162] PCI: Setting latency timer of
>> device 0000:00:06.0 to 64 [ 49.654575] DVB: registering new adapter
>> (cx88[0])
>> [ 49.654582] DVB: registering frontend 0 (Oren OR51132 VSB/QAM
>> Frontend)... [ 49.859126] intel8x0_measure_ac97_clock: measured 50668
>> usecs
>> [ 49.859131] intel8x0: clocking to 47378
>> [ 52.910654] EXT3 FS on dm-0, internal journal
>> [ 53.162621] kjournald starting. Commit interval 5 seconds
>> [ 53.170013] EXT3 FS on sda1, internal journal
>> [ 53.170019] EXT3-fs: mounted filesystem with ordered data mode.
>> [ 53.170144] SELinux: initialized (dev sda1, type ext3), uses xattr
>> [ 53.170540] SELinux: initialized (dev tmpfs, type tmpfs), uses
>> transition SIDs [ 53.174936] kjournald starting. Commit interval 5
>> seconds
>> [ 53.182987] EXT3 FS on sdc1, internal journal
>> [ 53.182992] EXT3-fs: mounted filesystem with ordered data mode.
>> [ 53.188827] SELinux: initialized (dev sdc1, type ext3), uses xattr
>> [ 54.005729] Adding 2031608k swap on /dev/mapper/VolGroup00-LogVol01.
>> Priority:-1 extents:1 across:2031608k [ 54.009417] SELinux: initialized
>> (dev binfmt_misc, type binfmt_misc), uses genfs_contexts



--
Cheers, Gene
"There are four boxes to be used in defense of liberty:
soap, ballot, jury, and ammo. Please use in that order."
-Ed Howdershelt (Author)
"People should have access to the data which you have about them. There
should
be a process for them to challenge any inaccuracies."
-- Arthur Miller

2008-01-28 17:07:05

by Dave Neuer

[permalink] [raw]
Subject: Re: Problem with ata layer in 2.6.24

On Jan 28, 2008 11:35 AM, Gene Heskett <[email protected]> wrote:
>
> On Monday 28 January 2008, Mikael Pettersson wrote:
> >
> >We have no way of debugging that module, so please try 2.6.24 without it.
>
> Sorry, I can't do this and have a working machine. The nv driver has suffered
> bit rot or something since the FC2 days when it COULD run a 19" crt at
> 1600x1200, and will not drive this 20" wide screen lcd 1680x1050 monitor at
> more than 800x600, which is absolutely butt ugly fuzzy, looking like a jpg
> compressed to 10%. The system is not usable on a day to basis without the
> nvidia driver.

I only have a 15.4" laptop screen, but it does 1650 x 1050 -- the
default w/ my Fedora 8 install -- with the nv driver no problem (I
change it because I have normal human eyesight rather than superhero
micro-vision).

>
> Fix the nv driver so it will run this screen at its native resolution and I'll
> be glad to run it even if it won't run google earth...

<snip>

> Again, fix the nv driver so it will run my screen & I'll be glad
> to switch.

Wow, 2 separate demands to a Linux-IDE developer to fix an X driver
for you. Pretty sure that the source is available to you as well.

>
> >If the problems persist, please try to capture a complete log from the
> >failing kernel -- the interesting bits are everything from initial boot
> >up to and including the first few errors. You may need to increase the
> >kernel's log buffer size if the log gets truncated (CONFIG_LOG_BUF_SHIFT).
>
> If by log you mean /var/log/messages, I have several megabytes of those.
> If you mean a live dmesg capture taken right now, its attached. It contains
> several of these at the bottom. I long ago made the kernel log buffer
> bigger, cuz it couldn't even show the start immediately after the boot, and
> even the dump to syslog was truncated.
>
> >There are no pata_amd changes from 2.6.24-rc7 to 2.6.24 final.
>
> That is what I was afraid of. I've done some limited grepping in that branch
> of the kernel tree, and cannot seem to locate where this EH handler is being
> invoked from.
>
> There is 2 lines of interest in the dmesg:
>
> [ 0.000000] Nvidia board detected. Ignoring ACPI timer override.
> [ 0.000000] If you got timer trouble try acpi_use_timer_override
>
> But I have NDI what it means, kernel argument/xconfig option?

My own 15 seconds of googling informs me that it's a kernel
command-line option: put it on the command-line as-is.

>
> I've also done some googling, and it appears this problem is fairly widespread
> since the switchover to libata was encouraged. A stock fedora F8 kernel
> suffers the same freezes and eventually locks up, but does it without the
> error messages being logged, it just freezes, feeling identical to this in
> the minutes before the total freeze. I've tried 2 of those too, but the
> newest one won't even run X.

As I said, I run F8 w/ nv on my amd64 laptop, so I'm not sure what you
mean by the "won't even run X" part there.

Dave

2008-01-28 17:20:42

by Zan Lynx

[permalink] [raw]
Subject: Re: Problem with ata layer in 2.6.24


On Mon, 2008-01-28 at 11:50 -0500, Calvin Walton wrote:
> On Mon, 2008-01-28 at 11:35 -0500, Gene Heskett wrote:
> > On Monday 28 January 2008, Mikael Pettersson wrote:
> > >Unfortunately we also see:
> > > > [ 48.285456] nvidia: module license 'NVIDIA' taints kernel.
> > > > [ 48.549725] ACPI: PCI Interrupt 0000:02:00.0[A] -> Link [APC4] -> GSI
> > > > 19 (level, high) -> IRQ 20 [ 48.550149] NVRM: loading NVIDIA UNIX x86
> > > > Kernel Module 169.07 Thu Dec 13 18:42:56 PST 2007
> > >
> > >We have no way of debugging that module, so please try 2.6.24 without it.
> >
> > Sorry, I can't do this and have a working machine. The nv driver has suffered
> > bit rot or something since the FC2 days when it COULD run a 19" crt at
> > 1600x1200, and will not drive this 20" wide screen lcd 1680x1050 monitor at
> > more than 800x600, which is absolutely butt ugly fuzzy, looking like a jpg
> > compressed to 10%. The system is not usable on a day to basis without the
> > nvidia driver.
>
> You should probably give the nouveau[1] driver a try, if only for
> testing purposes; if you are running an NV4x (G6x or G7x) card in
> particular, it works a lot better than the nv driver for 2d support.
>
> 1. http://nouveau.freedesktop.org/wiki/InstallNouveau

But nouveau is much less stable than nv. For testing purposes, go with
stable.

I'm not sure why it won't run his screen though. I can use nv to run a
1920x1200 laptop LCD. It *is* dog slow (although nouveau was not any
better with a NV17 / 440-Go -- render support for AA fonts seems to be
missing), but it does work.
--
Zan Lynx <[email protected]>


Attachments:
signature.asc (189.00 B)
This is a digitally signed message part

2008-01-28 17:31:35

by Gene Heskett

[permalink] [raw]
Subject: Re: Problem with ata layer in 2.6.24

On Monday 28 January 2008, Zan Lynx wrote:
>On Mon, 2008-01-28 at 11:50 -0500, Calvin Walton wrote:
>> On Mon, 2008-01-28 at 11:35 -0500, Gene Heskett wrote:
>> > On Monday 28 January 2008, Mikael Pettersson wrote:
>> > >Unfortunately we also see:
>> > > > [ 48.285456] nvidia: module license 'NVIDIA' taints kernel.
>> > > > [ 48.549725] ACPI: PCI Interrupt 0000:02:00.0[A] -> Link [APC4] ->
>> > > > GSI 19 (level, high) -> IRQ 20 [ 48.550149] NVRM: loading NVIDIA
>> > > > UNIX x86 Kernel Module 169.07 Thu Dec 13 18:42:56 PST 2007
>> > >
>> > >We have no way of debugging that module, so please try 2.6.24 without
>> > > it.
>> >
>> > Sorry, I can't do this and have a working machine. The nv driver has
>> > suffered bit rot or something since the FC2 days when it COULD run a 19"
>> > crt at 1600x1200, and will not drive this 20" wide screen lcd 1680x1050
>> > monitor at more than 800x600, which is absolutely butt ugly fuzzy,
>> > looking like a jpg compressed to 10%. The system is not usable on a day
>> > to basis without the nvidia driver.
>>
>> You should probably give the nouveau[1] driver a try, if only for
>> testing purposes; if you are running an NV4x (G6x or G7x) card in
>> particular, it works a lot better than the nv driver for 2d support.
>>
>> 1. http://nouveau.freedesktop.org/wiki/InstallNouveau
>
>But nouveau is much less stable than nv. For testing purposes, go with
>stable.
>
I believe at this point, its moot. I captured quite a few instances of that
error message while rebooting the last time, all of which occurred long
before I logged in and did a startx (I boot to runlevel 3 here), so the
kernel was NOT tainted at that point. That dmesg has been posted and some
questions asked.

As this has gone on for a while, it seems to me that with 14,800 google hits
on this problem, Linus should call a halt until this is found and fixed. But
I'm not Linus. I'm also locking up for 30 at a time, & probably ready for
reboot #7 today.

>I'm not sure why it won't run his screen though. I can use nv to run a
>1920x1200 laptop LCD. It *is* dog slow (although nouveau was not any
>better with a NV17 / 440-Go -- render support for AA fonts seems to be
>missing), but it does work.



--
Cheers, Gene
"There are four boxes to be used in defense of liberty:
soap, ballot, jury, and ammo. Please use in that order."
-Ed Howdershelt (Author)
There cannot be a crisis next week. My schedule is already full.
-- Henry Kissinger

2008-01-28 17:56:52

by Gene Heskett

[permalink] [raw]
Subject: Re: Problem with ata layer in 2.6.24

On Monday 28 January 2008, Gene Heskett wrote:
>On Monday 28 January 2008, Zan Lynx wrote:
>>On Mon, 2008-01-28 at 11:50 -0500, Calvin Walton wrote:
>>> On Mon, 2008-01-28 at 11:35 -0500, Gene Heskett wrote:
>>> > On Monday 28 January 2008, Mikael Pettersson wrote:
>>> > >Unfortunately we also see:
>>> > > > [ 48.285456] nvidia: module license 'NVIDIA' taints kernel.
>>> > > > [ 48.549725] ACPI: PCI Interrupt 0000:02:00.0[A] -> Link [APC4]
>>> > > > -> GSI 19 (level, high) -> IRQ 20 [ 48.550149] NVRM: loading
>>> > > > NVIDIA UNIX x86 Kernel Module 169.07 Thu Dec 13 18:42:56 PST 2007
>>> > >
>>> > >We have no way of debugging that module, so please try 2.6.24 without
>>> > > it.
>>> >
>>> > Sorry, I can't do this and have a working machine. The nv driver has
>>> > suffered bit rot or something since the FC2 days when it COULD run a
>>> > 19" crt at 1600x1200, and will not drive this 20" wide screen lcd
>>> > 1680x1050 monitor at more than 800x600, which is absolutely butt ugly
>>> > fuzzy, looking like a jpg compressed to 10%. The system is not usable
>>> > on a day to basis without the nvidia driver.
>>>
>>> You should probably give the nouveau[1] driver a try, if only for
>>> testing purposes; if you are running an NV4x (G6x or G7x) card in
>>> particular, it works a lot better than the nv driver for 2d support.
>>>
>>> 1. http://nouveau.freedesktop.org/wiki/InstallNouveau
>>
>>But nouveau is much less stable than nv. For testing purposes, go with
>>stable.
>
>I believe at this point, its moot. I captured quite a few instances of that
>error message while rebooting the last time, all of which occurred long
>before I logged in and did a startx (I boot to runlevel 3 here), so the
>kernel was NOT tainted at that point. That dmesg has been posted and some
>questions asked.
>
>As this has gone on for a while, it seems to me that with 14,800 google hits
>on this problem, Linus should call a halt until this is found and fixed.
> But I'm not Linus. I'm also locking up for 30 at a time, & probably ready
> for reboot #7 today.
>
>>I'm not sure why it won't run his screen though. I can use nv to run a
>>1920x1200 laptop LCD. It *is* dog slow (although nouveau was not any
>>better with a NV17 / 440-Go -- render support for AA fonts seems to be
>>missing), but it does work.

I've been trying to run a long selftest on that drive, but the constant
reboots are fscking that up. I have attached the last smartctl -a output,
indicating that the test was aborted probably from all the resets that are
being issued, the last one froze me for around 5 minutes but I haven't
rebooted yet. Its attached. Can anyone see if there is actually anything
wrong with the drive? If a boot will last long enough for the -t long to
complete, then it passes with no errors, but this was interrupted now for the
3rd time.

--
Cheers, Gene
"There are four boxes to be used in defense of liberty:
soap, ballot, jury, and ammo. Please use in that order."
-Ed Howdershelt (Author)
Well begun is half done.
-- Aristotle


Attachments:
(No filename) (2.98 kB)
smart.log (8.84 kB)
Download all attachments

2008-01-28 17:59:28

by Daniel Barkalow

[permalink] [raw]
Subject: Re: Problem with ata layer in 2.6.24

On Mon, 28 Jan 2008, Gene Heskett wrote:

> I believe at this point, its moot. I captured quite a few instances of that
> error message while rebooting the last time, all of which occurred long
> before I logged in and did a startx (I boot to runlevel 3 here), so the
> kernel was NOT tainted at that point. That dmesg has been posted and some
> questions asked.
>
> As this has gone on for a while, it seems to me that with 14,800 google hits
> on this problem, Linus should call a halt until this is found and fixed. But
> I'm not Linus. I'm also locking up for 30 at a time, & probably ready for
> reboot #7 today.

Can you switch back to old IDE to get your work done (and to make sure
it's not a hardware issue that's developed recently)? I believe libata is
just a whole lot pickier about behavior than the IDE subsystem was, so
it's more likely to complain about stuff, both for good reasons and when
it shouldn't, and there are a slew of potential "we have to accept that
old PATA hardware does this" bugs that all have the same symptom of "we go
into error handling when nothing is actually wrong", hence the vast
quantity of hits. I think it's not exactly that it's a common problem as
that it's a lot of problems that aren't very distinguishable.

-Daniel
*This .sig left intentionally blank*

2008-01-28 18:20:48

by Mark Lord

[permalink] [raw]
Subject: Re: Problem with ata layer in 2.6.24

> [ 64.037975] ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
> [ 64.038102] ata1.00: BMDMA stat 0x65
> [ 64.038227] ata1.00: cmd c8/00:58:89:3d:07/00:00:00:00:00/e0 tag 0 dma 45056 in
> [ 64.038229] res 51/40:58:8b:3d:07/00:00:00:00:00/e0 Emask 0x9 (media error)
> [ 64.038432] ata1.00: status: { DRDY ERR }
> [ 64.038555] ata1.00: error: { UNC }
> [ 64.050125] ata1.00: configured for UDMA/100
> [ 64.050134] sd 0:0:0:0: [sda] Result: hostbyte=0x00 driverbyte=0x08
> [ 64.050138] sd 0:0:0:0: [sda] Sense Key : 0x3 [current] [descriptor]
> [ 64.050142] Descriptor sense data with sense descriptors (in hex):
> [ 64.050143] 72 03 11 04 00 00 00 0c 00 0a 80 00 00 00 00 00
> [ 64.050149] 00 07 3d 8b
> [ 64.050152] sd 0:0:0:0: [sda] ASC=0x11 ASCQ=0x4
> [ 64.050155] end_request: I/O error, dev sda, sector 474507
..

This error looks somewhat different from the samples posted earlier.
This one is quite definitively a "bad sector".

It should also show up in "smartctl -a -data /dev/sda" (near the bottom)
if SMART was enabled on this drive at boot.

You could try reading that specific sector again just to make sure.
One way is to figure out how to use "dd" for this.
Another way is to use the "make_bad_sector" utility that
is included in the source tarball for hdparm-7.7, as follows:

make_bad_sector --readback /dev/sda 474507

(when invoked as above, it does *not* "make" a bad sector; no worries).

If it reports an I/O error consistently on that, then the sector is
indeed faulty, and it's contents have long been lost.

You can repair the bad sector (but not the original contents) like this:

make_bad_sector --rewrite /dev/sda 474507

Cheers

2008-01-28 18:24:00

by rgheck

[permalink] [raw]
Subject: Re: Problem with ata layer in 2.6.24

Daniel Barkalow wrote:
> Can you switch back to old IDE to get your work done (and to make sure
> it's not a hardware issue that's developed recently)?
I think it'd be really, REALLY helpful to a lot of people if you, or
someone, could explain in moderate detail how this might be done. I
tried doing it myself, but I'm not sufficiently expert at configuring
kernels that I was ever able to figure out how to do it.

Obviously, the short version is: switch back to Fedora 6. But this kind
of problem with libata---and yes, you're almost surely right that it's
not one problem but lots---is sufficiently widespread that a Mini HOWTO,
say, would be really welcome and, I'm guessing, widely used.

Richard

2008-01-28 18:53:59

by Andrey Borzenkov

[permalink] [raw]
Subject: Re: Problem with ata layer in 2.6.24

Richard Heck wrote:

> Daniel Barkalow wrote:
>> Can you switch back to old IDE to get your work done (and to make sure
>> it's not a hardware issue that's developed recently)?
> I think it'd be really, REALLY helpful to a lot of people if you, or
> someone, could explain in moderate detail how this might be done. I
> tried doing it myself, but I'm not sufficiently expert at configuring
> kernels that I was ever able to figure out how to do it.
>

well, here on Mandriva I

1) compile both IDE and libata as modules
2) create initrd that contains either IDE or libata modules
3) use labels for file system mounts, swaps and resume device.


Now 1) should be pretty straightforward (I could send you config if you
like, it is stripped down to bare minimum on my system, you will have to
check drivers for your hardware). 2 and 3 are obviously distribution
dependent. I can explain how to do it on Mandriva that ATM has near to
perfect support for addressing devices via label/UUID; also ide/scsi/ata
switch is trivial using Mandriva mkinitrd.

-andrey

> Obviously, the short version is: switch back to Fedora 6. But this kind
> of problem with libata---and yes, you're almost surely right that it's
> not one problem but lots---is sufficiently widespread that a Mini HOWTO,
> say, would be really welcome and, I'm guessing, widely used.
>
> Richard
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-ide" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html

2008-01-28 18:54:57

by Mark Lord

[permalink] [raw]
Subject: Re: Problem with ata layer in 2.6.24

Gene Heskett wrote:
> Greeting;
>
> I had to reboot early this morning due to a freezeup, and I had a
> bunch of these in the messages log:
> ==============
> Jan 27 19:42:11 coyote kernel: [42461.915961] ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen
> Jan 27 19:42:11 coyote kernel: [42461.915973] ata1.00: cmd ca/00:08:b1:66:46/00:00:00:00:00/e8 tag 0 dma 4096 out
> Jan 27 19:42:11 coyote kernel: [42461.915974] res 40/00:01:00:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout)
> Jan 27 19:42:11 coyote kernel: [42461.915978] ata1.00: status: { DRDY }
> Jan 27 19:42:11 coyote kernel: [42461.916005] ata1: soft resetting link
> Jan 27 19:42:12 coyote kernel: [42462.078216] ata1.00: configured for UDMA/100
> Jan 27 19:42:12 coyote kernel: [42462.078232] ata1: EH complete
> Jan 27 19:42:12 coyote kernel: [42462.090700] sd 0:0:0:0: [sda] 390721968 512-byte hardware sectors (200050 MB)
> Jan 27 19:42:12 coyote kernel: [42462.114230] sd 0:0:0:0: [sda] Write Protect is off
> Jan 27 19:42:12 coyote kernel: [42462.115079] sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't
> support DPO or FUA
> ===============
> That one showed up about 2 hours ago, so I expect I'll be locked
> up again before I've managed a 24 hour uptime. This drive passed
> a 'smartctl -t long /dev/sda' with flying colors after the reboot
> this morning.
>
> Two instances were logged after I had rebooted to 2.6.24 from 2.6.24-rc8:
>
> Jan 24 20:46:33 coyote kernel: [ 0.000000] Linux version 2.6.24 ([email protected]) (gcc version 4.1.2 20070925
> (Red Hat 4.1.2-33)) #1 SMP Thu Jan 24 20:17:55 EST 2008
> ----
> Jan 27 02:28:29 coyote kernel: [193207.445158] ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen
> Jan 27 02:28:29 coyote kernel: [193207.445170] ata1.00: cmd 35/00:08:f9:24:0a/00:00:17:00:00/e0 tag 0 dma 4096 out
> Jan 27 02:28:29 coyote kernel: [193207.445172] res 40/00:01:00:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout)
> Jan 27 02:28:29 coyote kernel: [193207.445175] ata1.00: status: { DRDY }
> Jan 27 02:28:29 coyote kernel: [193207.445202] ata1: soft resetting link
> Jan 27 02:28:29 coyote kernel: [193207.607384] ata1.00: configured for UDMA/100
> Jan 27 02:28:29 coyote kernel: [193207.607399] ata1: EH complete
> Jan 27 02:28:29 coyote kernel: [193207.609681] sd 0:0:0:0: [sda] 390721968 512-byte hardware sectors (200050 MB)
> Jan 27 02:28:29 coyote kernel: [193207.619277] sd 0:0:0:0: [sda] Write Protect is off
> Jan 27 02:28:29 coyote kernel: [193207.649041] sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't
> support DPO or FUA
> Jan 27 02:30:06 coyote kernel: [193304.336929] ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen
> Jan 27 02:30:06 coyote kernel: [193304.336940] ata1.00: cmd ca/00:20:69:22:a6/00:00:00:00:00/e7 tag 0 dma 16384 out
> Jan 27 02:30:06 coyote kernel: [193304.336942] res 40/00:01:00:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout)
> Jan 27 02:30:06 coyote kernel: [193304.336945] ata1.00: status: { DRDY }
> Jan 27 02:30:06 coyote kernel: [193304.336972] ata1: soft resetting link
> Jan 27 02:30:06 coyote kernel: [193304.499210] ata1.00: configured for UDMA/100
> Jan 27 02:30:06 coyote kernel: [193304.499226] ata1: EH complete
> Jan 27 02:30:06 coyote kernel: [193304.499714] sd 0:0:0:0: [sda] 390721968 512-byte hardware sectors (200050 MB)
> Jan 27 02:30:06 coyote kernel: [193304.499857] sd 0:0:0:0: [sda] Write Protect is off
> Jan 27 02:30:06 coyote kernel: [193304.502315] sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't
> support DPO or FUA
>
> None were logged during the time I was running an -rc7 or -rc8.
>
> The previous hits on this resulted in the udma speed being downgraded
> till it was actually running in pio just before the freeze that
> required the hardware reset button.
>
> I'll reboot to -rc8 right now and resume. If its the drive, I should see it.
> If not, then 2.6.24 is where I'll point the finger.
..

The only libata change I can see that could possibly affect your setup,
is this one here, which went in sometime between -rc7 and -final:

--- linux-2.6.24-rc7/drivers/ata/libata-eh.c 2008-01-06 16:45:38.000000000 -0500
+++ linux-2.6.24/drivers/ata/libata-eh.c 2008-01-24 17:58:37.000000000 -0500
@@ -1733,11 +1733,15 @@
ehc->i.action &= ~ATA_EH_PERDEV_MASK;
}

- /* consider speeding down */
+ /* propagate timeout to host link */
+ if ((all_err_mask & AC_ERR_TIMEOUT) && !ata_is_host_link(link))
+ ap->link.eh_context.i.err_mask |= AC_ERR_TIMEOUT;
+

It looks pretty innocent to me, though.
If you want to try reverting just that change
(comment out the two lines and rebuild),
then that might provide useful information here.

If -final is still b0rked even with those two lines changed back,
then I suspect you're just "getting lucky" when switching between
the -rc7/-rc8 kernel and the -final kernel.

"Lucky" in a bad way, that is.

The real test would be to rebuild the kernel without libata,
and *with* the old IDE driver instead, and see if the problems persist.

If you need help with that, then perhaps someone familiar with Fedora
might be able to assist.

Cheers

2008-01-28 19:02:14

by Mark Lord

[permalink] [raw]
Subject: Re: Problem with ata layer in 2.6.24

Gene Heskett wrote:
> Greeting;
>
> I had to reboot early this morning due to a freezeup, and I had a bunch of these in the messages log:
> ==============
> Jan 27 19:42:11 coyote kernel: [42461.915961] ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen
> Jan 27 19:42:11 coyote kernel: [42461.915973] ata1.00: cmd ca/00:08:b1:66:46/00:00:00:00:00/e8 tag 0 dma 4096 out
> Jan 27 19:42:11 coyote kernel: [42461.915974] res 40/00:01:00:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout)
> Jan 27 19:42:11 coyote kernel: [42461.915978] ata1.00: status: { DRDY }
> Jan 27 19:42:11 coyote kernel: [42461.916005] ata1: soft resetting link
> Jan 27 19:42:12 coyote kernel: [42462.078216] ata1.00: configured for UDMA/100
> Jan 27 19:42:12 coyote kernel: [42462.078232] ata1: EH complete
> Jan 27 19:42:12 coyote kernel: [42462.090700] sd 0:0:0:0: [sda] 390721968 512-byte hardware sectors (200050 MB)
> Jan 27 19:42:12 coyote kernel: [42462.114230] sd 0:0:0:0: [sda] Write Protect is off
> Jan 27 19:42:12 coyote kernel: [42462.115079] sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
> ===============
> That one showed up about 2 hours ago, so I expect I'll be locked up again before I've managed a 24 hour uptime. This drive passed
> a 'smartctl -t long /dev/sda' with flying colors after the reboot
> this morning.
>
> Two instances were logged after I had rebooted to 2.6.24 from 2.6.24-rc8:
>
> Jan 24 20:46:33 coyote kernel: [ 0.000000] Linux version 2.6.24 ([email protected]) (gcc version 4.1.2 20070925 (Red Hat 4.1.2-33)) #1 SMP Thu Jan 24 20:17:55 EST 2008
> ----
> Jan 27 02:28:29 coyote kernel: [193207.445158] ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen
> Jan 27 02:28:29 coyote kernel: [193207.445170] ata1.00: cmd 35/00:08:f9:24:0a/00:00:17:00:00/e0 tag 0 dma 4096 out
> Jan 27 02:28:29 coyote kernel: [193207.445172] res 40/00:01:00:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout)
> Jan 27 02:28:29 coyote kernel: [193207.445175] ata1.00: status: { DRDY }
> Jan 27 02:28:29 coyote kernel: [193207.445202] ata1: soft resetting link
> Jan 27 02:28:29 coyote kernel: [193207.607384] ata1.00: configured for UDMA/100
> Jan 27 02:28:29 coyote kernel: [193207.607399] ata1: EH complete
> Jan 27 02:28:29 coyote kernel: [193207.609681] sd 0:0:0:0: [sda] 390721968 512-byte hardware sectors (200050 MB)
> Jan 27 02:28:29 coyote kernel: [193207.619277] sd 0:0:0:0: [sda] Write Protect is off
> Jan 27 02:28:29 coyote kernel: [193207.649041] sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
> Jan 27 02:30:06 coyote kernel: [193304.336929] ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen
> Jan 27 02:30:06 coyote kernel: [193304.336940] ata1.00: cmd ca/00:20:69:22:a6/00:00:00:00:00/e7 tag 0 dma 16384 out
> Jan 27 02:30:06 coyote kernel: [193304.336942] res 40/00:01:00:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout)
> Jan 27 02:30:06 coyote kernel: [193304.336945] ata1.00: status: { DRDY }
> Jan 27 02:30:06 coyote kernel: [193304.336972] ata1: soft resetting link
> Jan 27 02:30:06 coyote kernel: [193304.499210] ata1.00: configured for UDMA/100
> Jan 27 02:30:06 coyote kernel: [193304.499226] ata1: EH complete
> Jan 27 02:30:06 coyote kernel: [193304.499714] sd 0:0:0:0: [sda] 390721968 512-byte hardware sectors (200050 MB)
> Jan 27 02:30:06 coyote kernel: [193304.499857] sd 0:0:0:0: [sda] Write Protect is off
> Jan 27 02:30:06 coyote kernel: [193304.502315] sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
>
> None were logged during the time I was running an -rc7 or -rc8.
>
> The previous hits on this resulted in the udma speed being downgraded till it was actually running in pio just before the freeze that required the hardware reset button.
>
> I'll reboot to -rc8 right now and resume. If its the drive, I should see it.
> If not, then 2.6.24 is where I'll point the finger.
..

The only libata change I can see that could possibly affect your setup,
is this one here, which went in sometime between -rc7 and -final:

--- linux-2.6.24-rc7/drivers/ata/libata-eh.c 2008-01-06 16:45:38.000000000 -0500
+++ linux-2.6.24/drivers/ata/libata-eh.c 2008-01-24 17:58:37.000000000 -0500
@@ -1733,11 +1733,15 @@
ehc->i.action &= ~ATA_EH_PERDEV_MASK;
}

- /* consider speeding down */
+ /* propagate timeout to host link */
+ if ((all_err_mask & AC_ERR_TIMEOUT) && !ata_is_host_link(link))
+ ap->link.eh_context.i.err_mask |= AC_ERR_TIMEOUT;
+

It looks pretty innocent to me, though.
If you want to try reverting just that change
(comment out the two lines and rebuild),
then that might provide useful information here.

If -final is still b0rked even with those two lines changed back,
then I suspect you're just "getting lucky" when switching between
the -rc7/-rc8 kernel and the -final kernel.

"Lucky" in a bad way, that is.

The real test would be to rebuild the kernel without libata,
and *with* the old IDE driver instead, and see if the problems persist.

If you need help with that, then perhaps someone familiar with Fedora
might be able to assist.

Cheers

2008-01-28 19:04:27

by Gene Heskett

[permalink] [raw]
Subject: Re: Problem with ata layer in 2.6.24

On Monday 28 January 2008, Mark Lord wrote:
>Gene Heskett wrote:
>> Greeting;
>>
>> I had to reboot early this morning due to a freezeup, and I had a
>> bunch of these in the messages log:
>> ==============
>> Jan 27 19:42:11 coyote kernel: [42461.915961] ata1.00: exception Emask 0x0
>> SAct 0x0 SErr 0x0 action 0x2 frozen Jan 27 19:42:11 coyote kernel:
>> [42461.915973] ata1.00: cmd ca/00:08:b1:66:46/00:00:00:00:00/e8 tag 0 dma
>> 4096 out Jan 27 19:42:11 coyote kernel: [42461.915974] res
>> 40/00:01:00:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout) Jan 27 19:42:11
>> coyote kernel: [42461.915978] ata1.00: status: { DRDY } Jan 27 19:42:11
>> coyote kernel: [42461.916005] ata1: soft resetting link Jan 27 19:42:12
>> coyote kernel: [42462.078216] ata1.00: configured for UDMA/100 Jan 27
>> 19:42:12 coyote kernel: [42462.078232] ata1: EH complete
>> Jan 27 19:42:12 coyote kernel: [42462.090700] sd 0:0:0:0: [sda] 390721968
>> 512-byte hardware sectors (200050 MB) Jan 27 19:42:12 coyote kernel:
>> [42462.114230] sd 0:0:0:0: [sda] Write Protect is off Jan 27 19:42:12
>> coyote kernel: [42462.115079] sd 0:0:0:0: [sda] Write cache: enabled, read
>> cache: enabled, doesn't support DPO or FUA
>> ===============
>> That one showed up about 2 hours ago, so I expect I'll be locked
>> up again before I've managed a 24 hour uptime. This drive passed
>> a 'smartctl -t long /dev/sda' with flying colors after the reboot
>> this morning.
>>
>> Two instances were logged after I had rebooted to 2.6.24 from 2.6.24-rc8:
>>
>> Jan 24 20:46:33 coyote kernel: [ 0.000000] Linux version 2.6.24
>> ([email protected]) (gcc version 4.1.2 20070925 (Red Hat 4.1.2-33))
>> #1 SMP Thu Jan 24 20:17:55 EST 2008
>> ----
>> Jan 27 02:28:29 coyote kernel: [193207.445158] ata1.00: exception Emask
>> 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen Jan 27 02:28:29 coyote kernel:
>> [193207.445170] ata1.00: cmd 35/00:08:f9:24:0a/00:00:17:00:00/e0 tag 0 dma
>> 4096 out Jan 27 02:28:29 coyote kernel: [193207.445172] res
>> 40/00:01:00:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout) Jan 27 02:28:29
>> coyote kernel: [193207.445175] ata1.00: status: { DRDY } Jan 27 02:28:29
>> coyote kernel: [193207.445202] ata1: soft resetting link Jan 27 02:28:29
>> coyote kernel: [193207.607384] ata1.00: configured for UDMA/100 Jan 27
>> 02:28:29 coyote kernel: [193207.607399] ata1: EH complete
>> Jan 27 02:28:29 coyote kernel: [193207.609681] sd 0:0:0:0: [sda] 390721968
>> 512-byte hardware sectors (200050 MB) Jan 27 02:28:29 coyote kernel:
>> [193207.619277] sd 0:0:0:0: [sda] Write Protect is off Jan 27 02:28:29
>> coyote kernel: [193207.649041] sd 0:0:0:0: [sda] Write cache: enabled,
>> read cache: enabled, doesn't support DPO or FUA
>> Jan 27 02:30:06 coyote kernel: [193304.336929] ata1.00: exception Emask
>> 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen Jan 27 02:30:06 coyote kernel:
>> [193304.336940] ata1.00: cmd ca/00:20:69:22:a6/00:00:00:00:00/e7 tag 0 dma
>> 16384 out Jan 27 02:30:06 coyote kernel: [193304.336942] res
>> 40/00:01:00:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout) Jan 27 02:30:06
>> coyote kernel: [193304.336945] ata1.00: status: { DRDY } Jan 27 02:30:06
>> coyote kernel: [193304.336972] ata1: soft resetting link Jan 27 02:30:06
>> coyote kernel: [193304.499210] ata1.00: configured for UDMA/100 Jan 27
>> 02:30:06 coyote kernel: [193304.499226] ata1: EH complete
>> Jan 27 02:30:06 coyote kernel: [193304.499714] sd 0:0:0:0: [sda] 390721968
>> 512-byte hardware sectors (200050 MB) Jan 27 02:30:06 coyote kernel:
>> [193304.499857] sd 0:0:0:0: [sda] Write Protect is off Jan 27 02:30:06
>> coyote kernel: [193304.502315] sd 0:0:0:0: [sda] Write cache: enabled,
>> read cache: enabled, doesn't support DPO or FUA
>>
>> None were logged during the time I was running an -rc7 or -rc8.
>>
>> The previous hits on this resulted in the udma speed being downgraded
>> till it was actually running in pio just before the freeze that
>> required the hardware reset button.
>>
>> I'll reboot to -rc8 right now and resume. If its the drive, I should see
>> it. If not, then 2.6.24 is where I'll point the finger.
>
>..
>
>The only libata change I can see that could possibly affect your setup,
>is this one here, which went in sometime between -rc7 and -final:
>
>--- linux-2.6.24-rc7/drivers/ata/libata-eh.c 2008-01-06
> 16:45:38.000000000 -0500 +++ linux-2.6.24/drivers/ata/libata-eh.c
> 2008-01-24 17:58:37.000000000 -0500 @@ -1733,11 +1733,15 @@
> ehc->i.action &= ~ATA_EH_PERDEV_MASK;
> }
>
>- /* consider speeding down */
>+ /* propagate timeout to host link */
>+ if ((all_err_mask & AC_ERR_TIMEOUT) && !ata_is_host_link(link))
>+ ap->link.eh_context.i.err_mask |= AC_ERR_TIMEOUT;
>+
>
>It looks pretty innocent to me, though.
>If you want to try reverting just that change
>(comment out the two lines and rebuild),
>then that might provide useful information here.
>
>If -final is still b0rked even with those two lines changed back,
>then I suspect you're just "getting lucky" when switching between
>the -rc7/-rc8 kernel and the -final kernel.
>
>"Lucky" in a bad way, that is.
>
>The real test would be to rebuild the kernel without libata,
>and *with* the old IDE driver instead, and see if the problems persist.

I can do that, but going to this was pretty painfull, probably 5 or 6 reboots
to get it right.

And so far no one has tried to comment on those 2 dmesg lines I've quoted a
couple of times now, here's another:
[ 0.000000] Nvidia board detected. Ignoring ACPI timer override.
[ 0.000000] If you got timer trouble try acpi_use_timer_override
what the heck is that trying to tell me to do, in some sort of broken english?


>If you need help with that, then perhaps someone familiar with Fedora
>might be able to assist.
>
>Cheers



--
Cheers, Gene
"There are four boxes to be used in defense of liberty:
soap, ballot, jury, and ammo. Please use in that order."
-Ed Howdershelt (Author)
I used to be disgusted, now I find I'm just amused.
-- Elvis Costello

2008-01-28 19:08:46

by Gene Heskett

[permalink] [raw]
Subject: Re: Problem with ata layer in 2.6.24

On Monday 28 January 2008, Mark Lord wrote:
>> [ 64.037975] ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
>> [ 64.038102] ata1.00: BMDMA stat 0x65
>> [ 64.038227] ata1.00: cmd c8/00:58:89:3d:07/00:00:00:00:00/e0 tag 0 dma
>> 45056 in [ 64.038229] res 51/40:58:8b:3d:07/00:00:00:00:00/e0
>> Emask 0x9 (media error) [ 64.038432] ata1.00: status: { DRDY ERR }
>> [ 64.038555] ata1.00: error: { UNC }
>> [ 64.050125] ata1.00: configured for UDMA/100
>> [ 64.050134] sd 0:0:0:0: [sda] Result: hostbyte=0x00 driverbyte=0x08
>> [ 64.050138] sd 0:0:0:0: [sda] Sense Key : 0x3 [current] [descriptor]
>> [ 64.050142] Descriptor sense data with sense descriptors (in hex):
>> [ 64.050143] 72 03 11 04 00 00 00 0c 00 0a 80 00 00 00 00 00
>> [ 64.050149] 00 07 3d 8b
>> [ 64.050152] sd 0:0:0:0: [sda] ASC=0x11 ASCQ=0x4
>> [ 64.050155] end_request: I/O error, dev sda, sector 474507
>
>..
>
>This error looks somewhat different from the samples posted earlier.
>This one is quite definitively a "bad sector".
>
>It should also show up in "smartctl -a -data /dev/sda" (near the bottom)
>if SMART was enabled on this drive at boot.
>
It does not unforch.

>You could try reading that specific sector again just to make sure.
>One way is to figure out how to use "dd" for this.
[root@coyote ~]# dd if=/dev/sda bs=512 skip=474506 count=3
��▒6
{�G���G���libkdecorations.so.1.0.0��c�®���J{�G���G���libkfontinst.so.0.0.0��c�®����"ʂ�GP�~GJ3G
6�7�8�#��z;����{�G���G���libkhotkeys_shared.so.1.0.0��c�®���N{�G���G���libkickermain.so.1.0.0��c�®���Y{�G���G���libkonq.so.4.2.0��c�®���Z{�G���G���libkonqsidebarplugin.so.1.2.0��c�®���d{�G���G���libksgrd.so.1.2.0��c�®����▒��G7
G▒�=G▒]��^���▒?����e{�G���G���libksplashthemes.so.0.0.0��c�®����{�G���G���libtaskbar.so.1.2.0��c�®����{�G���G���libtaskmanager.so.1.0.0��c�®�3+0
records in
3+0 records out
1536 bytes (1.5 kB) copied, 6.1403e-05 s, 25.0 MB/s

>Another way is to use the "make_bad_sector" utility that
>is included in the source tarball for hdparm-7.7, as follows:
>
> make_bad_sector --readback /dev/sda 474507
>
Apparently not in the rpm, darnit.

>(when invoked as above, it does *not* "make" a bad sector; no worries).
>
>If it reports an I/O error consistently on that, then the sector is
>indeed faulty, and it's contents have long been lost.
>
>You can repair the bad sector (but not the original contents) like this:
>
> make_bad_sector --rewrite /dev/sda 474507
>
>Cheers

I'm going up to Clarksburg this afternoon to see if I can find a couple of
drives, one a 2.5" bigger than 40Gb for my 2.5" maxtor usb housing, and
another pata drive big enough to run this thing & just re-install the
December respin after I save as much of this as I can, there's nearly 50GB
here now.

Maybe it won't be so fscking picky about the next drive.

I was hoping someone could look at that last dmseg I attached, but apparently
everybody is blinded by unrelated details as that bad sector may have been
transient, caused by the multiple hardware reset type reboots so far today :(

The last 3 reboots have interrupted a 'smartctl -t long /dev/sda' in
progress. :(

If I reconvert to non libata, can I do that only for the pata drives of which
there are 3 here including the dvd writer, and still use libata for the lone
sata drive left?

And can I do that without mucking with the device map, which will make
amanda/tar attempt to do a level 0 on the whole system if its changed. I see
the drives are at 254 again, when are they going to be given a stable device
address out of the LANANA experimental group so we can reboot without mucking
with that and driving tar crazy?

Thanks everybody.

--
Cheers, Gene
"There are four boxes to be used in defense of liberty:
soap, ballot, jury, and ammo. Please use in that order."
-Ed Howdershelt (Author)
I just had my entire INTESTINAL TRACT coated with TEFLON!

2008-01-28 19:08:59

by Jeff Garzik

[permalink] [raw]
Subject: Re: Problem with ata layer in 2.6.24

Gene Heskett wrote:
> Greeting;
>
> I had to reboot early this morning due to a freezeup, and I had a
> bunch of these in the messages log:
> ==============
> Jan 27 19:42:11 coyote kernel: [42461.915961] ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen
> Jan 27 19:42:11 coyote kernel: [42461.915973] ata1.00: cmd ca/00:08:b1:66:46/00:00:00:00:00/e8 tag 0 dma 4096 out
> Jan 27 19:42:11 coyote kernel: [42461.915974] res 40/00:01:00:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout)
> Jan 27 19:42:11 coyote kernel: [42461.915978] ata1.00: status: { DRDY }
> Jan 27 19:42:11 coyote kernel: [42461.916005] ata1: soft resetting link
> Jan 27 19:42:12 coyote kernel: [42462.078216] ata1.00: configured for UDMA/100
> Jan 27 19:42:12 coyote kernel: [42462.078232] ata1: EH complete
> Jan 27 19:42:12 coyote kernel: [42462.090700] sd 0:0:0:0: [sda] 390721968 512-byte hardware sectors (200050 MB)
> Jan 27 19:42:12 coyote kernel: [42462.114230] sd 0:0:0:0: [sda] Write Protect is off
> Jan 27 19:42:12 coyote kernel: [42462.115079] sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't
> support DPO or FUA
> ===============
> That one showed up about 2 hours ago, so I expect I'll be locked
> up again before I've managed a 24 hour uptime. This drive passed
> a 'smartctl -t long /dev/sda' with flying colors after the reboot
> this morning.
>
> Two instances were logged after I had rebooted to 2.6.24 from 2.6.24-rc8:
>
> Jan 24 20:46:33 coyote kernel: [ 0.000000] Linux version 2.6.24 ([email protected]) (gcc version 4.1.2 20070925
> (Red Hat 4.1.2-33)) #1 SMP Thu Jan 24 20:17:55 EST 2008
> ----
> Jan 27 02:28:29 coyote kernel: [193207.445158] ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen
> Jan 27 02:28:29 coyote kernel: [193207.445170] ata1.00: cmd 35/00:08:f9:24:0a/00:00:17:00:00/e0 tag 0 dma 4096 out
> Jan 27 02:28:29 coyote kernel: [193207.445172] res 40/00:01:00:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout)
> Jan 27 02:28:29 coyote kernel: [193207.445175] ata1.00: status: { DRDY }
> Jan 27 02:28:29 coyote kernel: [193207.445202] ata1: soft resetting link
> Jan 27 02:28:29 coyote kernel: [193207.607384] ata1.00: configured for UDMA/100
> Jan 27 02:28:29 coyote kernel: [193207.607399] ata1: EH complete
> Jan 27 02:28:29 coyote kernel: [193207.609681] sd 0:0:0:0: [sda] 390721968 512-byte hardware sectors (200050 MB)
> Jan 27 02:28:29 coyote kernel: [193207.619277] sd 0:0:0:0: [sda] Write Protect is off
> Jan 27 02:28:29 coyote kernel: [193207.649041] sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't
> support DPO or FUA
> Jan 27 02:30:06 coyote kernel: [193304.336929] ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen
> Jan 27 02:30:06 coyote kernel: [193304.336940] ata1.00: cmd ca/00:20:69:22:a6/00:00:00:00:00/e7 tag 0 dma 16384 out
> Jan 27 02:30:06 coyote kernel: [193304.336942] res 40/00:01:00:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout)
> Jan 27 02:30:06 coyote kernel: [193304.336945] ata1.00: status: { DRDY }
> Jan 27 02:30:06 coyote kernel: [193304.336972] ata1: soft resetting link
> Jan 27 02:30:06 coyote kernel: [193304.499210] ata1.00: configured for UDMA/100
> Jan 27 02:30:06 coyote kernel: [193304.499226] ata1: EH complete
> Jan 27 02:30:06 coyote kernel: [193304.499714] sd 0:0:0:0: [sda] 390721968 512-byte hardware sectors (200050 MB)
> Jan 27 02:30:06 coyote kernel: [193304.499857] sd 0:0:0:0: [sda] Write Protect is off
> Jan 27 02:30:06 coyote kernel: [193304.502315] sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't
> support DPO or FUA
>
> None were logged during the time I was running an -rc7 or -rc8.
>
> The previous hits on this resulted in the udma speed being downgraded
> till it was actually running in pio just before the freeze that
> required the hardware reset button.

Unfortunately there are 1001 different causes for timeouts, so we need
to drill down into the hardware, libata version, and ACPI version (most
notably).


> I'll reboot to -rc8 right now and resume. If its the drive, I should see it.
> If not, then 2.6.24 is where I'll point the finger.

There was also an ACPI update, which always affects interrupt handling
(whose symptom can sometimes be a timeout).

Definitely interesting in test results from what you describe.

Jeff

2008-01-28 19:15:59

by Gene Heskett

[permalink] [raw]
Subject: Re: Problem with ata layer in 2.6.24

On Monday 28 January 2008, Andrey Borzenkov wrote:
>Richard Heck wrote:
>> Daniel Barkalow wrote:
>>> Can you switch back to old IDE to get your work done (and to make sure
>>> it's not a hardware issue that's developed recently)?
>>
>> I think it'd be really, REALLY helpful to a lot of people if you, or
>> someone, could explain in moderate detail how this might be done. I
>> tried doing it myself, but I'm not sufficiently expert at configuring
>> kernels that I was ever able to figure out how to do it.
>
>well, here on Mandriva I
>
>1) compile both IDE and libata as modules
>2) create initrd that contains either IDE or libata modules
>3) use labels for file system mounts, swaps and resume device.
>
>
>Now 1) should be pretty straightforward (I could send you config if you
>like, it is stripped down to bare minimum on my system, you will have to
>check drivers for your hardware). 2 and 3 are obviously distribution
>dependent. I can explain how to do it on Mandriva that ATM has near to
>perfect support for addressing devices via label/UUID; also ide/scsi/ata
>switch is trivial using Mandriva mkinitrd.
>
>-andrey
>
>> Obviously, the short version is: switch back to Fedora 6. But this kind
>> of problem with libata---and yes, you're almost surely right that it's
>> not one problem but lots---is sufficiently widespread that a Mini HOWTO,
>> say, would be really welcome and, I'm guessing, widely used.
>>
>> Richard

I already build as modules, and it would be relatively easy to make 2 boot
stanza's that used the different initrd's if there were examples that could
be used as 'excludes' when building the initrd's. Is such a creature
breedable?

--
Cheers, Gene
"There are four boxes to be used in defense of liberty:
soap, ballot, jury, and ammo. Please use in that order."
-Ed Howdershelt (Author)
It's no wonder they call it WinNT; WNT = VMS++;

-- Chris Abbey
%
Peace, Love and Compile the kernel...

-- Justin L. Herreman

2008-01-28 19:17:19

by Gene Heskett

[permalink] [raw]
Subject: Re: Problem with ata layer in 2.6.24

On Monday 28 January 2008, Jeff Garzik wrote:
>Gene Heskett wrote:
>> Greeting;
>>
>> I had to reboot early this morning due to a freezeup, and I had a
>> bunch of these in the messages log:
>> ==============
>> Jan 27 19:42:11 coyote kernel: [42461.915961] ata1.00: exception Emask 0x0
>> SAct 0x0 SErr 0x0 action 0x2 frozen Jan 27 19:42:11 coyote kernel:
>> [42461.915973] ata1.00: cmd ca/00:08:b1:66:46/00:00:00:00:00/e8 tag 0 dma
>> 4096 out Jan 27 19:42:11 coyote kernel: [42461.915974] res
>> 40/00:01:00:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout) Jan 27 19:42:11
>> coyote kernel: [42461.915978] ata1.00: status: { DRDY } Jan 27 19:42:11
>> coyote kernel: [42461.916005] ata1: soft resetting link Jan 27 19:42:12
>> coyote kernel: [42462.078216] ata1.00: configured for UDMA/100 Jan 27
>> 19:42:12 coyote kernel: [42462.078232] ata1: EH complete
>> Jan 27 19:42:12 coyote kernel: [42462.090700] sd 0:0:0:0: [sda] 390721968
>> 512-byte hardware sectors (200050 MB) Jan 27 19:42:12 coyote kernel:
>> [42462.114230] sd 0:0:0:0: [sda] Write Protect is off Jan 27 19:42:12
>> coyote kernel: [42462.115079] sd 0:0:0:0: [sda] Write cache: enabled, read
>> cache: enabled, doesn't support DPO or FUA
>> ===============
>> That one showed up about 2 hours ago, so I expect I'll be locked
>> up again before I've managed a 24 hour uptime. This drive passed
>> a 'smartctl -t long /dev/sda' with flying colors after the reboot
>> this morning.
>>
>> Two instances were logged after I had rebooted to 2.6.24 from 2.6.24-rc8:
>>
>> Jan 24 20:46:33 coyote kernel: [ 0.000000] Linux version 2.6.24
>> ([email protected]) (gcc version 4.1.2 20070925 (Red Hat 4.1.2-33))
>> #1 SMP Thu Jan 24 20:17:55 EST 2008
>> ----
>> Jan 27 02:28:29 coyote kernel: [193207.445158] ata1.00: exception Emask
>> 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen Jan 27 02:28:29 coyote kernel:
>> [193207.445170] ata1.00: cmd 35/00:08:f9:24:0a/00:00:17:00:00/e0 tag 0 dma
>> 4096 out Jan 27 02:28:29 coyote kernel: [193207.445172] res
>> 40/00:01:00:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout) Jan 27 02:28:29
>> coyote kernel: [193207.445175] ata1.00: status: { DRDY } Jan 27 02:28:29
>> coyote kernel: [193207.445202] ata1: soft resetting link Jan 27 02:28:29
>> coyote kernel: [193207.607384] ata1.00: configured for UDMA/100 Jan 27
>> 02:28:29 coyote kernel: [193207.607399] ata1: EH complete
>> Jan 27 02:28:29 coyote kernel: [193207.609681] sd 0:0:0:0: [sda] 390721968
>> 512-byte hardware sectors (200050 MB) Jan 27 02:28:29 coyote kernel:
>> [193207.619277] sd 0:0:0:0: [sda] Write Protect is off Jan 27 02:28:29
>> coyote kernel: [193207.649041] sd 0:0:0:0: [sda] Write cache: enabled,
>> read cache: enabled, doesn't support DPO or FUA
>> Jan 27 02:30:06 coyote kernel: [193304.336929] ata1.00: exception Emask
>> 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen Jan 27 02:30:06 coyote kernel:
>> [193304.336940] ata1.00: cmd ca/00:20:69:22:a6/00:00:00:00:00/e7 tag 0 dma
>> 16384 out Jan 27 02:30:06 coyote kernel: [193304.336942] res
>> 40/00:01:00:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout) Jan 27 02:30:06
>> coyote kernel: [193304.336945] ata1.00: status: { DRDY } Jan 27 02:30:06
>> coyote kernel: [193304.336972] ata1: soft resetting link Jan 27 02:30:06
>> coyote kernel: [193304.499210] ata1.00: configured for UDMA/100 Jan 27
>> 02:30:06 coyote kernel: [193304.499226] ata1: EH complete
>> Jan 27 02:30:06 coyote kernel: [193304.499714] sd 0:0:0:0: [sda] 390721968
>> 512-byte hardware sectors (200050 MB) Jan 27 02:30:06 coyote kernel:
>> [193304.499857] sd 0:0:0:0: [sda] Write Protect is off Jan 27 02:30:06
>> coyote kernel: [193304.502315] sd 0:0:0:0: [sda] Write cache: enabled,
>> read cache: enabled, doesn't support DPO or FUA
>>
>> None were logged during the time I was running an -rc7 or -rc8.
>>
>> The previous hits on this resulted in the udma speed being downgraded
>> till it was actually running in pio just before the freeze that
>> required the hardware reset button.
>
>Unfortunately there are 1001 different causes for timeouts, so we need
>to drill down into the hardware, libata version, and ACPI version (most
>notably).
>
>> I'll reboot to -rc8 right now and resume. If its the drive, I should see
>> it. If not, then 2.6.24 is where I'll point the finger.
Both rc8 and rc7 do it. The fedora kernels do too, but without the error
messages being logged, I assume they are an attempt to trace this?

>There was also an ACPI update, which always affects interrupt handling
>(whose symptom can sometimes be a timeout).

I'm thinking Bingo!, please pay the man. See my posts asking about a couple of
lines very early in the dmesg, asking for an english explanation no one has
proffered as yet.

>Definitely interesting in test results from what you describe.
>
> Jeff



--
Cheers, Gene
"There are four boxes to be used in defense of liberty:
soap, ballot, jury, and ammo. Please use in that order."
-Ed Howdershelt (Author)
It's no wonder they call it WinNT; WNT = VMS++;

-- Chris Abbey
%
Peace, Love and Compile the kernel...

-- Justin L. Herreman

2008-01-28 19:22:12

by Andrey Borzenkov

[permalink] [raw]
Subject: Re: Problem with ata layer in 2.6.24

On Monday 28 January 2008, Gene Heskett wrote:
> On Monday 28 January 2008, Andrey Borzenkov wrote:
> >Richard Heck wrote:
> >> Daniel Barkalow wrote:
> >>> Can you switch back to old IDE to get your work done (and to make sure
> >>> it's not a hardware issue that's developed recently)?
> >>
> >> I think it'd be really, REALLY helpful to a lot of people if you, or
> >> someone, could explain in moderate detail how this might be done. I
> >> tried doing it myself, but I'm not sufficiently expert at configuring
> >> kernels that I was ever able to figure out how to do it.
> >
> >well, here on Mandriva I
> >
> >1) compile both IDE and libata as modules
> >2) create initrd that contains either IDE or libata modules
> >3) use labels for file system mounts, swaps and resume device.
> >
> >
> >Now 1) should be pretty straightforward (I could send you config if you
> >like, it is stripped down to bare minimum on my system, you will have to
> >check drivers for your hardware). 2 and 3 are obviously distribution
> >dependent. I can explain how to do it on Mandriva that ATM has near to
> >perfect support for addressing devices via label/UUID; also ide/scsi/ata
> >switch is trivial using Mandriva mkinitrd.
> >
>
> I already build as modules, and it would be relatively easy to make 2 boot
> stanza's that used the different initrd's if there were examples that could
> be used as 'excludes' when building the initrd's. Is such a creature
> breedable?
>

I am not sure I understand a question (it is not my native language) but here I
simply do

mkinitrd --omit-ide-modules --preload pata_ali --preload sd_mod ...

or

mkinitrd --omit-scsi-modules --preload alim15x3 --preload ide-disk ...

If you ask how --omit part is implemented I happily send you mkinitrd script.


Attachments:
(No filename) (1.74 kB)
signature.asc (197.00 B)
This is a digitally signed message part.
Download all attachments

2008-01-28 19:35:32

by Gene Heskett

[permalink] [raw]
Subject: Re: Problem with ata layer in 2.6.24

On Monday 28 January 2008, Andrey Borzenkov wrote:
>On Monday 28 January 2008, Gene Heskett wrote:
>> On Monday 28 January 2008, Andrey Borzenkov wrote:
>> >Richard Heck wrote:
>> >> Daniel Barkalow wrote:
>> >>> Can you switch back to old IDE to get your work done (and to make sure
>> >>> it's not a hardware issue that's developed recently)?
>> >>
>> >> I think it'd be really, REALLY helpful to a lot of people if you, or
>> >> someone, could explain in moderate detail how this might be done. I
>> >> tried doing it myself, but I'm not sufficiently expert at configuring
>> >> kernels that I was ever able to figure out how to do it.
>> >
>> >well, here on Mandriva I
>> >
>> >1) compile both IDE and libata as modules
>> >2) create initrd that contains either IDE or libata modules
>> >3) use labels for file system mounts, swaps and resume device.
>> >
>> >
>> >Now 1) should be pretty straightforward (I could send you config if you
>> >like, it is stripped down to bare minimum on my system, you will have to
>> >check drivers for your hardware). 2 and 3 are obviously distribution
>> >dependent. I can explain how to do it on Mandriva that ATM has near to
>> >perfect support for addressing devices via label/UUID; also ide/scsi/ata
>> >switch is trivial using Mandriva mkinitrd.
>>
>> I already build as modules, and it would be relatively easy to make 2 boot
>> stanza's that used the different initrd's if there were examples that
>> could be used as 'excludes' when building the initrd's. Is such a
>> creature breedable?
>
>I am not sure I understand a question (it is not my native language) but
> here I simply do
>
>mkinitrd --omit-ide-modules --preload pata_ali --preload sd_mod ...
>
>or
>
>mkinitrd --omit-scsi-modules --preload alim15x3 --preload ide-disk ...
>
This looks doable, thanks. I was trying to be cute above when I'm rather
frustrated by all this. I might have to fiddle a bit but I got the idea.

OTOH, I and about 15,000 others according to google, would be everlastingly
gratefull if it was just fixed. :)

Thanks

>If you ask how --omit part is implemented I happily send you mkinitrd
> script.



--
Cheers, Gene
"There are four boxes to be used in defense of liberty:
soap, ballot, jury, and ammo. Please use in that order."
-Ed Howdershelt (Author)
If everybody minded their own business, the world would go around a deal
faster.
-- The Duchess, "Through the Looking Glass"

2008-01-28 20:01:18

by rgheck

[permalink] [raw]
Subject: Re: Problem with ata layer in 2.6.24

Andrey Borzenkov wrote:
> Richard Heck wrote:
>
>
>> Daniel Barkalow wrote:
>>
>>> Can you switch back to old IDE to get your work done (and to make sure
>>> it's not a hardware issue that's developed recently)?
>>>
>> I think it'd be really, REALLY helpful to a lot of people if you, or
>> someone, could explain in moderate detail how this might be done. I
>> tried doing it myself, but I'm not sufficiently expert at configuring
>> kernels that I was ever able to figure out how to do it.
>>
>>
>
> well, here on Mandriva I
>
> 1) compile both IDE and libata as modules
> 2) create initrd that contains either IDE or libata modules
> 3) use labels for file system mounts, swaps and resume device.
>
>
> Now 1) should be pretty straightforward (I could send you config if you
> like, it is stripped down to bare minimum on my system, you will have to
> check drivers for your hardware). 2 and 3 are obviously distribution
> dependent. I can explain how to do it on Mandriva that ATM has near to
> perfect support for addressing devices via label/UUID; also ide/scsi/ata
> switch is trivial using Mandriva mkinitrd.
>
>
Thanks for this. Compiling the IDE stuff as a module is indeed the easy
part, though I suppose I need to make sure I get the right drivers for
my chipset, too. Loading e.g. the Fedora 6 LiveCD and then lsmod'ing
should do it, though. Labels are used by default in Fedora now, so
that's fine, too. Getting mkinitrd to work right shouldn't be too bad,
either. So I'll have a go at this when I get some time and report on it.
What might be REALLY helpful to people would be if we Fedora types could
produce a modified kernel rpm that would handle this....though, I should
say, I've also seen a lot of complaints along these same lines on Ubuntu.

Richard

2008-01-28 20:02:04

by Daniel Barkalow

[permalink] [raw]
Subject: Re: Problem with ata layer in 2.6.24

On Mon, 28 Jan 2008, Richard Heck wrote:

> Daniel Barkalow wrote:
> > Can you switch back to old IDE to get your work done (and to make sure it's
> > not a hardware issue that's developed recently)?
> I think it'd be really, REALLY helpful to a lot of people if you, or someone,
> could explain in moderate detail how this might be done. I tried doing it
> myself, but I'm not sufficiently expert at configuring kernels that I was ever
> able to figure out how to do it.

As far as configuring the kernel, I can help:

Go to Device Drivers, ATA/ATAPI/MFM/RLL support, and turn on anything that
looks relevant; go to Device Drivers, Serial ATA and Parallel ATA drivers,
and turn off anything that's PATA and looks relevant.

(Whether a device uses IDE or PATA depends on which driver that supports
the device is present and find it first, not on any sort of global
configuration, which is probably what tripped you up)

Building this and installing it along with the appropriate initrd (which
might be handled by Fedora's install scripts) will either get you back to
old IDE or will make your kernel panic on boot, depending on whether you
got it right (so make sure you can still boot the kernel you're sure of or
something from a boot disk). This will also cause your hard drives to show
up as different device nodes, so if your boot process doesn't mount by
disk uuid but by some other feature (and I don't know what Fedora does),
you'll also need to change it to something either stable across access
methods or which works for the one you're now using.

> Obviously, the short version is: switch back to Fedora 6. But this kind of
> problem with libata---and yes, you're almost surely right that it's not one
> problem but lots---is sufficiently widespread that a Mini HOWTO, say, would be
> really welcome and, I'm guessing, widely used.

Fedora really ought to provide documentation, because there's some
distro-specific stuff (like how you deal with the kernel's device node for
the root partition changing), and they're using code by default that's at
least somewhat documented as experimental (although it doesn't seem to be
actually marked as experimental in all cases).

-Daniel
*This .sig left intentionally blank*

2008-01-28 20:23:23

by Mark Lord

[permalink] [raw]
Subject: Re: Problem with ata layer in 2.6.24

Gene Heskett wrote:
>..
> And so far no one has tried to comment on those 2 dmesg lines I've quoted a
> couple of times now, here's another:
> [ 0.000000] Nvidia board detected. Ignoring ACPI timer override.
> [ 0.000000] If you got timer trouble try acpi_use_timer_override
> what the heck is that trying to tell me to do, in some sort of broken english?
..

I think it says this:

"If your system is misbehaving, then try adding the acpi_use_timer_override
keyword to your kernel command line (/boot/grub/menu.lst) and see if it helps."

So, you can either hardcode it in /boot/grub/menu.lst (just add it to the end
of the first line you see there that begins with the word "kernel".

Or you can just try it temporarily at boot time (safer, but tricker),
by catching GRUB (the bootloader) before it actually loads Linux.

Usually there's some key or something it says you have 3 seconds to hit for a "menu",
so do that, and then use the cursor keys to find the first "kernel" line in that menu
and hit "e" (edit) to go and add the acpi_use_timer_override keyword to the end of
that line (same as above).

Hit enter when done, and then the letter b (boot) to load Linux with that option.

Clear as mud, right? :)

2008-01-28 20:32:43

by Mark Lord

[permalink] [raw]
Subject: Re: Problem with ata layer in 2.6.24

Mark Lord wrote:
> Gene Heskett wrote:
>> ..
>> And so far no one has tried to comment on those 2 dmesg lines I've quoted a couple of times now, here's another:
>> [ 0.000000] Nvidia board detected. Ignoring ACPI timer override.
>> [ 0.000000] If you got timer trouble try acpi_use_timer_override
>> what the heck is that trying to tell me to do, in some sort of broken english?
> ..
>
> I think it says this:
>
> "If your system is misbehaving, then try adding the acpi_use_timer_override
> keyword to your kernel command line (/boot/grub/menu.lst) and see if it helps."
>
> So, you can either hardcode it in /boot/grub/menu.lst (just add it to the end
> of the first line you see there that begins with the word "kernel".
>
> Or you can just try it temporarily at boot time (safer, but tricker),
> by catching GRUB (the bootloader) before it actually loads Linux.
>
> Usually there's some key or something it says you have 3 seconds to hit for a "menu",
> so do that, and then use the cursor keys to find the first "kernel" line in that menu
> and hit "e" (edit) to go and add the acpi_use_timer_override keyword to the end of
> that line (same as above).
..

Minor correction (having just tried it here): once you see the GRUB (boot) menu,
hit the letter e to edit the first entry, then scroll to the "kernel" line,
and hit the letter e again to edit that line. It should put you at the end of the
line, where you can just type a space and then acpi_use_timer_override and then
hit enter to finish the (temporary) edit. Then hit b for boot.

-ml

2008-01-28 20:43:53

by Mark Lord

[permalink] [raw]
Subject: Re: Problem with ata layer in 2.6.24

Gene Heskett wrote:
> On Monday 28 January 2008, Mark Lord wrote:
>..
>> Another way is to use the "make_bad_sector" utility that
>> is included in the source tarball for hdparm-7.7, as follows:
>>
>> make_bad_sector --readback /dev/sda 474507
>>
> Apparently not in the rpm, darnit.
..

That's okay. It should still be in the SRPM source file.
And it's a tiny download from sourceforge.net:

http://sourceforge.net/search/?type_of_search=soft&type_of_search=soft&words=hdparm

Cheers

2008-01-29 00:06:12

by Gene Heskett

[permalink] [raw]
Subject: Re: Problem with ata layer in 2.6.24

On Monday 28 January 2008, Daniel Barkalow wrote:
>On Mon, 28 Jan 2008, Richard Heck wrote:
>> Daniel Barkalow wrote:
>> > Can you switch back to old IDE to get your work done (and to make sure
>> > it's not a hardware issue that's developed recently)?
>>
>> I think it'd be really, REALLY helpful to a lot of people if you, or
>> someone, could explain in moderate detail how this might be done. I tried
>> doing it myself, but I'm not sufficiently expert at configuring kernels
>> that I was ever able to figure out how to do it.
>
>As far as configuring the kernel, I can help:
>
>Go to Device Drivers, ATA/ATAPI/MFM/RLL support, and turn on anything that
>looks relevant; go to Device Drivers, Serial ATA and Parallel ATA drivers,
>and turn off anything that's PATA and looks relevant.
>
Done.

>(Whether a device uses IDE or PATA depends on which driver that supports
>the device is present and find it first, not on any sort of global
>configuration, which is probably what tripped you up)
>
>Building this and installing it along with the appropriate initrd (which
>might be handled by Fedora's install scripts)

Or mine, which I've been using for years.

>will either get you back to
>old IDE or will make your kernel panic on boot, depending on whether you
>got it right (so make sure you can still boot the kernel you're sure of or
>something from a boot disk). This will also cause your hard drives to show
>up as different device nodes, so if your boot process doesn't mount by
>disk uuid but by some other feature (and I don't know what Fedora does),
>you'll also need to change it to something either stable across access
>methods or which works for the one you're now using.

It mounts by LABEL=. All of it.

>> Obviously, the short version is: switch back to Fedora 6. But this kind of
>> problem with libata---and yes, you're almost surely right that it's not
>> one problem but lots---is sufficiently widespread that a Mini HOWTO, say,
>> would be really welcome and, I'm guessing, widely used.
>
>Fedora really ought to provide documentation, because there's some
>distro-specific stuff (like how you deal with the kernel's device node for
>the root partition changing), and they're using code by default that's at
>least somewhat documented as experimental (although it doesn't seem to be
>actually marked as experimental in all cases).

Fedora is not the only people having trouble, name a distro, its probably
someplace in that 14,800 hit google returns.

> -Daniel
>*This .sig left intentionally blank*

Thanks Daniel, try #1 is building now.

--
Cheers, Gene
"There are four boxes to be used in defense of liberty:
soap, ballot, jury, and ammo. Please use in that order."
-Ed Howdershelt (Author)
Those who do not understand Unix are condemned to reinvent it, poorly.
-- Henry Spencer

2008-01-29 00:07:40

by Gene Heskett

[permalink] [raw]
Subject: Re: Problem with ata layer in 2.6.24

On Monday 28 January 2008, Mark Lord wrote:
>Gene Heskett wrote:
>> On Monday 28 January 2008, Mark Lord wrote:
>>..
>>
>>> Another way is to use the "make_bad_sector" utility that
>>> is included in the source tarball for hdparm-7.7, as follows:
>>>
>>> make_bad_sector --readback /dev/sda 474507
>>
>> Apparently not in the rpm, darnit.
>
>..
>
>That's okay. It should still be in the SRPM source file.
>And it's a tiny download from sourceforge.net:
>
>http://sourceforge.net/search/?type_of_search=soft&type_of_search=soft&words
>=hdparm
>
>Cheers

That's ok, dd seemed to do the job also.

Thanks Mark.

--
Cheers, Gene
"There are four boxes to be used in defense of liberty:
soap, ballot, jury, and ammo. Please use in that order."
-Ed Howdershelt (Author)
Those who do not understand Unix are condemned to reinvent it, poorly.
-- Henry Spencer

2008-01-29 00:10:28

by Gene Heskett

[permalink] [raw]
Subject: Re: Problem with ata layer in 2.6.24

On Monday 28 January 2008, Mark Lord wrote:
>Gene Heskett wrote:
>>..
>> And so far no one has tried to comment on those 2 dmesg lines I've quoted
>> a couple of times now, here's another:
>> [ 0.000000] Nvidia board detected. Ignoring ACPI timer override.
>> [ 0.000000] If you got timer trouble try acpi_use_timer_override
>> what the heck is that trying to tell me to do, in some sort of broken
>> english?
>
>..
>
>I think it says this:
>
> "If your system is misbehaving, then try adding the
> acpi_use_timer_override keyword to your kernel command line
> (/boot/grub/menu.lst) and see if it helps."
>
>So, you can either hardcode it in /boot/grub/menu.lst (just add it to the
> end of the first line you see there that begins with the word "kernel".
>
>Or you can just try it temporarily at boot time (safer, but tricker),
>by catching GRUB (the bootloader) before it actually loads Linux.
>
>Usually there's some key or something it says you have 3 seconds to hit for
> a "menu", so do that, and then use the cursor keys to find the first
> "kernel" line in that menu and hit "e" (edit) to go and add the
> acpi_use_timer_override keyword to the end of that line (same as above).
>
>Hit enter when done, and then the letter b (boot) to load Linux with that
> option.
>
>Clear as mud, right? :)

Precisely Mark. Thanks, I'm building an ide-ata kernel 2.6.24 now, and I've
added that to the argument line for 2.6.24-rc8.

Thanks mark.

--
Cheers, Gene
"There are four boxes to be used in defense of liberty:
soap, ballot, jury, and ammo. Please use in that order."
-Ed Howdershelt (Author)
Of all men's miseries, the bitterest is this:
to know so much and have control over nothing.
-- Herodotus

2008-01-29 00:17:49

by Robert Hancock

[permalink] [raw]
Subject: Re: Problem with ata layer in 2.6.24

Gene Heskett wrote:
> And so far no one has tried to comment on those 2 dmesg lines I've quoted a
> couple of times now, here's another:
> [ 0.000000] Nvidia board detected. Ignoring ACPI timer override.
> [ 0.000000] If you got timer trouble try acpi_use_timer_override
> what the heck is that trying to tell me to do, in some sort of broken english?

A lot of NVIDIA-chipset motherboards have BIOS problems where they
include an incorrect ACPI interrupt override for the timer interrupt,
which tends to cause the system to fail to boot due to the timer
interrupt not working. The kernel normally ignores ACPI interrupt
overrides on the timer interrupt for NVIDIA chipsets for this reason.
Unfortunately on some such boards the override is actually correct and
needed, and so this actually causes problems. Hence the
acpi_use_timer_override option.

In any case this is unlikely to have anything to do with your problem,
since if that was messed up you likely would never have even booted.

2008-01-29 00:35:20

by Daniel Barkalow

[permalink] [raw]
Subject: Re: Problem with ata layer in 2.6.24

On Mon, 28 Jan 2008, Gene Heskett wrote:

> On Monday 28 January 2008, Daniel Barkalow wrote:
> >Building this and installing it along with the appropriate initrd (which
> >might be handled by Fedora's install scripts)
>
> Or mine, which I've been using for years.

You're ahead of a surprising number of people, including me, if you
understand making initrds.

> >will either get you back to
> >old IDE or will make your kernel panic on boot, depending on whether you
> >got it right (so make sure you can still boot the kernel you're sure of or
> >something from a boot disk). This will also cause your hard drives to show
> >up as different device nodes, so if your boot process doesn't mount by
> >disk uuid but by some other feature (and I don't know what Fedora does),
> >you'll also need to change it to something either stable across access
> >methods or which works for the one you're now using.
>
> It mounts by LABEL=. All of it.

That'll save a huge amount of hassle. So long as you manage to get the
right drivers included and the wrong drivers not included, you should be
pretty much set.

> Fedora is not the only people having trouble, name a distro, its probably
> someplace in that 14,800 hit google returns.

Yeah, but they each may need different instructions, particularly if
they're not mounting by label in general, or not mounting the root
partition by label. That was the big hassle going the opposite direction.
And the procedure is 4 lines to describe to somebody who knows how to
build and install a new kernel for the distro, which is much shorter than
the explanation of how you generally build and install a kernel. A real
howto would have to explain where to get the distro's kernel sources and
default configuration, for example.

-Daniel
*This .sig left intentionally blank*

2008-01-29 00:55:22

by Gene Heskett

[permalink] [raw]
Subject: Re: Problem with ata layer in 2.6.24

On Monday 28 January 2008, Robert Hancock wrote:
>Gene Heskett wrote:
>> And so far no one has tried to comment on those 2 dmesg lines I've quoted
>> a couple of times now, here's another:
>> [ 0.000000] Nvidia board detected. Ignoring ACPI timer override.
>> [ 0.000000] If you got timer trouble try acpi_use_timer_override
>> what the heck is that trying to tell me to do, in some sort of broken
>> english?
>
>A lot of NVIDIA-chipset motherboards have BIOS problems where they
>include an incorrect ACPI interrupt override for the timer interrupt,
>which tends to cause the system to fail to boot due to the timer
>interrupt not working. The kernel normally ignores ACPI interrupt
>overrides on the timer interrupt for NVIDIA chipsets for this reason.
>Unfortunately on some such boards the override is actually correct and
>needed, and so this actually causes problems. Hence the
>acpi_use_timer_override option.
>
>In any case this is unlikely to have anything to do with your problem,
>since if that was messed up you likely would never have even booted.
>--
>To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
>the body of a message to [email protected]
>More majordomo info at http://vger.kernel.org/majordomo-info.html
>Please read the FAQ at http://www.tux.org/lkml/

In this case, there seems to be a buglet. I turned on the nvidia/amd drives
under the ATA section of the menu, and turned off the pata_amd under the sata
menu in xconfig.

But I've tried twice now and it fails to build the initrd because the pata_amd
module is on the missing list. Of course its missing, I didn't have it
built...

Next?

--
Cheers, Gene
"There are four boxes to be used in defense of liberty:
soap, ballot, jury, and ammo. Please use in that order."
-Ed Howdershelt (Author)
Of course it's possible to love a human being if you don't know them too well.
-- Charles Bukowski

2008-01-29 01:30:42

by Robert Hancock

[permalink] [raw]
Subject: Re: Problem with ata layer in 2.6.24

Gene Heskett wrote:
> On Monday 28 January 2008, Robert Hancock wrote:
>> Gene Heskett wrote:
>>> And so far no one has tried to comment on those 2 dmesg lines I've quoted
>>> a couple of times now, here's another:
>>> [ 0.000000] Nvidia board detected. Ignoring ACPI timer override.
>>> [ 0.000000] If you got timer trouble try acpi_use_timer_override
>>> what the heck is that trying to tell me to do, in some sort of broken
>>> english?
>> A lot of NVIDIA-chipset motherboards have BIOS problems where they
>> include an incorrect ACPI interrupt override for the timer interrupt,
>> which tends to cause the system to fail to boot due to the timer
>> interrupt not working. The kernel normally ignores ACPI interrupt
>> overrides on the timer interrupt for NVIDIA chipsets for this reason.
>> Unfortunately on some such boards the override is actually correct and
>> needed, and so this actually causes problems. Hence the
>> acpi_use_timer_override option.
>>
>> In any case this is unlikely to have anything to do with your problem,
>> since if that was messed up you likely would never have even booted.
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
>> the body of a message to [email protected]
>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>> Please read the FAQ at http://www.tux.org/lkml/
>
> In this case, there seems to be a buglet. I turned on the nvidia/amd drives
> under the ATA section of the menu, and turned off the pata_amd under the sata
> menu in xconfig.
>
> But I've tried twice now and it fails to build the initrd because the pata_amd
> module is on the missing list. Of course its missing, I didn't have it
> built...
>
> Next?

Check the /etc/modprobe.conf file, a lot of distributions use this to
generate the initrd. If there's references to pata_amd it'll try and
include it.

2008-01-29 01:32:21

by Gene Heskett

[permalink] [raw]
Subject: Re: Problem with ata layer in 2.6.24

On Monday 28 January 2008, Daniel Barkalow wrote:
>On Mon, 28 Jan 2008, Gene Heskett wrote:
>> On Monday 28 January 2008, Daniel Barkalow wrote:
>> >Building this and installing it along with the appropriate initrd (which
>> >might be handled by Fedora's install scripts)
>>
>> Or mine, which I've been using for years.
>
>You're ahead of a surprising number of people, including me, if you
>understand making initrds.

In my script, its one line:
mkinitrd -f initrd-$VER.img $VER && \

where $VER is the shell variable I edit to = the version number, located at
the top of the script.

Unforch, its failing:
No module pata_amd found for kernel 2.6.24, aborting.

This is with pata_amd turned off and its counterpart under ATA/RLL/etc turned
on. So something is still dependent on it. I do have one sata drive, on an
accessory card in the box, so I need the rest of the sata_sil and friends
stuff. Its my virtual tapes for amanda. Also home built, the amanda
security model cannot be successfully bent into the shape of an rpm. They
BTW are #2 on coverity's list of most secure software.

So I've rebuilt 2.6.24 as it originally was, and added the acpi timer line to
the 2.6.24-rc8 stanza's kernel argument list. It will boot one or the other
when I next reboot. Its been about 8 hours since the last error was logged,
which is totally weirdsville to this old fart. Phase of the moon maybe? The
visit to the sawbones to see about my heart? They are going to fit me with a
30 day recorder tomorrow, my skip a beat problem is getting worse. The sort
of stuff that goes with the 7nth decade I guess. Officially, I'm wearing out
me, too much sugar, too many times nearly electrocuted=shingles yadda
yadda. :-) Oh, and don't forget Arther, he moved in uninvited about 25 years
ago too. Those people that talk about the golden years? They're full of
excrement...

>> >will either get you back to
>> >old IDE or will make your kernel panic on boot, depending on whether you
>> >got it right (so make sure you can still boot the kernel you're sure of
>> > or something from a boot disk). This will also cause your hard drives to
>> > show up as different device nodes, so if your boot process doesn't mount
>> > by disk uuid but by some other feature (and I don't know what Fedora
>> > does), you'll also need to change it to something either stable across
>> > access methods or which works for the one you're now using.
>>
>> It mounts by LABEL=. All of it.
>
>That'll save a huge amount of hassle. So long as you manage to get the
>right drivers included and the wrong drivers not included, you should be
>pretty much set.
>
>> Fedora is not the only people having trouble, name a distro, its probably
>> someplace in that 14,800 hit google returns.
>
>Yeah, but they each may need different instructions, particularly if
>they're not mounting by label in general, or not mounting the root
>partition by label. That was the big hassle going the opposite direction.
>And the procedure is 4 lines to describe to somebody who knows how to
>build and install a new kernel for the distro, which is much shorter than
>the explanation of how you generally build and install a kernel. A real
>howto would have to explain where to get the distro's kernel sources and
>default configuration, for example.
>
> -Daniel
>*This .sig left intentionally blank*



--
Cheers, Gene
"There are four boxes to be used in defense of liberty:
soap, ballot, jury, and ammo. Please use in that order."
-Ed Howdershelt (Author)
Never drink from your finger bowl -- it contains only water.

2008-01-29 01:51:28

by Gene Heskett

[permalink] [raw]
Subject: Re: Problem with ata layer in 2.6.24

On Monday 28 January 2008, Robert Hancock wrote:
[...]
>Check the /etc/modprobe.conf file, a lot of distributions use this to
>generate the initrd. If there's references to pata_amd it'll try and
>include it.

Bingo! Thanks Robert, I'll try it again with that line commented. I wasn't
aware of that connection at all. Yup, it worked, I feel a reboot coming
on. :)

--
Cheers, Gene
"There are four boxes to be used in defense of liberty:
soap, ballot, jury, and ammo. Please use in that order."
-Ed Howdershelt (Author)
If everything seems to be going well, you have obviously overlooked something.

2008-01-29 01:51:59

by Daniel Barkalow

[permalink] [raw]
Subject: Re: Problem with ata layer in 2.6.24

On Mon, 28 Jan 2008, Gene Heskett wrote:

> On Monday 28 January 2008, Daniel Barkalow wrote:
> >On Mon, 28 Jan 2008, Gene Heskett wrote:
> >> On Monday 28 January 2008, Daniel Barkalow wrote:
> >> >Building this and installing it along with the appropriate initrd (which
> >> >might be handled by Fedora's install scripts)
> >>
> >> Or mine, which I've been using for years.
> >
> >You're ahead of a surprising number of people, including me, if you
> >understand making initrds.
>
> In my script, its one line:
> mkinitrd -f initrd-$VER.img $VER && \
>
> where $VER is the shell variable I edit to = the version number, located at
> the top of the script.
>
> Unforch, its failing:
> No module pata_amd found for kernel 2.6.24, aborting.
>
> This is with pata_amd turned off and its counterpart under ATA/RLL/etc turned
> on. So something is still dependent on it.

That looks like something in the guts of the initrd; it probably thinks
you need pata_amd and it's unhappy that you don't have it.

Actually, another thing to try is making the ATA/etc one be "y" and
pata_amd be "m". Most likely, this should lead to the ATA one claiming the
drive before the module is loaded (but the module would be loaded later,
to avoid upsetting the initrd); you should be able to tell from dmesg (or
/dev, for that matter) which one got it, and I think built-in drivers will
claim everything they can before an initrd gets loaded.

> I do have one sata drive, on an accessory card in the box, so I need the
> rest of the sata_sil and friends stuff.

Assuming it isn't picking up your hard drive, which it isn't, that
shouldn't matter.

-Daniel
*This .sig left intentionally blank*

2008-01-29 02:20:51

by Gene Heskett

[permalink] [raw]
Subject: Re: Problem with ata layer in 2.6.24

On Monday 28 January 2008, Gene Heskett wrote:
>On Monday 28 January 2008, Robert Hancock wrote:
>[...]
>
>>Check the /etc/modprobe.conf file, a lot of distributions use this to
>>generate the initrd. If there's references to pata_amd it'll try and
>>include it.
>
>Bingo! Thanks Robert, I'll try it again with that line commented. I wasn't
>aware of that connection at all. Yup, it worked, I feel a reboot coming
>on. :)

But it didn't work, apparently commenting that line out needs to be balanced
by adding another line telling it amd74xx is the 'hostadapter', not
necessarily scsi.

Can this be made more universal so I don't have to edit /etc/modprobe.conf?

Thanks.

--
Cheers, Gene
"There are four boxes to be used in defense of liberty:
soap, ballot, jury, and ammo. Please use in that order."
-Ed Howdershelt (Author)
Because we don't think about future generations, they will never forget us.
-- Henrik Tikkanen

2008-01-29 03:16:19

by Mark Lord

[permalink] [raw]
Subject: Re: Problem with ata layer in 2.6.24

Gene Heskett wrote:
..
> That's ok, dd seemed to do the job also.
..

The two programs operate entirely differently from each other,
so it may still be worth trying the make_bad_sector utility there.

dd goes through the regular kernel I/O calls,
whereas make_bad_sector sends raw ATA commands
directly (more or less) to the drive.

-ml

2008-01-29 03:22:18

by Mark Lord

[permalink] [raw]
Subject: Re: Problem with ata layer in 2.6.24

Gene Heskett wrote:
> On Monday 28 January 2008, Gene Heskett wrote:
>> On Monday 28 January 2008, Robert Hancock wrote:
>> [...]
>>
>>> Check the /etc/modprobe.conf file, a lot of distributions use this to
>>> generate the initrd. If there's references to pata_amd it'll try and
>>> include it.
>> Bingo! Thanks Robert, I'll try it again with that line commented. I wasn't
>> aware of that connection at all. Yup, it worked, I feel a reboot coming
>> on. :)
>
> But it didn't work, apparently commenting that line out needs to be balanced
> by adding another line telling it amd74xx is the 'hostadapter', not
> necessarily scsi.
>
> Can this be made more universal so I don't have to edit /etc/modprobe.conf?
>..

You could really do it like Linus (and me), and not bother with modules
for critical services like hard disks.

Just build them *into* the core kernel (select "y" or "checkmark" rather
than "m" or "dot" for modules). This eliminates a ton of crap that can fail,
and may also make your kernel a micro-MIP faster (core memory is often mapped
without page table entries, whereas loaded modules use page tables.. slower, slightly).

Linus just edits the /boot/grub/menu.lst, and clones an existing boot entry
for the new kernel, editing the "kernel" line to match the name of the file
that got installed in /boot by "make install" (from the kernel directory).
He just leaves the ramdisk/initrd line as-was --> wrong version, but that's okay.

I totally get rid of them here, but that requires hardcoding the root=/dev/xxxx
part on the "kernel" line. No big deal, it works just fine that way.

Cheers

2008-01-29 04:08:07

by Gene Heskett

[permalink] [raw]
Subject: Re: Problem with ata layer in 2.6.24

On Monday 28 January 2008, Mark Lord wrote:
>Gene Heskett wrote:
>..
>
>> That's ok, dd seemed to do the job also.
>
>..
>
>The two programs operate entirely differently from each other,
>so it may still be worth trying the make_bad_sector utility there.
>
>dd goes through the regular kernel I/O calls,
>whereas make_bad_sector sends raw ATA commands
>directly (more or less) to the drive.
>
Humm, if it (the sector error) continues. I'm rather convinced that was a one
time transient item caused by doing so many hardware resets. It has not
repeated in subsequent stanzas of this error. Several times it went away
while the drives long self test was in progress, and the resets that go with
the reboot, or one of these errors seems to stop the long test, which from my
reading, should resume with no delay, but maybe that only applies to a
powerdown restart, which I haven't been doing. The last such error was about
11 hours ago now. I just started another long test, which if ok, should clear
the stuff its showing now because the test was interrupted. It has passed
that test twice before in the last 36 hours.

Thanks Mark.

--
Cheers, Gene
"There are four boxes to be used in defense of liberty:
soap, ballot, jury, and ammo. Please use in that order."
-Ed Howdershelt (Author)
You are a fluke of the universe; you have no right to be here.

2008-01-29 04:24:44

by Kasper Sandberg

[permalink] [raw]
Subject: Re: Problem with ata layer in 2.6.24

On Mon, 2008-01-28 at 11:35 -0500, Gene Heskett wrote:
> On Monday 28 January 2008, Mikael Pettersson wrote:
> >Gene Heskett writes:
> > > On Monday 28 January 2008, Peter Zijlstra wrote:
> > > >On Mon, 2008-01-28 at 09:17 +0100, Mikael Pettersson wrote:
> > > >> 1. Wrong mailing list; use linux-ide (@vger) instead.
> > > >
> > > >What, and keep all us other interested people in the dark?
> > >
> > > As a test, I tried rebooting to the latest fedora kernel and found it
> > > kills X, so I'm back to the second to last fedora version ATM, and the
> > > third 'smartctl -t lng /dev/sda' in 24 hours is running now. The first
> > > two completed with no errors.
> > >
> > > I've added the linux-ide list to refresh those people of the problem,
> > > the logs are being spammed by this message stanza:
> > >
> > > Jan 28 04:46:25 coyote kernel: [26550.290016] ata1.00: exception Emask
> > > 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen Jan 28 04:46:25 coyote kernel:
> > > [26550.290028] ata1.00: cmd 35/00:58:c9:9c:0a/00:01:00:00:00/e0 tag 0 dma
> > > 176128 out Jan 28 04:46:25 coyote kernel: [26550.290029] res
> > > 40/00:01:00:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout) Jan 28 04:46:25
> > > coyote kernel: [26550.290032] ata1.00: status: { DRDY } Jan 28 04:46:25
> > > coyote kernel: [26550.290060] ata1: soft resetting link Jan 28 04:46:25
> > > coyote kernel: [26550.452301] ata1.00: configured for UDMA/100 Jan 28
> > > 04:46:25 coyote kernel: [26550.452318] ata1: EH complete
> > > Jan 28 04:46:25 coyote kernel: [26550.455898] sd 0:0:0:0: [sda] 390721968
> > > 512-byte hardware sectors (200050 MB) Jan 28 04:46:25 coyote kernel:
> > > [26550.456151] sd 0:0:0:0: [sda] Write Protect is off Jan 28 04:46:25
> > > coyote kernel: [26550.456403] sd 0:0:0:0: [sda] Write cache: enabled,
> > > read cache: enabled, doesn't support DPO or FUA
> >
> >It's not obvious from this incomplete dmesg log what HW or driver
> >is behind ata1, but if the 2.6.24-rc7 kernel matches the 2.6.24 one,
> >
> >it should be pata_amd driving a WDC disk:
> > > [ 30.702887] pata_amd 0000:00:09.0: version 0.3.10
> > > [ 30.703052] PCI: Setting latency timer of device 0000:00:09.0 to 64
> > > [ 30.703188] scsi0 : pata_amd
> > > [ 30.709313] scsi1 : pata_amd
> > > [ 30.710076] ata1: PATA max UDMA/133 cmd 0x1f0 ctl 0x3f6 bmdma 0xf000
> > > irq 14 [ 30.710079] ata2: PATA max UDMA/133 cmd 0x170 ctl 0x376 bmdma
> > > 0xf008 irq 15 [ 30.864753] ata1.00: ATA-6: WDC WD2000JB-00EVA0,
> > > 15.05R15, max UDMA/100 [ 30.864756] ata1.00: 390721968 sectors, multi
> > > 16: LBA48
> > > [ 30.871629] ata1.00: configured for UDMA/100
> >
> >Unfortunately we also see:
> > > [ 48.285456] nvidia: module license 'NVIDIA' taints kernel.
> > > [ 48.549725] ACPI: PCI Interrupt 0000:02:00.0[A] -> Link [APC4] -> GSI
> > > 19 (level, high) -> IRQ 20 [ 48.550149] NVRM: loading NVIDIA UNIX x86
> > > Kernel Module 169.07 Thu Dec 13 18:42:56 PST 2007
> >
> >We have no way of debugging that module, so please try 2.6.24 without it.
>
> Sorry, I can't do this and have a working machine. The nv driver has suffered
> bit rot or something since the FC2 days when it COULD run a 19" crt at
> 1600x1200, and will not drive this 20" wide screen lcd 1680x1050 monitor at
> more than 800x600, which is absolutely butt ugly fuzzy, looking like a jpg
> compressed to 10%. The system is not usable on a day to basis without the
> nvidia driver.
>
> Fix the nv driver so it will run this screen at its native resolution and I'll
> be glad to run it even if it won't run google earth, which I do use from time
> to time. Now, if in all the hits you can get from google on this, currently
> 14,800 just for 'exception Emask', apparently caused by a timeout, if 100% of
> the complainers are running nvidia drivers also, then I see a legit
I can invalidate this theory...
i helped a guy on irc debug this problem, and he had ati. I tried having
him stop using fglrx, and go to r300.. same problem, and same problem
even with vesa.. :)

also, i have this on my fileserver with .20, which doesent even run X,
or module support in kernel :)

> complaint. Again, fix the nv driver so it will run my screen & I'll be glad
> to switch. I can see the reason, sure, but the machine must be capable of
> doing its common day to day stuff, while using that driver, like running kde
> for kmail, and browsers that work.
>
> >If the problems persist, please try to capture a complete log from the
> >failing kernel -- the interesting bits are everything from initial boot
> >up to and including the first few errors. You may need to increase the
> >kernel's log buffer size if the log gets truncated (CONFIG_LOG_BUF_SHIFT).
>
> If by log you mean /var/log/messages, I have several megabytes of those.
> If you mean a live dmesg capture taken right now, its attached. It contains
> several of these at the bottom. I long ago made the kernel log buffer
> bigger, cuz it couldn't even show the start immediately after the boot, and
> even the dump to syslog was truncated.
>
> >There are no pata_amd changes from 2.6.24-rc7 to 2.6.24 final.
>
> That is what I was afraid of. I've done some limited grepping in that branch
> of the kernel tree, and cannot seem to locate where this EH handler is being
> invoked from.
>
> There is 2 lines of interest in the dmesg:
>
> [ 0.000000] Nvidia board detected. Ignoring ACPI timer override.
> [ 0.000000] If you got timer trouble try acpi_use_timer_override
>
> But I have NDI what it means, kernel argument/xconfig option?
>
> I've also done some googling, and it appears this problem is fairly widespread
> since the switchover to libata was encouraged. A stock fedora F8 kernel
> suffers the same freezes and eventually locks up, but does it without the
> error messages being logged, it just freezes, feeling identical to this in
> the minutes before the total freeze. I've tried 2 of those too, but the
> newest one won't even run X.
>

2008-01-29 04:49:44

by Gene Heskett

[permalink] [raw]
Subject: Re: Problem with ata layer in 2.6.24

On Monday 28 January 2008, Kasper Sandberg wrote:
[...]
>> >We have no way of debugging that module, so please try 2.6.24 without it.
>>
>> Sorry, I can't do this and have a working machine. The nv driver has
>> suffered bit rot or something since the FC2 days when it COULD run a 19"
>> crt at 1600x1200, and will not drive this 20" wide screen lcd 1680x1050
>> monitor at more than 800x600, which is absolutely butt ugly fuzzy, looking
>> like a jpg compressed to 10%. The system is not usable on a day to basis
>> without the nvidia driver.
>>
>> Fix the nv driver so it will run this screen at its native resolution and
>> I'll be glad to run it even if it won't run google earth, which I do use
>> from time to time. Now, if in all the hits you can get from google on
>> this, currently 14,800 just for 'exception Emask', apparently caused by a
>> timeout, if 100% of the complainers are running nvidia drivers also, then
>> I see a legit
>
>I can invalidate this theory...
>i helped a guy on irc debug this problem, and he had ati. I tried having
>him stop using fglrx, and go to r300.. same problem, and same problem
>even with vesa.. :)
>
No Kasper, you are validating it, that it is not nvidia related, which is what
I was also saying.

>also, i have this on my fileserver with .20, which doesent even run X,
>or module support in kernel :)

That far back? Although ISTR I saw it happen once only when I was running
2.6.18-somethingorother.

>> complaint. Again, fix the nv driver so it will run my screen & I'll be
>> glad to switch. I can see the reason, sure, but the machine must be
>> capable of doing its common day to day stuff, while using that driver,
>> like running kde for kmail, and browsers that work.
>>
>> >If the problems persist, please try to capture a complete log from the
>> >failing kernel -- the interesting bits are everything from initial boot
>> >up to and including the first few errors. You may need to increase the
>> >kernel's log buffer size if the log gets truncated
>> > (CONFIG_LOG_BUF_SHIFT).
>>
>> If by log you mean /var/log/messages, I have several megabytes of those.
>> If you mean a live dmesg capture taken right now, its attached. It
>> contains several of these at the bottom. I long ago made the kernel log
>> buffer bigger, cuz it couldn't even show the start immediately after the
>> boot, and even the dump to syslog was truncated.
>>
>> >There are no pata_amd changes from 2.6.24-rc7 to 2.6.24 final.
>>
>> That is what I was afraid of. I've done some limited grepping in that
>> branch of the kernel tree, and cannot seem to locate where this EH handler
>> is being invoked from.
>>
>> There is 2 lines of interest in the dmesg:
>>
>> [ 0.000000] Nvidia board detected. Ignoring ACPI timer override.
>> [ 0.000000] If you got timer trouble try acpi_use_timer_override
>>
>> But I have NDI what it means, kernel argument/xconfig option?
>>
>> I've also done some googling, and it appears this problem is fairly
>> widespread since the switchover to libata was encouraged. A stock fedora
>> F8 kernel suffers the same freezes and eventually locks up, but does it
>> without the error messages being logged, it just freezes, feeling
>> identical to this in the minutes before the total freeze. I've tried 2 of
>> those too, but the newest one won't even run X.

--
Cheers, Gene
"There are four boxes to be used in defense of liberty:
soap, ballot, jury, and ammo. Please use in that order."
-Ed Howdershelt (Author)
bureaucrat, n:
A politician who has tenure.

2008-01-29 05:02:23

by Kasper Sandberg

[permalink] [raw]
Subject: Re: Problem with ata layer in 2.6.24

On Mon, 2008-01-28 at 23:49 -0500, Gene Heskett wrote:
> On Monday 28 January 2008, Kasper Sandberg wrote:
> [...]
<snip>
> >
> >I can invalidate this theory...
> >i helped a guy on irc debug this problem, and he had ati. I tried having
> >him stop using fglrx, and go to r300.. same problem, and same problem
> >even with vesa.. :)
> >
> No Kasper, you are validating it, that it is not nvidia related, which is what
> I was also saying.
yeah thats what i mean - i can invalidate the theory that all the
affected boxes run nvidia.

>
> >also, i have this on my fileserver with .20, which doesent even run X,
> >or module support in kernel :)
>
> That far back? Although ISTR I saw it happen once only when I was running
> 2.6.18-somethingorother.

Yes im afraid so.. i will now provide some complete details, as i feel
they are relevant.

the thing is, i run 6x300gb disks, IDE, in raid5.

i have both an onboard via ide controller, and then i bought a promise
pdc 202 new thingie. i had problem however..

after a bit of time, i would get DMA reset error thing, and it all
kindof went NUTS. it was as if all data access were skewed, and as you
might imagine, this made everything fail badly.

i purchased an ITE based controller for the drives on the promise, but
exactly the same thing happened.

the errors i got was:
hdf: dma_intr: bad DMA status (dma_stat=75)
hdf: dma_intr: status=0x50 { DriveReady SeekComplete }
ide: failed opcode was: unknown
---

i then found new hope, when i heard that libata provided much better
error handling, so i upgraded to .20.

this made my box usable.

the error happens once or twice a day, the disk led will turn on
constantly, and all IO freezes for about half a minute, where it returns
PROPERLY(thank you libata!). as far as i can tell, the only side effect
is that i get those messages like described here, and flooded with on
google.

to put some timeline perspective into this.
i believe it was in 2005 i assembled the system, and when i realized it
was faulty, on old ide driver, i stopped using it - that miht have been
in beginning of 2006. then for almost a year i werent using it, hoping
to somehow fix it, but in january 2007 i think it was, atleast in the
very beginning of 2007, i hit upon the idea of trying libata, and ever
since the system has been running 24/7 - doing these errors around 2
times a day.

i have multiple times reported my problems to lkml, but nothing has
happened, i also tried to aproeach jgarzik direcly, but he was not
interested.

i really hope this can be solved now, its a huge problem

my fileserver has an asus k8v motherboard, with via chipset (k8t880 i
think it is, or something like it). currently using the promise
controller again(strangely enough all the timeouts seems to happen here,
and when the ITE was on, there, not the onboard one), in conjunction
with the onboard via.


> >> complaint. Again, fix the nv driver so it will run my screen & I'll be
<snip>

2008-01-29 05:48:17

by Michal Jaegermann

[permalink] [raw]
Subject: Re: Problem with ata layer in 2.6.24

On Mon, Jan 28, 2008 at 08:31:57PM -0500, Gene Heskett wrote:
>
> In my script, its one line:
> mkinitrd -f initrd-$VER.img $VER && \
>
> where $VER is the shell variable I edit to = the version number, located at
> the top of the script.
>
> Unforch, its failing:
> No module pata_amd found for kernel 2.6.24, aborting.

mkinitrd is just a shell script. Even if its options, and there is
a quite a number of these, do not allow to influence a choice of
modules in a desired manner, it is pretty trivial to make yourself a
custom version of it and just hardwire there a fixed list of modules
to use instead of relying on general mechanisms which are trying
hard to guess what you may need.

That way your regular 'mkinitrd' will build something to boot with
libata and 'mkinird.ide' will use IDE modules for that purpose using
the same "core" kernel.

If you are using distribution kernels, as opposed to your own
configuration, it is quite likely that you will need to install
'kernel-devel' package and recompile and add required IDE modules
yourself as those may be not provided. This is done the same way
like for any other "external" module.

Michal

2008-01-29 06:41:49

by Florian Attenberger

[permalink] [raw]
Subject: Re: Problem with ata layer in 2.6.24

On Mon, 28 Jan 2008 14:13:21 -0500
Gene Heskett <[email protected]> wrote:


> >> I had to reboot early this morning due to a freezeup, and I had a
> >> bunch of these in the messages log:
> >> ==============
> >> Jan 27 19:42:11 coyote kernel: [42461.915961] ata1.00: exception Emask 0x0
> >> SAct 0x0 SErr 0x0 action 0x2 frozen Jan 27 19:42:11 coyote kernel:
> >> [42461.915973] ata1.00: cmd ca/00:08:b1:66:46/00:00:00:00:00/e8 tag 0 dma
> >> 4096 out Jan 27 19:42:11 coyote kernel: [42461.915974] res
> >> 40/00:01:00:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout) Jan 27 19:42:11
> >> coyote kernel: [42461.915978] ata1.00: status: { DRDY } Jan 27 19:42:11
> >> coyote kernel: [42461.916005] ata1: soft resetting link Jan 27 19:42:12
> >> coyote kernel: [42462.078216] ata1.00: configured for UDMA/100 Jan 27
> >> 19:42:12 coyote kernel: [42462.078232] ata1: EH complete
> >> Jan 27 19:42:12 coyote kernel: [42462.090700] sd 0:0:0:0: [sda] 390721968
> >> 512-byte hardware sectors (200050 MB) Jan 27 19:42:12 coyote kernel:
> >> [42462.114230] sd 0:0:0:0: [sda] Write Protect is off Jan 27 19:42:12
> >> coyote kernel: [42462.115079] sd 0:0:0:0: [sda] Write cache: enabled, read
> >> cache: enabled, doesn't support DPO or FUA
> >> ===============


I had this error too, or maybe only a similar one, and another, neither
of which of i still have the error output laying around, so I'm posting both
fixes, that i found here on lkml:
1) disabling ncq like that:
"echo 1 > /sys/block/sda/device/queue_depth"
2) this patch: libata_drain_fifo_on_stuck_drq_hsm.patch
( applies to 2.6.24 too )

Signed-off-by: Mark Lord <[email protected]>
---

--- old/drivers/ata/libata-sff.c 2007-09-28 09:29:22.000000000 -0400
+++ linux/drivers/ata/libata-sff.c 2007-09-28 09:39:44.000000000 -0400
@@ -420,6 +420,28 @@
ap->ops->irq_on(ap);
}

+static void ata_drain_fifo(struct ata_port *ap, struct ata_queued_cmd *qc)
+{
+ u8 stat = ata_chk_status(ap);
+ /*
+ * Try to clear stuck DRQ if necessary,
+ * by reading/discarding up to two sectors worth of data.
+ */
+ if ((stat & ATA_DRQ) && (!qc || qc->dma_dir != DMA_TO_DEVICE)) {
+ unsigned int i;
+ unsigned int limit = qc ? qc->sect_size : ATA_SECT_SIZE;
+
+ printk(KERN_WARNING "Draining up to %u words from data FIFO.\n",
+ limit);
+ for (i = 0; i < limit ; ++i) {
+ ioread16(ap->ioaddr.data_addr);
+ if (!(ata_chk_status(ap) & ATA_DRQ))
+ break;
+ }
+ printk(KERN_WARNING "Drained %u/%u words.\n", i, limit);
+ }
+}
+
/**
* ata_bmdma_drive_eh - Perform EH with given methods for BMDMA controller
* @ap: port to handle error for
@@ -476,7 +498,7 @@
}

ata_altstatus(ap);
- ata_chk_status(ap);
+ ata_drain_fifo(ap, qc);
ap->ops->irq_clear(ap);

spin_unlock_irqrestore(ap->lock, flags);
-





--
Florian Attenberger <[email protected]>


Attachments:
(No filename) (2.76 kB)
(No filename) (189.00 B)
Download all attachments

2008-01-29 12:16:37

by Alan

[permalink] [raw]
Subject: Re: Problem with ata layer in 2.6.24

> not one problem but lots---is sufficiently widespread that a Mini HOWTO,
> say, would be really welcome and, I'm guessing, widely used.

We don't see very many libata problems at the distro level and they for
the most part boil down to

- error messages looking different - Most bugs I get are things like
media errors (timeout looks different, UNC report looks different)

- broken hardware - I've closed a whole raft of bugs that turn out to be
new PC systems where even the BIOS doesn't see the drives

- faulty hardware being picked up because we actually do real error
checking now. We now check for and give some devices more slack while
still doing error checking. Both IDE layers also added blacklists for
stuff like the TSScorp DVD drives. Qemu has now had its bugs patched.

- sata_nv with >4GB of RAM, knowing being worked on, no old IDE driver
anyway

- pata_ali MWDMA with ATAPI, PIO works fine, all a bit of a mystery and
as it affects only a few chip variants hard to figure out. Workaround
libata.dma=1

- CS handling. On a few boxes using cable select (particularly on one
drive and not the other) shows up a problem, normally a failed SRST.
That's still under investigation.

- Promise timeouts. The old IDE times out then polls the device and finds
the IRQ was never sent and then recovers so the user sees a short stall
but no errors. The new libata doesn't do this and pdc202xx_old thus
produces some error messages on some boxes. Backup polling is on my todo
list.

2008-01-29 14:30:29

by Gene Heskett

[permalink] [raw]
Subject: Re: Problem with ata layer in 2.6.24

On Tuesday 29 January 2008, Alan Cox wrote:
>> not one problem but lots---is sufficiently widespread that a Mini HOWTO,
>> say, would be really welcome and, I'm guessing, widely used.
>
>We don't see very many libata problems at the distro level and they for
>the most part boil down to
>
>- error messages looking different - Most bugs I get are things like
>media errors (timeout looks different, UNC report looks different)
>
>- broken hardware - I've closed a whole raft of bugs that turn out to be
>new PC systems where even the BIOS doesn't see the drives
>
>- faulty hardware being picked up because we actually do real error
>checking now. We now check for and give some devices more slack while
>still doing error checking. Both IDE layers also added blacklists for
>stuff like the TSScorp DVD drives. Qemu has now had its bugs patched.
>
>- sata_nv with >4GB of RAM, knowing being worked on, no old IDE driver
>anyway
>
>- pata_ali MWDMA with ATAPI, PIO works fine, all a bit of a mystery and
>as it affects only a few chip variants hard to figure out. Workaround
>libata.dma=1
>
>- CS handling. On a few boxes using cable select (particularly on one
>drive and not the other) shows up a problem, normally a failed SRST.
>That's still under investigation.
>
>- Promise timeouts. The old IDE times out then polls the device and finds
>the IRQ was never sent and then recovers so the user sees a short stall
>but no errors. The new libata doesn't do this and pdc202xx_old thus
>produces some error messages on some boxes. Backup polling is on my todo
>list.

I have not had a problem, no errors at all, since I rebooted to
2.6.24-rc8 with the added argument in the kernel line in grub
(from dmesg):
[ 0.000000] Kernel command line: ro root=/dev/VolGroup00/LogVol00 acpi_use_timer_override rhgb quiet

which causes dmesg to log, some time later:

[ 27.581823] ENABLING IO-APIC IRQs
[ 27.582014] ..TIMER: vector=0x31 apic1=0 pin1=2 apic2=-1 pin2=-1
[ 27.592017] ..MP-BIOS bug: 8254 timer not connected to IO-APIC
[ 27.592068] ...trying to set up timer (IRQ0) through the 8259A ... failed.
[ 27.592071] ...trying to set up timer as Virtual Wire IRQ... works.
[ 27.703623] Brought up 1 CPUs

This was about noonish yesterday, and the logs have been silent
regarding this 'exception Emask' error since then. The drive itself
has also passed a smartctl -t long test with no errors since then.

Now, the last boot that had the problem was to 2.6.24, which did
NOT have that 'acpi_use_timer_override' argument, and its dmesg logged:

[ 24.934176] ENABLING IO-APIC IRQs
[ 24.934367] ..TIMER: vector=0x31 apic1=0 pin1=0 apic2=-1 pin2=-1
[ 25.045973] Brought up 1 CPUs

Now, my question is, did the use of that argument, while it looked
like it failed, cause the setup code to do something correct that
the default path didn't do? Is this the clue we're all looking for?

Since libata is apparently the path taken by TPTB, I'm going to build
and boot to a 2.6.24 using libata, but add that argument to grubs kernel
line in only one of 2 copies of that stanza.

Wish me luck.

--
Cheers, Gene
"There are four boxes to be used in defense of liberty:
soap, ballot, jury, and ammo. Please use in that order."
-Ed Howdershelt (Author)
The intelligence of any discussion diminishes with the square of the
number of participants.
-- Adam Walinsky

2008-01-29 14:52:29

by Gene Heskett

[permalink] [raw]
Subject: Re: Problem with ata layer in 2.6.24

On Tuesday 29 January 2008, Alan Cox wrote:
>> not one problem but lots---is sufficiently widespread that a Mini HOWTO,
>> say, would be really welcome and, I'm guessing, widely used.
>
>We don't see very many libata problems at the distro level and they for
>the most part boil down to
>
>- error messages looking different - Most bugs I get are things like
>media errors (timeout looks different, UNC report looks different)
>
>- broken hardware - I've closed a whole raft of bugs that turn out to be
>new PC systems where even the BIOS doesn't see the drives
>
>- faulty hardware being picked up because we actually do real error
>checking now. We now check for and give some devices more slack while
>still doing error checking. Both IDE layers also added blacklists for
>stuff like the TSScorp DVD drives. Qemu has now had its bugs patched.
>
>- sata_nv with >4GB of RAM, knowing being worked on, no old IDE driver
>anyway
>
>- pata_ali MWDMA with ATAPI, PIO works fine, all a bit of a mystery and
>as it affects only a few chip variants hard to figure out. Workaround
>libata.dma=1
>
>- CS handling. On a few boxes using cable select (particularly on one
>drive and not the other) shows up a problem, normally a failed SRST.
>That's still under investigation.
>
>- Promise timeouts. The old IDE times out then polls the device and finds
>the IRQ was never sent and then recovers so the user sees a short stall
>but no errors. The new libata doesn't do this and pdc202xx_old thus
>produces some error messages on some boxes. Backup polling is on my todo
>list.

As slight change here, I was going to use the same .config as 2.6.24-rc8, but
just discovered that neither rc8 nor final is finding the drivers for my
dvd writer while using libata, so its not useable. So I've enable a couple of
things in the 2.6.24 build that aren't in the 2.6.24-rc8. When I find the
magic twanger, I'll rebuild -rc8 with it too.

--
Cheers, Gene
"There are four boxes to be used in defense of liberty:
soap, ballot, jury, and ammo. Please use in that order."
-Ed Howdershelt (Author)
River: "He didn't lie down. They never lie down."
--"Serenity"

2008-01-29 15:05:25

by Gene Heskett

[permalink] [raw]
Subject: Re: Problem with ata layer in 2.6.24

On Tuesday 29 January 2008, Florian Attenberger wrote:
>On Mon, 28 Jan 2008 14:13:21 -0500
>
>Gene Heskett <[email protected]> wrote:
>> >> I had to reboot early this morning due to a freezeup, and I had a
>> >> bunch of these in the messages log:
>> >> ==============
>> >> Jan 27 19:42:11 coyote kernel: [42461.915961] ata1.00: exception Emask
>> >> 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen Jan 27 19:42:11 coyote kernel:
>> >> [42461.915973] ata1.00: cmd ca/00:08:b1:66:46/00:00:00:00:00/e8 tag 0
>> >> dma 4096 out Jan 27 19:42:11 coyote kernel: [42461.915974] res
>> >> 40/00:01:00:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout) Jan 27 19:42:11
>> >> coyote kernel: [42461.915978] ata1.00: status: { DRDY } Jan 27 19:42:11
>> >> coyote kernel: [42461.916005] ata1: soft resetting link Jan 27 19:42:12
>> >> coyote kernel: [42462.078216] ata1.00: configured for UDMA/100 Jan 27
>> >> 19:42:12 coyote kernel: [42462.078232] ata1: EH complete
>> >> Jan 27 19:42:12 coyote kernel: [42462.090700] sd 0:0:0:0: [sda]
>> >> 390721968 512-byte hardware sectors (200050 MB) Jan 27 19:42:12 coyote
>> >> kernel: [42462.114230] sd 0:0:0:0: [sda] Write Protect is off Jan 27
>> >> 19:42:12 coyote kernel: [42462.115079] sd 0:0:0:0: [sda] Write cache:
>> >> enabled, read cache: enabled, doesn't support DPO or FUA
>> >> ===============
>
>I had this error too, or maybe only a similar one, and another, neither
>of which of i still have the error output laying around, so I'm posting both
>fixes, that i found here on lkml:
>1) disabling ncq like that:
>"echo 1 > /sys/block/sda/device/queue_depth"

Interesting..

>2) this patch: libata_drain_fifo_on_stuck_drq_hsm.patch
>( applies to 2.6.24 too )
>
>Signed-off-by: Mark Lord <[email protected]>
>---
>
>--- old/drivers/ata/libata-sff.c 2007-09-28 09:29:22.000000000 -0400
>+++ linux/drivers/ata/libata-sff.c 2007-09-28 09:39:44.000000000 -0400
>@@ -420,6 +420,28 @@
> ap->ops->irq_on(ap);
> }
>
>+static void ata_drain_fifo(struct ata_port *ap, struct ata_queued_cmd *qc)
>+{
>+ u8 stat = ata_chk_status(ap);
>+ /*
>+ * Try to clear stuck DRQ if necessary,
>+ * by reading/discarding up to two sectors worth of data.
>+ */
>+ if ((stat & ATA_DRQ) && (!qc || qc->dma_dir != DMA_TO_DEVICE)) {
>+ unsigned int i;
>+ unsigned int limit = qc ? qc->sect_size : ATA_SECT_SIZE;
>+
>+ printk(KERN_WARNING "Draining up to %u words from data FIFO.\n",
>+ limit);
>+ for (i = 0; i < limit ; ++i) {
>+ ioread16(ap->ioaddr.data_addr);
>+ if (!(ata_chk_status(ap) & ATA_DRQ))
>+ break;
>+ }
>+ printk(KERN_WARNING "Drained %u/%u words.\n", i, limit);
>+ }
>+}
>+
> /**
> * ata_bmdma_drive_eh - Perform EH with given methods for BMDMA controller
> * @ap: port to handle error for
>@@ -476,7 +498,7 @@
> }
>
> ata_altstatus(ap);
>- ata_chk_status(ap);
>+ ata_drain_fifo(ap, qc);
> ap->ops->irq_clear(ap);
>
> spin_unlock_irqrestore(ap->lock, flags);
>-

This too. Thanks Florian. I'll keep these in mind as there may be more than
one cat in need of skinning here.

See a couple of posts I made to lkml this morning for the investigation I'm
doing re the kernel argument 'acpi_use_timer_override', experimental builds
under way right now.

Does anyone know why my dvdwriter isn't being assigned a '/dev/sdx' number
when dmesg says its found ok at ata2.00? I've turned on an option that says
something about using the bios for device access this build, but I'll be
surprised if that's it. :)

--
Cheers, Gene
"There are four boxes to be used in defense of liberty:
soap, ballot, jury, and ammo. Please use in that order."
-Ed Howdershelt (Author)
Ah, sweet Springtime, when a young man lightly turns his fancy over!

2008-01-29 15:54:07

by Alan

[permalink] [raw]
Subject: Re: Problem with ata layer in 2.6.24

> As slight change here, I was going to use the same .config as 2.6.24-rc8, but
> just discovered that neither rc8 nor final is finding the drivers for my

If it is not finding a driver that is nothing to do with libata. It means
it's not being loaded by the distribution, or the distribution kernel is
too old (2.6.22) for the hardware - in which case see the Fedora respins
which are on 2.6.23.something right now.

Alan

2008-01-29 16:12:55

by Mark Lord

[permalink] [raw]
Subject: Re: Problem with ata layer in 2.6.24

Gene Heskett wrote:
>..
> Does anyone know why my dvdwriter isn't being assigned a '/dev/sdx' number
> when dmesg says its found ok at ata2.00? I've turned on an option that says
> something about using the bios for device access this build, but I'll be
> surprised if that's it. :)
..

It should show up as /dev/scd0 or something very similar.

2008-01-29 16:33:36

by Gene Heskett

[permalink] [raw]
Subject: Re: Problem with ata layer in 2.6.24

On Tuesday 29 January 2008, Alan Cox wrote:
>> As slight change here, I was going to use the same .config as 2.6.24-rc8,
>> but just discovered that neither rc8 nor final is finding the drivers for
>> my
>
>If it is not finding a driver that is nothing to do with libata. It means
>it's not being loaded by the distribution, or the distribution kernel is
>too old (2.6.22) for the hardware - in which case see the Fedora respins
>which are on 2.6.23.something right now.
>
>Alan

Home built kernel Alan. But you are as good as anyone to tell me what I
need to turn on in order for this dvdwriter to be enabled:
[ 28.862478] ata2.00: ATAPI: LITE-ON DVDRW SHM-165H6S, HS06, max UDMA/66
....
[ 28.908647] ata2.00: limited to UDMA/33 due to 40-wire cable
[ 29.081253] ata2.00: configured for UDMA/33
....
it has had several 80 wire cables tried, hasn't fixed this, and does not
seem to effect its operation when it does work.
....
[ 29.132405] scsi 1:0:0:0: CD-ROM LITE-ON DVDRW SHM-165H6S HS06 PQ: 0 ANSI: 5
....
[ 43.450795] scsi 1:0:0:0: Attached scsi generic sg1 type 5
-------
No further mention of it in dmesg, and k3b cannot find the drive at any
/dev/sgX address.

.config attached, what else do I need to turn on?

There is also this in the log since I logged in and startx'd:

Jan 29 11:21:26 coyote automount[1923]: create_udp_client:101: hostname lookup failed: Operation not permitted
Jan 29 11:21:26 coyote automount[1923]: create_tcp_client:321: hostname lookup failed: Operation not permitted
Jan 29 11:21:26 coyote automount[1923]: lookup_mount: exports lookup failed for .directory
Jan 29 11:21:26 coyote automount[1923]: lookup_mount: lookup(file): key ".directory" not found in map

however a stop and restart of k3b does not regenerate another set of those.
So I have NDI what actually generated those.

I also discovered that my build nvidia script needs at least one run of
the complete .run version to get all the right versions of the GL stuffs
installed. That isn't related though, just a passing comment.

FWIW, this is a 2.6.24 kernel ATM, without that kernel argument acpi_use_timer_override.
If my theory is right, I should see some of those errors now.

--
Cheers, Gene
"There are four boxes to be used in defense of liberty:
soap, ballot, jury, and ammo. Please use in that order."
-Ed Howdershelt (Author)
You can move the world with an idea, but you have to think of it first.


Attachments:
(No filename) (2.38 kB)
.config (60.68 kB)
Download all attachments

2008-01-29 16:36:56

by Gene Heskett

[permalink] [raw]
Subject: Re: Problem with ata layer in 2.6.24

On Tuesday 29 January 2008, Mark Lord wrote:
>Gene Heskett wrote:
>>..
>> Does anyone know why my dvdwriter isn't being assigned a '/dev/sdx' number
>> when dmesg says its found ok at ata2.00? I've turned on an option that
>> says something about using the bios for device access this build, but I'll
>> be surprised if that's it. :)
>
>..
>
>It should show up as /dev/scd0 or something very similar.

Tisn't. Darnit.

--
Cheers, Gene
"There are four boxes to be used in defense of liberty:
soap, ballot, jury, and ammo. Please use in that order."
-Ed Howdershelt (Author)
clock speed

2008-01-29 16:49:34

by Mikael Pettersson

[permalink] [raw]
Subject: Re: Problem with ata layer in 2.6.24

Gene Heskett writes:
> On Tuesday 29 January 2008, Alan Cox wrote:
> >> As slight change here, I was going to use the same .config as 2.6.24-rc8,
> >> but just discovered that neither rc8 nor final is finding the drivers for
> >> my
> >
> >If it is not finding a driver that is nothing to do with libata. It means
> >it's not being loaded by the distribution, or the distribution kernel is
> >too old (2.6.22) for the hardware - in which case see the Fedora respins
> >which are on 2.6.23.something right now.
> >
> >Alan
>
> Home built kernel Alan. But you are as good as anyone to tell me what I
> need to turn on in order for this dvdwriter to be enabled:
> [ 28.862478] ata2.00: ATAPI: LITE-ON DVDRW SHM-165H6S, HS06, max UDMA/66
> ....
> [ 28.908647] ata2.00: limited to UDMA/33 due to 40-wire cable
> [ 29.081253] ata2.00: configured for UDMA/33
> ....
> it has had several 80 wire cables tried, hasn't fixed this, and does not
> seem to effect its operation when it does work.
> ....
> [ 29.132405] scsi 1:0:0:0: CD-ROM LITE-ON DVDRW SHM-165H6S HS06 PQ: 0 ANSI: 5
> ....
> [ 43.450795] scsi 1:0:0:0: Attached scsi generic sg1 type 5
> -------
> No further mention of it in dmesg, and k3b cannot find the drive at any
> /dev/sgX address.
>
> .config attached, what else do I need to turn on?

...

> # CONFIG_BLK_DEV_SR is not set

For starters, enable CONFIG_BLK_DEV_SR.

2008-01-29 16:50:34

by rgheck

[permalink] [raw]
Subject: Re: Problem with ata layer in 2.6.24

Mark Lord wrote:
> Gene Heskett wrote:
>> ..
>> Does anyone know why my dvdwriter isn't being assigned a '/dev/sdx'
>> number when dmesg says its found ok at ata2.00? I've turned on an
>> option that says something about using the bios for device access
>> this build, but I'll be surprised if that's it. :)
> ..
>
> It should show up as /dev/scd0 or something very similar.
>
Does it appear as /dev/sr0? Try ll /dev/s* and see what you get.

Anyway, these /dev/ entries are produced by udev, not by libata.

rh

2008-01-29 16:58:47

by Jeff Garzik

[permalink] [raw]
Subject: Re: Problem with ata layer in 2.6.24

Gene Heskett wrote:
> Does anyone know why my dvdwriter isn't being assigned a '/dev/sdx' number
> when dmesg says its found ok at ata2.00? I've turned on an option that says
> something about using the bios for device access this build, but I'll be
> surprised if that's it. :)

I think you mean /dev/scdx not /dev/sdx. Make sure you have the 'sr'
driver compiled and load (CONFIG_BLK_DEV_SR).

The bios-for-dev-access thing definitely won't help, and may hurt (by
taking over the device you wanted to test).

Jeff

2008-01-29 17:05:09

by Gene Heskett

[permalink] [raw]
Subject: Re: Problem with ata layer in 2.6.24

On Tuesday 29 January 2008, Mikael Pettersson wrote:
>Gene Heskett writes:
> > On Tuesday 29 January 2008, Alan Cox wrote:
> > >> As slight change here, I was going to use the same .config as
> > >> 2.6.24-rc8, but just discovered that neither rc8 nor final is finding
> > >> the drivers for my
> > >
> > >If it is not finding a driver that is nothing to do with libata. It
> > > means it's not being loaded by the distribution, or the distribution
> > > kernel is too old (2.6.22) for the hardware - in which case see the
> > > Fedora respins which are on 2.6.23.something right now.
> > >
> > >Alan
> >
> > Home built kernel Alan. But you are as good as anyone to tell me what I
> > need to turn on in order for this dvdwriter to be enabled:
> > [ 28.862478] ata2.00: ATAPI: LITE-ON DVDRW SHM-165H6S, HS06, max
> > UDMA/66 ....
> > [ 28.908647] ata2.00: limited to UDMA/33 due to 40-wire cable
> > [ 29.081253] ata2.00: configured for UDMA/33
> > ....
> > it has had several 80 wire cables tried, hasn't fixed this, and does not
> > seem to effect its operation when it does work.
> > ....
> > [ 29.132405] scsi 1:0:0:0: CD-ROM LITE-ON DVDRW SHM-165H6S
> > HS06 PQ: 0 ANSI: 5 ....
> > [ 43.450795] scsi 1:0:0:0: Attached scsi generic sg1 type 5
> > -------
> > No further mention of it in dmesg, and k3b cannot find the drive at any
> > /dev/sgX address.
> >
> > .config attached, what else do I need to turn on?
>
>...
>
> > # CONFIG_BLK_DEV_SR is not set
>
>For starters, enable CONFIG_BLK_DEV_SR.

That could stand to be moved or renamed, it is well buried in the menu for the
REAL scsi stuffs, which I don't have any of. Enabled & building now.
Thanks.

--
Cheers, Gene
"There are four boxes to be used in defense of liberty:
soap, ballot, jury, and ammo. Please use in that order."
-Ed Howdershelt (Author)
An air of FRENCH FRIES permeates my nostrils!!

2008-01-29 17:07:09

by rgheck

[permalink] [raw]
Subject: Re: Problem with ata layer in 2.6.24

Alan Cox wrote:
>> not one problem but lots---is sufficiently widespread that a Mini HOWTO,
>> say, would be really welcome and, I'm guessing, widely used.
>>
>
> We don't see very many libata problems at the distro level and they for
> the most part boil down to
>
> - sata_nv with >4GB of RAM, knowing being worked on, no old IDE driver
> anyway
>
Is this >4GB or >=4GB? I've seen contradictory reports, and I've got 4GB.

Richard

2008-01-29 17:16:22

by Gene Heskett

[permalink] [raw]
Subject: Re: Problem with ata layer in 2.6.24

On Tuesday 29 January 2008, Jeff Garzik wrote:
>Gene Heskett wrote:
>> Does anyone know why my dvdwriter isn't being assigned a '/dev/sdx' number
>> when dmesg says its found ok at ata2.00? I've turned on an option that
>> says something about using the bios for device access this build, but I'll
>> be surprised if that's it. :)
>
>I think you mean /dev/scdx not /dev/sdx. Make sure you have the 'sr'
>driver compiled and load (CONFIG_BLK_DEV_SR).
>
That menu item COULD be moved, I don't have any REAL scsi stuff, so I didn't
look there. My bad, with help from hiding it like that. :-)

>The bios-for-dev-access thing definitely won't help, and may hurt (by
>taking over the device you wanted to test).
>
Ok, if BLK_DEV_SR fails, I'll take that back out. I'm heating the room making
kernels here. :)

Thanks Jeff.

--
Cheers, Gene
"There are four boxes to be used in defense of liberty:
soap, ballot, jury, and ammo. Please use in that order."
-Ed Howdershelt (Author)
Life sucks, but death doesn't put out at all.
-- Thomas J. Kopp

2008-01-29 17:17:10

by Alan

[permalink] [raw]
Subject: Re: Problem with ata layer in 2.6.24


> Is this >4GB or >=4GB? I've seen contradictory reports, and I've got 4GB.

Depends how the memory is mapped. Any memory physically above the 4GB
boundary

Alan

2008-01-29 17:32:20

by Jeff Garzik

[permalink] [raw]
Subject: Re: Problem with ata layer in 2.6.24

Gene Heskett wrote:
> On Tuesday 29 January 2008, Jeff Garzik wrote:
>> Gene Heskett wrote:
>>> Does anyone know why my dvdwriter isn't being assigned a '/dev/sdx' number
>>> when dmesg says its found ok at ata2.00? I've turned on an option that
>>> says something about using the bios for device access this build, but I'll
>>> be surprised if that's it. :)
>> I think you mean /dev/scdx not /dev/sdx. Make sure you have the 'sr'
>> driver compiled and load (CONFIG_BLK_DEV_SR).
>>
> That menu item COULD be moved, I don't have any REAL scsi stuff, so I didn't
> look there. My bad, with help from hiding it like that. :-)
>
>> The bios-for-dev-access thing definitely won't help, and may hurt (by
>> taking over the device you wanted to test).
>>
> Ok, if BLK_DEV_SR fails, I'll take that back out. I'm heating the room making
> kernels here. :)

I can say with 100% certainty that 'sr' is required in order to use your
dvd writer with libata. :)

Jeff


2008-01-29 17:38:36

by Daniel Barkalow

[permalink] [raw]
Subject: Re: Problem with ata layer in 2.6.24

On Tue, 29 Jan 2008, Gene Heskett wrote:

> >For starters, enable CONFIG_BLK_DEV_SR.
>
> That could stand to be moved or renamed, it is well buried in the menu for the
> REAL scsi stuffs, which I don't have any of. Enabled & building now.

The "SCSI support type (disk, tape, CD-ROM)" section of that menu actually
applies to all ATA-command-set devices that don't use the old IDE code.
For example, usb-storage uses "SCSI disk" out of that section, and
I've only seen "Probe all LUNs on each SCSI device" be needed for a
particular USB card reader with two slots. At this point, most of the
things in the kernel that refer to SCSI probably should say "storage" (or
"ATA", really, but that would make the acronyms confusing).

Incidentally, you should be able to save debugging time for problems like
missing "sr" by building it as a module, which will build really quickly
and not require a reboot to test.

-Daniel
*This .sig left intentionally blank*

2008-01-29 17:42:24

by Adam Turk

[permalink] [raw]
Subject: Re: Problem with ata layer in 2.6.24


I just found this thread and it looks like it will fix my problem too. I have an IDE cd-rw drive and 2 SCSI hard drives. My ide cd-rw drive hasn't been showing up. I looked at setting scsi cdrom support (CONFIG_BLOCK_DEV_SR) but it doesn't mention anything about ide drives using libata.
I know the drive is being detecting by looking at dmesg:
ata_piix 0000:00:07.1: version 2.12
scsi1 : ata_piix
scsi2 : ata_piix
ata1: PATA max UDMA/33 cmd 0x1f0 ctl 0x3f6 bmdma 0xffa0 irq 14
ata2: PATA max UDMA/33 cmd 0x170 ctl 0x376 bmdma 0xffa8 irq 15
ata1.00: ATAPI: Memorex 52MAXX 3252AJ1, 4WS2, max UDMA/33
ata1.00: configured for UDMA/33
ata2: port disabled. ignoring.
scsi 1:0:0:0: CD-ROM Memorex 52MAXX 3252AJ1 4WS2 PQ: 0 ANSI: 5
if this works then it really needs to move and be renamed. I am compiling with DEV_SR set.

Just my $0.02 but may be worth more or less,
Adam


_________________________________________________________________
Need to know the score, the latest news, or you need your Hotmail?-get your "fix".
http://www.msnmobilefix.com/Default.aspx-

2008-01-29 17:49:39

by Alan

[permalink] [raw]
Subject: Re: Problem with ata layer in 2.6.24

> things in the kernel that refer to SCSI probably should say "storage" (or
> "ATA", really, but that would make the acronyms confusing).

SCSI is a command protocol. It is what your CD-ROM drive and USB storage
devices talk (albeit with a bit of an accent).

Alan

2008-01-29 17:57:57

by Gene Heskett

[permalink] [raw]
Subject: Re: Problem with ata layer in 2.6.24

On Tuesday 29 January 2008, Adam Turk wrote:
>I just found this thread and it looks like it will fix my problem too. I
> have an IDE cd-rw drive and 2 SCSI hard drives. My ide cd-rw drive hasn't
> been showing up. I looked at setting scsi cdrom support
> (CONFIG_BLOCK_DEV_SR) but it doesn't mention anything about ide drives
> using libata. I know the drive is being detecting by looking at dmesg:
>ata_piix 0000:00:07.1: version 2.12
>scsi1 : ata_piix
>scsi2 : ata_piix
>ata1: PATA max UDMA/33 cmd 0x1f0 ctl 0x3f6 bmdma 0xffa0 irq 14
>ata2: PATA max UDMA/33 cmd 0x170 ctl 0x376 bmdma 0xffa8 irq 15
>ata1.00: ATAPI: Memorex 52MAXX 3252AJ1, 4WS2, max UDMA/33
>ata1.00: configured for UDMA/33
>ata2: port disabled. ignoring.
>scsi 1:0:0:0: CD-ROM Memorex 52MAXX 3252AJ1 4WS2 PQ: 0 ANSI: 5
>if this works then it really needs to move and be renamed. I am compiling
> with DEV_SR set.
>
>Just my $0.02 but may be worth more or less,
>Adam
>
That fixed me right up, Adam, & k3b is once again as happy as a clam.

--
Cheers, Gene
"There are four boxes to be used in defense of liberty:
soap, ballot, jury, and ammo. Please use in that order."
-Ed Howdershelt (Author)
Main's Law:
For every action there is an equal and opposite government program.

2008-01-29 18:00:24

by Gene Heskett

[permalink] [raw]
Subject: Re: Problem with ata layer in 2.6.24

On Tuesday 29 January 2008, Daniel Barkalow wrote:
>On Tue, 29 Jan 2008, Gene Heskett wrote:
>> >For starters, enable CONFIG_BLK_DEV_SR.
>>
>> That could stand to be moved or renamed, it is well buried in the menu for
>> the REAL scsi stuffs, which I don't have any of. Enabled & building now.
>
>The "SCSI support type (disk, tape, CD-ROM)" section of that menu actually
>applies to all ATA-command-set devices that don't use the old IDE code.
>For example, usb-storage uses "SCSI disk" out of that section, and
>I've only seen "Probe all LUNs on each SCSI device" be needed for a
>particular USB card reader with two slots. At this point, most of the
>things in the kernel that refer to SCSI probably should say "storage" (or
>"ATA", really, but that would make the acronyms confusing).
>
>Incidentally, you should be able to save debugging time for problems like
>missing "sr" by building it as a module, which will build really quickly
>and not require a reboot to test.
>
> -Daniel
>*This .sig left intentionally blank*

I did, Daniel, but while that has worked, its not been 100% foolproof in the
past, so I just waste the 9 minutes building a new kernel as cheap insurance.

--
Cheers, Gene
"There are four boxes to be used in defense of liberty:
soap, ballot, jury, and ammo. Please use in that order."
-Ed Howdershelt (Author)
Mal: "If it's Alliance trouble you got, you might want to consider another
ship. Some onboard here fought for the Independents."
--Episode #8, "Out of Gas"

2008-01-29 18:09:45

by Mark Lord

[permalink] [raw]
Subject: Re: Problem with ata layer in 2.6.24

Gene Heskett wrote:
> On Tuesday 29 January 2008, Mark Lord wrote:
>> Gene Heskett wrote:
>>> ..
>>> Does anyone know why my dvdwriter isn't being assigned a '/dev/sdx' number
>>> when dmesg says its found ok at ata2.00? I've turned on an option that
>>> says something about using the bios for device access this build, but I'll
>>> be surprised if that's it. :)
>> ..
>>
>> It should show up as /dev/scd0 or something very similar.
>
> Tisn't. Darnit.
..

It requires CONFIG_SCSI, CONFIG_BLK_DEV_SD, CONFIG_BLK_DEV_SR, in the kernel .config.

The _SR one ("SCSI Reader") is for CD/DVD support.

Cheers

2008-01-29 18:13:38

by Mark Lord

[permalink] [raw]
Subject: Re: Problem with ata layer in 2.6.24

rgheck wrote:
> Alan Cox wrote:
>>> not one problem but lots---is sufficiently widespread that a Mini
>>> HOWTO, say, would be really welcome and, I'm guessing, widely used.
>>>
>>
>> We don't see very many libata problems at the distro level and they for
>> the most part boil down to
>>
>> - sata_nv with >4GB of RAM, knowing being worked on, no old IDE driver
>> anyway
>>
> Is this >4GB or >=4GB? I've seen contradictory reports, and I've got 4GB.
..

For all practical purposes, most memory over 3GB (or sometimes even 2GB)
on a 32-bit x86 system is treated as >4GB by the motherboard.

Because it's not the amount of *memory* that matters so much,
but rather the amount of *used address space*. Video cards,
PCI devices, other motherboard resources etc.. can all subtract
from the available address space, leaving much less than 4GB
for your RAM.

-ml

2008-01-29 18:14:05

by Daniel Barkalow

[permalink] [raw]
Subject: Re: Problem with ata layer in 2.6.24

On Tue, 29 Jan 2008, Alan Cox wrote:

> > things in the kernel that refer to SCSI probably should say "storage" (or
> > "ATA", really, but that would make the acronyms confusing).
>
> SCSI is a command protocol. It is what your CD-ROM drive and USB storage
> devices talk (albeit with a bit of an accent).

Among other things, yes. But SCSI standards also specify electrical
interfaces that aren't at all related to the electrical interfaces used by
a lot of devices, and a lot of the places the kernel uses the term suggest
that it's also talking about the electrical interface (or, at least,
connector shape). For example, it's misleading to talk about "SCSI CDROM
support" meaning the command protocol when hardly anybody has ever seen a
CDROM drive that doesn't use the SCSI command protocol, but most people
know about both SCSI-connector and PATA-connector CDROM drives.

-Daniel
*This .sig left intentionally blank*

2008-01-29 18:15:18

by Daniel Barkalow

[permalink] [raw]
Subject: Re: Problem with ata layer in 2.6.24

On Tue, 29 Jan 2008, Alan Cox wrote:

> > not one problem but lots---is sufficiently widespread that a Mini HOWTO,
> > say, would be really welcome and, I'm guessing, widely used.
>
> We don't see very many libata problems at the distro level and they for
> the most part boil down to
>
> - error messages looking different - Most bugs I get are things like
> media errors (timeout looks different, UNC report looks different)

The SCSI error reporting really ought to include a simple interpretation
of the error for end users ("The drive doesn't support this command" "A
sector's data got lost" "The drive timed out" "The drive failed" "The
drive is entirely gone"). There's too much similarity between the message
you get when you try a SMART test that doesn't apply to the drive and what
you get when the drive is broken.

> - faulty hardware being picked up because we actually do real error
> checking now. We now check for and give some devices more slack while
> still doing error checking. Both IDE layers also added blacklists for
> stuff like the TSScorp DVD drives. Qemu has now had its bugs patched.

I think this is the big source of unhappy users (and, of course, they all
look the same and the reports stay findable by Google, so it looks a lot
worse than it is). People getting this problem in distro kernels probably
really do want to have a way to report it with enough detail from logs to
get it dealt with and then switch back to old IDE until the fix propagates
through.

And it's possible that the error recovery is suboptimal in some cases. It
seems to like resetting drives too much; perhaps if it keeps seeing the
same problem and resetting the drive, it should decide that the drive's
error reporting is just bad and just ignore that error like the old IDE
did (but, in this case, after saying what it's doing).

-Daniel
*This .sig left intentionally blank*

2008-01-29 18:32:23

by Mark Lord

[permalink] [raw]
Subject: Re: Problem with ata layer in 2.6.24

rgheck wrote:
> Mark Lord wrote:
>> rgheck wrote:
>>> Alan Cox wrote:
>>>>> not one problem but lots---is sufficiently widespread that a Mini
>>>>> HOWTO, say, would be really welcome and, I'm guessing, widely used.
>>>>>
>>>>
>>>> We don't see very many libata problems at the distro level and they for
>>>> the most part boil down to
>>>>
>>>> - sata_nv with >4GB of RAM, knowing being worked on, no old IDE driver
>>>> anyway
>>>>
>>> Is this >4GB or >=4GB? I've seen contradictory reports, and I've got
>>> 4GB.
>> ..
>>
>> For all practical purposes, most memory over 3GB (or sometimes even 2GB)
>> on a 32-bit x86 system is treated as >4GB by the motherboard.
>>
>> Because it's not the amount of *memory* that matters so much,
>> but rather the amount of *used address space*. Video cards,
>> PCI devices, other motherboard resources etc.. can all subtract
>> from the available address space, leaving much less than 4GB
>> for your RAM.
>
> Right. So it looks like I do have this issue, though I haven't seen any
> actual problems on 24. Is there a known workaround?
..

For now, the workaround is to not enable the RAM above 4GB.
Your kernel .config file should therefore have these two lines:

CONFIG_HIGHMEM4G=y
# CONFIG_HIGHMEM64G is not set

Later, once the issue is fixed at the driver level (soon),
you can get your high memory back again by enabling CONFIG_HIGHMEM64G,
though this will cost a few percent of performance in the extra
page table overhead it creates.

Cheers

2008-01-29 18:35:41

by rgheck

[permalink] [raw]
Subject: Re: Problem with ata layer in 2.6.24

Mark Lord wrote:
> rgheck wrote:
>> Alan Cox wrote:
>>>> not one problem but lots---is sufficiently widespread that a Mini
>>>> HOWTO, say, would be really welcome and, I'm guessing, widely used.
>>>>
>>>
>>> We don't see very many libata problems at the distro level and they for
>>> the most part boil down to
>>>
>>> - sata_nv with >4GB of RAM, knowing being worked on, no old IDE driver
>>> anyway
>>>
>> Is this >4GB or >=4GB? I've seen contradictory reports, and I've got
>> 4GB.
> ..
>
> For all practical purposes, most memory over 3GB (or sometimes even 2GB)
> on a 32-bit x86 system is treated as >4GB by the motherboard.
>
> Because it's not the amount of *memory* that matters so much,
> but rather the amount of *used address space*. Video cards,
> PCI devices, other motherboard resources etc.. can all subtract
> from the available address space, leaving much less than 4GB
> for your RAM.

Right. So it looks like I do have this issue, though I haven't seen any
actual problems on 24. Is there a known workaround?

rh

2008-01-29 18:47:26

by Gene Heskett

[permalink] [raw]
Subject: Re: Problem with ata layer in 2.6.24

On Tuesday 29 January 2008, Jeff Garzik wrote:
>Gene Heskett wrote:
>> On Tuesday 29 January 2008, Jeff Garzik wrote:
>>> Gene Heskett wrote:
>>>> Does anyone know why my dvdwriter isn't being assigned a '/dev/sdx'
>>>> number when dmesg says its found ok at ata2.00? I've turned on an
>>>> option that says something about using the bios for device access this
>>>> build, but I'll be surprised if that's it. :)
>>>
>>> I think you mean /dev/scdx not /dev/sdx. Make sure you have the 'sr'
>>> driver compiled and load (CONFIG_BLK_DEV_SR).
>>
>> That menu item COULD be moved, I don't have any REAL scsi stuff, so I
>> didn't look there. My bad, with help from hiding it like that. :-)
>>
>>> The bios-for-dev-access thing definitely won't help, and may hurt (by
>>> taking over the device you wanted to test).
>>
>> Ok, if BLK_DEV_SR fails, I'll take that back out. I'm heating the room
>> making kernels here. :)
>
>I can say with 100% certainty that 'sr' is required in order to use your
>dvd writer with libata. :)
>
> Jeff

And as usual, you are 100% correct, thanks.

And now back to our regularly scheduled testing for 'exception Emask'
errors. :)

--
Cheers, Gene
"There are four boxes to be used in defense of liberty:
soap, ballot, jury, and ammo. Please use in that order."
-Ed Howdershelt (Author)
Main's Law:
For every action there is an equal and opposite government program.

2008-01-29 18:50:52

by Alan

[permalink] [raw]
Subject: Re: Problem with ata layer in 2.6.24

> The SCSI error reporting really ought to include a simple interpretation
> of the error for end users ("The drive doesn't support this command" "A
> sector's data got lost" "The drive timed out" "The drive failed" "The
> drive is entirely gone"). There's too much similarity between the message
> you get when you try a SMART test that doesn't apply to the drive and what
> you get when the drive is broken.

That would be the SCSI verbose messages option. I think the Eric
Youngdale consortium added it about Linux 1.2. Nowdays its always built
that way.

> And it's possible that the error recovery is suboptimal in some cases. It
> seems to like resetting drives too much; perhaps if it keeps seeing the
> same problem and resetting the drive, it should decide that the drive's
> error reporting is just bad and just ignore that error like the old IDE
> did (but, in this case, after saying what it's doing).

Nothing like casually praying the users data hasn't gone for a walk is
there. If we don't act on them the users don't report them until
something really bad occurs so that isn't an option.

Alan

2008-01-29 18:59:39

by Alan

[permalink] [raw]
Subject: Re: Problem with ata layer in 2.6.24

> That could stand to be moved or renamed, it is well buried in the menu for the
> REAL scsi stuffs, which I don't have any of.

Yes you do - USB storage and ATAPI are SCSI

2008-01-29 19:14:41

by Daniel Barkalow

[permalink] [raw]
Subject: Re: Problem with ata layer in 2.6.24

On Tue, 29 Jan 2008, Alan Cox wrote:

> > The SCSI error reporting really ought to include a simple interpretation
> > of the error for end users ("The drive doesn't support this command" "A
> > sector's data got lost" "The drive timed out" "The drive failed" "The
> > drive is entirely gone"). There's too much similarity between the message
> > you get when you try a SMART test that doesn't apply to the drive and what
> > you get when the drive is broken.
>
> That would be the SCSI verbose messages option. I think the Eric
> Youngdale consortium added it about Linux 1.2. Nowdays its always built
> that way.

I've seen a lot of verbosity out of SCSI messages, but I haven't seen a
straightforward interpretation of the problem in there. It's all
information useful for debugging, not information useful for system
administration.

> > And it's possible that the error recovery is suboptimal in some cases. It
> > seems to like resetting drives too much; perhaps if it keeps seeing the
> > same problem and resetting the drive, it should decide that the drive's
> > error reporting is just bad and just ignore that error like the old IDE
> > did (but, in this case, after saying what it's doing).
>
> Nothing like casually praying the users data hasn't gone for a walk is
> there. If we don't act on them the users don't report them until
> something really bad occurs so that isn't an option.

On the other hand, bringing the system down because a device is
misbehaving is a poor idea. I've personally recovered most of the data off
of a dying drive because the system was willing to let me keep using the
drive anyway; IIRC, the drive didn't work at all after a reboot, so I
would have lost all the data instead of only a little had the system
insisted on a perfectly functioning drive in order to use it at all.

There ought to be some middle ground between doing nothing until the
computer really breaks and breaking the computer before then, but that's
an issue not specific to libata.

-Daniel
*This .sig left intentionally blank*

2008-01-29 19:38:56

by Alan

[permalink] [raw]
Subject: Re: Problem with ata layer in 2.6.24

> I've seen a lot of verbosity out of SCSI messages, but I haven't seen a
> straightforward interpretation of the problem in there. It's all
> information useful for debugging, not information useful for system
> administration.

It tells you what is going on. Unfortunately that frequently requires
some basic knowledge of how to interpret the error report. Drive
interface behaviour simply doesn't boil down to a fault light on the
dashboard or a "tighten the cable". For most common fault types you'll
get errors most administrators should find meaningful - like "Media error"

> On the other hand, bringing the system down because a device is
> misbehaving is a poor idea. I've personally recovered most of the data off

Hence we have RAID and SATA hotplug.

Alan

2008-01-29 22:42:29

by Gene Heskett

[permalink] [raw]
Subject: Re: Problem with ata layer in 2.6.24

On Tuesday 29 January 2008, Alan Cox wrote:
>> That could stand to be moved or renamed, it is well buried in the menu for
>> the REAL scsi stuffs, which I don't have any of.
>
>Yes you do - USB storage and ATAPI are SCSI

By the linux software definition maybe. But I've defined scsi as that which
uses a 50 wire cable using 50 contact centronics connectors since the
mid '70's, and which often needs a ready supply of nubile virgins to
sacrifice to make it work, particularly with the old resistor pack
terminations & psu's whose 5 volt line is only 4.85 volts due to old age.
That's what I call REAL scsi. Its also a REAL PITA if the terms aren't
active.

You can call what you are doing 'scsi' because you are using much the same
command structure, and that is good, but its not the real thing with all its
hardware warts and/or capabilities. For one thing, this version usually
works. :)

Furinstance, you can tell 2 scsi devices on the same controller to talk to
each other, moving files from one to the other, and the host controller can
then goto sleep & the cpu isn't involved until the devices send it a wakeup
to advise the controller that the transfer has been done, and the controller
may or may not then interrupt and advise the cpu. You can do that with
separate controllers too as long as they have a compatible DMA channel
available to both.

I doubt libata has that capability now, or ever will, cuz these ide/atapi
devices are generally dumber than rocks about that. But any device claiming
to be scsi-II is supposed to be able to do those sorts of things while the
cpu is off crunching numbers for BOINC or whatever.

But that puts my mild objections to classifying this as 'scsi' in a more
understandable context. :-)

--
Cheers, Gene
"There are four boxes to be used in defense of liberty:
soap, ballot, jury, and ammo. Please use in that order."
-Ed Howdershelt (Author)
When some people decide it's time for everyone to make big changes,
it means that they want you to change first.

2008-01-29 22:53:21

by Alan

[permalink] [raw]
Subject: Re: Problem with ata layer in 2.6.24

> By the linux software definition maybe. But I've defined scsi as that which
> uses a 50 wire cable using 50 contact centronics connectors since the
> mid '70's, and which often needs a ready supply of nubile virgins t

25, 50 or 68, with multiple voltage levels, plus of course it might be
over fibre or copper FC loop and ..

SCSI is a protocol.

2008-01-29 22:57:34

by Adam Turk

[permalink] [raw]
Subject: RE: Problem with ata layer in 2.6.24


> From: [email protected]
>>if this works then it really needs to move and be renamed. I am compiling
>> with DEV_SR set.
>>
> That fixed me right up, Adam, & k3b is once again as happy as a clam.

Fixed it for me too. I just realized the default config in 2.6.24 is way different than the default config in 2.6.23.

If I remember correctly there was talk of separating the libata and scsi code. This was awhile ago. I am not a kernel programmer, only a user, but either the scsi and libata kconfig menus should be joined and made generic, or options like cdrom support should be in both kconfig menus. Alan says libata is scsi with an accent so maybe merging the two isn't as bad as it sounds.

Just my $0.02 cents, probably worth less in this case.
Adam
_________________________________________________________________
Connect and share in new ways with Windows Live.
http://www.windowslive.com/share.html?ocid=TXT_TAGHM_Wave2_sharelife_012008-

2008-01-30 00:19:55

by Mark Lord

[permalink] [raw]
Subject: Re: Problem with ata layer in 2.6.24

Gene Heskett wrote:
>
> I doubt libata has that capability now, or ever will, cuz these ide/atapi
> devices are generally dumber than rocks about that. But any device claiming
> to be scsi-II is supposed to be able to do those sorts of things while the
> cpu is off crunching numbers for BOINC or whatever.
..

The CD/DVD drives all all "MMC" devices internally, which means they speak
a SCSI command protocol. Regardless of the electrical or optical interface.

Linux is software, and the software protocol is exactly the same for them,
no matter what the cable/bus type happens to be.

Cheers

2008-02-02 07:13:54

by Tejun Heo

[permalink] [raw]
Subject: Re: Problem with ata layer in 2.6.24

Kasper Sandberg wrote:
> to put some timeline perspective into this.
> i believe it was in 2005 i assembled the system, and when i realized it
> was faulty, on old ide driver, i stopped using it - that miht have been
> in beginning of 2006. then for almost a year i werent using it, hoping
> to somehow fix it, but in january 2007 i think it was, atleast in the
> very beginning of 2007, i hit upon the idea of trying libata, and ever
> since the system has been running 24/7 - doing these errors around 2
> times a day.
>
> i have multiple times reported my problems to lkml, but nothing has
> happened, i also tried to aproeach jgarzik direcly, but he was not
> interested.
>
> i really hope this can be solved now, its a huge problem
>
> my fileserver has an asus k8v motherboard, with via chipset (k8t880 i
> think it is, or something like it). currently using the promise
> controller again(strangely enough all the timeouts seems to happen here,
> and when the ITE was on, there, not the onboard one), in conjunction
> with the onboard via.

Timeouts are nasty to debug. It can be caused by whole range of
different problems including transmission errors, bad power, faulty
drive, mishandled media error, IRQ misrouting, dumb hardware bug. It's
basically 'uh... I told the controller to do something but it never
called me back'.

If you see timeouts on multiple devices connected to different
controllers, the chance is that you have problem somewhere else. The
most likely culprit is bad power. Please...

* Post the result of 'lspci -nn' and kernel log including full boot log
and error messages.

* Try to isolate the problem. ie. Does removing several number of
drives fix the problem? If the problem is localized to certain device,
what happens if you move it? Does the problem follow the drive or stay
with the port? If the failing drives are SATA, it's a good idea to
power some of the failing drives with a separate PSU and see whether
anything is different.

By trying to isolate the hardware problem, more can be learned about the
error condition and even when the problem actually isn't hardware
problem, it gives us much deeper insight of the problem and clues
regarding where to look.

Thanks.

--
tejun