2008-02-14 17:47:36

by Malte Schröder

[permalink] [raw]
Subject: Regression: kernel 2.6.24{,.1} ahci problem, does not boot (resend)

Hello,
on one of my machines neither 2.6.24 nor 2.6.24.1 work.
The system is 64bit on Athlon X2 and ATI-Chipset (SB600).

Extract from the kernel messages during boot:

[ 66.943103] ahci 0000:00:12.0: controller can't do 64bit DMA, forcing 32bit
[ 66.950374] ahci 0000:00:12.0: controller can't do PMP, turning off CAP_PMP
[ 67.956470] ahci 0000:00:12.0: AHCI 0001.0100 32 slots 4 ports 3 Gbps 0xf impl SATA mode
[ 67.964996] ahci 0000:00:12.0: flags: ncq sntf ilck pm led clo pio slum part
[ 67.972820] scsi0 : ahci
[ 67.975699] scsi1 : ahci
[ 67.978445] scsi2 : ahci
[ 67.981178] scsi3 : ahci
[ 67.983949] ata1: SATA max UDMA/133 abar m1024@0xfadffc00 port 0xfadffd00 irq 509
[ 67.991825] ata2: SATA max UDMA/133 abar m1024@0xfadffc00 port 0xfadffd80 irq 509
[ 67.999729] ata3: SATA max UDMA/133 abar m1024@0xfadffc00 port 0xfadffe00 irq 509
[ 68.007619] ata4: SATA max UDMA/133 abar m1024@0xfadffc00 port 0xfadffe80 irq 509
[ 68.470669] ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
[ 98.431945] ata1.00: qc timeout (cmd 0xec)
[ 98.454907] ata1.00: failed to IDENTIFY (I/O error, err_mask=0x4)
[ 98.461296] ata1: failed to recover some devices, retrying in 5 secs
[ 103.916773] ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
[ 133.878045] ata1.00: qc timeout (cmd 0xec)
[ 133.882371] ata1.00: failed to IDENTIFY (I/O error, err_mask=0x4)
[ 133.888797] ata1: failed to recover some devices, retrying in 5 secs
[ 139.343901] ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
[ 169.305174] ata1.00: qc timeout (cmd 0xec)
[ 169.309534] ata1.00: failed to IDENTIFY (I/O error, err_mask=0x4)
[ 169.315926] ata1: failed to recover some devices, retrying in 5 secs
[ 174.771030] ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300)

The complete boot-log (captured via serial console), lspci output,
output of hdparm -I and the kernel config are attached.

The PC has one IDE-drive and four SATA-drives (see hdparm.txt).
If I wait long enough, the IDE-drive get's recognized and I can
continue the boot process, but the SATA-drives are never recognized.
The system work's fine with kernel 2.6.23.11 (later kernels not tested).

I hope someone can help ...

Regards
--
---------------------------------------
Malte Schröder
[email protected]
ICQ# 68121508
---------------------------------------


Attachments:
(No filename) (2.30 kB)
lspci.txt (40.60 kB)
boot.txt (27.64 kB)
config-2.6.24.1 (51.97 kB)
hdparm.txt (9.23 kB)
signature.asc (189.00 B)
Download all attachments

2008-02-14 23:35:30

by Tejun Heo

[permalink] [raw]
Subject: Re: Regression: kernel 2.6.24{,.1} ahci problem, does not boot (resend)

Malte Schr?der wrote:
> Hello,
> on one of my machines neither 2.6.24 nor 2.6.24.1 work.
> The system is 64bit on Athlon X2 and ATI-Chipset (SB600).
>
> Extract from the kernel messages during boot:
>
> [ 66.943103] ahci 0000:00:12.0: controller can't do 64bit DMA, forcing 32bit
> [ 66.950374] ahci 0000:00:12.0: controller can't do PMP, turning off CAP_PMP
> [ 67.956470] ahci 0000:00:12.0: AHCI 0001.0100 32 slots 4 ports 3 Gbps 0xf impl SATA mode
> [ 67.964996] ahci 0000:00:12.0: flags: ncq sntf ilck pm led clo pio slum part
> [ 67.972820] scsi0 : ahci
> [ 67.975699] scsi1 : ahci
> [ 67.978445] scsi2 : ahci
> [ 67.981178] scsi3 : ahci
> [ 67.983949] ata1: SATA max UDMA/133 abar m1024@0xfadffc00 port 0xfadffd00 irq 509
> [ 67.991825] ata2: SATA max UDMA/133 abar m1024@0xfadffc00 port 0xfadffd80 irq 509
> [ 67.999729] ata3: SATA max UDMA/133 abar m1024@0xfadffc00 port 0xfadffe00 irq 509
> [ 68.007619] ata4: SATA max UDMA/133 abar m1024@0xfadffc00 port 0xfadffe80 irq 509
> [ 68.470669] ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
> [ 98.431945] ata1.00: qc timeout (cmd 0xec)
> [ 98.454907] ata1.00: failed to IDENTIFY (I/O error, err_mask=0x4)
> [ 98.461296] ata1: failed to recover some devices, retrying in 5 secs
> [ 103.916773] ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
> [ 133.878045] ata1.00: qc timeout (cmd 0xec)
> [ 133.882371] ata1.00: failed to IDENTIFY (I/O error, err_mask=0x4)
> [ 133.888797] ata1: failed to recover some devices, retrying in 5 secs
> [ 139.343901] ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
> [ 169.305174] ata1.00: qc timeout (cmd 0xec)
> [ 169.309534] ata1.00: failed to IDENTIFY (I/O error, err_mask=0x4)
> [ 169.315926] ata1: failed to recover some devices, retrying in 5 secs
> [ 174.771030] ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
>
> The complete boot-log (captured via serial console), lspci output,
> output of hdparm -I and the kernel config are attached.
>
> The PC has one IDE-drive and four SATA-drives (see hdparm.txt).
> If I wait long enough, the IDE-drive get's recognized and I can
> continue the boot process, but the SATA-drives are never recognized.
> The system work's fine with kernel 2.6.23.11 (later kernels not tested).

Does irqpoll kernel parameter help?

--
tejun

2008-02-15 18:12:29

by Malte Schröder

[permalink] [raw]
Subject: Re: Regression: kernel 2.6.24{,.1} ahci problem, does not boot (resend)

On Fri, 15 Feb 2008 08:35:10 +0900
Tejun Heo <[email protected]> wrote:

> Malte Schröder wrote:
> > Hello,
> > on one of my machines neither 2.6.24 nor 2.6.24.1 work.
> > The system is 64bit on Athlon X2 and ATI-Chipset (SB600).
> >
> > Extract from the kernel messages during boot:
> >
> > [ 66.943103] ahci 0000:00:12.0: controller can't do 64bit DMA, forcing 32bit
> > [ 66.950374] ahci 0000:00:12.0: controller can't do PMP, turning off CAP_PMP
> > [ 67.956470] ahci 0000:00:12.0: AHCI 0001.0100 32 slots 4 ports 3 Gbps 0xf impl SATA mode
> > [ 67.964996] ahci 0000:00:12.0: flags: ncq sntf ilck pm led clo pio slum part
> > [ 67.972820] scsi0 : ahci
> > [ 67.975699] scsi1 : ahci
> > [ 67.978445] scsi2 : ahci
> > [ 67.981178] scsi3 : ahci
> > [ 67.983949] ata1: SATA max UDMA/133 abar m1024@0xfadffc00 port 0xfadffd00 irq 509
> > [ 67.991825] ata2: SATA max UDMA/133 abar m1024@0xfadffc00 port 0xfadffd80 irq 509
> > [ 67.999729] ata3: SATA max UDMA/133 abar m1024@0xfadffc00 port 0xfadffe00 irq 509
> > [ 68.007619] ata4: SATA max UDMA/133 abar m1024@0xfadffc00 port 0xfadffe80 irq 509
> > [ 68.470669] ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
> > [ 98.431945] ata1.00: qc timeout (cmd 0xec)
> > [ 98.454907] ata1.00: failed to IDENTIFY (I/O error, err_mask=0x4)
> > [ 98.461296] ata1: failed to recover some devices, retrying in 5 secs
> > [ 103.916773] ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
> > [ 133.878045] ata1.00: qc timeout (cmd 0xec)
> > [ 133.882371] ata1.00: failed to IDENTIFY (I/O error, err_mask=0x4)
> > [ 133.888797] ata1: failed to recover some devices, retrying in 5 secs
> > [ 139.343901] ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
> > [ 169.305174] ata1.00: qc timeout (cmd 0xec)
> > [ 169.309534] ata1.00: failed to IDENTIFY (I/O error, err_mask=0x4)
> > [ 169.315926] ata1: failed to recover some devices, retrying in 5 secs
> > [ 174.771030] ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
> >
> > The complete boot-log (captured via serial console), lspci output,
> > output of hdparm -I and the kernel config are attached.
> >
> > The PC has one IDE-drive and four SATA-drives (see hdparm.txt).
> > If I wait long enough, the IDE-drive get's recognized and I can
> > continue the boot process, but the SATA-drives are never recognized.
> > The system work's fine with kernel 2.6.23.11 (later kernels not tested).
>
> Does irqpoll kernel parameter help?

No, problem stays.

--
---------------------------------------
Malte Schröder
[email protected]
ICQ# 68121508
---------------------------------------


Attachments:
signature.asc (189.00 B)

2008-02-21 02:51:29

by Tejun Heo

[permalink] [raw]
Subject: Re: Regression: kernel 2.6.24{,.1} ahci problem, does not boot (resend)

Malte Schröder wrote:
>> Does irqpoll kernel parameter help?
>
> No, problem stays.

Can you capture full boot log w/ irqpoll specified? If you have root
filesystem connected to ahci, you'll probably have to use serial or
netconsole. Also, please post full boot log from 2.6.23.11.

Thanks.

--
tejun

2008-02-21 16:28:50

by Malte Schröder

[permalink] [raw]
Subject: Re: Regression: kernel 2.6.24{,.1} ahci problem, does not boot (resend)

On Thu, 21 Feb 2008 11:51:11 +0900
Tejun Heo <[email protected]> wrote:

> Malte Schröder wrote:
> >> Does irqpoll kernel parameter help?
> >
> > No, problem stays.
>
> Can you capture full boot log w/ irqpoll specified? If you have root
> filesystem connected to ahci, you'll probably have to use serial or
> netconsole. Also, please post full boot log from 2.6.23.11.
>
> Thanks.
>

I "solved" the problem by updating the BIOS. It now works perfectly. I
thought I had mailed that .. maybe I forgot.

--
---------------------------------------
Malte Schröder
[email protected]
ICQ# 68121508
---------------------------------------


Attachments:
signature.asc (189.00 B)

2008-03-17 21:01:12

by Jan Kasprzak

[permalink] [raw]
Subject: Re: Regression: kernel 2.6.24{,.1} ahci problem, does not boot (resend)

Malte Schröder wrote:
: I "solved" the problem by updating the BIOS. It now works perfectly. I
: thought I had mailed that .. maybe I forgot.

I have the same problem (ASUS M2R32-MVP board with SB600 chipset,
Athlon64 X2), but I am already using the latest BIOS for this mainboard.

2.6.22.19 works
2.6.24.3 does not work
2.6.25-rc6 does not work

I can try to bisect it (maybe tomorrow).
I use AHCI mode, maybe I should try the legacy mode as well.

-Yenya

--
| Jan "Yenya" Kasprzak <kas at {fi.muni.cz - work | yenya.net - private}> |
| GPG: ID 1024/D3498839 Fingerprint 0D99A7FB206605D7 8B35FCDE05B18A5E |
| http://www.fi.muni.cz/~kas/ Journal: http://www.fi.muni.cz/~kas/blog/ |
>> If you find yourself arguing with Alan Cox, you’re _probably_ wrong. <<
>> --James Morris in "How and Why You Should Become a Kernel Hacker" <<

2008-03-18 14:01:33

by Jan Kasprzak

[permalink] [raw]
Subject: Re: Regression: kernel 2.6.24{,.1} ahci problem, does not boot (resend)

Jan Kasprzak wrote:
: Malte Schröder wrote:
: : I "solved" the problem by updating the BIOS. It now works perfectly. I
: : thought I had mailed that .. maybe I forgot.
:
: I have the same problem (ASUS M2R32-MVP board with SB600 chipset,
: Athlon64 X2), but I am already using the latest BIOS for this mainboard.

Sorry for the noise, there is even newer BIOS (dated Mar 01),
and with this BIOS my mainboard works in AHCI mode even with 2.6.25-rc6.

-Yenya

--
| Jan "Yenya" Kasprzak <kas at {fi.muni.cz - work | yenya.net - private}> |
| GPG: ID 1024/D3498839 Fingerprint 0D99A7FB206605D7 8B35FCDE05B18A5E |
| http://www.fi.muni.cz/~kas/ Journal: http://www.fi.muni.cz/~kas/blog/ |
>> If you find yourself arguing with Alan Cox, you’re _probably_ wrong. <<
>> --James Morris in "How and Why You Should Become a Kernel Hacker" <<

2008-03-19 21:04:39

by Jan Kasprzak

[permalink] [raw]
Subject: Re: Regression: kernel 2.6.24{,.1} ahci problem, does not boot (resend)

Tejun Heo wrote:
: Hmm... I still wanna know why it got broke in the first place. The only
: thing I can think of is IRQ routing problem in which case pci=nomsi or
: irqpoll should help. Any chance you can test the old BIOS?

Yes, I can. What data are you interested in? Boot logs with
pci=nomsi and irqpoll ? Anything else?

-Yenya

--
| Jan "Yenya" Kasprzak <kas at {fi.muni.cz - work | yenya.net - private}> |
| GPG: ID 1024/D3498839 Fingerprint 0D99A7FB206605D7 8B35FCDE05B18A5E |
| http://www.fi.muni.cz/~kas/ Journal: http://www.fi.muni.cz/~kas/blog/ |
>> If you find yourself arguing with Alan Cox, you’re _probably_ wrong. <<
>> --James Morris in "How and Why You Should Become a Kernel Hacker" <<

2008-03-19 22:14:07

by Tejun Heo

[permalink] [raw]
Subject: Re: Regression: kernel 2.6.24{,.1} ahci problem, does not boot (resend)

Jan Kasprzak wrote:
> Jan Kasprzak wrote:
> : Malte Schröder wrote:
> : : I "solved" the problem by updating the BIOS. It now works perfectly. I
> : : thought I had mailed that .. maybe I forgot.
> :
> : I have the same problem (ASUS M2R32-MVP board with SB600 chipset,
> : Athlon64 X2), but I am already using the latest BIOS for this mainboard.
>
> Sorry for the noise, there is even newer BIOS (dated Mar 01),
> and with this BIOS my mainboard works in AHCI mode even with 2.6.25-rc6.

Hmm... I still wanna know why it got broke in the first place. The only
thing I can think of is IRQ routing problem in which case pci=nomsi or
irqpoll should help. Any chance you can test the old BIOS?

--
tejun

2008-03-19 22:49:57

by Tejun Heo

[permalink] [raw]
Subject: Re: Regression: kernel 2.6.24{,.1} ahci problem, does not boot (resend)

Jan Kasprzak wrote:
> Tejun Heo wrote:
> : Hmm... I still wanna know why it got broke in the first place. The only
> : thing I can think of is IRQ routing problem in which case pci=nomsi or
> : irqpoll should help. Any chance you can test the old BIOS?
>
> Yes, I can. What data are you interested in? Boot logs with
> pci=nomsi and irqpoll ? Anything else?

pci=nomsi, then, irqpoll should be enough for now.

Thanks.

--
tejun

2008-03-21 16:43:44

by Jan Kasprzak

[permalink] [raw]
Subject: Re: Regression: kernel 2.6.24{,.1} ahci problem, does not boot (resend)

Tejun Heo wrote:
: Jan Kasprzak wrote:
: > Tejun Heo wrote:
: > : Hmm... I still wanna know why it got broke in the first place. The only
: > : thing I can think of is IRQ routing problem in which case pci=nomsi or
: > : irqpoll should help. Any chance you can test the old BIOS?
: >
: > Yes, I can. What data are you interested in? Boot logs with
: > pci=nomsi and irqpoll ? Anything else?
:
: pci=nomsi, then, irqpoll should be enough for now.
:
I have tried to do this, but unfortunately ASUS BIOS flash
utility does not let me to downgrade to the 0906 bios, where the problem
occurs. It lets me only downgrade from 1101 to 100x, but not to 0906
(even when called from the 100x BIOS).

-Yenya

--
| Jan "Yenya" Kasprzak <kas at {fi.muni.cz - work | yenya.net - private}> |
| GPG: ID 1024/D3498839 Fingerprint 0D99A7FB206605D7 8B35FCDE05B18A5E |
| http://www.fi.muni.cz/~kas/ Journal: http://www.fi.muni.cz/~kas/blog/ |
>> If you find yourself arguing with Alan Cox, you’re _probably_ wrong. <<
>> --James Morris in "How and Why You Should Become a Kernel Hacker" <<

2008-03-22 08:13:38

by Tejun Heo

[permalink] [raw]
Subject: Re: Regression: kernel 2.6.24{,.1} ahci problem, does not boot (resend)

Jan Kasprzak wrote:
> Tejun Heo wrote:
> : Jan Kasprzak wrote:
> : > Tejun Heo wrote:
> : > : Hmm... I still wanna know why it got broke in the first place. The only
> : > : thing I can think of is IRQ routing problem in which case pci=nomsi or
> : > : irqpoll should help. Any chance you can test the old BIOS?
> : >
> : > Yes, I can. What data are you interested in? Boot logs with
> : > pci=nomsi and irqpoll ? Anything else?
> :
> : pci=nomsi, then, irqpoll should be enough for now.
> :
> I have tried to do this, but unfortunately ASUS BIOS flash
> utility does not let me to downgrade to the 0906 bios, where the problem
> occurs. It lets me only downgrade from 1101 to 100x, but not to 0906
> (even when called from the 100x BIOS).

Eee... Let's hope someone else reports again.

Thanks for the trouble.

--
tejun

2008-03-22 08:14:33

by Tejun Heo

[permalink] [raw]
Subject: Re: Regression: kernel 2.6.24{,.1} ahci problem, does not boot (resend)

Tejun Heo wrote:
>
> Eee... Let's hope someone else reports again.
>
> Thanks for the trouble.

That reads a bit strange, right? That should have been "Thanks for the
trouble you put into testing." :-)

--
tejun