2008-10-24 07:04:33

by Alexander Beregalov

[permalink] [raw]
Subject: 2.6.27-rc1 (2fca5c): libata: kernel cant boot

Hi

commit 2fca5ccf97d2c28bcfce44f5b07d85e74e3cd18e
Author: Jens Axboe <[email protected]>
Date: Wed Oct 22 09:34:49 2008 +0200

libata: switch to using block layer tagging support


This kernel can not read even 0 sector on disk with rootfs. It
initialized disk, but cant read it at all.
CMD646 on Sparc

Rverting helped.


2008-10-24 07:11:14

by Jens Axboe

[permalink] [raw]
Subject: Re: 2.6.27-rc1 (2fca5c): libata: kernel cant boot

On Fri, Oct 24 2008, Alexander Beregalov wrote:
> Hi
>
> commit 2fca5ccf97d2c28bcfce44f5b07d85e74e3cd18e
> Author: Jens Axboe <[email protected]>
> Date: Wed Oct 22 09:34:49 2008 +0200
>
> libata: switch to using block layer tagging support
>
>
> This kernel can not read even 0 sector on disk with rootfs. It
> initialized disk, but cant read it at all.
> CMD646 on Sparc
>
> Rverting helped.

Doh, how annoying! What driver does that controller use? PATA doesn't
even use NCQ, so it's a bit of an oddity that it makes a difference at
all.

Can you provide the boot messages?

--
Jens Axboe

2008-10-24 07:14:43

by Jens Axboe

[permalink] [raw]
Subject: Re: 2.6.27-rc1 (2fca5c): libata: kernel cant boot

On Fri, Oct 24 2008, Jens Axboe wrote:
> On Fri, Oct 24 2008, Alexander Beregalov wrote:
> > Hi
> >
> > commit 2fca5ccf97d2c28bcfce44f5b07d85e74e3cd18e
> > Author: Jens Axboe <[email protected]>
> > Date: Wed Oct 22 09:34:49 2008 +0200
> >
> > libata: switch to using block layer tagging support
> >
> >
> > This kernel can not read even 0 sector on disk with rootfs. It
> > initialized disk, but cant read it at all.
> > CMD646 on Sparc
> >
> > Rverting helped.
>
> Doh, how annoying! What driver does that controller use? PATA doesn't
> even use NCQ, so it's a bit of an oddity that it makes a difference at
> all.
>
> Can you provide the boot messages?

Darn, this smells like a train wreck. I'm assuming this fixes it?

diff --git a/drivers/ata/libata-scsi.c b/drivers/ata/libata-scsi.c
index d5b9b72..461ef5e 100644
--- a/drivers/ata/libata-scsi.c
+++ b/drivers/ata/libata-scsi.c
@@ -708,7 +708,11 @@ static struct ata_queued_cmd *ata_scsi_qc_new(struct ata_device *dev,
{
struct ata_queued_cmd *qc;

- qc = ata_qc_new_init(dev, cmd->request->tag);
+ if (cmd->request->tag != -1)
+ qc = ata_qc_new_init(dev, cmd->request->tag);
+ else
+ qc = ata_qc_new_init(dev, 0);
+
if (qc) {
qc->scsicmd = cmd;
qc->scsidone = done;

--
Jens Axboe

2008-10-24 07:16:59

by Paul Mundt

[permalink] [raw]
Subject: Re: 2.6.27-rc1 (2fca5c): libata: kernel cant boot

On Fri, Oct 24, 2008 at 09:13:29AM +0200, Jens Axboe wrote:
> On Fri, Oct 24 2008, Jens Axboe wrote:
> > On Fri, Oct 24 2008, Alexander Beregalov wrote:
> > > Hi
> > >
> > > commit 2fca5ccf97d2c28bcfce44f5b07d85e74e3cd18e
> > > Author: Jens Axboe <[email protected]>
> > > Date: Wed Oct 22 09:34:49 2008 +0200
> > >
> > > libata: switch to using block layer tagging support
> > >
> > >
> > > This kernel can not read even 0 sector on disk with rootfs. It
> > > initialized disk, but cant read it at all.
> > > CMD646 on Sparc
> > >
> > > Rverting helped.
> >
> > Doh, how annoying! What driver does that controller use? PATA doesn't
> > even use NCQ, so it's a bit of an oddity that it makes a difference at
> > all.
> >
> > Can you provide the boot messages?
>
> Darn, this smells like a train wreck. I'm assuming this fixes it?
>
Yes, that fixes it.

2008-10-24 07:25:40

by Jens Axboe

[permalink] [raw]
Subject: Re: 2.6.27-rc1 (2fca5c): libata: kernel cant boot

On Fri, Oct 24 2008, Paul Mundt wrote:
> On Fri, Oct 24, 2008 at 09:13:29AM +0200, Jens Axboe wrote:
> > On Fri, Oct 24 2008, Jens Axboe wrote:
> > > On Fri, Oct 24 2008, Alexander Beregalov wrote:
> > > > Hi
> > > >
> > > > commit 2fca5ccf97d2c28bcfce44f5b07d85e74e3cd18e
> > > > Author: Jens Axboe <[email protected]>
> > > > Date: Wed Oct 22 09:34:49 2008 +0200
> > > >
> > > > libata: switch to using block layer tagging support
> > > >
> > > >
> > > > This kernel can not read even 0 sector on disk with rootfs. It
> > > > initialized disk, but cant read it at all.
> > > > CMD646 on Sparc
> > > >
> > > > Rverting helped.
> > >
> > > Doh, how annoying! What driver does that controller use? PATA doesn't
> > > even use NCQ, so it's a bit of an oddity that it makes a difference at
> > > all.
> > >
> > > Can you provide the boot messages?
> >
> > Darn, this smells like a train wreck. I'm assuming this fixes it?
> >
> Yes, that fixes it.

OK, that's pretty bad. 2.6.28-rc1 will not work on any box using libata
with non-ncq disks. If you need me, I'll be at the bar.

----------

>From e598055dde1951c47c8b3522616f6ebff0ed9847 Mon Sep 17 00:00:00 2001
From: Jens Axboe <[email protected]>
Date: Fri, 24 Oct 2008 09:22:42 +0200
Subject: [PATCH] libata: fix bug with non-ncq devices

The recent commit 201f1b98822078c808b5e2d379a6ddbfc0a06ee1 to enable
support for block layer tagging in libata was broken for non-NCQ
devices. The block layer initializes the tag field to -1 to detect
invalid uses of a tag, and if the libata devices does NOT support
NCQ, we just used that field to index the internal command list.
So we need to check for -1 first and only use the tag field if
it's valid.

Signed-off-by: Jens Axboe <[email protected]>
---
drivers/ata/libata-scsi.c | 6 +++++-
1 files changed, 5 insertions(+), 1 deletions(-)

diff --git a/drivers/ata/libata-scsi.c b/drivers/ata/libata-scsi.c
index d5b9b72..4b95c43 100644
--- a/drivers/ata/libata-scsi.c
+++ b/drivers/ata/libata-scsi.c
@@ -708,7 +708,11 @@ static struct ata_queued_cmd *ata_scsi_qc_new(struct ata_device *dev,
{
struct ata_queued_cmd *qc;

- qc = ata_qc_new_init(dev, cmd->request->tag);
+ if (cmd->request->tag != -1)
+ qc = ata_qc_new_init(dev, cmd->request->tag);
+ else
+ qc = ata_qc_new_init(dev, 0);
+
if (qc) {
qc->scsicmd = cmd;
qc->scsidone = done;
--
1.6.0.2.588.g3102


--
Jens Axboe

2008-10-24 07:33:02

by Paul Mundt

[permalink] [raw]
Subject: Re: 2.6.27-rc1 (2fca5c): libata: kernel cant boot

On Fri, Oct 24, 2008 at 09:24:29AM +0200, Jens Axboe wrote:
> On Fri, Oct 24 2008, Paul Mundt wrote:
> > On Fri, Oct 24, 2008 at 09:13:29AM +0200, Jens Axboe wrote:
> > > Darn, this smells like a train wreck. I'm assuming this fixes it?
> > >
> > Yes, that fixes it.
>
> OK, that's pretty bad. 2.6.28-rc1 will not work on any box using libata
> with non-ncq disks. If you need me, I'll be at the bar.
>
At least you only missed -rc1 by a few hours, it was a good effort.. :-)

> From e598055dde1951c47c8b3522616f6ebff0ed9847 Mon Sep 17 00:00:00 2001
> From: Jens Axboe <[email protected]>
> Date: Fri, 24 Oct 2008 09:22:42 +0200
> Subject: [PATCH] libata: fix bug with non-ncq devices
>
> The recent commit 201f1b98822078c808b5e2d379a6ddbfc0a06ee1 to enable
> support for block layer tagging in libata was broken for non-NCQ
> devices. The block layer initializes the tag field to -1 to detect
> invalid uses of a tag, and if the libata devices does NOT support
> NCQ, we just used that field to index the internal command list.
> So we need to check for -1 first and only use the tag field if
> it's valid.
>
> Signed-off-by: Jens Axboe <[email protected]>

Tested-by: Paul Mundt <[email protected]>

Thanks for the quick fix! Now I can go back to hacking on the stuff I
was avoiding..

2008-10-24 07:34:37

by Rafael J. Wysocki

[permalink] [raw]
Subject: [Regression] 2.6.28-rc1 (2fca5c): libata: kernel cant boot (was: Re: 2.6.27-rc1 (2fca5c): libata: kernel cant boot)

On Friday, 24 of October 2008, Jens Axboe wrote:
> On Fri, Oct 24 2008, Alexander Beregalov wrote:
> > Hi
> >
> > commit 2fca5ccf97d2c28bcfce44f5b07d85e74e3cd18e
> > Author: Jens Axboe <[email protected]>
> > Date: Wed Oct 22 09:34:49 2008 +0200
> >
> > libata: switch to using block layer tagging support
> >
> >
> > This kernel can not read even 0 sector on disk with rootfs. It
> > initialized disk, but cant read it at all.
> > CMD646 on Sparc
> >
> > Rverting helped.

Confirmed on hp nx6325 w/ pata_atiixp (apparently, SCSI commands time out).

Reverting the commit also helps here.

[I guess it's going to break things left and right.]

> Doh, how annoying! What driver does that controller use? PATA doesn't
> even use NCQ, so it's a bit of an oddity that it makes a difference at
> all.
>
> Can you provide the boot messages?

If I'm able to reproduce the breakage of Asus L5D, I will.

Thanks,
Rafael

2008-10-24 07:36:16

by Jens Axboe

[permalink] [raw]
Subject: Re: [Regression] 2.6.28-rc1 (2fca5c): libata: kernel cant boot (was: Re: 2.6.27-rc1 (2fca5c): libata: kernel cant boot)

On Fri, Oct 24 2008, Rafael J. Wysocki wrote:
> On Friday, 24 of October 2008, Jens Axboe wrote:
> > On Fri, Oct 24 2008, Alexander Beregalov wrote:
> > > Hi
> > >
> > > commit 2fca5ccf97d2c28bcfce44f5b07d85e74e3cd18e
> > > Author: Jens Axboe <[email protected]>
> > > Date: Wed Oct 22 09:34:49 2008 +0200
> > >
> > > libata: switch to using block layer tagging support
> > >
> > >
> > > This kernel can not read even 0 sector on disk with rootfs. It
> > > initialized disk, but cant read it at all.
> > > CMD646 on Sparc
> > >
> > > Rverting helped.
>
> Confirmed on hp nx6325 w/ pata_atiixp (apparently, SCSI commands time out).

Basically it'll dererence ->qcmd[] with -1 index so it'll break in
various interesting ways, I'm sure.

> Reverting the commit also helps here.
>
> [I guess it's going to break things left and right.]

Indeed.

> > Doh, how annoying! What driver does that controller use? PATA doesn't
> > even use NCQ, so it's a bit of an oddity that it makes a difference at
> > all.
> >
> > Can you provide the boot messages?
>
> If I'm able to reproduce the breakage of Asus L5D, I will.

I don't need that anymore, I realized the mistake right after sending
that.

--
Jens Axboe

2008-10-24 07:45:03

by Dave Young

[permalink] [raw]
Subject: Re: 2.6.27-rc1 (2fca5c): libata: kernel cant boot

On Fri, Oct 24, 2008 at 3:24 PM, Jens Axboe <[email protected]> wrote:
> On Fri, Oct 24 2008, Paul Mundt wrote:
>> On Fri, Oct 24, 2008 at 09:13:29AM +0200, Jens Axboe wrote:
>> > On Fri, Oct 24 2008, Jens Axboe wrote:
>> > > On Fri, Oct 24 2008, Alexander Beregalov wrote:
>> > > > Hi
>> > > >
>> > > > commit 2fca5ccf97d2c28bcfce44f5b07d85e74e3cd18e
>> > > > Author: Jens Axboe <[email protected]>
>> > > > Date: Wed Oct 22 09:34:49 2008 +0200
>> > > >
>> > > > libata: switch to using block layer tagging support
>> > > >
>> > > >
>> > > > This kernel can not read even 0 sector on disk with rootfs. It
>> > > > initialized disk, but cant read it at all.
>> > > > CMD646 on Sparc
>> > > >
>> > > > Rverting helped.
>> > >
>> > > Doh, how annoying! What driver does that controller use? PATA doesn't
>> > > even use NCQ, so it's a bit of an oddity that it makes a difference at
>> > > all.
>> > >
>> > > Can you provide the boot messages?
>> >
>> > Darn, this smells like a train wreck. I'm assuming this fixes it?
>> >
>> Yes, that fixes it.

To confirm: 2.6.28-rc1 ata-piix, fixed for me as well.

>
> OK, that's pretty bad. 2.6.28-rc1 will not work on any box using libata
> with non-ncq disks. If you need me, I'll be at the bar.
>
> ----------
>
> From e598055dde1951c47c8b3522616f6ebff0ed9847 Mon Sep 17 00:00:00 2001
> From: Jens Axboe <[email protected]>
> Date: Fri, 24 Oct 2008 09:22:42 +0200
> Subject: [PATCH] libata: fix bug with non-ncq devices
>
> The recent commit 201f1b98822078c808b5e2d379a6ddbfc0a06ee1 to enable
> support for block layer tagging in libata was broken for non-NCQ
> devices. The block layer initializes the tag field to -1 to detect
> invalid uses of a tag, and if the libata devices does NOT support
> NCQ, we just used that field to index the internal command list.
> So we need to check for -1 first and only use the tag field if
> it's valid.
>
> Signed-off-by: Jens Axboe <[email protected]>
> ---
> drivers/ata/libata-scsi.c | 6 +++++-
> 1 files changed, 5 insertions(+), 1 deletions(-)
>
> diff --git a/drivers/ata/libata-scsi.c b/drivers/ata/libata-scsi.c
> index d5b9b72..4b95c43 100644
> --- a/drivers/ata/libata-scsi.c
> +++ b/drivers/ata/libata-scsi.c
> @@ -708,7 +708,11 @@ static struct ata_queued_cmd *ata_scsi_qc_new(struct ata_device *dev,
> {
> struct ata_queued_cmd *qc;
>
> - qc = ata_qc_new_init(dev, cmd->request->tag);
> + if (cmd->request->tag != -1)
> + qc = ata_qc_new_init(dev, cmd->request->tag);
> + else
> + qc = ata_qc_new_init(dev, 0);
> +
> if (qc) {
> qc->scsicmd = cmd;
> qc->scsidone = done;
> --
> 1.6.0.2.588.g3102
>
>
> --
> Jens Axboe
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>



--
Regards
dave

2008-10-24 08:14:52

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: [Regression] 2.6.28-rc1 (2fca5c): libata: kernel cant boot (was: Re: 2.6.27-rc1 (2fca5c): libata: kernel cant boot)

On Friday, 24 of October 2008, Jens Axboe wrote:
> On Fri, Oct 24 2008, Rafael J. Wysocki wrote:
> > On Friday, 24 of October 2008, Jens Axboe wrote:
> > > On Fri, Oct 24 2008, Alexander Beregalov wrote:
> > > > Hi
> > > >
> > > > commit 2fca5ccf97d2c28bcfce44f5b07d85e74e3cd18e
> > > > Author: Jens Axboe <[email protected]>
> > > > Date: Wed Oct 22 09:34:49 2008 +0200
> > > >
> > > > libata: switch to using block layer tagging support
> > > >
> > > >
> > > > This kernel can not read even 0 sector on disk with rootfs. It
> > > > initialized disk, but cant read it at all.
> > > > CMD646 on Sparc
> > > >
> > > > Rverting helped.
> >
> > Confirmed on hp nx6325 w/ pata_atiixp (apparently, SCSI commands time out).
>
> Basically it'll dererence ->qcmd[] with -1 index so it'll break in
> various interesting ways, I'm sure.
>
> > Reverting the commit also helps here.
> >
> > [I guess it's going to break things left and right.]
>
> Indeed.
>
> > > Doh, how annoying! What driver does that controller use? PATA doesn't
> > > even use NCQ, so it's a bit of an oddity that it makes a difference at
> > > all.
> > >
> > > Can you provide the boot messages?
> >
> > If I'm able to reproduce the breakage of Asus L5D, I will.
>
> I don't need that anymore, I realized the mistake right after sending
> that.

The fix you posted helps here too. :-)

Thanks,
Rafael

2008-10-24 08:17:44

by Jens Axboe

[permalink] [raw]
Subject: Re: [Regression] 2.6.28-rc1 (2fca5c): libata: kernel cant boot (was: Re: 2.6.27-rc1 (2fca5c): libata: kernel cant boot)

On Fri, Oct 24 2008, Rafael J. Wysocki wrote:
> On Friday, 24 of October 2008, Jens Axboe wrote:
> > On Fri, Oct 24 2008, Rafael J. Wysocki wrote:
> > > On Friday, 24 of October 2008, Jens Axboe wrote:
> > > > On Fri, Oct 24 2008, Alexander Beregalov wrote:
> > > > > Hi
> > > > >
> > > > > commit 2fca5ccf97d2c28bcfce44f5b07d85e74e3cd18e
> > > > > Author: Jens Axboe <[email protected]>
> > > > > Date: Wed Oct 22 09:34:49 2008 +0200
> > > > >
> > > > > libata: switch to using block layer tagging support
> > > > >
> > > > >
> > > > > This kernel can not read even 0 sector on disk with rootfs. It
> > > > > initialized disk, but cant read it at all.
> > > > > CMD646 on Sparc
> > > > >
> > > > > Rverting helped.
> > >
> > > Confirmed on hp nx6325 w/ pata_atiixp (apparently, SCSI commands time out).
> >
> > Basically it'll dererence ->qcmd[] with -1 index so it'll break in
> > various interesting ways, I'm sure.
> >
> > > Reverting the commit also helps here.
> > >
> > > [I guess it's going to break things left and right.]
> >
> > Indeed.
> >
> > > > Doh, how annoying! What driver does that controller use? PATA doesn't
> > > > even use NCQ, so it's a bit of an oddity that it makes a difference at
> > > > all.
> > > >
> > > > Can you provide the boot messages?
> > >
> > > If I'm able to reproduce the breakage of Asus L5D, I will.
> >
> > I don't need that anymore, I realized the mistake right after sending
> > that.
>
> The fix you posted helps here too. :-)

Good, please boot your regression time machine and get it committed
before -rc1 was released :-)

--
Jens Axboe

2008-10-24 08:44:44

by Elias Oltmanns

[permalink] [raw]
Subject: Re: 2.6.27-rc1 (2fca5c): libata: kernel cant boot

Jens Axboe <[email protected]> wrote:
> From e598055dde1951c47c8b3522616f6ebff0ed9847 Mon Sep 17 00:00:00 2001
> From: Jens Axboe <[email protected]>
> Date: Fri, 24 Oct 2008 09:22:42 +0200
> Subject: [PATCH] libata: fix bug with non-ncq devices
>
> The recent commit 201f1b98822078c808b5e2d379a6ddbfc0a06ee1 to enable

Wouldn't that be commit 2fca5ccf97d2c28bcfce44f5b07d85e74e3cd18e?

Regards,

Elias

2008-10-24 08:49:29

by Jens Axboe

[permalink] [raw]
Subject: Re: 2.6.27-rc1 (2fca5c): libata: kernel cant boot

On Fri, Oct 24 2008, Elias Oltmanns wrote:
> Jens Axboe <[email protected]> wrote:
> > From e598055dde1951c47c8b3522616f6ebff0ed9847 Mon Sep 17 00:00:00 2001
> > From: Jens Axboe <[email protected]>
> > Date: Fri, 24 Oct 2008 09:22:42 +0200
> > Subject: [PATCH] libata: fix bug with non-ncq devices
> >
> > The recent commit 201f1b98822078c808b5e2d379a6ddbfc0a06ee1 to enable
>
> Wouldn't that be commit 2fca5ccf97d2c28bcfce44f5b07d85e74e3cd18e?

Yes that is correct, the other commit is actually a private one in my
tree for other libata changes. Updated patch below, thanks for checking!

>From e598055dde1951c47c8b3522616f6ebff0ed9847 Mon Sep 17 00:00:00 2001
From: Jens Axboe <[email protected]>
Date: Fri, 24 Oct 2008 09:22:42 +0200
Subject: [PATCH] libata: fix bug with non-ncq devices

The recent commit 2fca5ccf97d2c28bcfce44f5b07d85e74e3cd18e to enable
support for block layer tagging in libata was broken for non-NCQ
devices. The block layer initializes the tag field to -1 to detect
invalid uses of a tag, and if the libata devices does NOT support
NCQ, we just used that field to index the internal command list.
So we need to check for -1 first and only use the tag field if
it's valid.

Signed-off-by: Jens Axboe <[email protected]>
---
drivers/ata/libata-scsi.c | 6 +++++-
1 files changed, 5 insertions(+), 1 deletions(-)

diff --git a/drivers/ata/libata-scsi.c b/drivers/ata/libata-scsi.c
index d5b9b72..4b95c43 100644
--- a/drivers/ata/libata-scsi.c
+++ b/drivers/ata/libata-scsi.c
@@ -708,7 +708,11 @@ static struct ata_queued_cmd *ata_scsi_qc_new(struct ata_device *dev,
{
struct ata_queued_cmd *qc;

- qc = ata_qc_new_init(dev, cmd->request->tag);
+ if (cmd->request->tag != -1)
+ qc = ata_qc_new_init(dev, cmd->request->tag);
+ else
+ qc = ata_qc_new_init(dev, 0);
+
if (qc) {
qc->scsicmd = cmd;
qc->scsidone = done;
--
1.6.0.2.588.g3102


--
Jens Axboe

2008-10-24 08:51:52

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: [Regression] 2.6.28-rc1 (2fca5c): libata: kernel cant boot (was: Re: 2.6.27-rc1 (2fca5c): libata: kernel cant boot)

On Friday, 24 of October 2008, Jens Axboe wrote:
> On Fri, Oct 24 2008, Rafael J. Wysocki wrote:
> > On Friday, 24 of October 2008, Jens Axboe wrote:
> > > On Fri, Oct 24 2008, Rafael J. Wysocki wrote:
> > > > On Friday, 24 of October 2008, Jens Axboe wrote:
> > > > > On Fri, Oct 24 2008, Alexander Beregalov wrote:
> > > > > > Hi
> > > > > >
> > > > > > commit 2fca5ccf97d2c28bcfce44f5b07d85e74e3cd18e
> > > > > > Author: Jens Axboe <[email protected]>
> > > > > > Date: Wed Oct 22 09:34:49 2008 +0200
> > > > > >
> > > > > > libata: switch to using block layer tagging support
> > > > > >
> > > > > >
> > > > > > This kernel can not read even 0 sector on disk with rootfs. It
> > > > > > initialized disk, but cant read it at all.
> > > > > > CMD646 on Sparc
> > > > > >
> > > > > > Rverting helped.
> > > >
> > > > Confirmed on hp nx6325 w/ pata_atiixp (apparently, SCSI commands time out).
> > >
> > > Basically it'll dererence ->qcmd[] with -1 index so it'll break in
> > > various interesting ways, I'm sure.
> > >
> > > > Reverting the commit also helps here.
> > > >
> > > > [I guess it's going to break things left and right.]
> > >
> > > Indeed.
> > >
> > > > > Doh, how annoying! What driver does that controller use? PATA doesn't
> > > > > even use NCQ, so it's a bit of an oddity that it makes a difference at
> > > > > all.
> > > > >
> > > > > Can you provide the boot messages?
> > > >
> > > > If I'm able to reproduce the breakage of Asus L5D, I will.
> > >
> > > I don't need that anymore, I realized the mistake right after sending
> > > that.
> >
> > The fix you posted helps here too. :-)
>
> Good, please boot your regression time machine and get it committed
> before -rc1 was released :-)

Well, in fact it won't be listed if Linus merges the fix quickly enough. ;-)

Thanks,
Rafael

2008-10-25 11:37:49

by Petr Vandrovec

[permalink] [raw]
Subject: Re: 2.6.27-rc1 (2fca5c): libata: kernel cant boot

Jens Axboe wrote:
> On Fri, Oct 24 2008, Elias Oltmanns wrote:
>> Jens Axboe <[email protected]> wrote:
>>> From e598055dde1951c47c8b3522616f6ebff0ed9847 Mon Sep 17 00:00:00 2001
>>> From: Jens Axboe <[email protected]>
>>> Date: Fri, 24 Oct 2008 09:22:42 +0200
>>> Subject: [PATCH] libata: fix bug with non-ncq devices
>>>
>>> The recent commit 201f1b98822078c808b5e2d379a6ddbfc0a06ee1 to enable
>> Wouldn't that be commit 2fca5ccf97d2c28bcfce44f5b07d85e74e3cd18e?
>
> Yes that is correct, the other commit is actually a private one in my
> tree for other libata changes. Updated patch below, thanks for checking!
>
> From e598055dde1951c47c8b3522616f6ebff0ed9847 Mon Sep 17 00:00:00 2001
> From: Jens Axboe <[email protected]>
> Date: Fri, 24 Oct 2008 09:22:42 +0200
> Subject: [PATCH] libata: fix bug with non-ncq devices

Hello,
this fixes my DVD, but unfortunately NCQ devices connected to PMP are
still dead - apparently as soon as mount() tries to do serious I/O on
the drive. Backing out both post-2.6.28-rc1 fix as well as your
original change brings storage back. I suspect that problem is that
with PMP same tag cannot be (should not be? must not be?) used on
multiple devices behind PMP - and before your change tags were allocated
per-port, while now they are allocated per-device.
Petr

P.S.: And disk connected to port #0 required poweroff/poweron cycle to
get back to life.



Oct 25 02:21:54 gwy kernel: sata_sil24 0000:05:00.0: version 1.1
Oct 25 02:21:54 gwy kernel: ACPI: PCI Interrupt Link [APC5] enabled at
IRQ 16
Oct 25 02:21:54 gwy kernel: sata_sil24 0000:05:00.0: PCI INT A ->
Link[APC5] -> GSI 16 (level, low) -> IRQ 16
Oct 25 02:21:54 gwy kernel: sata_sil24 0000:05:00.0: setting latency
timer to 64
Oct 25 02:21:54 gwy kernel: scsi6 : sata_sil24
Oct 25 02:21:54 gwy kernel: scsi7 : sata_sil24
Oct 25 02:21:54 gwy kernel: ata7: SATA max UDMA/100 host m128@0xfdaff000
port 0xfdaf8000 irq 16
Oct 25 02:21:54 gwy kernel: ata8: SATA max UDMA/100 host m128@0xfdaff000
port 0xfdafa000 irq 16
Oct 25 02:21:54 gwy kernel: ata8: SATA link up 3.0 Gbps (SStatus 123
SControl 0)
Oct 25 02:21:54 gwy kernel: ata8.15: Port Multiplier 1.1, 0x1095:0x3726
r23, 6 ports, feat 0x1/0x9
Oct 25 02:21:54 gwy kernel: ata8.00: hard resetting link
Oct 25 02:21:54 gwy kernel: ata8.00: SATA link up 3.0 Gbps (SStatus 123
SControl 320)
Oct 25 02:21:54 gwy kernel: ata8.01: hard resetting link
Oct 25 02:21:54 gwy kernel: ata8.01: SATA link up 3.0 Gbps (SStatus 123
SControl 300)
Oct 25 02:21:54 gwy kernel: ata8.02: hard resetting link
Oct 25 02:21:54 gwy kernel: ata8.02: SATA link up 3.0 Gbps (SStatus 123
SControl 300)
Oct 25 02:21:54 gwy kernel: ata8.03: hard resetting link
Oct 25 02:21:54 gwy kernel: ata8.03: SATA link up 3.0 Gbps (SStatus 123
SControl 300)
Oct 25 02:21:54 gwy kernel: ata8.04: hard resetting link
Oct 25 02:21:54 gwy kernel: ata8.04: SATA link down (SStatus 0 SControl 320)
Oct 25 02:21:54 gwy kernel: ata8.05: hard resetting link
Oct 25 02:21:54 gwy kernel: ata8.05: SATA link up 1.5 Gbps (SStatus 113
SControl 320)
Oct 25 02:21:54 gwy kernel: ata8.00: failed to IDENTIFY (I/O error,
err_mask=0x11)
Oct 25 02:21:54 gwy kernel: ata8.15: hard resetting link
Oct 25 02:21:54 gwy kernel: ata8: controller in dubious state,
performing PORT_RST
Oct 25 02:21:54 gwy kernel: ata8.15: SATA link up 3.0 Gbps (SStatus 123
SControl 0)
Oct 25 02:21:54 gwy kernel: ata8.00: hard resetting link
Oct 25 02:21:54 gwy kernel: ata8.00: SATA link up 3.0 Gbps (SStatus 123
SControl 320)
Oct 25 02:21:54 gwy kernel: ata8.01: hard resetting link
Oct 25 02:21:54 gwy kernel: ata8.01: SATA link up 3.0 Gbps (SStatus 123
SControl 300)
Oct 25 02:21:54 gwy kernel: ata8.02: hard resetting link
Oct 25 02:21:54 gwy kernel: ata8.02: SATA link up 3.0 Gbps (SStatus 123
SControl 300)
Oct 25 02:21:54 gwy kernel: ata8.03: hard resetting link
Oct 25 02:21:54 gwy kernel: ata8.03: SATA link up 3.0 Gbps (SStatus 123
SControl 300)
Oct 25 02:21:54 gwy kernel: ata8.05: hard resetting link
Oct 25 02:21:54 gwy kernel: ata8.05: SATA link up 1.5 Gbps (SStatus 113
SControl 320)
Oct 25 02:21:54 gwy kernel: ata8.00: failed to IDENTIFY (I/O error,
err_mask=0x11)
Oct 25 02:21:54 gwy kernel: ata8.15: hard resetting link
Oct 25 02:21:54 gwy kernel: ata8: controller in dubious state,
performing PORT_RST
Oct 25 02:21:54 gwy kernel: ata8.15: SATA link up 3.0 Gbps (SStatus 123
SControl 0)
Oct 25 02:21:54 gwy kernel: ata8.00: hard resetting link
Oct 25 02:21:54 gwy kernel: ata8.00: SATA link up 3.0 Gbps (SStatus 123
SControl 320)
Oct 25 02:21:54 gwy kernel: ata8.01: hard resetting link
Oct 25 02:21:54 gwy kernel: ata8.01: SATA link up 3.0 Gbps (SStatus 123
SControl 300)
Oct 25 02:21:54 gwy kernel: ata8.02: hard resetting link
Oct 25 02:21:54 gwy kernel: ata8.02: SATA link up 3.0 Gbps (SStatus 123
SControl 300)
Oct 25 02:21:54 gwy kernel: ata8.03: hard resetting link
Oct 25 02:21:54 gwy kernel: ata8.03: SATA link up 3.0 Gbps (SStatus 123
SControl 300)
Oct 25 02:21:54 gwy kernel: ata8.04: hard resetting link
Oct 25 02:21:54 gwy kernel: ata8.04: SATA link down (SStatus 0 SControl 320)
Oct 25 02:21:54 gwy kernel: ata8.05: hard resetting link
Oct 25 02:21:54 gwy kernel: ata8.05: SATA link up 1.5 Gbps (SStatus 113
SControl 320)
Oct 25 02:21:54 gwy kernel: ata8.00: failed to IDENTIFY (I/O error,
err_mask=0x11)
Oct 25 02:21:54 gwy kernel: ata8.00: failed to recover link after 3
tries, disabling
Oct 25 02:21:54 gwy kernel: ata8.15: hard resetting link
Oct 25 02:21:54 gwy kernel: ata8: controller in dubious state,
performing PORT_RST
Oct 25 02:21:54 gwy kernel: ata8.15: SATA link up 3.0 Gbps (SStatus 123
SControl 0)
Oct 25 02:21:54 gwy kernel: ata8.01: hard resetting link
Oct 25 02:21:54 gwy kernel: ata8.01: SATA link up 3.0 Gbps (SStatus 123
SControl 300)
Oct 25 02:21:54 gwy kernel: ata8.02: hard resetting link
Oct 25 02:21:54 gwy kernel: ata8.02: SATA link up 3.0 Gbps (SStatus 123
SControl 300)
Oct 25 02:21:54 gwy kernel: ata8.03: hard resetting link
Oct 25 02:21:54 gwy kernel: ata8.03: SATA link up 3.0 Gbps (SStatus 123
SControl 300)
Oct 25 02:21:54 gwy kernel: ata8.05: hard resetting link
Oct 25 02:21:54 gwy kernel: ata8.05: SATA link up 1.5 Gbps (SStatus 113
SControl 320)
Oct 25 02:21:54 gwy kernel: ata8.01: ATA-7: Hitachi HDS721010KLA330,
GKAOA70M, max UDMA/133
Oct 25 02:21:54 gwy kernel: ata8.01: 1953525168 sectors, multi 16: LBA48
NCQ (depth 31/32)
Oct 25 02:21:54 gwy kernel: ata8.01: configured for UDMA/100
Oct 25 02:21:54 gwy kernel: ata8.02: ATA-7: Hitachi HDS721010KLA330,
GKAOA70F, max UDMA/133
Oct 25 02:21:54 gwy kernel: ata8.02: 1953525168 sectors, multi 16: LBA48
NCQ (depth 31/32)
Oct 25 02:21:54 gwy kernel: ata8.02: configured for UDMA/100
Oct 25 02:21:54 gwy kernel: ata8.03: ATA-7: Hitachi HDS721010KLA330,
GKAOA70M, max UDMA/133
Oct 25 02:21:54 gwy kernel: ata8.03: 1953525168 sectors, multi 16: LBA48
NCQ (depth 31/32)
Oct 25 02:21:54 gwy kernel: ata8.03: configured for UDMA/100
Oct 25 02:21:54 gwy kernel: ata8: EH complete
Oct 25 02:21:54 gwy kernel: scsi 7:1:0:0: Direct-Access ATA
Hitachi HDS72101 GKAO PQ: 0 ANSI: 5
Oct 25 02:21:54 gwy kernel: sd 7:1:0:0: [sdh] 1953525168 512-byte
hardware sectors: (1000GB/931GiB)
Oct 25 02:21:54 gwy kernel: sd 7:1:0:0: [sdh] Write Protect is off
Oct 25 02:21:54 gwy kernel: sd 7:1:0:0: [sdh] Mode Sense: 00 3a 00 00
Oct 25 02:21:54 gwy kernel: sd 7:1:0:0: [sdh] Write cache: enabled, read
cache: enabled, doesn't support DPO or FUA
Oct 25 02:21:54 gwy kernel: sd 7:1:0:0: [sdh] 1953525168 512-byte
hardware sectors: (1000GB/931GiB)
Oct 25 02:21:54 gwy kernel: sd 7:1:0:0: [sdh] Write Protect is off
Oct 25 02:21:54 gwy kernel: sd 7:1:0:0: [sdh] Mode Sense: 00 3a 00 00
Oct 25 02:21:54 gwy kernel: sd 7:1:0:0: [sdh] Write cache: enabled, read
cache: enabled, doesn't support DPO or FUA
Oct 25 02:21:54 gwy kernel: sdh: sdh1
Oct 25 02:21:54 gwy kernel: sd 7:1:0:0: [sdh] Attached SCSI disk
Oct 25 02:21:54 gwy kernel: sd 7:1:0:0: Attached scsi generic sg9 type 0
Oct 25 02:21:54 gwy kernel: scsi 7:2:0:0: Direct-Access ATA
Hitachi HDS72101 GKAO PQ: 0 ANSI: 5
Oct 25 02:21:54 gwy kernel: sd 7:2:0:0: [sdi] 1953525168 512-byte
hardware sectors: (1000GB/931GiB)
Oct 25 02:21:54 gwy kernel: sd 7:2:0:0: [sdi] Write Protect is off
Oct 25 02:21:54 gwy kernel: sd 7:2:0:0: [sdi] Mode Sense: 00 3a 00 00
Oct 25 02:21:54 gwy kernel: sd 7:2:0:0: [sdi] Write cache: enabled, read
cache: enabled, doesn't support DPO or FUA
Oct 25 02:21:54 gwy kernel: sd 7:2:0:0: [sdi] 1953525168 512-byte
hardware sectors: (1000GB/931GiB)
Oct 25 02:21:54 gwy kernel: sd 7:2:0:0: [sdi] Write Protect is off
Oct 25 02:21:54 gwy kernel: sd 7:2:0:0: [sdi] Mode Sense: 00 3a 00 00
Oct 25 02:21:54 gwy kernel: sd 7:2:0:0: [sdi] Write cache: enabled, read
cache: enabled, doesn't support DPO or FUA
Oct 25 02:21:54 gwy kernel: sdi: sdi1
Oct 25 02:21:54 gwy kernel: sd 7:2:0:0: [sdi] Attached SCSI disk
Oct 25 02:21:54 gwy kernel: sd 7:2:0:0: Attached scsi generic sg10 type 0
Oct 25 02:21:54 gwy kernel: scsi 7:3:0:0: Direct-Access ATA
Hitachi HDS72101 GKAO PQ: 0 ANSI: 5
Oct 25 02:21:54 gwy kernel: sd 7:3:0:0: [sdj] 1953525168 512-byte
hardware sectors: (1000GB/931GiB)
Oct 25 02:21:54 gwy kernel: sd 7:3:0:0: [sdj] Write Protect is off
Oct 25 02:21:54 gwy kernel: sd 7:3:0:0: [sdj] Mode Sense: 00 3a 00 00
Oct 25 02:21:54 gwy kernel: sd 7:3:0:0: [sdj] Write cache: enabled, read
cache: enabled, doesn't support DPO or FUA
Oct 25 02:21:54 gwy kernel: sd 7:3:0:0: [sdj] 1953525168 512-byte
hardware sectors: (1000GB/931GiB)
Oct 25 02:21:54 gwy kernel: sd 7:3:0:0: [sdj] Write Protect is off
Oct 25 02:21:54 gwy kernel: sd 7:3:0:0: [sdj] Mode Sense: 00 3a 00 00
Oct 25 02:21:54 gwy kernel: sd 7:3:0:0: [sdj] Write cache: enabled, read
cache: enabled, doesn't support DPO or FUA
Oct 25 02:21:54 gwy kernel: sdj: sdj1
Oct 25 02:21:54 gwy kernel: sd 7:3:0:0: [sdj] Attached SCSI disk
Oct 25 02:21:54 gwy kernel: sd 7:3:0:0: Attached scsi generic sg11 type 0
Oct 25 02:23:09 gwy kernel: ata8.03: exception Emask 0x10 SAct 0x0 SErr
0x4010000 action 0xf
Oct 25 02:23:09 gwy kernel: ata8.03: SError: { PHYRdyChg DevExch }
Oct 25 02:23:09 gwy kernel: ata8.00: failed to read SCR 2 (Emask=0x40)
Oct 25 02:23:09 gwy kernel: ata8.00: COMRESET failed (errno=-5)
Oct 25 02:23:09 gwy kernel: ata8.00: failed to write SCR 1 (Emask=0x40)
Oct 25 02:23:09 gwy kernel: ata8.00: failed to clear SError.N (errno=-5)
Oct 25 02:23:09 gwy kernel: ata8.15: hard resetting link
Oct 25 02:23:11 gwy kernel: ata8.15: SATA link down (SStatus 0 SControl 0)
Oct 25 02:23:14 gwy kernel: ata8.15: failed to read PMP GSCR[0] (Emask=0x1)
Oct 25 02:23:14 gwy kernel: ata8.15: PMP revalidation failed (errno=-5)
Oct 25 02:23:16 gwy kernel: ata8.15: hard resetting link
Oct 25 02:23:18 gwy kernel: ata8.15: SATA link up 3.0 Gbps (SStatus 123
SControl 0)
Oct 25 02:23:18 gwy kernel: ata8.01: hard resetting link
Oct 25 02:23:19 gwy kernel: ata8.01: SATA link up 3.0 Gbps (SStatus 123
SControl 300)
Oct 25 02:23:19 gwy kernel: ata8.02: hard resetting link
Oct 25 02:23:19 gwy kernel: ata8.02: SATA link up 3.0 Gbps (SStatus 123
SControl 300)
Oct 25 02:23:19 gwy kernel: ata8.03: hard resetting link
Oct 25 02:23:20 gwy kernel: ata8.03: SATA link up 3.0 Gbps (SStatus 123
SControl 300)
Oct 25 02:23:20 gwy kernel: ata8.04: hard resetting link
Oct 25 02:23:20 gwy kernel: ata8.04: SATA link down (SStatus 0 SControl 320)
Oct 25 02:23:20 gwy kernel: ata8.05: hard resetting link
Oct 25 02:23:21 gwy kernel: ata8.05: SATA link up 1.5 Gbps (SStatus 113
SControl 320)
Oct 25 02:23:21 gwy kernel: ata8.01: failed to IDENTIFY (I/O error,
err_mask=0x11)
Oct 25 02:23:21 gwy kernel: ata8.01: revalidation failed (errno=-5)
Oct 25 02:23:23 gwy kernel: ata8.15: hard resetting link
Oct 25 02:23:23 gwy kernel: ata8: controller in dubious state,
performing PORT_RST
Oct 25 02:23:25 gwy kernel: ata8.15: SATA link up 3.0 Gbps (SStatus 123
SControl 0)
Oct 25 02:23:26 gwy kernel: ata8.01: hard resetting link
Oct 25 02:23:26 gwy kernel: ata8.01: SATA link up 3.0 Gbps (SStatus 123
SControl 300)
Oct 25 02:23:26 gwy kernel: ata8.02: hard resetting link
Oct 25 02:23:26 gwy kernel: ata8.02: SATA link up 3.0 Gbps (SStatus 123
SControl 300)
Oct 25 02:23:26 gwy kernel: ata8.03: hard resetting link
Oct 25 02:23:27 gwy kernel: ata8.03: SATA link up 3.0 Gbps (SStatus 123
SControl 300)
Oct 25 02:23:27 gwy kernel: ata8.05: hard resetting link
Oct 25 02:23:27 gwy kernel: ata8.05: SATA link up 1.5 Gbps (SStatus 113
SControl 320)
Oct 25 02:23:27 gwy kernel: ata8.01: failed to IDENTIFY (I/O error,
err_mask=0x11)
Oct 25 02:23:27 gwy kernel: ata8.01: revalidation failed (errno=-5)
Oct 25 02:23:30 gwy kernel: ata8.15: hard resetting link
Oct 25 02:23:30 gwy kernel: ata8: controller in dubious state,
performing PORT_RST
Oct 25 02:23:33 gwy kernel: ata8.15: SATA link up 3.0 Gbps (SStatus 123
SControl 0)
Oct 25 02:23:33 gwy kernel: ata8.01: hard resetting link
Oct 25 02:23:33 gwy kernel: ata8.01: SATA link up 3.0 Gbps (SStatus 123
SControl 300)
Oct 25 02:23:33 gwy kernel: ata8.02: hard resetting link
Oct 25 02:23:34 gwy kernel: ata8.02: SATA link up 3.0 Gbps (SStatus 123
SControl 300)
Oct 25 02:23:34 gwy kernel: ata8.03: hard resetting link
Oct 25 02:23:34 gwy kernel: ata8.03: SATA link up 3.0 Gbps (SStatus 123
SControl 300)
Oct 25 02:23:34 gwy kernel: ata8.04: hard resetting link
Oct 25 02:23:34 gwy kernel: ata8.04: SATA link down (SStatus 0 SControl 320)
Oct 25 02:23:34 gwy kernel: ata8.05: hard resetting link
Oct 25 02:23:35 gwy kernel: ata8.05: SATA link up 1.5 Gbps (SStatus 113
SControl 320)
Oct 25 02:23:35 gwy kernel: ata8.01: configured for UDMA/100
Oct 25 02:23:35 gwy kernel: ata8.02: failed to IDENTIFY (I/O error,
err_mask=0x11)
Oct 25 02:23:35 gwy kernel: ata8.02: revalidation failed (errno=-5)
Oct 25 02:23:38 gwy kernel: ata8.15: hard resetting link
Oct 25 02:23:38 gwy kernel: ata8: controller in dubious state,
performing PORT_RST
Oct 25 02:23:40 gwy kernel: ata8.15: SATA link up 3.0 Gbps (SStatus 123
SControl 0)
Oct 25 02:23:40 gwy kernel: ata8.01: hard resetting link
Oct 25 02:23:40 gwy kernel: ata8.01: SATA link up 3.0 Gbps (SStatus 123
SControl 300)
Oct 25 02:23:40 gwy kernel: ata8.02: hard resetting link
Oct 25 02:23:41 gwy kernel: ata8.02: SATA link up 3.0 Gbps (SStatus 123
SControl 300)
Oct 25 02:23:41 gwy kernel: ata8.03: hard resetting link
Oct 25 02:23:41 gwy kernel: ata8.03: SATA link up 3.0 Gbps (SStatus 123
SControl 300)
Oct 25 02:23:41 gwy kernel: ata8.05: hard resetting link
Oct 25 02:23:42 gwy kernel: ata8.05: SATA link up 1.5 Gbps (SStatus 113
SControl 320)
Oct 25 02:23:42 gwy kernel: ata8.01: configured for UDMA/100
Oct 25 02:23:42 gwy kernel: ata8.02: configured for UDMA/100
Oct 25 02:23:42 gwy kernel: ata8.03: configured for UDMA/100
Oct 25 02:23:42 gwy kernel: ata8.15: exception Emask 0x10 SAct 0x0 SErr
0x0 action 0x9 t4
Oct 25 02:23:42 gwy kernel: ata8.15: irq_stat 0x00060002
Oct 25 02:23:42 gwy kernel: ata8: EH complete
Oct 25 02:23:42 gwy kernel: sd 7:1:0:0: [sdh] 1953525168 512-byte
hardware sectors: (1000GB/931GiB)
Oct 25 02:23:42 gwy kernel: sd 7:1:0:0: [sdh] Write Protect is off
Oct 25 02:23:42 gwy kernel: sd 7:1:0:0: [sdh] Mode Sense: 00 3a 00 00
Oct 25 02:23:42 gwy kernel: sd 7:1:0:0: [sdh] Write cache: enabled, read
cache: enabled, doesn't support DPO or FUA
Oct 25 02:23:42 gwy kernel: sd 7:2:0:0: [sdi] 1953525168 512-byte
hardware sectors: (1000GB/931GiB)
Oct 25 02:23:42 gwy kernel: sd 7:2:0:0: [sdi] Write Protect is off
Oct 25 02:23:42 gwy kernel: sd 7:2:0:0: [sdi] Mode Sense: 00 3a 00 00
Oct 25 02:23:42 gwy kernel: sd 7:2:0:0: [sdi] Write cache: enabled, read
cache: enabled, doesn't support DPO or FUA
Oct 25 02:23:42 gwy kernel: sd 7:3:0:0: [sdj] 1953525168 512-byte
hardware sectors: (1000GB/931GiB)
Oct 25 02:23:42 gwy kernel: sd 7:3:0:0: [sdj] Write Protect is off
Oct 25 02:23:42 gwy kernel: sd 7:3:0:0: [sdj] Mode Sense: 00 3a 00 00
Oct 25 02:23:42 gwy kernel: sd 7:3:0:0: [sdj] Write cache: enabled, read
cache: enabled, doesn't support DPO or FUA
Oct 25 02:23:42 gwy kernel: sd 7:1:0:0: [sdh] 1953525168 512-byte
hardware sectors: (1000GB/931GiB)
Oct 25 02:23:42 gwy kernel: sd 7:1:0:0: [sdh] Write Protect is off
Oct 25 02:23:42 gwy kernel: sd 7:1:0:0: [sdh] Mode Sense: 00 3a 00 00
Oct 25 02:23:42 gwy kernel: sd 7:1:0:0: [sdh] Write cache: enabled, read
cache: enabled, doesn't support DPO or FUA
Oct 25 02:23:42 gwy kernel: sd 7:2:0:0: [sdi] 1953525168 512-byte
hardware sectors: (1000GB/931GiB)
Oct 25 02:23:42 gwy kernel: sd 7:2:0:0: [sdi] Write Protect is off
Oct 25 02:23:42 gwy kernel: sd 7:2:0:0: [sdi] Mode Sense: 00 3a 00 00
Oct 25 02:23:42 gwy kernel: sd 7:2:0:0: [sdi] Write cache: enabled, read
cache: enabled, doesn't support DPO or FUA
Oct 25 02:23:42 gwy kernel: sd 7:3:0:0: [sdj] 1953525168 512-byte
hardware sectors: (1000GB/931GiB)
Oct 25 02:23:42 gwy kernel: sd 7:3:0:0: [sdj] Write Protect is off
Oct 25 02:23:42 gwy kernel: sd 7:3:0:0: [sdj] Mode Sense: 00 3a 00 00
Oct 25 02:23:42 gwy kernel: sd 7:3:0:0: [sdj] Write cache: enabled, read
cache: enabled, doesn't support DPO or FUA
Oct 25 02:23:51 gwy kernel: kjournald starting. Commit interval 5 seconds
Oct 25 02:23:51 gwy kernel: EXT3 FS on sdi1, internal journal
Oct 25 02:23:51 gwy kernel: EXT3-fs: mounted filesystem with ordered
data mode.
Oct 25 02:23:51 gwy kernel: kjournald starting. Commit interval 5 seconds
Oct 25 02:23:51 gwy kernel: EXT3 FS on sdh1, internal journal
Oct 25 02:23:51 gwy kernel: EXT3-fs: mounted filesystem with ordered
data mode.
Oct 25 02:23:51 gwy kernel: kjournald starting. Commit interval 5 seconds
Oct 25 02:23:51 gwy kernel: EXT3 FS on sdj1, internal journal
Oct 25 02:23:51 gwy kernel: EXT3-fs: mounted filesystem with ordered
data mode.

2008-10-25 18:46:35

by Jens Axboe

[permalink] [raw]
Subject: Re: 2.6.27-rc1 (2fca5c): libata: kernel cant boot

On Sat, Oct 25 2008, Petr Vandrovec wrote:
> Jens Axboe wrote:
> >On Fri, Oct 24 2008, Elias Oltmanns wrote:
> >>Jens Axboe <[email protected]> wrote:
> >>>From e598055dde1951c47c8b3522616f6ebff0ed9847 Mon Sep 17 00:00:00 2001
> >>>From: Jens Axboe <[email protected]>
> >>>Date: Fri, 24 Oct 2008 09:22:42 +0200
> >>>Subject: [PATCH] libata: fix bug with non-ncq devices
> >>>
> >>>The recent commit 201f1b98822078c808b5e2d379a6ddbfc0a06ee1 to enable
> >>Wouldn't that be commit 2fca5ccf97d2c28bcfce44f5b07d85e74e3cd18e?
> >
> >Yes that is correct, the other commit is actually a private one in my
> >tree for other libata changes. Updated patch below, thanks for checking!
> >
> >From e598055dde1951c47c8b3522616f6ebff0ed9847 Mon Sep 17 00:00:00 2001
> >From: Jens Axboe <[email protected]>
> >Date: Fri, 24 Oct 2008 09:22:42 +0200
> >Subject: [PATCH] libata: fix bug with non-ncq devices
>
> Hello,
> this fixes my DVD, but unfortunately NCQ devices connected to PMP are
> still dead - apparently as soon as mount() tries to do serious I/O on
> the drive. Backing out both post-2.6.28-rc1 fix as well as your
> original change brings storage back. I suspect that problem is that
> with PMP same tag cannot be (should not be? must not be?) used on
> multiple devices behind PMP - and before your change tags were allocated
> per-port, while now they are allocated per-device.

That would indeed break, this requires allocating the tag map in the

--
Jens Axboe

2008-10-26 17:35:46

by Jens Axboe

[permalink] [raw]
Subject: Re: 2.6.27-rc1 (2fca5c): libata: kernel cant boot

On Sat, Oct 25 2008, Jens Axboe wrote:
> On Sat, Oct 25 2008, Petr Vandrovec wrote:
> > Jens Axboe wrote:
> > >On Fri, Oct 24 2008, Elias Oltmanns wrote:
> > >>Jens Axboe <[email protected]> wrote:
> > >>>From e598055dde1951c47c8b3522616f6ebff0ed9847 Mon Sep 17 00:00:00 2001
> > >>>From: Jens Axboe <[email protected]>
> > >>>Date: Fri, 24 Oct 2008 09:22:42 +0200
> > >>>Subject: [PATCH] libata: fix bug with non-ncq devices
> > >>>
> > >>>The recent commit 201f1b98822078c808b5e2d379a6ddbfc0a06ee1 to enable
> > >>Wouldn't that be commit 2fca5ccf97d2c28bcfce44f5b07d85e74e3cd18e?
> > >
> > >Yes that is correct, the other commit is actually a private one in my
> > >tree for other libata changes. Updated patch below, thanks for checking!
> > >
> > >From e598055dde1951c47c8b3522616f6ebff0ed9847 Mon Sep 17 00:00:00 2001
> > >From: Jens Axboe <[email protected]>
> > >Date: Fri, 24 Oct 2008 09:22:42 +0200
> > >Subject: [PATCH] libata: fix bug with non-ncq devices
> >
> > Hello,
> > this fixes my DVD, but unfortunately NCQ devices connected to PMP are
> > still dead - apparently as soon as mount() tries to do serious I/O on
> > the drive. Backing out both post-2.6.28-rc1 fix as well as your
> > original change brings storage back. I suspect that problem is that
> > with PMP same tag cannot be (should not be? must not be?) used on
> > multiple devices behind PMP - and before your change tags were allocated
> > per-port, while now they are allocated per-device.
>
> That would indeed break, this requires allocating the tag map in the

Totally untested, does this work?

diff --git a/drivers/ata/libata-scsi.c b/drivers/ata/libata-scsi.c
index 4b95c43..0785c46 100644
--- a/drivers/ata/libata-scsi.c
+++ b/drivers/ata/libata-scsi.c
@@ -1107,6 +1107,10 @@ static int ata_scsi_dev_config(struct scsi_device *sdev,

depth = min(sdev->host->can_queue, ata_id_queue_depth(dev->id));
depth = min(ATA_MAX_QUEUE - 1, depth);
+
+ if (dev->link->ap->pmp_link)
+ scsi_init_shared_tag_map(sdev->host, ATA_MAX_QUEUE - 1);
+
scsi_set_tag_type(sdev, MSG_SIMPLE_TAG);
scsi_activate_tcq(sdev, depth);
}

--
Jens Axboe

2008-10-27 01:28:45

by Petr Vandrovec

[permalink] [raw]
Subject: Re: 2.6.27-rc1 (2fca5c): libata: kernel cant boot

Jens Axboe wrote:
> On Sat, Oct 25 2008, Jens Axboe wrote:
>> On Sat, Oct 25 2008, Petr Vandrovec wrote:
>>> Jens Axboe wrote:
>>>> On Fri, Oct 24 2008, Elias Oltmanns wrote:
>>>>> Jens Axboe <[email protected]> wrote:
>>>>> >From e598055dde1951c47c8b3522616f6ebff0ed9847 Mon Sep 17 00:00:00 2001
>>>>>> From: Jens Axboe <[email protected]>
>>>>>> Date: Fri, 24 Oct 2008 09:22:42 +0200
>>>>>> Subject: [PATCH] libata: fix bug with non-ncq devices
>>>>>>
>>>>>> The recent commit 201f1b98822078c808b5e2d379a6ddbfc0a06ee1 to enable
>>>>> Wouldn't that be commit 2fca5ccf97d2c28bcfce44f5b07d85e74e3cd18e?
>>>> Yes that is correct, the other commit is actually a private one in my
>>>> tree for other libata changes. Updated patch below, thanks for checking!
>>>>
>>> >From e598055dde1951c47c8b3522616f6ebff0ed9847 Mon Sep 17 00:00:00 2001
>>>> From: Jens Axboe <[email protected]>
>>>> Date: Fri, 24 Oct 2008 09:22:42 +0200
>>>> Subject: [PATCH] libata: fix bug with non-ncq devices
>>> Hello,
>>> this fixes my DVD, but unfortunately NCQ devices connected to PMP are
>>> still dead - apparently as soon as mount() tries to do serious I/O on
>>> the drive. Backing out both post-2.6.28-rc1 fix as well as your
>>> original change brings storage back. I suspect that problem is that
>>> with PMP same tag cannot be (should not be? must not be?) used on
>>> multiple devices behind PMP - and before your change tags were allocated
>>> per-port, while now they are allocated per-device.
>> That would indeed break, this requires allocating the tag map in the
>
> Totally untested, does this work?
>
> diff --git a/drivers/ata/libata-scsi.c b/drivers/ata/libata-scsi.c
> index 4b95c43..0785c46 100644
> --- a/drivers/ata/libata-scsi.c
> +++ b/drivers/ata/libata-scsi.c
> @@ -1107,6 +1107,10 @@ static int ata_scsi_dev_config(struct scsi_device *sdev,
>
> depth = min(sdev->host->can_queue, ata_id_queue_depth(dev->id));
> depth = min(ATA_MAX_QUEUE - 1, depth);
> +
> + if (dev->link->ap->pmp_link)
> + scsi_init_shared_tag_map(sdev->host, ATA_MAX_QUEUE - 1);
> +
> scsi_set_tag_type(sdev, MSG_SIMPLE_TAG);
> scsi_activate_tcq(sdev, depth);
> }

No. It went through same story as without patch - first it declared
drive #2 hung, after port reset drives #0,1,2 were declared hung, after
second port reset drive #2 was declared dead, after third port reset
drive #3 was hung, after fourth reset it said that /dev/sde changed
capacity from 0 to 1TB, and at that point I decided that it is time to
hit alt-sysrq-b to prevent damage...

Also I think that this change leaks memory a bit...
Petr
Petr