2014-11-02 18:17:06

by Meelis Roos

[permalink] [raw]
Subject: bisected regression: qla2xxx endianness on sparc64

Between 3.17 and 3.18-rc2, qla2xxx is broken on my sparc64 machines. It
fails to boot (hangs in firmware rings init).

This is the result of bisect:

98aee70d19a7e3203649fa2078464e4f402a0ad8 is the first bad commit
commit 98aee70d19a7e3203649fa2078464e4f402a0ad8
Author: Joe Carnuccio <[email protected]>
Date: Thu Sep 25 05:16:38 2014 -0400

qla2xxx: Add endianizer to max_payload_size modifier.

Signed-off-by: Joe Carnuccio <[email protected]>
Signed-off-by: Saurav Kashyap <[email protected]>
Signed-off-by: Christoph Hellwig <[email protected]>

:040000 040000 041c1a26dc2a7988900acc982e3bd65d3cf7e751 9700d09c3226fc352d44352f22b84f1be632d324 M s

This may not be the only problem - when bisecting, I also came to commits
that got past this step but hang after about 165 seconds of uptime while
running userspace startup scripts. But let that be another issue at the
moment.

--
Meelis Roos ([email protected])


2014-11-03 20:10:08

by Chad Dupuis

[permalink] [raw]
Subject: Re: bisected regression: qla2xxx endianness on sparc64


On Sun, 2 Nov 2014, Meelis Roos wrote:

> Between 3.17 and 3.18-rc2, qla2xxx is broken on my sparc64 machines. It fails
> to boot (hangs in firmware rings init).
>
> This is the result of bisect:
>
> 98aee70d19a7e3203649fa2078464e4f402a0ad8 is the first bad commit
> commit 98aee70d19a7e3203649fa2078464e4f402a0ad8
> Author: Joe Carnuccio <[email protected]>
> Date: Thu Sep 25 05:16:38 2014 -0400
>
> qla2xxx: Add endianizer to max_payload_size modifier.
>
> Signed-off-by: Joe Carnuccio <[email protected]>
> Signed-off-by: Saurav Kashyap <[email protected]>
> Signed-off-by: Christoph Hellwig <[email protected]>
>
> :040000 040000 041c1a26dc2a7988900acc982e3bd65d3cf7e751
> 9700d09c3226fc352d44352f22b84f1be632d324 M s
>
> This may not be the only problem - when bisecting, I also came to commits
> that got past this step but hang after about 165 seconds of uptime while
> running userspace startup scripts. But let that be another issue at the
> moment.
>

We should revert that change. What were some of the other failures you
were seeing?

2014-11-03 21:30:50

by Meelis Roos

[permalink] [raw]
Subject: Re: bisected regression: qla2xxx endianness on sparc64

> > Between 3.17 and 3.18-rc2, qla2xxx is broken on my sparc64 machines. It
> > fails to boot (hangs in firmware rings init).
> >
> > This is the result of bisect:
> >
> > 98aee70d19a7e3203649fa2078464e4f402a0ad8 is the first bad commit
> > commit 98aee70d19a7e3203649fa2078464e4f402a0ad8
> > Author: Joe Carnuccio <[email protected]>
> > Date: Thu Sep 25 05:16:38 2014 -0400
> >
> > qla2xxx: Add endianizer to max_payload_size modifier.
> >
> > Signed-off-by: Joe Carnuccio <[email protected]>
> > Signed-off-by: Saurav Kashyap <[email protected]>
> > Signed-off-by: Christoph Hellwig <[email protected]>
> >
> > :040000 040000 041c1a26dc2a7988900acc982e3bd65d3cf7e751
> > 9700d09c3226fc352d44352f22b84f1be632d324 M s
> >
> > This may not be the only problem - when bisecting, I also came to commits
> > that got past this step but hang after about 165 seconds of uptime while
> > running userspace startup scripts. But let that be another issue at the
> > moment.
> >
>
> We should revert that change. What were some of the other failures you were
> seeing?

Yes. I took the same 3.18.0-rc1-00422-g2cc9188-dirty kernel that had
just this patch reverted, it started the controller fine, detected disk,
mounted root, started multiple tasks and then some time after startin
exim it just hangs. This is consisten with what I saw during bisection.

--
Meelis Roos ([email protected])

2014-11-03 22:26:41

by Geert Uytterhoeven

[permalink] [raw]
Subject: Re: bisected regression: qla2xxx endianness on sparc64

On Sun, Nov 2, 2014 at 7:10 PM, Meelis Roos <[email protected]> wrote:
> Between 3.17 and 3.18-rc2, qla2xxx is broken on my sparc64 machines. It
> fails to boot (hangs in firmware rings init).
>
> This is the result of bisect:
>
> 98aee70d19a7e3203649fa2078464e4f402a0ad8 is the first bad commit
> commit 98aee70d19a7e3203649fa2078464e4f402a0ad8
> Author: Joe Carnuccio <[email protected]>
> Date: Thu Sep 25 05:16:38 2014 -0400
>
> qla2xxx: Add endianizer to max_payload_size modifier.
>
> Signed-off-by: Joe Carnuccio <[email protected]>
> Signed-off-by: Saurav Kashyap <[email protected]>
> Signed-off-by: Christoph Hellwig <[email protected]>
>
> :040000 040000 041c1a26dc2a7988900acc982e3bd65d3cf7e751
> 9700d09c3226fc352d44352f22b84f1be632d324 M s
>
> This may not be the only problem - when bisecting, I also came to commits
> that got past this step but hang after about 165 seconds of uptime while
> running userspace startup scripts. But let that be another issue at the
> moment.

Switching from uint16_t to __le16 but _removing_ cpu_to_le16() operations
looks indeed very fishy.

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- [email protected]

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds

2014-11-10 14:15:28

by Christoph Hellwig

[permalink] [raw]
Subject: Re: bisected regression: qla2xxx endianness on sparc64

On Mon, Nov 03, 2014 at 03:09:47PM -0500, Chad Dupuis wrote:
> We should revert that change. What were some of the other failures you were
> seeing?

Can you please send me the revert ASAP?

2014-11-10 14:16:00

by Christoph Hellwig

[permalink] [raw]
Subject: Re: bisected regression: qla2xxx endianness on sparc64

On Mon, Nov 03, 2014 at 11:32:14PM +0200, Meelis Roos wrote:
> Yes. I took the same 3.18.0-rc1-00422-g2cc9188-dirty kernel that had
> just this patch reverted, it started the controller fine, detected disk,
> mounted root, started multiple tasks and then some time after startin
> exim it just hangs. This is consisten with what I saw during bisection.

Can you try to bisect this one as well?

2014-11-10 14:19:26

by Meelis Roos

[permalink] [raw]
Subject: Re: bisected regression: qla2xxx endianness on sparc64

> On Mon, Nov 03, 2014 at 11:32:14PM +0200, Meelis Roos wrote:
> > Yes. I took the same 3.18.0-rc1-00422-g2cc9188-dirty kernel that had
> > just this patch reverted, it started the controller fine, detected disk,
> > mounted root, started multiple tasks and then some time after startin
> > exim it just hangs. This is consisten with what I saw during bisection.
>
> Can you try to bisect this one as well?

I have tried for many evenings but it is not so easy. With current
bisecting, I have lost good base version, even going back to 3.17 and
3.16 breaks for now. I will try more after some days when I have an idea
what did I change to break it (some conf option perhaps).

--
Meelis Roos ([email protected])

2015-06-03 08:17:51

by Meelis Roos

[permalink] [raw]
Subject: Re: bisected regression: qla2xxx endianness on sparc64

On Mon, 10 Nov 2014, Christoph Hellwig wrote:

> On Mon, Nov 03, 2014 at 03:09:47PM -0500, Chad Dupuis wrote:
> > We should revert that change. What were some of the other failures you were
> > seeing?
>
> Can you please send me the revert ASAP?

Since QLogic is still silent on this one, I will send it to you:

Revert change that breaks QLA2XXX on big-endian systems,
__constant_cpu_to_le16() is still needed.

Signed-off-by: Meelis Roos <[email protected]>

diff --git a/drivers/scsi/qla2xxx/qla_fw.h b/drivers/scsi/qla2xxx/qla_fw.h
index 42bb357..88d3143 100644
--- a/drivers/scsi/qla2xxx/qla_fw.h
+++ b/drivers/scsi/qla2xxx/qla_fw.h
@@ -91,7 +91,7 @@ struct nvram_24xx {
/* Firmware Initialization Control Block. */
uint16_t version;
uint16_t reserved_1;
- __le16 frame_payload_size;
+ uint16_t frame_payload_size;
uint16_t execution_throttle;
uint16_t exchange_count;
uint16_t hard_address;
diff --git a/drivers/scsi/qla2xxx/qla_init.c b/drivers/scsi/qla2xxx/qla_init.c
index 285cb20..ed973a1 100644
--- a/drivers/scsi/qla2xxx/qla_init.c
+++ b/drivers/scsi/qla2xxx/qla_init.c
@@ -2658,18 +2658,18 @@ qla2x00_nvram_config(scsi_qla_host_t *vha)
nv->firmware_options[1] = BIT_7 | BIT_5;
nv->add_firmware_options[0] = BIT_5;
nv->add_firmware_options[1] = BIT_5 | BIT_4;
- nv->frame_payload_size = 2048;
+ nv->frame_payload_size = __constant_cpu_to_le16(2048);
nv->special_options[1] = BIT_7;
} else if (IS_QLA2200(ha)) {
nv->firmware_options[0] = BIT_2 | BIT_1;
nv->firmware_options[1] = BIT_7 | BIT_5;
nv->add_firmware_options[0] = BIT_5;
nv->add_firmware_options[1] = BIT_5 | BIT_4;
- nv->frame_payload_size = 1024;
+ nv->frame_payload_size = __constant_cpu_to_le16(1024);
} else if (IS_QLA2100(ha)) {
nv->firmware_options[0] = BIT_3 | BIT_1;
nv->firmware_options[1] = BIT_5;
- nv->frame_payload_size = 1024;
+ nv->frame_payload_size = __constant_cpu_to_le16(1024);
}

nv->max_iocb_allocation = __constant_cpu_to_le16(256);
@@ -2705,7 +2705,7 @@ qla2x00_nvram_config(scsi_qla_host_t *vha)
* are valid.
*/
if (ia64_platform_is("sn2")) {
- nv->frame_payload_size = 2048;
+ nv->frame_payload_size = __constant_cpu_to_le16(2048);
if (IS_QLA23XX(ha))
nv->special_options[1] = BIT_7;
}
@@ -5022,7 +5022,7 @@ qla24xx_nvram_config(scsi_qla_host_t *vha)
memset(nv, 0, ha->nvram_size);
nv->nvram_version = __constant_cpu_to_le16(ICB_VERSION);
nv->version = __constant_cpu_to_le16(ICB_VERSION);
- nv->frame_payload_size = 2048;
+ nv->frame_payload_size = __constant_cpu_to_le16(2048);
nv->execution_throttle = __constant_cpu_to_le16(0xFFFF);
nv->exchange_count = __constant_cpu_to_le16(0);
nv->hard_address = __constant_cpu_to_le16(124);
@@ -5969,7 +5969,7 @@ qla81xx_nvram_config(scsi_qla_host_t *vha)
memset(nv, 0, ha->nvram_size);
nv->nvram_version = __constant_cpu_to_le16(ICB_VERSION);
nv->version = __constant_cpu_to_le16(ICB_VERSION);
- nv->frame_payload_size = 2048;
+ nv->frame_payload_size = __constant_cpu_to_le16(2048);
nv->execution_throttle = __constant_cpu_to_le16(0xFFFF);
nv->exchange_count = __constant_cpu_to_le16(0);
nv->port_name[0] = 0x21;

--
Meelis Roos ([email protected])

2015-06-03 19:39:49

by Joe Carnuccio

[permalink] [raw]
Subject: RE: bisected regression: qla2xxx endianness on sparc64

Meelis,

Yes, please revert that patch (it should never have deleted the calls to __constant_cpu_to_le16 (those fields are all LE in the HW/nvram)).

-Joe

-----Original Message-----
From: [email protected] [mailto:[email protected]] On Behalf Of Meelis Roos
Sent: Wednesday, June 03, 2015 1:21 AM
To: Christoph Hellwig
Cc: Chad Dupuis; Joe Carnuccio; Saurav Kashyap; Christoph Hellwig; Dept-Eng QLA2xxx Upstream; James E.J. Bottomley; linux-scsi; linux-kernel
Subject: Re: bisected regression: qla2xxx endianness on sparc64

On Mon, 10 Nov 2014, Christoph Hellwig wrote:

> On Mon, Nov 03, 2014 at 03:09:47PM -0500, Chad Dupuis wrote:
> > We should revert that change. What were some of the other failures
> > you were seeing?
>
> Can you please send me the revert ASAP?

Since QLogic is still silent on this one, I will send it to you:

Revert change that breaks QLA2XXX on big-endian systems,
__constant_cpu_to_le16() is still needed.

Signed-off-by: Meelis Roos <[email protected]>

diff --git a/drivers/scsi/qla2xxx/qla_fw.h b/drivers/scsi/qla2xxx/qla_fw.h index 42bb357..88d3143 100644
--- a/drivers/scsi/qla2xxx/qla_fw.h
+++ b/drivers/scsi/qla2xxx/qla_fw.h
@@ -91,7 +91,7 @@ struct nvram_24xx {
/* Firmware Initialization Control Block. */
uint16_t version;
uint16_t reserved_1;
- __le16 frame_payload_size;
+ uint16_t frame_payload_size;
uint16_t execution_throttle;
uint16_t exchange_count;
uint16_t hard_address;
diff --git a/drivers/scsi/qla2xxx/qla_init.c b/drivers/scsi/qla2xxx/qla_init.c index 285cb20..ed973a1 100644
--- a/drivers/scsi/qla2xxx/qla_init.c
+++ b/drivers/scsi/qla2xxx/qla_init.c
@@ -2658,18 +2658,18 @@ qla2x00_nvram_config(scsi_qla_host_t *vha)
nv->firmware_options[1] = BIT_7 | BIT_5;
nv->add_firmware_options[0] = BIT_5;
nv->add_firmware_options[1] = BIT_5 | BIT_4;
- nv->frame_payload_size = 2048;
+ nv->frame_payload_size = __constant_cpu_to_le16(2048);
nv->special_options[1] = BIT_7;
} else if (IS_QLA2200(ha)) {
nv->firmware_options[0] = BIT_2 | BIT_1;
nv->firmware_options[1] = BIT_7 | BIT_5;
nv->add_firmware_options[0] = BIT_5;
nv->add_firmware_options[1] = BIT_5 | BIT_4;
- nv->frame_payload_size = 1024;
+ nv->frame_payload_size = __constant_cpu_to_le16(1024);
} else if (IS_QLA2100(ha)) {
nv->firmware_options[0] = BIT_3 | BIT_1;
nv->firmware_options[1] = BIT_5;
- nv->frame_payload_size = 1024;
+ nv->frame_payload_size = __constant_cpu_to_le16(1024);
}

nv->max_iocb_allocation = __constant_cpu_to_le16(256); @@ -2705,7 +2705,7 @@ qla2x00_nvram_config(scsi_qla_host_t *vha)
* are valid.
*/
if (ia64_platform_is("sn2")) {
- nv->frame_payload_size = 2048;
+ nv->frame_payload_size = __constant_cpu_to_le16(2048);
if (IS_QLA23XX(ha))
nv->special_options[1] = BIT_7;
}
@@ -5022,7 +5022,7 @@ qla24xx_nvram_config(scsi_qla_host_t *vha)
memset(nv, 0, ha->nvram_size);
nv->nvram_version = __constant_cpu_to_le16(ICB_VERSION);
nv->version = __constant_cpu_to_le16(ICB_VERSION);
- nv->frame_payload_size = 2048;
+ nv->frame_payload_size = __constant_cpu_to_le16(2048);
nv->execution_throttle = __constant_cpu_to_le16(0xFFFF);
nv->exchange_count = __constant_cpu_to_le16(0);
nv->hard_address = __constant_cpu_to_le16(124); @@ -5969,7 +5969,7 @@ qla81xx_nvram_config(scsi_qla_host_t *vha)
memset(nv, 0, ha->nvram_size);
nv->nvram_version = __constant_cpu_to_le16(ICB_VERSION);
nv->version = __constant_cpu_to_le16(ICB_VERSION);
- nv->frame_payload_size = 2048;
+ nv->frame_payload_size = __constant_cpu_to_le16(2048);
nv->execution_throttle = __constant_cpu_to_le16(0xFFFF);
nv->exchange_count = __constant_cpu_to_le16(0);
nv->port_name[0] = 0x21;

--
Meelis Roos ([email protected])