2016-04-11 17:26:36

by Joseph Salisbury

[permalink] [raw]
Subject: [4.5-rc4 Regression] qla2xxx: Add irq affinity notification

Hello Quinn,

A kernel bug report was opened against Ubuntu [0]. After a kernel
bisect, it was found that reverting the following commit resolved this bug:

commit cdb898c52d1dfad4b4800b83a58b3fe5d352edde
Author: Quinn Tran <[email protected]>
Date: Thu Dec 17 14:57:05 2015 -0500

qla2xxx: Add irq affinity notification


However, the prior commit also required the following three commits to
also be reverted:

commit 5327c7dbd1a7fd980608f44789076a636e5ee5fc
Author: Quinn Tran <[email protected]>
Date: Wed Feb 10 18:59:14 2016 -0500

qla2xxx: use TARGET_SCF_USE_CPUID flag to indiate CPU Affinity

commit 9095adaab8c1d82707e4e9961b6ad79b62f3361b
Author: Quinn Tran <[email protected]>
Date: Wed Feb 10 18:59:13 2016 -0500

target/transport: add flag to indicate CPU Affinity is observed

commit fb3269baf4ecc2ce6d17d4eb537080035bdf6d5b
Author: Quinn Tran <[email protected]>
Date: Thu Dec 17 14:57:06 2015 -0500

qla2xxx: Add selective command queuing



The regression was introduced as of v4.5-rc4.

I was hoping to get your feedback, since you are the patch author. The
dependant reverts all look like they are improving cpu affinity, which
would likely impact performance. Do you thing there is a way forward
instead of the reverts, or would it be best to submit a revert request?


Thanks,

Joe


[0] http://pad.lv/1554003



2016-04-11 18:21:23

by Quinn Tran

[permalink] [raw]
Subject: Re: [4.5-rc4 Regression] qla2xxx: Add irq affinity notification

Joe,

How do I get access to this specific Ubuntu kernel where the bug is found? Is there stack trace/bug report that you could share? Any data would be helpful. Thanks.

In the mean time, I will download 4.5 rc4 to re-verify.

Regards,
Quinn Tran






-----Original Message-----
From: Joseph Salisbury <[email protected]>
Date: Monday, April 11, 2016 at 10:26 AM
To: Quinn Tran <[email protected]>
Cc: Dept-Eng QLA2xxx Upstream <[email protected]>, "[email protected]" <[email protected]>, "Martin K. Petersen" <[email protected]>, linux-scsi <[email protected]>, linux-kernel <[email protected]>, "[email protected]" <[email protected]>, "[email protected]" <[email protected]>, Himanshu Madhani <[email protected]>, Nicholas Bellinger <[email protected]>
Subject: [4.5-rc4 Regression] qla2xxx: Add irq affinity notification

>Hello Quinn,
>
>A kernel bug report was opened against Ubuntu [0]. After a kernel
>bisect, it was found that reverting the following commit resolved this bug:
>
>commit cdb898c52d1dfad4b4800b83a58b3fe5d352edde
>Author: Quinn Tran <[email protected]>
>Date: Thu Dec 17 14:57:05 2015 -0500
>
> qla2xxx: Add irq affinity notification
>
>
>However, the prior commit also required the following three commits to
>also be reverted:
>
>commit 5327c7dbd1a7fd980608f44789076a636e5ee5fc
>Author: Quinn Tran <[email protected]>
>Date: Wed Feb 10 18:59:14 2016 -0500
>
> qla2xxx: use TARGET_SCF_USE_CPUID flag to indiate CPU Affinity
>
>commit 9095adaab8c1d82707e4e9961b6ad79b62f3361b
>Author: Quinn Tran <[email protected]>
>Date: Wed Feb 10 18:59:13 2016 -0500
>
> target/transport: add flag to indicate CPU Affinity is observed
>
>commit fb3269baf4ecc2ce6d17d4eb537080035bdf6d5b
>Author: Quinn Tran <[email protected]>
>Date: Thu Dec 17 14:57:06 2015 -0500
>
> qla2xxx: Add selective command queuing
>
>
>
>The regression was introduced as of v4.5-rc4.
>
>I was hoping to get your feedback, since you are the patch author. The
>dependant reverts all look like they are improving cpu affinity, which
>would likely impact performance. Do you thing there is a way forward
>instead of the reverts, or would it be best to submit a revert request?
>
>
>Thanks,
>
>Joe
>
>
>[0] http://pad.lv/1554003
>
>

2016-04-11 18:41:46

by Joseph Salisbury

[permalink] [raw]
Subject: Re: [4.5-rc4 Regression] qla2xxx: Add irq affinity notification

On 04/11/2016 01:48 PM, Quinn Tran wrote:
> Joe,
>
> How do I get access to this specific Ubuntu kernel where the bug is found? Is there stack trace/bug report that you could share? Any data would be helpful. Thanks.
The git tree for the specific Ubuntu kernel that exhibits this bug can
be found here:
git://git.launchpad.net/~ubuntu-kernel/ubuntu/+source/linux/+git/xenial

The bug can be found here, and has screen shots of the panic as well as
dmesg under the "Attachments" header:
http://pad.lv/1554003

Just let me know if additional debug information is needed.


>
> In the mean time, I will download 4.5 rc4 to re-verify.
>
> Regards,
> Quinn Tran
>
>
>
>
>
>
> -----Original Message-----
> From: Joseph Salisbury <[email protected]>
> Date: Monday, April 11, 2016 at 10:26 AM
> To: Quinn Tran <[email protected]>
> Cc: Dept-Eng QLA2xxx Upstream <[email protected]>, "[email protected]" <[email protected]>, "Martin K. Petersen" <[email protected]>, linux-scsi <[email protected]>, linux-kernel <[email protected]>, "[email protected]" <[email protected]>, "[email protected]" <[email protected]>, Himanshu Madhani <[email protected]>, Nicholas Bellinger <[email protected]>
> Subject: [4.5-rc4 Regression] qla2xxx: Add irq affinity notification
>
>> Hello Quinn,
>>
>> A kernel bug report was opened against Ubuntu [0]. After a kernel
>> bisect, it was found that reverting the following commit resolved this bug:
>>
>> commit cdb898c52d1dfad4b4800b83a58b3fe5d352edde
>> Author: Quinn Tran <[email protected]>
>> Date: Thu Dec 17 14:57:05 2015 -0500
>>
>> qla2xxx: Add irq affinity notification
>>
>>
>> However, the prior commit also required the following three commits to
>> also be reverted:
>>
>> commit 5327c7dbd1a7fd980608f44789076a636e5ee5fc
>> Author: Quinn Tran <[email protected]>
>> Date: Wed Feb 10 18:59:14 2016 -0500
>>
>> qla2xxx: use TARGET_SCF_USE_CPUID flag to indiate CPU Affinity
>>
>> commit 9095adaab8c1d82707e4e9961b6ad79b62f3361b
>> Author: Quinn Tran <[email protected]>
>> Date: Wed Feb 10 18:59:13 2016 -0500
>>
>> target/transport: add flag to indicate CPU Affinity is observed
>>
>> commit fb3269baf4ecc2ce6d17d4eb537080035bdf6d5b
>> Author: Quinn Tran <[email protected]>
>> Date: Thu Dec 17 14:57:06 2015 -0500
>>
>> qla2xxx: Add selective command queuing
>>
>>
>>
>> The regression was introduced as of v4.5-rc4.
>>
>> I was hoping to get your feedback, since you are the patch author. The
>> dependant reverts all look like they are improving cpu affinity, which
>> would likely impact performance. Do you thing there is a way forward
>> instead of the reverts, or would it be best to submit a revert request?
>>
>>
>> Thanks,
>>
>> Joe
>>
>>
>> [0] http://pad.lv/1554003
>>
>>

2016-04-11 21:43:36

by Quinn Tran

[permalink] [raw]
Subject: Re: [4.5-rc4 Regression] qla2xxx: Add irq affinity notification

Joe,

I see the crash point. We’re accessing Null pointer. The adapter in use is an older 4G adapter, where it does not have MSIX support. We’re tripping over the same shared code segment. The following is the propose fix. Let me know if it works. I’ll will follow up with a patch for upstream submission.

Thanks.

diff --git a/drivers/scsi/qla2xxx/qla_isr.c b/drivers/scsi/qla2xxx/qla_isr.c
index 4af9547..79469de 100644
--- a/drivers/scsi/qla2xxx/qla_isr.c
+++ b/drivers/scsi/qla2xxx/qla_isr.c
@@ -2552,7 +2552,7 @@ void qla24xx_process_response_queue(struct scsi_qla_host *vha,
if (!vha->flags.online)
return;

- if (rsp->msix->cpuid != smp_processor_id()) {
+ if (rsp->msix && (rsp->msix->cpuid != smp_processor_id())) {
/* if kernel does not notify qla of IRQ's CPU change,
* then set it here.
*/




Regards,
Quinn Tran






-----Original Message-----
From: Joseph Salisbury <[email protected]>
Date: Monday, April 11, 2016 at 11:41 AM
To: Quinn Tran <[email protected]>
Cc: Dept-Eng QLA2xxx Upstream <[email protected]>, "[email protected]" <[email protected]>, "Martin K. Petersen" <[email protected]>, linux-scsi <[email protected]>, linux-kernel <[email protected]>, "[email protected]" <[email protected]>, "[email protected]" <[email protected]>, Himanshu Madhani <[email protected]>, Nicholas Bellinger <[email protected]>
Subject: Re: [4.5-rc4 Regression] qla2xxx: Add irq affinity notification

>On 04/11/2016 01:48 PM, Quinn Tran wrote:
>> Joe,
>>
>> How do I get access to this specific Ubuntu kernel where the bug is found? Is there stack trace/bug report that you could share? Any data would be helpful. Thanks.
>The git tree for the specific Ubuntu kernel that exhibits this bug can
>be found here:
>git://git.launchpad.net/~ubuntu-kernel/ubuntu/+source/linux/+git/xenial
>
>The bug can be found here, and has screen shots of the panic as well as
>dmesg under the "Attachments" header:
>http://pad.lv/1554003
>
>Just let me know if additional debug information is needed.
>
>
>>
>> In the mean time, I will download 4.5 rc4 to re-verify.
>>
>> Regards,
>> Quinn Tran
>>
>>
>>
>>
>>
>>
>> -----Original Message-----
>> From: Joseph Salisbury <[email protected]>
>> Date: Monday, April 11, 2016 at 10:26 AM
>> To: Quinn Tran <[email protected]>
>> Cc: Dept-Eng QLA2xxx Upstream <[email protected]>, "[email protected]" <[email protected]>, "Martin K. Petersen" <[email protected]>, linux-scsi <[email protected]>, linux-kernel <[email protected]>, "[email protected]" <[email protected]>, "[email protected]" <[email protected]>, Himanshu Madhani <[email protected]>, Nicholas Bellinger <[email protected]>
>> Subject: [4.5-rc4 Regression] qla2xxx: Add irq affinity notification
>>
>>> Hello Quinn,
>>>
>>> A kernel bug report was opened against Ubuntu [0]. After a kernel
>>> bisect, it was found that reverting the following commit resolved this bug:
>>>
>>> commit cdb898c52d1dfad4b4800b83a58b3fe5d352edde
>>> Author: Quinn Tran <[email protected]>
>>> Date: Thu Dec 17 14:57:05 2015 -0500
>>>
>>> qla2xxx: Add irq affinity notification
>>>
>>>
>>> However, the prior commit also required the following three commits to
>>> also be reverted:
>>>
>>> commit 5327c7dbd1a7fd980608f44789076a636e5ee5fc
>>> Author: Quinn Tran <[email protected]>
>>> Date: Wed Feb 10 18:59:14 2016 -0500
>>>
>>> qla2xxx: use TARGET_SCF_USE_CPUID flag to indiate CPU Affinity
>>>
>>> commit 9095adaab8c1d82707e4e9961b6ad79b62f3361b
>>> Author: Quinn Tran <[email protected]>
>>> Date: Wed Feb 10 18:59:13 2016 -0500
>>>
>>> target/transport: add flag to indicate CPU Affinity is observed
>>>
>>> commit fb3269baf4ecc2ce6d17d4eb537080035bdf6d5b
>>> Author: Quinn Tran <[email protected]>
>>> Date: Thu Dec 17 14:57:06 2015 -0500
>>>
>>> qla2xxx: Add selective command queuing
>>>
>>>
>>>
>>> The regression was introduced as of v4.5-rc4.
>>>
>>> I was hoping to get your feedback, since you are the patch author. The
>>> dependant reverts all look like they are improving cpu affinity, which
>>> would likely impact performance. Do you thing there is a way forward
>>> instead of the reverts, or would it be best to submit a revert request?
>>>
>>>
>>> Thanks,
>>>
>>> Joe
>>>
>>>
>>> [0] http://pad.lv/1554003
>>>
>>>
>

2016-04-12 02:10:21

by Joseph Salisbury

[permalink] [raw]
Subject: Re: [4.5-rc4 Regression] qla2xxx: Add irq affinity notification

On 04/11/2016 05:28 PM, Quinn Tran wrote:
> Joe,
>
> I see the crash point. We’re accessing Null pointer. The adapter in use is an older 4G adapter, where it does not have MSIX support. We’re tripping over the same shared code segment. The following is the propose fix. Let me know if it works. I’ll will follow up with a patch for upstream submission.
>
> Thanks.
>
> diff --git a/drivers/scsi/qla2xxx/qla_isr.c b/drivers/scsi/qla2xxx/qla_isr.c
> index 4af9547..79469de 100644
> --- a/drivers/scsi/qla2xxx/qla_isr.c
> +++ b/drivers/scsi/qla2xxx/qla_isr.c
> @@ -2552,7 +2552,7 @@ void qla24xx_process_response_queue(struct scsi_qla_host *vha,
> if (!vha->flags.online)
> return;
>
> - if (rsp->msix->cpuid != smp_processor_id()) {
> + if (rsp->msix && (rsp->msix->cpuid != smp_processor_id())) {
> /* if kernel does not notify qla of IRQ's CPU change,
> * then set it here.
> */
>
>
>
>
> Regards,
> Quinn Tran
Good news, testing with your new patch resolved the bug and did not
require any reverts. Thanks for the quick response!

Testing results in comment #39: http://pad.lv/1554003




>
>
>
>
>
>
> -----Original Message-----
> From: Joseph Salisbury <[email protected]>
> Date: Monday, April 11, 2016 at 11:41 AM
> To: Quinn Tran <[email protected]>
> Cc: Dept-Eng QLA2xxx Upstream <[email protected]>, "[email protected]" <[email protected]>, "Martin K. Petersen" <[email protected]>, linux-scsi <[email protected]>, linux-kernel <[email protected]>, "[email protected]" <[email protected]>, "[email protected]" <[email protected]>, Himanshu Madhani <[email protected]>, Nicholas Bellinger <[email protected]>
> Subject: Re: [4.5-rc4 Regression] qla2xxx: Add irq affinity notification
>
>> On 04/11/2016 01:48 PM, Quinn Tran wrote:
>>> Joe,
>>>
>>> How do I get access to this specific Ubuntu kernel where the bug is found? Is there stack trace/bug report that you could share? Any data would be helpful. Thanks.
>> The git tree for the specific Ubuntu kernel that exhibits this bug can
>> be found here:
>> git://git.launchpad.net/~ubuntu-kernel/ubuntu/+source/linux/+git/xenial
>>
>> The bug can be found here, and has screen shots of the panic as well as
>> dmesg under the "Attachments" header:
>> http://pad.lv/1554003
>>
>> Just let me know if additional debug information is needed.
>>
>>
>>> In the mean time, I will download 4.5 rc4 to re-verify.
>>>
>>> Regards,
>>> Quinn Tran
>>>
>>>
>>>
>>>
>>>
>>>
>>> -----Original Message-----
>>> From: Joseph Salisbury <[email protected]>
>>> Date: Monday, April 11, 2016 at 10:26 AM
>>> To: Quinn Tran <[email protected]>
>>> Cc: Dept-Eng QLA2xxx Upstream <[email protected]>, "[email protected]" <[email protected]>, "Martin K. Petersen" <[email protected]>, linux-scsi <[email protected]>, linux-kernel <[email protected]>, "[email protected]" <[email protected]>, "[email protected]" <[email protected]>, Himanshu Madhani <[email protected]>, Nicholas Bellinger <[email protected]>
>>> Subject: [4.5-rc4 Regression] qla2xxx: Add irq affinity notification
>>>
>>>> Hello Quinn,
>>>>
>>>> A kernel bug report was opened against Ubuntu [0]. After a kernel
>>>> bisect, it was found that reverting the following commit resolved this bug:
>>>>
>>>> commit cdb898c52d1dfad4b4800b83a58b3fe5d352edde
>>>> Author: Quinn Tran <[email protected]>
>>>> Date: Thu Dec 17 14:57:05 2015 -0500
>>>>
>>>> qla2xxx: Add irq affinity notification
>>>>
>>>>
>>>> However, the prior commit also required the following three commits to
>>>> also be reverted:
>>>>
>>>> commit 5327c7dbd1a7fd980608f44789076a636e5ee5fc
>>>> Author: Quinn Tran <[email protected]>
>>>> Date: Wed Feb 10 18:59:14 2016 -0500
>>>>
>>>> qla2xxx: use TARGET_SCF_USE_CPUID flag to indiate CPU Affinity
>>>>
>>>> commit 9095adaab8c1d82707e4e9961b6ad79b62f3361b
>>>> Author: Quinn Tran <[email protected]>
>>>> Date: Wed Feb 10 18:59:13 2016 -0500
>>>>
>>>> target/transport: add flag to indicate CPU Affinity is observed
>>>>
>>>> commit fb3269baf4ecc2ce6d17d4eb537080035bdf6d5b
>>>> Author: Quinn Tran <[email protected]>
>>>> Date: Thu Dec 17 14:57:06 2015 -0500
>>>>
>>>> qla2xxx: Add selective command queuing
>>>>
>>>>
>>>>
>>>> The regression was introduced as of v4.5-rc4.
>>>>
>>>> I was hoping to get your feedback, since you are the patch author. The
>>>> dependant reverts all look like they are improving cpu affinity, which
>>>> would likely impact performance. Do you thing there is a way forward
>>>> instead of the reverts, or would it be best to submit a revert request?
>>>>
>>>>
>>>> Thanks,
>>>>
>>>> Joe
>>>>
>>>>
>>>> [0] http://pad.lv/1554003
>>>>
>>>>