2016-11-05 13:30:03

by James Bottomley

[permalink] [raw]
Subject: [GIT PULL] SCSI fixes for 4.9-rc3

Two more important data integrity fixes related to RAID device drivers
which wrongly throw away the SYNCHRONIZE CACHE command in the non-RAID
path and a memory leak fix in the scsi_debug driver.

The patch is available here:

git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi.git scsi-fixes

The short changelog is:

Ching Huang (1):
scsi: arcmsr: Send SYNCHRONIZE_CACHE command to firmware

Ewan D. Milne (1):
scsi: scsi_debug: Fix memory leak if LBP enabled and module is unloaded

Kashyap Desai (1):
scsi: megaraid_sas: Fix data integrity failure for JBOD (passthrough) devices

And the diffstat:

drivers/scsi/arcmsr/arcmsr_hba.c | 9 ---------
drivers/scsi/megaraid/megaraid_sas_base.c | 13 +++++--------
drivers/scsi/scsi_debug.c | 1 +
3 files changed, 6 insertions(+), 17 deletions(-)

With full diff below.

James

---

diff --git a/drivers/scsi/arcmsr/arcmsr_hba.c b/drivers/scsi/arcmsr/arcmsr_hba.c
index 3d53d63..f0cfb04 100644
--- a/drivers/scsi/arcmsr/arcmsr_hba.c
+++ b/drivers/scsi/arcmsr/arcmsr_hba.c
@@ -2636,18 +2636,9 @@ static int arcmsr_queue_command_lck(struct scsi_cmnd *cmd,
struct AdapterControlBlock *acb = (struct AdapterControlBlock *) host->hostdata;
struct CommandControlBlock *ccb;
int target = cmd->device->id;
- int lun = cmd->device->lun;
- uint8_t scsicmd = cmd->cmnd[0];
cmd->scsi_done = done;
cmd->host_scribble = NULL;
cmd->result = 0;
- if ((scsicmd == SYNCHRONIZE_CACHE) ||(scsicmd == SEND_DIAGNOSTIC)){
- if(acb->devstate[target][lun] == ARECA_RAID_GONE) {
- cmd->result = (DID_NO_CONNECT << 16);
- }
- cmd->scsi_done(cmd);
- return 0;
- }
if (target == 16) {
/* virtual device for iop message transfer */
arcmsr_handle_virtual_command(acb, cmd);
diff --git a/drivers/scsi/megaraid/megaraid_sas_base.c b/drivers/scsi/megaraid/megaraid_sas_base.c
index 9ff57de..d8b1fbd 100644
--- a/drivers/scsi/megaraid/megaraid_sas_base.c
+++ b/drivers/scsi/megaraid/megaraid_sas_base.c
@@ -1700,16 +1700,13 @@ megasas_queue_command(struct Scsi_Host *shost, struct scsi_cmnd *scmd)
goto out_done;
}

- switch (scmd->cmnd[0]) {
- case SYNCHRONIZE_CACHE:
- /*
- * FW takes care of flush cache on its own
- * No need to send it down
- */
+ /*
+ * FW takes care of flush cache on its own for Virtual Disk.
+ * No need to send it down for VD. For JBOD send SYNCHRONIZE_CACHE to FW.
+ */
+ if ((scmd->cmnd[0] == SYNCHRONIZE_CACHE) && MEGASAS_IS_LOGICAL(scmd)) {
scmd->result = DID_OK << 16;
goto out_done;
- default:
- break;
}

return instance->instancet->build_and_issue_cmd(instance, scmd);
diff --git a/drivers/scsi/scsi_debug.c b/drivers/scsi/scsi_debug.c
index c905709..cf04a36 100644
--- a/drivers/scsi/scsi_debug.c
+++ b/drivers/scsi/scsi_debug.c
@@ -5134,6 +5134,7 @@ static void __exit scsi_debug_exit(void)
bus_unregister(&pseudo_lld_bus);
root_device_unregister(pseudo_primary);

+ vfree(map_storep);
vfree(dif_storep);
vfree(fake_storep);
kfree(sdebug_q_arr);


2016-11-11 03:32:18

by Gabriel C

[permalink] [raw]
Subject: Re: [GIT PULL] SCSI fixes for 4.9-rc3


On 05.11.2016 14:29, James Bottomley wrote:


...

> Kashyap Desai (1):
> scsi: megaraid_sas: Fix data integrity failure for JBOD (passthrough) devices
>
> diff --git a/drivers/scsi/megaraid/megaraid_sas_base.c b/drivers/scsi/megaraid/megaraid_sas_base.c
> index 9ff57de..d8b1fbd 100644
> --- a/drivers/scsi/megaraid/megaraid_sas_base.c
> +++ b/drivers/scsi/megaraid/megaraid_sas_base.c
> @@ -1700,16 +1700,13 @@ megasas_queue_command(struct Scsi_Host *shost, struct scsi_cmnd *scmd)
> goto out_done;
> }
>
> - switch (scmd->cmnd[0]) {
> - case SYNCHRONIZE_CACHE:
> - /*
> - * FW takes care of flush cache on its own
> - * No need to send it down
> - */
> + /*
> + * FW takes care of flush cache on its own for Virtual Disk.
> + * No need to send it down for VD. For JBOD send SYNCHRONIZE_CACHE to FW.
> + */
> + if ((scmd->cmnd[0] == SYNCHRONIZE_CACHE) && MEGASAS_IS_LOGICAL(scmd)) {
> scmd->result = DID_OK << 16;
> goto out_done;
> - default:
> - break;
> }
>
> return instance->instancet->build_and_issue_cmd(instance, scmd);

This patch breaks my box.. I'm not able to boot it anymore.
It seems with this patch I have /dev/sda[a-z] to /dev/sdz[a-z] ?!?

I'm not sure how to get an log since dracut times out and I'm dropped , after a very long time
of probing 'ghost devices', in a emercency shell, journalctl doesn't work also..

After reverting this one I can boot normal.

Box is a FUJITSU PRIMERGY TX200 S5..

This is from an working kernel..

[ 5.119371] megaraid_sas 0000:01:00.0: FW now in Ready state
[ 5.119418] megaraid_sas 0000:01:00.0: firmware supports msix : (0)
[ 5.119420] megaraid_sas 0000:01:00.0: current msix/online cpus : (1/16)
[ 5.119422] megaraid_sas 0000:01:00.0: RDPQ mode : (disabled)
[ 5.123100] ehci-pci 0000:00:1a.7: cache line size of 32 is not supported
[ 5.123113] ehci-pci 0000:00:1a.7: irq 18, io mem 0xb0020000

...

[ 5.208063] megaraid_sas 0000:01:00.0: controller type : MR(256MB)
[ 5.208065] megaraid_sas 0000:01:00.0: Online Controller Reset(OCR) : Enabled
[ 5.208067] megaraid_sas 0000:01:00.0: Secure JBOD support : No
[ 5.208070] megaraid_sas 0000:01:00.0: megasas_init_mfi: fw_support_ieee=0
[ 5.208073] megaraid_sas 0000:01:00.0: INIT adapter done
[ 5.208075] megaraid_sas 0000:01:00.0: Jbod map is not supported megasas_setup_jbod_map 4967
[ 5.230163] megaraid_sas 0000:01:00.0: MR_DCMD_PD_LIST_QUERY failed/not supported by firmware
[ 5.252080] megaraid_sas 0000:01:00.0: DCMD not supported by firmware - megasas_ld_list_query 4369
[ 5.274086] megaraid_sas 0000:01:00.0: pci id : (0x1000)/(0x0060)/(0x1734)/(0x10f9)
[ 5.274089] megaraid_sas 0000:01:00.0: unevenspan support : no
[ 5.274090] megaraid_sas 0000:01:00.0: firmware crash dump : no
[ 5.274092] megaraid_sas 0000:01:00.0: jbod sync map : no
[ 5.274094] scsi host0: Avago SAS based MegaRAID driver
[ 5.280022] scsi 0:0:6:0: Direct-Access ATA WDC WD5002ABYS-5 3B06 PQ: 0 ANSI: 5
[ 5.282153] scsi 0:0:7:0: Direct-Access ATA WDC WD5002ABYS-5 3B06 PQ: 0 ANSI: 5
[ 5.285180] scsi 0:0:10:0: Direct-Access ATA ST500NM0011 FTM6 PQ: 0 ANSI: 5
[ 5.369885] scsi 0:2:0:0: Direct-Access LSI MegaRAID SAS RMB 1.40 PQ: 0 ANSI: 5

..

Please let me know if you need more infos and/or want me to test patches.


Best Regards,

Gabriel C

2016-11-11 04:11:33

by Gabriel C

[permalink] [raw]
Subject: Re: [GIT PULL] SCSI fixes for 4.9-rc3



On 11.11.2016 04:30, Gabriel C wrote:
>
> On 05.11.2016 14:29, James Bottomley wrote:
>
>
> ...
>
>> Kashyap Desai (1):
>> scsi: megaraid_sas: Fix data integrity failure for JBOD (passthrough) devices
>>
>> diff --git a/drivers/scsi/megaraid/megaraid_sas_base.c b/drivers/scsi/megaraid/megaraid_sas_base.c
>> index 9ff57de..d8b1fbd 100644
>> --- a/drivers/scsi/megaraid/megaraid_sas_base.c
>> +++ b/drivers/scsi/megaraid/megaraid_sas_base.c
>> @@ -1700,16 +1700,13 @@ megasas_queue_command(struct Scsi_Host *shost, struct scsi_cmnd *scmd)
>> goto out_done;
>> }
>>
>> - switch (scmd->cmnd[0]) {
>> - case SYNCHRONIZE_CACHE:
>> - /*
>> - * FW takes care of flush cache on its own
>> - * No need to send it down
>> - */
>> + /*
>> + * FW takes care of flush cache on its own for Virtual Disk.
>> + * No need to send it down for VD. For JBOD send SYNCHRONIZE_CACHE to FW.
>> + */
>> + if ((scmd->cmnd[0] == SYNCHRONIZE_CACHE) && MEGASAS_IS_LOGICAL(scmd)) {
>> scmd->result = DID_OK << 16;
>> goto out_done;
>> - default:
>> - break;
>> }
>>
>> return instance->instancet->build_and_issue_cmd(instance, scmd);
>
> This patch breaks my box.. I'm not able to boot it anymore.
> It seems with this patch I have /dev/sda[a-z] to /dev/sdz[a-z] ?!?
>
> I'm not sure how to get an log since dracut times out and I'm dropped , after a very long time
> of probing 'ghost devices', in a emercency shell, journalctl doesn't work also..
>
> After reverting this one I can boot normal.
>
> Box is a FUJITSU PRIMERGY TX200 S5..
>
> This is from an working kernel..
>
> [ 5.119371] megaraid_sas 0000:01:00.0: FW now in Ready state
> [ 5.119418] megaraid_sas 0000:01:00.0: firmware supports msix : (0)
> [ 5.119420] megaraid_sas 0000:01:00.0: current msix/online cpus : (1/16)
> [ 5.119422] megaraid_sas 0000:01:00.0: RDPQ mode : (disabled)
> [ 5.123100] ehci-pci 0000:00:1a.7: cache line size of 32 is not supported
> [ 5.123113] ehci-pci 0000:00:1a.7: irq 18, io mem 0xb0020000
>
> ...
>
> [ 5.208063] megaraid_sas 0000:01:00.0: controller type : MR(256MB)
> [ 5.208065] megaraid_sas 0000:01:00.0: Online Controller Reset(OCR) : Enabled
> [ 5.208067] megaraid_sas 0000:01:00.0: Secure JBOD support : No
> [ 5.208070] megaraid_sas 0000:01:00.0: megasas_init_mfi: fw_support_ieee=0
> [ 5.208073] megaraid_sas 0000:01:00.0: INIT adapter done
> [ 5.208075] megaraid_sas 0000:01:00.0: Jbod map is not supported megasas_setup_jbod_map 4967
> [ 5.230163] megaraid_sas 0000:01:00.0: MR_DCMD_PD_LIST_QUERY failed/not supported by firmware
> [ 5.252080] megaraid_sas 0000:01:00.0: DCMD not supported by firmware - megasas_ld_list_query 4369
> [ 5.274086] megaraid_sas 0000:01:00.0: pci id : (0x1000)/(0x0060)/(0x1734)/(0x10f9)
> [ 5.274089] megaraid_sas 0000:01:00.0: unevenspan support : no
> [ 5.274090] megaraid_sas 0000:01:00.0: firmware crash dump : no
> [ 5.274092] megaraid_sas 0000:01:00.0: jbod sync map : no
> [ 5.274094] scsi host0: Avago SAS based MegaRAID driver
> [ 5.280022] scsi 0:0:6:0: Direct-Access ATA WDC WD5002ABYS-5 3B06 PQ: 0 ANSI: 5
> [ 5.282153] scsi 0:0:7:0: Direct-Access ATA WDC WD5002ABYS-5 3B06 PQ: 0 ANSI: 5
> [ 5.285180] scsi 0:0:10:0: Direct-Access ATA ST500NM0011 FTM6 PQ: 0 ANSI: 5
> [ 5.369885] scsi 0:2:0:0: Direct-Access LSI MegaRAID SAS RMB 1.40 PQ: 0 ANSI: 5
>
> ..
>
> Please let me know if you need more infos and/or want me to test patches.
>
>

I managed to get some parts of the broken dmesg. There it is :

http://ftp.frugalware.org/pub/other/people/crazy/kernel/broken-dmesg


2016-11-12 01:08:38

by Kashyap Desai

[permalink] [raw]
Subject: RE: [GIT PULL] SCSI fixes for 4.9-rc3

> -----Original Message-----
> From: [email protected] [mailto:linux-scsi-
> [email protected]] On Behalf Of Gabriel C
> Sent: Friday, November 11, 2016 9:40 AM
> To: James Bottomley; Andrew Morton; Linus Torvalds
> Cc: linux-scsi; linux-kernel; [email protected]
> Subject: Re: [GIT PULL] SCSI fixes for 4.9-rc3
>
>
>
> On 11.11.2016 04:30, Gabriel C wrote:
> >
> > On 05.11.2016 14:29, James Bottomley wrote:
> >
> >
> > ...
> >
> >> Kashyap Desai (1):
> >> scsi: megaraid_sas: Fix data integrity failure for JBOD
> >> (passthrough) devices
> >>
> >> diff --git a/drivers/scsi/megaraid/megaraid_sas_base.c
> >> b/drivers/scsi/megaraid/megaraid_sas_base.c
> >> index 9ff57de..d8b1fbd 100644
> >> --- a/drivers/scsi/megaraid/megaraid_sas_base.c
> >> +++ b/drivers/scsi/megaraid/megaraid_sas_base.c
> >> @@ -1700,16 +1700,13 @@ megasas_queue_command(struct Scsi_Host
> *shost, struct scsi_cmnd *scmd)
> >> goto out_done;
> >> }
> >>
> >> - switch (scmd->cmnd[0]) {
> >> - case SYNCHRONIZE_CACHE:
> >> - /*
> >> - * FW takes care of flush cache on its own
> >> - * No need to send it down
> >> - */
> >> + /*
> >> + * FW takes care of flush cache on its own for Virtual Disk.
> >> + * No need to send it down for VD. For JBOD send
> SYNCHRONIZE_CACHE to FW.
> >> + */
> >> + if ((scmd->cmnd[0] == SYNCHRONIZE_CACHE) &&
> >> +MEGASAS_IS_LOGICAL(scmd)) {
> >> scmd->result = DID_OK << 16;
> >> goto out_done;
> >> - default:
> >> - break;
> >> }
> >>
> >> return instance->instancet->build_and_issue_cmd(instance, scmd);
> >
> > This patch breaks my box.. I'm not able to boot it anymore.
> > It seems with this patch I have /dev/sda[a-z] to /dev/sdz[a-z] ?!?
> >
> > I'm not sure how to get an log since dracut times out and I'm dropped
> > , after a very long time of probing 'ghost devices', in a emercency
> > shell,
> journalctl doesn't work also..
> >
> > After reverting this one I can boot normal.
> >
> > Box is a FUJITSU PRIMERGY TX200 S5..

Please check now commit. Below commit has complete fix.

http://git.kernel.org/cgit/linux/kernel/git/jejb/scsi.git/commit/?id=5e5ec1759dd663a1d5a2f10930224dd009e500e8


> >
> > This is from an working kernel..
> >
> > [ 5.119371] megaraid_sas 0000:01:00.0: FW now in Ready state
> > [ 5.119418] megaraid_sas 0000:01:00.0: firmware supports msix
> > : (0)
> > [ 5.119420] megaraid_sas 0000:01:00.0: current msix/online cpus
> > : (1/16)
> > [ 5.119422] megaraid_sas 0000:01:00.0: RDPQ mode : (disabled)
> > [ 5.123100] ehci-pci 0000:00:1a.7: cache line size of 32 is not
> > supported
> > [ 5.123113] ehci-pci 0000:00:1a.7: irq 18, io mem 0xb0020000
> >
> > ...
> >
> > [ 5.208063] megaraid_sas 0000:01:00.0: controller type :
> > MR(256MB)
> > [ 5.208065] megaraid_sas 0000:01:00.0: Online Controller Reset(OCR)
> > :
> Enabled
> > [ 5.208067] megaraid_sas 0000:01:00.0: Secure JBOD support : No
> > [ 5.208070] megaraid_sas 0000:01:00.0: megasas_init_mfi:
> fw_support_ieee=0
> > [ 5.208073] megaraid_sas 0000:01:00.0: INIT adapter done
> > [ 5.208075] megaraid_sas 0000:01:00.0: Jbod map is not supported
> megasas_setup_jbod_map 4967
> > [ 5.230163] megaraid_sas 0000:01:00.0: MR_DCMD_PD_LIST_QUERY
> failed/not supported by firmware
> > [ 5.252080] megaraid_sas 0000:01:00.0: DCMD not supported by
> > firmware -
> megasas_ld_list_query 4369
> > [ 5.274086] megaraid_sas 0000:01:00.0: pci id :
> (0x1000)/(0x0060)/(0x1734)/(0x10f9)
> > [ 5.274089] megaraid_sas 0000:01:00.0: unevenspan support : no
> > [ 5.274090] megaraid_sas 0000:01:00.0: firmware crash dump : no
> > [ 5.274092] megaraid_sas 0000:01:00.0: jbod sync map : no
> > [ 5.274094] scsi host0: Avago SAS based MegaRAID driver
> > [ 5.280022] scsi 0:0:6:0: Direct-Access ATA WDC WD5002ABYS-5
> > 3B06
> PQ: 0 ANSI: 5
> > [ 5.282153] scsi 0:0:7:0: Direct-Access ATA WDC WD5002ABYS-5
> > 3B06
> PQ: 0 ANSI: 5
> > [ 5.285180] scsi 0:0:10:0: Direct-Access ATA ST500NM0011
> > FTM6 PQ:
> 0 ANSI: 5
> > [ 5.369885] scsi 0:2:0:0: Direct-Access LSI MegaRAID SAS RMB
> > 1.40 PQ:
> 0 ANSI: 5
> >
> > ..
> >
> > Please let me know if you need more infos and/or want me to test
> > patches.
> >
> >
>
> I managed to get some parts of the broken dmesg. There it is :
>
> http://ftp.frugalware.org/pub/other/people/crazy/kernel/broken-dmesg
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
> the body of
> a message to [email protected] More majordomo info at
> http://vger.kernel.org/majordomo-info.html

2016-11-12 02:25:30

by Gabriel C

[permalink] [raw]
Subject: Re: [GIT PULL] SCSI fixes for 4.9-rc3



On 12.11.2016 02:08, Kashyap Desai wrote:
>> -----Original Message-----
>> From: [email protected] [mailto:linux-scsi-
>> [email protected]] On Behalf Of Gabriel C
>> Sent: Friday, November 11, 2016 9:40 AM
>> To: James Bottomley; Andrew Morton; Linus Torvalds
>> Cc: linux-scsi; linux-kernel; [email protected]
>> Subject: Re: [GIT PULL] SCSI fixes for 4.9-rc3
>>
>>
>>
>> On 11.11.2016 04:30, Gabriel C wrote:
>>>
>>> On 05.11.2016 14:29, James Bottomley wrote:
>>>
>>>
>>> ...
>>>
>>>> Kashyap Desai (1):
>>>> scsi: megaraid_sas: Fix data integrity failure for JBOD
>>>> (passthrough) devices
>>>>
>>>> diff --git a/drivers/scsi/megaraid/megaraid_sas_base.c
>>>> b/drivers/scsi/megaraid/megaraid_sas_base.c
>>>> index 9ff57de..d8b1fbd 100644
>>>> --- a/drivers/scsi/megaraid/megaraid_sas_base.c
>>>> +++ b/drivers/scsi/megaraid/megaraid_sas_base.c
>>>> @@ -1700,16 +1700,13 @@ megasas_queue_command(struct Scsi_Host
>> *shost, struct scsi_cmnd *scmd)
>>>> goto out_done;
>>>> }
>>>>
>>>> - switch (scmd->cmnd[0]) {
>>>> - case SYNCHRONIZE_CACHE:
>>>> - /*
>>>> - * FW takes care of flush cache on its own
>>>> - * No need to send it down
>>>> - */
>>>> + /*
>>>> + * FW takes care of flush cache on its own for Virtual Disk.
>>>> + * No need to send it down for VD. For JBOD send
>> SYNCHRONIZE_CACHE to FW.
>>>> + */
>>>> + if ((scmd->cmnd[0] == SYNCHRONIZE_CACHE) &&
>>>> +MEGASAS_IS_LOGICAL(scmd)) {
>>>> scmd->result = DID_OK << 16;
>>>> goto out_done;
>>>> - default:
>>>> - break;
>>>> }
>>>>
>>>> return instance->instancet->build_and_issue_cmd(instance, scmd);
>>>
>>> This patch breaks my box.. I'm not able to boot it anymore.
>>> It seems with this patch I have /dev/sda[a-z] to /dev/sdz[a-z] ?!?
>>>
>>> I'm not sure how to get an log since dracut times out and I'm dropped
>>> , after a very long time of probing 'ghost devices', in a emercency
>>> shell,
>> journalctl doesn't work also..
>>>
>>> After reverting this one I can boot normal.
>>>
>>> Box is a FUJITSU PRIMERGY TX200 S5..
>
> Please check now commit. Below commit has complete fix.
>
> http://git.kernel.org/cgit/linux/kernel/git/jejb/scsi.git/commit/?id=5e5ec1759dd663a1d5a2f10930224dd009e500e8
>


This patch fixes the problem for me. Thank you.


Regards,

Gabriel C