Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755024AbdDJPo0 (ORCPT ); Mon, 10 Apr 2017 11:44:26 -0400 Received: from mx2.suse.de ([195.135.220.15]:38447 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754664AbdDJPet (ORCPT ); Mon, 10 Apr 2017 11:34:49 -0400 X-Amavis-Alert: BAD HEADER SECTION, Duplicate header field: "References" From: Jiri Slaby To: stable@vger.kernel.org Cc: linux-kernel@vger.kernel.org, James Bottomley , "Martin K . Petersen" , Joe Korty , Jiri Slaby Subject: [PATCH 3.12 111/142] scsi: mpt3sas: fix hang on ata passthrough commands Date: Mon, 10 Apr 2017 17:33:12 +0200 Message-Id: X-Mailer: git-send-email 2.12.2 In-Reply-To: References: In-Reply-To: References: Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 5138 Lines: 143 From: James Bottomley 3.12-stable review patch. If anyone has any objections, please let me know. =============== commit ffb58456589443ca572221fabbdef3db8483a779 upstream. mpt3sas has a firmware failure where it can only handle one pass through ATA command at a time. If another comes in, contrary to the SAT standard, it will hang until the first one completes (causing long commands like secure erase to timeout). The original fix was to block the device when an ATA command came in, but this caused a regression with commit 669f044170d8933c3d66d231b69ea97cb8447338 Author: Bart Van Assche Date: Tue Nov 22 16:17:13 2016 -0800 scsi: srp_transport: Move queuecommand() wait code to SCSI core So fix the original fix of the secure erase timeout by properly returning SAM_STAT_BUSY like the SAT recommends. The original patch also had a concurrency problem since scsih_qcmd is lockless at that point (this is fixed by using atomic bitops to set and test the flag). [mkp: addressed feedback wrt. test_bit and fixed whitespace] Fixes: 18f6084a989ba1b (mpt3sas: Fix secure erase premature termination) Signed-off-by: James Bottomley Acked-by: Sreekanth Reddy Reviewed-by: Christoph Hellwig Reported-by: Ingo Molnar Tested-by: Ingo Molnar Signed-off-by: Martin K. Petersen Cc: Joe Korty Signed-off-by: Jiri Slaby --- drivers/scsi/mpt3sas/mpt3sas_base.h | 12 ++++++++++++ drivers/scsi/mpt3sas/mpt3sas_scsih.c | 36 +++++++++++++++++++++++++----------- 2 files changed, 37 insertions(+), 11 deletions(-) diff --git a/drivers/scsi/mpt3sas/mpt3sas_base.h b/drivers/scsi/mpt3sas/mpt3sas_base.h index 0ebf5d913c80..c56ac73a8d05 100644 --- a/drivers/scsi/mpt3sas/mpt3sas_base.h +++ b/drivers/scsi/mpt3sas/mpt3sas_base.h @@ -219,6 +219,7 @@ struct MPT3SAS_TARGET { * @eedp_enable: eedp support enable bit * @eedp_type: 0(type_1), 1(type_2), 2(type_3) * @eedp_block_length: block size + * @ata_command_pending: SATL passthrough outstanding for device */ struct MPT3SAS_DEVICE { struct MPT3SAS_TARGET *sas_target; @@ -227,6 +228,17 @@ struct MPT3SAS_DEVICE { u8 configured_lun; u8 block; u8 tlr_snoop_check; + /* + * Bug workaround for SATL handling: the mpt2/3sas firmware + * doesn't return BUSY or TASK_SET_FULL for subsequent + * commands while a SATL pass through is in operation as the + * spec requires, it simply does nothing with them until the + * pass through completes, causing them possibly to timeout if + * the passthrough is a long executing command (like format or + * secure erase). This variable allows us to do the right + * thing while a SATL command is pending. + */ + unsigned long ata_command_pending; }; #define MPT3_CMD_NOT_USED 0x8000 /* free */ diff --git a/drivers/scsi/mpt3sas/mpt3sas_scsih.c b/drivers/scsi/mpt3sas/mpt3sas_scsih.c index ae1db5499ca6..3d3d37e4b37c 100644 --- a/drivers/scsi/mpt3sas/mpt3sas_scsih.c +++ b/drivers/scsi/mpt3sas/mpt3sas_scsih.c @@ -3516,9 +3516,18 @@ _scsih_eedp_error_handling(struct scsi_cmnd *scmd, u16 ioc_status) SAM_STAT_CHECK_CONDITION; } -static inline bool ata_12_16_cmd(struct scsi_cmnd *scmd) +static int _scsih_set_satl_pending(struct scsi_cmnd *scmd, bool pending) { - return (scmd->cmnd[0] == ATA_12 || scmd->cmnd[0] == ATA_16); + struct MPT3SAS_DEVICE *priv = scmd->device->hostdata; + + if (scmd->cmnd[0] != ATA_12 && scmd->cmnd[0] != ATA_16) + return 0; + + if (pending) + return test_and_set_bit(0, &priv->ata_command_pending); + + clear_bit(0, &priv->ata_command_pending); + return 0; } /** @@ -3548,13 +3557,6 @@ _scsih_qcmd_lck(struct scsi_cmnd *scmd, void (*done)(struct scsi_cmnd *)) scsi_print_command(scmd); #endif - /* - * Lock the device for any subsequent command until command is - * done. - */ - if (ata_12_16_cmd(scmd)) - scsi_internal_device_block(scmd->device); - scmd->scsi_done = done; sas_device_priv_data = scmd->device->hostdata; if (!sas_device_priv_data || !sas_device_priv_data->sas_target) { @@ -3569,6 +3571,19 @@ _scsih_qcmd_lck(struct scsi_cmnd *scmd, void (*done)(struct scsi_cmnd *)) return 0; } + /* + * Bug work around for firmware SATL handling. The loop + * is based on atomic operations and ensures consistency + * since we're lockless at this point + */ + do { + if (test_bit(0, &sas_device_priv_data->ata_command_pending)) { + scmd->result = SAM_STAT_BUSY; + scmd->scsi_done(scmd); + return 0; + } + } while (_scsih_set_satl_pending(scmd, true)); + sas_target_priv_data = sas_device_priv_data->sas_target; /* invalid device handle */ @@ -4058,8 +4073,7 @@ _scsih_io_done(struct MPT3SAS_ADAPTER *ioc, u16 smid, u8 msix_index, u32 reply) if (scmd == NULL) return 1; - if (ata_12_16_cmd(scmd)) - scsi_internal_device_unblock(scmd->device, SDEV_RUNNING); + _scsih_set_satl_pending(scmd, false); mpi_request = mpt3sas_base_get_msg_frame(ioc, smid); -- 2.12.2