Received: by 2002:ad5:474a:0:0:0:0:0 with SMTP id i10csp5126300imu; Tue, 8 Jan 2019 12:01:20 -0800 (PST) X-Google-Smtp-Source: ALg8bN7oYIAOMTANtldpHk4LdbpYhir4trTCvw1+PhGb7inyy/syvwryMZVt67zLTyCmZa1kExaY X-Received: by 2002:a17:902:8ec8:: with SMTP id x8mr3135658plo.210.1546977680412; Tue, 08 Jan 2019 12:01:20 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1546977680; cv=none; d=google.com; s=arc-20160816; b=VK+1Sizbq4iLM/oLKe7SQUC+nX5QTOa5cLINruh1aPZRzsjW/fTlIRlb6ZFq/Fu1f3 uCgWrRzy4cIePB4bm+RokC8F3v0QTMmLw1SAwxX0IwVQurc1vC7DYDlVzHIMqt6ZH8p6 5lNscKxPLwlbmP5KgbJb8IAlyc0YaU2ML6ST4xZFqS5X4hw6lH7UFGmwbCCrJZoOAGLx 6TJiL1GuILzrfNAlyJgYAPRhIiN8XBF7A7wRj76+qGpwpw8hdIM3dffU7ButBuA7sbwq aO9wwURSb357aHQeyx+mRPEdIxbd8FXy44IZcFVsvb4AV0DD+jFZvHyR90ajngpR4y9l QW9g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=UKuyuDKLnenm87Cd71Vv/eI3ei/rVGkOwv90ibWaiQg=; b=WitO9DvmKAkJ3bLJC5X4fGlFL8GoS1Ng+Vv9RCJafyfyhJ7Vkj1c2aCRqUo6XWa/mD WZgp/7fzi5uLN5KiamtSNhFMqoYXC6GuK6og6pEprOQ9U3TbYN60XRvpLsXTKMgeSqA6 ryJPJw0NhX5bnctnBjE3sYzbJGSXnm2XfMXNGUflh37/gcnOkhgCwT6rCS3V3mWIL73t wxDeJGRpYlF0QHks1RA60UMx4PHsBTWjnPKcXEDHgjTFmVbXRXB436Iz05eeChvoUAH0 vF/bHe0JGp6w0uX6O1x1aNDts3S+s1CqV7IDKW3s4IZOSKaP9mR5Npc4k9rOFazVoiJI NJ+g== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=dF4RqVf7; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id g21si21554499plo.435.2019.01.08.12.01.04; Tue, 08 Jan 2019 12:01:20 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=dF4RqVf7; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730381AbfAHT3Z (ORCPT + 99 others); Tue, 8 Jan 2019 14:29:25 -0500 Received: from mail.kernel.org ([198.145.29.99]:36764 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1730367AbfAHT3X (ORCPT ); Tue, 8 Jan 2019 14:29:23 -0500 Received: from sasha-vm.mshome.net (c-73-47-72-35.hsd1.nh.comcast.net [73.47.72.35]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 63FA8217D9; Tue, 8 Jan 2019 19:29:21 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1546975762; bh=u/ssqokTEAvKBlVtv7/5497/fLar44ZWUb1ObTNHpEg=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=dF4RqVf7v+5jXzJ91A/5mAifxQztfK9swhkNC/IE3riBlIgpB27esATH3hwUpeooD 08w1OpLZoGOk6SdZb9uzdlcRxXk/s6I/FTHPAyDqFSGzSmb3i/CgLEO7uV87WeAxNY RbOJhEubIQga2FOZuJTXNIilglzwiASF6tpHmUnM= From: Sasha Levin To: linux-kernel@vger.kernel.org, stable@vger.kernel.org Cc: Kevin Barnett , Don Brace , "Martin K . Petersen" , Sasha Levin , esc.storagedev@microsemi.com, linux-scsi@vger.kernel.org Subject: [PATCH AUTOSEL 4.20 104/117] scsi: smartpqi: correct lun reset issues Date: Tue, 8 Jan 2019 14:26:12 -0500 Message-Id: <20190108192628.121270-104-sashal@kernel.org> X-Mailer: git-send-email 2.19.1 In-Reply-To: <20190108192628.121270-1-sashal@kernel.org> References: <20190108192628.121270-1-sashal@kernel.org> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 X-Patchwork-Hint: Ignore Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Kevin Barnett [ Upstream commit 2ba55c9851d74eb015a554ef69ddf2ef061d5780 ] Problem: The Linux kernel takes a logical volume offline after a LUN reset. This is generally accompanied by this message in the dmesg output: Device offlined - not ready after error recovery Root Cause: The root cause is a "quirk" in the timeout handling in the Linux SCSI layer. The Linux kernel places a 30-second timeout on most media access commands (reads and writes) that it send to device drivers. When a media access command times out, the Linux kernel goes into error recovery mode for the LUN that was the target of the command that timed out. Every command that timed out is kept on a list inside of the Linux kernel to be retried later. The kernel attempts to recover the command(s) that timed out by issuing a LUN reset followed by a TEST UNIT READY. If the LUN reset and TEST UNIT READY commands are successful, the kernel retries the command(s) that timed out. Each SCSI command issued by the kernel has a result field associated with it. This field indicates the final result of the command (success or error). When a command times out, the kernel places a value in this result field indicating that the command timed out. The "quirk" is that after the LUN reset and TEST UNIT READY commands are completed, the kernel checks each command on the timed-out command list before retrying it. If the result field is still "timed out", the kernel treats that command as not having been successfully recovered for a retry. If the number of commands that are in this state are greater than two, the kernel takes the LUN offline. Fix: When our RAIDStack receives a LUN reset, it simply waits until all outstanding commands complete. Generally, all of these outstanding commands complete successfully. Therefore, the fix in the smartpqi driver is to always set the command result field to indicate success when a request completes successfully. This normally isn’t necessary because the result field is always initialized to success when the command is submitted to the driver. So when the command completes successfully, the result field is left untouched. But in this case, the kernel changes the result field behind the driver’s back and then expects the field to be changed by the driver as the commands that timed-out complete. Reviewed-by: Dave Carroll Reviewed-by: Scott Teel Signed-off-by: Kevin Barnett Signed-off-by: Don Brace Signed-off-by: Martin K. Petersen Signed-off-by: Sasha Levin --- drivers/scsi/smartpqi/smartpqi_init.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/scsi/smartpqi/smartpqi_init.c b/drivers/scsi/smartpqi/smartpqi_init.c index a25a07a0b7f0..c1efc182f5ea 100644 --- a/drivers/scsi/smartpqi/smartpqi_init.c +++ b/drivers/scsi/smartpqi/smartpqi_init.c @@ -2704,6 +2704,9 @@ static unsigned int pqi_process_io_intr(struct pqi_ctrl_info *ctrl_info, switch (response->header.iu_type) { case PQI_RESPONSE_IU_RAID_PATH_IO_SUCCESS: case PQI_RESPONSE_IU_AIO_PATH_IO_SUCCESS: + if (io_request->scmd) + io_request->scmd->result = 0; + /* fall through */ case PQI_RESPONSE_IU_GENERAL_MANAGEMENT: break; case PQI_RESPONSE_IU_TASK_MANAGEMENT: -- 2.19.1