Received: by 2002:ad5:474a:0:0:0:0:0 with SMTP id i10csp2372326imu; Thu, 24 Jan 2019 11:35:31 -0800 (PST) X-Google-Smtp-Source: ALg8bN6nsTzW8/14KU1Q20tY3LDPkWkQe0VWaWFSnUHr/e+3B03cRYBWlFqW9pd0HXWnKojwL65e X-Received: by 2002:a63:9f19:: with SMTP id g25mr7136563pge.327.1548358531241; Thu, 24 Jan 2019 11:35:31 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1548358531; cv=none; d=google.com; s=arc-20160816; b=mS8BRoAKpICbjtyUfhmI/iHz2SdEneEZPO1SjKbL/6vEfOXZUfIzE3cL6Iz2gZLuLD R9N+YlK6FFb6JVJcsNUWff8JPnnU3PCqk3rs/s3HN+rARZfBLEkVvDq6CuwcClE3mlGA B4rH5wAJ+n8so085I4bDOVx0ymtORPxHxrfUFkIlEYvR863AHpT/5/O83Oq3Qt7+8eHX F2CdFH3rCLtsXBmzNll1Kv8CgGLIg41Cyxnv/pwRzD4KSFR/Omspm8/1Z0cXe2tReonF uYI1sdahxhYN61goBPFVz9zYWNrL6y40c2MxIh6O2dbscV8sM7gHVqbpFUmr/QcKZLaU wSQw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :user-agent:references:in-reply-to:message-id:date:subject:cc:to :from:dkim-signature; bh=cT/qblTaWyjrKuenuGfauV3nOJLbDrRIonKsZm1E6rQ=; b=Z4AFV4B2YyKYaYbUBZE9k7vWL3UyXrMqwzw+f49XkH8Drv9bZtG3kzPekBZt82XiNI ZJaYlf7sNb+t/21DTk6/mbN81FzHMoW26VpXk0h2pW6RTtrR87RMKZKFNqk5mH4trErt CsVVdDsxklEBhOMnFsy+OswbCC6DFjybMFr0TSnUjxJaMdi19nUOd+tew7Usu4TVo2J9 iD71UYzVnZFNIJUfqN83PcOytdt3CXilrlu44E6PdWS5z0AjvAMHIkboPN2cd3PM3tpL Z3wPOwetNHi50xAXwvu210yHbHqjDmFu1A2E9IZzxVGgCKf3d5U79/wGXA78uAeMxxv8 bPvw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=QDUKubgk; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id t3si23234047pgl.108.2019.01.24.11.35.15; Thu, 24 Jan 2019 11:35:31 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=QDUKubgk; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1731930AbfAXTdA (ORCPT + 99 others); Thu, 24 Jan 2019 14:33:00 -0500 Received: from mail.kernel.org ([198.145.29.99]:60856 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1730387AbfAXTc4 (ORCPT ); Thu, 24 Jan 2019 14:32:56 -0500 Received: from localhost (5356596B.cm-6-7b.dynamic.ziggo.nl [83.86.89.107]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 3B391218D4; Thu, 24 Jan 2019 19:32:55 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1548358375; bh=Xy/fI4DC/KwC6FIOYru8rvH4NrYb9qcV3BpA3tOAc1I=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=QDUKubgkGnISEdps5hIHJZoapmYqBJ8ecevoDwyb6sdwknjJl2gqub5yyEVkohYJN FRPl+VXB3npw4nDTzu8I0bLzvVXpD0mXatj9uU1EpbYpvZgeQAQVleQ+qII6VcCe8f SI4JKRm0KODfwmb+x3OTVhxNswliWFTS8eX6Xgsk= From: Greg Kroah-Hartman To: linux-kernel@vger.kernel.org Cc: Greg Kroah-Hartman , stable@vger.kernel.org, Dave Carroll , Scott Teel , Kevin Barnett , Don Brace , "Martin K. Petersen" , Sasha Levin Subject: [PATCH 4.14 51/63] scsi: smartpqi: correct lun reset issues Date: Thu, 24 Jan 2019 20:20:40 +0100 Message-Id: <20190124190201.473607160@linuxfoundation.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20190124190155.176570028@linuxfoundation.org> References: <20190124190155.176570028@linuxfoundation.org> User-Agent: quilt/0.65 X-stable: review X-Patchwork-Hint: ignore MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org 4.14-stable review patch. If anyone has any objections, please let me know. ------------------ [ Upstream commit 2ba55c9851d74eb015a554ef69ddf2ef061d5780 ] Problem: The Linux kernel takes a logical volume offline after a LUN reset. This is generally accompanied by this message in the dmesg output: Device offlined - not ready after error recovery Root Cause: The root cause is a "quirk" in the timeout handling in the Linux SCSI layer. The Linux kernel places a 30-second timeout on most media access commands (reads and writes) that it send to device drivers. When a media access command times out, the Linux kernel goes into error recovery mode for the LUN that was the target of the command that timed out. Every command that timed out is kept on a list inside of the Linux kernel to be retried later. The kernel attempts to recover the command(s) that timed out by issuing a LUN reset followed by a TEST UNIT READY. If the LUN reset and TEST UNIT READY commands are successful, the kernel retries the command(s) that timed out. Each SCSI command issued by the kernel has a result field associated with it. This field indicates the final result of the command (success or error). When a command times out, the kernel places a value in this result field indicating that the command timed out. The "quirk" is that after the LUN reset and TEST UNIT READY commands are completed, the kernel checks each command on the timed-out command list before retrying it. If the result field is still "timed out", the kernel treats that command as not having been successfully recovered for a retry. If the number of commands that are in this state are greater than two, the kernel takes the LUN offline. Fix: When our RAIDStack receives a LUN reset, it simply waits until all outstanding commands complete. Generally, all of these outstanding commands complete successfully. Therefore, the fix in the smartpqi driver is to always set the command result field to indicate success when a request completes successfully. This normally isn’t necessary because the result field is always initialized to success when the command is submitted to the driver. So when the command completes successfully, the result field is left untouched. But in this case, the kernel changes the result field behind the driver’s back and then expects the field to be changed by the driver as the commands that timed-out complete. Reviewed-by: Dave Carroll Reviewed-by: Scott Teel Signed-off-by: Kevin Barnett Signed-off-by: Don Brace Signed-off-by: Martin K. Petersen Signed-off-by: Sasha Levin --- drivers/scsi/smartpqi/smartpqi_init.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/scsi/smartpqi/smartpqi_init.c b/drivers/scsi/smartpqi/smartpqi_init.c index 83bdbd84eb01..b662f58203ac 100644 --- a/drivers/scsi/smartpqi/smartpqi_init.c +++ b/drivers/scsi/smartpqi/smartpqi_init.c @@ -2709,6 +2709,9 @@ static unsigned int pqi_process_io_intr(struct pqi_ctrl_info *ctrl_info, switch (response->header.iu_type) { case PQI_RESPONSE_IU_RAID_PATH_IO_SUCCESS: case PQI_RESPONSE_IU_AIO_PATH_IO_SUCCESS: + if (io_request->scmd) + io_request->scmd->result = 0; + /* fall through */ case PQI_RESPONSE_IU_GENERAL_MANAGEMENT: break; case PQI_RESPONSE_IU_TASK_MANAGEMENT: -- 2.19.1