Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932590AbdHWTNM (ORCPT ); Wed, 23 Aug 2017 15:13:12 -0400 Received: from mx0a-001b2d01.pphosted.com ([148.163.156.1]:57453 "EHLO mx0a-001b2d01.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932458AbdHWTNK (ORCPT ); Wed, 23 Aug 2017 15:13:10 -0400 Subject: Re: [GIT PULL] SCSI fixes for 4.13-rc6 To: Bart Van Assche , "martin.petersen@oracle.com" Cc: "torvalds@linux-foundation.org" , "James.Bottomley@HansenPartnership.com" , "linux-kernel@vger.kernel.org" , "linux-scsi@vger.kernel.org" , "akpm@linux-foundation.org" References: <1503470559.2729.4.camel@HansenPartnership.com> <1503500560.2484.1.camel@wdc.com> <1503503077.2484.5.camel@wdc.com> From: Brian King Date: Wed, 23 Aug 2017 14:13:03 -0500 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.2.0 MIME-Version: 1.0 In-Reply-To: <1503503077.2484.5.camel@wdc.com> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit X-TM-AS-GCONF: 00 x-cbid: 17082319-0008-0000-0000-00000272FEFA X-IBM-SpamModules-Scores: X-IBM-SpamModules-Versions: BY=3.00007598; HX=3.00000241; KW=3.00000007; PH=3.00000004; SC=3.00000224; SDB=6.00906608; UDB=6.00454418; IPR=6.00686809; BA=6.00005550; NDR=6.00000001; ZLA=6.00000005; ZF=6.00000009; ZB=6.00000000; ZP=6.00000000; ZH=6.00000000; ZU=6.00000002; MB=3.00016832; XFM=3.00000015; UTC=2017-08-23 19:13:07 X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 17082319-0009-0000-0000-00003677D6BD Message-Id: <715e97db-b110-bbc3-77a5-fa4d86610aa3@linux.vnet.ibm.com> X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:,, definitions=2017-08-23_07:,, signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 spamscore=0 suspectscore=3 malwarescore=0 phishscore=0 adultscore=0 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1707230000 definitions=main-1708230287 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2621 Lines: 53 On 08/23/2017 10:44 AM, Bart Van Assche wrote: > On Wed, 2017-08-23 at 11:27 -0400, Martin K. Petersen wrote: >> However, what's more important is that we still need a good version of >> your patch for 4.13. I took Brian's workaround for ipr but I still think >> Christoph's concerns need to be addressed for me to put your change back >> in. > > Hello Martin, > > I am not aware of any requests to modify the patch "scsi-mq: Always unprepare > before requeuing a request". See also > https://www.spinics.net/lists/linux-scsi/msg111541.html. Are you perhaps > referring to another patch? > > What I remember is that my patch uncovered a bug in the ipr driver. As you > mentioned, a workaround for that bug has already been queued for kernel v4.14 > (https://git.kernel.org/pub/scm/linux/kernel/git/mkp/scsi.git/commit/?h=4.13/scsi-fixes&id=723cd772fde2344a9810eeaf5106787d535ec4a4). I don't completely agree with that statement. The patch "scsi-mq: Always unprepare before requeueing a request" introduces a regression in the scsi stack. It alters the behavior of retries such that the retry counter no longer works and the jiffies_at_alloc use to ensure we don't spend a tremendous amount of time retrying ops gets broken as well. As far as ipr is concerned, we do have the workaround in place now and I'll also queue up a further improvement to ipr to return a better failure response. However, until we fix the retry counter in scsi, any driver that returns an error response that scsi wants to retry could get us stuck in an eternal retry loop, like we were seeing with ipr. > Further improvements for the SCSI core that are the result of the analysis of > the behavior of the SCSI subsystem on PowerPC systems are under discussion. See > also "[PATCHv2 1/2] scsi: Move scsi_cmd->jiffies_at_alloc initialization to > allocation time" (https://marc.info/?l=linux-next&m=150335524812989) and > "[PATCH 2/2] scsi: Preserve retry counter through scsi_prep_fn" > (https://marc.info/?l=linux-scsi&m=150335371112485). While my patches highlight the problem, I don't think they are the right fix and need to be reworked. It looks like we go through scsi_init_rq at hctx setup time rather than for each new i/o submission. Adding a simple kprobe to scsi_init_rq never triggers when issuing i/o to already configured devices. Therefore, we cannot simply move the initialization of jiffies_at_alloc to scsi_init_rq. I've been working on moving the retry counter and jiffies_at_alloc into struct scsi_request, as that doesn't get reinitialized. Thanks, Brian -- Brian King Power Linux I/O IBM Linux Technology Center