Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752237AbcCAHaG (ORCPT ); Tue, 1 Mar 2016 02:30:06 -0500 Received: from mx2.suse.de ([195.135.220.15]:34182 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751891AbcCAHaE (ORCPT ); Tue, 1 Mar 2016 02:30:04 -0500 Subject: Re: [PATCH v5 03/15] scsi: ufs: implement scsi host timeout handler To: Yaniv Gardi , James.Bottomley@HansenPartnership.com References: <1456666367-11418-1-git-send-email-ygardi@codeaurora.org> <1456666367-11418-4-git-send-email-ygardi@codeaurora.org> Cc: linux-kernel@vger.kernel.org, linux-scsi@vger.kernel.org, linux-arm-msm@vger.kernel.org, santoshsy@gmail.com, linux-scsi-owner@vger.kernel.org, Gilad Broner , Vinayak Holikatti , "James E.J. Bottomley" , "Martin K. Petersen" From: Hannes Reinecke Message-ID: <56D544E6.8040005@suse.de> Date: Tue, 1 Mar 2016 15:29:42 +0800 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.6.0 MIME-Version: 1.0 In-Reply-To: <1456666367-11418-4-git-send-email-ygardi@codeaurora.org> Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1597 Lines: 36 On 02/28/2016 09:32 PM, Yaniv Gardi wrote: > A race condition exists between request requeueing and scsi layer > error handling: > When UFS driver queuecommand returns a busy status for a request, > it will be requeued and its tag will be freed and set to -1. > At the same time it is possible that the request will timeout and > scsi layer will start error handling for it. The scsi layer reuses > the request and its tag to send error related commands to the device, > however its tag is no longer valid. Hmm. How can the host return a 'busy' status for a request? >From my understanding we have three possibilities: 1) queuecommand returns busy; however, that means that the command has never been send and this issue shouldn't occur 2) The command returns with BUSY status. But in this case it has already been returned, so there cannot be any timeout coming in. 3) The host receives a command with a tag which is already in-use. However, that should have been prevented by the block-layer, which really should ensure that this situation never happens. So either way I look at it, it really looks like a bug and adding a timeout handler will just paper over it. (Not that a timeout handler is a bad idea, in fact I'm convinced that you need one. Just not for this purpose.) So can you elaborate how this 'busy' status comes about? Is the command sent to the device? Cheers, Hannes -- Dr. Hannes Reinecke zSeries & Storage hare@suse.de +49 911 74053 688 SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 N?rnberg GF: J. Hawn, J. Guild, F. Imend?rffer, HRB 16746 (AG N?rnberg)