Received: by 2002:a25:4158:0:0:0:0:0 with SMTP id o85csp1526723yba; Thu, 4 Apr 2019 12:35:22 -0700 (PDT) X-Google-Smtp-Source: APXvYqw3t/1S6tVaOihK3uMKi2nc1dXCPdfHDA5tKSlIWNCBv5E8uNuMViXKUQXg9UT5aUOrQg10 X-Received: by 2002:a17:902:362:: with SMTP id 89mr8404406pld.172.1554406522103; Thu, 04 Apr 2019 12:35:22 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1554406522; cv=none; d=google.com; s=arc-20160816; b=RGbl8p+Vo135PSu+YDlQhhKFX+iLIkvCdrQxbx/AwtGjyOFHfgkIrW705+aBd2g0VU DnJaA0LhGJyDKuFGG1H4+NU8PPIGGeE2x6KsAHBoz247kpw3LDOMneHCfpbrk3/J01cA bEWVKGdeZlXon+YsnIm4VO9vxOHSpCyvFe8q9qBDdE/z8Ufpi+DzJcfOWhgTwVkoBlpS GKCIB0adIgYRgryvB5jLwHTOKCG5vnR2Ib5P5oqSkLxKd+zsvqdLBnvzm3L8oHPtrfzC zZw2ZXysyGgBowXlzPLt4vAzI+k+l/jtQl4wvsU5gvq0mkQcY6/xppPQpz5q5Zg7ZGCe KjGg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :message-id:in-reply-to:subject:cc:to:from:date; bh=SKTO6M6bOyO1JTirbo/DumoJMRdd39USj17wOcPZzMo=; b=JIGRK2RQwbN0DcuJVS65Oeaf3HxI9kOBJfeCG/iqHeugZrk+LQQJuBPMZUH4z6/+Ie Mb7ag55v1ay3izpwULrzpBqulNYK5RDBsFoZGEEO1s0hj3ZE3lZ6OCQ2pCqO12gYeQRq tjtKkEFAMD4FQwGG7qvzdn2pt5U9gn89j7BxXBjaYwrTNkjSDxKAK5Kzdf5oQspK2g37 sQqA8IHHJUWhuwymXLAsBawsUJEq2FfFjZhuoS/dzCwdPjMFh0+aRbcGtVtsYnWG1lZP sL9tYNM0AR8AzbYkMSavfvo96eysj+qcKTkbupcNk0jta8DToRo+kM6hlUiOcd0Hou3M gKfw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id q9si11999146pgv.542.2019.04.04.12.35.06; Thu, 04 Apr 2019 12:35:22 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730429AbfDDTdj (ORCPT + 99 others); Thu, 4 Apr 2019 15:33:39 -0400 Received: from iolanthe.rowland.org ([192.131.102.54]:41632 "HELO iolanthe.rowland.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with SMTP id S1730344AbfDDTdj (ORCPT ); Thu, 4 Apr 2019 15:33:39 -0400 Received: (qmail 27456 invoked by uid 2102); 4 Apr 2019 15:33:38 -0400 Received: from localhost (sendmail-bs@127.0.0.1) by localhost with SMTP; 4 Apr 2019 15:33:38 -0400 Date: Thu, 4 Apr 2019 15:33:38 -0400 (EDT) From: Alan Stern X-X-Sender: stern@iolanthe.rowland.org To: Kento.A.Kobayashi@sony.com, "James E.J. Bottomley" , "Martin K. Petersen" cc: Oliver Neukum , , USB Storage list , , Kernel development list , SCSI development list , USB list Subject: RE: [PATCH] usb: uas: fix usb subsystem hang after power off hub port In-Reply-To: Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=UTF-8 Content-Transfer-Encoding: 8BIT Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, 4 Apr 2019 Kento.A.Kobayashi@sony.com wrote: > Hi, > > >> Root Cause > >> - Block layer timeout happens after power off UAS USB device which is accessed as reproduce step. During timeout error handler process, scsi host state becomes SHOST_CANCEL_RECOVERY that causes IO hangs up and lock cannot be released. And in final, usb subsystem hangs up. > >> Follow is function call: > >> blk_mq_timeout_work > >> …->scsi_times_out (… means some functions are not listed before this function.) > >> …-> scsi_eh_scmd_add(scsi_host_set_state to SHOST_RECOVERY) > >> … -> scsi_error_handler > >> …-> uas_eh_device_reset_handler > >> -> usb_lock_device_for_reset <- take lock > >> -> usb_reset_device > >> …-> rebind = uas_post_reset (return 1 since ENODEV) > >> …-> usb_unbind_and_rebind_marked_interfaces (rebind=1) > >> …-> uas_disconnect (scsi_host_set_state to SHOST_CANCEL_RECOVERY) > >> … -> scsi_queue_rq >> -> scsi_host_queue_ready(return 0 causes IO hangs up.) > > > >How does scsi_queue_rq get called here? As far as I can see, this shouldn't happen. > > We confirmed the function call path on linux 4.9 when this problem occured since we are working on it. In linux 4.9, the last function is scsi_request_fn instead of scsi_queue_rq. In staging.git, we think the scsi_queue_rq is called by follow path. > uas_disconnect > |- scsi_remove_host > |- scsi_forget_host > |- __scsi_remove_device > |- device_del > |- bus_remove_device > |- device_release_driver > |- device_release_driver_internal > |- __device_release_driver > |- drv->remove(dev) (sd_remove) > |- sd_shutdown > |- sd_sync_cache > |- scsi_execute ... (unnecessary internal details elided) > |- blk_mq_dispatch_rq_list > |- q->mq_ops->queue_rq (scsi_queue_rq) So it looks as though the SCSI subsystem doesn't like to have a reset handler call scsi_remove_host. Commands dispatched by the removal routines are forced to wait for the reset recovery to finish, which won't happen until those commands have been completed. Is this a bug in the SCSI core? If not, we need to know what is the right way to do things when a reset handler detects that the SCSI host has been hot-unplugged. James, Martin, any suggestions? Alan Stern