Received: by 2002:ad5:474a:0:0:0:0:0 with SMTP id i10csp8465897imu; Tue, 4 Dec 2018 08:48:05 -0800 (PST) X-Google-Smtp-Source: AFSGD/UGtGKGXvs7hGpaevj7FfgwWPZGK514rDBVC0NHscdOetU5Hgr9hkgTbghwNFs58e4zaR0X X-Received: by 2002:a17:902:1101:: with SMTP id d1mr15278800pla.136.1543942085171; Tue, 04 Dec 2018 08:48:05 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1543942085; cv=none; d=google.com; s=arc-20160816; b=Yqn7YbBTbtJWw9xSQujZXe/UF40mRINcwovjkH2Z6d5lLc6/Xs6GIE4w011Sdc9InQ Uxf9DeUIr+2oSngIHzJW16G0m2cSS6SjJwB/sLFp/RAXcRbkZqcfkFflJmVMGiGAQ67a aPava69HJuY9oHHf4HR9yGJt+q7KmNB4eOAAIO5Hd3aKKrYHW4ETnZY9fOyAF3nJCa5a mUtAXUYYSvO1U6qSO8MS3Uebp5+cRQAytd8AnR4k4O3m/N2sKu8vaHVl+IDxD6QQI+7z na4fS+UfkxqYmMExxsDW+WPDR/26X5DXdq8S8vJVzwgCIQIcO9sTEPcv+zR5Xw93Uvme WISw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:mime-version:message-id:in-reply-to :subject:cc:to:from:date; bh=/6Qik9kYIKGmsH1l7TcCg9FMjrzl47F8/2fzpzdjJ+g=; b=ml9PNV5tloQi7ikewh4i8X186VE72BiKG35K8tfOlXIbXQlMGXigTuNqb7vPq6o+LD Di+mPU04tD6ZHsQ45Un2QYLfUwzgrpjzLf19g/oBLVfMF0jdwEBlxMb7Jud7QRMf20P6 gwnQ+8TffJS7VbVrAxR6ld2vNNf8r05Gn3qs5B19VzZ/ptR/5z/mpeCh3KFIui3r+/M4 BkKR2dScYnaliRFxPTV42O7KxW7FIsd88IoLP/fLfFN2MgFYnKtngk69ry8AOsXzwkEn CkWYjCuV5yY1ZHi+fkszKz0+9ulwsnx3K5lVBJzBScST6av8TZM+oHl3yDEpc5AWoeg3 pPtg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id 102si17457779plc.277.2018.12.04.08.47.49; Tue, 04 Dec 2018 08:48:05 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727189AbeLDQqj (ORCPT + 99 others); Tue, 4 Dec 2018 11:46:39 -0500 Received: from iolanthe.rowland.org ([192.131.102.54]:46636 "HELO iolanthe.rowland.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with SMTP id S1727169AbeLDQqi (ORCPT ); Tue, 4 Dec 2018 11:46:38 -0500 Received: (qmail 5537 invoked by uid 2102); 4 Dec 2018 11:46:37 -0500 Received: from localhost (sendmail-bs@127.0.0.1) by localhost with SMTP; 4 Dec 2018 11:46:37 -0500 Date: Tue, 4 Dec 2018 11:46:37 -0500 (EST) From: Alan Stern X-X-Sender: stern@iolanthe.rowland.org To: Anurag Kumar Vulisha cc: Felipe Balbi , Greg Kroah-Hartman , Shuah Khan , Johan Hovold , Jaejoong Kim , Benjamin Herrenschmidt , Roger Quadros , Manu Gautam , "martin.petersen@oracle.com" , Bart Van Assche , Mike Christie , Matthew Wilcox , Colin Ian King , "linux-usb@vger.kernel.org" , "linux-kernel@vger.kernel.org" , "v.anuragkumar@gmail.com" , Thinh Nguyen , Tejas Joglekar , Ajay Yugalkishore Pandey Subject: RE: [PATCH v7 01/10] usb: gadget: udc: Add timer support for usb requests In-Reply-To: Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, 4 Dec 2018, Anurag Kumar Vulisha wrote: > >> Yes the host can cancel the transfer. This issue originated from the endpoints using > >bulk > >> streaming protocol and may not occur with normal endpoints. AFAIK bulk streaming > >is > >> gadget driven, where the gadget is allowed to select which stream id transfer the > >host > >> should work on . Since the host doesn't have control on when the transfer would be > >> selected by gadget, it may wait for longer timeouts before cancelling the transfer. > > > >You're missing the point. Although the device selects which stream ID > >gets transferred, the _host_ decides whether a stream transfer should > >occur in the first place. No matter how many ERDY packets the device > >controller tries to send, no transfer will occur until the host wants > >to do it. > > > >In this sense, stream transfers (like all other USB interactions except > >wakeup requests) are host-driven. > > > > I agree with you that without host willing to transfer, the transfer wouldn't have > started and also agree the host always has the capability of detecting the hang. But > we are not sure on what host platform and operating system the gadget would be > tested and also not sure whether the host software is capable of handling the deadlock. > Even if the host has a timer , it would be very long and waiting for the timer to get > timedout would reduce the performance. In this patch we are giving the user the > flexibility to opt the timeout value, so that the deadlock can be avoided without > effecting the performance. So I think adding the timer in gadget is more easier solution > than handling in host. I have seen this workaround working for both linux & windows. Is there any way for the device controller driver to detect the problem without relying on a timeout? In fact, is this problem a known bug in the existing device controller hardware? Have the controller manufacturers published a suggested scheme for working around it? > >> >> Since the gadget > >> >> controller driver is aware that the controller is stuck , makes it responsible > >> >> to recover the controller from hang condition by restarting the transfer (which > >> >> triggers the controller FSM to issue ERDY to host). > >> > > >> >Isn't there a cleaner way to recover than by cancelling the request and > >> >resubmitting it? > >> > > >> > >> dequeuing the request issues the stop transfer command to the controller, which > >> cleans all the hardware resource allocated for that endpoint. This also resets the > >> hardware FSMs for that endpoint . So, when re-queuing of the transfer happens > >> the controller starts allocating hardware resources again, thus avoiding the > >probability > >> of entering into the issue. I am not sure of other controllers, but for dwc3, issuing > >> the stop transfer is the only way to handle this issue. > > > >Again you're missing the point. Can't the controller driver issue the > >Stop Transfer command but still consider the request to be in progress > >(i.e., don't dequeue the request) so that the gadget driver's > >completion callback isn't invoked and the request does not need to be > >explicitly resubmitted? > > > > Yes , what you are saying is correct. My initial patches were following the > same approach that you have suggested. We switched to the dequeue > approach because there can be different gadgets which may enter into > this issue and it would be better to follow a generic approach rather than > having every controller driver implementing their own workaround. > Please find the conservation in this link https://patchwork.kernel.org/patch/10640145/ At this point, it seems that the generic approach will be messier than having every controller driver implement its own fix. At least, that's how it appears to me. > >> @Felipe: Can you please provide your suggestion on this. > > > >> >How can the gadget driver know what timeout to use? The host is > >> >allowed to be as slow as it wants; the gadget driver doesn't have any > >> >way to tell when the host wants to start the transfer. > >> > >> Yes , I agree with you that the timeout may vary from usage to usage. This timeout > >> should be decided by the class driver which queues the request. As discussed above > >> this issue was observed in streaming protocol , which is very much faster than > >normal > >> BOT modes and it works on super speed . > > > >Although USB mass storage is currently the only user of the stream > >protocol, that may not be true in the future. You should think in more > >general terms. A timeout which is appropriate for mass storage may not > >be appropriate for some other application. > > > > You are correct , this timeout may not be ideal. Since I was not able to reproduce this issue > on non-stream capable transfers , at present the only user of usb_ep_queue_timeout() API > is the streaming gadget. As we are not aware of the optimal timeout value, instead of > hardcoding the timeout value we can add module_param() for it. So that the user can configure > timeout based on their use case. Please let me know your suggestion on this. Ideally it would not be necessary to rely on a timeout at all. Also, maintainers dislike module parameters. It would be better not to add one. Alan Stern