Received: by 2002:a25:7ec1:0:0:0:0:0 with SMTP id z184csp737804ybc; Tue, 12 Nov 2019 08:27:10 -0800 (PST) X-Google-Smtp-Source: APXvYqzlQtltTFkPPwaUwN5a6f4haRiBIdn2EYiGAI42FMPA6pavfwXEPc4XPOGGu/g1gKvrT+jN X-Received: by 2002:a17:906:505:: with SMTP id j5mr29186394eja.167.1573576030739; Tue, 12 Nov 2019 08:27:10 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1573576030; cv=none; d=google.com; s=arc-20160816; b=LAbHeVuLt3RTMyOBCs0Tcricg/Pxdmya8pF55UHoELTyHmCS/b3jRqSAjardNQAsuP iP+5tZmYeMggduS8dlwGHIirLwKd2I/CSuQM6FBDcYJu1gtlL6oybeFCZ1JTZBitMWBw Rpkc9N2zlab+GIh6JxlFCq5oa5LjBqdO6RnoBaQvoyEpn3qE6K7vVAiTQn2+7zt8tEA+ IOcPYQ1CBFk12PPPAhJe8b5cGsk0139LPoI/QNIuZh8YdBBtubKpyc8iSCQTbeYUFhAa ZBqdMn1SLntdtUhqs59oaasJPPNq+BaclHOQzQsuV9vKCiitS/viqPsTlv53g8cDt8l7 XW3w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:mime-version:user-agent:references :in-reply-to:subject:cc:to:from:message-id:date; bh=1Pa8boLy78njo+7s2SxWbpsUnbmVbLyPCM0OND07ksM=; b=ojKbanTHN0nAtSHcNll8HyoQbCRNFXTi8tR7wUBdfo9X51lL6oPvf2AoDj2vTTv15W idn0G9Y8/eToZKwXHDP3Fb4R9sTjBGiEtgVzEcqGjxyVVpYyOiHQklyrmbcKCeA3M8Ak l7LEJ34eHVaSqPFZc6ZIIzgR2tfMaKD2eggSPhu3uTP2r/S0cvlqgoIUdTUPMZVCGfdK k1n1vyPqp+HNiBBm0twuMi/uVi+03snO++HSLNcTjZVSVKnBqXR/Tbe8z3AgIVBXbZmU MOl4ZVR0+7Cy5onT8hmn6rGI4mSZ2hhXCYoDHntv0rjLPy9jXuazg0CaPal2VtEl9pIe bRXA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id os28si310709ejb.53.2019.11.12.08.26.46; Tue, 12 Nov 2019 08:27:10 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727329AbfKLQZP (ORCPT + 99 others); Tue, 12 Nov 2019 11:25:15 -0500 Received: from mx2.suse.de ([195.135.220.15]:46272 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1726953AbfKLQZP (ORCPT ); Tue, 12 Nov 2019 11:25:15 -0500 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id 4BA0CAC24; Tue, 12 Nov 2019 16:25:13 +0000 (UTC) Date: Tue, 12 Nov 2019 17:25:13 +0100 Message-ID: From: Takashi Iwai To: Henry Lin Cc: Jaroslav Kysela , Greg Kroah-Hartman , Allison Randal , "Thomas Gleixner" , Richard Fontana , "alsa-devel@alsa-project.org" , "linux-kernel@vger.kernel.org" Subject: Re: [PATCH] usb-audio: not submit urb for stopped endpoint In-Reply-To: References: <20191112065108.7766-1-henryl@nvidia.com> User-Agent: Wanderlust/2.15.9 (Almost Unreal) SEMI/1.14.6 (Maruoka) FLIM/1.14.9 (=?UTF-8?B?R29qxY0=?=) APEL/10.8 Emacs/25.3 (x86_64-suse-linux-gnu) MULE/6.0 (HANACHIRUSATO) MIME-Version: 1.0 (generated by SEMI 1.14.6 - "Maruoka") Content-Type: text/plain; charset=US-ASCII Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, 12 Nov 2019 17:13:37 +0100, Henry Lin wrote: > > >On Tue, 12 Nov 2019 07:51:06 +0100, > >Henry Lin wrote: > >> > >> While output urb's snd_complete_urb() is executing, calling > >> prepare_outbound_urb() may cause endpoint stopped before > >> prepare_outbound_urb() returns and result in next urb submitted > >> to stopped endpoint. usb-audio driver cannot re-use it afterwards as > >> the urb is still hold by usb stack. > >> > >> This change checks EP_FLAG_RUNNING flag after prepare_outbound_urb() again > >> to let snd_complete_urb() know the endpoint already stopped and does not > >> submit next urb. > > >OK, this part looks good and understandable. > > > >> We observed two scenario have this issue: > >> 1. While executing snd_complete_urb() to complete an output urb, calling > >> prepare_outbound_urb() let deactive_urbs() get called to unlink all > >> active urbs. > >> > >> [ 268.097066] [] deactivate_urbs+0xd4/0x108 > >> [ 268.102633] [] snd_usb_endpoint_stop+0x30/0x58 > >> [ 268.108636] [] snd_usb_substream_playback_trigger+0xa4/0xf4 > >> [ 268.115765] [] snd_pcm_do_stop+0x4c/0x58 > >> [ 268.121245] [] snd_pcm_action_single+0x40/0x88 > >> [ 268.127245] [] snd_pcm_action+0x30/0xf0 > >> [ 268.132632] [] snd_pcm_stop+0x24/0x2c > >> [ 268.137851] [] xrun+0x60/0x6c > >> [ 268.142374] [] snd_pcm_update_state+0xa8/0x10c > >> [ 268.148374] [] snd_pcm_update_hw_ptr0+0x328/0x344 > >> [ 268.154635] [] snd_pcm_period_elapsed+0x98/0xb0 > >> [ 268.160723] [] prepare_playback_urb+0x46c/0x488 > >> [ 268.166810] [] prepare_outbound_urb+0x60/0x1d4 > >> [ 268.172805] [] snd_complete_urb+0x244/0x264 > >> [ 268.178548] [] __usb_hcd_giveback_urb+0x94/0x104 > >> [ 268.184721] [] usb_hcd_giveback_urb+0x3c/0x114 > >> [ 268.190724] [] handle_tx_event+0x1304/0x1434 > >> [ 268.196552] [] xhci_handle_event+0x5dc/0x788 > >> [ 268.202378] [] xhci_irq+0x178/0x280 > >> > >> 2. Userspace application stops playback from sound subsystem with below > >> call stack: > >> > >> [ 28.506477] CPU: 5 PID: 1274 Comm: AudioOut_25 Not tainted 4.4.38-tegra #31 > >> [ 28.513430] Hardware name: quill (DT) > >> [ 28.517085] Call trace: > >> [ 28.519531] [] dump_backtrace+0x0/0xf8 > >> [ 28.524837] [] show_stack+0x14/0x1c > >> [ 28.529885] [] dump_stack+0xac/0xe0 > >> [ 28.534931] [] deactivate_urbs+0x148/0x180 > >> [ 28.540578] [] snd_usb_endpoint_stop+0x30/0x58 > >> [ 28.546571] [] snd_usb_substream_playback_trigger+0xa4/0xf4 > >> [ 28.553699] [] snd_pcm_do_stop+0x4c/0x58 > >> [ 28.559179] [] snd_pcm_action_single+0x40/0x88 > >> [ 28.565178] [] snd_pcm_action+0x30/0xf0 > >> [ 28.570568] [] snd_pcm_drop+0xac/0x140 > >> [ 28.575873] [] snd_pcm_release_substream+0x28/0xb0 > >> [ 28.582212] [] snd_pcm_release+0x3c/0x98 > >> [ 28.587686] [] __fput+0xe0/0x1ac > >> [ 28.592469] [] ____fput+0xc/0x14 > >> [ 28.597253] [] task_work_run+0xa0/0xc0 > >> [ 28.602558] [] do_notify_resume+0x48/0x60 > >> [ 28.608123] [] work_pending+0x1c/0x20 > >> > >> In the call path, snd_pcm_stream spinlock has been acquired in > >> snd_pcm_drop(). If an output urb is completed between the spinlock > >> acquired and deactivate_urbs() clears EP_FLAG_RUNNING for the endpoint, > >> its executing of snd_complete_urb() will be blocked for acquiring > >> snd_pcm_stream spinlock in snd_pcm_period_elapsed() until the lock is > >> released in snd_pcm_drop(). When snd_complete_urb() continues, all jobs > >> for deactivate_urbs() are finished. > > >... but this part is unclear to me. Do you mean that we have a > >deadlock in these two concurrent calls without your patch? > Above describes two different cases that endpoint is stopped before prepare_outbound_urb() returns in details. Listed two call stacks belong to different cases. Without this patch, both two cases will result in below error messages afterwards: > > [ 213.153103] usb 1-2: timeout: still 1 active urbs on EP #1 > [ 213.164121] usb 1-2: cannot submit urb 0, error -16: unknown error Ah, I see. Then you don't have to write up the full stack trace like the above. It's confusing as if it were some kernel Oops or WARNING. BTW, with your patch, is the error in the first line ("timeout: still 1 active urbs on EP #1") also fixed? thanks, Takashi