Received: by 2002:a05:6a10:206:0:0:0:0 with SMTP id 6csp3794222pxj; Mon, 24 May 2021 15:16:25 -0700 (PDT) X-Google-Smtp-Source: ABdhPJy3sANknuzu4fjrx7TwmubQTyYABnUbs8n+IT3AarhiztHESBGCCexTsGUbbHfnzAU2YbXF X-Received: by 2002:a17:906:1dc5:: with SMTP id v5mr22509833ejh.212.1621894585322; Mon, 24 May 2021 15:16:25 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1621894585; cv=none; d=google.com; s=arc-20160816; b=uKGFAyN6jxvmKN/8eJdjjcGiOTgO6TfeyuEyc/FqBdoOPSNsc9ZMiSVnzNL8CZiY0p 5gccVYA0flXsfel7Yq4BE4QDQEeQlTEjjLMVG1CgBAjUuWLglFP9LRPB8t2QnS8oPvKz 5FPePWBw5F7KI/xFANexwAnRrCssOG7ZOyA3s5S7OvzCAy5Frsulc8toxDfO89Cuwms5 SdUxRAQ1l44eqWD7cN1qCAupoSsXghrC5W5OoPxvsZ2WhaKk2GNmlDc0S6CpKo5gEOQH QcqYwY1CBAG9IASHRhLLOdTbnpiXywo7MDdQTxQJwCrKbzSo41eWrpmY29h0ZXgVPUr9 mpDw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:content-language :in-reply-to:mime-version:user-agent:date:message-id:autocrypt:from :references:cc:to:subject:ironport-sdr:ironport-sdr; bh=PDsKW17N4BY/5lQPPRDxIp2z2HQrH+hxbca+o6JLR6w=; b=Doxwzjrb4EMA/KljEHBZAlt3HtR7jLCJ6XS5f1hpB8263QusZjp+KT+bZOteb+qiFE 2fUwepS/LBpW3tAqDPQ27NtVmByiSg+B9sKZKeqZYJO55u8Djt1JYxS1dkwfHCXcwNAk hupegRwpiHP4dDnFjemhxiYSt7iic5/6QM3a43HPz/1QyPfSBZiXJVV2vq4Gc6UHVLmM 9nnit96UzGb6VwfODI8DIPRwt44QL5YTnZgYP7HjUlvVsmgnoTkhodBhsI5XXNuSwVdN cu+wZsYjIcyMHc8IOPrezVNT9cwCri0oIPWKzzbyb9KRf3gRLCrdO5RC2p5O8r2YisvM +Omg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id g18si12790012edb.157.2021.05.24.15.16.02; Mon, 24 May 2021 15:16:25 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233942AbhEXWQT (ORCPT + 99 others); Mon, 24 May 2021 18:16:19 -0400 Received: from mga12.intel.com ([192.55.52.136]:48071 "EHLO mga12.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232911AbhEXWQS (ORCPT ); Mon, 24 May 2021 18:16:18 -0400 IronPort-SDR: vP4VkEANegTVmEopAkjlPAiTK8M00ktI2daD3rsJoVuhu/YfS1QGorJF8mU45QGaKqe8oGFUX3 jRjbbwM2unYw== X-IronPort-AV: E=McAfee;i="6200,9189,9994"; a="181682609" X-IronPort-AV: E=Sophos;i="5.82,325,1613462400"; d="scan'208";a="181682609" Received: from orsmga008.jf.intel.com ([10.7.209.65]) by fmsmga106.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 24 May 2021 15:14:49 -0700 IronPort-SDR: P3W7QSBaJwRNtp1eVfL8CDEU+1G1R27mtiFi4a8sjwlsLpRyC+HmbmPI/J+Je1f8BGLKDa9Y7P NnGBLwelKEyw== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.82,325,1613462400"; d="scan'208";a="442246153" Received: from mattu-haswell.fi.intel.com (HELO [10.237.72.170]) ([10.237.72.170]) by orsmga008.jf.intel.com with ESMTP; 24 May 2021 15:14:44 -0700 Subject: Re: [syzbot] INFO: rcu detected stall in tx To: Thinh Nguyen , Alan Stern Cc: Mathias Nyman , Guido Kiener , dave penkler , Dmitry Vyukov , syzbot , Greg Kroah-Hartman , "lee.jones@linaro.org" , USB list , "bp@alien8.de" , "dwmw@amazon.co.uk" , "hpa@zytor.com" , "linux-kernel@vger.kernel.org" , "luto@kernel.org" , "mingo@redhat.com" , "syzkaller-bugs@googlegroups.com" , "tglx@linutronix.de" , "x86@kernel.org" References: <20210519173545.GA1173157@rowland.harvard.edu> <12088413-2f7d-a1e5-5e8a-25876d85d18a@synopsys.com> <20210520020117.GA1186755@rowland.harvard.edu> <74b2133b-2f77-c86f-4c8b-1189332617d3@synopsys.com> <37c41d87-6e30-1557-7991-0b7bca615be1@linux.intel.com> <20210524185520.GA1332625@rowland.harvard.edu> <354a16cb-ba96-aa6f-7f10-388e6201e56d@synopsys.com> From: Mathias Nyman Autocrypt: addr=mathias.nyman@linux.intel.com; prefer-encrypt=mutual; keydata= mQINBFMB0ccBEADd+nZnZrFDsIjQtclVz6OsqFOQ6k0nQdveiDNeBuwyFYykkBpaGekoHZ6f lH4ogPZzQ+pzoJEMlRGXc881BIggKMCMH86fYJGfZKWdfpg9O6mqSxyEuvBHKe9eZCBKPvoC L2iwygtO8TcXXSCynvXSeZrOwqAlwnxWNRm4J2ikDck5S5R+Qie0ZLJIfaId1hELofWfuhy+ tOK0plFR0HgVVp8O7zWYT2ewNcgAzQrRbzidA3LNRfkL7jrzyAxDapuejuK8TMrFQT/wW53e uegnXcRJaibJD84RUJt+mJrn5BvZ0MYfyDSc1yHVO+aZcpNr+71yZBQVgVEI/AuEQ0+p9wpt O9Wt4zO2KT/R5lq2lSz1MYMJrtfFRKkqC6PsDSB4lGSgl91XbibK5poxrIouVO2g9Jabg04T MIPpVUlPme3mkYHLZUsboemRQp5/pxV4HTFR0xNBCmsidBICHOYAepCzNmfLhfo1EW2Uf+t4 L8IowAaoURKdgcR2ydUXjhACVEA/Ldtp3ftF4hTQ46Qhba/p4MUFtDAQ5yeA5vQVuspiwsqB BoL/298+V119JzM998d70Z1clqTc8fiGMXyVnFv92QKShDKyXpiisQn2rrJVWeXEIVoldh6+ J8M3vTwzetnvIKpoQdSFJ2qxOdQ8iYRtz36WYl7hhT3/hwkHuQARAQABtCdNYXRoaWFzIE55 bWFuIDxtYXRoaWFzLm55bWFuQGdtYWlsLmNvbT6JAjsEEwECACUCGwMGCwkIBwMCBhUIAgkK CwQWAgMBAh4BAheABQJTAeo1AhkBAAoJEFiDn/uYk8VJOdIP/jhA+RpIZ7rdUHFIYkHEKzHw tkwrJczGA5TyLgQaI8YTCTPSvdNHU9Rj19mkjhUO/9MKvwfoT2RFYqhkrtk0K92STDaBNXTL JIi4IHBqjXOyJ/dPADU0xiRVtCHWkBgjEgR7Wihr7McSdVpgupsaXhbZjXXgtR/N7PE0Wltz hAL2GAnMuIeJyXhIdIMLb+uyoydPCzKdH6znfu6Ox76XfGWBCqLBbvqPXvk4oH03jcdt+8UG 2nfSeti/To9ANRZIlSKGjddCGMa3xzjtTx9ryf1Xr0MnY5PeyNLexpgHp93sc1BKxKKtYaT0 lR6p0QEKeaZ70623oB7Sa2Ts4IytqUVxkQKRkJVWeQiPJ/dZYTK5uo15GaVwufuF8VTwnMkC 4l5X+NUYNAH1U1bpRtlT40aoLEUhWKAyVdowxW4yGCP3nL5E69tZQQgsag+OnxBa6f88j63u wxmOJGNXcwCerkCb+wUPwJzChSifFYmuV5l89LKHgSbv0WHSN9OLkuhJO+I9fsCNvro1Y7dT U/yq4aSVzjaqPT3yrnQkzVDxrYT54FLWO1ssFKAOlcfeWzqrT9QNcHIzHMQYf5c03Kyq3yMI Xi91hkw2uc/GuA2CZ8dUD3BZhUT1dm0igE9NViE1M7F5lHQONEr7MOCg1hcrkngY62V6vh0f RcDeV0ISwlZWuQINBFMB0ccBEACXKmWvojkaG+kh/yipMmqZTrCozsLeGitxJzo5hq9ev31N 2XpPGx4AGhpccbco63SygpVN2bOd0W62fJJoxGohtf/g0uVtRSuK43OTstoBPqyY/35+VnAV oA5cnfvtdx5kQPIL6LRcxmYKgN4/3+A7ejIxbOrjWFmbWCC+SgX6mzHHBrV0OMki8R+NnrNa NkUmMmosi7jBSKdoi9VqDqgQTJF/GftvmaZHqgmVJDWNrCv7UiorhesfIWPt1O/AIk9luxlE dHwkx5zkWa9CGYvV6LfP9BznendEoO3qYZ9IcUlW727Le80Q1oh69QnHoI8pODDBBTJvEq1h bOWcPm/DsNmDD8Rwr/msRmRyIoxjasFi5WkM/K/pzujICKeUcNGNsDsEDJC5TCmRO/TlvCvm 0X+vdfEJRZV6Z+QFBflK1asUz9QHFre5csG8MyVZkwTR9yUiKi3KiqQdaEu+LuDD2CGF5t68 xEl66Y6mwfyiISkkm3ETA4E8rVZP1rZQBBm83c5kJEDvs0A4zrhKIPTcI1smK+TWbyVyrZ/a mGYDrZzpF2N8DfuNSqOQkLHIOL3vuOyx3HPzS05lY3p+IIVmnPOEdZhMsNDIGmVorFyRWa4K uYjBP/W3E5p9e6TvDSDzqhLoY1RHfAIadM3I8kEx5wqco67VIgbIHHB9DbRcxQARAQABiQIf BBgBAgAJBQJTAdHHAhsMAAoJEFiDn/uYk8VJb7AQAK56tgX8V1Wa6RmZDmZ8dmBC7W8nsMRz PcKWiDSMIvTJT5bygMy1lf7gbHXm7fqezRtSfXAXr/OJqSA8LB2LWfThLyuuCvrdNsQNrI+3 D+hjHJjhW/4185y3EdmwwHcelixPg0X9EF+lHCltV/w29Pv3PiGDkoKxJrnOpnU6jrwiBebz eAYBfpSEvrCm4CR4hf+T6MdCs64UzZnNt0nxL8mLCCAGmq1iks9M4bZk+LG36QjCKGh8PDXz 9OsnJmCggptClgjTa7pO6040OW76pcVrP2rZrkjo/Ld/gvSc7yMO/m9sIYxLIsR2NDxMNpmE q/H7WO+2bRG0vMmsndxpEYS4WnuhKutoTA/goBEhtHu1fg5KC+WYXp9wZyTfeNPrL0L8F3N1 BCEYefp2JSZ/a355X6r2ROGSRgIIeYjAiSMgGAZMPEVsdvKsYw6BH17hDRzltNyIj5S0dIhb Gjynb3sXforM/GVbr4mnuxTdLXQYlj2EJ4O4f0tkLlADT7podzKSlSuZsLi2D+ohKxtP3U/r 42i8PBnX2oAV0UIkYk7Oel/3hr0+BP666SnTls9RJuoXc7R5XQVsomqXID6GmjwFQR5Wh/RE IJtkiDAsk37cfZ9d1kZ2gCQryTV9lmflSOB6AFZkOLuEVSC5qW8M/s6IGDfYXN12YJaZPptJ fiD/ Message-ID: Date: Tue, 25 May 2021 01:16:54 +0300 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.10.0 MIME-Version: 1.0 In-Reply-To: <354a16cb-ba96-aa6f-7f10-388e6201e56d@synopsys.com> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 8bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 24.5.2021 22.23, Thinh Nguyen wrote: > Alan Stern wrote: >> On Mon, May 24, 2021 at 06:18:59PM +0300, Mathias Nyman wrote: >>> On 20.5.2021 23.30, Thinh Nguyen wrote: >>>> As for the xhci driver, there maybe a case where the stream URB never >>>> gets to complete because the transaction err_count is not properly >>>> updated. The err_count for transaction error is stored in ep_ring, but >>>> the xhci driver may not be able to lookup the correct ep_ring based on >>>> TRB address for streams. There are cases for streams where the event >>>> TRBs have their TRB pointer field cleared to '0' (xhci spec section >>>> 4.12.2). If the xhci driver doesn't see ep_ring for transaction error, >>>> it automatically does a soft-retry. This is seen from one of our >>>> testings that the driver was repeatedly doing soft-retry until the class >>>> driver timed out. >>>> >>>> Hi Mathias, maybe you have some comment on this? Thanks. >>> >>> This is true, if TRB pointer is 0 then there is no retry limit for soft retry. >>> We should add one and prevent a loop. after e few soft resets we can end with a >>> hard reset to clear the host side endpoint halt. >>> >>> We don't know the URB that was being tansferred during the error, and can't >>> give it back with a proper error code. >>> In that sense we still end up waiting for a timeout and someone to cancel >>> the urb. >> >> That's not good. There may not be a timeout; drivers expect transfers >> to complete with a failure, not to be retried indefinitely. >> >> However, if you do know which endpoint/stream the error is connected to, >> you should be able to get the URB. It will be the first one queued for >> that endpoint/stream. >> > > When the xhci can't recover a transfer with soft-retry, no outstanding > transfer can proceed/complete for the endpoint. If the TRB pointer is 0, > we just don't know which stream or endpoint ring it's for, but we know > all the outstanding URBs of an endpoint. Let's may as well return an > error status for all of them after a limited number of soft-retries. We get the endpoint, but not the stream. I guess we could walk through each stream of this endpoint, and return the first URB of every stream that has a pending URB. xHCI spec claims to supports 65533 streams per endpoint, but in real life UAS probably only uses a few per endpoint? -Mathias