Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932199AbcDKMx6 (ORCPT ); Mon, 11 Apr 2016 08:53:58 -0400 Received: from mga02.intel.com ([134.134.136.20]:25458 "EHLO mga02.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754384AbcDKMx4 (ORCPT ); Mon, 11 Apr 2016 08:53:56 -0400 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.24,462,1455004800"; d="asc'?scan'208";a="684363658" From: Felipe Balbi To: Roger Quadros , John Youn Cc: "nsekhar\@ti.com" , "linux-usb\@vger.kernel.org" , "linux-kernel\@vger.kernel.org" , "Shankar\, Abhishek" , "B\, Ravi" Subject: Re: [PATCH 1/2] usb: dwc3: core: Introduce dwc3_device_reinit() In-Reply-To: <570B9A84.1020607@ti.com> References: <1458133551-3071-1-git-send-email-rogerq@ti.com> <1458133551-3071-2-git-send-email-rogerq@ti.com> <87r3fah8h3.fsf@intel.com> <87lh5ih6hs.fsf@intel.com> <56EC5452.8030509@synopsys.com> <56F03FD7.30202@synopsys.com> <874mbzyq1l.fsf@intel.com> <56F3376E.8050305@synopsys.com> <87d1qkxtbb.fsf@intel.com> <56FC1E92.60403@synopsys.com> <56FD2E13.2010903@ti.com> <87r3eqg1us.fsf@intel.com> <57021CB9.3090803@ti.com> <87inzx7q0l.fsf@intel.com> <570B9A84.1020607@ti.com> User-Agent: Notmuch/0.21+96~g9bbc54b (http://notmuchmail.org) Emacs/25.0.90.3 (x86_64-pc-linux-gnu) Date: Mon, 11 Apr 2016 15:51:38 +0300 Message-ID: <8737qsnw9x.fsf@intel.com> MIME-Version: 1.0 Content-Type: multipart/signed; boundary="=-=-="; micalg=pgp-sha1; protocol="application/pgp-signature" Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 6260 Lines: 155 --=-=-= Content-Type: text/plain Content-Transfer-Encoding: quoted-printable Hi, Roger Quadros writes: >> I don't have this text anywhere so I don't know. Is this something TI >> came up with or Synopsys ? Unless I can see a document (preferrably from >> Synopsys) stating this, I can't really accept $subject. > > OK. I'll try to find out if there is an official document about this. > >>=20 >> Another question is: if all it takes is an extra SoftReset, why don't we >> just reset it during probe() if max_speed < SUPER and we're running on >> rev < 2.20a ? BTW, which revision of the IP is on AM57x/DRA7x ? > > The issue might happen on any Run/Stop transition so not sure if doing it > SoftReset just at probe fixes anything. > > On DRA7x it is rev 2.02a. oh, same block as OMAP5 ES2.0 :-( >>>> question is, then: How are you sure that resetting the device actually >>>> solves the issue ? Did you really hit the metastability problem and >>>> noted that it works after a soft-reset ? How did you verify that >>> >>> I don't know if it solves the issue or not. It was suggested by >>> Synopsis to TI's silicon team. >>=20 >> now that's a bummer ;-) >>=20 >>> I never hit the metastability problem detection condition in my >>> overnight tests (i.e. LTDB_LINK_STATE !=3D 4). >>=20 >> overnight is not enough. You need to keep this running for weeks. > > how many weeks is acceptable for you? I can run for that long, no problem. > And what if the issue doesn't happen in that time frame, would you still > consider this case? Well, there's always the possibility we have never triggered the issue to start with :-) What happens if you remove the the current workaround, set maximum-speed to high-speed and constantly toggle run/stop bit (there's a soft-connect file under the UDC's directory in sysfs). Can you ever cause the problems ? >>>> Run/Stop was in a metastable state, considering that Run/Stop signal is >>>> not visible outside the die ? >>> >>> LTDB_LINK_STATE !=3D 4 within 100ms or RUNSTOP set is the condition to >>> detect that the issue occurred. >>=20 >> this doesn't prove anything. This just means that your 100ms timer >> expired. Unless you can verify that Run/Stop is in metastability, you >> cannot be sure this workaround works. >>=20 >> Did anybody run silicon simulation to verify this case ? It's really the >> only way to be sure. > > AFAIK this wasn't reproducible during silicon simulation either. now this is a big problem. We just don't know if $subject is really avoiding the problem ;-) Unless we can trigger the problem, we can't be sure. We are, however, sure that current workaround avoids the problem completely. >>>> It seems to me that resetting the IP is just as "dangerous" as setting >>>> the IP to High-speed in the first place. No ? >>> >>> The soft-reset is just a recovery mechanism if that error is ever hit. >>=20 >> but you don't know if that's a *proper* recovery mechanism because you >> never even *hit* the error. >>=20 >>> Putting the controller in reset state means it is in a known >>> state. Why do you say it would be dangerous? >>=20 >> Because you can't predict the systems' behavior. If the flip-flop didn't >> have time to settle into 0 or 1 state, you don't know what the >> combinatorial part of the IP will do with that bogus value. It's truly >> unpredictable. You also cannot know, for sure, that a SoftReset will be >> enough to bring that flip-flop out of metastability. > > I'm not an expert in this area and can only follow the advice the > Silicon team gives. fair enough. But you must understand we can't just accept anything even if we never trigger we problems. Unless we're certain about the fix, without a shadow of a doubt, we might be creating a very, very hard to debug regression which might end up with sales drop and what not. It's the kinda thing that we all must be concerned about ;-) >>> The original workaround i.e. setting the High-speed instance to >>> Super-speed to avoid this errata is causing another side >>> effect. i.e. erratic interrupts happen and more than 2 seconds delay >>=20 >> this should have been an expected side-effect when you design a >> SuperSpeed controller without a SuperSpeed PHY and don't properly >> terminate inputs. What you have is a floating PIPE3 interface not >> properly terminated and capturing random noise (basically acting as a >> very poor antenna inside your die). Of course the IP will go bonkers and >> give you "erratic error" interrupts. It has no idea what the hell this >> "PHY" on the PIPE3 interface is doing. > > We know that. The damage is already done. :) right, and I'm trying to avoid further damage caused by a fix that hasn't been properly validated :) >>> to enumerations. This problem is more serious and so we have to keep >>> the controller in High-speed and tackle the meta-stability condition >>> if it happens. >>=20 >> you have to tackle the meta-stability, sure, but we need guarantee that >> $subject IS indeed tackling that problem. Unless there's proof that this >> really solves the meta-stability issue, I won't take $subject. Sorry >> dude, but I don't want regressions elsewhere because of a badly >> validated patch. > > I understand. I will see if someone from TI can provide me official > documentation about the workaround. thank you =2D-=20 balbi --=-=-= Content-Type: application/pgp-signature; name="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAEBAgAGBQJXC53bAAoJEIaOsuA1yqREg1wP/iu7X9mQ7MB/MhFyvqsaLZOi z3WSi1PIkMPS28PcbkFAyK+G7rYP5swwvgpypYB3NB93+mzFiALGynagwmr4OmsK wpfmcedxPNRrLGaFvbLwee2GvPNQbiMTnlMby7Vbus0KkbnwjbF2U5fgbPJ6AkQM IcAPvNcrQtZgom7qQ2mEurTE4kKDR+Z+jYlKjnRl2VM1DbM6rA1k2EWEg5GzDae4 Aubx4wj4zHQ6j7V+B6KeKj0cNsRVaAyIHS8c2AZR6bIp5iZ+OHjignQH1l5EKjXl 47Lv6TDg1IihIlSqQQoc1jBYo/Q2pVrPLstjAZx5SJrqnshsZHPqZZ5She1PEzmb i+gVhOugEbaDZAa0dqBzOCME1r4lNngWmhtBf/80rYOrwZzDt7NU1AnvOUdZw9q9 EnqKJWV6m1CBkW0BmIjGXI3i32uGAaCBmk2yi5ki9BeGI0pW0QwvvoTzel6fXggx 3qVK1x5IGLkb2mpAQ3f6UZJu3zhw0Q91yXWRGAass8FONL2FNhwkDa6fsgOEld0b 03unzbOmRMmNPYQ3F0QN6NIFFo67PZ47c8jZCKqXlNyJ23KRwXlSf6pEsL3fdWhk 063ikSQxypcVTZb5y3HJKBDXUdntp8M1OO4jqP/8QHFND+OxAlugyceLNfGF/Ubz bsWtI8YZqBuKi6aDTGJU =K5SZ -----END PGP SIGNATURE----- --=-=-=--