Received: by 10.223.164.221 with SMTP id h29csp3689088wrb; Tue, 10 Oct 2017 06:40:33 -0700 (PDT) X-Google-Smtp-Source: AOwi7QCjaBsgUlk1i7V6Tf+Il47FuIwUZpxG6Q8qKi7GzJj8WZDWNyL+xO+KPRHct03VYaG+269e X-Received: by 10.98.237.20 with SMTP id u20mr13321880pfh.129.1507642833521; Tue, 10 Oct 2017 06:40:33 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1507642833; cv=none; d=google.com; s=arc-20160816; b=M1WyyG1tESJaDfGpCiMfEHTyZdSPp7AwiHCuJiPfybgYotcFGf07SlUXp2LTXcZMp0 Gruf/TaAyCLgwjMyKI38zTcyDZ0qR/EuUwMeUw/ZfiCHilrCjVBIf7evHig1VlWgkM+c HvwgA6+9R7lPOCGtv52Jwn45GYIZ1WRG9EEfI7fFYPhn0hF/lF61twExMb44d7k+aVDD bpctfA3czy4sZTV06HzHGbV4tVWW0LLbopmZnqtNEm0IM2rZrRAtBueAt3W4eEMV+MXY C00g4Jj0gLjzPCOt7YLIt3C1wGOhk8ed2wb1rjEOz/NYuwU2QARUbGX2t+iXjYtNMt53 G9hQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding :content-language:in-reply-to:mime-version:user-agent:date :message-id:from:references:cc:to:subject:arc-authentication-results; bh=xd42TIbeixyPlhBrmePz1q/j+2nu8EuJwSiGfNbB4Fs=; b=bYVmz0OK4Roj//8bHa8VZN1LS879nmJdNWJn33a/9dZhvEx5hJD3K9eORHZ0QuEcK2 3tPGzbRwNICoj/XaTRowQNl4K8JTnakdput+ixLzrbZMnv9IbeI2vRY66l04kLB+arYe 7SH9Q0A5+rIz+F1hKGbz26T1p8n8owv0s1W/hpM+CdS+77SRLhAFVvBOa8SziziCP8KT SoS/8EMUTciV84ks26wY1Jpki+0ZEx4znjb1HqVQzOp2q6g6Nnn06o1S9MqNEc7cn3NG Uo3ghBDEudowa8r45UQF222VPOBMDdCRts/D38/DDotJiV8MkyEWz9/jCAtIy/S3yr0y oTlw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=nvidia.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id h128si9170544pfb.194.2017.10.10.06.40.19; Tue, 10 Oct 2017 06:40:33 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=nvidia.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1756304AbdJJNjF (ORCPT + 99 others); Tue, 10 Oct 2017 09:39:05 -0400 Received: from hqemgate16.nvidia.com ([216.228.121.65]:4952 "EHLO hqemgate16.nvidia.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1756278AbdJJNjC (ORCPT ); Tue, 10 Oct 2017 09:39:02 -0400 Received: from hqpgpgate101.nvidia.com (Not Verified[216.228.121.13]) by hqemgate16.nvidia.com id ; Tue, 10 Oct 2017 06:38:23 -0700 Received: from HQMAIL107.nvidia.com ([172.20.161.6]) by hqpgpgate101.nvidia.com (PGP Universal service); Tue, 10 Oct 2017 06:38:31 -0700 X-PGP-Universal: processed; by hqpgpgate101.nvidia.com on Tue, 10 Oct 2017 06:38:31 -0700 Received: from UKMAIL101.nvidia.com (10.26.138.13) by HQMAIL107.nvidia.com (172.20.187.13) with Microsoft SMTP Server (TLS) id 15.0.1293.2; Tue, 10 Oct 2017 13:35:59 +0000 Received: from [10.21.132.144] (10.21.132.144) by UKMAIL101.nvidia.com (10.26.138.13) with Microsoft SMTP Server (TLS) id 15.0.1293.2; Tue, 10 Oct 2017 13:35:55 +0000 Subject: Re: [PATCH v3] platform/chrome: Use proper protocol transfer function To: Shawn N CC: Olof Johansson , Benson Leung , "Lee Jones" , , Doug Anderson , Brian Norris , "Brian Norris" , Gwendal Grignou , Enric Balletbo , Tomeu Vizoso , "linux-tegra@vger.kernel.org" References: <20170908205011.77986-1-briannorris@chromium.org> <02aa65e7-e967-055b-2af3-2e9b6ef77935@nvidia.com> <20170919171401.GA10968@google.com> <20170920061317.GB13616@google.com> From: Jon Hunter Message-ID: <6beb7b8c-f602-112b-73b2-22188dcea28e@nvidia.com> Date: Tue, 10 Oct 2017 14:35:55 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.3.0 MIME-Version: 1.0 In-Reply-To: X-Originating-IP: [10.21.132.144] X-ClientProxiedBy: UKMAIL101.nvidia.com (10.26.138.13) To UKMAIL101.nvidia.com (10.26.138.13) Content-Type: text/plain; charset="utf-8" Content-Language: en-US Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 26/09/17 00:15, Shawn N wrote: > On Wed, Sep 20, 2017 at 1:22 PM, Shawn N wrote: >> On Tue, Sep 19, 2017 at 11:13 PM, Brian Norris wrote: >>> Hi, >>> >>> On Tue, Sep 19, 2017 at 11:05:38PM -0700, Shawn N wrote: >>>> This is failing because our EC_CMD_GET_PROTOCOL_INFO host command is >>>> getting messed up, or the reply buffer is getting corrupted somehow. >>>> >>>> ec_dev->proto_version = >>>> min(EC_HOST_REQUEST_VERSION, >>>> fls(proto_info->protocol_versions) - 1); >>>> >> >> Checking this closer, the first host command we send after we boot the >> kernel (EC_CMD_GET_PROTOCOL_INFO) is failing due to protocol error >> (see 'SPI rx bad data' / 'SPI not ready' on the EC console). Since >> this doesn't seem to happen on the Chromium OS nyan_big release >> kernel, I suggest to hook up a logic analyzer and see if the SPI >> master is doing something bad. >> >> The error handling in cros_ec_cmd_xfer_spi() is completely wrong and >> we return -EAGAIN / EC_RES_IN_PROGRESS, which the caller interprets >> "the host command was received by the EC and is currently being >> handled, poll status until completion". So the caller polls status >> with EC_CMD_GET_COMMS_STATUS, sees no host command is in progress >> (which is interpreted to mean "the host command I sent previously has >> now successfully completed"), and returns success. The problem here is >> that the initial host command was never received at all, and no reply >> was ever received, so our reply data is all zero. >> >> Two things need to be fixed here: >> >> 1) Find out why the first host command after boot is failing. Probe >> SPI pins and see what's going on. >> 2) Fix error handling so we properly return an error (or properly >> retry the entire command) when a protocol error occurs (I made some >> attempt in https://chromium-review.googlesource.com/385080/, probably >> I should revisit that). > > The below patch will fix error handling and will make things mostly > work on nyan_big, because we'll fall back to V2 protocol after the > initial failure. But we should still investigate why we're getting > errors on the first host command. We aren't seeing these errors when > we send commands from firmware, so I suspect something is wrong in > kernel SPI HW initialization that causes the first command to fail. I have been looking into this a bit more and it appears to be timing related. I found that enabling some debug in the Tegra SPI driver the problem would go away and seems that adding a delay before sending the SPI message would also workaround the problem. Looking back at the Tegra Linux test history [0], it appears that after v4.10-rc5 [1] I start to see the following message which is the first indication of some SPI issues ... cros-ec-spi spi32766.0: packet too long (249 bytes, expected 4) I attempted to bisect this, but I was not successful because each time I would ended up somewhere different. I found that even with v4.9 I may see the issue 1 in 20 times and so I realised that I am not even sure when the problem really started or if has always been there. It seems that for older kernels it is harder to reproduce. I am wondering now if some timing has changed somewhere causing us to see the problem more frequently with newer kernels? I see the the cros-ec binding defines the following ... "google,cros-ec-spi-pre-delay: Some implementations of the EC need a little time to wake up from sleep before they can receive SPI transfers at a high clock rate. This property specifies the delay, in usecs, between the assertion of the CS to the start of the first clock pulse." I found that adding the following also worked around the problem ... diff --git a/arch/arm/boot/dts/tegra124-nyan.dtsi b/arch/arm/boot/dts/tegra124-nyan.dtsi index 5cf987b5401e..0baa6bfc0f36 100644 --- a/arch/arm/boot/dts/tegra124-nyan.dtsi +++ b/arch/arm/boot/dts/tegra124-nyan.dtsi @@ -317,6 +317,7 @@ interrupts = ; reg = <0>; + google,cros-ec-spi-pre-delay = <10>; google,cros-ec-spi-msg-delay = <2000>; i2c-tunnel { I have tried 50 boots with the above and I have seen no SPI failures on boot. I did look to see if it is possible to probe the SPI signals with a scope but from the schematics I am not sure if they are accessible or buried in the PCB. Is it possible that Tegra is sending the SPI message too soon for the EC? Cheers Jon [0] https://nvtb.github.io/linux/ [1] https://nvtb.github.io/linux/test_v4.10-rc5/20170123023102/boot/tegra124-nyan-big/tegra124-nyan-big/tegra_defconfig_log.txt -- nvpublic From 1579645191055136784@xxx Tue Sep 26 23:04:07 +0000 2017 X-GM-THRID: 1578006392712496529 X-Gmail-Labels: Inbox,Category Forums