Received: by 2002:a25:1985:0:0:0:0:0 with SMTP id 127csp300160ybz; Fri, 24 Apr 2020 00:12:15 -0700 (PDT) X-Google-Smtp-Source: APiQypLWdprbD6Tnnkixwq2McvJrk887FUOg6XYAR8IukCI/qibS+Pi5iumkr5SGG07vY2ztujyt X-Received: by 2002:a50:dac9:: with SMTP id s9mr6017479edj.313.1587712334916; Fri, 24 Apr 2020 00:12:14 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1587712334; cv=none; d=google.com; s=arc-20160816; b=jiRsFQwKccbLxTdgaGKOvTY+9E5/gtJGK0/XgaEH6iTkfixYHTa6yhzXRgvzBUFdCQ 3Ap+WtvcP0ibrpg5Skx+Z39vgwZmWvLFZsOMGq8686eODdiiOxgwrzCoD00emIeGZ3df wEqkDRaJl+qqH9pQYdXvLC5vaL9x2UM2i5IYF+jcW1yVCaQUaWilYYsyy6iFsxqDAd6w 4DYYW0VkiB7Y7SDbtloV01TigVDs7A9btvLdbpXEek71tPiV+tZoPabjX6quErGP2Kcg 8rvsZsOcpaGq6faDeQzDIGvqzyOeYmcvZFiTrZSvFrYUXAg72dJQOIzqlrj6cVivLdJD DwGQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:dkim-signature:content-transfer-encoding :content-language:in-reply-to:mime-version:user-agent:date :message-id:from:references:cc:to:subject; bh=Vue2wSpWKstn9zp8TZY8tWYSPrb8qcAXPds/zrgMX3A=; b=QIkuSbYSqh3+CGVQvkGcfz5e1QChHxEOnoFNPnbO3kbdFi5s/6u4J72r78SRWgTOE3 vEodowF+RaAuUQvaDHbszRW5bPXn96zkjoLWZnRkg3fz09BOy6YqZIHTP8exIIdx8zZc QJmMXRkpxDqYMu4o+w5IHsA7T+3Zb5X5OfOHhOqsyKiTwjUGbKdYEL5leGCUJJxrtwDE T3IsO+peZYH/a4PivDO2KyZ3ydYhfDRUblGkEhC36jcboNlib525kPdEEp34OWdZzDDK BkpFrnKe+weL02RiAZgj9/3Y0/c+htGyz2mnmHFaB37X5+AwnXFUOoOO/UG1XeI1sZcb TFtw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@nvidia.com header.s=n1 header.b=rUq9ByqW; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=nvidia.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id ss10si493973ejb.103.2020.04.24.00.11.51; Fri, 24 Apr 2020 00:12:14 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@nvidia.com header.s=n1 header.b=rUq9ByqW; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=nvidia.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726523AbgDXHKd (ORCPT + 99 others); Fri, 24 Apr 2020 03:10:33 -0400 Received: from hqnvemgate24.nvidia.com ([216.228.121.143]:4799 "EHLO hqnvemgate24.nvidia.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726008AbgDXHKc (ORCPT ); Fri, 24 Apr 2020 03:10:32 -0400 Received: from hqpgpgate102.nvidia.com (Not Verified[216.228.121.13]) by hqnvemgate24.nvidia.com (using TLS: TLSv1.2, DES-CBC3-SHA) id ; Fri, 24 Apr 2020 00:08:34 -0700 Received: from hqmail.nvidia.com ([172.20.161.6]) by hqpgpgate102.nvidia.com (PGP Universal service); Fri, 24 Apr 2020 00:10:32 -0700 X-PGP-Universal: processed; by hqpgpgate102.nvidia.com on Fri, 24 Apr 2020 00:10:32 -0700 Received: from DRHQMAIL107.nvidia.com (10.27.9.16) by HQMAIL107.nvidia.com (172.20.187.13) with Microsoft SMTP Server (TLS) id 15.0.1473.3; Fri, 24 Apr 2020 07:10:32 +0000 Received: from [10.26.73.231] (10.124.1.5) by DRHQMAIL107.nvidia.com (10.27.9.16) with Microsoft SMTP Server (TLS) id 15.0.1473.3; Fri, 24 Apr 2020 07:10:29 +0000 Subject: Re: [PATCH v2 1/2] i2c: tegra: Better handle case where CPU0 is busy for a long time To: Dmitry Osipenko , Thierry Reding , Laxman Dewangan , "Wolfram Sang" , Manikanta Maddireddy , Vidya Sagar CC: , , References: <20200324191217.1829-1-digetx@gmail.com> <20200324191217.1829-2-digetx@gmail.com> <1e259e22-c300-663a-e537-18d854e0f478@nvidia.com> <8cd085e1-f9fd-6ec0-9f7a-d5463f176a63@nvidia.com> <6f07e5c8-7916-7ea2-2fe7-d05f8f011471@nvidia.com> <77a31b2f-f525-ba9e-f1ae-2b474465bde4@gmail.com> <470b4de4-e98a-1bdc-049e-6259ad603507@nvidia.com> From: Jon Hunter Message-ID: Date: Fri, 24 Apr 2020 08:10:26 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.7.0 MIME-Version: 1.0 In-Reply-To: X-Originating-IP: [10.124.1.5] X-ClientProxiedBy: HQMAIL111.nvidia.com (172.20.187.18) To DRHQMAIL107.nvidia.com (10.27.9.16) Content-Type: text/plain; charset="utf-8" Content-Language: en-US Content-Transfer-Encoding: quoted-printable DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=nvidia.com; s=n1; t=1587712115; bh=Vue2wSpWKstn9zp8TZY8tWYSPrb8qcAXPds/zrgMX3A=; h=X-PGP-Universal:Subject:To:CC:References:From:Message-ID:Date: User-Agent:MIME-Version:In-Reply-To:X-Originating-IP: X-ClientProxiedBy:Content-Type:Content-Language: Content-Transfer-Encoding; b=rUq9ByqWHrbIT8vZKHE3BWZXjB7b/gBRr9zQrjvZc6KYIgHNLUZWIWIp01BDAX6VD ot9pXd7OpP74fC1qjrPZQrjcsC2Bu0QujWBGBeY/KCHlgL+4hFcn+Pauyxh9SGZig5 p8/5VDhBtSVXRJ0TIHMasHcxP0c+acBVrPXlfSBxm3d0Hi2gxym9yLU7JiT+i1DsCQ eA/QIGbGEHRv40J5ATRCR5hmVAC4g3WWk8OSVG9fgbFjb74ajawnfCQCAdBrps19wl okCE32wqG1rSqme1L3OCiePIxMT3wQ7P7IpnMPuqoik3JF6U4wlhSWNK9brKvJaqsK Y/yRHxEP8RIqQ== Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 23/04/2020 17:33, Dmitry Osipenko wrote: > 23.04.2020 13:56, Jon Hunter =D0=BF=D0=B8=D1=88=D0=B5=D1=82: >> >> On 22/04/2020 15:07, Dmitry Osipenko wrote: >> >> ... >> >>>> So I think that part of the problem already existed prior to these >>>> patches. Without your patches I see ... >>>> >>>> [ 59.543528] tegra-i2c 7000d000.i2c: i2c transfer timed out >>> >>> Does this I2C timeout happen with my patches? Could you please post ful= l >>> logs of an older and the recent kernel versions? >> >> I believe that it does, but I need to check. >> >>>> [ 59.549036] vdd_sata,avdd_plle: failed to disable >>>> >>>> [ 59.553778] Failed to disable avdd-plle: -110 >>>> >>>> [ 59.558150] tegra-pcie 3000.pcie: failed to disable regulators: -11= 0 >>>> >>>> >>>> However, now with your patches the i2c is failing to resume and this >>>> breaks system suspend completely for Tegra30. So far this is the only >>>> board that appears to break. >>>> >>>> So the above issue needs to be fixed and I will chat to Thierry about = this. >>> >>> Okay >> >> So I have been looking at this some more and this starting to all look a >> bit of a mess :-( >> >> Firstly, the Tegra PCI driver suspends late (noirq) as we know. The PCI >> driver will warn if it cannot disable the regulators when suspending but >> does not actually fail suspend. So this warning is just indicating that >> we were unable to disable the regulators. >> >> Now I don't see that we can ever disable the PCI regulators currently >> when entering suspend because ... >> >> 1. We are in the noirq phase and so we will not get the completion >> interrupt for the I2C transfer. I know that you recently added the >> atomic I2C transfer support, but we can get the regulator framework >> to use this (I have not looked in much detail so far). >=20 > That's not good :) I didn't realize that *all* interrupts of every > device are disabled before .noirq is invoked. It appeared to me that the > IRQs disabling and .noirq invocation is performed for the drivers one > after another, but now I see that it's not true. >=20 > https://elixir.bootlin.com/linux/v5.7-rc2/source/drivers/base/power/main.= c#L1446 >=20 >> 2. Even if the regulator framework supported atomic I2C transfers, we >> also have the problem that the I2C controller could be runtime- >> suspended and pm_runtime_get_sync() will not work during during this >> phase to resume it correctly. This is a problem that needs to be >> fixed indeed! >=20 > Could you please clarify why pm_runtime_get_sync() can't be used by the > I2C driver's in NOIRQ phase? Yes take a look at commit 1e2ef05bb8cf ("PM: Limit race conditions between runtime PM and system sleep (v2)"). >> 3. Then we also have the possible dependency on the DMA driver that is >> suspended during the noirq phase. >=20 > Yes, this is correct. >=20 > Again, some regulator drivers may do something on suspend too, although > looks like the current upstream Tegra devices are not affected by this > potential problem. >=20 >> It could be argued that if the PCI regulators are never turned off >> (currently) then we should not even bother and this will likely resolve >> this for now. However, really we should try to fix that correctly. >=20 > Yes, keeping PCI regulators always-enabled should be a good immediate > solution. I was thinking about that, and I am not sure it is. I don't think that the failure to send the I2C command should break suspend. > Also, the RPM's system suspend/resume needs to fixed for the pci-tegra > driver, like I already suggested before. Yes but I don't think that is the cause of the issue in this particular case. >> What I still don't understand is why your patch breaks resume. Even if >> the I2C transfer fails, and is deemed harmless by the client driver, we >> should still be able to suspend and resume correctly. >=20 > If DMA is getting synchronized after DMA driver being suspended, then it > could be a problem. So I confirmed that DMA is not the issue in this case. I tested this by ensuring that DMA is never used. However, it is a potential problem indeed. > Could you please try to apply this hunk and see if it makes any > difference (I'll probably make it as proper patch): Per my tests, I don't believe that it will as disabling DMA does not resolve the problem. > It also could be that there is more than the suspend ordering problem, > but for now it is hard to tell without having a detailed log which > includes I2C/DMA/RPM traces. I have taken a look and I don't see any issues with ordering. I2C is suspended after PCI. This did not change. Cheers Jon --=20 nvpublic