Received: by 2002:ad5:474a:0:0:0:0:0 with SMTP id i10csp5001047imu; Tue, 15 Jan 2019 09:26:00 -0800 (PST) X-Google-Smtp-Source: ALg8bN7MvVjYexd+YqebFeu0SvKY7VktP6BBIEPx+uk438Fd3j6defqLp1P84hbAb4K9U7Fq5/Xd X-Received: by 2002:a17:902:d697:: with SMTP id v23mr5103101ply.261.1547573159933; Tue, 15 Jan 2019 09:25:59 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1547573159; cv=none; d=google.com; s=arc-20160816; b=rbyXWt2+fEOKi9LYYpKiWFPs/gPYRubL2ubXdcS/F/KeCgaxjJ+EK9jgDQxb+QbkNW C7YO566ikj9a6WnLzR/A5lC/B+Jf0MExeezHyhDvFhUDDD0NtJXh4/7E/4do3SsRnP+l DobMKOPvs/YtreH8761ZClvItYZVnoBKWVp7xTcZ8Xvhrn/3VyPQBfWpHNi9El6DIaUi e5Ln+GIuY4uGIiXq7rsKiM8lqWZ94G7bi4/a0Oy8toPpLaIyUt6L4+JcyhZSyO7gCNNx DoTvADRhj9OvArcVaXJUVGtqN3rss2GfkqfqKG+6bLNVoYKM626IpDH7nuXNg8DIyu2y pwuA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:dkim-signature:content-transfer-encoding :content-language:in-reply-to:mime-version:user-agent:date :message-id:from:references:cc:to:subject; bh=0HIGGipEtEYa/0GmsMEH1E8GbB0IUtS3PEf84XrRXvw=; b=UD8ynGbQBP5w0uXvUbxHl5HLVZYzllq/6Xwp23b39CnEGZQmXA059ypkbTeRoJs3PA fNEE5P9Bc7ecChFeQ7h442eqGIia4YC/KMflODAj0DzGctoPjUPKZoEgaNrlsTIcyNF/ OLR52ZMx0GAmpklbrCuEzkopPMWIeAyJ1tAwQsSlAm2Cq0LKmVKSttVBIvQ0S1XKKLwt mHm6PVYWKGZ+RoG4NMrXRJdaFEa+f85dFRigAUexM7hCOvHeBAsFWQNJq+Q4Oa3euXwV wurfFcW8VpJjzftHwzizLgq71IJOKHwteoLqAwv9oYAXPOOSXBmyw0EsB+4o1kvuyyxo Uyow== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@nvidia.com header.s=n1 header.b=qoscDJ4S; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=nvidia.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id v185si3855226pfb.65.2019.01.15.09.25.43; Tue, 15 Jan 2019 09:25:59 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@nvidia.com header.s=n1 header.b=qoscDJ4S; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=nvidia.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730362AbfAOO0I (ORCPT + 99 others); Tue, 15 Jan 2019 09:26:08 -0500 Received: from hqemgate14.nvidia.com ([216.228.121.143]:2523 "EHLO hqemgate14.nvidia.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726863AbfAOO0H (ORCPT ); Tue, 15 Jan 2019 09:26:07 -0500 Received: from hqpgpgate101.nvidia.com (Not Verified[216.228.121.13]) by hqemgate14.nvidia.com (using TLS: TLSv1.2, DES-CBC3-SHA) id ; Tue, 15 Jan 2019 06:25:51 -0800 Received: from hqmail.nvidia.com ([172.20.161.6]) by hqpgpgate101.nvidia.com (PGP Universal service); Tue, 15 Jan 2019 06:26:05 -0800 X-PGP-Universal: processed; by hqpgpgate101.nvidia.com on Tue, 15 Jan 2019 06:26:05 -0800 Received: from [10.21.132.148] (172.20.13.39) by HQMAIL101.nvidia.com (172.20.187.10) with Microsoft SMTP Server (TLS) id 15.0.1395.4; Tue, 15 Jan 2019 14:26:04 +0000 Subject: Re: Regression: spi: core: avoid waking pump thread from spi_sync instead run teardown delayed To: Martin Sperl CC: Mark Brown , linux-tegra , Linux Kernel Mailing List , References: <7C4A5EFC-8235-40C8-96E1-E6020529DF72@martin.sperl.org> From: Jon Hunter Message-ID: Date: Tue, 15 Jan 2019 14:26:02 +0000 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.2.1 MIME-Version: 1.0 In-Reply-To: <7C4A5EFC-8235-40C8-96E1-E6020529DF72@martin.sperl.org> X-Originating-IP: [172.20.13.39] X-ClientProxiedBy: HQMAIL103.nvidia.com (172.20.187.11) To HQMAIL101.nvidia.com (172.20.187.10) Content-Type: text/plain; charset="utf-8" Content-Language: en-US Content-Transfer-Encoding: quoted-printable DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=nvidia.com; s=n1; t=1547562351; bh=0HIGGipEtEYa/0GmsMEH1E8GbB0IUtS3PEf84XrRXvw=; h=X-PGP-Universal:Subject:To:CC:References:From:Message-ID:Date: User-Agent:MIME-Version:In-Reply-To:X-Originating-IP: X-ClientProxiedBy:Content-Type:Content-Language: Content-Transfer-Encoding; b=qoscDJ4SpBxkf8t7n8Z2MEeFE7Hs4RA2j4fa9R1Tz14WzSG3ASztR2itCzT8dVYVC F3dv2GQOawHG4YNanGzWzP51qTMHP5dweQEQ9sfoZVprkQqi9BFkNEWc7/uO9+Akik AVKnn5J9EYkVRNPKQ0g/dfQdTqAlvcyUgCGS+3jodAjmOby2cglAaT794C9z8xgzTV NtqQzGYAt1NeroPMBeASpLwW+PR7CPahBxL9638Gnlcwkq9D0r14pGFeNjYv52TyNq 7Y4aQd9crZEKe6A9ziOul1Kj0oezCFC7cO2zk7oTF3RjY5dvgdXmUo4rkHOwwi9jic DWwQjvcjXQR9Q== Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi Martin, On 14/01/2019 22:01, Martin Sperl wrote: > Hi Jon, >=20 > On 14.01.2019, at 16:35, Jon Hunter > wrote: >=20 >> Hi Martin, Mark, >> >> [ =C2=A0=C2=A058.222033] spi_master spi1: could not stop message queue >> [ =C2=A0=C2=A058.222038] spi_master spi1: queue stop failed >> [ =C2=A0=C2=A058.222048] dpm_run_callback(): platform_pm_suspend+0x0/0x5= 4 >> returns -16 >> [ =C2=A0=C2=A058.222052] PM: Device 7000da00.spi failed to suspend: erro= r -16 >> [ =C2=A0=C2=A058.222057] PM: Some devices failed to suspend, or early wa= ke event >> detected >=20 > Unfortunately I have not been able to reproduce this in=C2=A0 > my test cases=C2=A0with the hw available to me. Looking at both boards that fail, tegra30-cardhu-a04 and tegra124-jetson-tk1 they both have a spi-flash. The compatible strings for the spi flashes are "winbond,w25q32" and "winbond,w25q32dw", respectively which interestingly are not documented/used anywhere in the kernel. It appears that there was a patch to fix this a few years back but never got applied [0]. However, applying this patch does not fix the issue. Furthermore, without this patch applied I see that the spi flash is detected fine ... [ 2.540395] m25p80 spi1.0: w25q32dw (4096 Kbytes) So this is not related but the main point is occurs with a spi flash device= . > Looks as if there is something missing in=C2=A0spi_stop_queue that=C2=A0 > would wake the worker thread one last time without any delays > and finish the hw shutdown immediately - it runs as a delayed > task... >=20 > One question: do you run any spi transfers in > your test case before suspend? No and before suspending I dumped some of the spi stats and I see no tranfers/messages at all ... Stats for spi1 ... Bytes: 0 Errors: 0 Messages: 0 Transfers: 0 > /sys/class/spi_master/spi1/statistics/messages=C2=A0gives some > counters=C2=A0on the number=C2=A0of spi messages processed which > would give you an=C2=A0indication if that is happening. >=20 > It could be as easy as adding right after the first lock=C2=A0 > in spi_stop_queue: > kthread_mod_delayed_work(&ctlr->kworker, > =C2=A0&ctlr->pump_idle_teardown, 0); > (plus maybe a yield or similar to allow the worker to=C2=A0 > quickly/reliably=C2=A0run on a single core machine) >=20 > I hope that this initial guess helps. Unfortunately, the above did not help and the issue persists. Digging a bit deeper I see that now the 'ctlr->queue' is empty but 'ctlr->busy' flag is set and this is causing the 'could not stop message queue' error. It seems that __spi_pump_messages() is getting called several times during boot when registering the spi-flash, then after the spi-flash has been registered, about a 1 sec later spi_pump_idle_teardown() is called (as expected), but exits because 'ctlr->running' is true. However, spi_pump_idle_teardown() is never called again and when we suspend we are stuck in the busy/running state. In this case should something be scheduling spi_pump_idle_teardown() again? Although even if it does I don't see where the busy flag would be cleared in this path? Cheers Jon [0] https://patchwork.kernel.org/patch/7021961/ --=20 nvpublic