Received: by 10.223.185.116 with SMTP id b49csp1031284wrg; Fri, 16 Feb 2018 11:08:20 -0800 (PST) X-Google-Smtp-Source: AH8x2270dZBvYwVsqe36FbL+yMdX7wL5Ii4bVLx1Nz+Z6qOD+PyWHlglyFgYzYv+BNbrm8snjcMh X-Received: by 10.101.101.142 with SMTP id u14mr6052437pgv.429.1518808100193; Fri, 16 Feb 2018 11:08:20 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1518808100; cv=none; d=google.com; s=arc-20160816; b=yXncuaVzCRIQ+QgH+7JAaBFUuv02YoKkYYs9c+cW88OTIfG0W5Pzi/BJTjZQOwnIcE J4NJYQAN3u/073uH71ZJ4kP+aa9rWVL8YZKXFRsZ8Aowp+yZEoIzpKBWe+n7a4Ap3XzK GfzdIeZ2bgX7MKZ/mflrC7+dRjKhwyQXHuUSTZtdqEawd0d6SLaqtSYP5rUrLLZdADlj mE72hZ3+YDidJg3mrbtYxmPu2ye3yrcgf9M/z/HTH8iOm+KId6LqQC+buiHuePBQRtqV juqxIV7UQl0PUxDzGp2S2XTgiBrp3UANaTYIFqBC59ZI56tl4CngQbNF4NT/ks1Ar4ZS bmow== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :references:in-reply-to:mime-version:arc-authentication-results; bh=wqhJ9MPDV7iQEKI/1Whkx3KwN2ytTEozPzEByrb0g7E=; b=za1SwtKrr5KyvwE/jN17ukGlw0R4zxA+ylR338fW8h2jEV2yXUMkqQ4LM/SBIM/qS1 LmIJgHAoQ/6gABEWwHmbfycN0s/j5cZarMMH+LBal5pkGdgswnhU4dKRQ2a3Bap9jH9r x0imIMH85TNQlbaf6sybW0H9keEo43JfAQ+MoK07YAyoDrUIa5JlARWqz+W9cEw3QRAy j5FI8n7y8quDtLSgBuFS9PnAwetq+ylIMfaT9xm4GmYNPUNJyasRwHoF9EjPI+WAYraO EACgtS5rizmz3SR8n3WD66bd5wtcoj5X5pv4kxyqMMTqoNV7aVeLy4iPlpel39zI6RcZ ZUbw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id 91-v6si92621pld.283.2018.02.16.11.08.05; Fri, 16 Feb 2018 11:08:20 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S968010AbeBPM7o (ORCPT + 99 others); Fri, 16 Feb 2018 07:59:44 -0500 Received: from mail-wm0-f67.google.com ([74.125.82.67]:37492 "EHLO mail-wm0-f67.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S967661AbeBPM7l (ORCPT ); Fri, 16 Feb 2018 07:59:41 -0500 Received: by mail-wm0-f67.google.com with SMTP id v71so3008968wmv.2; Fri, 16 Feb 2018 04:59:40 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=wqhJ9MPDV7iQEKI/1Whkx3KwN2ytTEozPzEByrb0g7E=; b=uSvUIW5KkaDHcLIposphKe2nBWq2SF2TZqZ7UQOurA1lhcGZhti7AMLtZO906Z7q5P h43ZG1149jsStFBGNnGvDEmhpC+pRByc+hduQ5om6jbPgpi53awB3S2z9B2l9dQZClxe U7ItIPeXGXck6SseRQomOvsukYOstqbWo7m6vJsM2giIAylpAxSIM+0ExmxMwzJjs/I5 xnJ69B9pf3QJiGFLy3K1HAArOyT1RP4PjP0aXOzsYeP/EqbFnQnnQg+bqIuumAP/RZMa +mrFPreUbYQLKcfXevEujEPtMfOSCg+gw1cNDMw6Kp8sgv30UAHCTAoIRNd2wOk8p7/8 vTGw== X-Gm-Message-State: APf1xPCST7NUnOu5ubMeeF82zSHI3VCzZUol8A6cPxA7jBHITLedxu6E Jvyc7FNYUZ/PmIEYMf3H2+KwXr3Y X-Received: by 10.80.150.193 with SMTP id z1mr7988559eda.175.1518785980027; Fri, 16 Feb 2018 04:59:40 -0800 (PST) Received: from mail-wm0-f41.google.com (mail-wm0-f41.google.com. [74.125.82.41]) by smtp.gmail.com with ESMTPSA id a63sm9909558ede.89.2018.02.16.04.59.38 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 16 Feb 2018 04:59:39 -0800 (PST) Received: by mail-wm0-f41.google.com with SMTP id t74so3029024wme.3; Fri, 16 Feb 2018 04:59:38 -0800 (PST) X-Received: by 10.28.51.12 with SMTP id z12mr4644297wmz.16.1518785978220; Fri, 16 Feb 2018 04:59:38 -0800 (PST) MIME-Version: 1.0 Received: by 10.223.134.148 with HTTP; Fri, 16 Feb 2018 04:59:17 -0800 (PST) In-Reply-To: <0af79515-ed4f-a4c6-6358-2053e8fd1745@gmail.com> References: <20180214135612.21356-1-embed3d@gmail.com> <20180215141154.monvwx25awwetqrc@flea.lan> <00ece3d8-52ac-c51d-1867-97f7fd84dadf@gmail.com> <0af79515-ed4f-a4c6-6358-2053e8fd1745@gmail.com> From: Chen-Yu Tsai Date: Fri, 16 Feb 2018 20:59:17 +0800 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: [PATCH v3] rtc: ac100: Fix ac100 determine rate bug To: Philipp Rossak Cc: Maxime Ripard , Alessandro Zummo , alexandre.belloni@bootlin.com, linux-kernel , linux-sunxi , linux-rtc@vger.kernel.org, linux-clk , Mike Turquette , sboyd@kernel.org, linux-arm-kernel Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Feb 16, 2018 at 8:49 PM, Philipp Rossak wrote: > > > On 16.02.2018 05:10, Chen-Yu Tsai wrote: >> >> On Fri, Feb 16, 2018 at 1:53 AM, Philipp Rossak wrote: >>> >>> >>> >>> On 15.02.2018 15:11, Maxime Ripard wrote: >>>> >>>> >>>> On Wed, Feb 14, 2018 at 02:56:12PM +0100, Philipp Rossak wrote: >>>>> >>>>> >>>>> This patch fixes a bug, that prevents the Allwinner A83T and the A80 >>>>> from a successful boot. >>>>> >>>>> The bug is there since v4.16-rc1 and appeared after the clk branch was >>>>> merged. >>>> >>>> >>>> >>>> Out of curiosity, which patch has introduced this? I couldn't find any >>>> obvious match. >>>> >>> >>> I wasn't also n >> >> >> To be honest, I'm not sure why this is hitting you and not me. >> I have both A83T boards that have assigned-clock-rates set for >> the ac100 clock outputs for WiFi. I have them running 4.16-rc1 >> and have not seen this. The device tree patches that add these >> are in 4.15. >> > > Now it is getting curious ... . > I already mentioned that bug in the sunxi-irc and someone else was hitting > that problem also... > I tested it also with the same toolchain you are using (gcc 7.3.0-1 Debian), > but that didn't made any difference. > > I don't think that issue is related with the Hardware, but to be on the save > side: Which Hardware version of the BPI-M3 do you have? I have version 1.2. > > Can someone else can confirm this bug? So I might have remembered wrong, as I just realized I have your patch in my a83t branches. I don't hit this on the A80, which also has the AC100, but doesn't use assigned-clock-rates in the device tree. Could you try rolling back to 4.15 and see if you still hit it? > > >>> >>>>> You can find the shortend trace below: >>>>> >>>>> Unable to handle kernel NULL pointer dereference at virtual address >>>>> 00000000 >>>>> pgd = (ptrval) >>>>> [00000000] *pgd=00000000 >>>>> Internal error: Oops: 5 [#1] SMP ARM >>>>> Modules linked in: >>>>> CPU: 0 PID: 49 Comm: kworker/0:1 Not tainted 4.15.0-10190-gb89e32ccd1be >>>>> #2 >>>>> Hardware name: Allwinner sun8i Family >>>>> Workqueue: events deferred_probe_work_func >>>>> PC is at clk_hw_get_rate+0x0/0x34 >>>>> LR is at ac100_clkout_determine_rate+0x48/0x19c >>>>> >>>>> [ ... ] >>>>> >>>>> (clk_hw_get_rate) from (ac100_clkout_determine_rate+0x48/0x19c) >>>>> (ac100_clkout_determine_rate) from >>>>> (clk_core_set_rate_nolock+0x3c/0x1a0) >>>>> (clk_core_set_rate_nolock) from (clk_set_rate+0x30/0x88) >>>>> (clk_set_rate) from (of_clk_set_defaults+0x200/0x364) >>>>> (of_clk_set_defaults) from (platform_drv_probe+0x18/0xb0) >>>>> >>>>> To fix that bug, we first check if the return of the >>>>> clk_hw_get_parent_by_index is non zero. If it is zero we skip that >>>>> clock parent. >>>>> >>>>> The BUG report could be found here: https://lkml.org/lkml/2018/2/10/198 >>>>> >>>>> Fixes: 04940631b8d2 ("rtc: ac100: Add clk output support") >>>>> >>>>> Signed-off-by: Philipp Rossak >>>>> --- >>>>> >>>>> Changes in v3: >>>>> * add information when the bug appeared >>>>> * make the comment more clear >>>>> Changes in v2: >>>>> * add tag Fixes: ... to commit message >>>>> * add comment to if statement why we are doing this check >>>>> >>>>> drivers/rtc/rtc-ac100.c | 19 ++++++++++++++++++- >>>>> 1 file changed, 18 insertions(+), 1 deletion(-) >>>>> >>>>> diff --git a/drivers/rtc/rtc-ac100.c b/drivers/rtc/rtc-ac100.c >>>>> index 8ff9dc3fe5bf..2412aa2e8399 100644 >>>>> --- a/drivers/rtc/rtc-ac100.c >>>>> +++ b/drivers/rtc/rtc-ac100.c >>>>> @@ -183,7 +183,24 @@ static int ac100_clkout_determine_rate(struct >>>>> clk_hw >>>>> *hw, >>>>> for (i = 0; i < num_parents; i++) { >>>>> struct clk_hw *parent = clk_hw_get_parent_by_index(hw, >>>>> i); >>>>> - unsigned long tmp, prate = clk_hw_get_rate(parent); >>>>> + unsigned long tmp, prate; >>>>> + >>>>> + /* >>>>> + * The clock has two parents, one is a fixed clock >>>>> which >>>>> is >>>>> + * internally registered by the ac100 driver. The other >>>>> parent >>>>> + * is a clock from the codec side of the chip, which we >>>>> + * properly declare and reference in the devicetree and >>>>> is >>>>> + * not implemented in any driver right now. >>>>> + * If the clock core looks for the parent of that >>>>> second >>>>> + * missing clock, it can't one that is registered and >>>>> + * returns NULL. >>>>> + * Thus we need to check if the parent exists before >>>>> + * we get the parent rate. >>>>> + */ >>>>> + if (!parent) >>>>> + continue; >>>> >>>> >>>> >>>> I'm sorry, but I still don't get it. When you register that clock, you >>>> will give it two parents. Why would that change during the life of the >>>> clock? >>>> >>>> This really looks like a workaround rather than an actual fix. >>>> >>>> Maxime >>>> >>> I agree this is more a workaround! >>> A proper solution/fix would be to define the devicetree correct like >>> this: >>> >>> diff --git a/arch/arm/boot/dts/sun8i-a83t-bananapi-m3.dts >>> b/arch/arm/boot/dts/sun8i-a83t-bananapi-m3.dts >>> index 6550bf0e594b..6f56d429f17e 100644 >>> --- a/arch/arm/boot/dts/sun8i-a83t-bananapi-m3.dts >>> +++ b/arch/arm/boot/dts/sun8i-a83t-bananapi-m3.dts >>> @@ -175,11 +175,18 @@ >>> compatible = "x-powers,ac100-rtc"; >>> interrupt-parent = <&r_intc>; >>> interrupts = <0 IRQ_TYPE_LEVEL_LOW>; >>> - clocks = <&ac100_codec>; >>> + clocks = <&ac100_rtc_32k>; >>> #clock-cells = <1>; >>> clock-output-names = "cko1_rtc", >>> "cko2_rtc", >>> "cko3_rtc"; >>> + >>> + ac100_rtc_32k: rtc-32k-oscillator { >>> + compatible = "fixed-clock"; >>> + #clock-cells = <0>; >>> + clock-frequency = <32768>; >>> + clock-output-names = "ac100-rtc-32k"; >>> + }; >>> }; >>> }; >>> }; >>> >>> What do you think about that solution? >> >> >> That's not quite right either. As I mentioned before, the >> RTC block has two clock inputs, one 4MHz signal from the >> codec block, and one 32.768 kHz signal from an external >> crystal. The original device tree binding describes the >> first one, and the 32.768 kHz clock was registered by >> the RTC driver internally. >> >> If you're going to add the crystal clock, you still need >> to keep the codec one. Note that this does not fix what >> Maxime is asking you. I've already provided an explanation: >> >> The clock core allows registering clocks with not-yet-existing >> clock parents. Parents are matches by string names. If no >> clock by that name is registered yet, the clock core simply >> orphans the new clock if the unregistered parent is its >> current parent or simply ignores that parent if its not the >> current parent. This is entirely valid and is what we are >> counting on here, as we haven't implemented the codec-side >> driver. >> >> Note we also sort of depend on this behavior for sunxi-ng >> clocks, where in some cases we get circular dependencies >> between clock blocks. >> >> BTW, since this is mostly related to the clk subsystem, >> please CC the clk maintainers as well. >> > > Ok, I added also now the clk mailing list and maintainers. > As reference this here is the boot log [1]. > > Im not sure if this in relation to this bug: > But, already with the 4.15 kernel I get from time to time some oops during > runtime on the a31s (bpi-m2) and all are clock related and causes a crash of > the system. Are you referring to the A31 CLK_OUT bug? That is an entirely separate bug in the A31/A31s CCU driver. The bug was there from the beginning, but did not cause a crash as the incompatible data structures had lined up in a way that the embedded data structure that was used by the incorrect clk ops was actually compatible. This changed with 4.16-rc1, as a new field was added to some of the clock types, breaking the alignment. I made some changes but have yet to write the commit log for it. Also I don't have any A31/A31s boards to test it on right now. I can send you the patch later. ChenYu > > Philipp > > > > [1]: https://pastebin.com/5c7hxjsS