The WinLink E850-96 board boot failed with Linux next-20240506 but there
is no kernel crash log on the serial [1].
Anders bisection results pointing to this commit,
# first bad commit:
[4d11c62ca8d77cb1f79054844b598e0f4e92dabe]
clkdev: report over-sized strings when creating clkdev entrie
After reverting the above patch the boot test passed [2].
Reported-by: Linux Kernel Functional Testing <[email protected]>
Links:
[1] - https://lkft.validation.linaro.org/scheduler/job/7546260
[2] - https://lkft.validation.linaro.org/scheduler/job/7548064
Failed information:
- https://qa-reports.linaro.org/lkft/linux-next-master/build/next-20240506/testrun/23814147/suite/boot/test/gcc-13-lkftconfig/details/
- https://qa-reports.linaro.org/lkft/linux-next-master/build/next-20240506/testrun/23815165/suite/boot/test/gcc-13-lkftconfig/history/
--
Linaro LKFT
https://lkft.linaro.org
On Tue, May 7, 2024, at 09:20, Naresh Kamboju wrote:
> The WinLink E850-96 board boot failed with Linux next-20240506 but there
> is no kernel crash log on the serial [1].
>
> Anders bisection results pointing to this commit,
> # first bad commit:
> [4d11c62ca8d77cb1f79054844b598e0f4e92dabe]
> clkdev: report over-sized strings when creating clkdev entrie
>
> After reverting the above patch the boot test passed [2].
>
> Reported-by: Linux Kernel Functional Testing <[email protected]>
>
> Links:
> [1] - https://lkft.validation.linaro.org/scheduler/job/7546260
> [2] - https://lkft.validation.linaro.org/scheduler/job/7548064
I assume this is an actual overflow that was previously ignored
and is now causing vclkdev_alloc() to fail. Unfortunately this
happens before the console is up, so there is no output. Can
you run the broken version again with 'earlycon' added to the
boot arguments?
Arnd
Quoting Arnd Bergmann (2024-05-07 00:44:15)
> On Tue, May 7, 2024, at 09:20, Naresh Kamboju wrote:
> > The WinLink E850-96 board boot failed with Linux next-20240506 but there
> > is no kernel crash log on the serial [1].
> >
> > Anders bisection results pointing to this commit,
> > # first bad commit:
> > [4d11c62ca8d77cb1f79054844b598e0f4e92dabe]
> > clkdev: report over-sized strings when creating clkdev entrie
> >
> > After reverting the above patch the boot test passed [2].
> >
> > Reported-by: Linux Kernel Functional Testing <[email protected]>
> >
There are two fixes on the list: [1] and [2]. Perhaps one of those
resolves this?
[1] https://lore.kernel.org/r/20240507065317.3214186-1-m.szyprowski@samsungcom
[2] https://lore.kernel.org/r/20240507064434.3213933-1-m.szyprowski@samsungcom
On Tue, May 7, 2024, at 22:26, Stephen Boyd wrote:
> Quoting Arnd Bergmann (2024-05-07 00:44:15)
>> On Tue, May 7, 2024, at 09:20, Naresh Kamboju wrote:
>> > The WinLink E850-96 board boot failed with Linux next-20240506 but there
>> > is no kernel crash log on the serial [1].
>> >
>> > Anders bisection results pointing to this commit,
>> > # first bad commit:
>> > [4d11c62ca8d77cb1f79054844b598e0f4e92dabe]
>> > clkdev: report over-sized strings when creating clkdev entrie
>> >
>> > After reverting the above patch the boot test passed [2].
>> >
>> > Reported-by: Linux Kernel Functional Testing <[email protected]>
>> >
>
> There are two fixes on the list: [1] and [2]. Perhaps one of those
> resolves this?
>
> [1] https://lore.kernel.org/r/[email protected]
> [2] https://lore.kernel.org/r/[email protected]
My guess is that either one avoids the crash, but we actually
want both of them since the problem is a combination of the two
issues.
I think we also need this one on top, to have a va_end() for
each return() statement:
--- a/drivers/clk/clkdev.c
+++ b/drivers/clk/clkdev.c
@@ -193,6 +193,7 @@ vclkdev_alloc(struct clk_hw *hw, const char *con_id, const char *dev_fmt,
cla->cl.dev_id = cla->dev_id;
}
+ va_end(ap_copy);
return &cla->cl;
fail:
Arnd
Quoting Arnd Bergmann (2024-05-07 13:52:59)
> On Tue, May 7, 2024, at 22:26, Stephen Boyd wrote:
> > Quoting Arnd Bergmann (2024-05-07 00:44:15)
> >> On Tue, May 7, 2024, at 09:20, Naresh Kamboju wrote:
> >> > The WinLink E850-96 board boot failed with Linux next-20240506 but there
> >> > is no kernel crash log on the serial [1].
> >> >
> >> > Anders bisection results pointing to this commit,
> >> > # first bad commit:
> >> > [4d11c62ca8d77cb1f79054844b598e0f4e92dabe]
> >> > clkdev: report over-sized strings when creating clkdev entrie
> >> >
> >> > After reverting the above patch the boot test passed [2].
> >> >
> >> > Reported-by: Linux Kernel Functional Testing <[email protected]>
> >> >
> >
> > There are two fixes on the list: [1] and [2]. Perhaps one of those
> > resolves this?
> >
> > [1] https://lore.kernel.org/r/[email protected]
> > [2] https://lore.kernel.org/r/[email protected]
>
> My guess is that either one avoids the crash, but we actually
> want both of them since the problem is a combination of the two
> issues.
>
> I think we also need this one on top, to have a va_end() for
> each return() statement:
Makes sense. Hopefully Russell can fold that fix in as well, or you can
send it to the patch tracker.
On Tue, 7 May 2024 at 22:26, Stephen Boyd <[email protected]> wrote:
>
> Quoting Arnd Bergmann (2024-05-07 00:44:15)
> > On Tue, May 7, 2024, at 09:20, Naresh Kamboju wrote:
> > > The WinLink E850-96 board boot failed with Linux next-20240506 but there
> > > is no kernel crash log on the serial [1].
> > >
> > > Anders bisection results pointing to this commit,
> > > # first bad commit:
> > > [4d11c62ca8d77cb1f79054844b598e0f4e92dabe]
> > > clkdev: report over-sized strings when creating clkdev entrie
> > >
> > > After reverting the above patch the boot test passed [2].
> > >
> > > Reported-by: Linux Kernel Functional Testing <[email protected]>
> > >
>
> There are two fixes on the list: [1] and [2]. Perhaps one of those
> resolves this?
I applied patch [2] ontop of next-20240506 was helpful and the e850-96
board booted.
Cheers,
Anders
>
> [1] https://lore.kernel.org/r/[email protected]
> [2] https://lore.kernel.org/r/[email protected]
On Tue, May 7, 2024 at 3:26 PM Stephen Boyd <[email protected]> wrote:
>
> Quoting Arnd Bergmann (2024-05-07 00:44:15)
> > On Tue, May 7, 2024, at 09:20, Naresh Kamboju wrote:
> > > The WinLink E850-96 board boot failed with Linux next-20240506 but there
> > > is no kernel crash log on the serial [1].
> > >
> > > Anders bisection results pointing to this commit,
> > > # first bad commit:
> > > [4d11c62ca8d77cb1f79054844b598e0f4e92dabe]
> > > clkdev: report over-sized strings when creating clkdev entrie
> > >
> > > After reverting the above patch the boot test passed [2].
> > >
> > > Reported-by: Linux Kernel Functional Testing <[email protected]>
> > >
>
> There are two fixes on the list: [1] and [2]. Perhaps one of those
> resolves this?
>
> [1] https://lore.kernel.org/r/[email protected]
> [2] https://lore.kernel.org/r/[email protected]
>
Late to the party, but FWIW here is my two cents. E850-96 board
crashes on boot when running next-20240508. Enabling earlycon reveals
the details. Here is the relevant excerpt from the backtrace:
8<-------------------------------------------------------------------->8
Unable to handle kernel NULL pointer dereference
at virtual address 0000000000000000
Call trace:
vsnprintf+0x64/0x724
...
_printk+0x60/0x84
vclkdev_alloc+0x118/0x13c
clkdev_hw_create+0x64/0x9c
do_clk_register_clkdev+0x58/0x7c
clk_hw_register_clkdev+0x30/0x54
samsung_clk_register_fixed_rate+0xac/0x104
samsung_cmu_register_clocks+0x78/0xb0
samsung_cmu_register_one+0x48/0xa4
exynos_arm64_register_cmu+0x3c/0x70
exynos850_cmu_probe+0x2c/0x40
...
8<-------------------------------------------------------------------->8
'addr2line' points at the end of vclkdev_alloc():
pr_err("%pV:%s: %s ID is greater than %zu\n",
&fmt, con_id, failure, max_size);
Applying the forementioned patch [2] ("clkdev: fix potential NULL
pointer dereference") fixes the boot for me.
I can also observe a couple of warnings like these in the kernel log:
samsung_clk_register_fixed_rate: failed to register clock lookup
for clk_rco_i3c_pmic
samsung_clk_register_fixed_rate: failed to register clock lookup
for clk_rco_apm__alv
...
The patch [1] ("clk: samsung: Don't register clkdev lookup for the
fixed rate clocks") fixes those. I think both have to be applied ASAP.
In case of E850-96, I guess [1] is more critical.
Thanks!
On Wed, May 08, 2024 at 04:07:57PM -0500, Sam Protsenko wrote:
> The patch [1] ("clk: samsung: Don't register clkdev lookup for the
> fixed rate clocks") fixes those. I think both have to be applied ASAP.
> In case of E850-96, I guess [1] is more critical.
The fixes to clkdev.c have been pushed out for a while now, so I think
you may need to update your tree. There's been one more fix to it more
recently (because of the whole va_copy() debacle).
Whether linux-next picks up the latest version depends when they pull
in relation to me pushing the changes out, which can take 48 hours
due to the timezone differences. linux-next tends not to pick stuff
up quickly if one's in the UK and pushes stuff out in the late
afternoon/evening. That said, the NULL fix has been pushed out for
a few days now.
--
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!
On Tue, May 07, 2024 at 01:26:17PM -0700, Stephen Boyd wrote:
> Quoting Arnd Bergmann (2024-05-07 00:44:15)
> > On Tue, May 7, 2024, at 09:20, Naresh Kamboju wrote:
> > > The WinLink E850-96 board boot failed with Linux next-20240506 but there
> > > is no kernel crash log on the serial [1].
> > >
> > > Anders bisection results pointing to this commit,
> > > # first bad commit:
> > > [4d11c62ca8d77cb1f79054844b598e0f4e92dabe]
> > > clkdev: report over-sized strings when creating clkdev entrie
> > >
> > > After reverting the above patch the boot test passed [2].
> > >
> > > Reported-by: Linux Kernel Functional Testing <[email protected]>
> > >
>
> There are two fixes on the list: [1] and [2]. Perhaps one of those
> resolves this?
>
> [1] https://lore.kernel.org/r/[email protected]
This one has (I think) ended up in the patch system last week, but it's
not clkdev, it's only related. I'm also not Cc'd on its posting, and
it's not posted to any mailing list that I'm a part of. So I've not
been following any discussion on it.
Digging in to the discussion, I see various attributations, and a final
message reporting an unused variable, and a promise to send v2. So,
I'm guessing that
http://www.home.armlinux.org.uk/developer/patches/viewpatch.php?id=9397/1
is now superseded in some way... I wouldn't have known without locating
this email and checking the links.
--
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!
On Wed, 15 May 2024 at 22:53, Russell King (Oracle)
<[email protected]> wrote:
>
> On Tue, May 07, 2024 at 01:26:17PM -0700, Stephen Boyd wrote:
> > Quoting Arnd Bergmann (2024-05-07 00:44:15)
> > > On Tue, May 7, 2024, at 09:20, Naresh Kamboju wrote:
> > > > The WinLink E850-96 board boot failed with Linux next-20240506 but there
> > > > is no kernel crash log on the serial [1].
> > > >
> > > > Anders bisection results pointing to this commit,
> > > > # first bad commit:
> > > > [4d11c62ca8d77cb1f79054844b598e0f4e92dabe]
> > > > clkdev: report over-sized strings when creating clkdev entrie
> > > >
> > > > After reverting the above patch the boot test passed [2].
> > > >
> > > > Reported-by: Linux Kernel Functional Testing <[email protected]>
> > > >
> >
> > There are two fixes on the list: [1] and [2]. Perhaps one of those
> > resolves this?
> >
> > [1] https://lore.kernel.org/r/[email protected]
>
> This one has (I think) ended up in the patch system last week, but it's
> not clkdev, it's only related. I'm also not Cc'd on its posting, and
> it's not posted to any mailing list that I'm a part of. So I've not
> been following any discussion on it.
>
> Digging in to the discussion, I see various attributations, and a final
> message reporting an unused variable, and a promise to send v2. So,
> I'm guessing that
> http://www.home.armlinux.org.uk/developer/patches/viewpatch.php?id=9397/1
I do not have access to this link ^.
> is now superseded in some way... I wouldn't have known without locating
> this email and checking the links.
>
> --
> RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
> FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!
- Naresh
On 16.05.2024 12:27, Naresh Kamboju wrote:
> On Wed, 15 May 2024 at 22:53, Russell King (Oracle)
> <[email protected]> wrote:
>> On Tue, May 07, 2024 at 01:26:17PM -0700, Stephen Boyd wrote:
>>> Quoting Arnd Bergmann (2024-05-07 00:44:15)
>>>> On Tue, May 7, 2024, at 09:20, Naresh Kamboju wrote:
>>>>> The WinLink E850-96 board boot failed with Linux next-20240506 but there
>>>>> is no kernel crash log on the serial [1].
>>>>>
>>>>> Anders bisection results pointing to this commit,
>>>>> # first bad commit:
>>>>> [4d11c62ca8d77cb1f79054844b598e0f4e92dabe]
>>>>> clkdev: report over-sized strings when creating clkdev entrie
>>>>>
>>>>> After reverting the above patch the boot test passed [2].
>>>>>
>>>>> Reported-by: Linux Kernel Functional Testing <[email protected]>
>>>>>
>>> There are two fixes on the list: [1] and [2]. Perhaps one of those
>>> resolves this?
>>>
>>> [1] https://lore.kernel.org/r/[email protected]
>> This one has (I think) ended up in the patch system last week, but it's
>> not clkdev, it's only related. I'm also not Cc'd on its posting, and
>> it's not posted to any mailing list that I'm a part of. So I've not
>> been following any discussion on it.
>>
>> Digging in to the discussion, I see various attributations, and a final
>> message reporting an unused variable, and a promise to send v2. So,
>> I'm guessing that
>> https://protect2.fireeye.com/v1/url?k=946226d9-f5e933ef-9463ad96-74fe485fffe0-28286a0026513387&q=1&e=a16c1c53-9c99-475a-b144-8adf7852ebc0&u=http%3A%2F%2Fwww.home.armlinux.org.uk%2Fdeveloper%2Fpatches%2Fviewpatch.php%3Fid%3D9397%2F1
> I do not have access to this link ^.
>
>> is now superseded in some way... I wouldn't have known without locating
>> this email and checking the links.
The fix for drivers/clk/samsung/clk.c driver has been merged to clk-next:
https://lore.kernel.org/all/[email protected]/
Best regards
--
Marek Szyprowski, PhD
Samsung R&D Institute Poland
On Thu, May 16, 2024 at 12:27:20PM +0200, Naresh Kamboju wrote:
> On Wed, 15 May 2024 at 22:53, Russell King (Oracle)
> <[email protected]> wrote:
> >
> > On Tue, May 07, 2024 at 01:26:17PM -0700, Stephen Boyd wrote:
> > > Quoting Arnd Bergmann (2024-05-07 00:44:15)
> > > > On Tue, May 7, 2024, at 09:20, Naresh Kamboju wrote:
> > > > > The WinLink E850-96 board boot failed with Linux next-20240506 but there
> > > > > is no kernel crash log on the serial [1].
> > > > >
> > > > > Anders bisection results pointing to this commit,
> > > > > # first bad commit:
> > > > > [4d11c62ca8d77cb1f79054844b598e0f4e92dabe]
> > > > > clkdev: report over-sized strings when creating clkdev entrie
> > > > >
> > > > > After reverting the above patch the boot test passed [2].
> > > > >
> > > > > Reported-by: Linux Kernel Functional Testing <[email protected]>
> > > > >
> > >
> > > There are two fixes on the list: [1] and [2]. Perhaps one of those
> > > resolves this?
> > >
> > > [1] https://lore.kernel.org/r/[email protected]
> >
> > This one has (I think) ended up in the patch system last week, but it's
> > not clkdev, it's only related. I'm also not Cc'd on its posting, and
> > it's not posted to any mailing list that I'm a part of. So I've not
> > been following any discussion on it.
> >
> > Digging in to the discussion, I see various attributations, and a final
> > message reporting an unused variable, and a promise to send v2. So,
> > I'm guessing that
> > http://www.home.armlinux.org.uk/developer/patches/viewpatch.php?id=9397/1
>
> I do not have access to this link ^.
Sorry, that's my internal link, the external one is:
http://www.armlinux.org.uk/developer/patches/viewpatch.php?id=9397/1
--
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!
On Thu, May 16, 2024 at 12:57:53PM +0200, Marek Szyprowski wrote:
> The fix for drivers/clk/samsung/clk.c driver has been merged to clk-next:
>
> https://lore.kernel.org/all/[email protected]/
Good to know, would be nice to be kept in the loop though, especially
as the first version was submitted to me.
--
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!