2016-12-16 21:13:12

by Heiko Stübner

[permalink] [raw]
Subject: thermal zones break with patch "Reimplement IDR and IDA using the radix tree" (mainline+next)

Hi,

commit b05bbe3ea2db ("Reimplement IDR and IDA using the radix tree") seems to
break thermal zone allocation. This happens both on todays mainline and
linux-next-20161216 and produces errors like:

[ 9.397201] WARNING: CPU: 1 PID: 165 at ../fs/sysfs/dir.c:31 sysfs_warn_dup+0x64/0x74
[ 9.405145] sysfs: cannot create duplicate filename '/devices/virtual/thermal/thermal_zone1'
[ 9.514951] Hardware name: Rockchip (Device Tree)
[ 9.522611] [<c03104a4>] (unwind_backtrace) from [<c030ba50>] (show_stack+0x10/0x14)
[ 9.532925] [<c030ba50>] (show_stack) from [<c05a11fc>] (dump_stack+0x88/0x9c)
[ 9.542695] [<c05a11fc>] (dump_stack) from [<c0340c68>] (__warn+0xe8/0x100)
[ 9.552166] [<c0340c68>] (__warn) from [<c0340cb8>] (warn_slowpath_fmt+0x38/0x48)
[ 9.562154] [<c0340cb8>] (warn_slowpath_fmt) from [<c047e944>] (sysfs_warn_dup+0x64/0x74)
[ 9.572834] [<c047e944>] (sysfs_warn_dup) from [<c047ea20>] (sysfs_create_dir_ns+0x84/0x94)
[ 9.583687] [<c047ea20>] (sysfs_create_dir_ns) from [<c05a2cb8>] (kobject_add_internal+0xac/0x2fc)
[ 9.595149] [<c05a2cb8>] (kobject_add_internal) from [<c05a2f50>] (kobject_add+0x48/0x98)
[ 9.605829] [<c05a2f50>] (kobject_add) from [<c07ee340>] (device_add+0xd4/0x5b0)
[ 9.615714] [<c07ee340>] (device_add) from [<c0a43a98>] (thermal_zone_device_register+0x174/0x5cc)
[ 9.627173] [<c0a43a98>] (thermal_zone_device_register) from [<c0a3c588>] (__power_supply_register+0x374/0x44c)
[ 9.639772] [<c0a3c588>] (__power_supply_register) from [<c0a3c6bc>] (devm_power_supply_register+0x4c/0x7c)

System is a Rockchip rk3288-veyron and it has 4 thermal zones (3 from
the thermal ip block of the soc and one from the battery).

Reverting the patch mentioned above results in thermal zones being
registered again.

While I haven't looked to deeply into what idr exactly does, some findings:
- thermal_zone0 and thermal_zone1 are allocated correctly
- every further thermal_zone always gets allocated the number "1"
- thermal core calls idr_alloc with 0 for both start and end
- the rewrite-patch seems to change the semantics of idr_alloc
where it orignally said "@end: the maximum id (exclusive, <= 0 for max)"
the "<= 0" part is gone now, but I checked, simply setting INT_MAX
as end in the thermal_core does not help


Heiko


2016-12-16 21:34:22

by Matthew Wilcox

[permalink] [raw]
Subject: RE: thermal zones break with patch "Reimplement IDR and IDA using the radix tree" (mainline+next)

From: Heiko Stuebner [mailto:[email protected]]
> commit b05bbe3ea2db ("Reimplement IDR and IDA using the radix tree")
> seems to
> break thermal zone allocation. This happens both on todays mainline and
> linux-next-20161216 and produces errors like:

> While I haven't looked to deeply into what idr exactly does, some findings:
> - thermal_zone0 and thermal_zone1 are allocated correctly
> - every further thermal_zone always gets allocated the number "1"
> - thermal core calls idr_alloc with 0 for both start and end
> - the rewrite-patch seems to change the semantics of idr_alloc
> where it orignally said "@end: the maximum id (exclusive, <= 0 for max)"
> the "<= 0" part is gone now, but I checked, simply setting INT_MAX
> as end in the thermal_core does not help

Hi Heiko,

Thanks for the report! The problem is because the thermal subsystem calls idr_alloc() passing a NULL pointer for the data. I have fixed this problem in my git tree but haven't sent the patch to Andrew yet. This patch should fix the problem for you:

http://git.infradead.org/users/willy/linux-dax.git/commitdiff/c52eeed7b759c3fefe9b7f1b0a17a438df6950f3

Now ... thermal is actually using an IDR when it could save memory by using an IDA. Are you interested in doing that conversion?

2016-12-17 00:48:54

by Heiko Stübner

[permalink] [raw]
Subject: Re: thermal zones break with patch "Reimplement IDR and IDA using the radix tree" (mainline+next)

Hi Matthew,

Am Freitag, 16. Dezember 2016, 21:19:37 CET schrieb Matthew Wilcox:
> From: Heiko Stuebner [mailto:[email protected]]
>
> > commit b05bbe3ea2db ("Reimplement IDR and IDA using the radix tree")
> > seems to
> > break thermal zone allocation. This happens both on todays mainline and
> > linux-next-20161216 and produces errors like:
>
>
>
> > While I haven't looked to deeply into what idr exactly does, some
> > findings:
- thermal_zone0 and thermal_zone1 are allocated correctly
> > - every further thermal_zone always gets allocated the number "1"
> > - thermal core calls idr_alloc with 0 for both start and end
> > - the rewrite-patch seems to change the semantics of idr_alloc
> >
> > where it orignally said "@end: the maximum id (exclusive, <= 0 for
> > max)"
> > the "<= 0" part is gone now, but I checked, simply setting INT_MAX
> > as end in the thermal_core does not help
>
>
> Hi Heiko,
>
> Thanks for the report! The problem is because the thermal subsystem calls
> idr_alloc() passing a NULL pointer for the data. I have fixed this problem
> in my git tree but haven't sent the patch to Andrew yet. This patch should
> fix the problem for you:

> http://git.infradead.org/users/willy/linux-dax.git/commitdiff/c52eeed7b759c3
> fefe9b7f1b0a17a438df6950f3

yay, this fixes the issue. This should definitly go as fix into 4.10-rc and when
you send it to Andrew you can add my

Tested-by: Heiko Stuebner <[email protected]>


> Now ... thermal is actually using an IDR when it could save memory by using
> an IDA. Are you interested in doing that conversion?

That would more be a question for Eduardo and Rui :-) . I'm just in the
process of tracking down the smallish issues on my Rockchip board that are
poping up during the merge-window.


Anyway, thanks for pointing to the fix
Heiko

2016-12-18 19:43:22

by Andy Shevchenko

[permalink] [raw]
Subject: Re: thermal zones break with patch "Reimplement IDR and IDA using the radix tree" (mainline+next)

On Fri, Dec 16, 2016 at 11:19 PM, Matthew Wilcox <[email protected]> wrote:
> From: Heiko Stuebner [mailto:[email protected]]
>> commit b05bbe3ea2db ("Reimplement IDR and IDA using the radix tree")
>> seems to
>> break thermal zone allocation. This happens both on todays mainline and
>> linux-next-20161216 and produces errors like:
>
>> While I haven't looked to deeply into what idr exactly does, some findings:
>> - thermal_zone0 and thermal_zone1 are allocated correctly
>> - every further thermal_zone always gets allocated the number "1"
>> - thermal core calls idr_alloc with 0 for both start and end
>> - the rewrite-patch seems to change the semantics of idr_alloc
>> where it orignally said "@end: the maximum id (exclusive, <= 0 for max)"
>> the "<= 0" part is gone now, but I checked, simply setting INT_MAX
>> as end in the thermal_core does not help
>
> Hi Heiko,
>
> Thanks for the report! The problem is because the thermal subsystem calls idr_alloc() passing a NULL pointer for the data. I have fixed this problem in my git tree but haven't sent the patch to Andrew yet. This patch should fix the problem for you:
>
> http://git.infradead.org/users/willy/linux-dax.git/commitdiff/c52eeed7b759c3fefe9b7f1b0a17a438df6950f3
>
> Now ... thermal is actually using an IDR when it could save memory by using an IDA. Are you interested in doing that conversion?

+Cc: Mika, Vinod

Same here for at least DMA Engine (and perhaps another place in
thermal subsystem).

For DMA Engine case

Tested-by: Andy Shevchenko <[email protected]>

--
With Best Regards,
Andy Shevchenko

2016-12-19 16:37:55

by Matthew Wilcox

[permalink] [raw]
Subject: RE: thermal zones break with patch "Reimplement IDR and IDA using the radix tree" (mainline+next)

From: Andy Shevchenko [mailto:[email protected]]
> On Fri, Dec 16, 2016 at 11:19 PM, Matthew Wilcox
> <[email protected]> wrote:
> > Now ... thermal is actually using an IDR when it could save memory by using
> an IDA. Are you interested in doing that conversion?
>
> +Cc: Mika, Vinod
>
> Same here for at least DMA Engine (and perhaps another place in
> thermal subsystem).

Hey Andy, thanks for testing! Could you test another patch for me as well, to switch DMA Engine over to using an IDA instead of an IDR? This should save 576 bytes on a normal setup.

The patch itself is here:
http://git.infradead.org/users/willy/linux-dax.git/commitdiff/a4f4e8554639b3fba9a62c07ad0e2423ea1f5f31

but it may be better just to test the head of this tree:
http://git.infradead.org/users/willy/linux-dax.git/shortlog/refs/heads/idr-2016-12-16

as I keep fixing bugs :-)

2016-12-20 03:56:58

by Eduardo Valentin

[permalink] [raw]
Subject: Re: thermal zones break with patch "Reimplement IDR and IDA using the radix tree" (mainline+next)

Hey,

On Sat, Dec 17, 2016 at 01:48:33AM +0100, Heiko Stuebner wrote:
> Hi Matthew,
>
> Am Freitag, 16. Dezember 2016, 21:19:37 CET schrieb Matthew Wilcox:
> > From: Heiko Stuebner [mailto:[email protected]]
> >
> > > commit b05bbe3ea2db ("Reimplement IDR and IDA using the radix tree")
> > > seems to
> > > break thermal zone allocation. This happens both on todays mainline and
> > > linux-next-20161216 and produces errors like:
> >
> >
> >
> > > While I haven't looked to deeply into what idr exactly does, some
> > > findings:
> - thermal_zone0 and thermal_zone1 are allocated correctly
> > > - every further thermal_zone always gets allocated the number "1"
> > > - thermal core calls idr_alloc with 0 for both start and end
> > > - the rewrite-patch seems to change the semantics of idr_alloc
> > >
> > > where it orignally said "@end: the maximum id (exclusive, <= 0 for
> > > max)"
> > > the "<= 0" part is gone now, but I checked, simply setting INT_MAX
> > > as end in the thermal_core does not help
> >
> >
> > Hi Heiko,
> >
> > Thanks for the report! The problem is because the thermal subsystem calls
> > idr_alloc() passing a NULL pointer for the data. I have fixed this problem
> > in my git tree but haven't sent the patch to Andrew yet. This patch should
> > fix the problem for you:
>
> > http://git.infradead.org/users/willy/linux-dax.git/commitdiff/c52eeed7b759c3
> > fefe9b7f1b0a17a438df6950f3
>
> yay, this fixes the issue. This should definitly go as fix into 4.10-rc and when
> you send it to Andrew you can add my
>
> Tested-by: Heiko Stuebner <[email protected]>
>
>
> > Now ... thermal is actually using an IDR when it could save memory by using
> > an IDA. Are you interested in doing that conversion?
>
> That would more be a question for Eduardo and Rui :-) . I'm just in the
> process of tracking down the smallish issues on my Rockchip board that are
> poping up during the merge-window.

I would prefer to get the right implementation, and fixing thermal core by using IDA.

But given that this touches thermal core, we need an agreement with Rui
too.

>
>
> Anyway, thanks for pointing to the fix
> Heiko