2023-08-18 11:02:07

by Suzuki K Poulose

[permalink] [raw]
Subject: Re: [PATCH 2/2] coresight: core: fix memory leak in dict->fwnode_list

On 17/08/2023 16:01, James Clark wrote:
>
>
> On 17/08/2023 15:47, Suzuki K Poulose wrote:
>> On 17/08/2023 15:39, Suzuki K Poulose wrote:
>>> On 17/08/2023 09:59, Junhao He wrote:
>>>> There are memory leaks reported by kmemleak:
>>>> ...
>>>> unreferenced object 0xffff2020103c3200 (size 256):
>>>>    comm "insmod", pid 4476, jiffies 4294978252 (age 50072.536s)
>>>>    hex dump (first 32 bytes):
>>>>      10 60 40 06 28 20 ff ff 10 c0 59 06 20 20 ff ff  .`@.( ....Y.  ..
>>>>      10 e0 47 06 28 20 ff ff 10 00 49 06 28 20 ff ff  ..G.( ....I.( ..
>>>>    backtrace:
>>>>      [<0000000034ec4724>] __kmem_cache_alloc_node+0x2f8/0x348
>>>>      [<0000000057fbc15d>] __kmalloc_node_track_caller+0x5c/0x110
>>>>      [<00000055d5e34b>] krealloc+0x8c/0x178
>>>>      [<00000000a4635beb>] coresight_alloc_device_name+0x128/0x188
>>>> [coresight]
>>>>      [<00000000a92ddfee>] funnel_cs_ops+0x10/0xfffffffffffedaa0
>>>> [coresight_funnel]
>>>>      [<00000000449e20f8>] dynamic_funnel_ids+0x80/0xfffffffffffed840
>>>> [coresight_funnel]
>>>> ...
>>>>
>>>> when remove driver, the golab variables defined by the macro
>>>> DEFINE_CORESIGHT_DEVLIST will be released, dict->nr_idx and
>>>> dict->fwnode_list are cleared to 0. The lifetime of the golab
>>>> variable has ended. So the buffer pointer is lost.
>>>>
>>>> Use the callback of devm_add_action_or_reset() to free memory.
>>>
>>> Thanks for the report. But please see below:
>>>
>>>>
>>>> Fixes: 0f5f9b6ba9e1 ("coresight: Use platform agnostic names")
>>>> Signed-off-by: Junhao He <[email protected]>
>>>> ---
>>>>   drivers/hwtracing/coresight/coresight-core.c | 20 +++++++++++++++++++-
>>>>   1 file changed, 19 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/drivers/hwtracing/coresight/coresight-core.c
>>>> b/drivers/hwtracing/coresight/coresight-core.c
>>>> index 9fabe00a40d6..6849faad697d 100644
>>>> --- a/drivers/hwtracing/coresight/coresight-core.c
>>>> +++ b/drivers/hwtracing/coresight/coresight-core.c
>>>> @@ -1756,6 +1756,20 @@ bool coresight_loses_context_with_cpu(struct
>>>> device *dev)
>>>>   }
>>>>   EXPORT_SYMBOL_GPL(coresight_loses_context_with_cpu);
>>>> +void coresight_release_dev_list(void *data)
>>>> +{
>>>> +    struct coresight_dev_list *dict = data;
>>>> +
>>>> +    mutex_lock(&coresight_mutex);
>>>> +
>>>> +    if (dict->nr_idx) {
>>>> +        kfree(dict->fwnode_list);
>>>> +        dict->nr_idx = 0;
>>>> +    }
>>>> +
>>>> +    mutex_unlock(&coresight_mutex);
>>>> +}
>>>> +
>>>>   /*
>>>>    * coresight_alloc_device_name - Get an index for a given device in
>>>> the
>>>>    * device index list specific to a driver. An index is allocated for a
>>>> @@ -1766,12 +1780,16 @@
>>>> EXPORT_SYMBOL_GPL(coresight_loses_context_with_cpu);
>>>>   char *coresight_alloc_device_name(struct coresight_dev_list *dict,
>>>>                     struct device *dev)
>>>>   {
>>>> -    int idx;
>>>> +    int idx, ret;
>>>>       char *name = NULL;
>>>>       struct fwnode_handle **list;
>>>>       mutex_lock(&coresight_mutex);
>>>> +    ret = devm_add_action_or_reset(dev, coresight_release_dev_list,
>>>> dict);
>>>> +    if (ret)
>>>> +        goto done;
>>>
>>> This looks wrong. The devlist should be only released on the "driver"
>>> unload, not on every device release. The list retains the fwnode to
>>> assign the same name for a device, if it is re-probed (e.g., due to
>>> -EPROBE_DEFER error).
>>
>> The best way is to free it on module_unload and unfortunately we would
>> need to do it from all modules using the DEVLIST.
>>
>> Suzuki
>>
>
> Seems like we might also be able to move the separate lists to be one
> big list owned by the main 'coresight' module. If all the other modules
> are dependent on that one then it's always loaded first and the list is
> available. Then it persists as long as the main module is loaded and can
> be freed with the normal devm stuff.

That may not work, right ? For the devm stuff to work, you need a
device. Moving this to the coresight main module, doesn't give us
*a device* where all these lists can be allocated from. Also, we
need a list per device type (e.g., tmc-etf<>, tmc-etb<>, tmc-etr<>
for tmc etc.). So then the individual drivers need to then refer
to the particular (exported!) list for allocations.

>
> That would avoid the awkward combo of the static variables in each
> module plus the non devm kalloced list.

I think it is not too bad to add a cleanup call to the callers, who use
a devlist.

Suzuki




>
>>
>>>
>>> Suzuki
>>>
>>