2023-07-25 10:17:10

by Qi Zheng

[permalink] [raw]
Subject: Re: [PATCH v2 03/47] mm: shrinker: add infrastructure for dynamically allocating shrinker

Hi Muchun,

On 2023/7/25 17:02, Muchun Song wrote:
>
>
> On 2023/7/24 17:43, Qi Zheng wrote:
>> Currently, the shrinker instances can be divided into the following three
>> types:
>>
>> a) global shrinker instance statically defined in the kernel, such as
>>     workingset_shadow_shrinker.
>>
>> b) global shrinker instance statically defined in the kernel modules,
>> such
>>     as mmu_shrinker in x86.
>>
>> c) shrinker instance embedded in other structures.
>>
>> For case a, the memory of shrinker instance is never freed. For case b,
>> the memory of shrinker instance will be freed after synchronize_rcu()
>> when
>> the module is unloaded. For case c, the memory of shrinker instance will
>> be freed along with the structure it is embedded in.
>>
>> In preparation for implementing lockless slab shrink, we need to
>> dynamically allocate those shrinker instances in case c, then the memory
>> can be dynamically freed alone by calling kfree_rcu().
>>
>> So this commit adds the following new APIs for dynamically allocating
>> shrinker, and add a private_data field to struct shrinker to record and
>> get the original embedded structure.
>>
>> 1. shrinker_alloc()
>>
>> Used to allocate shrinker instance itself and related memory, it will
>> return a pointer to the shrinker instance on success and NULL on failure.
>>
>> 2. shrinker_free_non_registered()
>>
>> Used to destroy the non-registered shrinker instance.
>
> At least I don't like this name. I know you want to tell others
> this function only should be called when shrinker has not been
> registed but allocated. Maybe shrinker_free() is more simple.
> And and a comment to tell the users when to use it.

OK, if no one else objects, I will change it to shrinker_free() in
the next version.

>
>>
>> 3. shrinker_register()
>>
>> Used to register the shrinker instance, which is same as the current
>> register_shrinker_prepared().
>>
>> 4. shrinker_unregister()
>>
>> Used to unregister and free the shrinker instance.
>>
>> In order to simplify shrinker-related APIs and make shrinker more
>> independent of other kernel mechanisms, subsequent submissions will use
>> the above API to convert all shrinkers (including case a and b) to
>> dynamically allocated, and then remove all existing APIs.
>>
>> This will also have another advantage mentioned by Dave Chinner:
>>
>> ```
>> The other advantage of this is that it will break all the existing
>> out of tree code and third party modules using the old API and will
>> no longer work with a kernel using lockless slab shrinkers. They
>> need to break (both at the source and binary levels) to stop bad
>> things from happening due to using uncoverted shrinkers in the new
>> setup.
>> ```
>>
>> Signed-off-by: Qi Zheng <[email protected]>
>> ---
>>   include/linux/shrinker.h |   6 +++
>>   mm/shrinker.c            | 113 +++++++++++++++++++++++++++++++++++++++
>>   2 files changed, 119 insertions(+)
>>
>> diff --git a/include/linux/shrinker.h b/include/linux/shrinker.h
>> index 961cb84e51f5..296f5e163861 100644
>> --- a/include/linux/shrinker.h
>> +++ b/include/linux/shrinker.h
>> @@ -70,6 +70,8 @@ struct shrinker {
>>       int seeks;    /* seeks to recreate an obj */
>>       unsigned flags;
>> +    void *private_data;
>> +
>>       /* These are for internal use */
>>       struct list_head list;
>>   #ifdef CONFIG_MEMCG
>> @@ -98,6 +100,10 @@ struct shrinker {
>>   unsigned long shrink_slab(gfp_t gfp_mask, int nid, struct mem_cgroup
>> *memcg,
>>                 int priority);
>> +struct shrinker *shrinker_alloc(unsigned int flags, const char *fmt,
>> ...);
>> +void shrinker_free_non_registered(struct shrinker *shrinker);
>> +void shrinker_register(struct shrinker *shrinker);
>> +void shrinker_unregister(struct shrinker *shrinker);
>>   extern int __printf(2, 3) prealloc_shrinker(struct shrinker *shrinker,
>>                           const char *fmt, ...);
>> diff --git a/mm/shrinker.c b/mm/shrinker.c
>> index 0a32ef42f2a7..d820e4cc5806 100644
>> --- a/mm/shrinker.c
>> +++ b/mm/shrinker.c
>> @@ -548,6 +548,119 @@ unsigned long shrink_slab(gfp_t gfp_mask, int
>> nid, struct mem_cgroup *memcg,
>>       return freed;
>>   }
>> +struct shrinker *shrinker_alloc(unsigned int flags, const char *fmt,
>> ...)
>> +{
>> +    struct shrinker *shrinker;
>> +    unsigned int size;
>> +    va_list __maybe_unused ap;
>> +    int err;
>> +
>> +    shrinker = kzalloc(sizeof(struct shrinker), GFP_KERNEL);
>> +    if (!shrinker)
>> +        return NULL;
>> +
>> +#ifdef CONFIG_SHRINKER_DEBUG
>> +    va_start(ap, fmt);
>> +    shrinker->name = kvasprintf_const(GFP_KERNEL, fmt, ap);
>> +    va_end(ap);
>> +    if (!shrinker->name)
>> +        goto err_name;
>> +#endif
>
> So why not introduce another helper to handle this and declare it
> as a void function when !CONFIG_SHRINKER_DEBUG? Something like the
> following:
>
> #ifdef CONFIG_SHRINKER_DEBUG
> static int shrinker_debugfs_name_alloc(struct shrinker *shrinker, const
> char *fmt,
>                                        va_list vargs)
>
> {
>     shrinker->name = kvasprintf_const(GFP_KERNEL, fmt, vargs);
>     return shrinker->name ? 0 : -ENOMEM;
> }
> #else
> static int shrinker_debugfs_name_alloc(struct shrinker *shrinker, const
> char *fmt,
>                                        va_list vargs)
> {
>     return 0;
> }
> #endif

Will do in the next version.

>
>> +    shrinker->flags = flags;
>> +
>> +    if (flags & SHRINKER_MEMCG_AWARE) {
>> +        err = prealloc_memcg_shrinker(shrinker);
>> +        if (err == -ENOSYS)
>> +            shrinker->flags &= ~SHRINKER_MEMCG_AWARE;
>> +        else if (err == 0)
>> +            goto done;
>> +        else
>> +            goto err_flags;
>> +    }
>> +
>> +    /*
>> +     * The nr_deferred is available on per memcg level for memcg aware
>> +     * shrinkers, so only allocate nr_deferred in the following cases:
>> +     *  - non memcg aware shrinkers
>> +     *  - !CONFIG_MEMCG
>> +     *  - memcg is disabled by kernel command line
>> +     */
>> +    size = sizeof(*shrinker->nr_deferred);
>> +    if (flags & SHRINKER_NUMA_AWARE)
>> +        size *= nr_node_ids;
>> +
>> +    shrinker->nr_deferred = kzalloc(size, GFP_KERNEL);
>> +    if (!shrinker->nr_deferred)
>> +        goto err_flags;
>> +
>> +done:
>> +    return shrinker;
>> +
>> +err_flags:
>> +#ifdef CONFIG_SHRINKER_DEBUG
>> +    kfree_const(shrinker->name);
>> +    shrinker->name = NULL;
>
> This could be shrinker_debugfs_name_free()

Will do.

>
>> +err_name:
>> +#endif
>> +    kfree(shrinker);
>> +    return NULL;
>> +}
>> +EXPORT_SYMBOL(shrinker_alloc);
>> +
>> +void shrinker_free_non_registered(struct shrinker *shrinker)
>> +{
>> +#ifdef CONFIG_SHRINKER_DEBUG
>> +    kfree_const(shrinker->name);
>> +    shrinker->name = NULL;
>
> This could be shrinker_debugfs_name_free()
>
>> +#endif
>> +    if (shrinker->flags & SHRINKER_MEMCG_AWARE) {
>> +        down_write(&shrinker_rwsem);
>> +        unregister_memcg_shrinker(shrinker);
>> +        up_write(&shrinker_rwsem);
>> +    }
>> +
>> +    kfree(shrinker->nr_deferred);
>> +    shrinker->nr_deferred = NULL;
>> +
>> +    kfree(shrinker);
>> +}
>> +EXPORT_SYMBOL(shrinker_free_non_registered);
>> +
>> +void shrinker_register(struct shrinker *shrinker)
>> +{
>> +    down_write(&shrinker_rwsem);
>> +    list_add_tail(&shrinker->list, &shrinker_list);
>> +    shrinker->flags |= SHRINKER_REGISTERED;
>> +    shrinker_debugfs_add(shrinker);
>> +    up_write(&shrinker_rwsem);
>> +}
>> +EXPORT_SYMBOL(shrinker_register);
>> +
>> +void shrinker_unregister(struct shrinker *shrinker)
>
> You have made all shrinkers to be dynamically allocated, so
> we should prevent users from allocating shrinkers statically and
> use this function to unregister it. It is better to add a
> flag like SHRINKER_ALLOCATED which is set in shrinker_alloc(),
> and check whether it is set in shrinker_unregister(), if not
> maybe a warning should be added to tell the users what happened.

Make sense, will do.

>
>> +{
>> +    struct dentry *debugfs_entry;
>> +    int debugfs_id;
>> +
>> +    if (!shrinker || !(shrinker->flags & SHRINKER_REGISTERED))
>> +        return;
>> +
>> +    down_write(&shrinker_rwsem);
>> +    list_del(&shrinker->list);
>> +    shrinker->flags &= ~SHRINKER_REGISTERED;
>> +    if (shrinker->flags & SHRINKER_MEMCG_AWARE)
>> +        unregister_memcg_shrinker(shrinker);
>> +    debugfs_entry = shrinker_debugfs_detach(shrinker, &debugfs_id);
>
> In the internal of this function, you also could use
> shrinker_debugfs_name_free().

Yeah, will do.

Thanks,
Qi

>
> Thanks.
>
>> +    up_write(&shrinker_rwsem);
>> +
>> +    shrinker_debugfs_remove(debugfs_entry, debugfs_id);
>> +
>> +    kfree(shrinker->nr_deferred);
>> +    shrinker->nr_deferred = NULL;
>> +
>> +    kfree(shrinker);
>> +}
>> +EXPORT_SYMBOL(shrinker_unregister);
>> +
>>   /*
>>    * Add a shrinker callback to be called from the vm.
>>    */
>