Received: by 2002:a05:6358:1087:b0:cb:c9d3:cd90 with SMTP id j7csp2310601rwi; Fri, 21 Oct 2022 02:17:08 -0700 (PDT) X-Google-Smtp-Source: AMsMyM5G0a/oryu8rAEMcS2ensTVvcLiNs448mczs5smuAy+NqjIjR7Iau+0qx70YRRa8jt/IiL3 X-Received: by 2002:a62:1482:0:b0:55f:eb9a:38b2 with SMTP id 124-20020a621482000000b0055feb9a38b2mr17957969pfu.29.1666343828257; Fri, 21 Oct 2022 02:17:08 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1666343828; cv=none; d=google.com; s=arc-20160816; b=pDuhR/8LsJF8KxJsugRzCbb/SFUlmksezr8DDzXWGOT8vaPfS4hGIYYB/6QurUiu5Z i9pdBFuoPYhyBStKBus90vCurTPXczhS6hH/y20dKECeJKyAQVatUdF6Vi8rRE+T+Y4A INncvwpNQ+jJfWvEPx/GfG3yTXFGZJ3GMlZ77QKXZOWjQ4ez45KzVu/HgOtKAuCWl/Xv 0sS2PqXS/INuoVFtdJwK85tCdl8LAHNtn4j87FwcHQZlGtReBCFu4QIEx4+G36tyfUqc uWCXDALHRHvv2ygeajbughG6YRkMDJT23I9nOg5ItECSomGN4vGjwPhH8RjYDd0D0hv0 ekuw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-language:content-transfer-encoding :in-reply-to:mime-version:user-agent:date:message-id:from:references :cc:to:subject; bh=mVjxaa9G9RxQE0jI1jk9dCyMzXIJAlT2XTMFYUT9Et4=; b=xOTyzSrADj9wb5d6MpQXJok2RYSb6nwNRB1YsQmGyDq1qtmdZMh0JhtQSxe7Y5Epvx WLGgrmgUuWuakezDiW5YwP+jguQyV/ymmjVixzl3wuijpC5WoufPwaOzZsvLssHxlRdH EPwIgaj8jfrLXFZphKkmJkNURKKECLUq+8eh6fLBBLJeoHssgFeHS/j4cn7iSmO2o9If HedLZMUcjHXsTF9eMgEcxOJxK/aiHL74yE/kgdIrahz4DV5mPTvOzE4dGwoC/LkAh+Zt 7jo0q+qG3ZC776svCB5ju4vm1dMoTr+5KLk84xivmrqLpwmgvPR/IOETG+oR39GQKJ+b 6YSQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=huawei.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id j12-20020a056a00174c00b005632d18cd91si26206634pfc.263.2022.10.21.02.16.56; Fri, 21 Oct 2022 02:17:08 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=huawei.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231131AbiJUJNr (ORCPT + 99 others); Fri, 21 Oct 2022 05:13:47 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36340 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231254AbiJUJNP (ORCPT ); Fri, 21 Oct 2022 05:13:15 -0400 Received: from szxga02-in.huawei.com (szxga02-in.huawei.com [45.249.212.188]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 73BAD5C94C for ; Fri, 21 Oct 2022 02:12:59 -0700 (PDT) Received: from dggpemm500024.china.huawei.com (unknown [172.30.72.57]) by szxga02-in.huawei.com (SkyGuard) with ESMTP id 4MtzJb6fMCzHvCS; Fri, 21 Oct 2022 17:12:47 +0800 (CST) Received: from dggpemm500007.china.huawei.com (7.185.36.183) by dggpemm500024.china.huawei.com (7.185.36.203) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2375.31; Fri, 21 Oct 2022 17:12:39 +0800 Received: from [10.174.178.174] (10.174.178.174) by dggpemm500007.china.huawei.com (7.185.36.183) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2375.31; Fri, 21 Oct 2022 17:12:38 +0800 Subject: Re: [PATCH 00/11] fix memory leak while kset_register() fails To: Greg KH CC: Luben Tuikov , , , , , , , , , , , , , , , , , , , , , References: <20221021022102.2231464-1-yangyingliang@huawei.com> <0591e66f-731a-5f81-fc9d-3a6d80516c65@huawei.com> From: Yang Yingliang Message-ID: <1f3aa2ac-fba6-dc7a-d01d-7dd5331c8dc5@huawei.com> Date: Fri, 21 Oct 2022 17:12:37 +0800 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:68.0) Gecko/20100101 Thunderbird/68.7.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset="utf-8"; format=flowed Content-Transfer-Encoding: 8bit Content-Language: en-US X-Originating-IP: [10.174.178.174] X-ClientProxiedBy: dggems702-chm.china.huawei.com (10.3.19.179) To dggpemm500007.china.huawei.com (7.185.36.183) X-CFilter-Loop: Reflected X-Spam-Status: No, score=-4.2 required=5.0 tests=BAYES_00,NICE_REPLY_A, RCVD_IN_DNSWL_MED,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 2022/10/21 16:36, Greg KH wrote: > On Fri, Oct 21, 2022 at 04:24:23PM +0800, Yang Yingliang wrote: >> On 2022/10/21 13:37, Greg KH wrote: >>> On Fri, Oct 21, 2022 at 01:29:31AM -0400, Luben Tuikov wrote: >>>> On 2022-10-20 22:20, Yang Yingliang wrote: >>>>> The previous discussion link: >>>>> https://lore.kernel.org/lkml/0db486eb-6927-927e-3629-958f8f211194@huawei.com/T/ >>>> The very first discussion on this was here: >>>> >>>> https://www.spinics.net/lists/dri-devel/msg368077.html >>>> >>>> Please use this link, and not the that one up there you which quoted above, >>>> and whose commit description is taken verbatim from the this link. >>>> >>>>> kset_register() is currently used in some places without calling >>>>> kset_put() in error path, because the callers think it should be >>>>> kset internal thing to do, but the driver core can not know what >>>>> caller doing with that memory at times. The memory could be freed >>>>> both in kset_put() and error path of caller, if it is called in >>>>> kset_register(). >>>> As I explained in the link above, the reason there's >>>> a memory leak is that one cannot call kset_register() without >>>> the kset->kobj.name being set--kobj_add_internal() returns -EINVAL, >>>> in this case, i.e. kset_register() fails with -EINVAL. >>>> >>>> Thus, the most common usage is something like this: >>>> >>>> kobj_set_name(&kset->kobj, format, ...); >>>> kset->kobj.kset = parent_kset; >>>> kset->kobj.ktype = ktype; >>>> res = kset_register(kset); >>>> >>>> So, what is being leaked, is the memory allocated in kobj_set_name(), >>>> by the common idiom shown above. This needs to be mentioned in >>>> the documentation, at least, in case, in the future this is absolved >>>> in kset_register() redesign, etc. >>> Based on this, can kset_register() just clean up from itself when an >>> error happens? Ideally that would be the case, as the odds of a kset >>> being embedded in a larger structure is probably slim, but we would have >>> to search the tree to make sure. >> I have search the whole tree, the kset used in bus_register() - patch #3, >> kset_create_and_add() - patch #4 >> __class_register() - patch #5,  fw_cfg_build_symlink() - patch #6 and >> amdgpu_discovery.c - patch #10 >> is embedded in a larger structure. In these cases, we can not call >> kset_put() in error path in kset_register() > Yes you can as the kobject in the kset should NOT be controling the > lifespan of those larger objects. Read through the code the only leak in this case is the name, so can we free it directly in kset_register(): --- a/lib/kobject.c +++ b/lib/kobject.c @@ -844,8 +844,11 @@ int kset_register(struct kset *k)         kset_init(k);         err = kobject_add_internal(&k->kobj); -       if (err) +       if (err) { +               kfree_const(k->kobj.name); +               k->kobj.name = NULL;                 return err; +       }         kobject_uevent(&k->kobj, KOBJ_ADD);         return 0;  } or unset ktype of kobject, then call kset_put(): --- a/lib/kobject.c +++ b/lib/kobject.c @@ -844,8 +844,11 @@ int kset_register(struct kset *k)         kset_init(k);         err = kobject_add_internal(&k->kobj); -       if (err) +       if (err) { +               k->kobj.ktype = NULL; +               kset_put(k);                 return err; +       }         kobject_uevent(&k->kobj, KOBJ_ADD);         return 0;  } > > If it is, please point out the call chain here as I don't think that > should be possible. > > Note all of this is a mess because the kobject name stuff was added much > later, after the driver model had been created and running for a while. > We missed this error path when adding the dynamic kobject name logic, > thank for looking into this. > > If you could test the patch posted with your error injection systems, > that could make this all much simpler to solve. > > thanks, > > greg k-h > .