Received: by 2002:a05:6a10:eb17:0:0:0:0 with SMTP id hx23csp1056538pxb; Thu, 9 Sep 2021 19:17:35 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwjFOuqMVNwBhmMCoInEzSeA3/EBe4SEcE32uUo4Eb+wUAMSucyiN37L4E22TvC1f74z/gE X-Received: by 2002:a17:906:d88:: with SMTP id m8mr7003873eji.250.1631240254872; Thu, 09 Sep 2021 19:17:34 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1631240254; cv=none; d=google.com; s=arc-20160816; b=KXSU5FjNdMoaP/jHswwl1HsjBDtqA5VKs9W2J3H8IK2/XLmy4MnWhVQ+tJT3OW45XD PMPkqpo7oWeqfjoxjta4lq0arPgIx41pGO/Z7frK1qgjZ3axVAf7dj650bMqLTyKcGHO gW8+D2ZUHgaYmYEWL7Cwwj66vN3nWH6pnCvS+jWC71XxK8LBTxz+cciH3LFffKvejLV5 AGJO/8su+5QJTBIyphje2/kbfqbX8Wy43ViaVrJlw7fb2EXTLekpMyY1Twnc4EH5CKo3 MBvMwKpl2xQmYoMIuy/adDOW5pZmLW/XKXrkT3mJ8zTTc4x4ZW6ISQfubf6uKo9EjYW4 jygQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:in-reply-to :mime-version:user-agent:date:message-id:references:cc:to:subject :from; bh=wBM4xFpXI9EUakwQzTGYNBbl9B3ZzZRmTfolKTOaejE=; b=m7D5nHVj2UQuQIyBSKBh/7uiMj7HAnNxEahNEm9YlmPSkncdEmGEp6qkYH1pp5CjEV PxWkymxJQpHRsJT/v8UHrFjqeP6pdoBg8EguvEV9jkuQ6HXulTdx95gb4suXS/1PmJQn I6sjE9xVmYbG97VH9u8Wcahug0+ADrhDMZ02lYPE7JhWfK2UaV/B5bYNUu0+Z1qxrdeq DyYlzi2rbqJeybalgMqoDV+c0JoYiwSqKUwfCmqlH+zL5HsXAoKuE0jFVEhJaKYQvwb9 hhFqA7F64E1PVFXiLOdPQ0Mypo0QSLjnYM9ACqFeZwnVLoo1/LhbYvdYRropGAO36bUM SmKA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=alibaba.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id n6si2065856edx.253.2021.09.09.19.17.11; Thu, 09 Sep 2021 19:17:34 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=alibaba.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229767AbhIJCPn (ORCPT + 99 others); Thu, 9 Sep 2021 22:15:43 -0400 Received: from out30-44.freemail.mail.aliyun.com ([115.124.30.44]:50104 "EHLO out30-44.freemail.mail.aliyun.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229524AbhIJCPm (ORCPT ); Thu, 9 Sep 2021 22:15:42 -0400 X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R341e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=e01e04400;MF=escape@linux.alibaba.com;NM=1;PH=DS;RN=11;SR=0;TI=SMTPD_---0UnqXATB_1631240069; Received: from B-W5MSML85-1937.local(mailfrom:escape@linux.alibaba.com fp:SMTPD_---0UnqXATB_1631240069) by smtp.aliyun-inc.com(127.0.0.1); Fri, 10 Sep 2021 10:14:29 +0800 From: "taoyi.ty" Subject: Re: [RFC PATCH 1/2] add pinned flags for kernfs node To: Greg KH Cc: tj@kernel.org, lizefan.x@bytedance.com, hannes@cmpxchg.org, mcgrof@kernel.org, keescook@chromium.org, yzaikin@google.com, linux-kernel@vger.kernel.org, cgroups@vger.kernel.org, linux-fsdevel@vger.kernel.org, shanpeic@linux.alibaba.com References: Message-ID: <3d871bd0-dab5-c9ca-61b9-6aa137fa9fdf@linux.alibaba.com> Date: Fri, 10 Sep 2021 10:14:28 +0800 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:78.0) Gecko/20100101 Thunderbird/78.14.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 2021/9/8 下午8:35, Greg KH wrote: > Why are kernfs changes needed for this? kernfs creation is not > necessarily supposed to be "fast", what benchmark needs this type of > change to require the addition of this complexity? The implementation of the cgroup pool should have nothing to do with kernfs, but during the development process, I found that when there is a background cpu load, it takes a very significant time for a process to get the mutex from being awakened to starting execution. To create 400 cgroups concurrently, if there is no background cpu load, it takes about 80ms, but if the cpu usage rate is 40%, it takes about 700ms. If you reduce sched_wakeup_granularity_ns, the time consumption will also be reduced. If you change mutex to spinlock, the situation will be very much improved. So to solve this problem, mutex should not be used. The cgroup pool relies on kernfs_rename which uses kernfs_mutex, so I need to bypass kernfs_mutex and add a pinned flag for this. Because the lock mechanism of kernfs_rename has been changed, in order to maintain data consistency, the creation and deletion of kernfs have also been changed accordingly I admit that this is really not a very elegant design, but I don’t know how to make it better, so I throw out the problem and try to seek help from the community. thanks, Yi Tao