Received: by 2002:a05:6358:d09b:b0:dc:cd0c:909e with SMTP id jc27csp3746732rwb; Tue, 8 Nov 2022 08:00:31 -0800 (PST) X-Google-Smtp-Source: AMsMyM7B0WqqZ1/6/sVag5QSzz0qt1BABBsHMfx0RAEsrMVjl0TtOw9TeZjpP/2vrSG9bLEoHVgU X-Received: by 2002:a17:90b:1e43:b0:213:1efe:9815 with SMTP id pi3-20020a17090b1e4300b002131efe9815mr58939229pjb.164.1667923231091; Tue, 08 Nov 2022 08:00:31 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1667923231; cv=none; d=google.com; s=arc-20160816; b=J9M/Dm/9PUZZdBwKn0NhGHkAMtgVsNbNU+1ajCQineopiXusYc9B1GPfTeVP/J5HNM l16iIyeJESA2dDG8AgAPTe4QUdgKDvCeIhAAafCUQ96U4IIdb7mMNp6c/h766cwJPmAT yD1O2hTm4JHQy2hCCBYloPXT8t1LBKg2MoY3q03SXMMWTqeq5nV9QDfyrhe9L05dESIh MNwLjcqRY90BOHln47mX4so3GRZ+zS6AdgPfGWw7EStQ7mLRsXop8EDjslL9DiKY30PC jICMKpplKcN3BBrGH52K56ybTsOOUklcurRVsRRgdnqJ/gcWpAHS5Pe8neWH4aTHQ0vD wTgQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:dkim-signature; bh=tlv10Eug8aLlVo9oLkXyxUxR1PkPOaBoA9RFobpp0X4=; b=QiHnyvgPJ+ZHGqQvKdt7NNYmnwT5TZpqfCMrD2fggIG9BUmu9PCW3ttF32Tamy61to 2IeOnkFtG+rPDVNeRynNmyNBvEFJ33yKzGO1sV/J2mS3dsN6gGVsL05j+CTCLWVlg0iH q0ZgQqNb6Yq3oB1bv+/XPp8fuqghJ8Hi+ocPHhLppuj+ShxYPJS5c1Xdi/sLF8NJ18lG IUDNTQOATAjm3EY5XkMeqdcON6Tsw8KdCGggmEEHyIDBszMrket4PfAOc8BJTrN07cRM w7G/zMmmBxzyqs1uRl6YLrRdDYVFZ5RWLyZZG0LlmUmxO7p/dD43PoJIh06NLZYv92Bg QXVg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linuxfoundation.org header.s=korg header.b=2YOSsKC0; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linuxfoundation.org Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id c17-20020a170902d49100b00185466d72e8si17152661plg.320.2022.11.08.08.00.13; Tue, 08 Nov 2022 08:00:31 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@linuxfoundation.org header.s=korg header.b=2YOSsKC0; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linuxfoundation.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234319AbiKHPin (ORCPT + 92 others); Tue, 8 Nov 2022 10:38:43 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:58070 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234302AbiKHPij (ORCPT ); Tue, 8 Nov 2022 10:38:39 -0500 Received: from sin.source.kernel.org (sin.source.kernel.org [145.40.73.55]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2FBA214D14 for ; Tue, 8 Nov 2022 07:38:38 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by sin.source.kernel.org (Postfix) with ESMTPS id 47B56CE1BCA for ; Tue, 8 Nov 2022 15:38:36 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id C7673C4314D; Tue, 8 Nov 2022 15:38:33 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linuxfoundation.org; s=korg; t=1667921914; bh=TV1sWn1Sv+7u6xiLUPX2HqlHGm1NQnC++jAOJ2OGZP8=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=2YOSsKC0IzgV3p2rkoaUMxjKkOENcukiz6dCckEUM9meMskbZI5XcFLEF/img5Jb8 Tc1XARaLIgZ8nOCd9MoMLNFAAvBXcBBDxDm3FYAh9Sj5Sfshxi0/SUK/Q2rHsTHNPU QkWO8UeHPQY4Q/TwImD+Bt6xai6iURDOAVehIzCw= Date: Tue, 8 Nov 2022 16:38:31 +0100 From: Greg Kroah-Hartman To: Srikar Dronamraju Cc: Vishal Chourasia , Peter Zijlstra , linux-kernel@vger.kernel.org, mingo@redhat.com, vincent.guittot@linaro.org, vschneid@redhat.com, sshegde@linux.ibm.com, linuxppc-dev@lists.ozlabs.org, ritesh.list@gmail.com, aneesh.kumar@linux.ibm.com Subject: Re: sched/debug: CPU hotplug operation suffers in a large cpu systems Message-ID: References: <20221108145100.GG145013@linux.vnet.ibm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20221108145100.GG145013@linux.vnet.ibm.com> X-Spam-Status: No, score=-7.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_HI, SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Nov 08, 2022 at 08:21:00PM +0530, Srikar Dronamraju wrote: > * Greg Kroah-Hartman [2022-11-08 13:24:39]: > > > On Tue, Nov 08, 2022 at 03:30:46PM +0530, Vishal Chourasia wrote: > > Hi Greg, > > > > > > > Thanks Greg & Peter for your direction. > > > > > > While we pursue the idea of having debugfs based on kernfs, we thought about > > > having a boot time parameter which would disable creating and updating of the > > > sched_domain debugfs files and this would also be useful even when the kernfs > > > solution kicks in, as users who may not care about these debugfs files would > > > benefit from a faster CPU hotplug operation. > > > > Ick, no, you would be adding a new user/kernel api that you will be > > required to support for the next 20+ years. Just to get over a > > short-term issue before you solve the problem properly. > > > > If you really do not want these debugfs files, just disable debugfs from > > your system. That should be a better short-term solution, right? > > > > Or better yet, disable SCHED_DEBUG, why can't you do that? > > Thanks a lot for your quick inputs. > > CONFIG_SCHED_DEBUG disables a lot more stuff than just updation of debugfs > files. Information like /sys/kernel/debug/sched/debug and system-wide and > per process wide information would be lost when that config is disabled. > > Most users would still be using distribution kernels and most distribution > kernels that I know of seem to have CONFIG_SCHED_DEBUG enabled. Then work with the distros to remove that option if it doesn't do well on very large systems. Odds are they really do not want that enabled either, but that's not our issue, that's theirs :) > In a large system, lets say close to 2000 CPUs and we are offlining around > 1750 CPUs. For example ppc64_cpu --smt=1 on a powerpc. Even if we move to a > lesser overhead kernfs based implementation, we would still be creating > files and deleting files for every CPU offline. Most users may not even be > aware of these files. However for a few users who may be using these files > once a while, we end up creating and deleting these files for all users. The > overhead increases exponentially with the number of CPUs. I would assume the > max number of CPUs are going to increase in future further. I understand the issue, you don't have to explain it again. The scheduler developers like to see these files, and for them it's useful. Perhaps for distros that is not a useful thing to have around, that's up to them. > Hence our approach was to reduce the overhead for those users who are sure > they don't depend on these files. We still keep the creating of the files as > the default approach so that others who depend on it are not going to be > impacted. No, you are adding a new user/kernel api to the kernel that you then have to support for the next 20+ years because you haven't fixed the real issue here. I think you could have done the kernfs conversion already, it shouldn't be that complex, right? Note, when you do it, you might want to move away from returning a raw dentry from debugfs calls, and instead use an opaque type "debugfs_file" or something like that, instead, which might make this easier over time. thanks, greg k-h