Received: by 2002:a05:6358:3188:b0:123:57c1:9b43 with SMTP id q8csp30057364rwd; Wed, 5 Jul 2023 23:41:19 -0700 (PDT) X-Google-Smtp-Source: APBJJlGPgcLGuF9MNsfaAseLPvYMdjQ/WbPqM8tr/D2IOfdwTCBbwnUSJzxFKp4l3y9kRWQ/01yi X-Received: by 2002:a05:6a20:12c5:b0:126:d0e2:3fb4 with SMTP id v5-20020a056a2012c500b00126d0e23fb4mr1111295pzg.56.1688625679240; Wed, 05 Jul 2023 23:41:19 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1688625679; cv=none; d=google.com; s=arc-20160816; b=g3jVY5wmdA1zfi6TPMm41IalFmpaX1ad6DyJony0RQJ5Q4dPCslmgcADuw2i4jjnca WHLbv93iDiYmHr6of8OQ+YSts06lu9wJe/c4XyuB1fFrGIy77m8n4bvPNU2Nkirzs/Vq x2c8VlbIHLRtM5g6QcZZgHXnFaTIzssC/IIqBf8QPLCCXeaHR3LnG2Pt/TQiBhXmKdf8 8zX57sApQgysifZTLCMkgWCqHBuDgP3Fo1aSbeCyiwQaVFmMJJujv5MG999kCBkoeb+l gKDn1Sxewb1MxzStzPXaVhvWdQwA5dGHFzCxuE8BiP9TVphRAadLmNyhwvsS+fHWlvpS Ij7g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:from:subject:message-id:references :mime-version:in-reply-to:date:dkim-signature; bh=054dwjt9BZCou3BvPjMRnfFzZ2wjHCy2lrA2F8To2Yk=; fh=b/oVfQPkIvNUA3Y7FKSvu4eHSuoG1nee4JJ/nvOKD7o=; b=TPUFXZXUkeAAeAiT+tseOC+92aClmXDT7ihOMfaEztmg6oAyxavIp3uCZPV+JyLGe1 V5o7dH1RVbdFFGkjf5oa5ImptgXeTbIthQdVjGtWwHbNbgt8XXeofDsyF5kkYcKmQ0P5 QSY8OwmkYJDTOHSAl4KTFnLIY+ukpz7WiS6nmLZX+1XZxJ1Pysy4MEPmAYkcfz4K8KIY jqsUjo72vRHybd/MP75bWm6KHu6sbc2R1jRAVim9rso2IT/j8HBhLKFiunKIq8fxCiGN dJ3CyR/T+v6wDk4q0CcDWU7BkEem/sbJELBDsVMVlBF5TbNoSier2EAWXACDqFNRixeg ZWZg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20221208 header.b="a/gZc1hv"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id f13-20020a170902684d00b001b8970a2b15si726238pln.86.2023.07.05.23.41.07; Wed, 05 Jul 2023 23:41:19 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20221208 header.b="a/gZc1hv"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229957AbjGFGVN (ORCPT + 99 others); Thu, 6 Jul 2023 02:21:13 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47148 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233558AbjGFGVI (ORCPT ); Thu, 6 Jul 2023 02:21:08 -0400 Received: from mail-pg1-x54a.google.com (mail-pg1-x54a.google.com [IPv6:2607:f8b0:4864:20::54a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C38D71990 for ; Wed, 5 Jul 2023 23:20:48 -0700 (PDT) Received: by mail-pg1-x54a.google.com with SMTP id 41be03b00d2f7-55b2ab496ecso380322a12.2 for ; Wed, 05 Jul 2023 23:20:48 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20221208; t=1688624448; x=1691216448; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=054dwjt9BZCou3BvPjMRnfFzZ2wjHCy2lrA2F8To2Yk=; b=a/gZc1hvT7JRiiUYzG1SPVOHiHFMZ5cv81dBONAX0K6rQb/SqodZ/lLZoD6G628ExP YAd7fHKCO8Uce6GfN3NdeaykplGQ29jTP1s1duFpDpYznOqIoHbToAyL5+/e4rjop4px RnYjO1Dq2V76xE62yjTCc5gppCq4Ilyvg6B+wC/at9D5beVGj/dSGIKww8CivxZHgMPs xUAr63iI0RUxB/1CrizQ0kDz4z+hUrxNojnH+C506j0HK6jQWpTT4OiCsEUo2etb1Qaj qq9nOEvhORD0DyWKW/sSLHCUa79qo5POjO0faDZksJV99qlPER3hM1mrmWsqRtzP1voK kuCg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1688624448; x=1691216448; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=054dwjt9BZCou3BvPjMRnfFzZ2wjHCy2lrA2F8To2Yk=; b=FT22kvh+BNz3D13tbNlV9xbkrRMQn86tK/PY5NNMKzvtzGFUWiVyXHmvhC8bUE2SjD LtJ4ENWd52DZIZNq2zNv3UHE/WgZnRPJf76u9axLue4iOk+UbAsBfXc6UtZwB7oqyuDw rKm+4TIGWEJTrxkTWrtNIWQuMg9Gr93d5uVGK2zRknLbyZirsVmC6JedYHvrAJG9id3N hif+pEFBFqlpo/S8T5s7toNyAZvXueBzU8HjT14kGwCKVlov9lmM+wpeT1dNDkEpRuYZ /PDg95XygCLKTrDk+Mn/15FjOm7mootMJE8u4zOuOWFoabWmjwcbcG08ee/gTzO2i9aA laRg== X-Gm-Message-State: ABy/qLZPfi/4U7z5lddFHShUhE/QxcNwWFfUCQtLEaDEHq0mbtHAZ2iy gN6OvD2i3UDzEb5gsMuY5NmxOg+HA54qDQ== X-Received: from shakeelb.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:262e]) (user=shakeelb job=sendgmr) by 2002:a63:751d:0:b0:553:8668:dc40 with SMTP id q29-20020a63751d000000b005538668dc40mr509175pgc.6.1688624448157; Wed, 05 Jul 2023 23:20:48 -0700 (PDT) Date: Thu, 6 Jul 2023 06:20:45 +0000 In-Reply-To: Mime-Version: 1.0 References: Message-ID: <20230706062045.xwmwns7cm4fxd7iu@google.com> Subject: Re: Expensive memory.stat + cpu.stat reads From: Shakeel Butt To: Ivan Babrou Cc: cgroups@vger.kernel.org, Linux MM , kernel-team , Johannes Weiner , Michal Hocko , Roman Gushchin , Muchun Song , Andrew Morton , linux-kernel Content-Type: text/plain; charset="us-ascii" X-Spam-Status: No, score=-9.6 required=5.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE,USER_IN_DEF_DKIM_WL autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Jun 30, 2023 at 04:22:28PM -0700, Ivan Babrou wrote: > Hello, > > We're seeing CPU load issues with cgroup stats retrieval. I made a > public gist with all the details, including the repro code (which > unfortunately requires heavily loaded hardware) and some flamegraphs: > > * https://gist.github.com/bobrik/5ba58fb75a48620a1965026ad30a0a13 > > I'll repeat the gist of that gist here. Our repro has the following > output after a warm-up run: > > completed: 5.17s [manual / mem-stat + cpu-stat] > completed: 5.59s [manual / cpu-stat + mem-stat] > completed: 0.52s [manual / mem-stat] > completed: 0.04s [manual / cpu-stat] > > The first two lines do effectively the following: > > for _ in $(seq 1 1000); do cat /sys/fs/cgroup/system.slice/memory.stat > /sys/fs/cgroup/system.slice/cpu.stat > /dev/null > > The latter two are the same thing, but via two loops: > > for _ in $(seq 1 1000); do cat /sys/fs/cgroup/system.slice/cpu.stat > > /dev/null; done > for _ in $(seq 1 1000); do cat /sys/fs/cgroup/system.slice/memory.stat > > /dev/null; done > > As you might've noticed from the output, splitting the loop into two > makes the code run 10x faster. This isn't great, because most > monitoring software likes to get all stats for one service before > reading the stats for the next one, which maps to the slow and > expensive way of doing this. > > We're running Linux v6.1 (the output is from v6.1.25) with no patches > that touch the cgroup or mm subsystems, so you can assume vanilla > kernel. > > From the flamegraph it just looks like rstat flushing takes longer. I > used the following flags on an AMD EPYC 7642 system (our usual pick > cpu-clock was blaming spinlock irqrestore, which was questionable): > > perf -e cycles -g --call-graph fp -F 999 -- /tmp/repro > > Naturally, there are two questions that arise: > > * Is this expected (I guess not, but good to be sure)? > * What can we do to make this better? > > I am happy to try out patches or to do some tracing to help understand > this better. Hi Ivan, Thanks a lot, as always, for reporting this. This is not expected and should be fixed. Is the issue easy to repro or some specific workload or high load/traffic is required? Can you repro this with the latest linus tree? Also do you see any difference of root's cgroup.stat where this issue happens vs good state? BTW I am away for next month with very limited connectivity, so expect slow response. thanks, Shakeel