Received: by 2002:a25:4158:0:0:0:0:0 with SMTP id o85csp758796yba; Fri, 12 Apr 2019 13:10:16 -0700 (PDT) X-Google-Smtp-Source: APXvYqy8BLWBzGM9Skshz9s4ajr/yKyKBuyABHnoQTgOqI3vMGnvOnMdVdER94NjZOgy1w7BOXVv X-Received: by 2002:a17:902:b10c:: with SMTP id q12mr59140805plr.254.1555099816536; Fri, 12 Apr 2019 13:10:16 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1555099816; cv=none; d=google.com; s=arc-20160816; b=KVAPnQw7B7Y76764LheEszgNf0Xbfg4FFEjGOKsQdNM4xqwQYiBoymkwhRI/iSAfuM DvKY16WWhkwjMpVv2Z0RBhn9lCEZqjIvPxbPq92XFprBsuH1Ah+RElz07A/6CAj7X6JK Rt8/3v5BZVhGAxGKie0646/LUj7Liticfsd1rtGVhs6STIOgyqBy43DUbzzbX6dVq8yP vhdaeZxIFDgMOPJe++wSnVt6Rmpl7pbNA6M6Ic4VOYODZfCR0bbpBrDuTAmpnLH+VtFy tl2HkQTpSFkVm40cFoEub+ttgs8L/RX4GIKu3g1l5D6hE5RpiKTBlAS6FeY84nk83M7e oT6g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:dkim-signature; bh=cWy39L110bpl90KlTOYN36nnj0m9Og8hXJw1+zoZDKk=; b=Ca9tXOEh750/gwUVMJeFnzj7GBI9N8y36d488fKZ5R87QL1gQJSejq8WwrhfBp9q7D VF6gCfepWLC58RzIvLT5UpjbdrMXIHZdqLIOTanYAeRxT3zFUdl/mNIq/Jax6TfRSeB4 MCTjecTkDrnoCFt11Fv11bQowAS7zUXanVtVXEHH4fxqm6tFxciXYL4g69OdvkHQ2Vjh 7eSzptls4SholzFpqTbc5gnH7TdKHYzJmJMn6vxopgttW17duByjwla/qD2+4k/m514z uy4mMDVmKl3sZ0XKMsfCKnKq638Wmg1kD99IMuKBBNSNvKQoU/8iqzaDluw4nJ16IxeJ 8Gbg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=LP65T9uo; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id t7si10234565pgs.315.2019.04.12.13.09.50; Fri, 12 Apr 2019 13:10:16 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=LP65T9uo; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727006AbfDLUH3 (ORCPT + 99 others); Fri, 12 Apr 2019 16:07:29 -0400 Received: from mail-yw1-f68.google.com ([209.85.161.68]:34815 "EHLO mail-yw1-f68.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726771AbfDLUH3 (ORCPT ); Fri, 12 Apr 2019 16:07:29 -0400 Received: by mail-yw1-f68.google.com with SMTP id x129so3835506ywc.1 for ; Fri, 12 Apr 2019 13:07:29 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=cWy39L110bpl90KlTOYN36nnj0m9Og8hXJw1+zoZDKk=; b=LP65T9uo0dVn2Y5omvF5mxvXGRfAgu6NngtzQkPAnlMx95MzRaoxUddWsZp48rD13D DVYmLNGVWFeqmVnvvMAV+nlf+frio4tA488tODN1HVqEvHfhTW5BOunil70RJ/PsIhYE AVgukhQXhJcaGfW/bxdzEpdpiwIo/iM6ZGTv3XE8Y3RuPKBM4A8WyrvK4tDNZUmeWYl9 wLKIa898nv5rY6FJxM+Z2Bz1Tk8gGszZttNaJ7iQfD4v9jr1wIsYMwMZhHPbEWz/2gJT zl199wVxdgncig7ajSNj8hQEX/RicdcdKS2ywd0YJfPSMTtbNXfvLGLH43wX3vF0nHyC CWiA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=cWy39L110bpl90KlTOYN36nnj0m9Og8hXJw1+zoZDKk=; b=G1xY8ebuWPReirMBZfgG2G9WlBbvWbSTxGyIfkC35HAkjdZaPigS53pUspFvAy1zs4 NJh5+QEK+HRh4Lsm8Dxa5rnZ8z/SiS6YVAyOvEjLTp2LBPLWe7mONBVD7OG8hXlE1phb DX46jHeq4dnmiSMi6oHnm4wNyvIcrtd5MuvHEF8S99ffbXQtC+aHX3zNPi7bRKBBBp/G 0oZJ3PvFs1I90n5+MAvrOdS3/U/i0UlPwh0h3W7aesuUEQo1TM4CUr6Vv3y1Q2leXCEF g4BbM1SMr1dx/gk3u22TKaw+hg4w5DSWmTYNzWpxu5LnwyvUsUg3kzwRsb8rVKOpl2XW jG4Q== X-Gm-Message-State: APjAAAVurFPggGKwtK3gsff8o0jGxFZXLuQWLXNELChAh5gRQe3yNN+1 9lgrlymxP4LtmY7JCNztvLQxHcCA3cSb72LJMJGZZQ== X-Received: by 2002:a81:9ad0:: with SMTP id r199mr46915127ywg.310.1555099648342; Fri, 12 Apr 2019 13:07:28 -0700 (PDT) MIME-Version: 1.0 References: <20190412151507.2769-1-hannes@cmpxchg.org> In-Reply-To: <20190412151507.2769-1-hannes@cmpxchg.org> From: Shakeel Butt Date: Fri, 12 Apr 2019 13:07:17 -0700 Message-ID: Subject: Re: [PATCH 0/4] mm: memcontrol: memory.stat cost & correctness To: Johannes Weiner Cc: Andrew Morton , Linux MM , Cgroups , LKML , kernel-team@fb.com Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Apr 12, 2019 at 8:15 AM Johannes Weiner wrote: > > The cgroup memory.stat file holds recursive statistics for the entire > subtree. The current implementation does this tree walk on-demand > whenever the file is read. This is giving us problems in production. > > 1. The cost of aggregating the statistics on-demand is high. A lot of > system service cgroups are mostly idle and their stats don't change > between reads, yet we always have to check them. There are also always > some lazily-dying cgroups sitting around that are pinned by a handful > of remaining page cache; the same applies to them. > > In an application that periodically monitors memory.stat in our fleet, > we have seen the aggregation consume up to 5% CPU time. > > 2. When cgroups die and disappear from the cgroup tree, so do their > accumulated vm events. The result is that the event counters at > higher-level cgroups can go backwards and confuse some of our > automation, let alone people looking at the graphs over time. > > To address both issues, this patch series changes the stat > implementation to spill counts upwards when the counters change. > > The upward spilling is batched using the existing per-cpu cache. In a > sparse file stress test with 5 level cgroup nesting, the additional > cost of the flushing was negligible (a little under 1% of CPU at 100% > CPU utilization, compared to the 5% of reading memory.stat during > regular operation). For whole series: Reviewed-by: Shakeel Butt > > include/linux/memcontrol.h | 96 +++++++------- > mm/memcontrol.c | 290 +++++++++++++++++++++++++++---------------- > mm/vmscan.c | 4 +- > mm/workingset.c | 7 +- > 4 files changed, 234 insertions(+), 163 deletions(-) > >