Received: by 2002:a25:4158:0:0:0:0:0 with SMTP id o85csp778976yba; Fri, 12 Apr 2019 13:39:27 -0700 (PDT) X-Google-Smtp-Source: APXvYqz6q4V5Txxb2uC5lMyeCH/5vC+krlboWLCUts6cCOJ1NRFTdLmjnyxsjo9OdcKfwTRRYaog X-Received: by 2002:a63:fc43:: with SMTP id r3mr54868956pgk.44.1555101567331; Fri, 12 Apr 2019 13:39:27 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1555101567; cv=none; d=google.com; s=arc-20160816; b=pY5BqufkqtzrI+3atT24ZbOx8Dh6qg0bb7ynbn69hCaij+wcFRo+inPpWQ33oWvl7h wimaH07etqQ9vkv9F+3zE5h/AG4i5lhfCqfmn/ZAcV/ehU5JFn42prJe+o09LHtBx5+4 7vuspywhho79xak7XuxJqT+fKMFtcpnBVHR3txA2aZjdC9zdHCeCJxA46LKCub9LmzJI DN8yHOuweHKFVhKSZ+praub3oo6Ntz0zrvWhFQMrxI3jm/whDhuDr71To2e5CF82CCuE 4PleHfW3/HH3HZGWM8lpFnOPkQf/cIY76xSTA17P1Ckm9ex1tOUZ7uXchL4CvEZfWKNh hI7g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:dkim-signature; bh=WOEqioenipxbzm+7k35/3dZl524W3sMRWwsLbeTAQhk=; b=W4QguKgeDs1bbmqo5evMsT+gfygNeT2CRTtYeIVxkgrf6jiNANwJ2j8A3VXmZiEQh2 rJB2aoPuzQ1H717ueURExFWRoSZsm90hWzxOQrdyThCR3LU64MAQC6OYHqtL/llKFenv o5T6icp1mSL2nLQZ/yz381k3j5FPt/JNe6Koy1HnfPmk7mmzRGxeDusxlGbfNgtc83+l /OUMf29RCXPw3zpSYtjbLM8kFovczAsbldxyHuOfjZKlCwTyM8pk0gDUCkFzIXD1ML9t awliG5edNXR9rpywTf7Q9I06ypxkaE8Xzkb6JLgp5uc08TcH6r9xNCNNodozle2yZYPY BA/w== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=vjhFS4fO; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id r124si34657103pgr.201.2019.04.12.13.39.09; Fri, 12 Apr 2019 13:39:27 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=vjhFS4fO; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727011AbfDLUiU (ORCPT + 99 others); Fri, 12 Apr 2019 16:38:20 -0400 Received: from mail-yb1-f194.google.com ([209.85.219.194]:34039 "EHLO mail-yb1-f194.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726771AbfDLUiT (ORCPT ); Fri, 12 Apr 2019 16:38:19 -0400 Received: by mail-yb1-f194.google.com with SMTP id a3so4113436ybr.1 for ; Fri, 12 Apr 2019 13:38:19 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=WOEqioenipxbzm+7k35/3dZl524W3sMRWwsLbeTAQhk=; b=vjhFS4fOxo2Fj0LqIMr1LXlEtE5hGVVS4kg+kni9d0nYGWqXbVHBHSHDIq6HjqHVfK 6/CW+MALCvXA2OOeNtACh91JZ2d7/8n7Nmk05bGR15WV91PMf6W15BncEgCrcgoFF8QN O7fLLfSLR0wLz07gnsUNyjy0216a+mO11nLfQxSY6N6pNAJaXHen01o7PZzJF0yPz/Nk QVfTDGoL7y78XALI4X+1XaMzIF7naI/6arE9XqfIOC4GxkYQncqS/CBoVSnnFuCdD78D iMGEN3gMLjo5XKLB91tkWEP9ATedoWxAYvi1q/zTbTQZt/BqpSFRYue4qJwPLPzU1/9Y /jiw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=WOEqioenipxbzm+7k35/3dZl524W3sMRWwsLbeTAQhk=; b=ZW4iQJK7MzADP6EDAk0lHmNFNmZtsfVVKliujllxOczuRr+4LWRWfapE0F+V8bIGFH wkIHRNQEQecxLy24M7DN4Fb1wU8qIPVZPGXDHTIB4eL2v2YxNqvsdGh+ff7U6Jkv9bbX SKGM3PYh+TncdtYnstYsnSDqCoTrnDPFqUNgUif9y+YAoi8UcM8NfdSL5ZtbwMRqvmtS ERkeZn9HaUWnxzQCY+dGbNkUlONlhu7EBSYNNUmp24eCtWbXyYHeFabcyxkbWjNvPc5h o2Eq7riDiz9I26si56fZ0M1p45la8sb53XfNvYSMKUMoZhg6sTBVHx1UW5oy69XUEqIO 7urQ== X-Gm-Message-State: APjAAAWjhZSfILK0oTccC8q3SVPE7FMEeFk0LpKvp1zP/1rDHBLq2d8S rgwM+4RhqdL47twZx7/DDZ8/4ZJzfcNeh/sU9787vlTsxkYBdg== X-Received: by 2002:a25:1e57:: with SMTP id e84mr48719412ybe.184.1555101498296; Fri, 12 Apr 2019 13:38:18 -0700 (PDT) MIME-Version: 1.0 References: <20190412151507.2769-1-hannes@cmpxchg.org> <20190412151507.2769-4-hannes@cmpxchg.org> <20190412201004.GA27187@cmpxchg.org> In-Reply-To: <20190412201004.GA27187@cmpxchg.org> From: Shakeel Butt Date: Fri, 12 Apr 2019 13:38:07 -0700 Message-ID: Subject: Re: [PATCH 3/4] mm: memcontrol: fix recursive statistics correctness & scalabilty To: Johannes Weiner Cc: Andrew Morton , Linux MM , Cgroups , LKML , kernel-team@fb.com, Roman Gushchin Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Apr 12, 2019 at 1:10 PM Johannes Weiner wrote: > > On Fri, Apr 12, 2019 at 12:55:10PM -0700, Shakeel Butt wrote: > > We also faced this exact same issue as well and had the similar solution. > > > > > Signed-off-by: Johannes Weiner > > > > Reviewed-by: Shakeel Butt > > Thanks for the review! > > > (Unrelated to this patchset) I think there should also a way to get > > the exact memcg stats. As the machines are getting bigger (more cpus > > and larger basic page size) the accuracy of stats are getting worse. > > Internally we have an additional interface memory.stat_exact for that. > > However I am not sure in the upstream kernel will an additional > > interface is better or something like /proc/sys/vm/stat_refresh which > > sync all per-cpu stats. > > I was talking to Roman about this earlier as well and he mentioned it > would be nice to have periodic flushing of the per-cpu caches. The > global vmstat has something similar. We might be able to hook into > those workers, but it would likely require some smarts so we don't > walk the entire cgroup tree every couple of seconds. > > We haven't had any actual problems with the per-cpu fuzziness, mainly > because the cgroups of interest also grow in size as the machines get > bigger, and so the relative error doesn't increase. > Yes, this is very machine size dependent. We see this issue more often on larger machines. > Are your requirements that the error dissipates over time (waiting for > a threshold convergence somewhere?) or do you have automation that > gets decisions wrong due to the error at any given point in time? Not sure about the first one but we do have the second case. The node controller does make decisions in an online way based on the stats. Also we do periodically collect and store stats for all jobs across the fleet. This data is processed (offline) and is used in a lot of ways. The inaccuracy in the stats do affect all that analysis particularly for small jobs.