Received: by 2002:a25:8b12:0:0:0:0:0 with SMTP id i18csp1536116ybl; Tue, 13 Aug 2019 14:29:42 -0700 (PDT) X-Google-Smtp-Source: APXvYqzZyO86t4K9w9lF9lk6jM6wAiJ76/KQLR7CkH5gv2GFXgZ+l+AtpOgNP74ZEdolFVshQSII X-Received: by 2002:a62:642:: with SMTP id 63mr43259437pfg.257.1565731782615; Tue, 13 Aug 2019 14:29:42 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1565731782; cv=none; d=google.com; s=arc-20160816; b=EBymZeLC6qKz0NSTzBtVZoQm2/RlKctAVXtfQBi/5jK2Kf60U/G134DvxhcoJqAXYH 7TbqWmXAHWlLmT4eT+4/mdpKTF92pwCdE8rFPTIGZ3nlgx53q7piJlU2IE0ZeuT/JTNv 1PHHVpqKN4AdsqqkgCCRFoxAdCGs4EWL+M/TCL4G48C5cee+f+Su6yDy/dHPb+OsKoY2 L6VJW/nNiCbSxfoMS40szuQ2goGas7FpeoK6T7q3H+c/SujamL404bdGjfu+s8FDbrEK c1MaO2M77refMcSdTyvfOPoQX9uIPzgHvy19L/Cce4iTyYB3OpE15C2/tM5c1L0hhKD1 ikgg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :references:in-reply-to:message-id:subject:cc:to:from:date :dkim-signature; bh=GSnpdSEhVyXbpv2SZl7Sxs4AdE1+efZZnrkmdxpmhp4=; b=MZaytI4oXuyCwT9/stz2NLRgUMayoxGsz47hQCwj9CYnv1pFHVFTWukQI/yVJCIPzM h9c64tKctud0ps7z0O26CJsqn20/WsnAcj6eXmFaqSvb6ABzvF9pOLi73nBT/SCyXnB4 W74RFV8Kg3cv8fiOVLg+sSxfSeiHs+KSjsfWhjOw3eRAvb9DbPslnmIgqST6+K7GBONp HdrnvhfCQAxLhRo1ek6qfp758QG1HSvMLqrsYt4OvO1ODEUJrbPO9Sd4b2Zgj/Mzbex8 FgzYlupJlftQS3PD3Xx0GxqsH0H5DVofZnVd4J3ZSU0nCtcmzZZ50LWdBxZkzQ/tNWSZ HSqw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=TjxnzxQy; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id k143si65143301pfd.212.2019.08.13.14.29.24; Tue, 13 Aug 2019 14:29:42 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=TjxnzxQy; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726975AbfHMV1x (ORCPT + 99 others); Tue, 13 Aug 2019 17:27:53 -0400 Received: from mail.kernel.org ([198.145.29.99]:56850 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726750AbfHMV1x (ORCPT ); Tue, 13 Aug 2019 17:27:53 -0400 Received: from akpm3.svl.corp.google.com (unknown [104.133.8.65]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 9E57F20665; Tue, 13 Aug 2019 21:27:52 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1565731672; bh=JnDrtx/GuVJXE+K/6+7bGsOzbRZTo0uwMqXFtO03veI=; h=Date:From:To:Cc:Subject:In-Reply-To:References:From; b=TjxnzxQypKbYUc3VoT24ndSB0NUEq3OdCPnWH/WcRdNlbE8CSDvVztElNn3QMuBIO /YaMslZ4NwSjlMvBQ4kWRGX8P74H28X9zbRKOhfDOG4bQOFwxhd6r5KZp79jL8cv6V pH/HTTZycXXjnnyzj2C68WZ4fcUkvqGxdIFQvKDk= Date: Tue, 13 Aug 2019 14:27:52 -0700 From: Andrew Morton To: Roman Gushchin Cc: , Michal Hocko , Johannes Weiner , , Subject: Re: [PATCH 1/2] mm: memcontrol: flush percpu vmstats before releasing memcg Message-Id: <20190813142752.35807b6070db795674f86feb@linux-foundation.org> In-Reply-To: <20190812222911.2364802-2-guro@fb.com> References: <20190812222911.2364802-1-guro@fb.com> <20190812222911.2364802-2-guro@fb.com> X-Mailer: Sylpheed 3.7.0 (GTK+ 2.24.32; x86_64-pc-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, 12 Aug 2019 15:29:10 -0700 Roman Gushchin wrote: > Percpu caching of local vmstats with the conditional propagation > by the cgroup tree leads to an accumulation of errors on non-leaf > levels. > > Let's imagine two nested memory cgroups A and A/B. Say, a process > belonging to A/B allocates 100 pagecache pages on the CPU 0. > The percpu cache will spill 3 times, so that 32*3=96 pages will be > accounted to A/B and A atomic vmstat counters, 4 pages will remain > in the percpu cache. > > Imagine A/B is nearby memory.max, so that every following allocation > triggers a direct reclaim on the local CPU. Say, each such attempt > will free 16 pages on a new cpu. That means every percpu cache will > have -16 pages, except the first one, which will have 4 - 16 = -12. > A/B and A atomic counters will not be touched at all. > > Now a user removes A/B. All percpu caches are freed and corresponding > vmstat numbers are forgotten. A has 96 pages more than expected. > > As memory cgroups are created and destroyed, errors do accumulate. > Even 1-2 pages differences can accumulate into large numbers. > > To fix this issue let's accumulate and propagate percpu vmstat > values before releasing the memory cgroup. At this point these > numbers are stable and cannot be changed. > > Since on cpu hotplug we do flush percpu vmstats anyway, we can > iterate only over online cpus. > > Fixes: 42a300353577 ("mm: memcontrol: fix recursive statistics correctness & scalabilty") Is this not serious enough for a cc:stable?