Received: by 2002:ad5:474a:0:0:0:0:0 with SMTP id i10csp5173059imu; Tue, 29 Jan 2019 14:12:41 -0800 (PST) X-Google-Smtp-Source: ALg8bN7zFxJR6sHY4Qvyw8d53OlNPDWxmycjk/rCwytHmVm6Xdrw0uj/AjVruAa6DVFUdPfSm35a X-Received: by 2002:a63:441e:: with SMTP id r30mr25588160pga.128.1548799960984; Tue, 29 Jan 2019 14:12:40 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1548799960; cv=none; d=google.com; s=arc-20160816; b=Y5miNAUO+nXOSeaiwNiVPhSHOMN45LgQBRO6XOIJF8NCNwkTqy9Zh8sAMjib2oWC8K XOsRBLVwJj8ahdhXYmR4lhHtsfvCIulfZKxNk7a5Nrt2I3Eqfx4NDH7nOCdkfd1T2zVu pZAy/jaCybjrIxJlIfyDr8iw8m/U1F5LtVqs0+AvXw+LTInEk9Busfoph7sa9ekwczBi VoESWsefHQXLTkWE5DQAvo8ScxNCe4ETZjE5xwclin1ugxu0BfyPprY9LOIh/oBiSgSZ IECx4XXL2HLm2dQj4E6dTzY1U6vt/3p76Xb03IOyTQIzyROR3SVQ6dCEambE4N/w40UG Tj9w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:message-id:date:subject:cc:to:from; bh=KPSrZsu84s7dG54VRtyqZJD1qBgDbiyWgneJSC8QgBs=; b=ZzaqLTEGfuFO2AHyZPnL+8RusjjgQmN7cOHdnHu8OSrFOKlaS/kMf0J4CnkqrbuwJf tSwNdA40to2v2cMvJSFHN8HfpDQRRunSIAMgMSSWbxpwyJ6GTYWKNGE8oniIXkjKsS5x o64k+jaJpAxe2yrOdpZf4Dx8pJSJeOZsqdVi1Oju50/VK6Impv2PzZ7l3+imB8gAo8+6 MUvyK+w+y4gKSs3AaqUsAU/SR4ZTaFXi2yx5GnKOwLjoEHjaS766mF0js8eyDFAumr8f PIWUadBScy6yxVIh5AZ0JtuAQCf5uJCM5gsAqj6SpRyEkHcm+R6WanTj3u3GcjYZHAI3 /p1g== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=alibaba.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id d12si7739437pla.351.2019.01.29.14.12.24; Tue, 29 Jan 2019 14:12:40 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=alibaba.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729703AbfA2WL1 (ORCPT + 99 others); Tue, 29 Jan 2019 17:11:27 -0500 Received: from out30-131.freemail.mail.aliyun.com ([115.124.30.131]:35383 "EHLO out30-131.freemail.mail.aliyun.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727332AbfA2WL1 (ORCPT ); Tue, 29 Jan 2019 17:11:27 -0500 X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R241e4;CH=green;FP=0|-1|-1|-1|0|-1|-1|-1;HT=e01e04400;MF=yang.shi@linux.alibaba.com;NM=1;PH=DS;RN=6;SR=0;TI=SMTPD_---0TJEMIga_1548799877; Received: from e19h19392.et15sqa.tbsite.net(mailfrom:yang.shi@linux.alibaba.com fp:SMTPD_---0TJEMIga_1548799877) by smtp.aliyun-inc.com(127.0.0.1); Wed, 30 Jan 2019 06:11:24 +0800 From: Yang Shi To: mhocko@suse.com, hannes@cmpxchg.org, akpm@linux-foundation.org Cc: yang.shi@linux.alibaba.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: [RFC v2 PATCH] mm: vmscan: do not iterate all mem cgroups for global direct reclaim Date: Wed, 30 Jan 2019 06:11:17 +0800 Message-Id: <1548799877-10949-1-git-send-email-yang.shi@linux.alibaba.com> X-Mailer: git-send-email 1.8.3.1 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org In current implementation, both kswapd and direct reclaim has to iterate all mem cgroups. It is not a problem before offline mem cgroups could be iterated. But, currently with iterating offline mem cgroups, it could be very time consuming. In our workloads, we saw over 400K mem cgroups accumulated in some cases, only a few hundred are online memcgs. Although kswapd could help out to reduce the number of memcgs, direct reclaim still get hit with iterating a number of offline memcgs in some cases. We experienced the responsiveness problems due to this occassionally. A simple test with pref shows it may take around 220ms to iterate 8K memcgs in direct reclaim: dd 13873 [011] 578.542919: vmscan:mm_vmscan_direct_reclaim_begin dd 13873 [011] 578.758689: vmscan:mm_vmscan_direct_reclaim_end So for 400K, it may take around 11 seconds to iterate all memcgs. Here just break the iteration once it reclaims enough pages as what memcg direct reclaim does. This may hurt the fairness among memcgs. But the cached iterator cookie could help to achieve the fairness more or less. Cc: Johannes Weiner Cc: Michal Hocko Signed-off-by: Yang Shi --- v2: Added some test data in the commit log Updated commit log about iterator cookie could maintain fairness Dropped !global_reclaim() since !current_is_kswapd() is good enough mm/vmscan.c | 7 +++---- 1 file changed, 3 insertions(+), 4 deletions(-) diff --git a/mm/vmscan.c b/mm/vmscan.c index a714c4f..5e35796 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -2764,16 +2764,15 @@ static bool shrink_node(pg_data_t *pgdat, struct scan_control *sc) sc->nr_reclaimed - reclaimed); /* - * Direct reclaim and kswapd have to scan all memory - * cgroups to fulfill the overall scan target for the - * node. + * Kswapd have to scan all memory cgroups to fulfill + * the overall scan target for the node. * * Limit reclaim, on the other hand, only cares about * nr_to_reclaim pages to be reclaimed and it will * retry with decreasing priority if one round over the * whole hierarchy is not sufficient. */ - if (!global_reclaim(sc) && + if (!current_is_kswapd() && sc->nr_reclaimed >= sc->nr_to_reclaim) { mem_cgroup_iter_break(root, memcg); break; -- 1.8.3.1