Received: by 2002:a05:6a10:22f:0:0:0:0 with SMTP id 15csp481591pxk; Wed, 9 Sep 2020 10:15:48 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwdhlUkLI6fL8KuLJhrRbCyL13CQpjgPrqKLUs9PMZ51wEPpJxxOYrqGnRqiqWVmGzEdCgp X-Received: by 2002:a50:ce09:: with SMTP id y9mr4984792edi.91.1599671748711; Wed, 09 Sep 2020 10:15:48 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1599671748; cv=none; d=google.com; s=arc-20160816; b=cBeCrrePxPHUPmQOCXPwfRjf7Ihg7xVMg23JtCBCFJ6BISbMCCzxeGYP2VoEJyOWRy NHs5u3iNRCmUuG1hjcxYBBY8QFkAe0n9yVRM/kkSb3Yul8xuiygDQaUIqtlfeYZDbeAz QV9OSgmXWIFQjknPK7XWX0GU6D5Iys53X/oWZNZojxMJoLuB5dBazVjq0bC9Gr4Gk/TW dESsLscwzVi4EnfXy6TOPu/43iMNj7x57/UwYIdHqgnruqibIgJLTmQ9rkmf9W67zAdl NRBNisvJh73pWY9ker8zej9T0a5sGQINTRtox+8cZRYEG3e9UNGLeM7AK9ezQm8CP5y5 E/XQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :message-id:date:subject:cc:to:from:dkim-signature; bh=nRSn5ntJRj7Oony3O2CA+XH1U2imtO9TR4JjB0ilhXo=; b=KtcNOksY2B0XvZUZhCxWYRnWwowd1+0iRoLve02Sl/yzKWLjxw7eLLJUfGh+biJyrK seccCAK6rDLgJKqQjgSMbY743P0gV0jn+WQWfR3zB9BLG5QliQQo418grMXnohRdQqDB uCMsCUBUrQaZJj/ZzIPzL2yXFF7w5oanooU9dKPx6DFE/AyqpKj/xp+FtCZH0lnK/wdy B5HqOOG3vbSi2ji8vAhmvUr35FBUg7k6SAA5hExgoFxjMTF9xCpsQiQUykCGKkp7Jifa 4oaouQZCGWqbMaMZ0qX9iisbmOfvlIV90xBSI5Fa3I8Z4ioSbSmwI0sBjKNi2imL8YBG PVIQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@bytedance-com.20150623.gappssmtp.com header.s=20150623 header.b=cj5wRTpK; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=bytedance.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id h23si1711695ejx.721.2020.09.09.10.15.25; Wed, 09 Sep 2020 10:15:48 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@bytedance-com.20150623.gappssmtp.com header.s=20150623 header.b=cj5wRTpK; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=bytedance.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730245AbgIIROJ (ORCPT + 99 others); Wed, 9 Sep 2020 13:14:09 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47194 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1730256AbgIIPai (ORCPT ); Wed, 9 Sep 2020 11:30:38 -0400 Received: from mail-pg1-x543.google.com (mail-pg1-x543.google.com [IPv6:2607:f8b0:4864:20::543]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 5F846C0619C7 for ; Wed, 9 Sep 2020 08:20:58 -0700 (PDT) Received: by mail-pg1-x543.google.com with SMTP id l191so2302844pgd.5 for ; Wed, 09 Sep 2020 08:20:58 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance-com.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:mime-version :content-transfer-encoding; bh=nRSn5ntJRj7Oony3O2CA+XH1U2imtO9TR4JjB0ilhXo=; b=cj5wRTpKPlMV7QtPieSvF+gBLFmGNSf4JanrLhXcm1B5OrnUUj1J+zxeg1fyofhtf2 3vBYnuNYVg64VaSiT3R4PNhx9T0xwRvhs40W/XjP+BU96SRg2S4JnuWl6lI1+Qdw8iEX AUEVbabpxWiwbvu1/lJ0lWGjPFOvXxMMb9/1R+HPWwajs3QCss1MktD4Y9oTy7B8/mpA UUq/uKY2NL58Qg9LHJAUgapV3HJnDZw111zdVw8Bzfk1tPbjzlco3BK3i3UtY27Y9qgd DU0MtSjR3ygVRfl7kWJ1iFcl9XC+s9WUR8B232PzYVeIlxC1ysT9H7Re6kDlZYSGP61g u61w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:mime-version :content-transfer-encoding; bh=nRSn5ntJRj7Oony3O2CA+XH1U2imtO9TR4JjB0ilhXo=; b=hoLa6Y/wXU2bIxeTJCirG4SSoF2cuVlRJPuQGFW8X9M5165xeBqDI5hMfOdZy9OklO 8hNOPWUACRUcbte+9+2VOF6LZyfhsM7WcCMZepAFZs4p1+3F5qmbn/8QjITANfQZKeX5 5MlKhL56CgRxUY4LUj0OpNNtV9a3NXY60wYtrUQBrVfBIQ6x5yQjF+lICenNmzfS5x3P i52x074p5UZF/ZNhqsxNyu9Dcrp3gVEcTyw5p0q5U+Vo8Z4lkDmGm+fMfI+aLValcDuy bkdaHIaj/FRFynVwYpdRKQW6bV7KQNcLL7KwjAqunpIs8sOambzG/l59ykUcMLElr+K5 yjKg== X-Gm-Message-State: AOAM531ZCmM7ehmzqv3D4iZ0w0WfbAn+unJwlfKxcLbNlhe+BA3t4h1y vcqw0stUSUs77e42DgeB51tjKg== X-Received: by 2002:a63:af01:: with SMTP id w1mr1012742pge.23.1599664855461; Wed, 09 Sep 2020 08:20:55 -0700 (PDT) Received: from Zs-MacBook-Pro.local.net ([103.136.221.66]) by smtp.gmail.com with ESMTPSA id kf10sm2160958pjb.2.2020.09.09.08.20.52 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Wed, 09 Sep 2020 08:20:54 -0700 (PDT) From: zangchunxin@bytedance.com To: akpm@linux-foundation.org Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, Chunxin Zang , Muchun Song Subject: [PATCH v2] mm/vmscan: fix infinite loop in drop_slab_node Date: Wed, 9 Sep 2020 23:20:47 +0800 Message-Id: <20200909152047.27905-1-zangchunxin@bytedance.com> X-Mailer: git-send-email 2.24.3 (Apple Git-128) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Chunxin Zang On our server, there are about 10k memcg in one machine. They use memory very frequently. When I tigger drop caches,the process will infinite loop in drop_slab_node. There are two reasons: 1.We have too many memcgs, even though one object freed in one memcg, the sum of object is bigger than 10. 2.We spend a lot of time in traverse memcg once. So, the memcg who traversed at the first have been freed many objects. Traverse memcg next time, the freed count bigger than 10 again. We can get the following info through 'ps': root:~# ps -aux | grep drop root 357956 ... R Aug25 21119854:55 echo 3 > /proc/sys/vm/drop_caches root 1771385 ... R Aug16 21146421:17 echo 3 > /proc/sys/vm/drop_caches root 1986319 ... R 18:56 117:27 echo 3 > /proc/sys/vm/drop_caches root 2002148 ... R Aug24 5720:39 echo 3 > /proc/sys/vm/drop_caches root 2564666 ... R 18:59 113:58 echo 3 > /proc/sys/vm/drop_caches root 2639347 ... R Sep03 2383:39 echo 3 > /proc/sys/vm/drop_caches root 3904747 ... R 03:35 993:31 echo 3 > /proc/sys/vm/drop_caches root 4016780 ... R Aug21 7882:18 echo 3 > /proc/sys/vm/drop_caches Use bpftrace follow 'freed' value in drop_slab_node: root:~# bpftrace -e 'kprobe:drop_slab_node+70 {@ret=hist(reg("bp")); }' Attaching 1 probe... ^B^C @ret: [64, 128) 1 | | [128, 256) 28 | | [256, 512) 107 |@ | [512, 1K) 298 |@@@ | [1K, 2K) 613 |@@@@@@@ | [2K, 4K) 4435 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@| [4K, 8K) 442 |@@@@@ | [8K, 16K) 299 |@@@ | [16K, 32K) 100 |@ | [32K, 64K) 139 |@ | [64K, 128K) 56 | | [128K, 256K) 26 | | [256K, 512K) 2 | | In the while loop, we can check whether the TASK_KILLABLE signal is set, if so, we should break the loop. Signed-off-by: Chunxin Zang Signed-off-by: Muchun Song --- changelogs in v2: 1) Via check TASK_KILLABLE signal break loop. mm/vmscan.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/mm/vmscan.c b/mm/vmscan.c index b6d84326bdf2..c3ed8b45d264 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -704,6 +704,9 @@ void drop_slab_node(int nid) do { struct mem_cgroup *memcg = NULL; + if (fatal_signal_pending(current)) + return; + freed = 0; memcg = mem_cgroup_iter(NULL, NULL, NULL); do { -- 2.11.0