Received: by 10.213.65.68 with SMTP id h4csp860355imn; Tue, 27 Mar 2018 10:06:02 -0700 (PDT) X-Google-Smtp-Source: AIpwx4/uN6NS+s/v1LqlQxocvYHNsON6HNjZ5q+mqEe4yl4S+R+qbOYuefCF+jS3AgtXAGbp/Ms1 X-Received: by 10.99.99.65 with SMTP id x62mr83405pgb.157.1522170362021; Tue, 27 Mar 2018 10:06:02 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1522170361; cv=none; d=google.com; s=arc-20160816; b=f+KZIp7yi10ZaXhy/adbVB1OhVm607mtzoP/1+4cUdKOFeZcY5JqfBQtCgzXDZtZeD hCHK0ytmLTYIwDIxvmTCD+RJ/Px+4+Pw04xG+y21QWs7G23S/7FIgyCr9Iko5Ou6hXO8 fswandRlFXhWpbp7iCzheh28U5j4XMnDG/RgdJlLYXGDKggWVsKhq6888skiUWQGD5li WYmx+uVue8Wk9ja6q4Obpa/3Vi2GG+v4181aXofOoGGjurrUyS3C8Ot4xWNmHSzUvrjJ ZXWRHlW6dzrKyrvgSPF9fuRI//95B2sZXV7p1QzMQG1Fi7yZl/hQeCs6g2ONt4dWLIuW OeUw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:mime-version:user-agent:references :in-reply-to:message-id:date:subject:cc:to:from :arc-authentication-results; bh=MGwVI+qEbDRS+A3svFrbscDWxFCJSpFCm1VG2NZhBwI=; b=t6lhNmkaNiXWX8lyn9A17b43H5mh8Ba063x4FhKtYy7m9WIWKeI0cuCQcUtwHFi75n kMh0F0llT0xCZSHK9zBX/jUTVMkKYptVYmlPEFec9jbYbOmWzShkGDydPVa2sHSyGBGz wB/D2rBrUyW9pJ+aAZEe8+EUo0NsGlhXICkUiLTCyBoQG34wvAufKpmGRW3HR9HWbd2r N3ksQknUXfG++msPhblukaPcru1m4TCz4FswJHZp+QRLs40afolt+2hHW/IrlE9CRaz3 LRZK80hDYX6QyVHFKFFSoHLZZ68qmCYNV1gm4O0mWZhNMd6bdIvOLzJm+ZiPW0OPvDsE 9TRQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id j62si1111876pgc.583.2018.03.27.10.05.47; Tue, 27 Mar 2018 10:06:01 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755526AbeC0QlR (ORCPT + 99 others); Tue, 27 Mar 2018 12:41:17 -0400 Received: from mail.linuxfoundation.org ([140.211.169.12]:48624 "EHLO mail.linuxfoundation.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932217AbeC0QlM (ORCPT ); Tue, 27 Mar 2018 12:41:12 -0400 Received: from localhost (LFbn-1-12247-202.w90-92.abo.wanadoo.fr [90.92.61.202]) by mail.linuxfoundation.org (Postfix) with ESMTPSA id 7506AD3B; Tue, 27 Mar 2018 16:41:11 +0000 (UTC) From: Greg Kroah-Hartman To: linux-kernel@vger.kernel.org Cc: Greg Kroah-Hartman , stable@vger.kernel.org, Andrey Ryabinin , Shakeel Butt , Michal Hocko , Mel Gorman , Tejun Heo , Johannes Weiner , Andrew Morton , Linus Torvalds Subject: [PATCH 4.15 062/105] mm/vmscan: wake up flushers for legacy cgroups too Date: Tue, 27 Mar 2018 18:27:42 +0200 Message-Id: <20180327162801.438845896@linuxfoundation.org> X-Mailer: git-send-email 2.16.3 In-Reply-To: <20180327162757.813009222@linuxfoundation.org> References: <20180327162757.813009222@linuxfoundation.org> User-Agent: quilt/0.65 X-stable: review MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org 4.15-stable review patch. If anyone has any objections, please let me know. ------------------ From: Andrey Ryabinin commit 1c610d5f93c709df56787f50b3576704ac271826 upstream. Commit 726d061fbd36 ("mm: vmscan: kick flushers when we encounter dirty pages on the LRU") added flusher invocation to shrink_inactive_list() when many dirty pages on the LRU are encountered. However, shrink_inactive_list() doesn't wake up flushers for legacy cgroup reclaim, so the next commit bbef938429f5 ("mm: vmscan: remove old flusher wakeup from direct reclaim path") removed the only source of flusher's wake up in legacy mem cgroup reclaim path. This leads to premature OOM if there is too many dirty pages in cgroup: # mkdir /sys/fs/cgroup/memory/test # echo $$ > /sys/fs/cgroup/memory/test/tasks # echo 50M > /sys/fs/cgroup/memory/test/memory.limit_in_bytes # dd if=/dev/zero of=tmp_file bs=1M count=100 Killed dd invoked oom-killer: gfp_mask=0x14000c0(GFP_KERNEL), nodemask=(null), order=0, oom_score_adj=0 Call Trace: dump_stack+0x46/0x65 dump_header+0x6b/0x2ac oom_kill_process+0x21c/0x4a0 out_of_memory+0x2a5/0x4b0 mem_cgroup_out_of_memory+0x3b/0x60 mem_cgroup_oom_synchronize+0x2ed/0x330 pagefault_out_of_memory+0x24/0x54 __do_page_fault+0x521/0x540 page_fault+0x45/0x50 Task in /test killed as a result of limit of /test memory: usage 51200kB, limit 51200kB, failcnt 73 memory+swap: usage 51200kB, limit 9007199254740988kB, failcnt 0 kmem: usage 296kB, limit 9007199254740988kB, failcnt 0 Memory cgroup stats for /test: cache:49632KB rss:1056KB rss_huge:0KB shmem:0KB mapped_file:0KB dirty:49500KB writeback:0KB swap:0KB inactive_anon:0KB active_anon:1168KB inactive_file:24760KB active_file:24960KB unevictable:0KB Memory cgroup out of memory: Kill process 3861 (bash) score 88 or sacrifice child Killed process 3876 (dd) total-vm:8484kB, anon-rss:1052kB, file-rss:1720kB, shmem-rss:0kB oom_reaper: reaped process 3876 (dd), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB Wake up flushers in legacy cgroup reclaim too. Link: http://lkml.kernel.org/r/20180315164553.17856-1-aryabinin@virtuozzo.com Fixes: bbef938429f5 ("mm: vmscan: remove old flusher wakeup from direct reclaim path") Signed-off-by: Andrey Ryabinin Tested-by: Shakeel Butt Acked-by: Michal Hocko Cc: Mel Gorman Cc: Tejun Heo Cc: Johannes Weiner Cc: Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman --- mm/vmscan.c | 31 ++++++++++++++++--------------- 1 file changed, 16 insertions(+), 15 deletions(-) --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -1847,6 +1847,20 @@ shrink_inactive_list(unsigned long nr_to set_bit(PGDAT_WRITEBACK, &pgdat->flags); /* + * If dirty pages are scanned that are not queued for IO, it + * implies that flushers are not doing their job. This can + * happen when memory pressure pushes dirty pages to the end of + * the LRU before the dirty limits are breached and the dirty + * data has expired. It can also happen when the proportion of + * dirty pages grows not through writes but through memory + * pressure reclaiming all the clean cache. And in some cases, + * the flushers simply cannot keep up with the allocation + * rate. Nudge the flusher threads in case they are asleep. + */ + if (stat.nr_unqueued_dirty == nr_taken) + wakeup_flusher_threads(WB_REASON_VMSCAN); + + /* * Legacy memcg will stall in page writeback so avoid forcibly * stalling here. */ @@ -1858,22 +1872,9 @@ shrink_inactive_list(unsigned long nr_to if (stat.nr_dirty && stat.nr_dirty == stat.nr_congested) set_bit(PGDAT_CONGESTED, &pgdat->flags); - /* - * If dirty pages are scanned that are not queued for IO, it - * implies that flushers are not doing their job. This can - * happen when memory pressure pushes dirty pages to the end of - * the LRU before the dirty limits are breached and the dirty - * data has expired. It can also happen when the proportion of - * dirty pages grows not through writes but through memory - * pressure reclaiming all the clean cache. And in some cases, - * the flushers simply cannot keep up with the allocation - * rate. Nudge the flusher threads in case they are asleep, but - * also allow kswapd to start writing pages during reclaim. - */ - if (stat.nr_unqueued_dirty == nr_taken) { - wakeup_flusher_threads(WB_REASON_VMSCAN); + /* Allow kswapd to start writing pages during reclaim. */ + if (stat.nr_unqueued_dirty == nr_taken) set_bit(PGDAT_DIRTY, &pgdat->flags); - } /* * If kswapd scans pages marked marked for immediate