Received: by 2002:a25:8b91:0:0:0:0:0 with SMTP id j17csp2836052ybl; Sun, 26 Jan 2020 11:56:05 -0800 (PST) X-Google-Smtp-Source: APXvYqzNnjMzNU2IM6VHxUx4YKlapNUWVTG7I28fS+wRfbJkIORu3vW3bMTsCA1E2roaxP0eOGUA X-Received: by 2002:a9d:4d84:: with SMTP id u4mr4688111otk.323.1580068565754; Sun, 26 Jan 2020 11:56:05 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1580068565; cv=none; d=google.com; s=arc-20160816; b=l0G0cIA76oOOb9PEXMWY0JtTblzeXM6VM2I9wb6hbDN0nvbUK4o6xaNOfuoZIJiItZ OdILgEa0ZqIp0CSWma2/CWkMJiKh99ugBe48z2xnAZb+/P2QrsT1dvJcCzz04X3F/LLM PFgzfvsh5KOVz11vlrIgj9JJIn0NwAHTusACKDJh8jKB7StniO/R7CfVvmDqRStgjAVW oMlYCjLb8Q2JI1+qoWpYEpW90wr4OafP59XM8RSyROtkcSgX05hcBVITDpBxGP25EHTj 7DAoSVrTvjtCqgib7MaN/oQSJS5YpLkwZDFiQkSC9zn1MhKaBR9a6dAQq+bjG08D0Qln UWcQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:dkim-signature; bh=Nf++034O2ADon7l7potoTrFBHiiOGvz0fwI6LSaXWn8=; b=0foTiWueBVNutI7ZtBFd4VokUp05pCWPm36vZPKrlQs416YnVktTI9aj7Xs9xscA3B YK9rK8oJjL+apX/0JJrDpDk3mSeRiffz3qyBjCiyzB+cm8bGlrP1pW2cCe6JpIQmvSLV NExd7c9dl/A/LPgJ2I8BclH22XkPblZbutBR+LJqbKxb4JM+6TE0nuk2ujndPIkdgkyf 8kn8fI4KdIKVu7iUhg0HYVyOAjrpDE3DxVE2Lu3WfYkBDpQEwiQf+toe7Jva4yc2uyNY tLXLZyFp1SbdufEiPpxR7WqDnwgFfNCJdt1H9sXcAhri1g3lKolPk3GVjC107T4I/Aa2 KZdQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=RZNDkNhu; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id o25si2763288otk.28.2020.01.26.11.55.54; Sun, 26 Jan 2020 11:56:05 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=RZNDkNhu; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726293AbgAZTyI (ORCPT + 99 others); Sun, 26 Jan 2020 14:54:08 -0500 Received: from mail-ot1-f66.google.com ([209.85.210.66]:46272 "EHLO mail-ot1-f66.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726087AbgAZTyH (ORCPT ); Sun, 26 Jan 2020 14:54:07 -0500 Received: by mail-ot1-f66.google.com with SMTP id g64so6448552otb.13 for ; Sun, 26 Jan 2020 11:54:06 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=Nf++034O2ADon7l7potoTrFBHiiOGvz0fwI6LSaXWn8=; b=RZNDkNhuMM22FTj9GbXxdHr4O2w1mOGNAZqidUvOqtRLeGMJXsgzDn1em7c2/aGlmS GppJYD/YUCeC15Oehv7JnaSZOE86wT0lSLHC/H/3vklcGiDg01KdPIhe7zKMguM8xuNv ox2X+Twhp1dIVwvS+4YmiqBUPL4tx5BsvY9O3iTBeqSP5ZEqkXICGwbxnFkPEypTuleO hw2CH4KCVuDSlTRA1B6QcX/0DIEdKCQZ08HaR5kv1zbOUdCkaOxOcmTs33tR9ArtIai9 R7UPx5OcwFYkovS0KOV6xC4ZLi+9KGAsrXxr+MISqFzKR+NPnAqFF2Gv3e0IBHnE+MEd t/eg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=Nf++034O2ADon7l7potoTrFBHiiOGvz0fwI6LSaXWn8=; b=A6mMJtwh1jghpit6Im8eTuap1kEZqVt4Q9ruNfQLJ7RmwxFCzfvZkflxkcI5z3iOBc tPz0itnbYXJnTfon3gyiBLVnWMbXHsb9+fgv/3G8D+dg7tbI0uphjKom8YzTN/naPF8Y txPPpNcR8vqo7FspY7sx6LJY5a9VN/G/ttcWyMGnbSgc64Dbq+xIMVwBKzlk+eHDq6A2 ZTfRrUmXheHb03/mL6j6wldDtVuxXYziQSbNHd5l98q0h7HjdOrisNCN1/wFzxuLc9Qd sj0BcixVrXsOWfDlKn86gUzeaK6/2Dx7kN2yw74FmZJev1LYbiCAYm70eMTorbiAchLp yFQg== X-Gm-Message-State: APjAAAXlfCjpcvgFHqRERJb9Nj477JGh990VhyR0XPGE1mm4aUteCUPn YXiKy7KgHGWf4RKBIu38MvkJT3Q81FvBVHFcyys= X-Received: by 2002:a05:6830:1e64:: with SMTP id m4mr2233307otr.244.1580068446460; Sun, 26 Jan 2020 11:54:06 -0800 (PST) MIME-Version: 1.0 References: <20200109225646.22983-1-xiyou.wangcong@gmail.com> <20200110073822.GC29802@dhcp22.suse.cz> <20200121090048.GG29276@dhcp22.suse.cz> In-Reply-To: <20200121090048.GG29276@dhcp22.suse.cz> From: Cong Wang Date: Sun, 26 Jan 2020 11:53:55 -0800 Message-ID: Subject: Re: [PATCH] mm: avoid blocking lock_page() in kcompactd To: Michal Hocko Cc: LKML , Andrew Morton , linux-mm , Mel Gorman , Vlastimil Babka Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Jan 21, 2020 at 1:00 AM Michal Hocko wrote: > > On Mon 20-01-20 14:48:05, Cong Wang wrote: > > It got stuck somewhere along the call path of mem_cgroup_try_charge(), > > and the trace events of mm_vmscan_lru_shrink_inactive() indicates this > > too: > > So it seems that you are condending on the page lock. It is really > unexpected that the reclaim would take that long though. Please try to > enable more vmscan tracepoints to see where the time is spent. Sorry for the delay. I have been trying to collect more data in one shot. This is a a typical round of the loop after enabling all vmscan tracepoints: <...>-455450 [007] .... 4048595.842992: mm_vmscan_memcg_reclaim_begin: order=0 may_writepage=1 gfp_flags=GFP_NOFS|__GFP_HIGHMEM|__GFP_HARDWALL|__GFP_MOVABLE classzone_idx= 4 <...>-455450 [007] d... 4048595.842995: mm_vmscan_lru_isolate: isolate_mode=0 classzone=3 order=0 nr_requested=1 nr_scanned=1 nr_skipped=0 nr_taken=1 lru=inactive_file <...>-455450 [007] .... 4048595.842995: mm_vmscan_lru_shrink_inactive: nid=0 nr_scanned=1 nr_reclaimed=0 nr_dirty=0 nr_writeback=0 nr_congested=0 nr_immediate=0 nr_activat e=0 nr_ref_keep=0 nr_unmap_fail=0 priority=4 flags=RECLAIM_WB_FILE|RECLAIM_WB_ASYNC <...>-455450 [007] dN.. 4048595.842997: mm_vmscan_lru_isolate: isolate_mode=0 classzone=3 order=0 nr_requested=1 nr_scanned=1 nr_skipped=0 nr_taken=1 lru=inactive_file <...>-455450 [007] .... 4048595.843001: mm_vmscan_lru_shrink_inactive: nid=0 nr_scanned=1 nr_reclaimed=0 nr_dirty=0 nr_writeback=0 nr_congested=0 nr_immediate=0 nr_activat e=0 nr_ref_keep=0 nr_unmap_fail=0 priority=3 flags=RECLAIM_WB_FILE|RECLAIM_WB_ASYNC <...>-455450 [007] d... 4048595.843002: mm_vmscan_lru_isolate: isolate_mode=0 classzone=3 order=0 nr_requested=5 nr_scanned=5 nr_skipped=0 nr_taken=5 lru=inactive_file <...>-455450 [007] .... 4048595.843004: mm_vmscan_lru_shrink_inactive: nid=0 nr_scanned=5 nr_reclaimed=0 nr_dirty=0 nr_writeback=0 nr_congested=0 nr_immediate=0 nr_activat e=0 nr_ref_keep=0 nr_unmap_fail=0 priority=2 flags=RECLAIM_WB_FILE|RECLAIM_WB_ASYNC <...>-455450 [007] d... 4048595.843006: mm_vmscan_lru_isolate: isolate_mode=0 classzone=3 order=0 nr_requested=9 nr_scanned=9 nr_skipped=0 nr_taken=9 lru=inactive_file <...>-455450 [007] .... 4048595.843007: mm_vmscan_lru_shrink_inactive: nid=0 nr_scanned=9 nr_reclaimed=0 nr_dirty=0 nr_writeback=0 nr_congested=0 nr_immediate=0 nr_activat e=0 nr_ref_keep=0 nr_unmap_fail=0 priority=1 flags=RECLAIM_WB_FILE|RECLAIM_WB_ASYNC <...>-455450 [007] d... 4048595.843009: mm_vmscan_lru_isolate: isolate_mode=0 classzone=3 order=0 nr_requested=17 nr_scanned=15 nr_skipped=0 nr_taken=15 lru=inactive_file <...>-455450 [007] .... 4048595.843011: mm_vmscan_lru_shrink_inactive: nid=0 nr_scanned=15 nr_reclaimed=0 nr_dirty=0 nr_writeback=0 nr_congested=0 nr_immediate=0 nr_activa te=0 nr_ref_keep=0 nr_unmap_fail=0 priority=0 flags=RECLAIM_WB_FILE|RECLAIM_WB_ASYNC <...>-455450 [007] .... 4048595.843012: mm_vmscan_memcg_reclaim_end: nr_reclaimed=0 The whole trace output is huge (33M), I can provide it on request. I suspect the process gets stuck in the retry loop in try_charge(), as the _shortest_ stacktrace of the perf samples indicated: cycles:ppp: ffffffffa72963db mem_cgroup_iter ffffffffa72980ca mem_cgroup_oom_unlock ffffffffa7298c15 try_charge ffffffffa729a886 mem_cgroup_try_charge ffffffffa720ec03 __add_to_page_cache_locked ffffffffa720ee3a add_to_page_cache_lru ffffffffa7312ddb iomap_readpages_actor ffffffffa73133f7 iomap_apply ffffffffa73135da iomap_readpages ffffffffa722062e read_pages ffffffffa7220b3f __do_page_cache_readahead ffffffffa7210554 filemap_fault ffffffffc039e41f __xfs_filemap_fault ffffffffa724f5e7 __do_fault ffffffffa724c5f2 __handle_mm_fault ffffffffa724cbc6 handle_mm_fault ffffffffa70a313e __do_page_fault ffffffffa7a00dfe page_fault But I don't see how it could be, the only possible case is when mem_cgroup_oom() returns OOM_SUCCESS. However I can't find any clue in dmesg pointing to OOM. These processes in the same memcg are either running or sleeping (that is not exiting or coredump'ing), I don't see how and why they could be selected as a victim of OOM killer. I don't see any signal pending either from their /proc/X/status. Here is the status of the whole memcg: $ sudo cat /sys/fs/cgroup/memory/system.slice/osqueryd.service/memory.oom_control oom_kill_disable 0 under_oom 0 oom_kill 0 $ sudo cat /sys/fs/cgroup/memory/system.slice/osqueryd.service/memory.stat cache 376832 rss 255864832 rss_huge 0 shmem 0 mapped_file 135168 dirty 135168 writeback 0 pgpgin 5327157 pgpgout 5264598 pgfault 4251687 pgmajfault 7491 inactive_anon 4096 active_anon 256184320 inactive_file 32768 active_file 0 unevictable 0 hierarchical_memory_limit 262144000 total_cache 376832 total_rss 255864832 total_rss_huge 0 total_shmem 0 total_mapped_file 135168 total_dirty 135168 total_writeback 0 total_pgpgin 5327157 total_pgpgout 5264598 total_pgfault 4251687 total_pgmajfault 7491 total_inactive_anon 4096 total_active_anon 256184320 total_inactive_file 86016 total_active_file 0 total_unevictable 0 Please let me know if I can provide anything else. Thanks.